Welcome to the SRP Forum! Please refer to the SRP Forum FAQ post if you have any questions regarding how the forum works.
oecgi4 timing out
Hi All,
Just posting here because I'm kind of desperate for a quick resolution.
I'm having trouble with oecgi4 timing out and not processing the requests.
We're in the final stages of development/testing of our API's before we go live and now things just aren't working any more.
I've had inconsistent failures up till now which I've been able to let slide because subsequent requests would be successful. Now though, it is persistent, preventing users from even logging in.
The common theme seems to be only when the request is a "POST" and the "POST" has a body. Doesn't seem to matter what is in the body. "GET" requests seem to work and even the "POST" requests work if you omit the body. (by work, I mean the OECGI processes them).
Changing the timeout config is not the answer. I originally had it set to 15 mins so thought the problem was IIS because the requests just seemed to hang but when I changed the timeout to two minutes then I would get a 502 Bad gateway response indicating that the cgi application had timed out.
I am currently testing with just the login call that has a body containing just a json string of username and password. Previously, calls with far more complicated bodies have worked without issue so it seems just the existence of a body is now causing problems.
Appreciate any thoughts/suggestions.
Mark
Just posting here because I'm kind of desperate for a quick resolution.
I'm having trouble with oecgi4 timing out and not processing the requests.
We're in the final stages of development/testing of our API's before we go live and now things just aren't working any more.
I've had inconsistent failures up till now which I've been able to let slide because subsequent requests would be successful. Now though, it is persistent, preventing users from even logging in.
The common theme seems to be only when the request is a "POST" and the "POST" has a body. Doesn't seem to matter what is in the body. "GET" requests seem to work and even the "POST" requests work if you omit the body. (by work, I mean the OECGI processes them).
Changing the timeout config is not the answer. I originally had it set to 15 mins so thought the problem was IIS because the requests just seemed to hang but when I changed the timeout to two minutes then I would get a 502 Bad gateway response indicating that the cgi application had timed out.
I am currently testing with just the login call that has a body containing just a json string of username and password. Previously, calls with far more complicated bodies have worked without issue so it seems just the existence of a body is now causing problems.
Appreciate any thoughts/suggestions.
Mark
Comments
I had just finished reading your post on the Rev board and was about to respond when this came in. So, I'll give preference to our own forum. :)
Timeouts can occur at different links in the HTTP chain. Where exactly is the timeout occurring? For instance, is it the client, the server, or OECGI? What you wrote "changing the timeout config is not the answer", where is that config? Is this IIS?
Have you tried using a persistent connection? This helped me with a specific project. However, I must say that I have not experienced the behavior you are describing as being regular. This does make me think it is an IIS issue.
I also give preference to yours. Rarely get a response on the general forum and it appears my works subscription has expired so I can't get to the other one.
The timeout is oecgi.
I changed it within the cgi config of IIS.
When the call does timeout, I get this response
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
IIS 10.0 Detailed Error - 502.1 - Bad Gateway
HTTP Error 502.1 - Bad Gateway
The specified CGI application misbehaved by not returning a complete set of HTTP headers. The headers it did return are "".
Most likely causes:
Things you can try:
Detailed Error Information:
More Information:
This error occurs when the request is configured to handle by a CGI application and the CGI application takes longer than the configured timeout to handle the request. If the request usually takes longer than the configured timeout to run, increase the timeout value. Otherwise, troubleshoot the CGI application to determine the delay.View more information »
How long does it take before that error message gets returned? What I am wondering is whether this is a true timeout or an error condition that makes IIS think it is a timeout.
For comparison purposes have you tried OECGI3.exe?
Does it matter what client you are testing with? For instance, I presume this is a REST API you are calling. Therefore, does it make a difference if you are using a browser/JavaScript versus Postman to call the API?
I'm still not convinced this isn't a red herring. IIS may be pointing the finger at OECGI4.exe, but like I said, I have never had this occur other than when I am debugging an API and truly creating a timeout situation.
The request should take seconds at most. The error message is being returned once the cgi timeout value has been reached whether I set that at two minutes or fifteen. I'm happy to entertain that it is some other error message but I have no idea how to determine if that is the case.
I haven't tried oecgi3 in this instance but I think I tried that a few months ago for something else (which may actually have been related to this) but couldn't actually get oecgi3 to run at all so just moved on. I will try again if I don't come up with anything else.
Just using the browser doesn't help because I can't change the verb to a "POST" and add a body. So the same call in the browser will work or at least make its way through to OI because it is a "GET". The same thing works in Postman. I don't know enough about javascript to set up a browser/javascript test.
Incidentally, it came to my attention yesterday because an end user was testing from the mobile app on their phone. They had closed the app, then went to log back in and it just sat there spinning its wheels. There's some hoops for the app to jump through before it gets to our backend but when I test locally through Postman, I'm getting the same issue and my calls are going direct to IIS and oecgi4.
Basically I was hoping the app guys had broken their end but alas the buck stops here.
Did you test any of the POST APIs that came pre-installed with the SRP HTTP Framework? Do they timeout as well?
Since you can duplicate this from different clients (especially Postman) then I do not think the client is at issue.
You noted originally that the POST requests do get processed. Have you confirmed that the BASIC+ code is getting all the way back to the final return in HTTP_MCP? Have you enabled logging so you can inspect the response getting back to OECGI?
Finally, are you absolutely sure that you aren't doing something in the way you handle your POST requests that would be causing problems on the return trip? I might suggest you hardcode a simple response rather than let the API create the response to a POST request and see if you still get a timeout. If you do not get a timeout, then that tells me the problem is in the response. If you do get a timeout, then I am inclined to believe that something is wrong with the way POST methods are being processed. You might want to test the PUT method as well just to see if that produces similar results.
It is both a mobile and web app. Depends on the user and the feature. The web is like an extended version.
I'm doing the OI API's and a third party mobile app developer is doing the front end so I'm not privy to their toolset.
Did I test any of the pre-installed POST API's? No, not recently at least but didn't really need to. I was successfully posting using my code as recently as Friday. In fact even yesterday I had intermittent success. When it works, it works well and is all very exciting.
When I said the POST requests did get processed, they then do exactly as I expect. All the code returns generally with an error because there's no body to post or an unauthorised error. Either way it returns the response I've coded for. The issue arises now when I add a body to the request. I have the oesocketserver running in debug mode so I can see when stuff is happening but when there's a body in the POST the socketserver doesn't do a thing, no engine is started, so it's not the return trip because it doesn't even get to the beginning of http_mcp let alone the end of it.
As I finished typing the above I glanced back over to Postman and noticed that the last POST request I sent actually worked. So I sent it again. And it worked again. This is the one I had working on Friday. I only brought it back up because I was documenting it.
So I retried the login request which is a very simple one. It times out without so much as a flicker on the socketserver. Aaaarrrgghhhh!!!
So I took the slightly more complicated body from the working request and placed it in the non working request and whammo, the non working one returned immediately with a correct "Unauthorized" response.
So this body works
{
"whs":
{
"incidentDt": "2016-06-17 18:42:22",
"site": "Johhnos place",
"employeeName": "HavaCookie",
"details": "These are some new facts. This is a pretty busy place",
"incidentType": "Incident"
}
}
yet this one doesn't (anymore at least)
{
"username": "markb",
"password": "pw"
}
Somebody please share with me what I am not seeing???
I see nothing unusual with the body that would cause any problems. Of course, I don't know everything about your request, like the header names and values, but I would seriously doubt that your body should be relevant. It's just string data.
You reported something that I misunderstood before. I thought you had originally indicated that these timed out requests are actually being processed in OpenEngine. It seems clear to me from the above comments that these requests are simply not getting through at all. Am I correct?
Have you tried running the socket server in debug mode to even seen if the requests get that far?
I am even more convinced this is not a timeout issue. Timeout is a symptom of the problem, not the cause. Rather, IIS is simply unable to talk to OECGI at all.
If you have some degree of control over the web server then I would install a lightweight server like Abyss just to see if it handles your requests. This would only be for comparison purposes...unless, of course, you are able to switch over permanently. But at least you can get some peace of mind. I am beginning to suspect this is an IIS problem.
Sorry for not being clear enough previously though I suspect it may have also had something to do with you really needing to be sleeping rather than trying to solve my problems. :)
That's correct. When it doesn't work, it doesn't get through at all. Timeout likely is just a symptom but the symptoms are about all I can report. Yes I run the socketserver in debug mode. That's why I'm focussing on an issue with the oecgi. The socketserver doesn't do anything and IIS returns it's own error when it determines that the cgi process has timed out.
I will see what I can do with installing Abyss
1Login failed-3101
Does that turn any lightbulbs on for anyone?
I've only seen that error when I try to launch CTO/ARev32 and there are not enough licenses. However, it simply means that the engine server was unable to be connected to.
I've been monitoring the discussion between you and Bryan on the forum. I'll keep my own comments isolated to this forum to avoid confusion and so as to not become a distraction to Bryan's analysis.
That said, your latest post has me confused. I think I would like clarification on the following items:
1. Made sure every user has full control over the entire directory that oecgi4 is in.
By this do you mean network users as opposed to app users? I hope so, as the latter would make no sense at all.
2. Retyped the body of the request (as opposed to using the saved one or copy/pasting the body)
In what way was the body saved? Are you referring to Postman's feature of preserving requests?
One possible answer to why retyping makes a difference is perhaps your original body includes a non-printable character that is choking the request. Retyping allowed you to eliminate that problematic character.
Thanks for monitoring and clarifying.
1. Yes network users though personally neither makes any sense to me. If a network user doesn't have full control how can they post some requests and not others?
2. Yes Postman's feature of preserving requests. A non printable character is possible but where did it suddenly come from and how come an end user has the same issue on their mobile device (causing me to investigate in the first place).
My head is spinning with the "doesn't make any sense" symptoms.
1. Yes network users though personally neither makes any sense to me. If a network user doesn't have full control how can they post some requests and not others?
No, it doesn't make sense which is why I really don't think this is relevant. I just wanted to confirm what you meant. I don't even think network rights are utilized, at least not directly with the OECGI. IIS is the application that invokes the OECGI.
2. Yes Postman's feature of preserving requests. A non printable character is possible but where did it suddenly come from...
I know this is a rhetorical question, but we might not ever know the answer to this question so I suggest putting it aside.
...and how come an end user has the same issue on their mobile device (causing me to investigate in the first place).
Indeed, that part is a mystery that I don't have a plausible explanation for yet. Does this mean the mobile app users are still having trouble or have you not tested this since you retyped the Postman request body?
Of course, you actually have no idea what your web app developer is passing in, do you?
My head is spinning with the "doesn't make any sense" symptoms.
If you don't mind me saying so, I think you need a fresh pair of eyes looking at this directly. But, here are a few more comment and questions for you to review:
1. Why are you using the body for credentials in the first place rather than the Authorization request header? I know this doesn't address the issue, but I had been wanting to ask this from the beginning.
2. One of reasons I dislike IIS is because it doesn't handle the Authorization request header as well as I would like. Try removing HTTP_AUTHORIZATION from your registry, restart the engine server, and see if this makes any difference.
3. I would assume IIS has the ability to log requests. Have you tried to go that route and see what is coming in when you have the problematic requests?
4. You haven't said anything about Abyss. Have you not tried to test with that yet?
Fresh pair of eyes? Yep. Have had some others looking today and that's how we ended up getting back to working but with me doubting the resolution.
1. Body for credentials? Only for the login. This was for consistency with a previous approach already developed which talked to a ruby or some other based app. It then subsequently passed credentials using cookies. I retained the login concept but pass the credentials in the authorisation header as you suggest for all subsequent calls.
2. I will keep this one in back of mind as a step to try. The login call I'm using has no header anyway.
3. Yep, I've tried IIS logs but they don't tell me much. They basically just say what the request was (the url) and little more than that. I've tried using Debug Diagnostics tool but it is the opposite and litters me with verbiage with the disclaimer that it couldn't access something and so the results are probably inaccurate. So I haven't been able to decipher that as yet. The fresh eyes I've been calling on are more for people who can point me in the right direction to successfully debugging IIS.
4. Aaaahhh Abyss. Yes I downloaded it. I installed it. I struggled to configure it so I didn't really get to test with it. In conjunction with trying to get that to work and trying to figure out these symptoms, my machine started crashing and rebooting with increasing regularity. Now that may have had nothing to do with Abyss but the timing was such that I was ready to throw my machine out the window. Instead I uninstalled Abyss.
Again, possibly coincidence but on Friday Windows crashed five or six times. Today, not once.
A long shot idea just came to me. Could this be due to your rewrite rules? Maybe there is a problem there that is causing IIS not not resolve the URL properly and thus the OECGI is never really invoked. It could be something as simple as the URL ending with a "/" versus not ending with a "/".
1. I retained the login concept but pass the credentials in the authorisation header as you suggest for all subsequent calls.
How are you doing that? I mean, this would have to be done via JavaScript and I thought you are relying on a third-party for this.
2. I will keep this one in back of mind as a step to try. The login call I'm using has no header anyway.
Yes, but the presence of the header - even if the value is empty - might cause interference. However, if your above comment is correct, you are using the Authorization header for subsequent calls and this is not causing you any problems. Therefore I think this is likely not the issue.
3. Yep, I've tried IIS logs but they don't tell me much.
Too bad. I thought perhaps just being able to see the request body would have been helpful.
4. Aaaahhh Abyss. Yes I downloaded it. I installed it. I struggled to configure it so I didn't really get to test with it.
Let me know if you need any guidance. I've been thinking of posting a page on our wiki for Abyss, but no one really seems to be using it since IIS or Apache are either required or preferred.
In conjunction with trying to get that to work and trying to figure out these symptoms, my machine started crashing and rebooting with increasing regularity. Now that may have had nothing to do with Abyss but the timing was such that I was ready to throw my machine out the window. Instead I uninstalled Abyss.
Again, possibly coincidence but on Friday Windows crashed five or six times. Today, not once.
I have to assume coincidence. We have Abyss running on several servers and I've tested it on various workstations. I've never seen that happen. What version of Windows?
It includes the resolved address and it is correct.
1. Not sure what you're getting at here? I'm writing the API's so am telling the third party what's required. Postman lets me simulate both scenarios
2. Not likely but not necessarily unrelated. My stumbling block a couple of months ago was exactly that and I thought I resolved it by using 'X-Authorization' instead. However that seems no longer necessary so that's another example of now it works, now it doesn't, now it does.
3. Me too but the logs I was getting didn't contain the body, only the call and the response code. There may be some other boxes I can tick to get more value but I haven't seen them yet.
4. Thanks. If I head down that road again, I'll be sure to ask. I thought I had configured it correctly but it didn't play ball. Happy to take some coaching next time.
Windows 10. It crashes from time to time and it's annoying but from time to time I mean maybe once a week. On this occasion I had just applied a config change to Abyss when it crashed with an error that I hadn't seen before. "impersonating_worker_thread". I put it down to coincidence as well but over the next couple of days the crashes became more frequent and with that same error. Like I said, once it had crashed five times in the one day I was over it so I removed the only new thing I had added.
So you are saying that the full URL (which contains the reference to OECGI4.exe) looks correct? Per chance had you done any testing with the fully resolved URL within Postman just to verify that rewriting was not the problem?
1. Not sure what you're getting at here? I'm writing the API's so am telling the third party what's required. Postman lets me simulate both scenarios
Perhaps I simply misunderstood your original explanation. This was what I got from you:
* The app uses the body to pass in a JSON object containing the username and password for the initial login.
* This was done this way to preserve the pre-existing design that originally was managed by Ruby.
* However, all subsequent requests into the API utilize the Authorization header.
Thus, the first API call does not use the Authorization header but all subsequent API calls do you the Authorization header. Is that correct? Therefore, your app developers are caching the credentials that were originally passed in as a JSON object and then pass them in through the Authorization header going forward. Yes? When you wrote "I...pass the credentials in the authorisation header as you suggest for all subsequent calls", it came across as something you yourself were doing rather than the app developers. That's why I was confused.
You don't owe me an explanation, but I am still unconvinced about the explanation as to why one API uses a JSON object to pass in the credentials but all the others use the header. Seems like a simple thing to just unify all of your APIs and less prone to maintenance issues.
Confusion explained. I pass the credentials .... when I'm testing with Postman. Perhaps better to say, the API code check the header for the credentials for every call other than the login.
Technically no I don't owe you an explanation but I think your effort warrants it.
Firstly. One big learning curve for me.
remove the line feeds from the post body
this: should be this:
Might I suggest you keep an eye on any responses to my post
revelation.com/o4wtrs/oecgi3.exe/O4W_RUN_FORM?INQID=WORKS_READ&KEY=D3EB4E62253992A2A96BD91A7#/section/breadcrumb/UPDATETABLE/Display
Yes you are right, but this was consistent in getting any json to get thru.
I would like to call it a workaround, hoping on a response in my rti forum post!!