oecgi4 timing out

AusMarkB · June 2016

Hi All,

Just posting here because I'm kind of desperate for a quick resolution.

I'm having trouble with oecgi4 timing out and not processing the requests.
We're in the final stages of development/testing of our API's before we go live and now things just aren't working any more.

I've had inconsistent failures up till now which I've been able to let slide because subsequent requests would be successful. Now though, it is persistent, preventing users from even logging in.

The common theme seems to be only when the request is a "POST" and the "POST" has a body. Doesn't seem to matter what is in the body. "GET" requests seem to work and even the "POST" requests work if you omit the body. (by work, I mean the OECGI processes them).

Changing the timeout config is not the answer. I originally had it set to 15 mins so thought the problem was IIS because the requests just seemed to hang but when I changed the timeout to two minutes then I would get a 502 Bad gateway response indicating that the cgi application had timed out.

I am currently testing with just the login call that has a body containing just a json string of username and password. Previously, calls with far more complicated bodies have worked without issue so it seems just the existence of a body is now causing problems.

Appreciate any thoughts/suggestions.

Mark

DonBakke · June 2016

Mark,

I had just finished reading your post on the Rev board and was about to respond when this came in. So, I'll give preference to our own forum. :)

Timeouts can occur at different links in the HTTP chain. Where exactly is the timeout occurring? For instance, is it the client, the server, or OECGI? What you wrote "changing the timeout config is not the answer", where is that config? Is this IIS?

Have you tried using a persistent connection? This helped me with a specific project. However, I must say that I have not experienced the behavior you are describing as being regular. This does make me think it is an IIS issue.

AusMarkB · June 2016

Hey Don,

I also give preference to yours. Rarely get a response on the general forum and it appears my works subscription has expired so I can't get to the other one.

The timeout is oecgi.
I changed it within the cgi config of IIS.

When the call does timeout, I get this response

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

IIS 10.0 Detailed Error - 502.1 - Bad Gateway

HTTP Error 502.1 - Bad Gateway

The specified CGI application misbehaved by not returning a complete set of HTTP headers. The headers it did return are "".

Most likely causes:

The CGI application handling the request took longer than the configured CGI timeout to process the request.
The CGI application is taking too long to process the request.

Things you can try:

Verify the configured timeout for the CGI application.
1. Open IIS Manager.
2. In the Connections pane, click the site or application where the problem is occurring.
3. In the Features pane, double-click CGI.
4. In the CGI page, under Behavior, verify that a value is set for the Timeout property.
5. Change the timeout value, if necessary, and then in the Actions pane click Apply.
Use DebugDiag to troubleshoot the CGI application and determine why it is taking so long to process the request.

Detailed Error Information:

Module	CgiModule
Notification	ExecuteRequestHandler
Handler	OECGI4
Error Code	0x00000000

Requested URL	http://localhost:80/e94http/oecgi4.exe/inet_trace
Physical Path	D:\pf\edmen94http\oecgi4.exe\inet_trace
Logon Method	Anonymous
Logon User	Anonymous

More Information:

This error occurs when the request is configured to handle by a CGI application and the CGI application takes longer than the configured timeout to handle the request. If the request usually takes longer than the configured timeout to run, increase the timeout value. Otherwise, troubleshoot the CGI application to determine the delay.

View more information »

DonBakke · June 2016

How long should the request actually take if you didn't get this timeout response?

How long does it take before that error message gets returned? What I am wondering is whether this is a true timeout or an error condition that makes IIS think it is a timeout.

For comparison purposes have you tried OECGI3.exe?

Does it matter what client you are testing with? For instance, I presume this is a REST API you are calling. Therefore, does it make a difference if you are using a browser/JavaScript versus Postman to call the API?

I'm still not convinced this isn't a red herring. IIS may be pointing the finger at OECGI4.exe, but like I said, I have never had this occur other than when I am debugging an API and truly creating a timeout situation.

AusMarkB · June 2016

Thanks for the thoughts.

The request should take seconds at most. The error message is being returned once the cgi timeout value has been reached whether I set that at two minutes or fifteen. I'm happy to entertain that it is some other error message but I have no idea how to determine if that is the case.

I haven't tried oecgi3 in this instance but I think I tried that a few months ago for something else (which may actually have been related to this) but couldn't actually get oecgi3 to run at all so just moved on. I will try again if I don't come up with anything else.

Just using the browser doesn't help because I can't change the verb to a "POST" and add a body. So the same call in the browser will work or at least make its way through to OI because it is a "GET". The same thing works in Postman. I don't know enough about javascript to set up a browser/javascript test.

Incidentally, it came to my attention yesterday because an end user was testing from the mobile app on their phone. They had closed the app, then went to log back in and it just sat there spinning its wheels. There's some hoops for the app to jump through before it gets to our backend but when I test locally through Postman, I'm getting the same issue and my calls are going direct to IIS and oecgi4.
Basically I was hoping the app guys had broken their end but alas the buck stops here.

DonBakke · June 2016

I see. So this is a mobile app rather than a web app. What is being used to make the HTTP requests? Is this a native app wrapper (PhoneGap/Cordova) around a web stack or is it a pure mobile platform SDK (Objective-C/Java) being used?

Did you test any of the POST APIs that came pre-installed with the SRP HTTP Framework? Do they timeout as well?

Since you can duplicate this from different clients (especially Postman) then I do not think the client is at issue.

You noted originally that the POST requests do get processed. Have you confirmed that the BASIC+ code is getting all the way back to the final return in HTTP_MCP? Have you enabled logging so you can inspect the response getting back to OECGI?

Finally, are you absolutely sure that you aren't doing something in the way you handle your POST requests that would be causing problems on the return trip? I might suggest you hardcode a simple response rather than let the API create the response to a POST request and see if you still get a timeout. If you do not get a timeout, then that tells me the problem is in the response. If you do get a timeout, then I am inclined to believe that something is wrong with the way POST methods are being processed. You might want to test the PUT method as well just to see if that produces similar results.

AusMarkB · June 2016

Don, thanks again for your input and suggestions.

It is both a mobile and web app. Depends on the user and the feature. The web is like an extended version.
I'm doing the OI API's and a third party mobile app developer is doing the front end so I'm not privy to their toolset.

Did I test any of the pre-installed POST API's? No, not recently at least but didn't really need to. I was successfully posting using my code as recently as Friday. In fact even yesterday I had intermittent success. When it works, it works well and is all very exciting.

When I said the POST requests did get processed, they then do exactly as I expect. All the code returns generally with an error because there's no body to post or an unauthorised error. Either way it returns the response I've coded for. The issue arises now when I add a body to the request. I have the oesocketserver running in debug mode so I can see when stuff is happening but when there's a body in the POST the socketserver doesn't do a thing, no engine is started, so it's not the return trip because it doesn't even get to the beginning of http_mcp let alone the end of it.

As I finished typing the above I glanced back over to Postman and noticed that the last POST request I sent actually worked. So I sent it again. And it worked again. This is the one I had working on Friday. I only brought it back up because I was documenting it.

So I retried the login request which is a very simple one. It times out without so much as a flicker on the socketserver. Aaaarrrgghhhh!!!

So I took the slightly more complicated body from the working request and placed it in the non working request and whammo, the non working one returned immediately with a correct "Unauthorized" response.

So this body works

{
"whs":
{
"incidentDt": "2016-06-17 18:42:22",
"site": "Johhnos place",
"employeeName": "HavaCookie",
"details": "These are some new facts. This is a pretty busy place",
"incidentType": "Incident"
}
}

yet this one doesn't (anymore at least)

{
"username": "markb",
"password": "pw"
}

Somebody please share with me what I am not seeing???

DonBakke · June 2016

Mark,

I see nothing unusual with the body that would cause any problems. Of course, I don't know everything about your request, like the header names and values, but I would seriously doubt that your body should be relevant. It's just string data.

You reported something that I misunderstood before. I thought you had originally indicated that these timed out requests are actually being processed in OpenEngine. It seems clear to me from the above comments that these requests are simply not getting through at all. Am I correct?

Have you tried running the socket server in debug mode to even seen if the requests get that far?

I am even more convinced this is not a timeout issue. Timeout is a symptom of the problem, not the cause. Rather, IIS is simply unable to talk to OECGI at all.

If you have some degree of control over the web server then I would install a lightweight server like Abyss just to see if it handles your requests. This would only be for comparison purposes...unless, of course, you are able to switch over permanently. But at least you can get some peace of mind. I am beginning to suspect this is an IIS problem.

AusMarkB · June 2016

Hey Don,

Sorry for not being clear enough previously though I suspect it may have also had something to do with you really needing to be sleeping rather than trying to solve my problems. :)

That's correct. When it doesn't work, it doesn't get through at all. Timeout likely is just a symptom but the symptoms are about all I can report.

I have the oesocketserver running in debug mode so I can see when stuff is happening but when there's a body in the POST the socketserver doesn't do a thing, no engine is started, so it's not the return trip because it doesn't even get to the beginning of http_mcp let alone the end of it.

Yes I run the socketserver in debug mode. That's why I'm focussing on an issue with the oecgi. The socketserver doesn't do anything and IIS returns it's own error when it determines that the cgi process has timed out.

I will see what I can do with installing Abyss

AusMarkB · June 2016

Just tried the login request on the server again (as opposed to my local machine) and this time got this response

1Login failed-3101

Does that turn any lightbulbs on for anyone?

DonBakke · June 2016

Mark,

I've only seen that error when I try to launch CTO/ARev32 and there are not enough licenses. However, it simply means that the engine server was unable to be connected to.

AusMarkB · June 2016

Actually looks like that one was a true red herring but I'm hesitant to say more till I see more.

DonBakke · July 2016

Mark,

I've been monitoring the discussion between you and Bryan on the forum. I'll keep my own comments isolated to this forum to avoid confusion and so as to not become a distraction to Bryan's analysis.

That said, your latest post has me confused. I think I would like clarification on the following items:

1. Made sure every user has full control over the entire directory that oecgi4 is in.

By this do you mean network users as opposed to app users? I hope so, as the latter would make no sense at all.

2. Retyped the body of the request (as opposed to using the saved one or copy/pasting the body)

In what way was the body saved? Are you referring to Postman's feature of preserving requests?

One possible answer to why retyping makes a difference is perhaps your original body includes a non-printable character that is choking the request. Retyping allowed you to eliminate that problematic character.

AusMarkB · July 2016

Hey Don,

Thanks for monitoring and clarifying.
1. Yes network users though personally neither makes any sense to me. If a network user doesn't have full control how can they post some requests and not others?
2. Yes Postman's feature of preserving requests. A non printable character is possible but where did it suddenly come from and how come an end user has the same issue on their mobile device (causing me to investigate in the first place).

My head is spinning with the "doesn't make any sense" symptoms.

DonBakke · July 2016

Mark,

1. Yes network users though personally neither makes any sense to me. If a network user doesn't have full control how can they post some requests and not others?

No, it doesn't make sense which is why I really don't think this is relevant. I just wanted to confirm what you meant. I don't even think network rights are utilized, at least not directly with the OECGI. IIS is the application that invokes the OECGI.

2. Yes Postman's feature of preserving requests. A non printable character is possible but where did it suddenly come from...

I know this is a rhetorical question, but we might not ever know the answer to this question so I suggest putting it aside.

...and how come an end user has the same issue on their mobile device (causing me to investigate in the first place).

Indeed, that part is a mystery that I don't have a plausible explanation for yet. Does this mean the mobile app users are still having trouble or have you not tested this since you retyped the Postman request body?

Of course, you actually have no idea what your web app developer is passing in, do you?

My head is spinning with the "doesn't make any sense" symptoms.

If you don't mind me saying so, I think you need a fresh pair of eyes looking at this directly. But, here are a few more comment and questions for you to review:

1. Why are you using the body for credentials in the first place rather than the Authorization request header? I know this doesn't address the issue, but I had been wanting to ask this from the beginning.

2. One of reasons I dislike IIS is because it doesn't handle the Authorization request header as well as I would like. Try removing HTTP_AUTHORIZATION from your registry, restart the engine server, and see if this makes any difference.

3. I would assume IIS has the ability to log requests. Have you tried to go that route and see what is coming in when you have the problematic requests?

4. You haven't said anything about Abyss. Have you not tried to test with that yet?

AusMarkB · July 2016

True I don't know for sure what the web app developer is passing in but the symptoms were the same.

Fresh pair of eyes? Yep. Have had some others looking today and that's how we ended up getting back to working but with me doubting the resolution.

1. Body for credentials? Only for the login. This was for consistency with a previous approach already developed which talked to a ruby or some other based app. It then subsequently passed credentials using cookies. I retained the login concept but pass the credentials in the authorisation header as you suggest for all subsequent calls.
2. I will keep this one in back of mind as a step to try. The login call I'm using has no header anyway.
3. Yep, I've tried IIS logs but they don't tell me much. They basically just say what the request was (the url) and little more than that. I've tried using Debug Diagnostics tool but it is the opposite and litters me with verbiage with the disclaimer that it couldn't access something and so the results are probably inaccurate. So I haven't been able to decipher that as yet. The fresh eyes I've been calling on are more for people who can point me in the right direction to successfully debugging IIS.
4. Aaaahhh Abyss. Yes I downloaded it. I installed it. I struggled to configure it so I didn't really get to test with it. In conjunction with trying to get that to work and trying to figure out these symptoms, my machine started crashing and rebooting with increasing regularity. Now that may have had nothing to do with Abyss but the timing was such that I was ready to throw my machine out the window. Instead I uninstalled Abyss.
Again, possibly coincidence but on Friday Windows crashed five or six times. Today, not once.

DonBakke · July 2016

Mark,

A long shot idea just came to me. Could this be due to your rewrite rules? Maybe there is a problem there that is causing IIS not not resolve the URL properly and thus the OECGI is never really invoked. It could be something as simple as the URL ending with a "/" versus not ending with a "/".

DonBakke · July 2016

Mark,

1. I retained the login concept but pass the credentials in the authorisation header as you suggest for all subsequent calls.

How are you doing that? I mean, this would have to be done via JavaScript and I thought you are relying on a third-party for this.

2. I will keep this one in back of mind as a step to try. The login call I'm using has no header anyway.

Yes, but the presence of the header - even if the value is empty - might cause interference. However, if your above comment is correct, you are using the Authorization header for subsequent calls and this is not causing you any problems. Therefore I think this is likely not the issue.

3. Yep, I've tried IIS logs but they don't tell me much.

Too bad. I thought perhaps just being able to see the request body would have been helpful.

4. Aaaahhh Abyss. Yes I downloaded it. I installed it. I struggled to configure it so I didn't really get to test with it.

Let me know if you need any guidance. I've been thinking of posting a page on our wiki for Abyss, but no one really seems to be using it since IIS or Apache are either required or preferred.

In conjunction with trying to get that to work and trying to figure out these symptoms, my machine started crashing and rebooting with increasing regularity. Now that may have had nothing to do with Abyss but the timing was such that I was ready to throw my machine out the window. Instead I uninstalled Abyss.
Again, possibly coincidence but on Friday Windows crashed five or six times. Today, not once.

I have to assume coincidence. We have Abyss running on several servers and I've tested it on various workstations. I've never seen that happen. What version of Windows?

AusMarkB · July 2016

I too suspected the rewrite rule might be playing havoc till I read the entire error message.
It includes the resolved address and it is correct.

1. Not sure what you're getting at here? I'm writing the API's so am telling the third party what's required. Postman lets me simulate both scenarios
2. Not likely but not necessarily unrelated. My stumbling block a couple of months ago was exactly that and I thought I resolved it by using 'X-Authorization' instead. However that seems no longer necessary so that's another example of now it works, now it doesn't, now it does.
3. Me too but the logs I was getting didn't contain the body, only the call and the response code. There may be some other boxes I can tick to get more value but I haven't seen them yet.
4. Thanks. If I head down that road again, I'll be sure to ask. I thought I had configured it correctly but it didn't play ball. Happy to take some coaching next time.

Windows 10. It crashes from time to time and it's annoying but from time to time I mean maybe once a week. On this occasion I had just applied a config change to Abyss when it crashed with an error that I hadn't seen before. "impersonating_worker_thread". I put it down to coincidence as well but over the next couple of days the crashes became more frequent and with that same error. Like I said, once it had crashed five times in the one day I was over it so I removed the only new thing I had added.

DonBakke · July 2016

It includes the resolved address and it is correct.

So you are saying that the full URL (which contains the reference to OECGI4.exe) looks correct? Per chance had you done any testing with the fully resolved URL within Postman just to verify that rewriting was not the problem?

1. Not sure what you're getting at here? I'm writing the API's so am telling the third party what's required. Postman lets me simulate both scenarios

Perhaps I simply misunderstood your original explanation. This was what I got from you:

* The app uses the body to pass in a JSON object containing the username and password for the initial login.
* This was done this way to preserve the pre-existing design that originally was managed by Ruby.
* However, all subsequent requests into the API utilize the Authorization header.

Thus, the first API call does not use the Authorization header but all subsequent API calls do you the Authorization header. Is that correct? Therefore, your app developers are caching the credentials that were originally passed in as a JSON object and then pass them in through the Authorization header going forward. Yes? When you wrote "I...pass the credentials in the authorisation header as you suggest for all subsequent calls", it came across as something you yourself were doing rather than the app developers. That's why I was confused.

You don't owe me an explanation, but I am still unconvinced about the explanation as to why one API uses a JSON object to pass in the credentials but all the others use the header. Seems like a simple thing to just unify all of your APIs and less prone to maintenance issues.

AusMarkB · July 2016

Yes the full URL containing the reference to oecgi4.exe is spot on. No I hadn't.

Confusion explained. I pass the credentials .... when I'm testing with Postman. Perhaps better to say, the API code check the header for the credentials for every call other than the login.

Technically no I don't owe you an explanation but I think your effort warrants it.
Firstly. One big learning curve for me.

When I started on this, I didn't have a clue what I was doing so I went with what existed already.
What existed already still exists. Different purpose. Same client.
So before I had written any API, the login and one other was at least partially documented. I passed the login documentation onto the third party just so they had something even if I didn't understand what it was.
Avoiding re-inventing the wheel. Keep the boss happy.
Learnt about headers and started creating the new wheels encompassing authorisation header
Now don't want two different login methods depending on which API you are calling. (Other prototypes were being developed using the previous API's)

Don't know if that sheds any further light on the subject but it's basically a process of evolution and I just need to decide when to forge ahead and when to leave well enough alone. When the forging ahead proves itself then I may well revisit the old and better align with the new.

BarryStevens · August 2016

Mark,

remove the line feeds from the post body

this:

{
"username": "markb",
"password": "pw"
}

should be this:

{ "username": "markb", "password": "pw" }

DonBakke · August 2016

Barry - Do you think that removing line feeds alone is the issue? Haven't you had success with other JSON POSTs where the line feeds are present? I don't doubt that this allowed your API to work, but I am still convinced this is an IIS problem. I believe both you and Mark are using IIS whereas I use something else and never have this problem.

BarryStevens · August 2016

Mark,

Might I suggest you keep an eye on any responses to my post

revelation.com/o4wtrs/oecgi3.exe/O4W_RUN_FORM?INQID=WORKS_READ&KEY=D3EB4E62253992A2A96BD91A7#/section/breadcrumb/UPDATETABLE/Display

BarryStevens · August 2016

Don,

Yes you are right, but this was consistent in getting any json to get thru.

I would like to call it a workaround, hoping on a response in my rti forum post!!