Welcome to the SRP Forum! Please refer to the SRP Forum FAQ post if you have any questions regarding how the forum works.
Reliability issue with Web interface using OEGCI talking to 9.4 OI OESocketServer.jar (via Service)
Has anyone experienced issues recently with the above setup?
Basically, we have had a couple of clients that have been running the above setup in production without issue for quite a while. All of a sudden, the last few weeks or so, communication hangs regularly. However, a simple restart of the wrapper service for oesocketserver restores access immediately. I should note the wrapper service is still listed as running and not stopped.
The server setup looks something like this:
- Win Server 2012 R2 with IIS 8 (their IT service company appears to be on the ball when it comes to patch maintenance). We also have another client on 2016 Server that is also experiencing some recent flakiness that may, or may not be, ultimately the same issue.
- oecgi3 (was a renamed oecgi 4.0.0.1). I have just changed this to oecgi 4.0.0.3, with no apparent stability improvement. (We require oecgi3 as it is hardcoded LOTS in some deprecated legacy code)
- OI 9.4 (I didn’t run a full update on OI to 9.4.4 but I tried with the latest oesocketserver.jar from patch 5.1)
- Was running Oracle JRE 8.211. Upgraded to 8.241, and then replaced with Adopt OpenJDK. This was not more stable.
- Wrapper. Nothing obvious in the logs (that hasn’t been there for four years).
STATUS | wrapper | 2020/03/09 11:14:10 | --> Wrapper Started as Service
STATUS | wrapper | 2020/03/09 11:14:11 | Launching a JVM...
INFO | jvm 1 | 2020/03/09 11:14:11 | Wrapper (Version 3.1.2) http://wrapper.tanukisoftware.org
INFO | jvm 1 | 2020/03/09 11:14:11 |
INFO | jvm 1 | 2020/03/09 11:14:11 | WARNING - System.in can not be used when the JVM is being controlled by the Java Service Wrapper. Calls will block indefinitely.
INFO | jvm 1 | 2020/03/09 11:14:11 | Version: 3.0.0.411 - Licensed for use to CN=Revelation Software
INFO | jvm 1 | 2020/03/09 11:14:11 | Started at 2020-03-09 11:14:11
Again, I feel I should note that it was working fine on the original configuration for ages and then just started to be flaky.
I know these are pretty broad brushstrokes at the moment but I am just putting it out there in case a lightbulb goes off for somebody who may have experienced a similar issue…
Comments
My feel is OEngineServer issue rather than OECGI but I am certainly not across the ins and outs of how the whole structure communicates. I am effectively just restarting the DB 'listener' (for want of a better term) while doing nothing to the 'caller' (OECGI) so it would appear that this issue is past the point of OECGI.
As this has been happening on a production site I am getting less and less time to intelligence gather since they are getting more and more agitated with the downtime. In those early days I stoles a few minutes to collect information while the issue was occurring. Now I need to restart that service pretty quickly to restore law and order!
I probably should of posted here for ideas a week or 2 ago :(
CONTENT_LENGTH = 0 CONTENT_TYPE = GATEWAY_INTERFACE = CGI/1.1 HTTPS = off HTTP_ACCEPT = text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 HTTP_COOKIE = HTTP_FROM = HTTP_REFERER = HTTP_USER_AGENT = Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4077.0 Safari/537.36 PATH_INFO = /inet_trace PATH_TRANSLATED = C:\inetpub\wwwroot\$PATH\inet_trace QUERY_STRING = REMOTE_ADDR = xxx.xxx.xxx.xxx REMOTE_HOST = xxx.xxx.xxx.xxx REMOTE_IDENT = REMOTE_USER = REQUEST_METHOD = GET SCRIPT_NAME = /cgi-bin/oecgi3.exe SERVER_NAME = yyy.yyy.yyy.yyy SERVER_PORT = zzzz SERVER_PROTOCOL = HTTP/1.1 SERVER_SOFTWARE = Microsoft-IIS/8.5 SERVER_URL = SERVER_SERIAL = $SERIAL RegistryKey = SOFTWARE\RevSoft\OECGI3 EngineName = ServerURL = localhost ServerPort = $PORT ApplicationName = $APP_NAME UserName = $USER_NAME StartupFlags = 65 ShutdownFlags = 1 FileMode = 1 FilePath =
The fact that INET_TRACE fails suggests to me that the engine isn't getting called. Have you inspected the IIS logs to see what they reveal?
Are you running the OEngineServer is debug mode or as a service? If as a service, consider switching to debug mode so you can watch the console and the engines when the system is unresponsive. It will be telling to see if either or both get activity.
Running the OEngineServer in debug mode once I had an active report of that root issue eventually led to the solution as I was then able to trace through the entire process. It was not quite as direct as it should of been due to the nature of the specific setup but ultimately it turned out to be some 'hidden' corrupt indexes.
AS an aside, I did some investigation into the Debug Intercept mode you mentioned Don. I will keep that in the back of my mind... aka 'potential toolbox'!