Welcome to the SRP Forum! Please refer to the SRP Forum FAQ post if you have any questions regarding how the forum works.

Orphaned connections

Morning,

Is there a reliable way to prevent or manage orphaned connections?
I checked a syncserver today and there were 53 connections.
I stopped it and re-started it and there were 31 connections. This is a more accurate number.

My concern is again the rising error count assuming they are caused by the orphaned connections.
I didn't have any reports of people crashing out of OI letting alone 22 of them so that is not likely to be the cause in this case. The ever increasing error count is a concern because it reaches a certain number and then the app halts and people can't log in at all due to the fact they are either connecting or checking the server valid property and it doesn't want to play any more.

It's now Monday morning. The sync server was checked on Friday because people couldn't log in and it had 64 connections. The max number for this organisation should be somewhere in the 30's. So re-started on Friday and over the course of the weekend and Monday morning we've picked up another 22 orphan connections.

This is the only site we've implemented this so far and looks like we might have to turn it off just so users can continue without interruption and someone constantly monitoring the server.
That would be disappointing because it provides potential for some really cool features that will have to be shelved in the interest of uptime.

Would appreciate any insight, suggestions experiences with successful implementations.

Thanks

Comments

  • Forgot to add, using version 2.1.1
  • I sent you an email with a link to a new build. Give it a test, and if it works, I'll make an official release.
  • Thanks Kevin.

    I've put it in. I'll let you know.
  • Morning,

    We are still having issues with this though it is difficult to tell exactly what is the problem.
    As in my earlier post, the syncserver stopped working/responding with 64 connections. The number of connections may be coincidental but I feel you will have a better idea on that. Is there a limit on the number of workable connections?
    Orphaned connections may be a problem but not to the extent that we thought. When this was failing with 64 connections and knowing that the max users should be somewhere in the 30's I "assumed" there was a consistent problem with orphans. Have just been informed that some users may have two or even three copies of the application running at once so in reality sixty odd connections is a definite possibility.

    So now I have three questions to try and help troubleshoot the problems we're experiencing.
    1. Is there a limitation on the number of workable connections?
    2. What, if any, issues may arise with multiple connections to the same sync server from the same workstation? 1 user. 3 applications therefore 3 DC controls all doing the same thing.
    3. Are there any known issues with the syncserver running on a virtual server?
  • App hung on this line again.
    IsConnected = Get_Property(ctrlentid, 'OLE.ServerValid')

    And the number of connections was again 64 so I'm guessing this has answered my first question above.
    64 seems to be the magic number where things don't want to play any more.
    Is there a way to get around this?
  • Try version 2.1.3 RC1. This should do a better job dropping orphaned connections. Let me know if you still have any issues and I'll work to resolve them right away.
  • Thanks Kevin.

    Am using that one and the issue continues.
    Whilst I've kept it to this thread, I'm no longer sure if orphaned connections is really the issue.
    It may be part of the problem but it's hard to tell as we've since found out that some users have multiple instances of the application open and therefore multiple connections.

    So the trouble shooting went like this.
    More connections than users and an ever increasing error count meant orphaned connections and an eventual failure.
    The view of the errors was removed so we had only the number of connections to work with. This still exceeded the number of users so we still thought orphaned connections must be causing errors.

    Then we learned about the one user, multiple instances scenario which made the number of connections more accurate.

    Now I'm down to two definitive points based on about the last four failures.
    1. When I re-start the sync server, whilst it always automatically reconnects it never reconnects with as many controls as it originally had connections.
    2. Each time it has failed there have been 64 connections.

    I don't know if point one means there were orphaned connections or if the automatic reconnect just doesn't always find all previous connections.
    As for point two, I initially ignored the 64 as a coincidence but it has been the same number for at least the last four failures moving it from coincidence to consistent.

    Hopefully some of this rambling provides something of value to you.
  • There does turn out to be a 64 connection limit in the winsock API. It was easy to overlook because it only applies to one function. After some work, I managed to increase the limit to 1024. It can go higher, but Windows tends to bog down beyond that many connections anyway. Hopefully this will suffice.

    The reason there seemed to be orphaned connections was because hitting that limit just caused all the connections to get "stuck" for lack of a better term. That's why nothing would connect after that.

    You can download 2.1.1 RC2 for testing on your end.
  • Woohoo!

    Will put it in and let you know if there are any more problems like when we reach 1024 users.
  • Well Gentlemen,
    pleased to say, we lasted a week and no issues.
    That's the longest period yet and no reason to suspect that there will be any from here on in.

    Well done.

    Thank you.
  • This has raised its ugly head again.
    Due to other issues I had cause to download the latest version of this control at the same client.
    Now the 64 connection limit is back.

    Is it possible the fix for this didn't actually make it into the official release?
  • edited November 2016
    For clarification, I've restored the version we were usinng before. It is version 2.1.3 RC2.

    It has no hangs ups when reaching 64 connections.
    The version currently available from the products page says "no more thanks" at 64.
    Well actually it doesn't bother to say anything and just leaves us wondering.
  • Are you hitting the limit naturally or is this still due to orphaned connections?
  • This thread started out as orphaned connections because I knew we didn't have that many users. Along the way I found out that whilst there weren't that many users, it was common among them to have multiple instances open making it very possible that it was reaching that number naturally. So I can't confirm definitively if there are orphaned connections or not, just that hitting 64 makes everything stop.

    That said, it was brought to my attention this time because the IT guys here were being made aware rather than myself and their resolution was to stop and restart the syncserver. This was happening each morning as the morning influx of users occurred. The fact that a restart of the syncserver seemed to resolve the daily issue would suggest that perhaps there were orphaned connections hanging around from the day before.

    Since the restore of 2.1.3 RC2 about a week ago, there has been no need to stop and restart.
  • The way you have to increase connection limit in the code for Windows is unusual. It's not a setting, but a MACRO that is compiled. The macro is not in my code, it's in Microsoft code. I suspect that when I made the official build, I used a different machine (I bounce between a PC and a laptop) and the old macro was used. I will make an official build when I get a chance, but it might not be until next week since I'm am onsite at a clients this week. In the meantime, the 2.1.3 RC2 is not functionally different from 2.1.3. The only difference was my sad attempt to rebuild it without the RC2 suffix. In other words, it's a stable build until I can correct this.
  • No problem. Just caught me out.
    Hopefully the client won't get hit by a virus again anytime soon and I won't get caught out again. :)
  • Mark. I fixed this for real, and gave it a new version number so there would be no confusion. I tested it on my end as well. Sorry for the inconvenience. You can download it from the product page.
Sign In or Register to comment.