Welcome to the SRP Forum! Please refer to the SRP Forum FAQ post if you have any questions regarding how the forum works.
Debugging a crash using WinDbg
I've finally managed to replicate a OI crash on demand with a very specific sequence of steps! (Usually these crashes appear to be quite random).
- The crash occurs at the same point with SRPeditTable.ocx v3.0.7, 3.0.8, or SRPcontrols.ocx v4.0.0 as the Faulting Module. This doesn't occur with SRPeditTable v3.0.6 .
- The crash occurs on closing an MDI child window. The OEprofile log shows this occurring during a Utility() function called from End_Window().
- I've replicated on a Windows 10 and a Widows 8.1 machine. On another Windows 7 machine, a crash can still occur but not until closing the MDI frame and in an 'unknown' faulting module, and not with consistency.
- All OLE events are qualified explicitly (ie no ALL_OLES) and asynchronous.
So I've installed WinDbg as Jared suggested the other day on the Revsoft forum and trying to understand how to use this. So I'm posting here as it seems relevant and this is getting quite technical ;).
Here's a portion of the List Loaded Module report (lm command):
And here is the Stack Backtrace (k command):
Are there any clues here on what I can look for in our application, or in WinDbg?
Cheers, M@
- The crash occurs at the same point with SRPeditTable.ocx v3.0.7, 3.0.8, or SRPcontrols.ocx v4.0.0 as the Faulting Module. This doesn't occur with SRPeditTable v3.0.6 .
- The crash occurs on closing an MDI child window. The OEprofile log shows this occurring during a Utility() function called from End_Window().
- I've replicated on a Windows 10 and a Widows 8.1 machine. On another Windows 7 machine, a crash can still occur but not until closing the MDI frame and in an 'unknown' faulting module, and not with consistency.
- All OLE events are qualified explicitly (ie no ALL_OLES) and asynchronous.
So I've installed WinDbg as Jared suggested the other day on the Revsoft forum and trying to understand how to use this. So I'm posting here as it seems relevant and this is getting quite technical ;).
Here's a portion of the List Loaded Module report (lm command):
start end module nameDoes it matter that the start of the address space for srpcontrols is the same as the end of the previous module's address space? These seems rare in the complete module list. Is it suspicious? - too close for comfort?
00000000`54b10000 00000000`550c0000 mscorwks
00000000`550c0000 00000000`561ec000 srpcontrols
00000000`59d20000 00000000`59d9f000 vbscript
And here is the Stack Backtrace (k command):
# Child-SP RetAddr Call SiteThere's no mention of SRPcontrols at all! But ntdll seems to be recursing until it finally crashes.
00 00000000`000966e0 00000000`6399b9b9 wow64!Wow64NotifyDebugger+0x1d
01 00000000`00096710 00000000`6399c873 wow64!HandleRaiseException+0x13d
02 00000000`00096c00 00000000`6399cb54 wow64!Wow64NtRaiseException+0x9b
03 00000000`00096c90 00000000`63996245 wow64!whNtRaiseException+0x14
04 00000000`00096cc0 00000000`63a61c87 wow64!Wow64SystemServiceEx+0x155
05 00000000`00097580 00000000`6399bdb2 wow64cpu!ServiceNoTurbo+0xb
06 00000000`00097630 00000000`639a8c36 wow64!RunCpuSimulation+0x22
07 00000000`00097660 00000000`639e2e0d wow64!Wow64KiUserCallbackDispatcher+0x426
08 00000000`00097a70 00007ff8`eff98b94 wow64win!whcbfnNCDESTROY+0xbd
09 00000000`00098450 00000000`63a621bc ntdll!KiUserCallbackDispatcherContinue
0a 00000000`000984d8 00000000`63a6217f wow64cpu!CpupSyscallStub+0xc
0b 00000000`000984e0 00000000`6399bdb2 wow64cpu!Thunk0Arg+0x5
0c 00000000`00098590 00000000`639a8c36 wow64!RunCpuSimulation+0x22
0d 00000000`000985c0 00000000`639e440a wow64!Wow64KiUserCallbackDispatcher+0x426
0e 00000000`000989d0 00007ff8`eff98b94 wow64win!whcbfnDWORD+0x21a
0f 00000000`000993d0 00000000`63a621bc ntdll!KiUserCallbackDispatcherContinue
10 00000000`00099458 00000000`63a6217f wow64cpu!CpupSyscallStub+0xc
11 00000000`00099460 00000000`6399bdb2 wow64cpu!Thunk0Arg+0x5
12 00000000`00099510 00000000`639a8c36 wow64!RunCpuSimulation+0x22
13 00000000`00099540 00000000`639e440a wow64!Wow64KiUserCallbackDispatcher+0x426
14 00000000`00099950 00007ff8`eff98b94 wow64win!whcbfnDWORD+0x21a
15 00000000`0009a350 00000000`63a621bc ntdll!KiUserCallbackDispatcherContinue
16 00000000`0009a3d8 00000000`63a6217f wow64cpu!CpupSyscallStub+0xc
17 00000000`0009a3e0 00000000`6399bdb2 wow64cpu!Thunk0Arg+0x5
18 00000000`0009a490 00000000`639a8c36 wow64!RunCpuSimulation+0x22
19 00000000`0009a4c0 00000000`639e440a wow64!Wow64KiUserCallbackDispatcher+0x426
1a 00000000`0009a8d0 00007ff8`eff98b94 wow64win!whcbfnDWORD+0x21a
1b 00000000`0009b2d0 00000000`639f3844 ntdll!KiUserCallbackDispatcherContinue
1c 00000000`0009b358 00000000`639e5aa2 wow64win!NtUserMessageCall+0x14
1d 00000000`0009b360 00000000`639e5ef8 wow64win!whNT32NtUserMessageCallCB+0x32
1e 00000000`0009b3b0 00000000`63996245 wow64win!whNtUserMessageCall+0x128
1f 00000000`0009b470 00000000`63a61c87 wow64!Wow64SystemServiceEx+0x155
20 00000000`0009bd30 00000000`6399bdb2 wow64cpu!ServiceNoTurbo+0xb
21 00000000`0009bde0 00000000`639a8c36 wow64!RunCpuSimulation+0x22
22 00000000`0009be10 00000000`639e440a wow64!Wow64KiUserCallbackDispatcher+0x426
23 00000000`0009c220 00007ff8`eff98b94 wow64win!whcbfnDWORD+0x21a
24 00000000`0009cc20 00000000`639f3844 ntdll!KiUserCallbackDispatcherContinue
25 00000000`0009cca8 00000000`639e5aa2 wow64win!NtUserMessageCall+0x14
26 00000000`0009ccb0 00000000`639e5ef8 wow64win!whNT32NtUserMessageCallCB+0x32
27 00000000`0009cd00 00000000`63996245 wow64win!whNtUserMessageCall+0x128
28 00000000`0009cdc0 00000000`63a61c87 wow64!Wow64SystemServiceEx+0x155
29 00000000`0009d680 00000000`6399bdb2 wow64cpu!ServiceNoTurbo+0xb
2a 00000000`0009d730 00000000`639a8c36 wow64!RunCpuSimulation+0x22
2b 00000000`0009d760 00000000`639e440a wow64!Wow64KiUserCallbackDispatcher+0x426
2c 00000000`0009db70 00007ff8`eff98b94 wow64win!whcbfnDWORD+0x21a
2d 00000000`0009e570 00000000`639f3844 ntdll!KiUserCallbackDispatcherContinue
2e 00000000`0009e5f8 00000000`639e5aa2 wow64win!NtUserMessageCall+0x14
2f 00000000`0009e600 00000000`639e5ef8 wow64win!whNT32NtUserMessageCallCB+0x32
30 00000000`0009e650 00000000`63996245 wow64win!whNtUserMessageCall+0x128
31 00000000`0009e710 00000000`63a61c87 wow64!Wow64SystemServiceEx+0x155
32 00000000`0009efd0 00000000`6399bdb2 wow64cpu!ServiceNoTurbo+0xb
33 00000000`0009f080 00000000`6399bcc0 wow64!RunCpuSimulation+0x22
34 00000000`0009f0b0 00007ff8`eff7fca1 wow64!Wow64LdrpInitialize+0x120
35 00000000`0009f360 00007ff8`effb166d ntdll!LdrpInitializeProcess+0x176d
36 00000000`0009f750 00007ff8`eff66d5e ntdll!_LdrpInitialize+0x4a8b9
37 00000000`0009f7d0 00000000`00000000 ntdll!LdrInitializeThunk+0xe
Are there any clues here on what I can look for in our application, or in WinDbg?
Cheers, M@
Comments
My compliments and gratitude for engaging in such extensive efforts to dig deep into this issue. I am going to have to let Kevin review and comment on the technical bits in your post. However, if you are able to replicate this consistently, can you provide us with the steps or an RDK that replicates the experience? There is probably no better way for us to troubleshoot than to duplicate the crash in-house and work on a solution.
Download this debug file for our controls along with the latest pre-release of SRPControls.ocx (4.0.1 RC1). Place them into the same directory and re-register the OCX. Then do the same steps you did before, with WINDBG, and see if we get more information in the dump. If not, then we'll have to fall back to Don's suggestion of an RDK and/or specific steps to reproduce the problem.
Thanks for the downloads, Kevin. I can still replicate the crash and get a dump file. I can't see any extra info in the lm or k commands though - but I'm not sure if I'm using this correctly. Is WinDbg supposed to be running alongside OI in realtime? - if so, I'm not sure how.
I've uploaded the dump file in case there's anything you might be able to glean from it.
I've been trying for several months now to reduce the issue to its core so that I could send you a simple RDK, but I don't think that's going to work. I could send you the whole application with instructions/video on how to replicate, but I suspect the operating environment is also a factor. Our current application release uses SRPeditTable v3.0.8 but we have some customers reporting crashes and others not - and it doesn't appear to be as simple as Windows version.
Alternatively, now that I have WinDbg installed and the .pdb file, it may be easier if we have a shared remote session where I can replicate and you can use WinDbg?? Let me know either way.
Cheers, M@
Are you only using SRPEditTable.ocx? Or do you also use SRPControls.ocx? Because we don't support both. One always overrides the others and could lead to unexpected behavior.
I am sympathetic to the fact that you cannot create an RDK. If you have a way for me to download a copy of the system, I'm happy to walk through the steps on my end. With inconsistent crashes like these, I feel like it would be the shortest path to isolating the problem and getting a fix.
Let me know what you think.
Ok, I'll prepare a system you can download and do a video on how to crash it.
Cheers, M@
Cheers, M@
Your copy of the system along with steps to repeat were the key. It was a tough fix with a lot of trial and error, but I think I nailed it. Try 4.0.1 RC3 and let me know if things are improved on your end.
Yes, the WinDbg can be useful - particularly for determining which control within SRPcontrols is not happy. That then gives us something to target in our code for resolution.
Thanks very much for working on this! This one has been a small straw in a haystack.
Cheers, M@