This is the first time I’ve debugged in a while, and the example is from a dump file which my friend on Sysnative Patrick has been debugging, but I wanted to write another debugging post which explained a few additional clues you can check with Stop 0x50’s.
The problem which I find with some bugchecks, is that their names can be a little generic and not really pinpoint the exact problem. Yes, they give a idea of the problem but do not give any major clues; a paged pool address could have been referenced at the wrong IRQL Level which would likely lead to a Stop 0x50 or a Stop 0xA. Let’s take a look at few of the little clues available within the dump file.
Within the description there is two clues which point to the type of possible problem. A invalid system address was being referenced, which is quite obvious, since it must have been a Kernel-Mode address otherwise we wouldn’t have gotten the bugcheck and the address has been freed. Of course, the address could have been paged out onto a page file, and then the corruptions within the file system may have lead to that address being corrupted too.
Now, a good thing next would be to check the CR2 and this if that matches the address being referenced within Arg 1. The CR2 register or Control Register 2, is the register which contains the Page Fault Linear Address or the last address which the program attempted to access. Linear Address is pretty much the Virtual Address or the Logical Address with the segment register added, which in this case is DS (Data Segment).
The CR2 register being to be pointing to the referenced address within Arg 1. We can investigate further and gather some small but important clues by gathering a stack trace and then viewing the registers stored within a context switch upon a page fault.
Using the .trap command on the trap frame address, we can view the registers and referenced addresses and the last called function which caused the page fault. Note a trap frame is the saving of a register state when an exception occurs, which is what a page fault is technically considered, it would only lead to problems if the exception couldn’t be handled by the page fault handler within MmAccessFault.
The concatenation of the two registers provides the address within the ds register and the referenced address within CR2 and Arg 1 of the bugcheck description. We have found the referenced address. Going back to the bugcheck description, notice “pointing to freed memory”, the memory address has been freed wrongly with the nt!ExFreePoolWithTag function and paged out back onto the disk when it shouldn’t have since this a non-paged pool memory address.
We can even check the IRQL Level with the !irql extension, and see if the problem could have been due to IRQL Level problems, since only non-paged pool can be accessed and any page faults are illegal. Since the IRQL Level was 0, then the possibility of the IRQL Level is moot as page faults are legal.
In my opinion, the problem is most likely to point to software issues. It’s been a while since I last debugged, so it was nice to be able to write a blog post regarding the subject again.
The full thread is here with Patrick’s analysis.