The Stop 0x124 is mostly caused by hardware, and in some exceptional cases, can be potentially caused by buggy device drivers. There isn’t much of a debugging methodology to debugging a Stop 0x124, but there is plenty of background information which would be useful for understanding some of the terminology witnessed within a Stop 0x124 bugcheck.
A failure of a Stop 0x124 to be successfully created, usually produces a Stop 0x122, a debugging tutorial for Stop 0x122 can be found here – Debugging Stop 0x122 – WHEA_INTERNAL_ERROR
WHEA (Windows Hardware Error Architecture) was introduced on Windows Vista and Windows Server 2008, to provide a effective error reporting system which would make debugging more effective, and take precedence over the MCA (Machine Check Architecture) as a primary error reporting architecture for hardware devices. MCA and MCE do still exist on Windows Vista and later operating systems, but are delivered through WHEA instead.
Structure of WHEA:
WHEA consists of a number of different components, the main concepts are LLHEHs (Low-Level Hardware Error Handler), PSHEDs (Platform-Specific Hardware Error Driver) and WHEA error records. The following diagram obtained from the Microsoft documentation provides an overview of how these components interact with the rest of the operating system:
The LLHEH is the first component which would handle the error discovered by the error source. Error sources will discussed later in this guide, but for now, I will simply mention that the error source is the hardware component which discovered the hardware error, and does not mean where the error originated from. The following flow diagram will hopefully help to illustrate the entire WHEA process.
Hardware Error -> Error Source Alerts OS -> LLHEH for corresponding error source is invoked -> Error Packet is created -> Error Packet is processed into a Error Record -> Error Record is processed by PSHED -> Bugcheck is produced
It is important to note that the above flow diagram is rather crude and doesn’t necessarily show the details of each process involved in the WHEA bugchecking process. Please note it also only illustrates what happens with a fatal hardware error, something which will only lead to a bugcheck.
I will now begin to discuss Error Sources, and their purpose within a WHEA bugcheck. To begin, we need to understand and identify that the first parameter of the Stop 0x124 is the value of the error source.
2: kd> .bugcheck Bugcheck code 00000124 Arguments 00000000`00000000 fffffa80`04ba6028 00000000`be000000 00000000`00800400
All error sources are stored within a enumeration called WHEA_ERROR_SOURCE_TYPE. This enumeration can be used to find the name of the error source. There are currently 13 different error sources. The most common being MCE (0x0) and PCIe (0x4).
2: kd> dt nt!_WHEA_ERROR_SOURCE_TYPE WheaErrSrcTypeMCE = 0n0 WheaErrSrcTypeCMC = 0n1 WheaErrSrcTypeCPE = 0n2 WheaErrSrcTypeNMI = 0n3 WheaErrSrcTypePCIe = 0n4 WheaErrSrcTypeGeneric = 0n5 WheaErrSrcTypeINIT = 0n6 WheaErrSrcTypeBOOT = 0n7 WheaErrSrcTypeSCIGeneric = 0n8 WheaErrSrcTypeIPFMCA = 0n9 WheaErrSrcTypeIPFCMC = 0n10 WheaErrSrcTypeIPFCPE = 0n11 WheaErrSrcTypeMax = 0n12
Our current error source type is the Machine Check Exception. The error source alerts the operating system of a hardware error, and when done so, the corresponding LLHEH will be ran to handle that error condition. The LLHEH isn’t actucally a separate entitiy which exists, it is simply a category of handlers, and thus a LLHEH can be a range of handlers, including interrupt handlers, exception handlers or callback functions. The LLHEH will process the error condition into a error packet, and then alert the operating system of the hardware condition.
2: kd> .frame /r 3 03 fffff880`02f6db00 fffff800`02c26052 hal!HalpMcaReportError+0x4c rax=0000000000000000 rbx=fffffa8004c17ea0 rcx=0000000000000124 rdx=0000000000000000 rsi=fffff88002f6de00 rdi=fffffa8004c17ef0 rip=fffff80002c26700 rsp=fffff88002f6db00 rbp=fffff88002f6de30 r8=fffffa8004ba6028 r9=00000000be000000 r10=0000000000800400 r11=0000000000000002 r12=00000000ffffff02 r13=0000000000000000 r14=0000000000000000 r15=0000000000000001 iopl=0 ov up ei pl nz na po nc cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00000a06 hal!HalpMcaReportError+0x4c: fffff800`02c26700 488b8c2430010000 mov rcx,qword ptr [rsp+130h] ss:0018:fffff880`02f6dc30=ffff00906cfd8774
The hal!HalpMcaReportError+0x4c is the LLHEH for this current bugcheck, notice the bugcheck information stored within the status registers for the stack frame?
As mentioned previously, a LLHEH will produce a error packet, which in turn can be investigated by the debugger.
Each error packet is represented by the WHEA_ERROR_PACKET macro, and there is currently two different versions: WHEA_ERROR_PACKET_V1 and WHEA_ERROR_PACKET_V2. The V1 type is supported by Windows Vista SP1 and Windows Server 2008; V2 is supported by from Windows 7 and all latter operating systems.
The only difference between the two structures, is the Signature member. The Signature member takes the value of WHEA_ERROR_PACKET_V2_SIGNATURE for Version 2 or WHEA_ERROR_PACKET_V1_SIGNATURE for Version 1. Since Windows Vista systems are pretty much obsolete now, there isn’t any real reason to bother examining the Version 1 structure.
2: kd> dt _WHEA_ERROR_PACKET_V2 nt!_WHEA_ERROR_PACKET_V2 +0x000 Signature : Uint4B +0x004 Version : Uint4B +0x008 Length : Uint4B +0x00c Flags : _WHEA_ERROR_PACKET_FLAGS +0x010 ErrorType : _WHEA_ERROR_TYPE +0x014 ErrorSeverity : _WHEA_ERROR_SEVERITY +0x018 ErrorSourceId : Uint4B +0x01c ErrorSourceType : _WHEA_ERROR_SOURCE_TYPE +0x020 NotifyType : _GUID +0x030 Context : Uint8B +0x038 DataFormat : _WHEA_ERROR_PACKET_DATA_FORMAT +0x03c Reserved1 : Uint4B +0x040 DataOffset : Uint4B +0x044 DataLength : Uint4B +0x048 PshedDataOffset : Uint4B +0x04c PshedDataLength : Uint4B
The most important members of the data structure are: Error Type, ErrorSourceType and NotifyType.
The ErrorType field contains the WHEA_ERROR_TYPE structure which describes the hardware which reported the error.
2: kd> dt nt!_WHEA_ERROR_TYPE WheaErrTypeProcessor = 0n0 WheaErrTypeMemory = 0n1 WheaErrTypePCIExpress = 0n2 WheaErrTypeNMI = 0n3 WheaErrTypePCIXBus = 0n4 WheaErrTypePCIXDevice = 0n5 WheaErrTypeGeneric = 0n6
The ErrorSourceType has been explained earlier in this post. The NotifyType is the type of mechanism which reports the error to the operating system; for example MCE or BOOT. The _GUID is given the following values:
We can examine WHEA Error Packets using the !errpkt extension, but unfortunately that requires a WHEA Error Record with the Error Record Section named Error Packet/Hardware Error Packet. I started debugging in 2012, and I still haven’t seen a BSOD where !errpkt has worked.