daniel pearson david solomon expert seminars svr302
TRANSCRIPT
![Page 1: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/1.jpg)
![Page 2: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/2.jpg)
Windows Crash Dump Analysis
Daniel PearsonDavid Solomon Expert SeminarsSVR302
![Page 3: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/3.jpg)
Daniel Pearson
Started working with Windows NT 3.51Three years at Digital Equipment Corporation
Supporting Intel and Alpha systems running Windows NTSeven years at Microsoft
Senior Escalation Lead in Windows base teamWorked in the Mobile Internet sustainedengineering team
Instructor for David Solomon, co-author of the Windows Internals book series
![Page 4: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/4.jpg)
Agenda
Causes of Windows crashesWhat happens during a crashConfiguring Windows crash optionsWriting a crash dumpAutomated and manual crash analysisUsing Driver Verifier to detect errorsAttaching a kernel debugger
* Portions of this session are based on material developed by Mark Russinovich andDavid Solomon
![Page 5: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/5.jpg)
Why Analyze a Crash?
When Windows Error Reporting has no solution or when it blames “a device driver”
![Page 6: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/6.jpg)
Why Does Windows Crash
A device driver or part of the operating system incurs an unhandled exceptionA device driver or part of the operating system explicitly crashes the system due to an unrecoverable conditionA page fault occurs at an interrupt request level of dispatch or higherA hardware condition such as a nonmaskable interrupt or faulty memory, disk, etc.
![Page 7: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/7.jpg)
Causes of Windows Crashes
70%
13%
11% 6%
Percentage of Top 500 Crashes for Windows Vistawith Service Pack 11
Third-party device driversMicrosoft codeCrash too corrupt for analysisHardware errors
1. Microsoft Corporation. 2008. Online Crash Analysis research performed in September.
![Page 8: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/8.jpg)
What Happens During a Crash
When a condition is detected that requires a crash, the kernel API KeBugCheckEx is calledKeBugCheckEx accepts a bugcheck code that indicates the reason for the crash and four parameters that supply additional information
KeBugCheckEx( IN ULONG BugCheckCode, IN ULONG_PTR BugCheckParameter1, IN ULONG_PTR BugCheckParameter2, IN ULONG_PTR BugCheckParameter3, IN ULONG_PTR BugCheckParameter4 );
![Page 9: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/9.jpg)
Inside of KeBugCheckEx
KeBugCheckEx performs several functionsDisables interruptsNotifies other CPUs to halt executionNotifies registered driversWrites crash dump information to disk*
Restarts the system*
* Only if the system is configured to do so
![Page 10: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/10.jpg)
The Windows Stop Screen
1
2
3
4
5
![Page 11: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/11.jpg)
Bugcheck Codes
Shared by many components and driversThe Windows Driver Kit currently documents over 250 unique bugcheck codes
Two of the most common bugcheck codes are0xA IRQL_NOT_LESS_OR_EQUAL
Usually caused by an invalid memory access0x1E KMODE_EXCEPTION_NOT_HANDLED
Generated when executing garbage instructionsUsually caused when a stack has been trashed
![Page 12: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/12.jpg)
Memory Dump Types
Small memory dumpRecords the smallest set of useful information
Kernel memory dump*
Records only kernel memory, which speeds up the process of writing a crash dump
Complete memory dump*
Records the entire contents of system memory
* If either a Kernel or Complete memory dump is selected, the system will also create a minidump and store it in the %SystemRoot%\minidump directory
![Page 13: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/13.jpg)
Configuring DebuggingInformation Options
demo
![Page 14: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/14.jpg)
Writing a Crash Dump
Crash dump information is written to the paging file on the boot volume
Too risky to create a new file on the systemHow does the system know its safe?
The boot volume paging file’s on-disk mappingis obtained when the system startsCritical crash components are checksummedWhen a crash occurs, if the checksum doesn’t match, a memory dump is not written
![Page 15: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/15.jpg)
Why Would You Not Get a Dump?
Problems with page file configurationThe paging file on the boot volume is too small or one does not existThe system crashed before the paging filewas initialized
Critical crash components are corruptedWindows didn’t crash!
The system spontaneously restartedThe system is hung
![Page 16: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/16.jpg)
When the System Restarts
WinInit
WerFault
NtCreatePagingFile
“MachineCrash”
User mode
Kernel mode
Paging file
SMSS
WinInit
DUMPxxxx.tmp
Memory.dmp
Session Manager
Œ
�
Ž
�
�
![Page 17: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/17.jpg)
Analyzing a Crash Dump
The Microsoft kernel debuggers can be used to open and analyze a crash dump
kd, a command line tool and WinDbg, a GUI toolAvailable as part of the Debugging Tools for Windowshttp://www.microsoft.com/whdc/devtools/debugging/
default.mspxConfigure the debugger to point to symbolssrv*C:\SYMBOLS*http://msdl.microsoft.com/download/
symbols
![Page 18: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/18.jpg)
Automated Analysis
When you open a crash dump with WinDbg or kd, the debugger performs basic crash analysis*
Displays stop code and parameter informationTakes a guess at the offending driver
The analysis is the result of the automated execution of the !analyze debugger command
!analyze uses the bugcheck parameters and a set of heuristics to determine what component is the likely cause of the crash
* Set the environment variable DBGENG_NO_BUGCHECK_ANALYSIS=1 to disable
![Page 19: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/19.jpg)
Automated Analysis Using !analyzedemo
![Page 20: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/20.jpg)
Buffer Overruns
Occurs when a driver goes past the end,called an overrun, or the beginning, an underrun,of it’s memory allocationUsually detected when overwritten data is referenced by the kernel or another driverIt’s possible there’s a long delay betweencorruption and detection
![Page 21: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/21.jpg)
Viewing the Effects of a Buffer Overrundemo
![Page 22: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/22.jpg)
Crash Transformation
For crashes that are difficult to analyzeThe “victim” crashed the system, not the culpritThe debugger points to ntoskrnl.exe, win32k.sys or other Windows componentsYou get many different crash dumps all pointing at different causes
Your goal isn’t to analyze difficult crashes …
It’s to try to make an “unanalyzable” crash into one that can be easily analyzed
![Page 23: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/23.jpg)
Driver Verifier
Useful for identifying code defects in driversPerforms more thorough checks on the system and device drivers as well as simulating failuresSupport is built into the operating systemThe requirements for the Windows logo program state that a driver must not fail while running under Driver Verifier
![Page 24: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/24.jpg)
Using Driver Verifier to Catch aBuffer Overrun
demo
![Page 25: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/25.jpg)
Manual Analysis
Sometimes !analyze isn’t enoughIt might not tell you anything usefulYou want to know in more detail what was happening at the time of the crash
Several useful commands and techniquesVerify the time of the crash, .time
A short uptime value can mean frequent problemsCheck the stack on each CPU, stacks are read from the bottom to the top
!cpuinfo will display a list of all the CPUsUse ~s to switch to a different CPU for investigationk to display the stack
![Page 26: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/26.jpg)
Manual Analysis
Several useful commands and techniquesLook at memory usage, !vm
Make sure memory pools are not depleted or contain errorsUse !poolused to identify large users
Check the currently running thread, !threadMay or may not be related to the crashCheck pending I/O requests using !irp
List all processes on the system, !process 0 0Make sure you understand what was running at the time
List loaded drivers, lm t nMake sure all the drivers are recognizable and up to date
* Refer to the Debugging Tools for Windows documentation for additional commands
![Page 27: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/27.jpg)
Manual Analysis of a Crash Dumpdemo
![Page 28: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/28.jpg)
Attaching a Kernel Debugger
Required for debugging initialization failures and crashes where no dump file is createdRequires that the system be started with the debugger enabled to workSupport for using a null-modem, IEEE 1394 and USB 2.0 cable as well as virtual machines and over the network in Windows 7Limited support for local kernel debugging
![Page 29: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/29.jpg)
Attaching a Kernel Debugger to aLive System
demo
![Page 30: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/30.jpg)
Hung Systems
Sometimes systems becomes unresponsiveKeyboard and mouse frozen
Two types of hangsInstant lockup
Kernel synchronization deadlockInfinite loop at a high IRQL or a very high priority thread
Slowly grinding to a haltResource depletion
![Page 31: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/31.jpg)
Initiating a Manual Crash
Using the keyboardRequires a PS/2 keyboard + registry key
HKLM\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters\CrashOnCtrlScroll
Using an NMI buttonRequires specialized hardware + registry key
HKLM\SYSTEM\CurrentControlSet\Control\CrashControl\NMICrashDump
Using the debuggerBreak in and execute the .crash command
![Page 32: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/32.jpg)
Debugging a Hung Systemdemo
![Page 33: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/33.jpg)
Additional Information
Windows Internals 5th editionDebugging Tools for Windows documentationMark Russinovich’s Blog
http://blogs.technet.com/markrussinovichAdvanced Windows Debugging Blog
http://blogs.msdn.com/ntdebuggingCrash Dump Analysis and Debugging Portal
http://www.dumpanalysis.org
![Page 34: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/34.jpg)
Additional Information
David Solomon Expert Seminars offers trainingon Windows Internals both as public and private workshops and public webinars via the InternetCurrently scheduled up and coming classes
Public workshop in London scheduled March, 2010Public webinar scheduled for January, 2010
Visit http://www.solsem.com for further course descriptions and up to date information
![Page 35: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/35.jpg)
question & answer
![Page 36: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/36.jpg)
www.microsoft.com/teched
Sessions On-Demand & Community
http://microsoft.com/technet
Resources for IT Professionals
http://microsoft.com/msdn
Resources for Developers
www.microsoft.com/learning
Microsoft Certification & Training Resources
Resources
![Page 37: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/37.jpg)
Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!
![Page 38: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/38.jpg)
![Page 39: Daniel Pearson David Solomon Expert Seminars SVR302](https://reader030.vdocuments.net/reader030/viewer/2022032701/56649c975503460f9495309b/html5/thumbnails/39.jpg)
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS,
IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.