kernel recipes 2015 - kernel dump analysis

25
Linux crashdump analysis Dumping and analysing system state Kernel-Recipes 2015 Adrien Mahieux - Sysadmin & microsecond hunter gh: github.com/Saruspete tw: @Saruspete

Upload: anne-nicolas

Post on 09-Jan-2017

1.312 views

Category:

Software


7 download

TRANSCRIPT

Page 1: Kernel Recipes 2015 - Kernel dump analysis

Linux crashdump analysis

Dumping and analysing system stateKernel-Recipes 2015

Adrien Mahieux - Sysadmin & microsecond huntergh: github.com/Saruspete tw: @Saruspete

Page 2: Kernel Recipes 2015 - Kernel dump analysis

0 - Agenda

1. What’s a (crash)dump ?

2. Dump analysis

3. Live analysis (+ edition)

4. Tools & Links

- Get a dump - from hypervisor- Get a crashdump - with kdump

- GDB based tool : crash- Requirements : debuginfo- What to look for

- Using crash on a live system

- Source browsing- Script helpers- Analysis

Page 3: Kernel Recipes 2015 - Kernel dump analysis

What A snapshot of a system memory at a specific time

Who Mostly for sysadmins and guardians of production

Where Physical and Virtual Linux-based servers

When Your server is unresponsive (from ssh / console / application…)

Why To know what happened (kernel bug, external attack, limit missing…)

How Physical : kexec & panic the serverVirtual : same, or from hypervisor

H. Much Uses between 64M and 512M of RAM to boot the sec. kernelOn Virtual, you may do it from hypervisor at no cost

1 - What’s a (crash)dump ?

Page 4: Kernel Recipes 2015 - Kernel dump analysis

1.1 - Get a dump - hypervisor

VMWare

- Suspend / resume (.vmss file) or Snapshot with memory (.vmsn file)- Use tool vmss2core (VMWare Labs) to transform the raw dump into ELF dump

libvirt

- virsh : virsh dump MyGuestName /storage/MyGuestName.dump- QEMU Monitor : dump-guest-memory [-z|-l|-s] FILENAME

Xen

- xl : dump-core domain-id filename

Page 5: Kernel Recipes 2015 - Kernel dump analysis

1.2 - Get a crashdump - kexec / kdump

Kernel configuration

- CONFIG_KEXEC=y to boot the secondary kernel- CONFIG_SYSFS=y for /sys/kernel/kexec_crash_{loaded,size}- CONFIG_CRASH_DUMP=y- CONFIG_PROC_VMCORE=y Export dump to /proc/vmcore- (CONFIG_DEBUG_INFO=y) Will not be in live kernel- (CONFIG_RELOCATABLE=y) To use the same kernel for live & dump

- boot option : crashkernel=X@Y- X is the amount of memory to be reserved

+ 2 bytes for each 4KB- Y is the offset at which memory will be reserved- You can specify only X and the Kernel will find Y- If you have more than 2G of RAM, you can use “auto”

Page 6: Kernel Recipes 2015 - Kernel dump analysis

1.2 - Get a crashdump - kexec / kdump

Configure kdump

- Feature of the kernel that exports an ELF memory image via /proc/vmcore - kdump often refers to the whole process to dump a core- Relies on kexec to boot a secondary kernel / initrd to do the job- Uses the memory reserved by “crashkernel” bootopt to load the “dump-

capture” kernel & initrd- Upon panic, the running kernel will start the new one, which will do the dump

(ssh, ftp, local disk.. depending on your script) and reboot the system- kdump can use makedumpfile to filter memory data by type (free pages,

userland pages, private cache, cache pages, zero pages).- Check status with /sys/kernel/kexec_crash_{loaded,size}

Page 7: Kernel Recipes 2015 - Kernel dump analysis

1.2 - Get a crashdump - kexec / kdump

Dumping an unresponsive system : PANIC !

Manually

- SysRq echo c > /proc/sysrq-trigger - NMI via IPMI ipmitool power diag- NMI via virsh virsh inject-nmi MyGuestName- Beware of kernel.unknown_nmi_panic=1

Automatically

- Watchdog Boot cmdline: nmi_watchdog=1- Softlockup sysctl kernel.softlockup_panic=1- Out Of Memory sysctl vm.panic_on_oom=1

Page 8: Kernel Recipes 2015 - Kernel dump analysis

1.2 - Get a crashdump - kexec / kdump (non-server)

Desktops / Laptops usually don’t have external source to generate NMI

Kernel provides other ways :

Hard/Soft lockup detectors

- Kernel config {SOFT,HARD}LOCKUP_DETECTOR / BOOTPARAM_{SOFT,HARD}LOCKUP_PANIC

- Hard : Stay in kernel for more than 10sec- Soft : Task is hung for 120sec

Watchdog daemon

- Kernel config {SOFT,CLOCKSOURCE}_WATCHDOG- Boot option “nmi_watchdog=1”- watchdog daemon (http://sourceforge.net/projects/watchdog)

Page 9: Kernel Recipes 2015 - Kernel dump analysis

1.2 - Get a crashdump - kexec / kdump

Page 10: Kernel Recipes 2015 - Kernel dump analysis

2 - Dump Analysis

Page 11: Kernel Recipes 2015 - Kernel dump analysis

2 - Dump Analysis

Your weapon : Crash

- Tool by Dave Anderson (RedHat)- Based on GDB- x86, x86_64, arm, ia64, ppc64, s390- Extensible (snap, trace, appdump,

memory, dm, ipcs, cgroups, sockets, openvz…)

- Quick evolution and active Mailing List

Your gunsmith : debuginfos

- We don’t want debug in production, but we’d like to be able to debug

- Split debuginfo are Dwarf debug data in separate files to be used on demand

- Most distributions provides them for stock kernelRedhat : debuginfo-install kernelDebian : apt-get install linux-image-$(uname -r)-dbg

Page 12: Kernel Recipes 2015 - Kernel dump analysis

2.1 - What to look for

Summup of the system state : sys

KERNEL: /var/crash/127.0.0.1-2015-08-20-20:00:00/vmcore DUMPFILE: vmcore.myserver [PARTIAL DUMP] CPUS: 24 DATE: Mon Aug 20 20:00:00 2015 UPTIME: 32 days, 17:12:02LOAD AVERAGE: 1625.88, 1603.11, 1509.73 TASKS: 25639 NODENAME: myserver RELEASE: 2.6.18-371.8.1.el5 VERSION: #1 SMB Fri Mar 28 05:53:58 EDT 2014 MACHINE: x86_64 (2933Mhz) MEMORY: 284 GB PANIC: “Kernel panic - not syncing: An NMI occured” PID: 61015 COMMAND: "java" TAKS: ffff8135b50e5830 [THREAD_INFO: ffff8104bd256000] CPU: 0 STATE: TASK_RUNNING (PANIC)

System logs logMemory Usage kmemSwap Usage swapRunning process psSet PID to analyze setTask struct of PID taskFiles opened by PID filesBacktrace of PID btAvailable devices devAvailable NICs netInterrupts irqMountpoints mountProcess using a file fuserIPC Show ipcsKernernel Modules modRunQueue runqSymbols info sym

Page 13: Kernel Recipes 2015 - Kernel dump analysis

2.2 - Crash : GDB for Kernel

Let’s check for a real kernel bug

KERNEL: /usr/lib/debug/lib/modules/2.6.32-431.29.2.el6.x86_64/vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 64 DATE: Wed Jun 14 11:23:14 2015 UPTIME: 44 days, 04:14:21LOAD AVERAGE: 0.70, 0.58, 0.55 TASKS: 1917 NODENAME: myredhat65 RELEASE: 2.6.32-431.29.2.el6.x86_64 VERSION: #1 SMP Sun Jul 27 15:55:46 EDT 2014 MACHINE: x86_64 (1997 Mhz) MEMORY: 64 GB PANIC: "BUG: unable to handle kernel NULL pointer dereference at (null)" PID: 2120 COMMAND: "scsi_eh_6" TASK: ffff880437dcf540 [THREAD_INFO: ffff880435a94000] CPU: 50 STATE: TASK_RUNNING (PANIC)

Page 14: Kernel Recipes 2015 - Kernel dump analysis

2.2 - Crash : GDB for Kernelcrash> btPID: 2120 TASK: ffff880437dcf540 CPU: 50 COMMAND: "scsi_eh_6"#0 [ffff880435a95890] machine_kexec at ffffffff81038f3b#1 [ffff880435a958f0] crash_kexec at ffffffff810c5af2#2 [ffff880435a959c0] oops_end at ffffffff8152ca50#3 [ffff880435a959f0] no_context at ffffffff8104a00b#4 [ffff880435a95a40] __bad_area_nosemaphore at ffffffff8104a295#5 [ffff880435a95a90] bad_area_nosemaphore at ffffffff8104a363#6 [ffff880435a95aa0] __do_page_fault at ffffffff8104aabf#7 [ffff880435a95bc0] do_page_fault at ffffffff8152e99e#8 [ffff880435a95bf0] page_fault at ffffffff8152bd55 [exception RIP: scsi_send_eh_cmnd+99] RIP: ffffffff813860e3 RSP: ffff880435a95ca0 RFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff880c2d600ec0 RCX: 0000000000002710 RDX: ffff880c3002f000 RSI: ffffffff82017288 RDI: ffff880c2d600ec0 RBP: ffff880435a95da0 R8: 0000000000000000 R9: 0000000000000000 R10: 000d8f6a631f7b23 R11: 0000000000000001 R12: 0000000000000001 R13: ffff880435a95e90 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018#9 [ffff880435a95da8] scsi_eh_tur at ffffffff81386672#10 [ffff880435a95dd8] scsi_eh_test_devices at ffffffff8138675a#11 [ffff880435a95e28] scsi_error_handler at ffffffff81387d4c#12 [ffff880435a95ee8] kthread at ffffffff8109abf6#13 [ffff880435a95f48] kernel_thread at ffffffff8100c20a

crash> gdb set disassemble-flavor intelcrash> dis scsi_send_eh_cmnd0xffffffff81386080 <scsi_send_eh_cmnd>: push rbp0xffffffff81386081 <scsi_send_eh_cmnd+1>: mov rbp,rsp0xffffffff81386084 <scsi_send_eh_cmnd+4>: push r150xffffffff81386086 <scsi_send_eh_cmnd+6>: push r140xffffffff81386088 <scsi_send_eh_cmnd+8>: push r130xffffffff8138608a <scsi_send_eh_cmnd+10>: push r120xffffffff8138608c <scsi_send_eh_cmnd+12>: push rbx0xffffffff8138608d <scsi_send_eh_cmnd+13>: sub rsp,0xd80xffffffff81386094 <scsi_send_eh_cmnd+20>: nop DWORD PTR [rax+rax*1+0x0]0xffffffff81386099 <scsi_send_eh_cmnd+25>: mov rax,QWORD PTR gs:0x280xffffffff813860a2 <scsi_send_eh_cmnd+34>: mov QWORD PTR [rbp-0x38],rax0xffffffff813860a6 <scsi_send_eh_cmnd+38>: xor eax,eax0xffffffff813860a8 <scsi_send_eh_cmnd+40>: mov QWORD PTR [rbp-0xc8],rsi0xffffffff813860af <scsi_send_eh_cmnd+47>: mov DWORD PTR [rbp-0xcc],edx0xffffffff813860b5 <scsi_send_eh_cmnd+53>: mov rbx,rdi0xffffffff813860b8 <scsi_send_eh_cmnd+56>: mov rax,QWORD PTR [rdi+0x80]

crash> rd -o 0x80 0xffff880c2d600ec0ffff880c2d600f40: ffff880c372afd00

0xffffffff813860bf <scsi_send_eh_cmnd+63>: mov rdx,QWORD PTR [rdi]0xffffffff813860c2 <scsi_send_eh_cmnd+66>: mov r14d,r8d0xffffffff813860c5 <scsi_send_eh_cmnd+69>: mov rax,QWORD PTR [rax+0xb0]

crash> rd -64 -o 0xb0 ffff880c372afd00ffff880c372afdb0: ffff880c2a4d0400

0xffffffff813860cc <scsi_send_eh_cmnd+76>: mov QWORD PTR [rbp-0xe8],0x00xffffffff813860d7 <scsi_send_eh_cmnd+87>: test rax,rax0xffffffff813860da <scsi_send_eh_cmnd+90>: je 0xffffffff813860ed <scsi_send_eh_cmnd+109>0xffffffff813860dc <scsi_send_eh_cmnd+92>: mov rax,QWORD PTR [rax+0x2c8]

crash> rd -64 -o 0x2c8 ffff880c2a4d0400ffff880c2a4d06c8: 0000000000000000

0xffffffff813860e3 <scsi_send_eh_cmnd+99>: mov rax,QWORD PTR [rax]

Looking on the kernel code, there is this function :static inline struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd) { if (!cmd->request->rq_disk) return NULL; return *(struct scsi_driver **)cmd->request->rq_disk->private_data;}

The "test/je" matches the "if (!cmd->request->rq_disk)".

Page 15: Kernel Recipes 2015 - Kernel dump analysis

2.2 - Crash : GDB for KernelLet’s check the structures involved in this bug :

crash> struct scsi_cmndstruct scsi_cmnd { … unsigned int transfersize; __struct request *request;__ unsigned char *sense_buffer; …}

crash> struct requeststruct request { … struct gendisk *rq_disk; …}

crash> struct -xo gendiskstruct gendisk { … [0x2c0] struct request_queue *queue; [0x2c8] void *private_data; [0x2d0] int flags; …}

So here we have a scsi_cmnd, which contains a “request”, which contains a “gendisk”.

Here are the addresses of our different instances : ffff880c2d600ec0 = scsi_cmndffff880c372afd00 = requestffff880c2a4d0400 = gendisk

Offset 0x2c8 matches the code just before the crash :

0xffffffff813860dc <scsi_send_eh_cmnd+92>: mov rax,QWORD PTR [rax+0x2c8]

crash> struct gendisk.disk_name ffff880c2a4d0400 disk_name = "sg96\000\000\000\000...\000"

Disk is /dev/sg96.From "scsi_cmnd", the first element is an scsi_device object (addr : 0xffff880c3002f000 )

crash> scsi_device.vendor 0xffff880c3002f000 vendor = 0xffff880c2a3a2ac8 "QUANTUM Scalar i6000 656Q656Q.GS01501 \001"

crash> scsi_device.model 0xffff880c3002f000 model = 0xffff880c2a3a2ad0 "Scalar i6000 656Q656Q.GS01501 \001"

crash> scsi_device.rev 0xffff880c3002f000 rev = 0xffff880c2a3a2ae0 "656Q656Q.GS01501 \001"

Redhat Bug : https://access.redhat.com/solutions/1231363

Page 16: Kernel Recipes 2015 - Kernel dump analysis

3 - Live modifications

Page 17: Kernel Recipes 2015 - Kernel dump analysis

3 - Live modifications

Yes, you can tinkle with the Kernel memory too !

Through /dev/mem, you can access memory… but not on most distributions.

Dave Anderson says : Defeat CONFIG_STRICT_DEVMEM with kretprobes http://www.redhat.com/archives/crash-utility/2008-March/msg00036.html

/* Return-probe handler: force return value to be 1. */static int ret_handler(struct kretprobe_instance *ri, struct pt_regs *regs){#if defined(__i386__) && !defined(__KERNEL__) regs->eax = 1;#else regs->ax = 1;#endif return 0;}

Page 18: Kernel Recipes 2015 - Kernel dump analysis

3.1 - Live modifications - Network Parameters

Get the list of the NICs :crash> net NET_DEVICE NAME IP ADDRESS(ES)ffff88003e999020 lo 127.0.0.1ffff88003e228020 eth0 192.168.122.13

Check the value (net_device)crash> struct net_device.mtu ffff88003e228020 mtu = 1500

Get the offsetcrash> struct -o net_device.mtu ffff88003e228020struct net_device { [ffff88003e22818c] unsigned int mtu;}

Read the memorycrash> rd -32 -D ffff88003e22818cffff88003e22818c: 1500

And change itcrash> wr -32 ffff88003e22818c 1400

[root@centos6 ~]# ifconfig eth0 |grep -Po 'MTU:[0-9]+'MTU:1400

Page 19: Kernel Recipes 2015 - Kernel dump analysis

4 - Tools and useful links

Page 20: Kernel Recipes 2015 - Kernel dump analysis

4.1 - Tools : OpenGrok

Wicked fast code source browserhttp://opengrok.github.io/OpenGrok/

Grok : "to understand intuitively or by empathy; to establish rapport with" / "to empathize or communicate sympathetically (with); also, to experience enjoyment"

Uses ctags and lucene to index code with context : Search for “text”, “definitions”, “symbols”, “file path” and “history”

Understand : Mercurial, Git, SCCS, RCS, CVS, Subversion, Teamware, ClearCase, Perforce, Monotone and Bazaar

Page 21: Kernel Recipes 2015 - Kernel dump analysis

4.2 - Tools : kdumptools

Set of scripts to ease your kdump usage (try to work with all distributions)https://github.com/saruspete/kdumptools

kdump_setup.sh Helper: setup kdump on your distrib

kdump_analyze.sh Helper: analyze a crashdump (retrieve dbg + crash)

kdump_live.sh Helper: analyze your running system

kdump_getdbg.sh Helper: retrieve debuginfos for a given OS / Release

src/crash Crash + compile scripts (latest version)

src/allow_devmem Kernel module to allow /dev/mem usage

Page 22: Kernel Recipes 2015 - Kernel dump analysis

4.3 - Links - kdump

Kdump-Tool : Kexec is part of kexec-toolsSources : https://git.kernel.org/cgit/utils/kernel/kexec/kexec-tools.git Distrib : https://kernel.org/pub/linux/utils/kernel/kexec/

Kernel Doc : http://www.kernel.org/doc/Documentation/kdump/kdump.txt

MakeDumpFile : Select the memory regions to be stripped of the dumphttps://github.com/chitranshi/makedumpfile

Fence Kdump : Avoid kdump being interrupted by sending heartbeatshttp://www.ovirt.org/Fence_kdump

Page 23: Kernel Recipes 2015 - Kernel dump analysis

4.4 - Links - crash

Official Page : Download, tools and helphttp://people.redhat.com/anderson

Linux Crash Cook Book : Detailed and step-by-step detailshttp://www.dedoimedo.com/computers/crash-book.html

Defeating /dev/mem restrictions : Howto tinkle with /dev/mem http://www.redhat.com/archives/crash-utility/2008-March/msg00036.html

Dwarf debuginfo format : Details on the Dwarf format compatible with ELF binarieshttp://dwarfstd.org

Page 24: Kernel Recipes 2015 - Kernel dump analysis

4.5 - Links - Kernel

Linux Insides : https://0xax.gitbooks.io/linux-insides

Understanding the Linux KernelISBN 10 : 0-596-00565-2

Linux Kernel Development ISBN 10 : 0-672-32946-8

Linux Kernel ArchitectureISBN 10 : 0-470-34343-5

The Linux Programming InterfaceISBN 10 : 1-59327-220-0

Page 25: Kernel Recipes 2015 - Kernel dump analysis

Thank you