chen haibo

37
Optimizing Crash Dump in Virtualized Environments Haibo Chen Assistant Processor Parallel Processing Institute Fudan University http://ppi.fudan.edu.cn/haibo_chen 1 Collaborative work with Yijian Huang and Binyu Zang

Upload: the-linux-foundation

Post on 20-May-2015

933 views

Category:

Technology


6 download

DESCRIPTION

Xen Summit Asia 2009

TRANSCRIPT

Page 1: Chen Haibo

Optimizing Crash Dump in

Virtualized Environments

Haibo Chen

Assistant Processor

Parallel Processing Institute

Fudan University

http://ppi.fudan.edu.cn/haibo_chen

1

Collaborative work with Yijian Huang and Binyu Zang

Page 2: Chen Haibo

Warning!!! Advertising ^.^

• Parallel Processing Institute, Fudan University

– Started from 1992

– http://ppi.fudan.edu.cn/

• Started research on Xen at the beginning of

2004

• Research focus

– Using virtualization to improve the dependabilitydependability

and scalabilityscalability of computing systems

2

Page 3: Chen Haibo

Xen-related Projects in PPI

3

OSOS

LUCOSLUCOS

ApplicationsApplications

MercuryMercuryCORCOR

EE

CORCOR

EE

NewNew OSOS

Mercury Mercury Combining Performance with Dependability Combining Performance with Dependability

Using SelfUsing Self--virtualizationvirtualization

LucosLucos Live Updating Operating Systems Using VirtualizationLive Updating Operating Systems Using Virtualization

Page 4: Chen Haibo

Xen-related Projects in PPI

4

OSOS

ApplicationsApplicationsCloud Cloud

ApplicationApplicationAttackAttack

CORCOR

EE

CORCOR

EE

ChaosChaos TamperTamper--Resistant Execution in an Untrusted OSResistant Execution in an Untrusted OS

Using A Virtual Machine MonitorUsing A Virtual Machine Monitor

Page 5: Chen Haibo

Xen-related Projects in PPI

5

OSOS

CORCOR

EE

CORCOR

EE

Cloud Cloud ApplicationApplication

ChaosChaos

ChaosChaos TamperTamper--Resistant Execution in an Untrusted OSResistant Execution in an Untrusted OS

Using A Virtual Machine MonitorUsing A Virtual Machine Monitor

Page 6: Chen Haibo

Xen-related Projects in PPI

6

OSOS

ApplicationsApplications

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

ManyMany--core core ApplicationApplication

CerberusCerberus Scaling ManyScaling Many--core Applications with Clustering of core Applications with Clustering of

Commodity OSesCommodity OSes

Page 7: Chen Haibo

Xen-related Projects in PPI

7

OSOS

ManyMany--corecoreApplicationApplication

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

CORCOR

EE

CerberusCerberus

OSOS OSOS

system call virtualizationsystem call virtualization

CerberusCerberus Scaling ManyScaling Many--core Applications with Clustering of core Applications with Clustering of

Commodity OSesCommodity OSes

Page 8: Chen Haibo

Core Dump on System Crash

• System Crash

– Painful, but a fact of life

– Reboot the whole system for recovery

• Crash Dump (or Core Dump)

– Saving System States into Persistent Storage

• For future analysis to avoid recurring reboot

• Different level of dumps

– Minidump, Kernel Dump and Full Dump

8

Page 9: Chen Haibo

Core Dump and Recovery

• On system crash, We need both…

– Core dump – for future diagnosis

– Recovery – minimum downtime

9

Core Dump

Core Core

DumpDumpSystem Reboot

System System

RebootReboot

Service available again

SystemSystem

CrashCrash

Downtime

Page 10: Chen Haibo

Performance Implication

• High latency for full dump

• Low CPU/memory Utilization

– by I/O-Intensive core dump

– Low CPU Utilization

– Long-term full memory reservation

10

Core Dump

Core Core

DumpDumpSystem Reboot

System System

RebootReboot

Service available again

SystemSystem

CrashCrash

Page 11: Chen Haibo

Optimizing Opportunities in Xen

• Optimization opportunities in virtualized

environments

– Crash of a VM is not the end of the world

• VMM software is still alive

– Concurrent existence of multiple VMs

• Goal: minimize downtime for core dump &

recovery

11

Page 12: Chen Haibo

Optimizing Crash Dump in Xen

• Proposed optimizations

– Concurrent core dump & recovery

– Selective core dump by VM introspection

– Disk I/O Rate Control

12

Core Dump

Core Core

DumpDumpSystem Reboot

System System

RebootReboot

Service available again

SystemSystem

CrashCrash

Core Dump

Core Core

DumpDump

System Reboot

System System

RebootReboot

Downtime

Page 13: Chen Haibo

Outline

• Design

• Implementation

• Evaluation

• Conclusion

13

Page 14: Chen Haibo

Typical Crash Dump Tools

• Kdump

– Based on Kexec

– Load new kernel from reserved area on panic

• Xen

– “xc dump-core <domid>”

14

Page 15: Chen Haibo

Architecture

• Core dump by Xend

• Reboot-based recovery with recovery DomainU

• The 2 DomUs share the file system

– Retain persistent states

– Safe sharing15

VMMVMM

Dom0 DomUDomURecovery

DomU

Recovery

DomU

DaemonDaemon

Page 16: Chen Haibo

Concurrent Core Dump & Recovery

• Start recovery DomU while core dump still on-the-fly

• Break full reservation of memory by core dump

– CPU, I/O resources immediately released on crash

• Divide DomU memory into chunks

– Core dump, reallocate memory to recovery

DomainU in chunks

16

Page 17: Chen Haibo

Core Dump and Reallocation of Memory

DiskDisk

17

The Xen HypervisorThe Xen Hypervisor

0-16 MB00--16 MB16 MB0-16 MB00--16 MB16 MB

Core Dump Core Dump

by Xendby XendReallocation Reallocation

by Hypervisorby Hypervisor

0-32 MB00--32 MB32 MB

0-48 MB00--48 MB48 MB

Crashed DomU Recovery DomU

(booting)

0-32 MB00--32 MB32 MB

0-48 MB00--48 MB48 MB

Page 18: Chen Haibo

Selective Dump

• Selective Dump by VM Introspection

1. Access Memory of Crashed DomainU

2. Extract page descriptor array

3. Identify free pages

4. Skip core-dumping free pages

• Rely on the integrity of guest memory

management

18

Page 19: Chen Haibo

Selective Dump

• Introspect the page descriptor array in guest

Linux

– mem_map

• Identify free pages

– Whose _count == 0 in the page descriptor

• Skip core-dumping free pages

– in xc_domain_dumpcore_via_callback()

19

Page 20: Chen Haibo

Disk I/O Rate Control

• Allocating I/O bandwidth

– between concurrent core dump and recovery

• Tune the allocation policy by user

• Trade-off between

– Speed to release memory by core dump

– Speed to boot recovery DomainU

20

Page 21: Chen Haibo

Disk I/O Rate Control

• Available in enterprise virtualization products

• Scheduling in open-source Xen

– ionice in Domain0

– over kernel threads handling DomainU I/O in

Domain0

• Alternative

– Schedule in hypervisor, which is OS-agnostic

21

Page 22: Chen Haibo

Outline

• Design

• Implementation

• Evaluation

• Conclusion

22

Page 23: Chen Haibo

The Prototype

• Based on Xen 3.3.0

• A utility program in Xen-tools

– Get notified of DomainU crash by Xend

– Initiate core dump

– Start the recovery DomainU

• Optimized core dump routine

– Modified xc_domain_dumpcore_via_callback()

• Added a new hypercall

– Memory reallocation

23

Page 24: Chen Haibo

Outline

• Design

• Implementation

• Evaluation

• Conclusion

24

Page 25: Chen Haibo

Environment

• Xen 3.3.0; Debian Linux (kernel 2.6.28)

• TPC-W

– The TPC-W-lycos Java Implementation

• Machine

– Intel Core Duo 2

2.33 GHz

– 2 GB memory

– 320 SATA disk

25

Page 26: Chen Haibo

Methodology

• Domain0 – 512 MB

• Crashed / Recovery DomainU – 1024 MB

• Disable ballooning

• Event log

– Record moments of starting/ending of core dump/recovery

• Downtime = End of recovery – Start of dump

• Domain Resource consumption monitoring

26

Page 27: Chen Haibo

Concurrent Core Dump & Recovery

27

Page 28: Chen Haibo

Concurrent Core Dump & Recovery

28

core dumpcore dump

recoveryrecovery

downtimedowntimeMemoryMemory

Page 29: Chen Haibo

Selective Dump

# Pages Skipped

# Pages Core-dumped

Downtime

00 262144262144 4545

192474192474 6967069670 1919

29

* # Pages in Total = 262144

NOTE: not integrated with

concurrent dump

Page 30: Chen Haibo

Disk I/O Rate Control

• Make I/O contention occur

– Between core dump and the recovery DomainU

– Copy a small file when booting the recovery

DomainU

• Not memory-critical

• Case 1: recovery is given higher I/O priority

• Case 2: core dump is given higher I/O priority

30

Page 31: Chen Haibo

Disk I/O Rate Control

31

CaseCase--11� Recovery has higher I/O

priority� Downtime – 31 sec� Slow memory reallocation

CaseCase--22� Core Dump has higher I/O

priority� Downtime – 54 sec� Fast memory reallocation

Page 32: Chen Haibo

Outline

• Design

• Implementation

• Evaluation

• Conclusion

32

Page 33: Chen Haibo

Conclusion

• Optimize core dump in Xen

– Concurrent core dump & recovery

• Better hardware resource utilization

– Selective core dump by VM introspection

• Shorter core dump latency

– Disk I/O Rate Control

• Improve I/O QoS

• Future work

– Integrate the three techniques

– More tuned QoS control

– Compression on multi-core VM 33

Page 34: Chen Haibo

Optimizing Crash Dump in

Virtualized Environments

Haibo Chen

Assistant Processor

Parallel Processing Institute

Fudan University

http://ppi.fudan.edu.cn/haibo_chen

34

Collaborative work with Yijian Huang and Binyu Zang

Page 35: Chen Haibo

The End

•• Thank youThank you

35

Page 36: Chen Haibo

Selective Core Dump

• To access the page descriptor array

– Start with the virtual address of the 1st elem in array

• Symbol info stored in kernel image

– DomainU virtual address => DomainU phys address

• Linux linear mapping in kernel space

– DomainU phys address => machine address

• Xen p2m table of DomainU, accessible by Xend

– Map machine address

• xc_map_foreign_batch() in Xend

36

Page 37: Chen Haibo

Concurrent Core Dump & Recovery

1. Core dump tool grabs DomainU pages

2. Release reference by DomainU to its pages

1. Drop reference count to 1, by relinquish_memory()

3. for each chunk:

– Core_dump(chunk)

– Release(chunk)

• Drop its reference count to 0

• Reclaimed by hypervisor

– Add chunk to recovery DomainU

– Start recovery DomainU if 64 MB reclaimed in total