virtunoid : breaking out of kvm

Post on 23-Feb-2016

43 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Virtunoid : Breaking out of KVM. Nelson Elhage Black Hat USA 2011. Outline. Introduction Related work Background K nowledge Attack Detailed CVE 2011-1751 Bug Detailed Exploit Detailed (Take Control of %rip) Inject Shellcode into host Disable non executable page Bypassing ASLR - PowerPoint PPT Presentation

TRANSCRIPT

Nelson ElhageBlack Hat USA 2011

Virtunoid: Breaking out of KVM

IntroductionRelated workBackground KnowledgeAttack Detailed

CVE 2011-1751 Bug DetailedExploit Detailed (Take Control of %rip)Inject Shellcode into host Disable non executable pageBypassing ASLR

ConclusionsReference

Outline

It was found that the PIIX4 Power Management emulation layer in qemu-kvm did not properly check for hot plug eligibility during device removals. A privileged guest user could use this flaw to crash the guest or, possibly, execute arbitrary code on the host. (CVE-2011-1751)

Introduction

a generic and open source machine emulator and virtualizer.

Three components:Kvm.koKvm-intel.ko or kvm-amd.koQemu-kvm

QEMU-KVM Background Knowledge

The core KVM kernel moduleProvides ioctls for communicating the kernel

modulePrimarily responsible for emulating the

virtual CPU and MMUEmulates a few devices in-kernel for

efficiencyContains an emulator for a subset of x86 used

in handling certain traps

Kvm.ko [3]

Provides support for Intel’s VMX and AMD’s SVM virtualization extensions

Relatively small compared to the rest of KVM

Kvm-intel.ko/kvm-amd.ko

Provides the most direct user interface to KVM

Based on the classic x86 emulatorImplements the bulk of the virtual devices a

VM usesImplements a wide variety of possible devices

and busesAn order of magnitude more code than the

kernel module

Qemu-kvm

QEMU-KVM Background Knowledge

Static QEMUTimer *active_timers[QEMU_NUM_CLOCKS] Struct QEMUTimer { QEMUClock *clock; int64_t expire_time; QEMUTimerCB *cb; /* call back function*/ void *opaque; /* parameter */ struct QEMUTimer *next; /* link list */}

QEMUTimer (qemu_src/qemu_common.h)

QEMUTimer

Active_timers QEMUTimer

Related functions:Qemu_new_timer: allocate a memory region for the new

timer.Qemu_mod_timer: modify the current timer add it to link

list. Qemu_run_timers: loop through the link list and execute the

timer structure call back function with the opaque as the parameter

QEMUTimer(qemu_src/vl.c)

The main_loop_wait function will iterate through the active_timers and call qemu_run_timers()

QEMUTimer(qemu_src/vl.c)

A computer clock that keep track of the current time

MC146818 RTC hardware manual can be found

http://wiki.qemu.org/File:MC146818AS.pdf

Real Time Clock

RTCState structureStruct RTCState { ….. QEMUTimer *second_timer; QEMUTimer *second_timer2;}

Real Time Clock Emulations [2]

Related functions:Rtc_initfn : initialize the RTCRtc_update_second : update the expire time of

the QEMUTimer and add it to the link list.

Real Time Clock Emulations

rtc_initfn : RTCState *s = …. s->second_timer = qemu_new_timer(rtc_clock, rtc_updated_second, s) s->second_timer2 = qemu_new_timer(rtc_clock, rtc_update_second2, s) qemu_mod_timer(s->second_timer2, s->next_second_time)

Real Time Clock Emulations

RTCState and QEMUTimer

………

Second_timer

Second_timer2 opaqueNext

RTCState

Active_timer

QEMUTimer

QEMUTimer

opaqueNext

Cb

Cb

……….………..……….

……….………..……….

Rtc_update_second

Rtc_update_second2

Real Time Clock Emulations

A south bridge chip.Default south bridge chip used by qemu-kvmInclude ACPI, PCI-ISA, and an embeded MC146818

RTC.Support PCI device hotplug, write values to IO port

0xae08Qemu use qdev_free to emulate device hotplug.Certain devices don’t support device hotplug but qemu

didn’t check this.

PIIX 4

It should not be possible to unplug the ISA bridge

KVM’s emulated RTC is not designed to be unplugged.

PCI-ISA hot plug

Bug Detailed

Did not checkThe device can Be unplug or not

Being dealloc

Add the second timer to link list.

#include <sys/io.h>Int main(){ iopl(3); outl(2, 0xae08); return 0;}

Bug reproducer

………

Second_timer

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimer

Rtc_update_second

Active_timerUnplug RTC

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimer

Rtc_update_second

Active_timerUnplug RTC

Second_timer

Dummy memory region

………………

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimer

Rtc_update_second

Active_timerReturn to main_loop_waitCall qemu_run_timers

Second_timer

Dummy memory region

………………

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimer

Rtc_update_second

Active_timerQEMUTimer call backRtc_update_second(opaque)

Second_timer

Dummy memory region

………………

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimer

Rtc_update_second

Active_timerNext Main_loop_wait

Second_timer

Dummy memory region

………………

1. Inject a Controlled QEMUTimer into qemu-kvm

2. Eject ISA bridge3. Force an allocation into the freed

RTCState, with second timer point to our fake QEMUTimer

Exploit Detailed ( Take Control of %rip)

The guest RAM is backed by mmap()ed region inside the qemu-kvm process.

Allocate in the guest RAM and calculate the the host address by the following formula:Hva = physmem_base + gpagpa = page_traslation(gva) <= linux kernel project 1

Gva = guest virtual addressGpa = guest physical addressHva = host virtual addressPhysmem_base = mmap start regionFor now assume we know physmem_base(no aslr)

Inject a Fake QEMUTimer

Force qemu to call mallocUtilize the qemu-kvm user-mode networking

stackQemu-kvm implement DHCP server, DNS

server and NAT gateway in user-mode networking stack

User-mode stack normally handle packets synchronously

To prevent recursion, if a second packet is emitted while handling a first packet, the second packet is queued using malloc.

ICMP ping.

Force allocation into the freed RTCState

1. Allocate a Fake QEMUTimer2. calculate the Fake timer address3. unplug ISA bridge4. ping the gateway containing pointers to

your fake timer.

Putting it together

………

Second_timer

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimerRtc_update_second

Active_timerAllocate Fake QMEUTimer

opaqueNext

CbFake QEMUTimer

……….………..……….

Evil function(Shellcode)

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimerRtc_update_second

Active_timerUnplug ISA bridge

opaqueNext

CbFake QEMUTimer

……….………..……….

Evil function(Shellcode)

Second_timer

Ping the gateway

opaqueNext

Cb

opaqueNext

Cb

……….………..……….

RTCState

QEMUTimer

QEMUTimerRtc_update_second

Active_timerFirst Main_loop_wait

opaqueNext

CbFake QEMUTimer

……….………..……….

Evil function(Shellcode)

Second_timer

opaqueNext

Cb

RTCState

QEMUTimer

Active_timerSecond Main_loop_wait

opaqueNext

CbFake QEMUTimer

……….………..……….

Evil function(Shellcode)

Second_timer

1. we have %rip control 2. Where is the Evil function

Inject shellcode to host virtual memoryHost virtual memory has page protection(NX

bit)3. Solutions:

A. ROP B. something clever

What’s Next

1. we can control the QEMUTimer data structure.

2. create multiple QEMUTimer object and chain them together.

Multiple QEMUTimer Chaining

opaqueNext

CbopaqueNext

CbopaqueNext

Cb

……….………..……….

……….………..……….

……….………..……….

QEMUTimer

F1(X) F2(Y) F3(Z)

We now have multiple on argument function calls.

We want to do more arguments function calls. For example, mprotect take three arguments.

Multiple QEMUTimer Chaining

Arguments of types Bool, char, short, int, long, long long, and pointers are in the INTEGER class.

If the class is INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used

More detailed check out the reference 7

System V AMD64 ABI [7]

Suppose we can find a function with the following property.

Let f1(x) be set_rsi%rsi register will not be modified during

qemu_run_timer() in most qemu version.Therefore, F2(y) becomes F2(y,x) since we control the

%rsi from f1(x)

Multiple QEMUTimer Chaining

Set_rsi: movl %rdi, %rsi; return

This function will copy its first parameter to the second parameter of ioport_write

%rdi is the first parameter and %rsi is the second parameter. Therefore we get a function with the previous property. (Movl %rdi, %rsi)

Set_rsi for QEMUTimer Chaining

Void cpu_outl(pio_addr_t addr, uint32_t val) { ioport_write(2, addr, val);}

Mprotect prototype:Mprotect(addr, lens, prot)PROT_EXEC = 4

Use the following function

We control the “opaque/ioport” by QEMUTimer and control the “addr” by set_rsi()

Seems like we control everything in this function

Mprotect for QEMUTimer Chaining

Structure of IORange and IORangeOps

Allocate a fake IORangeOps with fake_ops->read = mprotect

Allocate a page-aligned IORange with Fake_ioport->ops = fake_ops Fake_ioport->base = -PAGE_SIZE

Copy shellcode following the IORangeConstruct a timer chain that calls

Cpu_outl(0, *)Ioport_readl_thunk(fake_ioport, 0)Fake_ioport + 1

Putting it All Together

opaqueNext

CbopaqueNext

CbopaqueNext

Cb

IORange (PAGE_ALIGN)

ops

Fill with shellcode

IORangeOps

Read

…….…….…….

…….…….…….

…….…….…….

Cpu_outl Ioport_readl_thunk

mprotectQEMUTimer Chain

The base address of the qemu-kvm binary, to find code address(such as mprotect ….)

Physmem_base, the address of the physical memory mapping inside kvm

Solutions:Find an information leakAssume non-PIE. Every major distribution compile qemu-

kvm as non position independent executable. How about physmem_base

Address

Emulated IO ports 0x510 (address) and 0x511 (data)

Used to communicate various tables to the qemu BIOS (e820 map, ACPI tables, etc)

Also provides support for exporting writable tables to the BIOS

However, fw_cfg_write doesn’t check if the target table is supposed to be writable

Fw_cfg

Several fw_cfg areas are backed by statically-allocated buffers.

Net result: nearly 500 writable bytes inside static variables.

Static data

Mprotect needs a page-aligned address, so these aren’t suitable for our shellcode

We can construct fake timer chains in this space to build a read4() primitive. (Create Information Leak)

Follow pointers from static variables to find physmem_base

Proceed as before

Read4 your way to victory

Sandbox qemu-kvmBuild qemu-kvm as PIELazily mmap/mprotect guest RAMXOR-encode key function pointersMore auditing and fuzzing of qemu-kvm

Possible hardening directions

VM breakouts aren’t magicHypervisors are just as vulnerable as

anything elseDevice drivers are the weak spot.

Conclusions

[1] http://qemu.weilnetz.de/qemu-tech.html [2] http://qemu.weilnetz.de/doxygen/structRTCState.html [3] http://www.linuxinsight.com/files/kvm_whitepaper.pdf [4] https://www.ibm.com/developerworks/cn/linux/l-virtio/ [5] http://smilejay.com/kvm_theory_practice/ [6] http://www.linux-kvm.org/page/Documents [7] http://www.cs.tufts.edu/comp/40/readings/amd64-abi.pdf [8]http://

linuxfromscratch.xtra-net.org/hlfs/view/unstable/glibc-2.4/chapter02/pie.html

qemu source code

Reference

top related