cactus.eas.asu.educactus.eas.asu.edu/.../unpublished/v01-journal-nov14.docx · web viewcode...

102
Title - Software based Remote Attestation: measuring integrity of user applications and kernels Authors: Raghunathan Srinivasan 1 (corresponding author), Partha Dasgupta 1 , Tushar Gohad 2 Affiliation: 1. School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA 2. MontaVista Software LLC Address: Email: [email protected] Phone: (1) 480-965-5583 Fax: (1)-480-965-2751

Upload: lymien

Post on 07-Mar-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Title - Software based Remote Attestation: measuring integrity of user

applications and kernels

Authors: Raghunathan Srinivasan1 (corresponding author), Partha Dasgupta1,

Tushar Gohad2

Affiliation: 1. School of Computing, Informatics and Decision Systems

Engineering, Arizona State University, Tempe, AZ, USA

2. MontaVista Software LLC

Address:

Email: [email protected]

Phone: (1) 480-965-5583

Fax: (1)-480-965-2751

Abstract:

This research describes a method known as Remote attestation to attest the

integrity of a process using a trusted remote entity. Remote attestation has

mostly been implemented using hardware support. Our research focuses on

the implementation of these techniques based entirely on software utilizing

code injection inside a running process to attest its integrity.

A trusted external entity issues a challenge to the client machine and the client

machine has to respond to this challenge. The result of this challenge

provides the external entity with an assurance on whether or not the software

executing on the client machine is compromised. This paper also shows

methods to determine the integrity of the operating system on which software

based remote attestation occurs.

Keywords: Remote Attestation, Integrity Measurement, Root of Trust, Kernel

Integrity, Code Injection.

1. Introduction

Many consumers utilize security sensitive applications on a machine (PC)

along with other vulnerable software. Malware can patch on various software

in the system by exploiting these vulnerabilities. A regular commodity OS

consists of millions of lines of code (LOC) [1]. Device drivers usually range

in size between a few lines of code to around 100 thousand lines of code

(KLOC), with an average of 1 bug per device driver [2]. Another empirical

study showed that bugs in the kernel may have a lifetime of nearly 1.8 years

on average [3], and that there may be as many as 1000 bugs in the 2.4.1 Linux

kernel. The cumulative effect of such studies is that it is difficult to prevent

errors that can be exploited by malware. Smart malware can render Anti-

malware detection techniques by disabling them. Hardware detection

schemes are considered to be non modifiable by malware. However, mass

scale deployment of hardware techniques remains a challenge, and they also

have the stigma of digital rights management (DRM) attached. Another issue

with hardware measurement schemes is that software updates have to be

handled such that only legitimate updates get registered with the hardware. If

the hardware device offers an API to update measurements, malware can

attempt to use that API to place malicious measurements in the hardware. If

the hardware device is not updatable from the OS, then reprogramming has to

be performed on it to reflect updated measurements.

Software based attestation schemes offer flexibility and can be changed

quickly to reflect legitimate updates. Due to the ease of use and the potential

of mass scale deployment, software based attestation schemes offer significant

advantages over hardware counterparts. However, every software based

attestation scheme is potentially vulnerable to some corner case attack

scenario. In extreme threat model cases and cases where updates are rare,

network administrators can switch to using hardware based measurement

schemes. For the general consumer, software based schemes offer a

lightweight protocol that can detect intrusions prior to serious data losses.

Remote Attestation is a set of methods that allows an external trusted agent to

measure the integrity of a system. Software based solutions for Remote

Attestation schemes vary in their implementation techniques. Pioneer [4],

SWATT [5], Genuinity [6], and TEAS [7] are well known examples. In

TEAS, the authors prove mathematically that it is highly difficult for an

attacker to determine the response for every integrity challenge, provided the

code for the challenge is regenerated for every instance. However, TEAS does

not provide any implementation framework.

In Genuinity, a trusted authority sends executable code to the kernel on the un-

trusted machine, and the kernel loads the attestation code to perform the

integrity measurements. Genuinity has been shown to have some weaknesses

by two studies [8], [5]. However, the authors of Genuinity have since claimed

that these attacks may work only on the specific cases mentioned in the two

works, a regeneration of the challenge by the server would render the attacks

insignificant [9].

This work is quite similar to Genuinity with certain differences in technique.

Like Genuinity, this work focuses on the importance of regenerating code that

performs integrity measurement of an application on the client. We do not

utilize the Operating System support to load the challenge, the application has

to receive the code and execute it. In addition, this paper also deals with the

problem of what we term a ‘redirect’ attack where an attacker may direct the

challenge to a different machine.

The attestation mechanisms presented in this work use the system call

interface of the client platform. Due to this, the problem of determining

integrity of an application on a client platform is split into two orthogonal

problems. The first involves determining the integrity of the user application

in question by utilizing system calls and software interrupts. The orthogonal

problem is determining the integrity of the system call table, interrupt

descriptors, and the Text section of the kernel that runs on client platform

For the first problem, it is assumed that the system calls will produce the

correct results. Rootkits are assumed to be absent from the system. We

assume that there may be various other user level applications on the client

platform that may attempt to tamper with the execution of the challenge. For

the second problem, this paper presents a scheme where an external entity can

determine the state of the OS Text section, System call Table, and the

Interrupt Descriptor table on the client machine. It can be noted that the

external entities obtaining the integrity measure for the application and the OS

can be different.

The solution in this paper is designed to detect changes made to the code

section of a process. This allows the user (Alice) to determine whether one

application is clean on the system. The same technique can be extended to

every application on the system to determine whether all installed applications

are clean. Trent is a trusted entity who has knowledge of the structure of an

un-tampered copy of the process (P) to be verified. Trent may be the

application vendor or Trent may be an entity that offers attestation services for

various applications. It should be noted that Trent only needs to know the

contents and behavior of the clean program image of P to generate challenges.

Trent provides executable code (C) to Alice (the client/ end user), which Alice

injects on P. C takes overlapping MD5 hashes on the sub-regions of P and

returns the results to Trent. Trent has to be a trusted agent as the client

downloads program code or performs certain operations based on Trent’s

instructions. If Trent is not trusted then Alice cannot run the required code

with certainty that it will not compromise Alice’s machine (MAlice).

C is newly generated randomized code that executes on the user end to

determine the integrity of an application on an x86 based platform. This

ensures that an attacker cannot determine the results of the integrity

measurement without executing C. Trent places some programming

constructs in C that ensure that C is difficult to execute in a sandbox or a

controlled environment. A software protocol means that there exists

opportunity for an attacker (Mallory) to forge results. The solution provided

in this paper protects itself from the following attacks.

Replay Attack: Mallory may provide Trent forged results by replaying the

response to a previous attestation challenge. To prevent this scenario, Trent

changes the operations performed in every instance of C. This is done by

placing some lines in the source code of C that depend on various constants.

C is recompiled for every attestation request. These constants are generated

prior to code compilation using random numbers. Consequentially, the

outputs of these measurements change with the change of every constant. The

code produced by Trent requires that Mallory monitors and adapts the attack

to suit the challenge. We utilize the concept that program analysis of

obfuscated code is complex enough to prevent attacks [7].

Tampering: Mallory may analyze the operations performed by the challenge

to return forged values. Trent places dummy instructions, randomizes

locations of variables, and places some self modifying instructions to prevent

static analysis of the application. It must be noted that self modifying code is

normally not permitted in the Intel x86 architecture as the code section is

protected against writes. However, we use a Linux OS call ‘mprotect’ to

change the protections on the code section of the process in which C executes

to allow this feature. Furthermore Trent also maintains a time threshold by

which the results are expected to be received; this reduces the window of

opportunity for Mallory to launch a successful attack.

Redirect: Mallory may re-direct the challenge from Trent to a clean machine

or execute it in a sandbox which will provide correct integrity values as the

response to Trent. The executable code sent by Trent obtains machine

identifiers to determine whether it executed at the correct machine. It also

executes certain tests to determine if it was executed inside a sandbox. C

communicates multiple times to Trent while executing tests on P. This makes

it harder for Mallory to prevent C from executing. These techniques are

discussed in detail in section 5.

For obtaining the integrity measurement of the OS Text section, the attestation

service provider Trent′ provides executable code (Ckernel) to the client OS

(OSAlice). OSAlice receives the code into a kernel module and executes the code.

It is assumed that OSAlice has means such as Digital Signatures to verify that

Ckernel did originate from Trent′. The details of implementation of this scheme

are in section 7.

The rest of the paper is organized as follows. Section 2 contains a review of

the related work. Section 3 describes the problem statement, threat model and

assumptions made in this solution. Section 4 describes the overall design of

the system; section 5 describes the obfuscation techniques used in creating C.

Section 6 describes the implementation of the application attestation system,

section 7 describes the implementation of kernel runtime measurements and

section 8 concludes the paper.

2. Related Work

Code attestation involves checking if the program code executing within a

process is legitimate or has been tampered. It has been implemented using

hardware, virtual machine and software based detection schemes. In this

section we discuss these schemes as well as methods to perform program

analysis and obfuscation techniques available in literature.

2.1 Hardware based integrity checking

Some hardware based schemes operate off the TPM chip provided by the

Trusted Computing Group [10],[11], [12], while others use a hardware

coprocessor which can be placed into the PCI slot of the platform [13], [14].

The schemes using the TPM chip involve the kernel or an application

executing on the client obtaining integrity measurements, and providing it to

the TPM, the TPM signs the values with its private key and may forward it to

an external agent for verification. The coprocessor based schemes read

measurements on the machine without any assistance from the OS or the CPU

on the platform, and compare measurements to previously stored values. The

hardware based scheme can allow a remote (or external) agent to verify

whether the integrity of all the programs on the client machine is intact or not.

Hardware based schemes have a stigma of DRM attached to them, may be

difficult to reprogram and are not ideally suited for mass deployment. The

TPM based schemes have little backward compatibility in that it does not

work on legacy systems which do not have a TPM chip.

Integrity Measurement Architecture (IMA) [15] is a software based integrity

measurement scheme that utilizes the underlying TPM on the platform. The

verification mechanism does not rely on the trustworthiness of the software on

the system. IMA maintains a list of hash values of all possible executable

content that is loaded in the system. When an executable, library, or kernel

module is loaded, IMA performs an integrity check prior to executing it. IMA

measures values while the system is being loaded, however does not provide

means to determine whether any program that is in execution got tampered in

memory. IMA also relies on being called by the OS when any application is

loaded; it relies on the kernel functions for reading the file system, and relies

on the underlying TPM to maintain an integrity value over the measurement

list residing in the kernel. Due to this, each new measurement added to a

kernel-held measurement list results in a change required for values stored in

the Platform Configuration Register (PCR) of the TPM security chip on the

system.

2.2 Virtualization based Integrity checking

Virtualization implemented without hardware support has been used for

security applications. This form of virtualization was implemented prior to

large scale deployment of platforms containing in built hardware support for

virtualization. Terra uses a trusted virtual machine monitor (TVMM) and

partitions the hardware platform into multiple virtual machines that are

isolated from one another [16]. Hardware dependent isolation and

virtualization are used by Terra to isolate the TVMM from the other VMs.

Terra implements a scheme where potentially every class of operation is

performed on a separate virtual machine (VM) on the client platform. Terra is

installed in one of the VMs and is not exposed to external applications like

mail, gaming, and so on. The TVMM is provided the role of a Host OS. The

root of trust in Terra is present in the hardware TPM; the TPM takes

measurements on the boot loader, which in turn takes measurements on the

TVMM. The TVMM takes measurements on the VMs prior to loading them.

Terra relies on the underlying TPM to take some measurements. Most

traditional VMM based schemes are bulky and need significant resources on

the platform to appear transparent to the end user, this holds true for Terra

where the authors advocate multiple virtual machines.

2.3 Integrity checking using hardware assisted virtualization

Hardware support for virtualization has been deployed in the widely used x86

consumer platforms recently. Intel and AMD have come out with Intel VT-x

and AMD-V which provide processor extensions where a system

administrator can load certain values in the hardware to setup a VMM and

execute the operating system in a guest environment. The VMM runs in a

mode that has higher privileges than the guest OS and can therefore enforce

access control between multiple guest operating systems and also between

application programs inside an OS. The system administrator can also setup

events in the hardware which cause the control to exit from the guest OS to

the VMM in a trap and emulate model. The VMM can take a decision based

on the local policy whether to emulate or ignore the instruction.

VIS [17] is a hardware based virtualization scheme which determines the

integrity of client programs that connect to a remote server.  VIS contains an

Integrity Measurement Module which reads the cryptographically signed

reference measurement (manifest) of a client process.  VIS verifies the

signature in a scheme similar to X.509 certificate measurement and then takes

the exact same measurements on the running client process to determine

whether it has been tampered.  The OS loader may perform relocation of

certain sections of the client program, in which case the IMM reverses these

relocations using information provided in the manifest and then obtains the

measurement values.  VIS requires that the pages of the client programs are

pinned in memory (not paged out).  VIS restricts network access during the

verification phase to prevent any malicious program from bypassing

registration.  VIS does not allow the client programs unrestricted access to

network before the program has been verified.

2.4 Software based integrity measurement schemes

Genuinity [6] implements a remote attestation system in which the client

kernel initializes the attestation for a program. It receives executable code and

maps it into the execution environment as directed by the trusted authority.

The system maps each page of physical memory into multiple pages of virtual

memory creating a one to many relationship between the physical and virtual

pages. The trusted external agent sends a pseudorandom sequence of

addresses, the Genuinity system othen takes the checksum over the specified

memory regions. Genuinity also incorporates various other values like the

Instruction and Data TLB miss count, counters which determine number of

branches and instructions executed. The executable code performs various

checks on the client kernel and returns the results to a verified location in the

kernel on the remote machine, which returns the results back to the server.

The server verifies if the results are in accordance with the checks performed,

if so the client is verified. This protocol requires OS support on the remote

machine for many operations including loading the attestation code into the

correct area in memory, obtaining hardware values such as TLB. Commodity

OS have many applications, requiring OS support or a kernel module for each

specific application can be considered a major overhead.

In Pioneer [4] the verification code resides on the client machine. The verifier

(server) sends a random number (nonce) as a challenge to the client machine.

The result returned as response determines if the verification code has been

tampered or not. The verification code then performs attestation on some

entity within the machine and transfers control to it. This forms a dynamic

root of trust in the client machine. Pioneer assumes that the challenge cannot

be re directed to another machine on a network, however in many real world

scenarios a malicious program can attempt to redirect challenges to another

machine which has a clean copy of the attestation code. In its checksum

procedure, it incorporates the values of Program Counter and Data Pointer,

both of which hold virtual memory addresses. An adversary can load another

copy of the client code to be executed in a sandbox like environment and

provide it the challenge. This way an adversary can obtain results of the

computation that the challenge produces and return it to the verifier. Pioneer

also assumes that the server knows the exact hardware configuration of the

client for performing a timing analysis, this places a restriction on the client to

not upgrade or change hardware components. In TEAS [7] the authors

propose a remote attestation scheme in which the verifier generates program

code to be executed by the client machine. Random code is incorporated in

the attestation code to make analysis difficult for the attacker. The analysis

provided by them proves that it is very unlikely that an attacker can clearly

determine the actions performed by the verification code; however

implementation is not described in the research.

A Java Virtual Machine (JVM) based root of trust method has also been

implemented to attest code [18]. The authors implement programs in Java

and modify the JVM to attest the runtime environment. However, the JVM

has known vulnerabilities and is itself software that operates within the

Operating System, and hence is not a suitable candidate for checking integrity.

SWATT [5] implements a remote attestation scheme for embedded devices.

The attestation code resides on the node to be attested. The code contains a

pseudorandom number generator (PRG) which receives a seed from the

verifier. The attestation code includes memory areas which correspond to the

random numbers generated by PRG as part of the measurement to be returned

to the verifier. The obtained measurements are passed through a keyed MAC

function, the key for the instance of MAC operation is provided by the

verifier. The problem with this scheme is that if an adversary obtains the seed

and the key to the MAC function, the integrity measurements can be spoofed

as the attacker would have access to the MAC function and the PRG code.

2.5 Attacks against software based attestation schemes

Genuinity has been shown to have weaknesses by two works [8], [5]. In [8] it

is described that Genuinity would fail against a range of attacks known as

substitution attacks. The paper suggests placing attack code on the same

physical page as the checksum code. The attack code leaves the checksum

code unmodified and writes itself to the zero-filled locations in the page. If the

pseudo random traversal maps into the page on which the imposter code is

present, the attack code redirects the challenge to return byte values from the

original code page. Authors of Genuinity countered these findings by stating

that the attack scenario does not take into account the time required to extract

test cases from the network, analyze it, find appropriate places to hide code

and finally produce code to forge the checksum operations [9]. The attacks

were specifically constructed against one instance of the checksum generation,

and would require complex re engineering to succeed against all possible test

cases. This would require a large scale code base to perform the attack. Such

a large code base would not be easy to hide.

In [5] it is suggested that genuinity has a problem of mobile code where an

attacker can exploit vulnerabilities of mobile code as code is sent over the

network to be executed on the client platform. In addition, the paper also

states that Genuinity reads 32 bit words for performing a checksum and hence

will be vulnerable if the attack is constructed to avoid the lower 32 bits of

memory regions. These two claims are countered by the authors of Genuinity

[9]. The first is countered by stating that Genuinity incorporates public key

signing which will prevent mobile code modifications by an attacker, while

the second is countered by stating that genuinity reads 32 bits at a time, and

not the lower 32 bits of an address.

A generic attack on software checksum based operations has been proposed

[19]. This attack is based on installing a kernel patch that redirects data

accesses of integrity measurement code to a different page in the memory

containing a clean copy of the code. This attack constitutes installation of a

rootkit to change the page table address translation routine in the OS.

Although this scheme potentially defeats many software based techniques, the

authors have themselves noted that it is difficult for this attack to work on an

x86 based 64 bit machine which does not use segmentation, this is because the

architecture does not provide the ability to use offsets for code and data

segments. Moreover, an attack like this requires the installation of a kernel

level rootkit that continuously redirects all read accesses to different pages in

memory. The attestation scheme presented in this paper for the user

application cannot defend itself against this attack, however, the scheme

presented in this work to determine the integrity of the kernel is capable of

detecting such modifications. In addition, Pioneer [4] suggests a workaround

on this classes of attacks by suggesting that if there are multiple virtual

address aliases, which in turn creates extra entries in the page table which will

lead to the OS eventually flushing out the spurious pages.

2.6 Program analysis and code obfuscation

Program Analysis requires disassembly of code and the control flow graph

(CFG) generation. The linux tool ‘objdump’ is one of the simplest linear

sweep tools that perform disassembly. It moves through the entire code once,

disassembling each instruction as and when encountered. This method suffers

from a weakness that it misinterprets data embedded inside instructions hence

carefully constructed branch statements induce errors [20]. Linear sweep is

also susceptible to insertion of dummy instructions and self modifying code.

Recursive Traversal involves decoding executable code at the target of a

branch before analyzing the next executable code in the current location. This

technique can also be defeated by opaque predicates [21]where one target of a

branch contains complex instructions which never execute [22].

CFG generation involves identifying blocks of code such that they have one

entry point and only one branch instruction with target addresses. Once

blocks are identified, branch targets are identified to create a CFG. Compiler

optimization techniques such as executing instructions in the delay slot of a

branch cause issues to the CGF and require iterative procedures to generate an

accurate CFG. The execution time of these algorithms is non-linear (n2) [23].

2.7 Kernel integrity measurement schemes

An attacker can compromise any measurements taken by a user level program

by installing a kernel level rootkit. The kernel provides file system, memory

management and system calls for user applications. The remote attestation

scheme as implemented in this work requires kernel support. This section

describes prior work done in implementing kernel integrity measurement.

Co-processor schemes that are installed on the PCI slot of the PC have been

used to measure the integrity of the kernel as mentioned in section 2.1. One

scheme [13] computes the integrity of the kernel at installation time and stores

this value for future comparisons. The core of the system lies in a co-

processor (SecCore) that performs integrity measurement of a kernel module

during system boot. The kernel interrupt service routine (SecISR) performs

integrity checks on a kernel checker and a user application checker. The

kernel checker proceeds with attesting the entire kernel .TEXT section and

modules. The system determines that during installation for the machine used

for building the prototype, the .TEXT section began at virtual address

0xC0100000 which corresponded to the physical address 0x00100000, and

begin measurements at this address.

Another work focuses on developing a framework for classifying rootkits

[24]. The authors state that there are three classes of rootkits, those that

modify system call table, those that modify targets of system calls, and those

that redirect references to the system call table by redirecting to a different

location. A kernel level rootkit may perform these actions by using

/dev/kmem device file, an example of such a rootkit is the knark rootkit [25].

The rootkit detector keeps a copy of the original System.map file and

compares the current system call table’s addresses with the original values. A

difference between the two tables indicates system call table modification.

This system of detecting changes to system call table detected the presence of

knark rootkit that modifies 8 system calls. The framework also detects

rootkits like SucKIT [26] which overwrite kernel memory to create a fake

system call table. Any user access to the system calls re directs to the new

table. The rootkit checker determines if the current system call table starts at

a location different that the original address, in which case a compromise is

detected.

LKIM [27] obtains hashes and contextual measurements to determine the

integrity of the platform. In addition to taking hash measurements on kernel

Text section, system call table, LKIM also takes measurements on some other

descriptors such as inodes, executable file format handlers, Linux security

model hooks and so on. The measurements taken are defined by a set of

measurement instructions. The paper states that there is no silver bullet to

prevent the Linux OS from forging results, hence propose a hypervisor based

scheme instead of a native OS scheme. The hypervisor scheme involves

changing Xen’s domain U to host the LKIM infrastructure. The domain

hosting LKIM is provided Domain 0 privileges.

3. Threat model and Assumptions

We assume that Mallory an attacker has complete control over software

residing on Alice’s machine and Mallory possesses the power to start a clean

copy of Alice’s installed program P to execute it in a controlled environment

to return results to Trent. Mallory can also attempt to re-direct the challenge

to another machine which runs a clean copy of P. We assume that Mallory

will not perform transient attacks like patching P with malicious code at any

given time t and then at any time t + ∆ replace the old instructions back and

remove any modifications. This behavior can be classified as rootkit like

behavior which will not be determined by the application level remote

attestation. However, a rootkit like this would get detected in the kernel level

remote attestation as described in section 7.

We assume that Alice will trust the code provided by Trent and allow it to

execute on the machine to be verified, and that Alice has means such as

certificates and digital signatures to verify that the verification code (C) has

been generated by Trent. We also assume that Alice is not running MAlice

behind a NAT and that the machine has only one network interface. The

reason to make these assumptions is that C takes measurements on MAlice to

determine if it is the same machine that contacted Trent. If MAlice is behind a

NAT then Trent would see the request coming from a router and

measurements from MAlice. This work focuses on the general client platform

where only one network interface is installed, and each network interface has

only one IP address associated with it. In the case that there are many

addresses configured on the same network interface, the code can be altered to

populate all possible IP addresses that it reads from the interface and send

them to Trent. Trent can parse through the result to find the matching IP

address.

For the user application attestation part, this work does not assume a

compromised kernel. The verification code C relies on the kernel to handle

the system calls executed through interrupts, and to read the file structure

containing the open connections on the system. There are many system call

routines in the Linux kernel and monitoring and duplicating the results of each

of these may be a difficult task for malware. Reading the port file structure

also requires support from the operating system. We will assume that the OS

provides correct results when the contents of a directory and file are read out.

Without this assumption, Remote Attestation cannot be performed entirely

without kernel support.

For the kernel attestation part, we assume that the kernel is compromised;

system call tables may be corrupted, and a malware may have changed the

interrupt descriptors. Runtime code injection is performed on a kernel module

to measure the integrity of the kernel. It is assumed that Alice has means such

as digital certificates to determine that the code being injected is generated by

a trusted server. It is also assumed that the trusted server is the OS vendor or

a corporate network administrator who has knowledge of the OS mappings for

the client.

4. Overview of operations to be performed on Client end

If Alice could download the entire copy of P every time the program had to

be executed then Remote Attestation would not be required. However, since

P is an installed application, Alice must have customized certain profile

options, saved some data which will be cumbersome to create ever time.

Alice uses P to contact Trent for a service, Trent returns to P: a challenge

which is executable code (C). P must inject C in its virtual memory and

execute it at a location specified by Trent. C computes certain measurements

and communicates integrity measurement value M1 directly to Trent. This

process is depicted in Fig. 1. Trent has a local copy of P on which the same

sets of tests are executed as above to produce a value M0. Trent compares M1

and M0; if the two values are the same then Alice is informed that P has not

been tampered. This raises the issue of verifiable code execution, in which

Trent wants to be certain that C took its measurements on P residing inside

MAlice. To provide this guarantee C executes some more tests on MAlice and

returns their results to Trent. These checks ensure that C was not bounced to

another machine, and that it was not executed in a sandbox environment

inside a dummy P process within MAlice.

There are many ways in which Mallory may tamper with the execution of C.

Mallory may substitute values of M1 being sent to Trent such that there is no

evidence of any modification to P having taken place. It is also possible that

Mallory may have loaded another copy of P which has not been tampered

inside a sandbox, execute C within it, and provide the results back to Trent.

Mallory may have also redirected the challenge to another machine on the

network making it compute and send the responses back to Trent. Without

addressing these issues, it is not possible for Trent to correctly determine

whether the measurements accurately reflect the state of P on MAlice. If Trent

can determine that C executed on MAlice, and C was not executed in a sandbox

then Trent can produce code whose results are difficult to guess and the

results can indicate the correct state of P. Achieving these guarantees require

that C provides Trent with a machine identifier and a process identifier.

Trent can retain a sense of certainty that the results are genuine by producing

code that makes it difficult for Mallory to pre-compute results. Once these

factors are satisfied, Trent can determine whether P on MAlice has been

tampered. The entire process of Remote Attestation is shown in Fig. 2.

4.1 Determining checksum and MD5 on P

C computes a MD5 hash of P to determine if the code section has been

tampered. Downloading MD5 code is an expensive operation as the code size

is fairly large, and MD5 code cannot be randomized as it may lose its

properties. Due to these reasons, the MD5 code permanently resides on P. To

prevent Mallory from exploiting this aspect, a two phase hash protocol is

implemented. Trent places a mathematical checksum inside C which

computes the checksum on the region of P containing the MD5 executable

code along with some other selected regions. Trent computes the results of

the checksum locally and verifies if C is returning the expected value. C

proceeds with the rest of the protocol if Trent responds in affirmative.

Trent changes the operations of the checksum in every instance so that

Mallory cannot use prior knowledge to predict the results of the mathematical

operations. C does not take the checksums over fixed sized regions; instead

Trent divides the entire area over which checksum is taken into multiple

overlapping sub-regions, the boundaries of the sub-regions are defined inside

C by Trent by moving the data pointer back by a random number that is

generated during compilation of the C source code. For the prototype

implementation, the method used to generate the random numbers was the

‘rand’ call, since rand call may not me truly random, we used the ‘srand’ call

and used the current stack pointer of the source code generating program as

the seed to the random number. The stack of all processes is randomized

using Address Space Layout Randomization (ASLR) [28]. It can be noted

that this is not as secure as using a cryptographically secure random number

generator. In real world applications, Trent can use the Linux ‘/dev/random’

file [29] to read random numbers.

The individual checksums are then combined and sent to Trent. This is

depicted in Fig. 3. C performs MD5 hash on overlapping sub-regions of P

defined in a similar fashion as above. A degree of obfuscation is added by

following the procedure in Fig. 4. C initially takes the MD5 hash of the first

sub-region (H1). It then obtains the MD5 hash of the next sub-region (H2). It

then concatenates the two values to produce H1H2. Then a MD5 Hash of H1H2

is taken to produce H12. H12 is then concatenated with H3 to produce H12H3.

H12H3 is hashed again to produce H23 and so on. This process is followed for

all the sub-regions and sent to Trent.

Drawing inferences from executable code is considered difficult as discussed

in section 2. Randomizing the boundary overlaps between the sub-regions

makes it difficult to predict the hash values being generated. Mallory has to

execute the code to observe the computation being performed. The

checksums are taken on overlapping sub regions to make the prediction of

results more difficult for Mallory. This creates multiple levels of

indeterminacy for an attack to take place. Mallory has to not only predict the

boundaries of the sub-regions, but has to also deal with the overlap among the

sub-regions. Overlapping checksums also ensures that if by accident the sub-

regions are defined identically in two different versions of C, the results of

computation produced by C are still different. This also ensures that some

random sections of P are present more than once in the checksum, making it

more difficult for Mallory to hide any modifications to such regions.

MD5 checksum has been used in this prototype, it has been discovered that it

has collisions. However, MD5 can be substituted easily with a different

hashing algorithm in a software based attestation scheme, the same cannot be

done easily in a TPM or hardware based attestation scheme.

4.2 Determining process identifiers.

C determines whether it was executed inside a fake process or the correct P

process by obtaining some identifiers. C determines the number of processes

having an open connection to Trent on MAlice. This is obtained by determining

the remote address and remote port combinations on each of the port

descriptors in the system. C communicates to Trent using the descriptor

provided by P and does not create a new connection. This implies that in an

ideal situation there must be only one such descriptor on the entire system,

and the process utilizing it must be the process under which C is executing.

The passing of socket descriptor from P to C also addresses the issue of

redirection of challenge to another machine partially. The only way for such a

connection to exist on a machine is if Trent accepts the incoming request,

otherwise the machine will not have a socket descriptor with the ability to

communicate with Trent.

If there is more than one process having such a connection then an error

message is sent to Trent. If there is only one such process, C computes its

own process id and compares the two values. If they match an affirmative

message is sent to Trent. If the values do not match then it reports an error

with an appropriate message to Trent.

4.3 Determining the Identifier for MAlice

C has to provide Trent the guarantee that it was not re-directed to another

machine and that it was not executed in a sandbox environment or pasted on

another clean copy of P within MAlice. The first is achieved by obtaining any

particular unique machine identifier. In this case the IP address of the

machine can serve as an identifier. Trent has received a request from Alice

and has access to the IP address of MAlice. If C returns the IP address of the

machine it is executing on Trent can determine if both are the same machine

or not. It can be argued that IP addresses are dynamic however there is little

possibility that any machine will change its IP address in the small time

window between a request by Alice to measurements being taken and

provided to Trent. C determines the IP address of MAlice using System

Interrupts. Mallory will also find it hard to tamper with the results of an

Interrupt. The interrupt ensures that the address present on the Network

interface is correctly reported to Trent. It can again be noted that Mallory

may have changed the address of the network interface to match that of MAlice,

but as these machines are not behind a NAT it would be quite difficult for

Mallory to provide the identical address to another machine on an external

network and communicate with that machine. On receiving the results of the

four tests, Trent knows that P has not been tampered from the time of

installation to the time of request of verification being sent from MAlice.

5. Design of Checksum code produced by Trent

Trent has to prevent Mallory from analyzing the operations performed by C.

Trent places a series of obfuscations inside the generated code along with a

time threshold (T) by which the response from MAlice is expected. If C does

not respond back in a stipulated period of time (allowing for network delays),

Trent will know that something went wrong at MAlice. This includes denial of

service based attacks where Trent will inform Alice that C is not

communicating back.

Fig. 5 shows a sample snippet of the C mathematical checksum code. The

send function used in the checksum snippet is implemented using inline ASM.

It is evident that in order to forge any results, Mallory must determine the

value of checksum2 being returned to Trent. This requires that Mallory

identifies all the instructions modifying checksum2 and the locations on stack

that it uses for computation. To prevent Mallory from analyzing the injected

code, certain obfuscations are placed in C as discussed below:

5.1 Changing execution flow and locations of variables on stack

To prevent Mallory from utilizing knowledge about a previous instance of C

in the current test, Trent changes the checksum operations performed by

selecting mathematical operations on memory blocks from a pool of possible

operations and also changes the order of the instructions. The results of these

operations are stored temporarily in the stack. Trent changes the pointers on

the stack for all the local variables inside C for every instance. These steps

prevent Mallory from successfully launching an attack similar to those used

for HD-DVD key stealing [30, 31].

5.2 Inserting Dummy Instructions

Program Analysis is a non linear operation as discussed in section 2. An

increase in the number of instructions that Mallory has to analyze decreases

the time window available to forge the results of these operations. Trent

inserts instructions that never execute and also inserts operations that are

performed on MAlice but not included as part of the results sent back to Trent.

These additions to the code make it difficult for Mallory to correctly analyze

C within a reasonable period of time.

5.3 Changing instructions during execution

Mallory may perform static analysis on the executable code C sent by Trent.

A good disassembler can provide significant information on the instructions

being executed, and allow Mallory to determine when system calls are made

and when function calls are made. In addition it may also allow Mallory to

see the area of code which reads memory recursively. If these tools do not

have access to the code to be executed before it actually executes, then

Mallory cannot determine the operations performed by C. Trent removes

some instructions in C while sending the code to MAlice and places code inside

C with data offsets such that during execution, this section in C changes the

modified instructions to the correct values. This way without executing C it is

difficult for Mallory to determine the exact contents of C.

6. Implementation of user application attestation

In this section the implementation of the techniques proposed in this paper are

described. All the coding was done using the C language on Intel x86

architecture machines on Linux kernel using the gcc compiler.

6.1 Generation of C by Trent

Trent generates C for every instance of verification request. If Trent sent

out the same copy of the verification code, then Mallory can gain significant

knowledge on the individual checks performed by C, by generating new code

for every instance of verification Trent mitigates this possibility. Trent also

places obfuscations inside the code to prevent static analysis of the executable

code. The operations performed by Trent to obfuscate the operations

performed during verification are discussed below.

6.1.1 Changing execution flow and locations of variables on stack

Changing execution flow and locations of stack serves to prevent the program

analysis on C. The source code of C was divided into four blocks which are

independent of each other. Trent assigns randomly generated sequence

numbers to the four blocks and places them accordingly inside C source code.

The checksum block is randomized by creating a pool of mathematical

operations that can be performed on every memory location and selecting

from the pool of operations. The pool of operations is created by replacing

the mathematical operation with other mathematical operation on the exact

same location.

Once the mathematical operations are selected in the C source code, Trent

changes the sub-regions for the checksum code and the MD5 calling

procedure. This is done by replacing the numbers defining the sub-regions. C

has sub–regions defined in its un-compiled code. To randomize the sub-

regions, a pre-processor is executed on the un-compiled C such that it changes

the numbers defining the sub-regions. The numbers are generated such that

the sub-regions overlap by a random value.

C allocates space on the local stack to store computational values. Instead of

utilizing fixed locations on the stack, Trent replaces all variables inside C with

pointers to locations on the stack. To allocate space on the stack Trent

declares a large array of type ‘char’ of size N, which has enough space to hold

contents of all the other variables simultaneously. Trent executes a pre-

processor which assigns locations to the pointers. The pre-processor

maintains a counter which starts at 0 and ends at N-1. It randomly picks a

pointer to be assigned a location and assigns it to the value on the counter and

increments the counter using the size of the corresponding variable in

question. This continues until all the pointers are assigned a location on the

stack. Trent compiles C source code to produce the executable after placing

these obfuscations.

6.1.2 Obfuscating instructions executed

Mallory cannot obtain a control flow graph (CFG) or perform program

analysis on the executable code of C provided the instruction is being

executed by C cannot be determined. Trent changes the instructions inside

the executable code such that they cause analysis tools to produce incorrect

results. C contains a section (Crestore) which changes these modified

instructions back to their original contents when it executes. Crestore contains

the offset from the current location and the value to be placed inside the

offset. Trent places information to correct the modified instructions inside

Crestore. Crestore is executed prior to executing other instructions inside C and

Crestore corrects the values inside the modified instructions.

6.2 Execution of C on Client’s Machine

The executable code is received by the Client’s (Alice) machine. The

received information contains the length of the code and the location where it

should be placed and executed. Normally it is not possible to introduce new

code into a process during run time. However Alice’s software (P) can use a

Linux library call to place C at the required location and execute the code. C

communicates the results of the verification back to Trent without relying on

P. The details of its execution are discussed below.

6.2.1 Injection of code by P on itself

P makes a connection request to Trent. Trent grants the request and provides

the number of bytes of challenge to be received and follows it with providing

the executable code of C. Trent also sends the information on the location

inside P where C should be placed. P receives the code and prepares the area

for injection by executing the library utility mprotect on the area. The code

section of a process in the Intel x86 architecture is write-protected. This

utility changes the protection on the code specified area of the code section

and allows this area to be overwritten with new values. Once the injection is

complete P creates a function pointer which points to the address of the

location where the code was injected and calls the function using the pointer,

transferring control to C.

6.2.2 Obtaining measurements on the target machine

C obtains certain identifiers on MAlice that allow Trent to identify whether it

indeed executed at the correct machine and process. These identifiers have to

be located outside the process space of P; therefore C computes the following

values in order to send them to Trent. The IP address of MAlice, mathematical

checksum on the MD5 code residing inside P, MD5 hash values of

overlapping sub-regions inside P, and the process state that allows C to

determine whether it was executed inside a sandbox.

The first involves identifying the machine on which it is executing. Trent

received an incoming connection from Alice, hence it is possible to track of

the IP address of MAlice. Although most IP addresses are dynamic, there is

little probability of an IP address changing in the small time window between

a request being sent and C taking its measurements. C does not utilize the

system call libraries to obtain values. It utilizes interrupts to execute system

calls. This involves loading the stack with the correct operands for the system

call, placing the system call number in the A register and the other registers

and executing the interrupt instruction. The sample code for creating a socket

is shown in Fig. 6.

Reading the IP address involves creating a socket on the network interface and

obtaining the address from the socket by means another system call – ioctl.

The obtained address is in the form of an integer which is converted to the

standard A.B.C.D format. After this, the address is sent to Trent using the

send routine inside the socketcall system call. It must be noted that the send is

done using the socket provided by P and not using a new socket. This is done

so that Mallory cannot bounce C to another machine. If Mallory did that, then

Mallory must provide an existing connection to Trent. However as

connections to any machine can exist only with Trent’s knowledge, this

situation cannot arise.

Trent verifies the address of the machine and sends a response to C which

then proceeds to take checksum on some portions of the code and follows up

with an MD5 hash of the entire code section. As discussed in section 4.2 and

6.3, the sub-regions are defined randomly and such that they overlap. C sends

the checksum and MD5 results to Trent utilizing the system interrupt method

for send as discussed above. C obtains the pid of the process (P 0) under

which it is executing using the system interrupt for getpid. It then locates all

the remote connections established to Trent from MAlice. This is done by

reading the contents of the ‘/proc/net/tcp/’ file. The file has a structure shown

in Fig. 7.

As seen in figure there is a remote address and port information for every

connection that allows C to identify any open connection to Trent. Once all

the connections are identified, C utilizes the inode of each of the socket

descriptor to locate any process utilizing it. This is done by scanning the

‘/proc/<pid>/fd’ folder for all the running processes on MAlice. In the ideal

situation there should be only one process id (P 1) utilizing the identified

inode. If it encounters more than one such process, then it sends an error

message back to Trent. Once the process id P 1 is obtained, C measures if the

id P 0 and the id P 1 are the same. If so, C sends an affirmative to Trent.

These measurements allow Trent to be certain that C executed on P residing

on MAlice.

7. Remote kernel attestation

To measure the integrity of the kernel we implement a scheme which is

similar to the user application attestation scheme. Trent′ is a trusted server

who provides code (Ckernel) to MAlice. It is assumed that Alice has means such

as digital signature verification scheme to determine whether Ckernel was sent

by Trent′. Alice receives Ckernel using a user level application Puser, verifies

that it was sent by Trent’ and places it in the kernel of the OS executing on

MAlice. Ckernel is then executed which obtains integrity measurements (Hkernel)

on the OS Text section, system call table, and the interrupt descriptors table.

Ckernel passes these results to Puser, which returns these results to Trent′. If

required Ckernel can encrypt the integrity measurement results using a one time

pad or a simple substitution cipher, however as the test case generated is

different in every instance, this is not a required operation. Figure 8 depicts

this process. Trent′ also provides a kernel module Pkernel that provides ioctl

calls to Puser. As seen in figure 8a, Puser receives Ckernel from Trent′. In figure

8b, Puser forwards the code to Pkernel. It is assumed that Pkernel has the ability to

verify that the code was sent by Trent′. Pkernel places the received code in its

code section at a location specified by Trent′ and executes it. Ckernel obtains an

arithmetic and MD5 checksum on the specified regions of the kernel on MAlice

and returns the results to Puser as seen in figure 8c. Puser then forwards the

results to Trent′ who determines whether the measurements obtained from the

OS on MAlice match with existing computations (figure 8d). Since Trent′ is an

OS vendor or a corporate network administrator, it can be assumed that Trent′

has local access to a pristine copy of the kernel executing on MAlice to obtain

expected integrity measurement values generated by Ckernel. Although this

seems like Trent′ would need infinite memory requirements to keep track of

every client, most OS installations are identical as they are off the shelf. In

addition if Trent is a system administrator of a number of machines on a

corporate network, Trent′ would have knowledge of the OS on every client

machines.

7.1 Implementation

The kernel attestation was implemented on an x86 based 32 bit Ubuntu 8.04

machine executing with 2.6.24-28-generic kernel. In Linux the exact identical

copy of the kernel is mapped to every process in the system. Since we use the

system calls, and software interrupts for the application attestation part, this

section describes the integrity measurement of the text section (which contains

the code for system calls and other kernel routines), the system call table and

the interrupt descriptor table.

The /boot/System.map-2.6.24-28-generic file on the client platform was used

to locate the symbols to be used for kernel measurement. The kernel text

section was located at virtual address 0xC0100000, the end of kernel text

section was located to be at 0xc03219CA which corresponded to the symbol

'_etext'. The system call table was located at 0xC0326520, the next symbol in

the maps file was located at 0xc0326b3c, a difference of 1564 bytes. The

'arch/x86/include/asm/unistd_32.h' file for the kernel build showed the

number of system calls to be 337. Since MAlice was a 32 bit system, the space

required for the address mappings would be 1348 bytes. We took integrity

measurements from 0xC0326520 - 0xC0326B3B. The Interrupt descriptor

table was located at 0xc0410000 and the next symbol was located at

0xc0410800, which gives the IDT a size of 2048 bytes. A fully populated IDT

should be 256 entries of 8 bytes each which gives a 2KB sized IDT, this is

consistent with the System.maps file on the client machine.

Trent′ also provides a kernel module (Pkernel) to the client platform which is

installed as a device driver for a character device. Pkernel offers functionalities

using the ioctl call. Puser receives the code from the trusted authority and

opens the char device. Puser then executes an ioctl which allows the kernel

module to receive the executable code. As in the user application attestation

case, Trent′does not send the MD5 code for every attestation instance. Instead

the trusted authority sends a driver code which populates a data array and

provides it to the MD5 code which stays resident on Pkernel. To prevent

Mallory from exploiting this, the trusted authority also provides an arithmetic

checksum computation routine which is downloaded for every attestation

instance. This provides a degree of extra unpredictability to the results

generated by the integrity measurement code.

Kernel modules can be relocated during compile time. This means that the

Trent′ would not know where the MD5 code got relocated during installation

of the module. In order to execute the MD5 code, the Trent′ requests the

location of MD5 function in the kernel module from the client end. After

obtaining the address, Trent′ generates the executable code Ckernel which has

numerous calls to the MD5 code. At generation, the call address may not

match the actual function address at the client end. Once Ckernel is generated,

the call instructions are identified in the code and the correct target address is

patched on the call instruction. Once this patching is done, Trent′ sends the

code to the client end. The call address calculation is done as follows:

call_target = -( (address_injected_driver + call_locations[0] +

length_ofcall ) - address_mdstring );

code_in_file[jump_locations[0] +1 ] = call_target;

Ckernel is loaded in a char array code_in_file. The location where Ckernel address

to be injected is determined by Trent′ by selecting a location from a number of

'nop' locations in the module, this address is termed as address_injected_driver

in the above code snippet. The call location in the generated executable code

is determined by scanning the code for the presence of the call instruction.

The length of call instruction is a constant value which is dependent on the

current architecture. Finally the address of mdstring (which is the location of

MD5 code) is obtained from the client machine as described above. The

second statement changes the code array by placing the correct target address.

This procedure is repeated for all the call instructions in the generated code. It

must be noted that Ckernel calls only the MD5 code and no other function. If

obfuscation is required, Trent′ can place some junk function calls which get

executed by evaluating an ‘if statement’. Trent′ can construct several if

statements such that they never evaluate to true. It can be noted that even if

the client does not communicate the address of the MD5 code, Pkernel can be

designed such that the MD5 driver provided by the trusted authority and the

MD5 code reside on the same page. This means that the higher 20 bits of the

address of the MD5 code and the downloaded code will be the same and only

the lower 12 bits would be different. This allows the Trent′ to determine

where Ckernel will reside on the client machine, and automatically calculate the

target address for the MD5 code. This is possible because the C compiler

produces lower 12 bits of function addresses while creating a kernel module

and allows the higher 20 bits to be populated during module insertion.

Once the code is injected, Trent′ issues a message to the user application

requesting the kernel integrity measurements. Puser executes another ioctl

which causes the Pkernel to execute the injected code. Ckernel reads various

memory locations in the kernel and passes the data to the MD5 code. The

MD5 code returns the MD5 checksum value to Ckernel which in turn returns the

value to the ioctl handler in the Pkernel. Pkernel then passes the MD5 and

arithmetic checksum computations back to Puser which forwards the results to

the Trent′.

If required the disable interrupt instruction can be issued by Ckernel to prevent

any other process from obtaining hold of the processor. It must be noted that

in multi processor systems disable interrupt instruction may not prevent a

second processor from smashing kernel integrity measurement values.

However, as the test cases are different for every attestation instance, Mallory

may not gain anything by smashing integrity measurement values.

8. Results

The time threshold (T) is an important parameter in this implementation. We

aim to prevent an attacker Mallory from intercepting C and providing fake

results to Trent. If T is too large then Mallory may be able to obtain some

information about the execution of C. The value of T must take into account

network delays. Network delays between cities in IP networks are of the

order of a few milliseconds [32]. Hence measuring the overall time required

for one instance of Remote Attestation and adding a few seconds to the

execution time can suffice for the value of T.

We obtained the source code for the VLC media player interface [33]. We

removed some sections of the interface code and left close to 1000 lines of C

code in the program. We measured various stages of the integrity

measurement process. We took 2 pairs of machines running Ubuntu 8.04.

One pair were legacy machines executing on an Intel Pentium 4 processor

with 1 GB of ram, and the second pair of machines were Intel Core 2 Quad

machine with 3 GB of ram. The tests measured were the time taken to

generate code including compile time, time taken by the server to do a local

integrity check on a clean copy of the application and time taken by the client

to perform the integrity measurement and send a response back to the server.

To obtain an average measurement for code generation we executed the

program in a loop of 1000 times and measured the time taken using a watch.

We also measured the time reported by system clock and found to be a slight

variation (order of 1 second) in the time perceived by the human eye using the

watch and that reported by the system clock at the end of the loop. The time

taken for compiling the freshly generated code was measured similarly. These

two times are reported in table 1.

We then executed the integrity measurement code C locally on the server and

sent it to the client for injection and execution. The time taken on the server is

the compute time the code will take to generate integrity measurement on the

server as both machines were kept with the same configuration in each case.

These times are reported in table 2. It must be noted that the client requires a

higher threshold to report results because it has to receive the code from the

network stack, inject the code, execute it, return results back through the

network stack to the server. Network delays also affect the time threshold.

We can see from the two tables that it takes an order of a few hundred

milliseconds for the server to generate code, while the integrity measurement

is very light weight and returns results in the order of a few milliseconds. Due

to this the code generation process can be viewed as a huge overhead.

However, the server need not generate new code for every instance of a client

connection. It can generate the measurement code periodically every second

and ship out the same integrity measurement code to all clients connecting

within that second. This can alleviate the workload on the server. A value for

T can be suitably computed from the table taking into consideration network

hops required and be set to a value less than 5 seconds.

9. Conclusion and Future work

This paper implements a method for implementing Remote Attestation entirely

in software. We also presented number of other schemes in literature that

address the problem of program integrity checking. We reduced the window

of opportunity for the attacker Mallory to provide fake results to the trusted

authority Trent by implementing various forms of obfuscation and providing

new executable code for every run. We implemented this scheme on Intel x86

architecture and set a time threshold for the response.

As future work we plan to implement this scheme using the virtualization

extensions. We also plan to extend this work to find out whether the client

process continued executing after the Remote Attestation was successful.

References

[1] Web link. In brief and statistics: The H open source. Retrieved on October 4, 2010, http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-35-1047707.html?page=5

[2] T. Ball, E. Bounimova, B. Cook, V. Levin, J. Lichtenberg, C. McGarvey, B. Ondrusek, S. K. Rajamani and A. Ustuner, "Thorough static analysis of device drivers," ACM SIGOPS Operating Systems Review, vol. 40, pp. 73-85, 2006.

[3] A. Chou, J. Yang, B. Chelf, S. Hallem and D. Engler, "An empirical study of operating systems errors," in Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles, 2001, pp. 73-88.

[4] A. Seshadri, M. Luk, E. Shi, A. Perrig, L. Van Doorn and P. Khosla, "Pioneer: Verifying code integrity and enforcing untampered code execution on legacy systems," in ACM SIGOPS Operating Systems Review, 2005, pp. 1-16.

[5] A. Seshadri, A. Perrig, L. van Doorn and P. Khosla. SWATT: SoftWare-based ATTestation for embedded devices. 2004 IEEE Symposium on Security and Privacy. pp. 272-282.

[6] R. Kennel and L. H. Jamieson, "Establishing the genuinity of remote computer systems," in Proceedings of the 12th USENIX Security Symposium, 2003, pp. 295-308.

[7] J. A. Garay and L. Huelsbergen, "Software integrity using timed excutable agents," in Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, 2006, pp. 189-200.

[8] U. Shankar, M. Chew and J. D. Tygar, "Side effects are not sufficient to authenticate software," in Proceedings of the 13th USENIX Security Symposium, 2004, pp. 89-102.

[9] R. Kennel and L. H. Jamieson, "An Analysis of proposed attacks against GENUINITY tests," CERIAS Technical Report, Purdue University, 2004.

[10] F. Stumpf, O. Tafreschi, P. Röder and C. Eckert, "A robust integrity reporting protocol for remote attestation," in Second Workshop on Advances in Trusted Computing (WATC’06 Fall), 2006.

[11] R. Sailer, X. Zhang, T. Jaeger and L. Van Doorn, "Design and implementation of a TCG-based integrity measurement architecture," in SSYM'04: Proceedings of the 13th Conference on USENIX Security Symposium, 2004, pp. 223-228.

[12] K. Goldman, R. Perez and R. Sailer, "Linking remote attestation to secure tunnel endpoints," in STC '06: Proceedings of the First ACM Workshop on Scalable Trusted Computing, 2006, pp. 21-24.

[13] L. Wang and P. Dasgupta, "Coprocessor-based hierarchical trust management for software integrity and digital identity protection," Journal of Computer Security, vol. 16, pp. 311-339, 2008.

[14] N. L. Petroni Jr, T. Fraser, J. Molina and W. A. Arbaugh, "Copilot-a coprocessor-based kernel runtime integrity monitor," in Proceedings of the 13th Conference on USENIX Security Symposium-Volume 13, 2004.

[15] R. Sailer. IBM research - integrity measurement architecture. Retrieved on November 3, 2010, http://domino.research.ibm.com/comm/research_people.nsf/pages/sailer.ima.html

[16] T. Garfinkel, B. Pfaff, J. Chow, M. Rosenblum and D. Boneh, "Terra: A virtual machine-based platform for trusted computing," ACM SIGOPS Operating Systems Review, vol. 37, pp. 193 - 206, 2003.

[17] R. Sahita, U. Savagaonkar, P. Dewan and D. Durham, "Mitigating the lying-endpoint problem in virtualized network access frameworks," 18th IFIP/IEEE international conference on Managing virtualization of networks and services, 2007, pp. 135-146.

[18] V. Haldar, D. D. Chandra and M. M. Franz, "Semantic remote attestation: A virtual machine directed approach to trusted computing," in USENIX Virtual Machine Research and Technology Symposium, 2004, pp. 29-41.

[19] G. Wurster, P. C. van Oorschot and A. Somayaji, "A generic attack on checksumming-based software tamper resistance," in 2005 IEEE Symposium on Security and Privacy, 2005, pp. 127-138.

[20] B. Schwarz, S. Debray and G. Andrews, "Disassembly of executable code revisited," in Proceedings of Working Conference on Reverse Engineering, 2002, pp. 45-54.

[21] C. Collberg, C. Thomborson and D. Low, "Manufacturing cheap stealthy opaque constructs," in Proceedings of Working Conference on Reverse Engineering, 1998, pp. 184-196.

[22] C. Linn and S. Debray, "Obfuscation of executable code to improve resistance to static disassembly," in Proceedings of the 10th ACM Conference on Computer and Communications Security, 2003, pp. 290-299.

[23] K. D. Cooper, T. J. Harvey and T. Waterman, "Building a control flow graph from scheduled assembly code,"

[24] J. F. Levine, J. B. Grizzard and H. L. Owen. (2006, Detecting and categorizing kernel-level rootkits to aid future detection. IEEE Security & Privacy pp. 24-32.

[25] Web link, "Information about the knark rootkit," Retrieved on November 9 2010. http://www.ossec.net/rootkits/knark.php

[26] D. Sd. (2001), Linux on-the-fly kernel patching without LKM.

[27] P. A. Loscocco, P. W. Wilson, J. A. Pendergrass and C. D. McDonell, "Linux kernel integrity measurement using contextual inspection," in 2007 ACM Workshop on Scalable Trusted Computing, 2007, pp. 21-29.

[28] Web link, "Address space layout randomization," Retrieved on April 25, 2010. http://pax.grsecurity.net/docs/aslr.txt

[29] Web link, "Linux man pages online - kernel random number generator," Retrieved on August 30, 2010. http://linux.die.net/man/4/random

[30] Web link. Hackers discover HD DVD and blu-ray processing key - all HD titles now exposed. Retrieved on November 3, 2009.

http://www.engadget.com/2007/02/13/hackers-discover-hd-dvd-and-blu-ray-processing-key-all-hd-t/

[31] Web link, "Hi-Def DVD Security is bypassed," Retrieved on November 3, 2009. http://news.bbc.co.uk/2/hi/technology/6301301.stm

[32] Web link, "Global IP Network Latency," Retrieved on January 17, 2010. http://ipnetwork.bgtmo.ip.att.net/pws/network_delay.html

[33] Web link, "VLC media player source code FTP repository," Retrieved on February 24, 2010. http://download.videolan.org/pub/videolan/vlc/

Machine Testgeneration

Compilationtime

TotalTime

Pentium 4 12.3 320 332Quad Core 5.2 100 105

Table 1: Average code generation time in milliseconds on server end for Intel

Pentium 4 and Core 2 Quad machines for one instance of the measurement

Machine Server sideexecution time

Client sideexecution time

Pentium 4 0.6 22Quad Core 0.4 16

Table 2: Time taken in milliseconds to compute the measurements on server

and on the remote client

Figure Captions

Figure Number Caption

1 Challenge response Overview

2 Protocol Overview

3 Hash obtained on overlapping sub-regions. Two

instances have different sub-regions

4 Procedure for obtaining the MD5 Hash of the entire

code section

5 Snippet from the checksum code

6 ASM code for creating a socket

7 Contents of /proc/net/tcp file

8 Kernel remote attestation scheme

a. User application initiates attestation request

b. User application sends attestation code to kernel

c. Kernel returns integrity values to user application

d. Verification of kernel integrity by trusted server

Results

Measurements

C

Request

TrentMAlice

P

C

Figures

Fig. 1

Fig. 2

1. Alice TrentVerification Request

2. Trent AliceInject code at location, execute it

3. C TrentMachine Identifier

4. Trent CProceed

5. C TrentInitial Checksum

6. Trent CProceed

7. C Trent MD5 Hash of specified regions

8. Trent CProceed

9. C Trent Test of correct process ID

10. Trent CProceed/Halt

200

160

110

50

0

200

150

80

60

0Checksum 1

Checksum 2

Checksum 3

Checksum 4

Checksum 1

Checksum 2

Checksum 3

Checksum 4

Fig. 3

H12H3H3

H12

H1H2H2

H1Concatenation

MD 5Region 1

MD 5Region 1

+

MD 5

MD 5Region 3 +

MD 5Region N + MD 5 Result

Fig. 4

Fig. 5

{……x = <random value>a = 0; while (a<400) { checksum 1 += Mem[a]; if ((a % 55) == 0) {checksum2 += checksum1/x;} a++; }send checksum2; …..}

Fig. 6

__asm__(“sub $12, %%esp\n” “movl $2, (%%esp)\n” “movl $1, 4(%%esp) \n” “movl $0, 8(%%esp) \n” “movl $102, %%eax\n” “movl $1,%%ebx\n” “movl %%esp, %%ecx\n” “int $0x80\n” “add $12, %%esp\n” : “=a” (new_socket) );

Fig. 7

sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 0100007F:1F40 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5456 1 f6eb0980 299 0 0 2 -1 1: 00000000:C3A9 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 4533 1 f6ec0000 299 0 0 2 -1 2: 00000000:006F 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 4473 1 f6f60000 299 0 0 2 -1 3: 0100007F:0277 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5690 1 f6ec0980 299 0 0 2 -1 4: 0100007F:0019 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5358 1 f6ec04c0 299 0 0 2 -1 5: 0100007F:743A 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 5411 1 f6eb04c0 299 0 0 2 -1

Trent′Userland Kernel attestation request

Ckernel

Operating System

Pkernel

Puser

Userland

Operating System

Pkernel

Puser

Ckernel

Userland

Operating System

Pkernel

Puser

Hkernel

Trent′Userland Kernel integrity measurements

OK

Operating System

Pkernel

Puser

Fig. 8d

Fig. 8cFig. 8b

Fig. 8a

Figure 8