detecting deadlock, double-free and other abuses in a million lines of linux kernel source (sew 30)

27
(Needles in a Haystack) Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source Peter T. Breuer, Simon Pickin Universidad Carlos III de Madrid Maria Larrondo Petrie Florida Atlantic University

Upload: peter-breuer

Post on 10-Jun-2015

187 views

Category:

Technology


0 download

DESCRIPTION

Presentation at 30th Annual IEEE/NASA Software Engineering Workshop (SEW-30), Loyola College Graduate Center, Columbia, MD, USA, April 25, 2006. The preprint of the paper is at http://www.academia.edu/1413564/Detecting_deadlock_double-free_and_other_abuses_in_a_million_lines_of_linux_kernel_source. DOI 10.1109/SEW.2006.1 .

TRANSCRIPT

Page 1: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

(Needles in a Haystack)

Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel

Source

Peter T. Breuer, Simon PickinUniversidad Carlos III de Madrid

Maria Larrondo PetrieFlorida Atlantic University

Page 2: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Goal•

Add quality assurance fromFormal Methods

to theLinux kernel

post-hoccapable of application by non-experts

handle 6.5 million lines of rapidly changing C code

Page 3: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

A little poem

"The time has come," the Walrus said,"To talk of many things:Of shoes - and ships - and sealing-waxOf cabbages - and kings -And why the sea is boiling hot -And whether pigs have wings."

L. Carroll, The Walrus and the Carpenter

• Pigs with wings have seemed about as likely as Formal Methods in the Linux kernel !

Page 4: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Frightening Naïveté in FM• "For example in p. 13 you claim that the kernel does

not treat an infinite number of user request[s] between DiskTQ events. This is not completely true ..."

• BUT HERE is a fundamental problem with the paper: it ignores early work on operating system correctness. In 1969, A.N. Habermann (of Dijkstra's T.H.E. team) wrote a thesis entitled "On the harmonious cooperation of abstract machines". His results [are] at least very similar to yours. Whether they are equivalent I cannot say ...

Page 5: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Analysis Example: Sleep under Spinlock Hunt (SluSH - needs funding)

Page 6: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Challenges for FM in OS

• Programming in the large– 6.5 Million LoC, 15 hardware architectures, hundreds or

thousands of authors, hundreds of changes every day

– industrial programming "the code should be commented"

– literate programming "the comments should be coded"

– open source "the code is the commentary"

Page 7: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Methodology

• Apply combinatory logic to C programs (think precisely how afterwards)

• Parallel abstract interpretation of state to guide analysis

• Further interpretation of logic to generate actions

• Initial idea loops evaluated "almost once" evolves to finding loop invariant automatically

• Did not initially realise that C statements return a value(did realise that C expression evaluations affect state!)

Page 8: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

SluSH run

Page 9: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Example of detected miscreant code

• snd_sb_csp_load() in sb16_csp.c

Page 10: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Another piece of guilty code

• Kernel 2.6.12 sound/oss/sequencer.c midi_outc()

Page 11: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Alan Cox owns up

Page 12: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

How many Type II errors?

• 16 false alarms per 1000 files

• 2 real alarms– in kernel coding, nearly any over-reporting is

acceptable to authors

• There will always be over-detection Why?

Page 13: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Output summarises liklihoods

Page 14: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

How many Type I errors

• Ideally0 !

– abstract approximation

– under-specifies, over-estimates

– if we say/see it doesn't happen, it doesn't happen● provisos

data/code memory separationlibrary functions do not modify current

environment parallel thread does not rewrite local data

Page 15: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Example of kfree/access

• drivers/scsi/aix7xxx_old.c in kernel 2.6.3

Page 16: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Basic method - state descriptions

Page 17: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Three Ps...

• Program– x = 1; if (x) ...

• Predicate description of program state

– n 1

• Perception of description– upper[n:p]

● spincount > 0 on a sleepy node is bad!● trigger/action system raises alarms, generates

relations ...

Page 18: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

What's new?

• Combining logic is 3 phase

● PRE● DURING● POST

• Logic constraints reduced to NF on the fly● ∪ ∩ simple constraints x < k, x > k, x = k (NP-

complete)

• Traces are often joined● p → 1 | q→2 becomes p∪q →[1,2]

Page 19: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

During?

• Observe process in execution

• In C means– exceptional program exit revealing internal state

● Return, Break (continue), Goto, (interrupt)

● R B G– "Blue box"

Pre Post

Dur

Page 20: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Blue box processing

A BPre Post

Dur

Page 21: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Empty Statement - NRB

• Pre → (Post, Dur)

– maintains P normally

– cannot return (F)

– cannot break (F)

Page 22: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Sequence -NRB• normal : traverse A then B

• return : return from A OR traverse A then return from B

• break : break from AOR traverse A then break from B

Page 23: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Forever Loop -NRB

• break from body is only normal exit from while(1)

• relax p until it is invariant

Page 24: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

+ Trigger/Action engine

• Three rules propagate call graph and do other housekeeping.

• One ... says that a sleep call while the objective function is positive is set causes alarm output:

Page 25: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Using the analyser

• Call with same parameters as gcc compiler

Page 26: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

At this point I should do example run

Page 27: Detecting Deadlock, Double-Free and Other Abuses in a Million Lines of Linux Kernel Source (SEW 30)

Summary

• Has formal methods come to the poloi?

• It's a step in the right direction.– No expertise needed

– Fast

– Copes with massive amounts of code

– Sound

• Negatives– Not good tracking program state (cries wolf)

– Extension to new problems needs expert