accelerating forensic and incident response workflow: the case for a new standard in forensic...

67
Accelerating your forensic & incident response workflow: the case for a new standard in forensic imaging Dr. Bradley Schatz Director, Schatz Forensic AusCERT Conference 2016 © Schatz Forensic 2016

Upload: bradley-schatz

Post on 14-Jan-2017

548 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Accelerating your forensic & incident response workflow: the case for a new standard in forensic imaging

Dr. Bradley SchatzDirector, Schatz Forensic

AusCERT Conference 2016© Schatz Forensic 2016

© 2016 Schatz Forensic

The volume problem increases the latency between evidence identification and useful

findings

Identify Acquire Analyse Reporting

Latency

© 2016 Schatz Forensic

Pick one of the belowYou can’t have both

Latency

Com

plet

enes

sPhysical Acquisition

Triage

You preserve everything but

analysis will have to wait

Near immediate results at the expense of

potentially missing evidence

Live forensics

© 2016 Schatz Forensic

How can we reduce latency?While maximising completeness

Latency

Com

plet

enes

sPhysical Acquisition

Triage

IncreaseI/O

throughput?

Live analysis while we acquire?

Dynamic partial acquisition?

Live forensics

What’s stopping me increasing I/O throughput?Background

© 2016 Schatz Forensic

Forensic Imaging v1.0: RawLinear bitstream copy + linear bitstream hash

$ dd if=/dev/hda bs=4k conv=sync,noerror | tee C1.D1.raw | md5sum > C1.D1.md5.txt

© 2016 Schatz Forensic

Forensic Imaging v1.0: Raw

MD5

Source Hard Drive

ACMECo.C1.D1.raw

ACMECo.C1.D1.raw.txt

# Linear Bitstream Hash

© 2016 Schatz Forensic

What affects throughput in acquisition?

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

© 2016 Schatz Forensic

I/O throughput in Acquisition is a systems problem

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

Target Storage Sustained Read

1TB Seagate 3.5” 7200rpm SATA 100 MB/s

Current generation 3.5” 7200rpm SATA 200 MB/s

Intel 730 SSD 550 MB/s

Macbook Pro 1TB ~1 GB/s

RAID 15000rpm SAS > 1 GB/s

© 2016 Schatz Forensic

I/O throughput in Acquisition is a systems problem

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

Interconnect Gb/s Max MB/s

PCIe / NVMe / Thunderbolt > 1000SATA3 / SAS 6 600USB3 5 500Gigabit Ethernet 1 100USB2 .48 48

© 2016 Schatz Forensic

I/O throughput in Acquisition is a systems problem

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

Algorithm Throughput MB/s

SHA1 619.23MD5 745.65Blake2b 601.87

© 2016 Schatz Forensic

Example: Forensic Duplicator1TB Seagate Target

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

SHA1600MB/s

SATA3Spinning Disk93.6MB/s

SAS 6G500MB/s

SATA3Spinning Disk200MB/s

Acquisition 1TB @ 93.6MB/s = 2h 58mVerification 1TB @ 200MB/s = 1h 23mTOTAL = 4h 21m

SAS 6G500MB/s

© 2016 Schatz Forensic

LiveCD Ancient Workstation Acquisition

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

SHA1600MB/s

SATA3Spinning Disk100MB/s

USB245MB/s

SATA3Spinning Disk200MB/s

Acquisition 1TB @ 45MB/s = 6h 10mVerification 1TB @ 45MB/s = 6h 10mTOTAL = 12h 20m

© 2016 Schatz Forensic

LiveCD Ancient Workstation Acquisition

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

SHA1600MB/s

SATA3Spinning Disk100MB/s

USB245MB/s

SATA3Spinning Disk200MB/s

Acquisition 1TB @ 45MB/s = 6h 10mVerification 1TB @ 200MB/s = 1h 23mTOTAL = 7h 33m

After copy, verify image on device with

faster interconnect

© 2016 Schatz Forensic

Forensic Imaging v2.0: EWFOriginal design

Source Hard Drive

MD5

Deflate

ACMECo.C1.D1.e01

Source Hard Drive

# Linear BitStream Hash

Linear Compressed Block Stream

© 2016 Schatz Forensic

The deflate algorithm is a significant bottleneck

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

Data Deflate MB/s Inflate MB/s

High entropy 40.4 IO bound

Low entropy 259 439

*Single core of quad core i7-4770 3.4Ghz measured with gzip

© 2016 Schatz Forensic

FTK Imager EWF Acquisition1TB Seagate 75% full, 4 core i5-750

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

SHA1600MB/s

SATA3Spinning Disk100MB/s

SATA3500MB/s

SATA3Spinning Disk200MB/s

Acquisition 1TB @ 67.8MB/s = 4h 06mVerification 1TB @ 106MB/s = 2h 36mTOTAL = 6h 42m

Deflate67.8 MB/s

© 2016 Schatz Forensic

Forensic Imaging v2.1: EWFGuymager (2008), X-Ways, recent ewfacquire

MD5

Deflate DeflateDeflate

Source Hard Drive

ACMECo.C1.D1.e01

# Linear Bitstream Hash

© 2016 Schatz Forensic

Lacklustre throughput reports (2013)

• Practitioner reports– Low 100’s MB/s [Zimmerman 2013]

• Research publications– FastDD <= 110 MB/s [Bertasi & Zago 2013]

• Our experience– Low powered CPU’s give low throughtput

Our approach to increasing I/O throughput

© 2016 Schatz Forensic

Scale to 8-core i7 & uncontended IO?Threaded EWF is CPU bound

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

SHA1600MB/s

SATA3Intel 720 SSD500MB/s

SATA3500MB/s

SATA3Samsung850 EVO Pro500MB/s

Acquisition 240GB @ 255MB/s = 14m 35sVerification 240GB @ 350MB/s = 10m 37sTOTAL = 25m 12s

Deflate31.9MB/s/core

*8 core i7-5820k @ 3.20 GHz

© 2016 Schatz Forensic

How about using a faster compression algorithm?

Target Storage Interconnect Hash Compress Interconnect Evidence

storage

Compression Algorithm Throughput MB/s/core*

Deflate (ZIP, gzip) 31.9Snappy (Google BigTable/MapReduce) 1,400LZO (ZFS) 1,540

© 2016 Schatz Forensic

Forensic Imaging v4.0: AFF4 (2009)

• ZIP64 based container• Storage virtualization

• Open source implementation & specification

© 2016 Schatz Forensic

AFF4: Storage Virtualisation

ACMECo.S1.RAID0.af4

ACMECo.S1.D1.af4 # Linear Bitstream Hash

ACMECo.S1.D2.af4

# Linear Bitstream Hash

Compressed Block Storage Stream

Virtual Storage Stream (Map)

© 2016 Schatz Forensic

AFF4: Storage Virtualisation

ACMECo.S1.RAID0.af4

ACMECo.S1.D1.af4 # Linear Bitstream Hash

ACMECo.S1.D2.af4

# Linear Bitstream Hash

Compressed Block Storage Stream

Virtual Storage Stream (Map)

Storage virtualisation

© 2016 Schatz Forensic

AFF4: Storage Virtualisation

ACMECo.S1.RAID0.af4

ACMECo.S1.D1.af4 # Linear Bitstream Hash

ACMECo.S1.D2.af4

# Linear Bitstream Hash

Compressed Block Storage Stream

Virtual Storage Stream (Map)

Inter –container referencing

© 2016 Schatz Forensic

Linear bitstream hashing isn’t parallelizable.Max. rate ~600 MB/s on current gen. CPU’s

Target Storage Interconnect Hash Filesystem Interconnect Evidence

storage

Algorithm Throughput MB/s

SHA1 619.23MD5 745.65Blake2b 601.87

© 2016 Schatz Forensic

Our solution: Block based hashing.

Hash

Compress CompressCompress

Source Hard Drive

Hash Hash

Block Hashes

# Block Hashes Hash

© 2016 Schatz Forensic

Block hashing shifts the bottleneck from from CPU to Source I/O

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

SHA1600 MB/s/core

SATA3Intel 730 SSD500MB/s

4xSATA32GB/s

RAID04x SATA32TB800MB/s

SnappyAvg1.5GB/s/core

*8 core i7-5820k @ 3.20 GHz

Acquisition application Linear Acquisition Verification

X-Ways Forensics 14:35255 MB/s (15.3 GB/min)

10:37350 MB/s (21.0 GB/min)

Wirespeed (linear) 7:23500 MB/s (30.3 GB/min)

4:12888 MB/s (53.33 GB/min)

How can we take advantage of these speeds?

© 2016 Schatz Forensic

Block hashing shifts the bottleneck from from CPU to Source I/O

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

SHA1600 MB/s/core

SATA3Intel 720 SSD500MB/s

4xSATA32GB/s

RAID04x SATA32TB800MB/s

SnappyAvg1.5GB/s/core

*8 core i7-5820k @ 3.20 GHz

Acquisition application Linear Acquisition Verification

X-Ways Forensics 14:35255 MB/s (15.3 GB/min)

10:37350 MB/s (21.0 GB/min)

Wirespeed (linear) 7:23500 MB/s (30.3 GB/min)

4:12888 MB/s (53.33 GB/min)

Realistic?More likely USB3

or 1GbE

© 2016 Schatz Forensic

Idea: can we aggregate output I/O?Use 2x USB3 drives?

Target Storage Interconnect Hash Compress Filesystem Interconnect Evidence

storage

SHA1600 MB/s/core

SATA3Intel 720 SSD500MB/s

2xUSB31GB/s

2x SATA32TB400MB/s

SnappyAvg1.5GB/s/core

*8 core i7-5820k @ 3.20 GHz

© 2016 Schatz Forensic

AFF4 Striping

ACMECo.S1.D1.2.af4

ACMECo.S1.D1.1.af4

Virtual Storage Stream (Map)

Disk 1

Disk 2

Source blocks striped over multiple containers on multiple output disks

© 2016 Schatz Forensic

AFF4 Striping

ACMECo.S1.D1.2.af4

ACMECo.S1.D1.1.af4

Virtual Storage Stream (Map)

Disk 1

Disk 2

A copy of the map is stored in each container.

How can we analyse while we acquire?

© 2016 Schatz Forensic

How can we reduce latency?While maximising completeness

Latency

Com

plet

enes

sPhysical Acquisition

Triage

IncreaseI/O

throughput?

Live analysis while we acquire?

Dynamic partial acquisition?

Live forensics

© 2016 Schatz Forensic

Acquire and access in parallel? dd + iSCSI access to target

MD5

Source Hard Drive

ACMECo.C1.D1.raw

ACMECo.C1.D1.raw.txt

# Linear Bitstream Hash

iSCSIRemote analysis tools

© 2016 Schatz Forensic

Acquire and access in parallel? dd + iSCSI access to target

MD5

Source Hard Drive

ACMECo.C1.D1.raw

ACMECo.C1.D1.raw.txt

# Linear Bitstream Hash

iSCSIRemote analysis tools

Access is contended.Poor interactive

performance (lag )

© 2016 Schatz Forensic

Acquire and access in parallel? dd + iSCSI access to target

MD5

Source Hard Drive

ACMECo.C1.D1.raw

ACMECo.C1.D1.raw.txt

# Linear Bitstream Hash

iSCSIRemote analysis tools

Early termination may not have a

complete filesystem

© 2016 Schatz Forensic

Idea: Start with a non-linear partial image and add from there

Entire disk

All allocated

Interactive analysis artifacts

High value files

Volume & FS Metadata, Memory

Analysis

© 2016 Schatz Forensic

Raw Image : Non-linear acquisition driven by live analysis?

Source Hard Drive

ACMECo.C1.D1.raw

ACMECo.C1.D1.raw.txt

# Linear Bitstream Hash

iSCSI How do you generate a hash over a non-linear image?

© 2016 Schatz Forensic

Forensic Imaging v4.1: AFF4 (2010)

• Non-linear acquisition• Hash based imaging

(deduplication)

© 2016 Schatz Forensic

Partial, non-linear, block based hashing

Hash

Compress CompressCompress

ACMECo.C1.D1.af4

Volume Metadata

Filesystem Metadata

Sparse Data

File Content

Unknown

Hash Hash

Block Hashes

Compressed Block Stream

# Block Hashes Hash

Virtual Block Stream (Map)

Source Hard Drive

© 2016 Schatz Forensic

Forensic Imaging v4.2: AFF4 (2015)

• Partial acquisition – Represent what we didn’t

acquire vs. what we couldn’t acquire

• Block based hashing

© 2016 Schatz Forensic

Partial acquisition brings reproducibility and elasticity to IR and triage

Target Storage Interconnect Hash Compress Network Evidence

storage

SHA1600 MB/s/core

SATA3Spinning disk200MB/s

1GbE100MB/s

RAID04x SATA32TB800MB/s

SnappyAvg1.5GB/s/core

*8 core i7-5820k @ 3.20 GHz

Partial IR acquisition 21.9GiB @ 102MiB/s = 3m 39s

Volume metadata, filesystem metadata, 16G pagefile, Registries, Logs, Link files, Jump lists, WMI CIM Repo, Prefetch, USN Journal, $Logfile, Scheduler artefacts

How can I work with AFF4 images?

© 2016 Schatz Forensic

Why adopt this?My toolset doesn't support AFF4.

• Wait for support from vendors?

• Convert AFF4 to EWF on fast workstation– Can be done in near same time it takes to simply

copy by only deflate compressing low entropy blocks

• Emulate Raw image in the filesystem?

© 2016 Schatz Forensic

Emulation of AFF4 containers as RAW

© 2016 Schatz Forensic

Emulated raw is faster than native EWF.

X-Ways processing task X-Ways Native EWF X-Ways w/ Wirespeed FS Bridge

Verify 0:42 0:08FS Data Recovery 3:35 3:20Hashing & header validation

1:59:03 1:05:25

Carving unallocated 0:41 0:44Total 3:25:43 2:02:09

Image: 1TB Macbook Pro i7, processed on 8 core i7

How does this affect workflow?

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

Single Threaded EWF?

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

Multi Threaded EWF

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

AFF4

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

AFF4: Copies in half the time due to

striped acquisition over 2 x 200 MB/s

spinning disks.

EWF: I/O bound on single 200MB/s disk

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

AFF4: Verification completes in 8m. I/O

bound by RAID.

EWF: CPU bound

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

AFF4: Filesystem search in around ½

time.

EWF: CPU bound?

© 2016 Schatz Forensic

Native EWF Acquisition vs AFF4Native EWF Processing vs AFF4 FS Bridge

AFF4 & EWF around the same throughput.

Will the courts accept the AFF4 format?

© 2016 Schatz Forensic

Courts accept expert evidenceIs it reliable?

• Is the expert reliable?

• Is the underlying theory reliable?– Reliable by way of the application of Scientific methods (eg.

Daubert)– 4 scientifically peer reviewed papers, unrefuted

• Are the methods implementing the theory reliable?– Tool testing (as always, the expert’s ultimate responsibility)

AdoptionWho is using AFF4?

© 2016 Schatz Forensic

AFF4 is used in the following

evimetry wirespeed

More information

© 2016 Schatz Forensic

More informationImplementations• https://evimetry.com/ • https://github.com/google/aff4• http://www.rekall-forensic.com/docs/Tools/• https://github.com/google/grr

Ongoing specification and papers• http://www.aff4.org/ • http://dfrws.org/2009/proceedings/p57-cohen.pdf• http://dfrws.org/2010/proceedings/2010-314.pdf• http://dfrws.org/2015/proceedings/DFRWS2015-16.pdf

Conclusion

© 2016 Schatz Forensic

Conclusion

• Optimising forensic workflow is a systems problem

• Existing forensic formats are a bottleneck for todays systems

• Existing forensic formats are incompatible with triage and reproducible live analysis

• The Advanced Forensic Format 4 solves the above

Contact

Hard disk head by amckgillFootprints by kimba

Dr Bradley Schatzhttp://schatzforensic.com.au/[email protected]

Schatz BL (2012) Digital Evidence (Chapter) in Expert Evidence, Freckelton & Selby Eds

Available online via Westlaw AU and Thomson Legal Online