spotless sandboxes: evading malware analysis systems ?· spotless sandboxes: evading malware...

Download Spotless Sandboxes: Evading Malware Analysis Systems ?· Spotless Sandboxes: Evading Malware Analysis…

Post on 11-Jun-2018




4 download

Embed Size (px)


  • Spotless Sandboxes: Evading Malware AnalysisSystems using Wear-and-Tear Artifacts

    Najmeh Miramirkhani, Mahathi Priya Appini, Nick Nikiforakis, Michalis PolychronakisStony Brook University

    {nmiramirkhani, mappini, nick, mikepo}

    AbstractMalware sandboxes, widely used by antiviruscompanies, mobile application marketplaces, threat detectionappliances, and security researchers, face the challenge ofenvironment-aware malware that alters its behavior once it de-tects that it is being executed on an analysis environment. Recentefforts attempt to deal with this problem mostly by ensuring thatwell-known properties of analysis environments are replaced withrealistic values, and that any instrumentation artifacts remainhidden. For sandboxes implemented using virtual machines, thiscan be achieved by scrubbing vendor-specific drivers, processes,BIOS versions, and other VM-revealing indicators, while moresophisticated sandboxes move away from emulation-based andvirtualization-based systems towards bare-metal hosts.

    We observe that as the fidelity and transparency of dynamicmalware analysis systems improves, malware authors can resortto other system characteristics that are indicative of artificialenvironments. We present a novel class of sandbox evasiontechniques that exploit the wear and tear that inevitably occurson real systems as a result of normal use. By moving beyond howrealistic a system looks like, to how realistic its past use looks like,malware can effectively evade even sandboxes that do not exposeany instrumentation indicators, including bare-metal systems. Weinvestigate the feasibility of this evasion strategy by conductinga large-scale study of wear-and-tear artifacts collected from realuser devices and publicly available malware analysis services. Theresults of our evaluation are alarming: using simple decision treesderived from the analyzed data, malware can determine that asystem is an artificial environment and not a real user devicewith an accuracy of 92.86%. As a step towards defending againstwear-and-tear malware evasion, we develop statistical models thatcapture a systems age and degree of use, which can be used toaid sandbox operators in creating system images that exhibit arealistic wear-and-tear state.


    As the number and sophistication of the malware samplesand malicious web pages that must be analyzed every dayconstantly increases, defenders rely on automated analysissystems for detection and forensic purposes. Malware scanningbased on static code analysis faces significant challengesdue to the prevalent use of packing, polymorphism, andother code obfuscation techniques. Consequently, malwareanalysts largely rely on dynamic analysis approaches to scanpotentially malicious executables. Dynamic malicious codeanalysis systems operate by loading each sample into a heavilyinstrumented environment, known as a sandbox, and monitoringits operations at varying levels of granularity (e.g., I/O activity,system calls, machine instructions). Malware sandboxes aretypically built using API hooking mechanisms [1], CPU

    emulators [2][5], virtual machines [6], [7], or even dedicatedbare-metal hosts [8][10].

    Malware sandboxes are employed by a broad range ofsystems and services. Besides their extensive use by antiviruscompanies for analyzing malware samples gathered by deployedantivirus software or by voluntary submissions from users,sandboxes have also become critical in the mobile ecosystem,for scrutinizing developers app submissions to online appmarketplaces [11], [12]. In addition, a wide variety of industrysecurity solutions, including intrusion detection systems, securegateways, and other perimeter security appliances, employsandboxes (also known as detonation boxes) for on-the-flyor off-line analysis of unknown binary downloads and emailattachments [13][17].

    As dynamic malware analysis systems become more widelyused, online miscreants constantly seek new methods to hinderanalysis and evade detection by altering the behavior ofmalicious code when it runs on a sandbox [8], [9], [11], [18][20]. For example, malicious software can simply crash orrefrain from performing any malicious actions, or even exhibitbenign behavior by masquerading as a legitimate application.Similarly, exploit kits hosted on malicious (or infected) webpages have started employing evasion and cloaking techniquesto refrain from launching exploits against analysis systems [21],[22]. To effectively alter its behavior, a crucial capability formalicious code is to be able to recognize whether it is beingrun on a sandbox environment or on a real user system. Suchenvironment-aware malware has been evolving for years,using sophisticated techniques against the increasing fidelityof malware analysis systems.

    In this paper, we argue that as the fidelity and non-intrusiveness of dynamic malware analysis systems continuesto improve, an anticipated next step for attackers is to relyon other environmental features and characteristics to distin-guish between end-user systems and analysis environments.Specifically, we present an alternative, more potent approach tocurrent VM-detection and evasion techniques that is effectiveeven if we assume that no instrumentation or introspectionartifacts of an analysis system can be identified by malwareat run time. This can be achieved by focusing instead on the

    wear and tear and aging that inevitably occurs on realsystems as a result of normal use. Instead of trying to identifyartifacts characteristic to analysis environments, malware candetermine the plausibility of its host being a real system usedby an actual person based on system usage indicators.

  • The key intuition behind the proposed strategy is that, incontrast to real devices under active use, existing dynamicanalysis systems are typically implemented using operatingsystem images in almost pristine condition. Although somefurther configuration is often performed to make the systemlook more realistic, after each binary analysis, the system isrolled back to its initial state [23]. This means that any possiblewear-and-tear side-effects are discarded, and the system isalways brought back to a pristine condition. As an indicativeexample, in a real system, properties like the browsing history,the event log, and the DNS resolver cache, are expected tobe populated as a result of days or weeks worth of activity.In contrast, due to the rollback nature of existing dynamicanalysis systems, malicious code can easily identify that theabove aspects of a system may seem unrealistic for a devicethat supposedly has been in active use by an actual person.

    Even though researchers have already discussed the pos-sibility of using specific indicators, such as the absence ofRecently Open Files [24] and the lack of a sufficient numberof processes [25] as ways of identifying sandbox environments,we show that these individual techniques are part of a largerproblem, and systematically assess their threat to dynamicanalysis sandboxes. To assess the feasibility and effectivenessof this evasion strategy, we have identified a multitude of wear-and-tear artifacts that can potentially be used by malware asindicators of the extent to which a system has been actuallyused. We present a taxonomy of our findings, based on whichwe then proceeded to implement a probe tool that relies on asubset of those artifactsones that can be captured in a privacy-preserving wayto determine a systems level of wear andtear. Using that tool, we gathered a large set of measurementsfrom 270 real user systems, and used it to determine thediscriminatory capacity of each artifact for evasion purposes.

    We then proceed to test our hypothesis that wear and tearcan be a robust indicator for environment-aware malware, bygauging the extent to which malware sandboxes are vulner-able to evasion using this strategy. We submitted our probeexecutable to publicly available malware scanning services,receiving results from 16 vendors. Using half of the collectedsandbox data, we trained a decision tree model that picks afew artifacts to reason whether a system is real or not, andevaluated it using the other half. Our findings are alarming:for all tested sandboxes, the employed model can determinethat the system is an artificial environment and not a real userdevice with an accuracy of 92.86%.

    We provide a detailed analysis of the robustness of thisevasion approach using different sets of artifacts, and showthat modifying existing sandboxes to withstand this type ofevasion is not as simple as artificially adjusting the probedartifacts to plausible values. Besides the fact that malware caneasily switch to the use of another set of artifacts, we haveuncovered a deeper direct correlation of many of the evaluatedwear-and-tear artifacts and the reported age of the system,which can allow malware to detect unrealistic configurations.As a step towards defending against wear-and-tear malwareevasion, we have developed statistical models that capture the

    age and degree of usage of a given system, which can be usedto fine-tune sandboxes so that they look realistic.

    In summary, our work makes the following main contribu-tions:

    We show that certain, seemingly independent, techniquesfor identifying sandbox environments are in fact allinstances of a broader evasion strategy based on a systemswear-and-tear characteristics.

    We present an indicative set of diverse wear-and-tearartifacts, and evaluate their discriminatory capacity byconducting a large-scale measurement study based on datacollected from real user systems and malware sandboxes.

    We demonstrate that using simple decision trees derivedfrom the analyzed data, malware can robustly evade alltested sandboxes with an accuracy of 92.86%.

    We present statistical models that can predict the age ofa system based on its system


View more >