cloud computing challenges and responses

Upload: huy-nguyen-the

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Cloud Computing Challenges and Responses

    1/6

    Cloud Computing: Challenges and Responses

    Nguyen The Huy

    AbstractCloud Computing (CC) is on the rise. CC differs in a number of ways from the traditional computing

    models. CC presents numerous challenges to digital forensics community whose researches and

    practices largely fall in the realm of traditional computing. We will name a few of them in this paper

    together with some approaches the community has taken to overcome those challenges. On the other

    hand, CC is also full of economic and computational advantages. We will visit several approaches that

    employ CC to improve the quality of digital forensics work.

    1. IntroductionCloud Computing is defined by NIST as model for enabling convenient, on-demand network access to a

    shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and

    services) that can be rapidly provisioned and released with minimal management effort or service

    provider interaction. In this definitions, CC has 5 essential characteristics (On-demand self-service,

    Broad network access, Resource pooling, Rapid elasticity, Measured service), 3 service models (Software

    as a service, Platform as a service, Infrastructure as a service) and 4 deployment models (Private cloud,

    Community cloud, Public cloud, Hybrid cloud). [8]

    On the other hand, according to Armbrust et al., CC is either Software as a Service (SaaS) or UtilityComputing, but excludes Private Cloud deployment model[2]. Although their definition is narrower than

    that ofNISTs, their claims and arguments at the least are valid for a Public cloud deployment model,

    which is currently the most popular. Examples of public CC nowadays are Amazon EC2, Google

    AppEngine and Microsoft Azure.

    Although not required by CCs definition, virtualization is considered essential to achieve elasticity and

    the illusion of infinite capacity [2]. Armbrust, et al. also noted that the construction and operation of

    extremely large-scale data center is necessary for economic and efficient use of CC. High automation is

    expected in such large-scale cloud infrastructure. While private or smaller scale CCs are possible [5], the

    scale of many popular CC deployments nowadays is (extremely) large. As a result of that large scale

    infrastructure, access to the internal construction and operation of the cloud are (severely) limited to

    outsider (e.g. cloud user, digital forensics investigator). Private cloud, however, may offer more relaxed

    access to its internal. Another commonly found characteristic in CC is distributed environment. Broad

    network access and rapid elasticity characteristics of CC help the distributed environment scale even

    easier.

  • 8/2/2019 Cloud Computing Challenges and Responses

    2/6

    2. Challenges of Cloud Computing as Digital Forensic TargetMany challenges in traditional computing models are still present in CC. For example, encryption and

    vast storage size still pose great difficulty in performing digital forensics. Traditional digital forensic tools

    and practices, while already showing insufficient capability in handling those challenges, probably

    cannot readily adapt to handle new challenges from CC.

    A prominent challenge from CC is the difficulty in collecting digital evidence, especially from

    infrastructure sources. The ability to perform data preservation and isolation are greatly hindered by

    (extremely) limited access to the source of digital evidence (e.g. infrastructure, logs). Garfinkel argues

    that those fundamental forensic activities cannot be performed in a CC environment [6]. Birk even

    questioned the availability and validity of digital evidence given the abstraction of internal CC

    infrastructure and the lack of standard within the cloud [4].

    It is also obvious that locating digital evidence in CC is very difficult given the highly dynamic, virtualized

    and distributed natures of CC. For instance, a cloud similar to the one in [5] can load a virtual instance

    from its library onto any node in its pool. Or as Marty noticed, on Amazon AWS, load balancers IPaddress constantly changes [7].

    In addition, legal challenge also places a big challenge for forensics work. Besides existing legal

    complication, the fact that data belonging to multiple cloud users may reside in a same cloud entity

    introduces new legal obstacles [3].

    3. Taking Advantage of Cloud Computing PowerAs mentioned earlier, distributed environment is an often encountered characteristic of CC. Moreover,

    CC can offer scalable distributed environment with ease. Therefore mentioning early works leveraging

    CC for digital forensics benefit must include the feasibility assessment of developing digital forensics

    tools in scalable, distributed environment by Roussev and Richard [9]. Their work was motivated by the

    fact that current forensics tools were insufficient in handling the challenge of todays massive storage

    size and huge network bandwidth. Going further, they argued that when digital forensics tools getting

    more sophisticated, as they should be, current single-machine CPU power would not be enough. Then,

    they proposed a specialized distributed framework upon which digital forensics tools can build upon to

    tackle both IO and CPU constraints. Their proposed framework built on a coordinator-workers

    architecture communicating using simple text-based predefined system commands (initialization,

    termination, cache management and reply messages) and processing command (e.g. hash, grep, crack).

    Early result from their prototype showed significant reduction in processing time. That their software

    remained interactive during the run was a bonus.

    The application of distributed computing shows great potential for improving the performance of digital

    forensic tools or developing more sophisticated ones. However, the evaluated prototype implemented

    only 1 digital forensics operation: regular expression matching (Grep). As a result, the usefulness of the

    proposed distributed framework was less convincing. In addition, Roussev et al.s decision to develop a

    brand new distributed framework from scratch instead of making use of an existing generic one may not

  • 8/2/2019 Cloud Computing Challenges and Responses

    3/6

    be right. For instance, with only 1 forensic task ever implemented in that new framework, it is possible

    that the framework is not suitable for, or does not scale well in the development of other forensics

    software.

    Later in 2009, Roussev and other researchers presented a cloud-based implementation of several

    elementary digital forensics software. Their implementation was reported achieving linear and sub-linear speedup compared to traditional implementation. Their work (called MMR) was an

    implementation of Googles MapReduce framework using Message Passing Interface (MPI). In particular,

    MMR consisted of 3 abstraction layers: MPI providing distributed communication; middleware platform

    providing synchronization and MapReduce abstraction; and finally software code containing application

    logic. A comprehensive evaluation of using MMR in developing elementary forensics software or alike

    (such as wordcount, grep, bloom filter or pi estimator representing CPU-bound image processing

    algorithms) confirmed the feasibility of achieving scalable and robust performance runs with MMR.

    MMR was also reported to have better performance over Hadoop, the Java implementation of

    MapReduce. [9]

    This work by Roussev et al. was another attempt to apply distributed computing to solve the

    performance issues of digital forensics tools. Compared to previous work, this works implementation

    and evaluation were more robust and comprehensive. Much of the robustness, we believe, was the

    result of using MPI and Googles MapReduce instead of developing a new specialized distributed

    framework.

    The evaluation also included a comparison between the performance of MMR and Hadoop. Because of

    the limitations in Hadoops Java implementation [9], the result, as mentioned earlier, was expected.

    However, it would be very interesting if there were a comparison between the above 2 in term of ease

    of design and implementation. As forensics software become more sophisticated, ease of development

    will become more important. Java is well known for its object-oriented-ness and automatic memory

    management. The implementation language of MMR was unfortunately not mentioned.

    Taking advantage of CC in a different way, Buchanan et al. used CC to methodologically evaluate the

    quality of digital forensics tools [5]. Inspire them was the fact that credibility of digital forensics finding is

    impacted by the lack of standardization in procedures, toolkits and data sets. Built from their success of

    using virtualization in teaching computer security and digital forensics, Buchanan et al. presented an

    infrastructure based on virtualization within CC deployment. In brief, they created a set of evaluation

    criteria and a CC-based testing system. This system was capable of script-automating different modes of

    testing and of creating and preparing portable and reproducible test environment. Each test

    environment was a virtual instance stored in a shared library. Result from evaluation of their systems

    showed reliable and robust and scalable execution. In addition, the systems demonstrated better energy

    consumption and CPU utilization compared to traditional stand-alone test system.

    Buchanan et al.s work showed great promise in improving the quality and credibility of digital forensics

    researches and practices. Their work can also promote and facilitate collaboration among members of

    the digital forensics community because it would be easier to create, collect and transfer testing and

  • 8/2/2019 Cloud Computing Challenges and Responses

    4/6

    training data sets. However, copyright issues may still constraint the creating and sharing certain types

    of digital forensics data. Binary scrambling techniques may be used in some cases, but it was reported to

    create new issue for forensics techniques which rely on known binary signature [5].

    In addition, the current implementation based on VMware technologies cannot accommodate non-

    desktop (e.g. mobile phones, handheld devices) digital forensics data and tools. For instance, two mostpopular mobile platforms, Apples iOS and Android, have not been supported as guest OS by VMware

    [11]. Given the current trend of mobile computing, we expect demand for digital forensics researches

    and practices on those mobile platforms to increase rapidly. Therefore, it is highly desirable that future

    version of Buchanan et al.s test system can support such guest OSs.

    4. Different Approach to Forensics in Cloud ComputingMarty from Loggly, Inc. highlighted the importance of log analysis in a digital forensics investigation. In

    his paper, he presented a list of challenges encountered when analyzing logs from CC deployment for

    forensics purpose. Those challenges are often results of failures in properly managing log files andproperly generating log records. In order to overcome those difficulties, Marty proposed a proactive

    approach to generating and managing logs. His suggestion is a comprehensive set of guidelines on log

    management (e.g. enabling, storing, transporting and tuning logs) and log creation (e.g. when, what,

    how to log). A sample set up in a SaaS CC was presented together with practical tips and actual

    difficulties encountered during the set up. The goal was to collect logs from Django web application,

    JavaScript, Apache web server, load balancer, MySQL, operating system and back-end. [7]

    Even though Martys guidelines comprehensively cover logging issues from infrastructure level (e.g. OS,

    transport) to application level (e.g. when, what, how to log), Marty also noted that application

    developers total support is necessary for the success of the framework.

    In addition, it is not easy to implement those guidelines; some may not even be possible. (Extremely)

    limited access to the internal structure and operation of the cloud leaves little room for any outsider to

    improve the log management process to support digital forensics. Usually this is the case for logs from

    infrastructure sources. For instance, in his sample set up, Marty was not able to obtain MySQL logs. Even

    if this is possible, the large scale and dynamic nature of CC can hamper the log management process.

    Martys guidelines on infrastructure log were also rather short and abstract compared to his guidelines

    on how to generate application log. Yet even generating log on application level may also fail to follow

    all guidelines. While the guidelines on what, when, where, who, why and how to log are useful and

    extensive, it is perhaps rather overwhelming to any developer who is not familiar with such practice.

    More importantly, it is often the case that logging is not mentioned, or only briefly mentioned in any

    business user requirements. Thus to implement all guidelines is to over-invest time and effort while

    obtaining no clear economics gain (measured against delivering all required functionality on time and

    within budget).

    In another work on log analysis, Accorsi et al. took the practice to a higher level. While current digital

    forensics toolkits only performed analysis on logs from infrastructure sources, they invented a forensics

  • 8/2/2019 Cloud Computing Challenges and Responses

    5/6

    technique called RECIF for the analysis of logs from the application level. Probably inspired by the

    success of researches in the mature business informatics, their technique offered a pioneering approach

    to digital forensics investigation. In brief: a data flow is a transition between 2 events whose output of

    one is used as input of the other. A policy is a set ofconstraint-exception relations expressed in a special

    simple language. From data collected in application logs, a propagation graph of data flows is

    reconstructed and the resulted graph is matched with a set of predefined business process policies to

    detect information leak [1]. An interesting fact was the use of MXML, the standardized log format for

    business process. It was mentioned that tools for transforming logs from major business process

    systems such as SAP, Oracle and Sage to MXML were available. An example run was reported with the

    technique correctly detected information leak from a set of generated data flows and a separation-of-

    duty policy. Further works were ongoing to demonstrate the correctness of the technique in more

    complex scenarios.

    Overall, the technique looks very promising. It works on a standardized log formats which can be

    transformed from multiple other common formats. It can readily support a many common business

    policies such as Separation of Duties, Conflict of interests. It is also complimentary to other digitalforensics techniques. And it doesnt require any special tools or skills (besides the ability to write

    business process policy in a pre-defined syntax, which looks simple). This technique is probably most

    useful in information leak scenarios where the so-called crime does not involve much technical details.

    Those scenarios are often the result of flaws in security policy specifications or software designs.

    However, the expressiveness of the language can still be enhanced. Obviously, the language cannot

    express policy of type deny all, allow a few because it was intrinsically allow all, deny a few.

    5. Conclusion and Future DirectionIn this paper, we have reviewed a few challenges posed by CC to the digital forensics community. We

    also discuss a few approaches focusing on infrastructure and applications log analysis to tackle those

    challenges. On the other hand, CC also delivers many computational advantages which can be utilized by

    the community to improve the performance and quality of digital forensics tools. Digital forensics

    training in the virtualization environment of the cloud was also explored.

    From the above finding, we think the future of digital forensics given the rise of CC lies in taking its

    economic and computational advantages and exploring non-traditional approaches in collecting and

    analyzing digital evidence. Among the possible tactics, borrowing expertise from other related research

    fields may bring in new ideas and firmer approaches. A good example was the application of business

    informatics in the detection of information leaks [1].

    6. References[1]. Accorsi, R., Wonnemann, C., & Stocker, T. (2011). Towards forensics data flow analysis of business

    process logs. IT Security Incident Management & IT Forensics.

  • 8/2/2019 Cloud Computing Challenges and Responses

    6/6

    [2]. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., et al. (2009).Above theClouds: A Berkeley View of Cloud Computing. University of California at Berkeley.

    [3]. Beebe, N. (2009). Digital Forensic Research: The good, the Bad and the Unaddressed.Advances inDigital Forensics V, IFIP AICT 306 , pp. 17-36.

    [4]. Birk, D. (n.d.). Technical Challenges of Forensic Investigations in Cloud Computing Environments.Retrieved April 8, 2011, from http://www.zurich.ibm.com/~cca/csc2011/submissions/birk.pdf

    [5]. Buchanan, W. J., Macfarlane, R. J., Flandrin, F., Graves, J., Buchanan, B., Fan, L., et al. (2011). Cloud-based Digital Forensics Evaluation Test (D-FET) Platform. Cyberforensics.

    [6]. Garfinkel, S. L. (2010). Digital forensics research: the next 10 years. Digital Forensics ResearchWorkshop.

    [7]. Marty, R. (2011). Cloud Application logging for forensics. SAC.[8]. National Institue of Standard and Technology. (n.d.). The NIST Definition of Cloud Computing.

    Retrieved April 28, 2011, from http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc

    [9]. Roussev, V., & Richard, G. G. (2004). Breaking the performance wall: the case for distributed digitalforensics. Proceedings of the fourth digital forensic research workshop.

    [10].Roussev, V., Wang, L., Richard, G., & Marziale, L. (2009). A CLOUD COMPUTING PLATFORM FORLARGE-SCALE FORENSIC COMPUTING.Advances in Digital Forensics V, IFIP AICT 306, , 201-214.

    [11].VMware Supports the Largest Number of Guest Operating Systems. (n.d.). Retrieved 04 12, 2011,from http://www.vmware.com/technical-resources/advantages/guest-os.html