cloud computing challenges and responses
TRANSCRIPT
-
8/2/2019 Cloud Computing Challenges and Responses
1/6
Cloud Computing: Challenges and Responses
Nguyen The Huy
AbstractCloud Computing (CC) is on the rise. CC differs in a number of ways from the traditional computing
models. CC presents numerous challenges to digital forensics community whose researches and
practices largely fall in the realm of traditional computing. We will name a few of them in this paper
together with some approaches the community has taken to overcome those challenges. On the other
hand, CC is also full of economic and computational advantages. We will visit several approaches that
employ CC to improve the quality of digital forensics work.
1. IntroductionCloud Computing is defined by NIST as model for enabling convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and
services) that can be rapidly provisioned and released with minimal management effort or service
provider interaction. In this definitions, CC has 5 essential characteristics (On-demand self-service,
Broad network access, Resource pooling, Rapid elasticity, Measured service), 3 service models (Software
as a service, Platform as a service, Infrastructure as a service) and 4 deployment models (Private cloud,
Community cloud, Public cloud, Hybrid cloud). [8]
On the other hand, according to Armbrust et al., CC is either Software as a Service (SaaS) or UtilityComputing, but excludes Private Cloud deployment model[2]. Although their definition is narrower than
that ofNISTs, their claims and arguments at the least are valid for a Public cloud deployment model,
which is currently the most popular. Examples of public CC nowadays are Amazon EC2, Google
AppEngine and Microsoft Azure.
Although not required by CCs definition, virtualization is considered essential to achieve elasticity and
the illusion of infinite capacity [2]. Armbrust, et al. also noted that the construction and operation of
extremely large-scale data center is necessary for economic and efficient use of CC. High automation is
expected in such large-scale cloud infrastructure. While private or smaller scale CCs are possible [5], the
scale of many popular CC deployments nowadays is (extremely) large. As a result of that large scale
infrastructure, access to the internal construction and operation of the cloud are (severely) limited to
outsider (e.g. cloud user, digital forensics investigator). Private cloud, however, may offer more relaxed
access to its internal. Another commonly found characteristic in CC is distributed environment. Broad
network access and rapid elasticity characteristics of CC help the distributed environment scale even
easier.
-
8/2/2019 Cloud Computing Challenges and Responses
2/6
2. Challenges of Cloud Computing as Digital Forensic TargetMany challenges in traditional computing models are still present in CC. For example, encryption and
vast storage size still pose great difficulty in performing digital forensics. Traditional digital forensic tools
and practices, while already showing insufficient capability in handling those challenges, probably
cannot readily adapt to handle new challenges from CC.
A prominent challenge from CC is the difficulty in collecting digital evidence, especially from
infrastructure sources. The ability to perform data preservation and isolation are greatly hindered by
(extremely) limited access to the source of digital evidence (e.g. infrastructure, logs). Garfinkel argues
that those fundamental forensic activities cannot be performed in a CC environment [6]. Birk even
questioned the availability and validity of digital evidence given the abstraction of internal CC
infrastructure and the lack of standard within the cloud [4].
It is also obvious that locating digital evidence in CC is very difficult given the highly dynamic, virtualized
and distributed natures of CC. For instance, a cloud similar to the one in [5] can load a virtual instance
from its library onto any node in its pool. Or as Marty noticed, on Amazon AWS, load balancers IPaddress constantly changes [7].
In addition, legal challenge also places a big challenge for forensics work. Besides existing legal
complication, the fact that data belonging to multiple cloud users may reside in a same cloud entity
introduces new legal obstacles [3].
3. Taking Advantage of Cloud Computing PowerAs mentioned earlier, distributed environment is an often encountered characteristic of CC. Moreover,
CC can offer scalable distributed environment with ease. Therefore mentioning early works leveraging
CC for digital forensics benefit must include the feasibility assessment of developing digital forensics
tools in scalable, distributed environment by Roussev and Richard [9]. Their work was motivated by the
fact that current forensics tools were insufficient in handling the challenge of todays massive storage
size and huge network bandwidth. Going further, they argued that when digital forensics tools getting
more sophisticated, as they should be, current single-machine CPU power would not be enough. Then,
they proposed a specialized distributed framework upon which digital forensics tools can build upon to
tackle both IO and CPU constraints. Their proposed framework built on a coordinator-workers
architecture communicating using simple text-based predefined system commands (initialization,
termination, cache management and reply messages) and processing command (e.g. hash, grep, crack).
Early result from their prototype showed significant reduction in processing time. That their software
remained interactive during the run was a bonus.
The application of distributed computing shows great potential for improving the performance of digital
forensic tools or developing more sophisticated ones. However, the evaluated prototype implemented
only 1 digital forensics operation: regular expression matching (Grep). As a result, the usefulness of the
proposed distributed framework was less convincing. In addition, Roussev et al.s decision to develop a
brand new distributed framework from scratch instead of making use of an existing generic one may not
-
8/2/2019 Cloud Computing Challenges and Responses
3/6
be right. For instance, with only 1 forensic task ever implemented in that new framework, it is possible
that the framework is not suitable for, or does not scale well in the development of other forensics
software.
Later in 2009, Roussev and other researchers presented a cloud-based implementation of several
elementary digital forensics software. Their implementation was reported achieving linear and sub-linear speedup compared to traditional implementation. Their work (called MMR) was an
implementation of Googles MapReduce framework using Message Passing Interface (MPI). In particular,
MMR consisted of 3 abstraction layers: MPI providing distributed communication; middleware platform
providing synchronization and MapReduce abstraction; and finally software code containing application
logic. A comprehensive evaluation of using MMR in developing elementary forensics software or alike
(such as wordcount, grep, bloom filter or pi estimator representing CPU-bound image processing
algorithms) confirmed the feasibility of achieving scalable and robust performance runs with MMR.
MMR was also reported to have better performance over Hadoop, the Java implementation of
MapReduce. [9]
This work by Roussev et al. was another attempt to apply distributed computing to solve the
performance issues of digital forensics tools. Compared to previous work, this works implementation
and evaluation were more robust and comprehensive. Much of the robustness, we believe, was the
result of using MPI and Googles MapReduce instead of developing a new specialized distributed
framework.
The evaluation also included a comparison between the performance of MMR and Hadoop. Because of
the limitations in Hadoops Java implementation [9], the result, as mentioned earlier, was expected.
However, it would be very interesting if there were a comparison between the above 2 in term of ease
of design and implementation. As forensics software become more sophisticated, ease of development
will become more important. Java is well known for its object-oriented-ness and automatic memory
management. The implementation language of MMR was unfortunately not mentioned.
Taking advantage of CC in a different way, Buchanan et al. used CC to methodologically evaluate the
quality of digital forensics tools [5]. Inspire them was the fact that credibility of digital forensics finding is
impacted by the lack of standardization in procedures, toolkits and data sets. Built from their success of
using virtualization in teaching computer security and digital forensics, Buchanan et al. presented an
infrastructure based on virtualization within CC deployment. In brief, they created a set of evaluation
criteria and a CC-based testing system. This system was capable of script-automating different modes of
testing and of creating and preparing portable and reproducible test environment. Each test
environment was a virtual instance stored in a shared library. Result from evaluation of their systems
showed reliable and robust and scalable execution. In addition, the systems demonstrated better energy
consumption and CPU utilization compared to traditional stand-alone test system.
Buchanan et al.s work showed great promise in improving the quality and credibility of digital forensics
researches and practices. Their work can also promote and facilitate collaboration among members of
the digital forensics community because it would be easier to create, collect and transfer testing and
-
8/2/2019 Cloud Computing Challenges and Responses
4/6
training data sets. However, copyright issues may still constraint the creating and sharing certain types
of digital forensics data. Binary scrambling techniques may be used in some cases, but it was reported to
create new issue for forensics techniques which rely on known binary signature [5].
In addition, the current implementation based on VMware technologies cannot accommodate non-
desktop (e.g. mobile phones, handheld devices) digital forensics data and tools. For instance, two mostpopular mobile platforms, Apples iOS and Android, have not been supported as guest OS by VMware
[11]. Given the current trend of mobile computing, we expect demand for digital forensics researches
and practices on those mobile platforms to increase rapidly. Therefore, it is highly desirable that future
version of Buchanan et al.s test system can support such guest OSs.
4. Different Approach to Forensics in Cloud ComputingMarty from Loggly, Inc. highlighted the importance of log analysis in a digital forensics investigation. In
his paper, he presented a list of challenges encountered when analyzing logs from CC deployment for
forensics purpose. Those challenges are often results of failures in properly managing log files andproperly generating log records. In order to overcome those difficulties, Marty proposed a proactive
approach to generating and managing logs. His suggestion is a comprehensive set of guidelines on log
management (e.g. enabling, storing, transporting and tuning logs) and log creation (e.g. when, what,
how to log). A sample set up in a SaaS CC was presented together with practical tips and actual
difficulties encountered during the set up. The goal was to collect logs from Django web application,
JavaScript, Apache web server, load balancer, MySQL, operating system and back-end. [7]
Even though Martys guidelines comprehensively cover logging issues from infrastructure level (e.g. OS,
transport) to application level (e.g. when, what, how to log), Marty also noted that application
developers total support is necessary for the success of the framework.
In addition, it is not easy to implement those guidelines; some may not even be possible. (Extremely)
limited access to the internal structure and operation of the cloud leaves little room for any outsider to
improve the log management process to support digital forensics. Usually this is the case for logs from
infrastructure sources. For instance, in his sample set up, Marty was not able to obtain MySQL logs. Even
if this is possible, the large scale and dynamic nature of CC can hamper the log management process.
Martys guidelines on infrastructure log were also rather short and abstract compared to his guidelines
on how to generate application log. Yet even generating log on application level may also fail to follow
all guidelines. While the guidelines on what, when, where, who, why and how to log are useful and
extensive, it is perhaps rather overwhelming to any developer who is not familiar with such practice.
More importantly, it is often the case that logging is not mentioned, or only briefly mentioned in any
business user requirements. Thus to implement all guidelines is to over-invest time and effort while
obtaining no clear economics gain (measured against delivering all required functionality on time and
within budget).
In another work on log analysis, Accorsi et al. took the practice to a higher level. While current digital
forensics toolkits only performed analysis on logs from infrastructure sources, they invented a forensics
-
8/2/2019 Cloud Computing Challenges and Responses
5/6
technique called RECIF for the analysis of logs from the application level. Probably inspired by the
success of researches in the mature business informatics, their technique offered a pioneering approach
to digital forensics investigation. In brief: a data flow is a transition between 2 events whose output of
one is used as input of the other. A policy is a set ofconstraint-exception relations expressed in a special
simple language. From data collected in application logs, a propagation graph of data flows is
reconstructed and the resulted graph is matched with a set of predefined business process policies to
detect information leak [1]. An interesting fact was the use of MXML, the standardized log format for
business process. It was mentioned that tools for transforming logs from major business process
systems such as SAP, Oracle and Sage to MXML were available. An example run was reported with the
technique correctly detected information leak from a set of generated data flows and a separation-of-
duty policy. Further works were ongoing to demonstrate the correctness of the technique in more
complex scenarios.
Overall, the technique looks very promising. It works on a standardized log formats which can be
transformed from multiple other common formats. It can readily support a many common business
policies such as Separation of Duties, Conflict of interests. It is also complimentary to other digitalforensics techniques. And it doesnt require any special tools or skills (besides the ability to write
business process policy in a pre-defined syntax, which looks simple). This technique is probably most
useful in information leak scenarios where the so-called crime does not involve much technical details.
Those scenarios are often the result of flaws in security policy specifications or software designs.
However, the expressiveness of the language can still be enhanced. Obviously, the language cannot
express policy of type deny all, allow a few because it was intrinsically allow all, deny a few.
5. Conclusion and Future DirectionIn this paper, we have reviewed a few challenges posed by CC to the digital forensics community. We
also discuss a few approaches focusing on infrastructure and applications log analysis to tackle those
challenges. On the other hand, CC also delivers many computational advantages which can be utilized by
the community to improve the performance and quality of digital forensics tools. Digital forensics
training in the virtualization environment of the cloud was also explored.
From the above finding, we think the future of digital forensics given the rise of CC lies in taking its
economic and computational advantages and exploring non-traditional approaches in collecting and
analyzing digital evidence. Among the possible tactics, borrowing expertise from other related research
fields may bring in new ideas and firmer approaches. A good example was the application of business
informatics in the detection of information leaks [1].
6. References[1]. Accorsi, R., Wonnemann, C., & Stocker, T. (2011). Towards forensics data flow analysis of business
process logs. IT Security Incident Management & IT Forensics.
-
8/2/2019 Cloud Computing Challenges and Responses
6/6
[2]. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., et al. (2009).Above theClouds: A Berkeley View of Cloud Computing. University of California at Berkeley.
[3]. Beebe, N. (2009). Digital Forensic Research: The good, the Bad and the Unaddressed.Advances inDigital Forensics V, IFIP AICT 306 , pp. 17-36.
[4]. Birk, D. (n.d.). Technical Challenges of Forensic Investigations in Cloud Computing Environments.Retrieved April 8, 2011, from http://www.zurich.ibm.com/~cca/csc2011/submissions/birk.pdf
[5]. Buchanan, W. J., Macfarlane, R. J., Flandrin, F., Graves, J., Buchanan, B., Fan, L., et al. (2011). Cloud-based Digital Forensics Evaluation Test (D-FET) Platform. Cyberforensics.
[6]. Garfinkel, S. L. (2010). Digital forensics research: the next 10 years. Digital Forensics ResearchWorkshop.
[7]. Marty, R. (2011). Cloud Application logging for forensics. SAC.[8]. National Institue of Standard and Technology. (n.d.). The NIST Definition of Cloud Computing.
Retrieved April 28, 2011, from http://csrc.nist.gov/groups/SNS/cloud-computing/cloud-def-v15.doc
[9]. Roussev, V., & Richard, G. G. (2004). Breaking the performance wall: the case for distributed digitalforensics. Proceedings of the fourth digital forensic research workshop.
[10].Roussev, V., Wang, L., Richard, G., & Marziale, L. (2009). A CLOUD COMPUTING PLATFORM FORLARGE-SCALE FORENSIC COMPUTING.Advances in Digital Forensics V, IFIP AICT 306, , 201-214.
[11].VMware Supports the Largest Number of Guest Operating Systems. (n.d.). Retrieved 04 12, 2011,from http://www.vmware.com/technical-resources/advantages/guest-os.html