trojan transformation using ontology apprach

Upload: intan-nurfarahin-ahmad

Post on 03-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 trojan transformation using ontology apprach

    1/22

    1

    Abstract

    Recently, many cases on malware attack had been reported and again it cause a lot of

    negatives impact such as loss of money, freeze on company operation and decrease the

    productivity to many of the organization. Trojan is one of the example of the maliciouscode which originally created to attempt an attack to any services and devices. On 24 th

    December 2012, Trojan called as Zeus involved with numerous DDoS attempt and attack

    to Ascent Builder which cause loss more than $900,000 USD. Prior to the study

    conducted, Trojan horse had been chosen as the domain for this research paper. In

    depth study and investigation of the Trojan horse classification, not much research related

    with Trojan horse has been done. Therefore, in this research paper, a new Trojan horse

    classification is presented by using ontology approach. This research later used as a

    basis to build a new model to detect Trojan horse efficiently. The method proposed are

    the static and dynamic analysis which used to understand the behavior or the Trojan then

    continue with ontology approach to classify the dataset and transformed it into an

    understandable data.

  • 8/12/2019 trojan transformation using ontology apprach

    2/22

    2

    1.0 INTRODUCTION

    1.1 BACKGROUND

    Trojan is one of the malicious code that attempt an attack to users computer for more

    than a decade. It first appear to be useful software but will actually do damage onceinstalled or run on your computer. The statistics taken from Cyber Security 2013 from

    January to October (Figure 1) shows that there are three type of major security incident

    are often reported which are fraud with 47.2% followed by intrusion 27.2% and malicious

    code 17.1% over 9369 reported incident. From the study by Ponemon Institute (2013) it

    provide an estimation of the economic impact of cybercrime which the average cost to

    resolve a single attack might total more than $1million USD. The Trojan can control the

    victims devices such as computer and tablet remotely and steal any confidential

    information such as username and password, credit card number and file deletion from it

    (Mangrae, 2006). In contrast with the worm, virus and other malicious code, it has the

    ability to steal victims information without being noticed and it does not replicate itself

    (Saudi, 2008). Furthermore, as times goes by, the Trojan is keep changing and updating

    itself regularly, make it harder to detect it present even with the anti-viruses. A good

    security strategy is needed to prevent and defence this problem. This security strategy

    that embraces incident response and technologically sound security measures including,

    but not limited to Trojan threats (Saudi and Jomhari 2006, Hawkins et al , 2000). For thisresearch, ontology approach is applied to conduct the Trojan horse classification and

    analysis. Ontology approach is used to extract dataset and transform it into an

    understandable format.

    Figure 1: General Incident Classif icat ion Stat is t ics (Jan Oct ) 2013

    0.50% 4.30%0.20%

    40.70%27.20%

    0.60%

    17.10%

    9.20%0.20%

    GENERAL INCIDENT CLASSIFICATION STATISTICS (JAN- OCT) 2013 (MYCERT)

    Content Related

    Cyber Harrassment

    Denial of Service

    Fraud

    Instrusion

  • 8/12/2019 trojan transformation using ontology apprach

    3/22

    3

    1.2 MOTIVATIONS

    In conducting this research, there are two main motivations which are:

    I. The difficulties of the researcher to get clean dataset for their research analysis.

    Ontology is an approach that being applied to classify this dataset. The expected

    outcome is Trojan classification using and ontology approach.

    II. To a clean big data, it consume a lot of time to process it.

    There are various techniques can be used to clean up dataset, but which one is

    the easiest and less time consuming? Many of the researcher stopped to do the

    research in this field since it is actually time consuming and require many man

    power to do it (Witten, et al, 2005). To clean up the dataset, the researcher need

    to test each sample one by one. 1.3 PROBLEM STATEMENT

    The number of Trojan is growing as the technology growth. It became worse when

    those Trojan nowadays keep on updating and changing which make it difficult to detect

    with the antivirus. On 24 December 2012 an anonymous cyber crooks attempt an attack

    using Distributed Denial of Service (DDoS) to an account belonging to Ascent Builder with

    netting thieves more than $900,000 USD. In this case, Trojan called as Zeus has been

    involved with numerous DDoS attack. This Trojan is commonly used among thecybercriminals and the most prolific malware used in financial cyber attacks. By this

    infection of Trojan, the creator had successfully caused chaos, where a lots of money

    loss. The urge of this research is due to the Trojan bad implication and lack of clean

    dataset of Trojan that freely available that can be used for further analysis.

    1.4 RESEARCH QUESTION

    In between conducting this research, there are questions need to be answered which

    lead to the successful of this research. The research questions are:

    1) Is how the raw data is transformed into an understandable format?

    2) What is the approach can be used to transform the Trojan dataset?

    3) What are the procedures involved to provide clean Trojan dataset?

  • 8/12/2019 trojan transformation using ontology apprach

    4/22

    4

    1.5 OBJECTIVES

    The objectives of this research are:

    1) To investigate and evaluate the work related with Trojan data transformation.

    2) To design a new Trojan classification using an ontology approach.

    3) To evaluate the transformed dataset.

    1.6 SCOPE AND LIMITATION

    This research is using only Trojan horse dataset on Windows platform. There are more

    or less 1,987 samples of Trojan data. This research is focusing on ontology approach

    which is used to classify those dataset.1.7 ORGANIZATION OF THE RESEARCH REPORT

    This thesis is organized into five related chapters:

    Chapter 1 : This chapter explains the research background, problem statement, research

    motivation, research objectives, research question, research scope and limitation,

    significant of the research and research schedule and expected outcome.

    Chapter 2 : This chapter summarizes the review on other paper, article and book whichare related with this research. It discusses the Trojan horse study which consist the

    definition, Trojan horse architecture and classification. Ontology approach and KDD

    approach also being discuss in this topic.

    Chapter 3 : This chapter discusses about the methodology used to achieve the objectives.

    How the data is being analyzed using the integration of static and dynamic analysis, and

    ontology methodology for data transformation.

    Chapter 4 : This chapter discusses the expected outcome from the experiment analysis

    being discussed. It also describe in detailed how ontology approach being conducted to

    classify the Trojan horses. Different testing techniques is compared with the ontology to

    prove the effectiveness of the ontology itself in Trojan transformation.

  • 8/12/2019 trojan transformation using ontology apprach

    5/22

    5

    Chapter 5 : This chapter summarizes the research. It explain about the research

    contributions, and future work on this research.

    2.0 LITERATURE REVIEW

    2.1 OVERVIEW

    The uses of transformation malware dataset nowadays are very important in

    security field especially in network security. This useful information will very helpful for

    future research and for data analysis. This can only be achieving by performing the data

    set transformation.

    This section combines several aspects of the research study. For this research the

    scope of malware dataset is the Trojan horse. This chapter include to review on

    fundamental knowledge of Trojan horse and ontology approach. The first section will

    discuss on Trojan itself, the classification and architecture, type of detection technique

    and the differences of Trojan over the other malicious code. The second section will

    discuss on ontology which include the type of ontology exist in world of computing. Lastly,

    the previous works that are related to this research being review in order to guide and

    less the gaps found in those precious pieces of research.

    2.2 DEFINITION OF TROJAN HORSE

    Trojan horse is one of the most serious and dangerous threats found in the world of

    computing especially in computer security. It spread widely and vigorously as long as the

    technology grow. It becoming more serious as today, the Trojan keep on changing and

    updating itself regularly make it harder to detect it present. This Trojan get its name

    originated from history which comes from the Greeks mythologies. It is the history where

    The Greeks had entered and destroyed the city of Troy. Trojan is the one who allowing

    the Greek army to sneak through a high gate using the wooden horse and this attack had

    destroyed the city of Troy.

    There are a lot of definition of Trojan that stated by the academician. In computer

    world, Trojan horses is defined as a computer programs that presented as useful in order

    to induce the user to install and run them, but also have some hidden malicious goals,

  • 8/12/2019 trojan transformation using ontology apprach

    6/22

    6

    such as enabling remote access and control with the aim of gaining full or partial access

    to the infected system (Liu et al, 2010).

    Other than that, others had defined Trojan horse is a program in which malicious or

    harmful code is contained inside apparently harmless program or data in such a way thatit can get control and do is chosen form of damage, such as ruining or erasing data on

    the hard drive (Alsadoon et al, 2011). Continuously, once a Trojan horse has been

    installed on a victim computer it is possible to the hacker to access it remotely and execute

    program as command.

    Based on the latest research, Trojan is defined as a program or file appears as a

    useful and harmless, but in fact, after urging the user to install it on their computer, it

    begins to carry out malicious acts like enable the hacker to control the victims computer remotely and steal data (Areej , 2013).

    Based on these studies, this research defines Trojan as malicious code in form of

    program or file which is harmful and dangerous as the victims install it in their devices. It

    will carry out malicious activities such as stealing and destroy the data from the hard drive,

    direct access to the private information and enable the hacker to remotely control the

    device. It do not replicate itself which this make it different from other malicious code.

    2.2.1 TROJAN CLASSIFICATION AND ARCHITECTURE

    Most of the normal people will assume those of the malicious code such as Trojan,

    worm, adware, spy ware, and many more as a virus. In real computing world Trojan are

    not the same as other malicious code. It is created with it specific function to achieve its

    goals. Therefore, it is important to know the specific differences between those malicious

    code such as virus, Trojan, worm, adware and spyware so the correct detection technique

    can be applied based on their characteristics.

    Classification is one of the most crucial processes that must be place in order to

    ensure the effectiveness of the detection process (Siti Suraya, 2013). Generally malicious

    code can be classified based on the characteristics such as infection target and technique

    and other different characteristics (Babak et al, 2011). Plus, an effective classification

    algorithm or technique can improved the accuracy of malicious code detection (Nguyen

  • 8/12/2019 trojan transformation using ontology apprach

    7/22

    7

    et al, 2012). Basically, in general, there are a lot of different type of Trojan which carried

    out different goals and target. But most Trojan obviously cannot replicate itself, thus make

    it different from other malicious code.

    Worms is one example of threat in network security. Based on (Saudi, 2011) worm isdefined as a malicious program that can replicate itself, moving from one computer to

    another or can propagate via a network without human intervention or an owners

    consent. For other researcher, worm defined as an autonomous, self-replicating threats

    that do not infect or alter computer programs in the same way as viruses; with different

    objective (Hughes et al, 2007). The worms also defined as a malicious code that usually

    spreads by exploiting vulnerabilities in the network services (Farrukh, 2013).

    2.3 DEFINITION OF ONTOLOGY

    Ontology is one of the approach being using nowadays in extracting data. Ontologies

    is defined as knowledge representation frameworks that allow us to express knowledge

    in an explicit and expressive way using well-defined semantics (Daconta et al, 2003).

    Then, (McMullen et al, 2005) defined as how sentences are created by combining words

    together to give meaning. Ontology can provide this functionality by linking concepts

    together using relationships, which in turn be processed to produce meaningful data.

    On the other hand, it is a formal, explicit specification of shared conceptualization

    (Gruber, 1993). Ontology can be thought of as semantic primitives that specify a particular

    domain of knowledge (Saudi , 2008). An ontology is an inventory of the kinds of entities

    that exist in a domain, their silent properties, and the salient relationships that can hold

    between them (Benjamin et al, 1995).

    Ontology is method that focuses on extracting the essential nature of the concepts in

    any domain and representing it in a structured manner. For example, to illustrate this in

    natural language, an ontology triple (car, has, wheel) formalizes the sentence (a) car has

    wheel(s). In the ontological form, the concepts car and wheel are linked using the

    has property. By connecting concepts with properties and instances (examples), we are

    provided with a knowledge map of a given domain.

  • 8/12/2019 trojan transformation using ontology apprach

    8/22

    8

    2.3.1 TYPE OF ONTOLOGY AND ITS BENEFITS

    According to (Nguyen, 2010) ontology can be characterized according to their

    granularity, formality, generality and computational capability. Granularity an ontology can

    defined as either coarse-grained or fix-grained (Broekstra et al, 2002) . In term ofgenerality, ontologies may be classified as being top-level ontologies, mid-level

    ontologies, task ontologies, domain ontologies and application ontologies. In terms of

    computational capability, ontologies may be classified as being heavy-weight or light-

    weight. Ontologies can also be classified according to their expressiveness. For example,

    ontologies may be controlled vocabularies, glossaries, thesauri, formal instances

    relations ontologies, frames ontologies, value restriction ontologies and general logical

    constraints ontologies (Broekstra et al, 2002). However, for ontologies to be processable

    by computer, they must be represented in a computer readable language such as Web

    Ontology Language (OWL) and F-logic.

    Ontological analysis clarifies the structure of knowledge. According to

    (Chandrasekaran et al , 1999) Ontology gives benefits in which it is a heart of any system

    of knowledge representation for that domain. Other than that, the ontology enable

    knowledge sharing and captures the intrinsic conceptual structure of the domain. Related

    to this research paper, the ontology is applied to extract big data of Trojan and

    transformed it into knowledge that could be share to others.

    2.4 RELATED RESEARCH

    Many research on malware had been carried out since 10 years ago. One of them is

    Trojan which the study was started by (Thimbleby et al. 1998). It then followed with more

    studies which more focus on Trojans Hardware taxonomy and Trojans hardware

    detection instead. Now, the study on Trojan had been continue and different approach

    was applied to bring the new Trojan data transformation.

    There are many study related to Trojan data transformation. Based on (Liu et al, 2010)

    proposed a study on malware detection using machine learning method. In this study,

    they choose Trojan as the domain in Windows platform. This study concluded that the

    accuracy of classification may increase when more relevant features are used in the

  • 8/12/2019 trojan transformation using ontology apprach

    9/22

    9

    process. However, the more features are selected, the more time building classification

    cost thus, make the Trojan detection respond slowly in real time. They also make a

    comparison classification accuracy on the same training dataset with different test

    dataset. The result shows that the Trojan horse collected from real network environment

    is limited.

    (Saudi , 2011) presented an improvement detecting method based on STAKCERT

    KDD process. This study use worms as the domain. The data pattern extraction is

    achieved by using data mining. For this research, it implemented an algorithm k-means

    for clustering and SMO for worm classification. This research had made an enhancement

    on KDD data pre-processing and pattern extraction process. Plus, statistical methods

    comprising Chi-square and symmetric measure and security metric are also introduced.

    This approach out performs the existing work by (Dai et al, 2009) with 98.13% overall

    accuracy.

    (Ren & Qian, 2013) presented SPID- based method of Trojan horse detection. It

    focusing on how to identify various Trojan efficiently and accurately. SPID is use to

    analyses the common protocol, generating a model to identify Trojan. The result from

    their study shows that the optimized combination of attribute meters have a high efficiency

    to identify Trojan based on keeping SPID detection accuracy. During the research, they

    found out that this technique is a web-based, real-time detection technology. Using the

    network characteristics attributes meters to generate protocol model library and statistical

    based to identify Trojan has a high recognition rate and a wide range of adaptability.

    (Huang et al, 2010) this study proposes an ontology-based intelligent system for

    malware behavior analysis. The Taiwan Malware Analysis Net (TWMAN) were

    represented to analyses the malware behavior and as ontology agent. The malware

    behavioral analysis collects the malware behavioral information to build malware

    behavioral ontology and malware behavioral rules. The results from the system logs show

    that the TWMAN can work effectively to protect he computer from the attack of computer

    viruses and Trojan based on the malware behavioral analysis.

  • 8/12/2019 trojan transformation using ontology apprach

    10/22

    10

    Based on previous work, this research will introduce a new Trojan classification based

    on ontology approach and compare the classification on the same training dataset but

    with a different test dataset approach.

    3.0 RESEARCH METHODOLOGY3.1 OVERVIEW

    This section explains the research methodology used including the detailed

    explanations of what approach have been used to collect and analyses the data. This

    section also explain on how the research will be conducted including the domain, tools,

    and laboratory environment used and how the result from this research being tested and

    verified. The systematics research methodology will produce a high quality of research

    findings.

    3.2 RESEARCH DESIGN

    Figure 2 shows the full frame of the research design that will applied in this study.

    Set up laboratoryenvironment

    Trojan classification isobtained

    A new format ofdataset is gained

    The result is tested Published to otherresearchers

    Dataset from VXHeavens is

    downloaded

    Data transformation isconducted (ontology

    approach

    Dataset analysis

    Tools are installed

    Do thecorrection

    Valid?

    Invalid?

    Figure 2: Research m ethodology Des ign for Tro jan Transformat ion us ing Onto logy

  • 8/12/2019 trojan transformation using ontology apprach

    11/22

    11

    3.3 SETUP LABORATORY ENVIRONMENT

    In order to do this research, a controlled laboratory environment is proposed as illustrated

    in Figure 3. This laboratory will be setup with two computers which are installed with

    VMWare. The lab I build up separated from the production network. No outgoing networkis allowed for this architecture. The reason why this controlled lab architecture was used

    are; firstly, any Trojan horse infection, propagation, operating algorithm, activation and

    payload can be monitored without any constraint in terms of network connectivity.

    Secondly, the lab is portable where it is easy to be moved. Lastly, the controlled lab

    environment would less the harm since the lab was separated from the operational

    network.

    Figure 3: Lab Architectu re

    Window 7, monitoring

    Window 7, VMWareWindow 7, VMWare

  • 8/12/2019 trojan transformation using ontology apprach

    12/22

    12

    3.4 LOADING SPECIMEN

    For this experiment Trojan is the domain. The dataset is downloaded from the

    VXHeaven (2013) website. All Trojan and variants were downloaded to be tested.

    However for this experiment, only the Trojan from windows platform is being chosen.Windows platform is chosen due to more attacks and vulnerabilities exploited in windows

    platform discovered. In addition, the amounts of Trojans that attack on other platform are

    fewer than windows. Windows is more exposed for being under attack of worms, viruses

    and Trojan. The problem are windows is poorly coded. Therefore, a lot of Trojan appeared

    form windows.

    There are several reasons why this research chooses to gather dataset form

    VXHeavens source. Firstly, many researcher have used this source of data for theirtesting. For example research from (Stibor, 2010), (Saudi et al, 2011) and (Siti Suraya,

    2013). Secondly, the variants are important than the quantity of the dataset and lastly,

    due to the scope of this research, where only focusses on windows Trojan plus it is one

    of the largest Trojan databases freely available from the internet

    3.5 SETUP TOOLS

    For this experiment, almost 80% of the software used in this testing is an open source or

    available on a free basis. The following in Table 1 is the listing tools used in this lab.

    Table 1: Tools and t heir func t ional i t ies .

    Function Tools Purpose of action

    Scan tools AVG antivirus To prepare the scan tool to

    detect various forms of

    malicious code including

    those with newer

    signatures.

    String research tool Stirng.exe (from

    sysinternal)

    To display and extract

    suspicious set of ASCII

    characters included in a

    file

  • 8/12/2019 trojan transformation using ontology apprach

    13/22

    13

    Unpack tool Proc dump 4.01

    Unpack tool

    UPX tool

    To decompress and

    unpack the Trojan code.

    Virtual PC VMWare Work

    Station

    To allow multiple operating

    system to run on a single

    computer.

    TCP view

    TCP view TCPView is a Windows

    program that will show

    detailed listing of all TCP

    and UDP end points on

    system, including the local

    and remote addresses and

    sate of TCP connections.

    Disassembler/Debug Tool OllyDbg To perform detailed code

    analysis.

    Process Monitoring Preview v3.7.3.1 To identify the resources

    used by all running

    processes, including DLLsand registry keys. Process

    explorer provides a wealth

    of useful information

    regarding how the Trojan

    was impacting upon the

    victim computer.

    Database MS Access To store the transformeddataset.

    Automated analysis Cuckoo To analyze Trojan horses

    behavior and documented

    it.

  • 8/12/2019 trojan transformation using ontology apprach

    14/22

    14

    Ontology approach protg To classify the Trojan

    using ontology approach

    3.6 DATASET ANALYSIS

    Basically, the dataset for this experiment will go through certain processes which illustrate

    as Figure 4:

    Figure 4: Trojan Dataset Analysis Proc ess

    3.6.1 DATA PROCESSING

    In data processing, there are two ways of analysis which are static and dynamicanalysis. The raw Trojan horse dataset that downloaded from the VXHeavens source

    needed to be transformed into format that will easily be used for subsequent analysis.

    Therefore, the dataset will go through Standard Operating Procedure (SOP) to clean the

    data and to remove any noise and duplication of data.

    Input dataset

    Data processing using

    Standard OperatingProcedure (SOP)

    Post processing

    Output = knowledge

    Extraction and Classificationof dataset

    Cleaning data toremove noise andduplication.

    Data transformationusing ontologyapproach

    Clustering Classification

  • 8/12/2019 trojan transformation using ontology apprach

    15/22

    15

    a) Static Analysis

    The mechanism of the static analysis is by looking at the files associated with the

    Trojan in the computer without running the program. Figure 5 illustrated the stage of static

    analysis.

    Figure 5: Stat ic analysis

    Anti-virus check : Once the dataset has been loaded into the testing computers, the filetype or compression type is identified. Then, the anti-virus that has been installed inside

    the testing computer is run. For this experiment the AVG antivirus is choose to fulfill the

    work where it able to detect the Trojan or not. If yes, the name of the Trojan horse is

    checked and being analyzes using anti-virus website for further information.

    Start StaticAnalysis

    Static AnalysisFinish

    Run Antivirus

    Use tools touncompressingmalicious code

    String Analysis

    Identify LanguageScript

    Disassemble code

    Detect?No

    Yes

  • 8/12/2019 trojan transformation using ontology apprach

    16/22

    16

    String analysis: String tool called Strings.exe (from Sysintermal) is used to extract string

    from the Trojan horse codes.

    String analysis : String tool called String.exe (from sysintermal) is used to extract strings

    from the Trojan horse codes. This is helpful in identifying the Trojan horse characteristics.

    Looking for script : Based on the string extracted from the Trojan horse codes the

    common scripting or programming languages have been identified. Table 2 can be used

    as guidance.Table 2: Trojan horse Script Analysis Guid ance

    Programming and

    Scripting Language

    Identifying characteristics Inside the File Files common

    Suffix

    Perl Start with line !#usr/bin/perl .pl.perl

    Bourne ShellScripting Language

    Start with line !#/bin/sh .sh

    C C programming language .c

    C++ Can be standalone program or many files

    referenced within the language.

    .cpp

    Java Contain java source code. .java,.j, .ljav

    Assembly

    Language

    Close to binary machine code .asi

    Active Server Page

    (ASP)

    Can be built using Visual Basic, Jscript or Perl.

    Can combine HTML, scripts, Active-X server

    components.

    .asp

    JavaScript Includes the world javascript or JavaScript,,

    especially in the form

    .js, .html, .htm

    Visual Basic Script

    (VBScript)

    Includes the word VBScript, or character vb

    scattered throughout the file

    .vbs, .html,

    .htm

    Disassemble code: Disassemble and debugger which are called as OllyDbg and Ida

    Pro, were used to transfer a raw binary executable into assembly language and to

    disassemble and debug the codes.

  • 8/12/2019 trojan transformation using ontology apprach

    17/22

    17

    b) Dynamic analysis

    Dynamic analysis include executing Trojan horse dataset in the controlled lab and

    carefully watch their behavior and actions. All steps involved are illustrated in Figure 6.

    Figure 6: Dynamic A nalys i s

    Monitoring file activities : most Trojan horse reads from or writes to the file system. It

    might try to write files, altering existed programs, adding new files or append itself to the

    file system.

    Monitor process : Preview v3.7.3.1 is a tool that is used to monitor any program, files

    registry keys and all the DLLs in the victims computer.

    Automatic analysis (malware sandbox): Sandbox is a mechanism to analyze the

    untrusted files or program in a system. It is an alternative to analyze for the binary file.

    Start DynamicAnalysis

    DynamicAnalysis Finish

    Monitoring FileActivities

    Monitoring

    Processes

    Monitoring NetworkActivities

    Monitoring RegistryAccess

  • 8/12/2019 trojan transformation using ontology apprach

    18/22

    18

    3.6.2 EXTRACTION AND CLASSIFICATION OF DATASET

    For data extracting and classification, ontology approach is used. Ontologies are

    knowledge representation frameworks that allow us to express knowledge in an explicit

    and expressive way using well-defined semantics. This process aimed to extract datasetby clustering and classifying them according to their characteristics and behaviors. The

    process of defining concept in ontology is also called categorization, which involves taking

    closely related term and grouping the Trojan as concepts or categories [Saudi, 2008].

    a) Ontology Model

    The design propose for this study is to present a new Trojan classification model

    using ontology. This study represents a novel structure of the domain ontology including

    a domain layer, a category layer, and behavior layer.

    b) OWL-based Trojan Behavioral Ontology

    Owl-based Trojan behavioral ontology uses protg to build the ontology and

    describes the ontology of Trojan behavioral.

    Domain= Trojan

    Domain layer Category layer

    Behavior layer

    Figure 7 : Onto logy Mod el

  • 8/12/2019 trojan transformation using ontology apprach

    19/22

    19

    3.6.3 POST PROCESSING

    This level will continue the data processing by interpreted the pattern from all those

    data that already extracted using ontology approach. The data will be transformed into

    useful information or known as knowledge. Those data then will be stored in the database.

    3.7 TESTING

    To verify and validate the proposed Trojan data transformation and Trojan

    classification, all the result report from the static and dynamic analysis is compared and

    verified with the automated analysis (cuckoo) result report. This process is done

    manually. As along this research, all the data will be extracted using ontology and being

    tested. After all data analysis which consist three important stages, which are data

    processing, extracting data, and post processing, this new result will be tested. Those

    data will be run many times using ontology coding to obtain many result. Those results

    then are compared to find the validity and the highest percentage of the frequency

    occurrence of Trojan behavior and characteristics. The last result is gain then stored in

    the database.

    4.0 EXPECTED OUTCOME

    This section will discuss on expected outcome for this research. In this research, it is

    expected:

    1) A new Trojan classification is formed using an ontology approach.

    The uses of ontology approach, a new Trojan classification is produced. The ontology

    model structure include domain layer, category layer, and behavior layer were

    extracted using the Owl-based Trojan behavioral Ontology. The protg is use to build

    to describes the ontology of Trojan behavioral.

    2) A repository of clean dataset is formed by end of this research.

    To produce a new Trojan classification, the domain dataset have to go through

    Standard Operating Process (SOP). SOP include static and dynamic analysis will

    clean the dataset by removing all the noises and duplication of data. Thus, a repository

    of clean dataset is produced by end of this research.

  • 8/12/2019 trojan transformation using ontology apprach

    20/22

    20

    5.0 CONCLUSION

    In the world of computer security, malware threat problem cannot be neglect. Their

    different capabilities which always updated make it more difficult to detect it existent. This

    proposal presented a Trojan data transformation which using ontology approach. Thedomain use in this research is Trojan horse in windows platform. The research will be

    conducted in controlled laboratory environment. The big dataset that downloaded from

    the VXHeaven then being clean up by using SOP technique. The clean dataset then being

    used for Trojan classification using ontology approach. By following the development

    processes of this study, the ontology approach can be expanded to solve more complex

    problems. There are research on Trojan classification and detection technique had been

    conducted to confront the Trojan horse attacks and this research is part of it. This Trojan

    data transformation can be used as a reference for other researcher in the world to

    construct a better Trojan detection model either to use the same approach or to do with

    different approach.

  • 8/12/2019 trojan transformation using ontology apprach

    21/22

    21

    REFERENCES:

    Malaysia Cyber Security, 2013. Malaysian Computer Emergency Response Team, Available from:

    http://www.mycert.org.my/en/services/statistic/mycert/2013/main/detail/914/index.html. (Accessed on 11 th

    December 2013).

    Mangatae, Aelphases. (2006) Trojan White Paper [igniteds.NET], Available from: http://igniteds. (Accessed

    October 2013).

    Hawkins,S., C. Yen,D. and C. Chou,D. (2000). Awareness and challenges of Internet security, Information

    Management & Computer Security, Vol. 8: 3,131 143.

    Saudi,M.M.and Jomhari,N.(2006). Knowledge structure on virus for user education. 2006. International

    Conference on Computational Intelligence and Security , 1515 - 1518.

    Witten, I. H., & Frank, E. (2005). Data Mining: practical machine learning tools and techniques . Morgan

    Kaufmann.

    Pierluigi Paganini, 1 st November (2013). Impact of Cybercrime . Available from

    www.infosecinstitute.com/2013-impact-cybercrime. (Accessed 29 th November 2013).

    Farrukh S., M. Ali Akhbar, Muddassar F., (2013). The Droid Knight: a Silent Guardian for the Android Kernel,

    Hunting for Rogue Smartphone Malware Applications.

    Areej M. A., Saudi M. M., Bachok M. T., and Zul H. A., (2013). An Efficient Trojan Classification (ETC),

    IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 3.

    Mohd Saudi, Madihah. (2011). A New Model for Worms Detection and Responds (electronic version).

    Babak, R., Maslin, B., and Suhaimi, I. (2011). Evolution of Computer Virus Concealment and AntiVirus

    Techniques: A IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 3, March 2013

    Issues, 8(1).

    Nguyen, V. T., Kha, V. V., and Anh, A. P. (2012). Research Some Algorithm in Machine Learning and

    Artificial Immune System, Apply to Set Up A Virus Detection System. International Journal of Computer

    Science Issues, 9(4).

    Suraya S. O., Saudi M. M. Zul H. A. (2013). Standard Operating Procedures (SOP) to build up MalwareDataset.

    Daconta, M.C., Obrst, L.J., Smith, K.T. (2003). The Semantic Web. Wiley Publishing Inc, Indianapolis

    Indiana.

    McMullen, D., Holohan, E., Melia, M., Pahl, C. (2005). Knowledge-driven Learning

    Technology Systems. 6th Annual Irish Educational Technology Users Conference, EdTech2005, ILTA

    http://www.mycert.org.my/en/services/statistic/mycert/2013/main/detail/914/index.htmlhttp://igniteds/http://www.infosecinstitute.com/2013-impact-cybercrimehttp://www.infosecinstitute.com/2013-impact-cybercrimehttp://igniteds/http://www.mycert.org.my/en/services/statistic/mycert/2013/main/detail/914/index.html
  • 8/12/2019 trojan transformation using ontology apprach

    22/22

    22

    Gruber, T.R., (1993). A translation approach to portable ontology specifications, Knowledge Acquisition,

    vol 5, 199-220.

    Benjamin, P., C. Menzel, R.J. Mayer (1995). Towards a method for acquiring CIM ontologies, International

    Journal of computer Integrated Manufacturing, 8 (3), 225-234.

    Azni A. H., Saudi M. M., Azreen A., Emran M. T. and Yamani M. I. I., (2008). An Efficient Network Security

    System through an Ontology Approach, IEEE Xplore 978-1-4244-3397-1

    Broekstra, J., Kampman, A. and Harmelen, van F. 2002. Sesame: A generic architecture for storing and

    querying RDF and RDF schema. In Proceedings of the 1 st International Semantic Web Conference, Lecture

    Notes in Computer Science, Vol. 2342, pp. 54{68, Springer.

    Van Nguyen, 2010. Command Control Communications and Intelligence Division. DSTO-TH-1002.

    Al-Saadoon, G, Al-Bayatti, H, 2011. A Comparison of Trojanhorse Virus Behavior in Linux and Windows

    OperatingSystems, World of Computer Science and InformationTechnology jornal,

    Vol. (1), No. 3, 56-62.

    Liu,y., Zhang,l. Liang,j. Qu,s. Ni,z. 2010. Detecting Trojanhorses based on system behavior

    using machine learning method, 2010 Machine Learning and Cybernetics conferenceIEEE, vol (2): 855

    860.

    Dai,J., Guha,R. and Lee ,J., Efficient Virus Detection Using Dynamic Instruction Sequences, Journal of

    Computers , Vol 4, No 5, 2009, pp. 405-414.

    Xun-yi Ren and Guui- bing Qian, 2013. SPID -based Method of Trojan Horse Detection. International

    Conference on Information, business and Education Technology (ICIBT 2013).

    Thimbleby,H., Anderson,S. and Cairns, P. 1998. A framework for Modelling Trojans and Computer Virus

    Infection, Computer Journal,Vol(41):7,444-458.

    Huang, H., Chuang, T., Tsai, Y., and Lee, C., 2010. Ontology -based Intelligent System for Malware

    Behavioral Analysis. WCCI 2010 IEEE World Congress on Computational Intelligence, July, 18-23,2010

    CCIB, Barcelona, Spain.

    Chandrasekaran, B., Josephson, J, R., and Benjamins V, R., 1999. Waht are Ontologies, and Do we Need

    Them?, IEEE Intelegent System 1004 -7167, January/February 1999.

    Stibor, Thomas. (2010). A Study Of Detecting Computer Viruses In Real-Infected Files in the n-gram

    Representation with Machine Learning Methods (Electronic Version). URL:

    http://www.sec.in.turn.de/assets/staff/stiborr/iea.aie.final.extended.pdf

    http://www.sec.in.turn.de/assets/staff/stiborr/iea.aie.final.extended.pdfhttp://www.sec.in.turn.de/assets/staff/stiborr/iea.aie.final.extended.pdf