a deep dive into dex file format--chiossi

Upload: lucian-lazar

Post on 07-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    1/22

     

    Rodrigo Chiossi ABS 2014

     A deep dive into DEX file format

    Rodrigo Chiossi

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    2/22

     

    Rodrigo Chiossi ABS 2014

    Bio

    ● Rodrigo Chiossi

     –  Android Engineer @ Intel !C

     –  AndroidXRef

    ● """#android$ref#%om

     – De$terit&

    ● https'((gith)*#%om(r%hiossi(de$terit&

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    3/22

     

    Rodrigo Chiossi ABS 2014

    vervie"

    ● DEX +ile Str)%t)re

     – Chara%teristi%s

     – ,EB12-

     –

    Relative Inde$ing – ./!+-

     – !he Big 3eader and the data#

    ● DEX Instr)mentation

     – !he String Add %ase

    ● DEX ,imitations

     – Bitness restri%tions

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    4/22

     

    Rodrigo Chiossi ABS 2014

    DEX Str)%t)re

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    5/22

     

    Rodrigo Chiossi ABS 2014

    DEX roperties

    ● Red)%ed .emor& +ootprint

     – ,EB12- en%oding

     – Relative Inde$ing

     –

    Single file for all %lasses 5vs# 1 file per %lass in #%lassformat6

     – 7o d)pli%ate strings

    ● .odified /!+- String En%oding

    ● Stri%t re8)irements for alignment● Even more stri%t r)ntime verifier 5De$pt6

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    6/22

     

    Rodrigo Chiossi ABS 2014

    ,EB12-

    ● Encoding format from DWARF3.

    ● Used to encode signed (SLEB128 andULEB128p1 and !nsigned (ULEB128 n!m"ers.

    ● Used in DE# for encoding 32$"it n!m"ers.● Numbers are encoded using 1 to 5 bytes.

     – Depending on t%e %ig%est ‘1’$"it

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    7/22

     

    Rodrigo Chiossi ABS 2014

    ,EB12- E$ample

    3EX BI7 S,EB12- /,EB12- /,EB12-p1

    00 00000000 0 0 1

    01 00000001 1 1 0

    9f 011111111 1 129 12:

    -0 9f 10000000

    011111111

    12- 1:2;: 1:2;;

    ● $1 is !sed to represent t%e &')&DE# *a+!e.

    ● Encoded as ULEB128p1, &')&DE# re-!ires on+ one"te to "e encoded.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    8/22

     

    Rodrigo Chiossi ABS 2014

    Relative Inde$ing

    ● .an& DEX o*

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    9/22

     

    Rodrigo Chiossi ABS 2014

    Relative Inde$ing E$ample

    +ield ID +ield 7ame

    ###

    1024 field>1

    102; field>2

    ###

    10?: field>?

    ###

    ● +ield ,ist'

     – +ield>1= field>2= field>?

    ● En%oding'

     – 1024= 1= 11

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    10/22

     

    Rodrigo Chiossi ABS 2014

    .odified /!+-

    ● /sed for en%oding all strings in the DEX format#

    ● Chara%ters ma& have 1= 2 or ? *&tes#

    ● Strings are terminated *& a single n)ll *&te#

    ● hen parsing string>data>item= the )ft1:>siefield %annot *e )sed to %al%)late the sie of thefollo"ing data as it onl& represents the n)m*er

    of %hara%ters in the ./!+- string#●  ASCII strings are ./!+- legal strings

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    11/22

     

    Rodrigo Chiossi ABS 2014

    !he Big 3eader

    ● Besides the header>item= "e have si$ other str)%t)resthat des%ri*e the DEX file'

     – string>id>item list

     – t&pe>id>item list

     – proto>id>item list

     – field>id>item list

     – method>id>item list

     – %lass>def>item list

    ● !his str)%t)res define all the f)n%tional %ontent of theDEX file#

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    12/22

     

    Rodrigo Chiossi ABS 2014

    !he .ap

    ● !he DEX file ma& %ontain an optional str)%t)re%alled the .ap= %omposed *& map>itemstr)%t)res#

    !he .ap str)%t)re %ontains information a*o)t allthe offsets in the file and "hat is the t&pe of%ontent in that offset#

    ● Although optional according to the file format

    specification, the existence and correctnessof the map is enforced by DexOpt.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    13/22

     

    Rodrigo Chiossi ABS 2014

    !he Data

    ●  All the %ontent of the DEX file not in the *igheader goes to the Data area#

    ● ffsets to str)%t)res in the data area m)st *e

    *igger than the end of the *ig header# !hispropert& is enfor%ed *& De$pt#

    ● It is o to have gaps in the middle of the datase%tion#

    ● !he map is part of the data area#

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    14/22

     

    Rodrigo Chiossi ABS 2014

    !he ,in Data

    ● ptional area at the end of the Data area#

    ● +ormat )nspe%ified#

    ● 7ever present in 7ormal aps#

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    15/22

     

    Rodrigo Chiossi ABS 2014

    DEX Instr)mentation

    ● Case St)d&' String add

     – String manip)lation is re8)ired for mosto*f)s%ation(deo*f)s%ation te%hni8)es#

     – Can *e e$tended for repla%ing and removingstrings#

    ● *

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    16/22

     

    Rodrigo Chiossi ABS 2014

    String Str)%t)re

    ● Represented " t%e pair (string_id_item,string_data_item

    ● string_id_item +ist m!st "e sorted

     – Sorted " t%e !tf1/ code points of t%e string

    ● Strings are referenced " its inde0 position in t%estring_id_item +ist.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    17/22

     

    Rodrigo Chiossi ABS 2014

     Adding a string>id>item

    ● !st "e added in t%e position of t%e +ist t%at i++ eep t%e +istsorted.

    ● 4eader ad5!stments6

     – Data o7set.

     – Fi+e sie.● aps ad5!stments6

     –   string_id_item map sie.

    ● Entire 9+e ad5!stments6

     –

    '7sets references in data area m!st "e s%ifted : "tes. – String references e-!a+ or "igger t%an t%e added string m!st "e

    increased " 1.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    18/22

     

    Rodrigo Chiossi ABS 2014

    ,EB12- E$pansion

    ● Some o7sets are encoded as ULEB128.

     – E.g. code_of  inside encoded_method o"5ect.

    ● Some stringiditem references are encoded asULEB128.

     – E.g. name_idx inside annotation_element o"5ect.

    ● After s%ifting o7sets or increasing string_id_item references, t%e sie of t%e LEB128 in "tes maincrease.

    )f t%e e0pansion occ!rs, f!rt%er s%ifting of o7sets isneeded in t%e 9+e.

    ● aps sie and o7set m!st "e !pdated.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    19/22

     

    Rodrigo Chiossi ABS 2014

     Alignment

    ● Some str!ct!res in t%e DE# 9+e m!st "e :$"tea+igned.

     – E.g., code_item.

    ● string_id_item is :$"te in sie, so adding a ne

    o"5ect i++ not misa+ign t%e DE#.● LEB128 e0pansion i++ often add 1 "te s%ifting, %ic%

    i++ "rea a+ignment.

    ● )f rea+ignment is re-!ired, o7set references m!st "e

    !pdated.● aps sie and o7set m!st "e !pdated.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    20/22

     

    Rodrigo Chiossi ABS 2014

     Adding a string>data>item

    ● !st "e inside t%e data area.

    ● 4eader ad5!stments6

     – Data sie.

     – Fi+e sie.

    ● aps ad5!stments6

     –  string_data_item map sie.

    ● Entire 9+e ad5!stments6

     – '7sets references after t%e o7set of t%e ne string_data_item m!st "e s%ifted" t%e sie of t%e added o"5ect.

     – String references e-!a+ or "igger t%an t%e added string m!st "e increased " 1.

    ● ;%ec for LEB128 e0pansion and app+ s%ifting.

    ● ;%ec for a+ignment and app+ s%ifting.

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    21/22

     

    Rodrigo Chiossi ABS 2014

    DEX Bit Restri%tions

    ● ?2 *its en%oding

     – Stati% fields "ith fi$ed ?2 *it sie 5E#g#string>id>item6#

     –

    ffsets e$pe%ted to *e "ithin ?2 *it range#● ,ess than ?2 *its en%oding

     – Class= t&pe= proto and other lists alie are limited to1: *its in sie#

  • 8/19/2019 A Deep Dive Into Dex File Format--chiossi

    22/22

     

    Rodrigo Chiossi ABS 2014

    Rodrigo Chiossi

    r.c%iossi