pdf secrets - hiding & revealing secrets in pdf documents

129
2014/05/17 secrets PDF hiding & revealing secrets in PDF documents Mannheim Germany RaumZeitLabor Ange Albertini CTF PDF stegano 101

Upload: ange-albertini

Post on 15-Jan-2015

11.109 views

Category:

Technology


3 download

DESCRIPTION

recording of this talk: https://www.youtube.com/watch?v=JQrBgVRgqtc

TRANSCRIPT

Page 1: PDF secrets - hiding & revealing secrets in PDF documents

2014/05/17

secretsPDF

hiding & revealing secrets in PDF documents

Mannheim GermanyRaumZeitLabor

Ange Albertini CTF PDF stegano 101

Page 2: PDF secrets - hiding & revealing secrets in PDF documents

reverse engineering&

visual documentations

corkami.com

Page 3: PDF secrets - hiding & revealing secrets in PDF documents

The problem

You need to remove sensitive elementsof a PDF document for public release

are they actually removed ?can someone reveal your secrets ?

Page 4: PDF secrets - hiding & revealing secrets in PDF documents

https://www.schneier.com/blog/archives/2005/05/pdf_radacting_f.html

It’s not a new fact

Page 5: PDF secrets - hiding & revealing secrets in PDF documents

http://download.repubblica.it/pdf/rapportousacalipari.pdfseen in its metadata: “EmailSubject (Another Redact Job For You)”

You just need to:1. uncompress the PDF2. remove all “ re\n” occurences

(“re” = rectangle operator)

There are plenty of real

examples

Page 6: PDF secrets - hiding & revealing secrets in PDF documents

AFAIK

the topic wasn’t reallycovered technically

Page 7: PDF secrets - hiding & revealing secrets in PDF documents

The reverse problem

You need to carry a sensible PDF,or exfiltrate some information:

Can you convincingly pretendthat it was a mistake,

and yet easily re-enable the contents?

Page 8: PDF secrets - hiding & revealing secrets in PDF documents

...and, more importantly...

it still makes it an interesting exerciseto learn and experiment with PDF internals ☺

...and it might also be useful for a CTF steganography challenge...

Page 9: PDF secrets - hiding & revealing secrets in PDF documents

it’s about hidingparts of the PDF document

not hiding data in a PDF file+ nothing reader-specific

Page 10: PDF secrets - hiding & revealing secrets in PDF documents

General outline of this talk

3 relatively independent parts:1. a non-technical approach2. a basic introduction to the PDF file format3. a technical perspective

Page 11: PDF secrets - hiding & revealing secrets in PDF documents

a non-technical approachPart I / III

Page 12: PDF secrets - hiding & revealing secrets in PDF documents

What about that NSA doc ?

there is an NSA document on the topic.worth a read, but Adobe Acrobat (Pro) only

http://www.nsa.gov/ia/_files/app/pdf_risks.pdf

Page 13: PDF secrets - hiding & revealing secrets in PDF documents

Preamble

this presentation has a lot of hands-on examples,that you can find at:

http://pdf.corkami.com

Page 14: PDF secrets - hiding & revealing secrets in PDF documents

Outline

1. the problem (introduction)2. outline

a. see Google “recursion” ☺3. examples

a. colori. forgotten text

b. overlapped textc. secured documents

i. bypassing securityd. overlapped image

i. extracting image4. Conclusion

Page 15: PDF secrets - hiding & revealing secrets in PDF documents

So, you tried to hide elements in a PDF...

Page 16: PDF secrets - hiding & revealing secrets in PDF documents

“well, I don’t see them anymore”

try with the next slide:nothing is visible… and yet...

1. “Select All” text with your favorite PDF viewer2. Copy and paste in a text editor

You can’t work on these slides via online viewing like SlideShare:it displays rendered pictures of the PDF file, not the file itself→ try in a dedicated viewer with the PDF file itself

Page 17: PDF secrets - hiding & revealing secrets in PDF documents

Example: color

hidden viawhite color

Page 18: PDF secrets - hiding & revealing secrets in PDF documents

hint

Page 19: PDF secrets - hiding & revealing secrets in PDF documents

It worked, right?

you can’t see the text,but it’s still on the page

→ the software can select it

Example: colorhidden viawhite color

Page 20: PDF secrets - hiding & revealing secrets in PDF documents

Btw...

this can lead to unexpected results, so be careful before publishing slides,even if you think you have nothing to remove

try with next slide ☺

Page 21: PDF secrets - hiding & revealing secrets in PDF documents

HyperVortex 1.0a publication software

Roberto Martinez

title

authors

insert stupid footer here -- LaTeX sucks!!!

god, I hate making slides!!!

Example: forgotten text

Page 22: PDF secrets - hiding & revealing secrets in PDF documents

Oops

maybe it wasn’t a secret to be removed,but’s still there!

put extra hidden content for easier indexing

god, I hate making slides!!!Example: forgotten textHyperVortex 1.0a publication softwaretitleRoberto Martinezauthorsinsert stupid footer here -- LaTeX sucks!!!

Page 23: PDF secrets - hiding & revealing secrets in PDF documents

Another try

Try to get the secret from the next slide,with the same copy-paste trick...

Page 24: PDF secrets - hiding & revealing secrets in PDF documents

Example: overlapped text

hidden viaoverlapping shapeCONFIDENTIAL

Page 25: PDF secrets - hiding & revealing secrets in PDF documents

Once again...

the text is behind the “CONFIDENTIAL” shape,but it’s still there!the software selects everything(not only the front layer)

Example: overlapped textCONFIDENTIALhidden viaoverlapping shape

Page 26: PDF secrets - hiding & revealing secrets in PDF documents

pdftotext does it for youinstantly see which text is still hidden

Better than “Select all”

D:\>pdftotext -layout -l 1 "PDF Secrets.pdf"Syntax Warning (631): Badly formatted number

D:\>_

PDF secrets hiding & revealing secrets in PDF documents

Page 27: PDF secrets - hiding & revealing secrets in PDF documents

But PDF can prevent that?

● yes, in theory● but the text is still there, and decrypted→ it can be circumvented

Page 28: PDF secrets - hiding & revealing secrets in PDF documents

either:● some readers just ignore it

○ like Evince● generate a new file out of the original one

○ print PDF as PDF(not 100% compatible, but fast and usually works)

○ decrypt

Bypassing copy/paste protection

D:\>qpdf -decrypt protected.pdf unprotected.pdf

D:\>_

Page 29: PDF secrets - hiding & revealing secrets in PDF documents

1. open in chrome2. print

Page 30: PDF secrets - hiding & revealing secrets in PDF documents

1. change printer as “Save as PDF”2. Save

Page 31: PDF secrets - hiding & revealing secrets in PDF documents

final document looks identicalnot (SECURED) anymore

Page 32: PDF secrets - hiding & revealing secrets in PDF documents

● sometimes, text can be copied,but it comes as corrupted

● it’s not protection, just incompatibility

→ try with another reader

● it could be abused○ but it’s not easy to implement○ and it’s still easy to recover content

(it’s just a substitution cipher)

Copy/paste corruption

Page 33: PDF secrets - hiding & revealing secrets in PDF documents

copy/paste weirdness

Page 34: PDF secrets - hiding & revealing secrets in PDF documents

Ok, a last one

is it hopeless?

try this one...

Page 35: PDF secrets - hiding & revealing secrets in PDF documents

Example: overlapped image

SECRET

Page 36: PDF secrets - hiding & revealing secrets in PDF documents

Failure?

the secret behind the shape is a picture:→ it’s not copied as text by standard software

(common softwares don’t copy pictures)

Example: overlapped imageSECRET

Page 37: PDF secrets - hiding & revealing secrets in PDF documents

Does it means we’re safe?

No:the image is still present in the PDF document.→ it’s trivial to extract it with a standard tool

Example:use PDFImages (or mutool)

Page 38: PDF secrets - hiding & revealing secrets in PDF documents

extracting our secret image directly from the file

D:\>pdfimages -f 32 -l 32 "PDF Secrets.pdf" .

D:\>_ D:\>mutool extract "PDF Secrets.pdf"extracting image img-0015.pngextracting image img-0016.png...

Page 39: PDF secrets - hiding & revealing secrets in PDF documents

Conclusionon Part I / III

Page 40: PDF secrets - hiding & revealing secrets in PDF documents

text can be copiedimages can be extracted

Page 41: PDF secrets - hiding & revealing secrets in PDF documents

the “Select All” trick often works,but not always

Page 42: PDF secrets - hiding & revealing secrets in PDF documents

even if “Select All” does not work,secrets may still be recovered

Page 43: PDF secrets - hiding & revealing secrets in PDF documents

but there aremore advanced tricks!

→ need to study PDF internals

Page 44: PDF secrets - hiding & revealing secrets in PDF documents

PDF 101basics of the PDF file format

Part II / III

Page 45: PDF secrets - hiding & revealing secrets in PDF documents

My poster on the PDF format (free to print, reuse…) http://pics.corkami.comto order a print: http://prints.corkami.com

Page 46: PDF secrets - hiding & revealing secrets in PDF documents

A simple examplehelloworld.pdf

reminder: this is simplified, PDF is actually much more complex

Page 47: PDF secrets - hiding & revealing secrets in PDF documents

==

Page 48: PDF secrets - hiding & revealing secrets in PDF documents

binary stream →

(text)

(text)

Page 49: PDF secrets - hiding & revealing secrets in PDF documents

A PDF file is

● text-based○ white-space tolerant

● with binary streams

→ it can be explored with a decent text editor

if you need one, try Notepad++http://notepad-plus-plus.org/

Page 50: PDF secrets - hiding & revealing secrets in PDF documents

Recommended environment

● text editor● Sumatra

○ single-file viewer○ updates on the fly

● a tool to decompress streams○ (explanations later)

● check mistakes with qpdf --check or pdfinfo

Page 51: PDF secrets - hiding & revealing secrets in PDF documents

editing and viewing the changes on the fly

Page 52: PDF secrets - hiding & revealing secrets in PDF documents

A PDF structure

1. header○ signature

2. body○ objects

3. cross-reference table4. trailer5. xref pointer6. end of file signature

Page 53: PDF secrets - hiding & revealing secrets in PDF documents

1. PDF signature○ %PDF-1.0 - %PDF-1.7

2. charset identifier○ not required○ tells tools it’s not ASCII○ 4 non-ASCII chars in a

comment

Signature

Page 54: PDF secrets - hiding & revealing secrets in PDF documents

made of objects● <number> <generation> obj

<content>endobj

Body

Page 55: PDF secrets - hiding & revealing secrets in PDF documents

Xref

● table● offsets of each objectxref

0 5 5 objects, starting at 0

0000000000 65535 f obj #0: always null

0000000016 00000 n obj #1: offset 16

0000000051 00000 n obj #2: offset 51

0000000111 00000 n …0000000283 00000 n

● each line = 20 chars○ space before CR

Page 56: PDF secrets - hiding & revealing secrets in PDF documents

Trailer 1/2

● structurea. “trailer”b. object-like content

● defines the “root” object○ /Size = #(xref elements)

Page 57: PDF secrets - hiding & revealing secrets in PDF documents

Trailer 2/2

1. pointer to xrefa. “startxref”b. offset to xref

■ (decimal)2. End Of File marker

a. %%EOF

Page 58: PDF secrets - hiding & revealing secrets in PDF documents

Basic typesnames, strings, dictionaries...

Page 59: PDF secrets - hiding & revealing secrets in PDF documents

● (string)● <hex>● %comment until line return

● some others, less-used types(PDF is quite f*cked up)

Literals

Page 60: PDF secrets - hiding & revealing secrets in PDF documents

equivalent files

Page 61: PDF secrets - hiding & revealing secrets in PDF documents

points● <object> <generation> Rwith● the actual contents of the

object

some object CAN’T be inlined

<generation> is very rarely non-null

Object reference

Page 62: PDF secrets - hiding & revealing secrets in PDF documents

57

Object reference - example 1

354 0 R

354 0 obj

57

endobj

2 equivalent examples via object reference

Page 63: PDF secrets - hiding & revealing secrets in PDF documents

Object reference syntax

it’s odd, but critical to understand● 3 0 1 ⇒ 3 elements (3 numbers):

a. 3b. 0c. 1

● 3 0 R ⇒ 1 element:a. reference to “3 0”

■ object 3■ generation 0

Other PDF syntax rules follow common-sense

Page 64: PDF secrets - hiding & revealing secrets in PDF documents

● “reserved keywords”○ like symbols in Ruby

● starts with /○ /Pages , /Kids …

● case sensitive○ CamelCase by default○ undefined names are ignored

⇒/pages != /Pages(useful to disable tags)

Name objects

Page 65: PDF secrets - hiding & revealing secrets in PDF documents

Syntax● [ <values>* ]

Examples:● [3 0 R] = 1 value

a. “3 0 R”● [0 0 612 792] = 4 values

a. 0b. 0c. 612d. 792

Array

Page 66: PDF secrets - hiding & revealing secrets in PDF documents

Syntax:● << [<name> <value>]* >>

Object 1 sets:1. /Pages to “2 0 R”Object 2 sets:1. /Kids to “[3 0 R]”2. /Count to “1”3. /Type to /Pages

Dictionaries

Page 67: PDF secrets - hiding & revealing secrets in PDF documents

/Pages 2 0 R

is “equivalent” to/Pages <<

/Kids [3 0 R]

/Count 1

/Type /Pages

>>

and then ”3 0 R“ is replaced too...

Object reference - example 2

Page 68: PDF secrets - hiding & revealing secrets in PDF documents

Binary streamsparameters, filters...

Page 69: PDF secrets - hiding & revealing secrets in PDF documents

syntax:1. usual object declaration

2. parameters dictionary3. stream

+ return character

4. stream data5. endstream

+ return character6. usual endobjstream data is not interpreted(at object level)

Streams

Page 70: PDF secrets - hiding & revealing secrets in PDF documents

object 4● stream parameters

○ /Filter = /FlateDecode○ /Length = 57

● stream content (binary)xœsáRPÐw3T044²BÒ€„¡�‚‰��BH�-á‘š““¯ž_”“¢¨©’Åå !0×

Example

Page 71: PDF secrets - hiding & revealing secrets in PDF documents

Binary streams

● can be stored with different encodings○ /Filter○ encodings can be cascaded

● content is decoded● after each filter

only the final data matters

Page 72: PDF secrets - hiding & revealing secrets in PDF documents

Streams don’t enforce encodings

as long as the result is correctonce decoded by the filters

Page 73: PDF secrets - hiding & revealing secrets in PDF documents

<< /Length 53 >>

stream

BT

/F1 110 Tf

10 400 Td

(Hello World!) Tj

ET

endstream

<< /Filter /FlateDecode

/Length 57 >>

stream

xœsáRPÐw3T044 ²BÒ€„¡ �‚‰��BH

�-á‘š““¯ž_”“¢¨©’Åå !0×

endstream

these 2 streams are equivalent,just using a different encoding

Page 74: PDF secrets - hiding & revealing secrets in PDF documents

<< /Filter

[/ASCIIHexDecode /FlateDecode]

/Length 170 >>

stream78 9C 73 0A E1 52 50 D0 77 33 54 30 34

34 00 B2 42 D2 80 84 A1 81 82 89 81 81

42 48 0A 90 AD E1 91 9A 93 93 AF 10 9E

5F 94 93 A2 A8 A9 10 92 C5 E5 1A C2 05

00 21 30 0B D7

endstream

<< /Filter /FlateDecode

/Length 57 >>

stream

xœsáRPÐw3T044 ²BÒ€„¡ �‚‰��BH

�-á‘š““¯ž_”“¢¨©’Åå !0×

endstream

/ASCIIHexDecode willdecode ASCII Hex to binary

Page 75: PDF secrets - hiding & revealing secrets in PDF documents

Main filters

● <none>: direct raw binary in the file● /FlateDecode : ZIP’s deflate decompression

→ smaller● /ASCIIHexDecode: turns hex into binary

○ 41 0A ⇒ “A\n”

→ easy text editing (but binary is very common)mutool has a specific option for that

Page 76: PDF secrets - hiding & revealing secrets in PDF documents

Images● /DCTDecode to store JPEG files directly

○ not just the data, even the header!● JPEG2000, Fax

Encryption● Crypt

○ RC4 or AES

Other filters

Page 77: PDF secrets - hiding & revealing secrets in PDF documents

Let’s put it all togetherhow is the file actually parsed?

Page 78: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 1/7

1. Signature is checked

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 79: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 2/7

2. %%EOF is located

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 80: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 3/7

3. xref is located via startxref

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 81: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 4/7

4. xref gives offsets of each objects

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 82: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 5/7

5. trailer is parsed→ gives /Root object

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 83: PDF secrets - hiding & revealing secrets in PDF documents

Parsing 6/7

6. objects are parseda. /Root object contains /Pagesb. /Pages contains page array

■ /Kidsc. each /Page has:

■ size: /MediaBox■ /Contents

● as stream object■ /Resources

● define /Font dictionary

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

Page 84: PDF secrets - hiding & revealing secrets in PDF documents

7. the page is rendereda. BT BeginTextb. <name> <size> Tf select fontc. <x> <y> Td move cursord. <string> Tj display stringe. ET EndText

Parsing 7/7

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj<< /Length 53 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000051 00000 n 0000000109 00000 n 0000000281 00000 n

trailer << /Root 1 0 R /Size 5 >>

startxref384%%EOF

BT /F1 110 Tf 10 400 Td (Hello World!) TjET

Page 85: PDF secrets - hiding & revealing secrets in PDF documents

In practice

● that was the ‘strict’ minimum● a typical PDF embeds more information

● fonts● fonts encoding● metadata● …

a generated Hello World typically weights >5 Kb

Page 86: PDF secrets - hiding & revealing secrets in PDF documents

In practice - in the malware world

● most readers accept malformed files○ many elements missing

■ EOF, startxref, xref, /Length, endobj, endstream■ /MediaBox /Font

● each reader has its own weirdness○ see my “Schizophrens” talks and PoCs

● so much for the so-called “standard”

Page 87: PDF secrets - hiding & revealing secrets in PDF documents

%PDF-\01 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream\nBT/F1 105 Tf 0 400 Td(Hello Adobe!)Tj ETendstream\nendobj\ntrailer<</Root<</Pages 1 0 R>>>>

a “Hello World” for Adobe, in 179 bytes

Page 88: PDF secrets - hiding & revealing secrets in PDF documents

Conclusion

we’ve covered the basics of:● file structure● objects relation● file parsing● page rendering

→ enough to play with PDF internals!

Page 89: PDF secrets - hiding & revealing secrets in PDF documents

A technical perspectivePart III / III

Wkroczyłem do sektora smoków

Page 90: PDF secrets - hiding & revealing secrets in PDF documents

Isn’t copy/paste enough?

● why not editing the file itself ?and restoring the secrets perfectly?

want to hide something?● create your own methods!

Page 91: PDF secrets - hiding & revealing secrets in PDF documents

Easy PDF editing

1. decompress streams○ PDFTk , qpdf○ optional: use ASCIIHex to get an ASCII-only file

2. open in text editor3. view results via Sumatra

overwrite, or comment (don’t delete)⇒ no offset to adjust D:\>pdftk "PDF Secrets.pdf" output uncompressed.pdf uncompress

D:\>qpdf --qdf "PDF Secrets.pdf" uncompressed.pdf

Page 92: PDF secrets - hiding & revealing secrets in PDF documents

Reminder

technically speaking, a PDF page is:1. a stream object2. as the /Contents of a /Type /Page object3. in the /Kids array of a /Type /Pages object4. as the value of /Pages in root object5. as the value of /Root in the trailer

and a text on the page is a simple (string) Tj

Page 93: PDF secrets - hiding & revealing secrets in PDF documents

Remove a page ?

easy hiding1. remove reference from /Kids2. write it back later

Page 94: PDF secrets - hiding & revealing secrets in PDF documents

locate the /Kids array

Page 95: PDF secrets - hiding & revealing secrets in PDF documents

Edit out your page’s reference

Page 96: PDF secrets - hiding & revealing secrets in PDF documents

and don’t forget to update the pages’ /Count ☺(may lead to funny results)

Page 97: PDF secrets - hiding & revealing secrets in PDF documents

● tools such as PDFtk can operate on pages○

but:● they don’t erase pages!

○ they extract the other pages

→ the whole page is lost

but the image contents (as objects) are still left!and extractable!!

Erasing a page with a tool

D:\>pdftk "PDF Secrets.pdf" cat 1-3 5-end output no4.pdf

Page 98: PDF secrets - hiding & revealing secrets in PDF documents

Erase overlapping element?

● remove paint/text operators from binary stream

Hint:overlapping elements might beat the end of the stream,as they were likely added last

Page 99: PDF secrets - hiding & revealing secrets in PDF documents

paint operators(PDF 32000-1:2008, page 135)

Page 100: PDF secrets - hiding & revealing secrets in PDF documents

text showing operators(PDF 32000-1:2008, page 250-251)

Page 101: PDF secrets - hiding & revealing secrets in PDF documents

Example:manually remove

overlapping elements

Page 102: PDF secrets - hiding & revealing secrets in PDF documents

take the uncompressed PDFlocate the /Contents stream object

locate the S (Stroke path)(you can search for \nS\n)

Page 103: PDF secrets - hiding & revealing secrets in PDF documents

erase the S⇒ no more black border

Page 104: PDF secrets - hiding & revealing secrets in PDF documents

locate the f (path Filling)

Page 105: PDF secrets - hiding & revealing secrets in PDF documents

⇒ no more gray surface

Page 106: PDF secrets - hiding & revealing secrets in PDF documents

and the “obvious” Tj after the string (...)Note: the letters are different, due to the font mapping

&→C, 2→O, 1→N...

Page 107: PDF secrets - hiding & revealing secrets in PDF documents

→ no more hidden elements!

bonus: the operation can be easily automated!(on all pages, etc…)

Page 108: PDF secrets - hiding & revealing secrets in PDF documents

Page size tricks

● a page isn’t just a /MediaBox :(○ PDF is not so simple!

■ CropBox/BleedBox/TrimBox/ArtBox/...

● What you see is /CropBox○ Copy/Paste and (some) pdftotext respect that

⇒ what is in Mediabox (but not CropBox)is not extracted

Page 109: PDF secrets - hiding & revealing secrets in PDF documents

disable /CropBox to see the full contents

Page 110: PDF secrets - hiding & revealing secrets in PDF documents

OS-X actually does a /CropBox when you copy/paste out of a PDF,and you can see the full original content by rotating the page.

Page 111: PDF secrets - hiding & revealing secrets in PDF documents

Hidden text

● White color○ 1 1 1 rg (filling’s color)

● text rendering mode○ 3 Tr = invisible

■ OCRs use it to store text

Page 112: PDF secrets - hiding & revealing secrets in PDF documents

A more ‘deniable’ hiding

altering /Kids or the page’s /Contents work,

but there is another elegant solution:incremental updates

Page 113: PDF secrets - hiding & revealing secrets in PDF documents

PDF incremental updates

● not commonly used○ required for signing

● but still supported by readers

the concept: add another set of objects, xref, trailer, …to update the objects’ hierarchy

Page 114: PDF secrets - hiding & revealing secrets in PDF documents

Example

a confidential objectwith a secret stream object 4to be hidden

%PDF-1.1%âãÏÓ

1 0 obj<< /Pages 2 0 R >>endobj

2 0 obj<< /Kids [3 0 R] /Type /Pages /Count 1 >>endobj

3 0 obj<< /Parent 2 0 R /MediaBox [0 0 612 792]/Resources << /Font << /F1 <</BaseFont /Arial /Subtype /Type1 /Type /Font>>>> >> /Contents 4 0 R /Type /Page >>endobj

4 0 obj << /Length 50 >>streamBT /F1 120 Tf 10 400 Td (Top Secret) TjETendstream endobj

xref0 50000000000 65535 f 0000000016 00000 n 0000000052 00000 n 0000000110 00000 n 0000000282 00000 n

trailer << /Size 5 /Root 1 0 R >>

startxref385%%EOF

Page 115: PDF secrets - hiding & revealing secrets in PDF documents

New /Contents

append a new object 44 0 obj<< /Length 52 >>streamBT /F1 110 Tf 10 400 Td (Hello World!) TjETendstreamendobj

Page 116: PDF secrets - hiding & revealing secrets in PDF documents

Extra xref

append a new xrefthat references it

xref0 10000000000 65535 f 4 10000000551 00000 n

Page 117: PDF secrets - hiding & revealing secrets in PDF documents

Extra trailer 1/2

● same /Size & /Root● references the previous xref via /Prev

(not the previous trailer)trailer <<

/Size 5/Root 1 0 R/Prev 385

>>

Page 118: PDF secrets - hiding & revealing secrets in PDF documents

Extra trailer 2/2

points to the new xref

startxref654%%EOF

Page 119: PDF secrets - hiding & revealing secrets in PDF documents

Result

⇒ different content !

restore content by cutting after the first %%EOF

Page 120: PDF secrets - hiding & revealing secrets in PDF documents

Incremental update to hide page

use the same trickto override /Type /Pages

…%%EOF

1 0 obj<</Type /Pages/Kids [ 6 0 R 21 0 R]/Count 2>>endobj

xref0 10000000000 65535 f 1 10000118783 00000 n

trailer << /Size 41 /Root 4 0 R /Prev 117882 >>

startxref118849%%EOF

Page 121: PDF secrets - hiding & revealing secrets in PDF documents

Actual leaks in the wild ?

in any PDF with /Prev in the trailer:restore each intermediate versionby truncating after each %%EOF

Page 122: PDF secrets - hiding & revealing secrets in PDF documents

incremental PDF found in the wild(removed parts, incorrect page number)

Page 123: PDF secrets - hiding & revealing secrets in PDF documents

“Printed USA”

Page 124: PDF secrets - hiding & revealing secrets in PDF documents

some files produced corrupted text when copying(mentioned in the first part)this is due to fonts:

○ /Subtype /Type3○ with no /ToUnicode mapping

Copy/Paste corruption

Page 125: PDF secrets - hiding & revealing secrets in PDF documents

Conclusion

Page 126: PDF secrets - hiding & revealing secrets in PDF documents

Conclusion

● the PDF file format is awkward ○ not too complex if you just want to hide/reveal secrets

● be careful when removing sensitive elements!○ quite easy to check if elements are still removed or not○ overlapping DOESN’T work

● hiding and recovering elements is ‘easy’○ content is still there!

Page 127: PDF secrets - hiding & revealing secrets in PDF documents

Suggestions?

I’m interested in:● hiding technics● automated revealing technics● documents that are a pain to ‘rebuild’

○ split fonts in small paths ?○ licensed fonts are converted to glyphs

⇒ no more text

Page 128: PDF secrets - hiding & revealing secrets in PDF documents

ACK

@pdfkungfoo@Daeinar @veorq @_Quack1 @MunrekFR @dominicgs @mwgamera @kevinallix @munin @kristamonster @ClaudioAlbertin @push_pnx @JHeguia @doegox @gynvael @nst021 @iamreddave