privacy & stylometry: practical attacks against authorship recognition techniques michael...

Privacy & Stylometry: Practical Attacks Against Authorship Recognition Techniques

Michael Brennan and Rachel Greenstadt{mb553,greenie}@cs.drexel.edu

Overview

2

Introduction to Authorship RecognitionThe Threat to Privacy & Anonymity

Experimental ResultsAttacking Authorship Recognition

Study Setup Methodology Results

Updated Work Translation Attacks Data Set Release Building an Attack Corpus

Future Work

What is Authorship Recognition?The basic question: “who wrote this

document?”Stylometry: The study of attributing

authorship to documents based only on the linguistic style they exhibit. “Linguistic Style” Features: sentence length, word

choices, syntactic structure, etc. Handwriting, content-based features, and

contextual features are not considered. In this presentation, stylometry and

authorship recognition are used interchangeably.

Stylometry: A Brief HistoryThe classic stylometry problem: The Federalist

Papers.85 anonymous papers to persuade

ratification of the Constitution. 12 of these have disputed authorship.

Stylometry has been used to show Madison authored the disputed documents.

Used as a data set for countless stylometry studies.

Modern Stylometry Based in Machine LearningSVMs, Genetic Algorithms, Neural Networks,

Bayesian Classifiers… used extensively.

Why is Stylometry Important?Great, you can figure out who wrote a 200

year old document, so what?From the Institute for Linguistic Evidence:

“In some criminal, civil, and security matters, language can be evidence… When you are faced with a suspicious document, whether you need to know who wrote it, or if it is a real threat or real suicide note, or if it is too close for comfort to some other document, you need reliable, validated methods.”

Plagiarism, Forensics, Anonymity…

Stylometry: the Threat to Privacy

6

Good techniques for location privacy (Tor, Mixes, etc). But it may be insufficient!

Stylometry can identify authors based on their writing. Can anonymous authors defend against this? ~6500 words to leak identity – Rao, Rohatgi. 2000. “The Multidisciplinary Requirement for Privacy” –

Carlisle Adams. 2006.

Supervised Stylometry

7

Given a set of documents of known authorship, classify a document of unknown authorship. Classifier trained on undisputed text.

Scenario: Alice the Anonymous Blogger vs. Bob the Abusive Employer. Alice blogs about abuses at Bob’s company.

Blog posted anonymously (Tor, pseudonym, etc). Bob obtains 5000-10000 words of each

employee’s writing. Bob uses stylometry to identify Alice as the blogger.

Unsupervised Stylometry

8

Given a set of documents of unknown authorship, cluster them into author groups. No existing author information, “Similarity Detection”

Scenario: Anonymous Forum vs. Oppressive Government. Participants organize protests.

Posts are completely unlabeled (no pseudonyms) Unknown organizational structure, number of authors, etc.

The government applies unsupervised stylometric techniques. # of authors may be discovered, author profiles created. Fed into supervised stylometry system to identify individuals.

The Threat in Numbers (Unsupervised)

9

Experiments with “Forum” Scenario9 Anonymous authors, raw forum data:

Only 35% Accuracy. Posts 25-200 words long

5 authors, same data set, modified for longer posts: 88.2% Accuracy Passages 500-750 words long

Other Research on 50 Unique Authors (WritePrints): Unsupervised “Pairing”: 95%

The Threat in Numbers (Supervised)

10

“Workplace” Scenario is Well ResearchedSupervised Classification of 2-5 Authors: 79-

99% Accuracy (IAAI ‘09). Random sets taken from 15 subject pool. 3 Different Methods Used

Other Research on 50 Unique Authors (WritePrints): Supervised: 90%

Protecting Privacy: Attacking Stylometry

11

Problem Can stylometry be attacked? How so? How Easily?

Evaluation In depth study on attacking multiple methods of

Stylometry.Conclusion

Stylometry is very vulnerable to attack by inexperienced human adversaries.

Attacks can be used to protect privacy.

Related WorkAdversarial stylometry is not a well-

researched part of the field. Somers & Tweedie, 2003: Authorship Attribution &

Pastiche. Mixed Results. Kacmarcik & Gamon, 2006: Obfuscating Document

Stylometry to Preserve Anonymity. Encouraging results.

Our goal: Assess the general vulnerability of authorship recognition systems when under attack by human adversaries.

Results Published in “Practical Attacks Against Authorship Recognition Techniques.” (IAAI '09)

We’re Under Attack!Obfuscation Attack

An author attempts to write a document in such a way that their personal writing style will not be recognized.

Imitation Attack An author attempts to write a document such that

the writing style will be recognized as that of another specific author.

Study Setup & Format3 representative methods of stylometry.15 Individual Authors. Participation had three

parts: Submit 5000 words of pre-existing writing from a

formal source. Write a new 500 word passage as an obfuscation

attack. Task: Describe your neighborhood.

Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy, describe your day.

Authors had no formal training or knowledge in linguistics or stylometry.

Imitating Cormac McCarthy

“On the far side of the river valley the road passed through a stark black burn. Charred and limbless trunks of trees stretching away on every side. Ash moving over the road and the sagging hands of blind wire strung from the blackened lightpoles whining thinly in the wind.”

Imitation Attack Examples“Light sliced through the blinds, and

construction began in the adjacent apartment. The harsh cacophony crashed through the wall.”

“Hot water in the mug. Brush in the mug. The blade read ‘Wilkinson Sword’ on the layered wax paper packaging.”

“He fills the coffee pot with water, after cleaning out the putrid remains of yesterday's brew. The beans are in the freezer, he remembers.”

Methodology & EvaluationValidation Experiment: Verify accuracy claims

of all 3 methods.Attack Experiment: Determine accuracy when

attacked using the same experimental conditions.

Experimental Conditions: Each test was applied to four random sets of 2, 3,

4, and 5 distinct authors. Same sets used in each method.

Cross-validation used to measure accuracy (repeated sub sampling and leave-one-out).

Machine Learning: Quick IntroductionMachine Learning is AI

Learn & recognize patterns automatically, make decisions.

Most common incarnation of ML: ClassifiersSpam Filter: Bayesian ClassifierGiven a message, what is the probability

that it is spam or not spam given the extracted features.

Important to pick good features.

A Closer Look at Linguistic FeaturesBasic Measurements:

Average syllable/word/sentence count, letter distribution, punctuation.

Lexical DensityUnique_Words / Total_Words

Gunning-Fog Readability Index:0.4 * ( Average_Sentence_Length +

100 * Complex_Word_Ratio )Result: years of formal education required to

read the text.

Stylometry MethodsMethod 1: Signature Stylometric System.

Three features: word length, letter usage, punctuation usage. 95% base accuracy.

Stylometry MethodsMethod 2: Neural Network Classifier

Based on Singh & Holmes, 1996. 9 linguistic features.

Lexical density, unique word count, character counts with and without spaces, two readability indexes, sentence count, average sentence length, and average syllables per word.

78.5% base accuracy. NN: simple switches,

complex overall system.

Input OutputHidden

Stylometry Methods

Method 3: Synonym-Based Classifier. Based on Clark & Hannon, 2007. Examines word choice when presented with all

possible synonyms. 88-99% accuracy.

(Diagram from “A Classifier System for Author RecognitionUsing Synonym-Based Features” by Clark & Hannon)

Accuracy on Training Set

All three methods were highly accurate in recognizing the correct author of the training set.

Obfuscation Attack Detection

Obfuscation attacks brought the accuracy of each system to that of random chance or worse.

Imitation Attack Detection

Imitation attacks all but completely circumvented all three authorship recognition methods.

Imitation Attack Success

The imitation attacks were very successful in targeting the intended victim of the attack.

Questioning the ResultsWhy are some authors more susceptible to

attack?Why did certain methods perform better than

others?Would these results hold up in a stricter

domain?What if the participants were skilled in

linguistics or stylometry?What features and methods could offer some

resistance to these attacks?

Translation AttacksCommon suggestion: run some text through a

translator and back!Limited testing show this to be ineffective.Translation Test Setup:

Five Subjects, (88-98% base accuracy)Single passage translated to/from

German and Japanese.Tested on Neural Network & Synonym

Methods100% effectiveness in identifying true author.

Using Attacks to Improve Privacy

29

What you can do: Important: These these methods failed, others

might not. Best: Try to imitate someone else. Write less.

What is needed A tool that will assist in anonymizing writing styles.

Building a ToolWhat a style-modifying tool needs:

A variety of methods & features. A large corpus of existing authors. AI-Assisted (Human oversight to preserve content) Open Source Non-Stylometry Features:

Contextual Clue Identifiers (Time, Content)

Building A Corpus IRB Approved Survey: dragonauth.comWe need your help!Mandatory for participation:

6000 words untampered writing1 Obfuscation Attack1 Imitation Attack

Optional:Demographic InformationVerification (untampered) passageSecond Imitation Attack

Building A CorpusEarly Stages of Data CollectionPlease Tell Others!English Targeted (For Now)

Suggestions for imitation authors for other languages are welcome

To participate in another language, do the imitation attack for an author of your choosing & let us know who it is.

We aren't web developers!

Releasing the Original Data Set8 Authors Consented (post-IRB approval)Available Jan 6th at psal.cs.drexel.eduContains:

Original sample passages1 Obfuscation Attack1 Imitation Attack

Future WorkTesting more stylometric methods. Imitation attacks against other authors.Effectiveness in unique domains (aggregated

short messages)Extracting additional information

(demographic)Better demonstration of effective

unsupervised stylometry.

ConclusionNot the end of stylometry in sensitive areas

New methods should test for adversarial threats.Stylometry is useful, but can also present a

threat to privacy. Attacking stylometry to preserve privacy has high

potential.Potential for arms race:

Developing attack-resistant methods of stylometry vs. creating new attacks to preserve privacy.

Questions?Contact:

Mike Brennan: [email protected] Rachel Greenstadt: [email protected] psal.cs.drexel.edu

Interested in learning more? “Writing Style Fingerprint Tool Easily Fooled” -

New Scientist, August 2009. “Can Pseudonymity Really Guarantee Privacy?” –

Josyula Rao, Pankaj Rohatgi. 2000. “Writeprints: A stylometric approach to identity-

level identification and similarity detection in cyberspace” – A. Abbasi, H. Chen. 2008.

FAQ“When will your data set be released?”“I can think of features for a stylometry

system that I bet these authors did not consider”

“I know a person in a certain Three Letter Organization that has subtle ways of detecting authorship that would surely not be broken by these attacks.”

“What about function words?”

Message Board Posts vs. Original Data

Study Setup: Detailed 3 representative methods of stylometry. 15 Individual Authors. Participation had three

parts: Submit 5000 words of pre-existing writing from a

formal source. Formal means school essays, professional reports, etc. No

slang, abbreviations, casual conversation. Write a new 500 word passage as an obfuscation

attack. Task: Describe your neighborhood.

Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy using a passage from The

Road, write a third person narrative about your day starting from when you wake up.

Authors had no formal training or knowledge in linguistics or stylometry.

Discussion Training Set Content and Size

The amount of training text for each author not exceptionally large, but the number of authors is significantly larger than most studies. Allowed us to see interesting patterns, such as authors who

were better at creating attacks. Certain authors particularly susceptible to obfuscation

attacks. Do they have a “generic” writing style? Could be beneficial to study the effects of adversarial

attacks in stricter domains. Participant Skill Level

The lack of linguistic expertise of the participants strengthens our conclusions. It would be reasonable to expect authors with some level of expertise to do a better job at attacking these methods.

Examining what features and methods might offer resistance to these attacks is a viable avenue of research.

privacy & stylometry: practical attacks against authorship recognition techniques michael...

Documents