privacy & stylometry: practical attacks against authorship recognition techniques michael...
TRANSCRIPT
Privacy & Stylometry: Practical Attacks Against Authorship Recognition Techniques
Michael Brennan and Rachel Greenstadt{mb553,greenie}@cs.drexel.edu
Overview
2
Introduction to Authorship RecognitionThe Threat to Privacy & Anonymity
Experimental ResultsAttacking Authorship Recognition
Study Setup Methodology Results
Updated Work Translation Attacks Data Set Release Building an Attack Corpus
Future Work
What is Authorship Recognition?The basic question: “who wrote this
document?”Stylometry: The study of attributing
authorship to documents based only on the linguistic style they exhibit. “Linguistic Style” Features: sentence length, word
choices, syntactic structure, etc. Handwriting, content-based features, and
contextual features are not considered. In this presentation, stylometry and
authorship recognition are used interchangeably.
Stylometry: A Brief HistoryThe classic stylometry problem: The Federalist
Papers.85 anonymous papers to persuade
ratification of the Constitution. 12 of these have disputed authorship.
Stylometry has been used to show Madison authored the disputed documents.
Used as a data set for countless stylometry studies.
Modern Stylometry Based in Machine LearningSVMs, Genetic Algorithms, Neural Networks,
Bayesian Classifiers… used extensively.
Why is Stylometry Important?Great, you can figure out who wrote a 200
year old document, so what?From the Institute for Linguistic Evidence:
“In some criminal, civil, and security matters, language can be evidence… When you are faced with a suspicious document, whether you need to know who wrote it, or if it is a real threat or real suicide note, or if it is too close for comfort to some other document, you need reliable, validated methods.”
Plagiarism, Forensics, Anonymity…
Stylometry: the Threat to Privacy
6
Good techniques for location privacy (Tor, Mixes, etc). But it may be insufficient!
Stylometry can identify authors based on their writing. Can anonymous authors defend against this? ~6500 words to leak identity – Rao, Rohatgi. 2000. “The Multidisciplinary Requirement for Privacy” –
Carlisle Adams. 2006.
Supervised Stylometry
7
Given a set of documents of known authorship, classify a document of unknown authorship. Classifier trained on undisputed text.
Scenario: Alice the Anonymous Blogger vs. Bob the Abusive Employer. Alice blogs about abuses at Bob’s company.
Blog posted anonymously (Tor, pseudonym, etc). Bob obtains 5000-10000 words of each
employee’s writing. Bob uses stylometry to identify Alice as the blogger.
Unsupervised Stylometry
8
Given a set of documents of unknown authorship, cluster them into author groups. No existing author information, “Similarity Detection”
Scenario: Anonymous Forum vs. Oppressive Government. Participants organize protests.
Posts are completely unlabeled (no pseudonyms) Unknown organizational structure, number of authors, etc.
The government applies unsupervised stylometric techniques. # of authors may be discovered, author profiles created. Fed into supervised stylometry system to identify individuals.
The Threat in Numbers (Unsupervised)
9
Experiments with “Forum” Scenario9 Anonymous authors, raw forum data:
Only 35% Accuracy. Posts 25-200 words long
5 authors, same data set, modified for longer posts: 88.2% Accuracy Passages 500-750 words long
Other Research on 50 Unique Authors (WritePrints): Unsupervised “Pairing”: 95%
The Threat in Numbers (Supervised)
10
“Workplace” Scenario is Well ResearchedSupervised Classification of 2-5 Authors: 79-
99% Accuracy (IAAI ‘09). Random sets taken from 15 subject pool. 3 Different Methods Used
Other Research on 50 Unique Authors (WritePrints): Supervised: 90%
Protecting Privacy: Attacking Stylometry
11
Problem Can stylometry be attacked? How so? How Easily?
Evaluation In depth study on attacking multiple methods of
Stylometry.Conclusion
Stylometry is very vulnerable to attack by inexperienced human adversaries.
Attacks can be used to protect privacy.
Related WorkAdversarial stylometry is not a well-
researched part of the field. Somers & Tweedie, 2003: Authorship Attribution &
Pastiche. Mixed Results. Kacmarcik & Gamon, 2006: Obfuscating Document
Stylometry to Preserve Anonymity. Encouraging results.
Our goal: Assess the general vulnerability of authorship recognition systems when under attack by human adversaries.
Results Published in “Practical Attacks Against Authorship Recognition Techniques.” (IAAI '09)
We’re Under Attack!Obfuscation Attack
An author attempts to write a document in such a way that their personal writing style will not be recognized.
Imitation Attack An author attempts to write a document such that
the writing style will be recognized as that of another specific author.
Study Setup & Format3 representative methods of stylometry.15 Individual Authors. Participation had three
parts: Submit 5000 words of pre-existing writing from a
formal source. Write a new 500 word passage as an obfuscation
attack. Task: Describe your neighborhood.
Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy, describe your day.
Authors had no formal training or knowledge in linguistics or stylometry.
Imitating Cormac McCarthy
“On the far side of the river valley the road passed through a stark black burn. Charred and limbless trunks of trees stretching away on every side. Ash moving over the road and the sagging hands of blind wire strung from the blackened lightpoles whining thinly in the wind.”
Imitation Attack Examples“Light sliced through the blinds, and
construction began in the adjacent apartment. The harsh cacophony crashed through the wall.”
“Hot water in the mug. Brush in the mug. The blade read ‘Wilkinson Sword’ on the layered wax paper packaging.”
“He fills the coffee pot with water, after cleaning out the putrid remains of yesterday's brew. The beans are in the freezer, he remembers.”
Methodology & EvaluationValidation Experiment: Verify accuracy claims
of all 3 methods.Attack Experiment: Determine accuracy when
attacked using the same experimental conditions.
Experimental Conditions: Each test was applied to four random sets of 2, 3,
4, and 5 distinct authors. Same sets used in each method.
Cross-validation used to measure accuracy (repeated sub sampling and leave-one-out).
Machine Learning: Quick IntroductionMachine Learning is AI
Learn & recognize patterns automatically, make decisions.
Most common incarnation of ML: ClassifiersSpam Filter: Bayesian ClassifierGiven a message, what is the probability
that it is spam or not spam given the extracted features.
Important to pick good features.
A Closer Look at Linguistic FeaturesBasic Measurements:
Average syllable/word/sentence count, letter distribution, punctuation.
Lexical DensityUnique_Words / Total_Words
Gunning-Fog Readability Index:0.4 * ( Average_Sentence_Length +
100 * Complex_Word_Ratio )Result: years of formal education required to
read the text.
Stylometry MethodsMethod 1: Signature Stylometric System.
Three features: word length, letter usage, punctuation usage. 95% base accuracy.
Stylometry MethodsMethod 2: Neural Network Classifier
Based on Singh & Holmes, 1996. 9 linguistic features.
Lexical density, unique word count, character counts with and without spaces, two readability indexes, sentence count, average sentence length, and average syllables per word.
78.5% base accuracy. NN: simple switches,
complex overall system.
Input OutputHidden
Stylometry Methods
Method 3: Synonym-Based Classifier. Based on Clark & Hannon, 2007. Examines word choice when presented with all
possible synonyms. 88-99% accuracy.
(Diagram from “A Classifier System for Author RecognitionUsing Synonym-Based Features” by Clark & Hannon)
Accuracy on Training Set
All three methods were highly accurate in recognizing the correct author of the training set.
Obfuscation Attack Detection
Obfuscation attacks brought the accuracy of each system to that of random chance or worse.
Imitation Attack Detection
Imitation attacks all but completely circumvented all three authorship recognition methods.
Imitation Attack Success
The imitation attacks were very successful in targeting the intended victim of the attack.
Questioning the ResultsWhy are some authors more susceptible to
attack?Why did certain methods perform better than
others?Would these results hold up in a stricter
domain?What if the participants were skilled in
linguistics or stylometry?What features and methods could offer some
resistance to these attacks?
Translation AttacksCommon suggestion: run some text through a
translator and back!Limited testing show this to be ineffective.Translation Test Setup:
Five Subjects, (88-98% base accuracy)Single passage translated to/from
German and Japanese.Tested on Neural Network & Synonym
Methods100% effectiveness in identifying true author.
Using Attacks to Improve Privacy
29
What you can do: Important: These these methods failed, others
might not. Best: Try to imitate someone else. Write less.
What is needed A tool that will assist in anonymizing writing styles.
Building a ToolWhat a style-modifying tool needs:
A variety of methods & features. A large corpus of existing authors. AI-Assisted (Human oversight to preserve content) Open Source Non-Stylometry Features:
Contextual Clue Identifiers (Time, Content)
Building A Corpus IRB Approved Survey: dragonauth.comWe need your help!Mandatory for participation:
6000 words untampered writing1 Obfuscation Attack1 Imitation Attack
Optional:Demographic InformationVerification (untampered) passageSecond Imitation Attack
Building A CorpusEarly Stages of Data CollectionPlease Tell Others!English Targeted (For Now)
Suggestions for imitation authors for other languages are welcome
To participate in another language, do the imitation attack for an author of your choosing & let us know who it is.
We aren't web developers!
Releasing the Original Data Set8 Authors Consented (post-IRB approval)Available Jan 6th at psal.cs.drexel.eduContains:
Original sample passages1 Obfuscation Attack1 Imitation Attack
Future WorkTesting more stylometric methods. Imitation attacks against other authors.Effectiveness in unique domains (aggregated
short messages)Extracting additional information
(demographic)Better demonstration of effective
unsupervised stylometry.
ConclusionNot the end of stylometry in sensitive areas
New methods should test for adversarial threats.Stylometry is useful, but can also present a
threat to privacy. Attacking stylometry to preserve privacy has high
potential.Potential for arms race:
Developing attack-resistant methods of stylometry vs. creating new attacks to preserve privacy.
Questions?Contact:
Mike Brennan: [email protected] Rachel Greenstadt: [email protected] psal.cs.drexel.edu
Interested in learning more? “Writing Style Fingerprint Tool Easily Fooled” -
New Scientist, August 2009. “Can Pseudonymity Really Guarantee Privacy?” –
Josyula Rao, Pankaj Rohatgi. 2000. “Writeprints: A stylometric approach to identity-
level identification and similarity detection in cyberspace” – A. Abbasi, H. Chen. 2008.
FAQ“When will your data set be released?”“I can think of features for a stylometry
system that I bet these authors did not consider”
“I know a person in a certain Three Letter Organization that has subtle ways of detecting authorship that would surely not be broken by these attacks.”
“What about function words?”
Message Board Posts vs. Original Data
Study Setup: Detailed 3 representative methods of stylometry. 15 Individual Authors. Participation had three
parts: Submit 5000 words of pre-existing writing from a
formal source. Formal means school essays, professional reports, etc. No
slang, abbreviations, casual conversation. Write a new 500 word passage as an obfuscation
attack. Task: Describe your neighborhood.
Write a new 500 word passage as an imitation attack. Task: Imitate Cormac McCarthy using a passage from The
Road, write a third person narrative about your day starting from when you wake up.
Authors had no formal training or knowledge in linguistics or stylometry.
Discussion Training Set Content and Size
The amount of training text for each author not exceptionally large, but the number of authors is significantly larger than most studies. Allowed us to see interesting patterns, such as authors who
were better at creating attacks. Certain authors particularly susceptible to obfuscation
attacks. Do they have a “generic” writing style? Could be beneficial to study the effects of adversarial
attacks in stricter domains. Participant Skill Level
The lack of linguistic expertise of the participants strengthens our conclusions. It would be reasonable to expect authors with some level of expertise to do a better job at attacking these methods.
Examining what features and methods might offer resistance to these attacks is a viable avenue of research.