parsing for fun and profit
Post on 17-Oct-2014
2.105 views
DESCRIPTION
Slides from my talk Parsing for Fun and Profit, code is available here: https://github.com/patchspace/parsing_for_fun_and_profitTRANSCRIPT
![Page 1: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/1.jpg)
Parsingfor Fun and Profit(but mainly fun)
PatchSpace LtdSaturday, 23 February 13
![Page 2: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/2.jpg)
What?
Saturday, 23 February 13
![Page 3: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/3.jpg)
Parsing
Adding structure and meaning to text
Saturday, 23 February 13
![Page 4: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/4.jpg)
Parsing Human Languages
Jake stretched his legs“Jake”, “stretched”, “his”, “legs”“Jake”<noun>, “stretched”<verb, past>, “his”<possessive pronoun>, “legs”<noun>“Jake” <noun, subject>, “stretched”, (“his”, “legs”)<noun phrase, object>
Saturday, 23 February 13
![Page 5: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/5.jpg)
Parsing Computer Languages
“foo = bar + 123”“foo”, “=”, “bar”, “+”, “123”“foo”<var>, “=”<assignment_op>, “bar”<var>, “+”<op_plus>, “123”<int_literal>
Saturday, 23 February 13
![Page 6: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/6.jpg)
Why?
Saturday, 23 February 13
![Page 7: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/7.jpg)
Not just compiling!Compilers breathe fire.
Saturday, 23 February 13
![Page 8: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/8.jpg)
Pretty PrintingSaturday, 23 February 13
![Page 9: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/9.jpg)
Pretty Printing
gofmt
http://gofmt.com/
Saturday, 23 February 13
![Page 10: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/10.jpg)
Code Smell Detectorshttps://rubygems.org/gems/reek
Saturday, 23 February 13
![Page 11: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/11.jpg)
Code Smell DetectorsSaturday, 23 February 13
![Page 12: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/12.jpg)
Other ideasCode metricsBug detectorsDomain-specific languagesLanguage translators (e.g. Ruby -> PHP)Code obfuscatorsAlternative syntaxes (e.g. CoffeeScript)Refactoring tools
Saturday, 23 February 13
![Page 13: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/13.jpg)
How?
Saturday, 23 February 13
![Page 14: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/14.jpg)
Step 13 year computer science
degree
Saturday, 23 February 13
![Page 15: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/15.jpg)
Lexing/Tokenising
if x > 100 then return “big” else return “small”if x > 100 then return “big” else return “small”
Saturday, 23 February 13
![Page 16: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/16.jpg)
Tree Buildingif x > 100 then return “big” else return a + b
if
x
>
100
then
return
“big”
else
return
a+
b
Saturday, 23 February 13
![Page 17: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/17.jpg)
Parsing Expression Grammars
Like regular expressions, but can handle recursion, e.g. HTMLNot actually that much harder to use
Saturday, 23 February 13
![Page 18: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/18.jpg)
Regexes and HTML
Saturday, 23 February 13
![Page 19: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/19.jpg)
Treetop PEG grammarSaturday, 23 February 13
![Page 20: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/20.jpg)
Doing Sums
Saturday, 23 February 13
![Page 21: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/21.jpg)
Switch to Sublime Text, idiot
Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/
Saturday, 23 February 13
![Page 22: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/22.jpg)
A Ruby Syntax Highlighter
Saturday, 23 February 13
![Page 23: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/23.jpg)
What
A tool to read in simple Ruby source and output syntax highlighted HTML
Saturday, 23 February 13
![Page 24: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/24.jpg)
Why
Because I thought it would be funIt wasBecause I thought it would be easy…
Saturday, 23 February 13
![Page 25: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/25.jpg)
Why
Saturday, 23 February 13
![Page 26: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/26.jpg)
HowBuild a parse tree of the Ruby sourceWalk the tree and spit out a <span> element for each bit of textOh yes, make sure each line goes in <div> and <pre> tagsWrap it in <html>And for bonus points, do some fancy method highlighting
Saturday, 23 February 13
![Page 27: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/27.jpg)
Switch to Chrome, idiot
Saturday, 23 February 13
![Page 28: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/28.jpg)
Switch to Sublime Text again, idiot
Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/
Saturday, 23 February 13
![Page 29: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/29.jpg)
We’re doing this the hard way
Ruby’s grammar is too complex and undefined to easily implement as a PEGTools for parsing Ruby already exist
Saturday, 23 February 13
![Page 30: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/30.jpg)
Ripper (Ruby 1.9.3)Saturday, 23 February 13
![Page 31: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/31.jpg)
Learn more!
Skip theoretical physics, start by playing with Lego
Saturday, 23 February 13
![Page 32: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/32.jpg)
Do moreIdeas you might like to try:
CSV parserJSON parser (return arrays & hashes)XML parserJSON highlighterA simple JavaScript minifier (just kill whitespace)
Saturday, 23 February 13
![Page 33: Parsing for Fun and Profit](https://reader033.vdocuments.net/reader033/viewer/2022050919/5441ff0dafaf9f62208b480b/html5/thumbnails/33.jpg)
Thank you
PatchSpace LtdSaturday, 23 February 13