parsers combinators in scala, ilya @lambdamix kliuchnikov

66
1 Parser Combinators in Scala Илья Ключников @lambdamix

Upload: vasil-remeniuk

Post on 12-May-2015

1.509 views

Category:

Technology


4 download

DESCRIPTION

Talk on Parsers Combinators in Scala by Ilya @lambdamix Kliuchnikov at scalaby#8 (scala.by)

TRANSCRIPT

Page 1: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

1

Parser Combinators in ScalaИлья Ключников

@lambdamix

Page 2: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

2

Комбинаторные библиотеки

● Actors● Parsers● ScalaCheck, Spesc● Scalaz● SBT● EDSLs● ...

Page 3: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

3

33/35 11/14 8/9 4/13

Page 4: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

4

33/35 11/14 8/9 4/13

Intro: combinators, parsers

Scala Parser Combinators from the Ground Up

Pros, cons

Advanced techniques

How to write typical parser

Page 5: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

5

Parser?

● Трансформирует текст в структуру

+

2*3 + 4 *

2 3

3

Page 6: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

6

Hello, parserimport scala.util.parsing.combinator._import syntactical.StandardTokenParsers

sealed trait Exprcase class Num(i: Int) extends Exprcase class Var(n: String) extends Exprcase class Plus(e1: Expr, e2: Expr) extends Exprcase class Mult(e1: Expr, e2: Expr) extends Expr

object ArithParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", "+", "*") def expr: Parser[Expr] = term ~ ("+" ~> expr) ^^ Plus | term def term: Parser[Expr] = factor ~ ("*" ~> term) ^^ Mult | factor def factor: Parser[Expr] = numericLit ^^ { s => Num(s.toInt) } | ident ^^ Var | "(" ~> expr <~ ")"

def parseExpr(s: String) = phrase(expr)(new lexical.Scanner(s))}

scala> ArithParsers.parseExpr("1")res1: ArithParsers.ParseResult[parsers2.Expr] = [1.2] parsed: Num(1)

scala> ArithParsers.parseExpr("1 + 1 * 2")res2: ArithParsers.ParseResult[parsers2.Expr] = [1.10] parsed: Plus(Num(1),Mult(Num(1),Num(2)))

scala> ArithParsers.parseExpr("a * (a * a)")res3: ArithParsers.ParseResult[parsers2.Expr] = [1.12] parsed: Mult(Var(a),Mult(Var(a),Var(a)))

Page 7: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

7

Example 2: Lambda calculus

t ::=

x

λx.t

t t

terms:

variable

abstraction

application

x y z = ((x y) z)

λx.λy.y = λx.(λy.y)

Page 8: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

8

Example 2sealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Term

object LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "\\") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("\\" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))}

scala> LamParsers.parseTerm("x y z")res1: LamParsers.ParseResult[parsers.Term] = [1.6] parsed: App(App(Var(x),Var(y)),Var(z))

scala> LamParsers.parseTerm("""\x.\y.x y""")res2: LamParsers.ParseResult[parsers.Term] = [1.10] parsed: Lam(Var(x),Lam(Var(y),App(Var(x),Var(y))))

scala> LamParsers.parseTerm("""(\x.x x) (\x. x x)""")res3: LamParsers.ParseResult[parsers.Term] = [1.19] parsed: App(Lam(Var(x),App(Var(x),Var(x))),Lam(Var(x),App(Var(x),Var(x))))

Page 9: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

9

Combinators

Page 10: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

10

Комбинаторные библиотеки

● Actors● Parsers● ScalaCheck, Spesc● Scalaz● SBT● EDSLs● ...

Page 11: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

11

● Соответствие терминологии библиотеки и терминологии предметнои области.

● Состав● типы, ● примитивы, ● комбинаторы первого порядка, ● комбинаторы высшего порядка.

● Своиство замыкания (композиционность).● Возможность эффективнои реализации.

Принципы комбинаторных библиотек

E. Кирпичев. Элементы функциональных языков. Практика функционального программирования №3.

Page 12: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

12

Парсеры

Page 13: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

13

Предметная область

● Грамматика● Регулярная● Бесконтекстная ● Леворекурсивная● Праворекурсивная● Аттрибутная● Boolean● PEG● ...

● Парсеры● LL-парсеры● LR-парсеры● Нисходящие● Восходящие● GLL● Packrat-парсеры● Parsing with

derivativatives

Page 14: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

14

Предметная область

Page 15: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

15

Подходы к созданию парсеров

● Parser-generator● Yacc● Lex● JavaCC● AntLR● Rat!

● Hand-written● Low-level● High-level

Page 16: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

16

Parsers in Scala

C9 Lectures: Dr. Erik Meijer - Functional Programming Fundamentals Chapter 8 of 13A. Moors, F. Piessens, M. Odersky. Parser Combinators in Scala. Report CW 49 // Feb 2008

Page 17: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

17

Scala parser combinators are a form of recursive descent parsing

with infinite backtracking.

Page 18: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

18

Parsers in Scala are functional

Background:● W. Burge. Recursive Programming Techniques.

Addison-Wesley, 1975.● Ph. Wadler. How to Replace Failure by a List of

Successes. A method for exception handling, backtracking, and pattern matching in lazy functional languages // 1985

● G. Hutton. Higher-order functions for parsing // Journal of functional programming. 1992/2

● J. Fokker. Functional Parsers // 1995

Page 19: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

19

Parser?

● Трансформирует текст в структуру

+

2*3 + 4 *

2 3

3

Page 20: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

20

Парсер – это функция

type Parser[A] = String => A

type Parser[A] = String => (A, String)

Нет композиции функции, не обязательно парсить всю строку

Может закончиться неудачеи

type Parser[A] = String => Option[(A, String)]

Page 21: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

21

Attempt #1

Page 22: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

22

Resultstrait SimpleResults { type Input trait Result[+T] { def next: Input } case class Success[+T](result: T, next: Input) extends Result[T] case class Failure(msg: String, next: Input) extends Result[Nothing]}

object XParser extends SimpleResults { type Input = String val acceptX: Input => Result[Char] = { (in: String) => if (in.charAt(0) == 'x') Success('x', in.substring(1)) else Failure("expected an x", in) }}

scala> XParser.acceptX("xyz")res0: parsers.XParser.Result[Char] = Success(x,yz)

scala> XParser.acceptX("yz")res1: parsers.XParser.Result[Char] = Failure(expected an x,yz)

Page 23: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

23

The basis: Parser, |, ~, accepttrait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}}

def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}

trait StringParsers extends SimpleParsers { type Input = String private val EOI = 0.toChar

def accept(expected: Char) = new Parser[Char] { def apply(in: String) = if (in == "") { if (expected == EOI) Success(expected, "") else Failure("no more input", in) } else if (in.charAt(0) == expected) Success(expected, in.substring(1)) else Failure("expected \'" + expected + "\'", in) } def eoi = accept(EOI)}

Page 24: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

24

The simplest parserobject OXOParser extends StringParsers { def oxo = accept('o') ~ accept('x') ~ accept('o') def oxos: Parser[Any] = (oxo ~ accept(' ') ~ oxos | oxo)}

scala> OXOParser.oxos("123")res2: parsers.OXOParser.Result[Any] = Failure(expected 'o',123)

scala> OXOParser.oxos("oxo")res3: parsers.OXOParser.Result[Any] = Success(((o,x),o),)

scala> OXOParser.oxos("oxo oxo")res4: parsers.OXOParser.Result[Any] = Success(((((o,x),o), ),((o,x),o)),)

scala> OXOParser.oxos("oxo oxo 1")res5: parsers.OXOParser.Result[Any] = Success(((((o,x),o), ),((o,x),o)), 1)

scala> (OXOParser.oxos ~ OXOParser.eoi)("oxo oxo 1")res6: parsers.OXOParser.Result[(Any, Char)] = Failure(expected '?', 1)

Page 25: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

25

Be careful!trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}}

def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}

object OXOParser extends StringParsers { def oxo = accept('o') ~ accept('x') ~ accept('o') def oxos: Parser[Any] = (oxo ~ accept(' ') ~ oxos | oxo)}

call-by-name param

call-by-name param

Page 26: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

26

Be careful!trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}}

def ~[U](p: Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}

object OXOParser extends StringParsers { def oxo = accept('o') ~ accept('x') ~ accept('o') def oxos: Parser[Any] = (oxo ~ accept(' ') ~ oxos | oxo)}

call-by-value param

call-by-value param

scala> OXOParser.oxos("123")java.lang.StackOverflowError

at parsers.OXOParser$.oxo(stepbystep.scala:67)at parsers.OXOParser$.oxos(stepbystep.scala:69)at parsers.OXOParser$.oxos(stepbystep.scala:69)at parsers.OXOParser$.oxos(stepbystep.scala:69)at parsers.OXOParser$.oxos(stepbystep.scala:69)at parsers.OXOParser$.oxos(stepbystep.scala:69)at parsers.OXOParser$.oxos(stepbystep.scala:69)

...

Page 27: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

27

Attempt #2(Factoring out Plumbing)

Page 28: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

28

Where is a problem?trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}}

def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}

object OXOParser extends StringParsers { def oxo = accept('o') ~ accept('x') ~ accept('o') def oxos: Parser[Any] = (oxo ~ accept(' ') ~ oxos | oxo)}

Page 29: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

29

Too much “threading”trait SimpleParsers extends SimpleResults { trait Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) match { case Failure(_, _) => p(in) case Success(x, n) => Success(x, n)}}

def ~[U](p: => Parser[U]): Parser[(T, U)] = new Parser[(T, U)] { def apply(in: Input) = Parser.this(in) match { case Success(x, next) => p(next) match { case Success(x2, next2) => Success((x, x2), next2) case Failure(m, n) => Failure(m, n) } case Failure(m, n) => Failure(m, n)}} }}

object OXOParser extends StringParsers { def oxo = accept('o') ~ accept('x') ~ accept('o') def oxos: Parser[Any] = (oxo ~ accept(' ') ~ oxos | oxo)}

Page 30: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

30

Improved Resultstrait SimpleResults { type Input trait Result[+T] { def next: Input def map[U](f: T => U): Result[U] def flatMapWithNext[U](f: T => Input => Result[U]): Result[U] def append[U >: T](alt: => Result[U]): Result[U] } case class Success[+T](result: T, next: Input) extends Result[T] { def map[U](f: T => U) = Success(f(result), next) def flatMapWithNext[U](f: T => Input => Result[U]) = f(result)(next) def append[U >: T](alt: => Result[U]) = this } case class Failure(msg: String, next: Input) extends Result[Nothing] { def map[U](f: Nothing => U) = this def flatMapWithNext[U](f: Nothing => Input => Result[U]) = this def append[U](alt: => Result[U]) = alt }}

●map -...●flatMapWithNext - ...●append – for multiple results (we do not consider it here)

Page 31: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

31

Parser is a function with many results

type Parser[A] = String => A

type Parser[A] = String => (A, String)

type Parser[A] = String => Option[(A, String)]

type Parser[A] = String => List[(A, String)]

Page 32: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

32

trait SimpleResults { type Input trait Result[+T] { def next: Input def map[U](f: T => U): Result[U] def flatMapWithNext[U](f: T => Input => Result[U]): Result[U] def append[U >: T](alt: => Result[U]): Result[U] } case class Success[+T](result: T, next: Input) extends Result[T] { def map[U](f: T => U) = Success(f(result), next) def flatMapWithNext[U](f: T => Input => Result[U]) = f(result)(next) def append[U >: T](alt: => Result[U]) = this } case class Failure(msg: String, next: Input) extends Result[Nothing] { def map[U](f: Nothing => U) = this def flatMapWithNext[U](f: Nothing => Input => Result[U]) = this def append[U](alt: => Result[U]) = alt }}

trait SimpleParsers extends SimpleResults { abstract class Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T] def flatMap[U](f: T => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) flatMapWithNext (f) } def map[U](f: T => U): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) map (f) } def |[U >: T](p: => Parser[U]): Parser[U] = new Parser[U] { def apply(in: Input) = Parser.this(in) append p(in) } def ~[U](p: => Parser[U]): Parser[(T, U)] = for (a <- this; b <- p) yield (a, b) }}

After improving

Hey!

Page 33: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

33

So, Parser is a Monad!!

Page 34: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

34

Where is my “withFilter”?

● In Scala 2.10● It was not easy...

Page 35: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

35

Removing noise...trait SimpleParsers extends SimpleResults {

def Parser[T](f: Input => Result[T]) = new Parser[T] { def apply(in: Input) = f(in) }

abstract class Parser[+T] extends (Input => Result[T]) { def apply(in: Input): Result[T]

def flatMap[U](f: T => Parser[U]): Parser[U] = Parser { in => Parser.this(in) flatMapWithNext (f) }

def map[U](f: T => U): Parser[U] = Parser { in => Parser.this(in) map (f) }

def |[U >: T](p: => Parser[U]): Parser[U] = Parser { in => Parser.this(in) append p(in) }

def ~[U](p: => Parser[U]): Parser[(T, U)] = for (a <- this; b <- p) yield (a, b) }

}

RemovingBoilerplate

New Parser{apply}

Page 36: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

36

Real Parsers

Page 37: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

37

Real Parserspackage scala.util.parsing.combinator

trait Parsers { type Elem type Input = Reader[Elem]

sealed abstract class ParseResult[+T] case class Success[+T](result: T, override val next: Input) extends ParseResult[T] sealed abstract class NoSuccess(val msg: String, override val next: Input)

extends ParseResult[Nothing] case class Failure(override val msg: String, override val next: Input)

extends NoSuccess(msg, next) case class Error(override val msg: String, override val next: Input)

extends NoSuccess(msg, next)

... abstract class Parser[+T] extends (Input => ParseResult[T]) { ... } case class ~[+a, +b](_1: a, _2: b) { override def toString = "("+ _1 +"~"+ _2 +")" }

}

package scala.util.parsing.input

abstract class Reader[+T] { def first: T def rest: Reader[T]}

Stream annotated withcoordinates

Controllingbacktracking

Deconstructing sequencing

Page 38: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

38

Simplified picturepackage scala.util.parsing.combinator

trait Parsers { type Elem type Input = Reader[Elem]

sealed abstract class ParseResult[+T] abstract class Parser[+T] extends (Input => ParseResult[T]) { combinators }

combinators}

Page 39: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

39

Combinators

Page 40: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

40

Basic Combinatorspackage scala.util.parsing.combinator

trait Parsers {

def elem(kind: String, p: Elem => Boolean): Parser[Elem] def elem(e: Elem): Parser[Elem] implicit def accept(e: Elem): Parser[Elem] abstract class Parser[+T] extends (Input => ParseResult[T]) { def ~ [U](q: => Parser[U]): Parser[~[T, U]] def <~ [U](q: => Parser[U]): Parser[T] def ~! [U](p: => Parser[U]): Parser[~[T, U]] def | [U >: T](q: => Parser[U]): Parser[U] def ||| [U >: T](q0: => Parser[U]): Parser[U] def ^^ [U](f: T => U): Parser[U] def ^^^ [U](v: => U): Parser[U] def ^? [U](f: PartialFunction[T, U], error: T => String): Parser[U] def ^? [U](f: PartialFunction[T, U]): Parser[U] def >>[U](fq: T => Parser[U]) def *: Parser[List[T]] def +: Parser[List[T]] def ?: Parser[Option[T]] } }

Page 41: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

41

Swiss army knife Combinatorspackage scala.util.parsing.combinator

trait Parsers {

def commit[T](p: => Parser[T]): Parser[T] def accept[ES <% List[Elem]](es: ES): Parser[List[Elem]] def accept[U](expected: String, f: PartialFunction[Elem, U]): Parser[U] def failure(msg: String): Parser[Nothing] def err(msg: String): Parser[Nothing] def success[T](v: T): Parser[T] def rep[T](p: => Parser[T]): Parser[List[T]] def repsep[T](p: => Parser[T], q: => Parser[Any]): Parser[List[T]] def rep1[T](p: => Parser[T]): Parser[List[T]] def rep1[T](first: => Parser[T], p0: => Parser[T]): Parser[List[T]] def repN[T](num: Int, p: => Parser[T]): Parser[List[T]] def rep1sep[T](p : => Parser[T], q : => Parser[Any]): Parser[List[T]] def chainl1[T](p: => Parser[T], q: => Parser[(T, T) => T]): Parser[T] def chainl1[T, U](first: => Parser[T], p: => Parser[U], q: => Parser[(T, U) => T]): Parser[T] def chainr1[T, U](p: => Parser[T], q: => Parser[(T, U) => U], combine: (T, U) => U, first: U): Parser[U] def opt[T](p: => Parser[T]): Parser[Option[T]] def not[T](p: => Parser[T]): Parser[Unit] def guard[T](p: => Parser[T]): Parser[T] def positioned[T <: Positional](p: => Parser[T]): Parser[T] def phrase[T](p: Parser[T]): Parser[T] }

Inpired by G. Hutton and E. Meijer. Monadic Parser Combinators.

Page 42: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

42

Lexing

Page 43: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

43

Простейший (low-level) парсерtrait SimplestParsers extends Parsers { type Elem = Char def whitespaceChar: Parser[Char] = elem("space char", ch => ch <= ' ' && ch != EofCh) def letter: Parser[Char] = elem("letter", _.isLetter)

def whitespace: Parser[List[Char]] = rep(whitespaceChar) def ident: Parser[List[Char]] = rep1(letter)

def parse[T](p: Parser[T], in: String): ParseResult[T] = p(new CharSequenceReader(in))}

scala> val p1 = new SimplestParsers{}p1: java.lang.Object with parsers.SimplestParsers = $anon$1@17d59ff0

scala> import p1._import p1._

scala> parse(letter, "foo bar")res0: p1.ParseResult[Char] = [1.2] parsed: f

scala> parse(ident, "foo bar")res1: p1.ParseResult[List[Char]] = [1.4] parsed: List(f, o, o)

scala> parse(ident, "123")res2: p1.ParseResult[List[Char]] = [1.1] failure: letter expected

123^

Page 44: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

44

Towards ASTtrait Tokencase class Id(n: String) extends Tokencase class Num(n: String) extends Tokencase object ErrorToken extends Token

trait TokenParsers extends Parsers {

type Elem = Char

private def whitespaceChar: Parser[Char] = elem("space char", ch => ch <= ' ' && ch != EofCh) def letter: Parser[Char] = elem("letter", _.isLetter) def digit: Parser[Char] = elem("digit", _.isDigit)

def whitespace: Parser[List[Char]] = rep(whitespaceChar) def idLit: Parser[String] = rep1(letter) ^^ { _.mkString("") } def numLit: Parser[String] = rep1(digit) ^^ { _.mkString("") }

def id: Parser[Token] = idLit ^^ Id def num: Parser[Token] = numLit ^^ Num

def token = id | num

def parse[T](p: Parser[T], in: String): ParseResult[T] = p(new CharSequenceReader(in))

}

Page 45: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

45

Lexer/Scannertrait Scanners extends TokenParsers { class Scanner(in: Reader[Char]) extends Reader[Token] { def this(in: String) = this(new CharArrayReader(in.toCharArray())) private val (tok, rest1, rest2) = whitespace(in) match { case Success(_, in1) => token(in1) match { case Success(tok, in2) => (tok, in1, in2) case ns: NoSuccess => (ErrorToken, ns.next, ns.next.rest) } case ns: NoSuccess => (ErrorToken, ns.next, ns.next.rest) } def first = tok def rest = new Scanner(rest2) }}

scala> val scs = new Scanners {}scs: java.lang.Object with Scanners = $anon$1@68a750a

scala> val reader = new scs.Scanner("foo bar")reader: scs.Scanner = Scanners$Scanner@6a75863f

scala> reader.firstres0: Token = Id(foo)

scala> reader.rest.firstres1: Num = Num(123)

scala> reader.rest.rest.firstres2: Token = ErrorToken

Page 46: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

46

Lexing

Low-level ParsingReader[Char] Reader[Token]

Page 47: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

47

Typical Parser

Page 48: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

48

RAM++

Page 49: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

49

AST

Page 50: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

50

Parser

Implicit magic“~” magic

Page 51: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

51

Итак, ...

● Parsers Combinators in Scala позволяют описывать исполняемые грамматики в виде, близком к BNF.

● Внутреннее устроиство Parser Combinators - самыи настоящии Programming Pearl.

● Internal DSL for External DSLs.

Page 52: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

52

Discussion(Parser Combinators vs Parser Generator)

Page 53: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

53

PROS

● Toт же язык (Scala) – не нужно учить новыи инструмент.

● Исполняемая грамматика - всегда актуальныи код.

● Краткость + богатая выразительность: LL(*) и больше (в том числе, контекстные грамматики).

● Можно делать fusion синтаксического разбора и чего-нибудь еще.

● Модульность

Page 54: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

54

CONS

● Некоторые простые вещи могут кодироваться очень непросто.

● Performance.

Page 55: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

55

Performance

Hand-written Lift-json is 350 times faster than version based on parser combinators (proof link)

Page 56: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

56

Packrat Parsers

Page 57: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

57

Parsing “9”: Too much backtracking import scala.util.parsing.combinator._import syntactical.StandardTokenParsers

sealed trait Exprcase class Num(i: Int) extends Exprcase class Var(n: String) extends Exprcase class Plus(e1: Expr, e2: Expr) extends Exprcase class Mult(e1: Expr, e2: Expr) extends Expr

object ArithParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", "+", "*") def expr: Parser[Expr] = term ~ ("+" ~> expr) ^^ Plus | term def term: Parser[Expr] = factor ~ ("*" ~> term) ^^ Mult | factor def factor: Parser[Expr] = numericLit ^^ { s => Num(s.toInt) } | ident ^^ Var | "(" ~> expr <~ ")"

def parseExpr(s: String) = phrase(expr)(new lexical.Scanner(s))}

Page 58: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

58

Idea: Memoization (Really, Laziness)

Page 59: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

59

+ Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Term

object LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "\\") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("\\" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))}

Page 60: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

60

+ Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Term

object LamParsers extends StandardTokenParsers with ImplicitConversions with PackratParsers { lexical.delimiters += ("(", ")", ".", "\\") lazy val term: PackratParser[Term] = appTerm | lam lazy val vrb: PackratParser[Var] = ident ^^ Var lazy val lam: PackratParser[Term] = ("\\" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: PackratParser[Term] = appTerm ~ aTerm ^^ App | aTerm lazy val aTerm: PackratParser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))}

lazy val

Page 61: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

61

Without Left Recursionsealed trait Termcase class Var(n: String) extends Termcase class Lam(v: Var, body: Term) extends Termcase class App(t1: Term, t2: Term) extends Term

object LamParsers extends StandardTokenParsers with ImplicitConversions { lexical.delimiters += ("(", ")", ".", "\\") lazy val term: Parser[Term] = appTerm | lam lazy val vrb: Parser[Var] = ident ^^ Var lazy val lam: Parser[Term] = ("\\" ~> vrb) ~ ("." ~> term) ^^ Lam lazy val appTerm: Parser[Term] = (aTerm +) ^^ { _.reduceLeft(App) } lazy val aTerm: Parser[Term] = vrb | "(" ~> term <~ ")" def parseTerm(s: String) = phrase(term)(new lexical.Scanner(s))}

Page 62: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

62

Packrat Performance

Page 63: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

63

Other Parsers

● Pairboled Parser (PEG parser)● GLL parser● Derivative combinators

http://stackoverflow.com/questions/4423514/scala-parsers-availability-differences-and-combining

Page 64: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

64

Trends

● Merging two worlds● Compositionality (Functional programming)● Performance

Page 65: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

65

Спасибо!

Page 66: Parsers Combinators in Scala, Ilya @lambdamix Kliuchnikov

66

https://github.com/ilya-klyuchnikov/tapl-scala