Solving AOC#2 with Parslet

by Paweł Świątkowski
02 Dec 2020

It’s December again and this means that another edition of Advent of Code is under way. Most of us have probably already solved day 3 challenge, but I want to go back a bit and show an alternative solution to Day 2: Password Philosophy.

tl;dr (for those who don’t want to read the whole backstory): you are given 1000 lines like these: 1-3 a: abcde. What it means is: an abcde string should contain between 1 and 3 letters a. You need to count for how many lines this is true.

For most of us probably an intuitive solution is to take each line, apply some regex (or even string splitting) on it, do some casting and checking on matches/split results, and, based on the result, increment some counter or not. This is, by the way, exactly what I did in my first iteration. But then I stared some more at the input and realized that we might, in fact, treat the whole file as a program written in some exotic language. And as a result, that we could use some tools for writing parsers and interpreters to run this program and get a result.

Solving it with Parslet

I chose Parslet to write my alternative solution, because I had some experience with it in the past. Also because I could write everything in one Ruby file - in many other tools you define your grammar in a separate file, then “compile it” into a Ruby code, being an actual parser. There are usually faster but also have a higher entry barrier.

I’m not gonna explain Parslet in detail. Others, especially its author, already did a great job doing it and you should probably at least read “Getting started” first. What I’ll do is just present my solution with some comments.

Working with Parslet consists of two steps. First is parsing the input into an internal hash-like representation. Then we do transformation step. In many tutorials the result of the transformation is an AST (abstract syntax tree) on which the actual interpreter operates, but we can (and will) omit that step, writing a bit of logic directly into a transform class.

Parser / grammar

Our program we want to parse consists of many lines. Each line has a following structure:

number -> dash -> number -> space -> letter -> colon -> space -> string -> (optional) new line

(the new line is optional because the last line of the file does not have it)

This is how it looks expressed in Parslet’s DSL:

class Parser < Parslet::Parser
  rule(:space) { match('\s') }
  rule(:newl?) { match('\n').maybe }
  rule(:dash) { str('-') }
  rule(:colon) { str(':') }

  rule(:num) { match('[0-9]').repeat(1).as(:num) }
  rule(:letter) { match('[a-z]').as(:letter) }
  rule(:string) { letter.repeat(1) }

  rule(:line) { num.as(:min) >> dash >> num.as(:max) >> 
    space >> letter.as(:target) >> 
    colon >> space >> string.as(:string) >> newl? }
  rule(:lines) { line.repeat(1).as(:lines) }
  root :lines
end

As you can see, this is very similar to a pseudocode written above. That’s what’s great with Parslet and PEG grammars in general: they are quite easy to wrap your head around writing them, even without much knowledge how parsers work. If you run the parser on our example from the beginning of the post and pretty-print it, this should be the result:

{:lines=>
  [{:min=>{:num=>"1"@0},
    :max=>{:num=>"3"@2},
    :target=>{:letter=>"a"@4},
    :string=>
     [{:letter=>"a"@7},
      {:letter=>"b"@8},
      {:letter=>"c"@9},
      {:letter=>"d"@10},
      {:letter=>"e"@11}]}]}

This is a hash-like representation I told you about. Now it’s time to do something with it.

Transformation

The role of transformation step is to take a hash and transform it into something nicer. In our case we will go all in and transform it into a single number: the amount of lines that tell the truth (meet the condition they specify).

This is a class doing it:

class Transform < Parslet::Transform
  rule(:num => simple(:int)) { Integer(int) }
  rule(:letter => simple(:char)) { char.to_s }
  rule(
    :min => simple(:min),
    :max => simple(:max),
    :target => simple(:target),
    :string => sequence(:letters)
  ) { frequency = letters.tally.fetch(target, 0); frequency >= min && frequency <= max ? 1 : 0 }
  rule(:lines => sequence(:lines)) { lines.sum }
end

This is not that straightforward as writing a parser, but the basic rules are still simple. We write a rule with a kind of pattern matching for each hash-like part of the parser’s output and then we add a block doing a transformation. For example, the first rule(:num => simple(:int)) { Integer(int) } matches both {:num=>"1"@0} and {:num=>"3"@2}. It treats a value of num as a simple value and applies casting to integer on it.

One rule certainly stands out here. It’s a rule for matching a line. But it’s no surprise as the grammar for a line was also the most complex one. This rule reduces each line to number 1 (for a line which is correct) or 0 (for a line that is incorrect). Finally, a lines rule just takes all the lines and sums it.

Wrapping up

I’m sorry to tell you this, but that’s it. We wrote a simple parser and transform-o-interpreter, now we can just run our program with:

parsed = Parser.new.parse(File.read('input'))
puts Transform.new.apply(parsed)

The result will be our answer for the riddle. The complete file is here.

Why would I do it like that? Solution using built-in regexes is probably more familiar, faster and you can write it quicker. Sure. But I find this kind of problem-solving a good way to learn something new. The grammar is relatively easy so you won’t be caught up in recursive rules or other things that might be surprisingly difficult in PEG grammars. And, what is probably most important, it’s much more enjoyable than doing it like everyone else.

end of the article

Tags: ruby parsers

This article was written by me – Paweł Świątkowski – on 02 Dec 2020. I'm on Fediverse (Ruby-flavoured account, Elixir-flavoured account) and also kinda on Twitter. Let's talk.

Related posts: