Reporting on the first workshop on Parsing@SLE

20 Nov 2013 - Jurgen Vinju and Eric Van Wyk

In 2013 we organized the first workshop on Parsing at SLE (Parsing@SLE) as a pre-conference workshop with SLE in Indianapolis. The goal of this workshop was to bring together as many parsing experts as possible and find out what the state-of-the-art in parsing technology is and what the future will hold.

Parsing@SLE is also a vehicle for sparking the interest of more people to join the SLE community.

Participation

During the summer of 2013 we did a targeted email offensive, inviting everybody who has ever written a parser generator or maintained one, wrote a paper about it or blogged about it to the workshop. As a result we had 40 enthousiastic participants at the workshop!

Format

The format of the workshop was as follows:

  • We invited people to submit short position papers.
  • From the 13 submissions we composed an interactive program, comprising of:
    • 10 minute talks, hard time limits
    • inter-mixed with longer discussions, free form
    • tightly scheduled coffee breaks

It seems that this formula is good for this kind of workshop, because the attendees expressed unanimously that the workshop was enjoyable and useful.

We started in the morning by having 3 times 10 minute talks in one block and then 30 minutes of discussion, but after lunch we moved naturally to 10 minute talks interspersed by 10 minute discussions.

The discussions were usually sparked by the presentations but did not always stick to the original point, and we did not force this either. This freedom of discussion topic was deemed to be good for this first instance of parsing@SLE, but some participants expressed the hope that next year at least a part of the day would also be reserved for focused discussion on particular topics.

Example topics that were proposed to focus on next year:

  • The principled definition, evaluation and comparison of parser run-time behavior on typical use cases, as opposed to plain old worst case analysis or benchmarking without a background theory of what it is that parsers do at run-time.
  • Implementation tactics: what makes the difference between a fast and a slow parser implementation.
  • Off-the-beaten-track: step away from our warts, e.g. the parsing-takes-a-string-as-input-and-produces-a-tree-as-output meme, or parsing programming languages. What else needs parsing or is like parsing and what other effects do parsers have? Examples are “abstract parsing”, “incremental parsing”, “parsing streams of events”, “safety/security application area”, “fuzzy parsing”, “statistical parsing”.

Output

As output of Parsing@SLE 2013 we can mention:

  • parsing@sleconf.org - mailing list of interested participants
  • new SLE community members
  • a handful of registrants who came only for Parsing@SLE.

Testimonials

At the end of the workshop we asked each individual participant what he or she got out of attending this workshop. The following list summarizes the responses, removing duplicates and generalizing towards sound bites:

  • “Modularity and composability in parsing [technology] is in more people’s minds than I had thought”
  • “I did not know so many people cared about deterministic parsing still, while we have all these great general[ized] parsing technologies.”
  • “I did not know about these general[ized] parsing technologies, and I am going to look into this to see what they have to offer over deterministic parsing”
  • “It was brought to my attention again that the use case of a parser greatly influences its design choices, consider an editor in IDE versus a compiler.”
  • “Parser performance is more important than I had considered before. It’s still a number one priority in this age of superfast computers”
  • “What a surprise! more than 10 people actually showed up!”
  • “I was interested most to hear about Boolean Grammars, they are a novel concept to me and I would like to hear more about it.”
  • “Great workshop, but next time let’s consider our application areas with more focus. It’s all in the details!”
  • “Aha! So there is more going in the parsing world than just ANTLR!”
  • “It was particularly good to see such a wide coverage over technological and theoretical backgrounds between the different participants in this workshop.”
  • “I believe the natural language processing people would find a home here too”
  • “We should teach this stuff more”
  • “OH: Incremental parsing is a competitor of projectional editing”
  • “Regular expressions meet context-free grammars at this workshop, in terms of meta notation and performance optimization questions”
  • “We realized today we need to characterize precisely (in a principled manner) what a parser does at run-time in a principled manner in order to start comparing different parser technologies.
  • “Parsing interacts at many points with general computer science and software engineering”
  • “This kind of interactive workshop is what I prefer most.”
  • “I learned some techniques that may help me improve my next parser [generator].”

Program

This is the program as it was planned for the day:

8:30-10:00 First session, welcome

  • Vinju, Van Wyk: Introduction, goal of the day

  • Participant Introduction/Parsing visions round

  • 10 minute talks:
    • Erdweg - partial evaluation of GLL interpretive parser to improve speed
    • Johnstone/Scott - performance characterization of GLL
    • Afroozeh - OO lazy scannerless parser - performance using DFA for ‘scanning’
  • Discussion
    • what the speakers bring, or:
    • Trade-offs: non-determinism versus determinism, benefits & pitfalls

10:30-12:00 Second session

  • Watson - Exotic Grammatical Operators
  • Patterson/Hirsch - combinators, in C, for parsing binary data
  • Grigorev/Kirilenko - abstract parsers - analysis of dynamically created strings during program execution
  • McDirmid - incremental compilation, live programming

  • Discusssion
    • what the speakers bring, or:
    • Parsing in the IDE: special requirements and opportunities.

13:30-15:00 Third session

  • Diekmann/Tratt - language boxes
  • Omar - Wyvern - extensible type driven parsing
  • Kaminski/Van Wyk - LALR(1)
  • Stevenson/Cordy - boolean grammars, for different views of programs, representing ambiguous parse trees

  • Discussion:
    • What the speakers bring, or:
    • Global synergy: how can we converge the effort on parser tooling?

15:30-17:00 Final session

  • Parr - ALL(*) model of parsing in ANTLR
  • Break out in groups of three or four for 45 minutes
    • Assignment: what is the future for the field?
    • open problems in parsing and parser generation
    • Formulate, explore, related work
  • Get together again for 45 minutes
    • to harvest, each group chooses a representative who presents results for 5 minutes
  • The end

Participants

TBD