Driver License Parser Generators

Driver License Parser Generators Average ratng: 3,7/5 5715 reviews
  1. Java Parser Generator
  2. Drivers License Generator Picture
  3. C# Parser Generator

A list of Python parsing tools initially imported from @nedbat'sblog post.

Human-parser-generator - A straightforward recursive descent Parser Generator with a focus on 'human' code generation and ease of use.

The List

NameDescriptionLicenseUpdatedParsesUsed ByNotes
PlyDocstrings are used to associate lexer or parser rules with actions. The lexer uses Python regular expressions.LGPLv3.6 4/2015LALR(1)lesscpyply-hack group
pyparsingDirect parser objects in python, built to parallel the grammar.MITv2.0.3 8/2014twill
ANTLRParser and lexical analyzer generator in Java. Generates parsing code in Python (as well as Java, C++, C#, Ruby, etc).BSDv4.4 7/2014LL(*)
pyPEGA parsing expression grammar toolkit for Python.GPLv 2.15 1/2015PEG
pydslA language workbench written in Python.GPLv3v 0.5.2 11/2014
LEPLA recursive descent parser.dual licensed MPL/LGPLv 5.1.3 9/2012Discontinued
CodetalkerPython-based grammar definitions.MITv 1.1 3/2014
funcparserlibRecurisve descent parsing library for Python based on functional combinators.MITv0.3.6 5/2013
picoparsev0.9 3/2009
AperiotApache 2.0v0.1.12 1/2012
PyGgyLexes with DFA from a specification in a .pyl file. Parses GLR grammars from a specification in a .pyg file. Rules in both files have Python action code. Unlike most parser generators, the rule actions are all executed in a post-processing step. The parser isn't represented as a discrete object, but as globals in a module.Public Domainv0.4 8/2004Python 3 compatible fork 0.4.1, discussion group
ParsingLR(1) parser generator as well as CFSM and GLR parser drivers.MITv1.4 12/2012LR(1), CFSM, and GLR
RparseGPLv 1.1.0. 4/2010LL(1) parser generator with AST generation.
SableCCJava-based parser and lexical analyzer generator. Generates parsing code in Java, with alternative generators for other languages including Python.LGPLv 3.7 11/2012
GOLD Parserzlib/libpngv 5.2.0 8/2012LALR
PlexPython module for constructing lexical analysers.LGPLv 2.0 12/2009compiles all of the regular expressions into a single DFA.
Plex3Python3 port of Plex8/2012No official release
yeanpypaYeanpypa is (yet another) framework to create recursive-descent parsers in Python.Public Domain4/2010Parsers are created by writing an EBNF-like grammar as Python expressions.
ZestyParserMITv 0.8.1 4/2007
BisonGenv 0.8.0b1 4/2005
DParser for PythonA scannerless GLR parserBSDv 1.3.0 3/2013Charming Python: DParser for Python: Exploring Another Python Parser
YappsProduces recursive-descent parsers, as a human would write. Designed to be easy to use rather than powerful or fast. Better suited for small parsing tasks like email addresses, simple configuration scripts, etc.MITv 2.1.1 8/2003
PyBisonPython binding to the Bison (yacc) and Flex (lex) parser-generator utilitiesGPLv 0.1.8 6/2004LALR(1)Doesn't yet support Windows.
Yappyv 1.9.4 8/2014SLR, LR(1) and LALR(1)Uses python strings to declare the grammar.
Toy Parser GeneratorLGPLv 3.2.2 12/2013
kwParsingAn experimental parser generator implemented in Python which generates parsers implemented in Python.Python Licensev 1.3SLRGadfly
MartelMartel uses regular expression format definition to generate a parser for flat- and semi-structured files. The parse tree information is passed back to the caller using XML SAX events. In other words, Martel lets you parse flat files as if they are in XML.BSDv 0.8 12/2001Last version included in BioPython
SimpleParseLexing and parsing in one step, but only deterministic grammars.BSD2.11a2 8/2010
mxTextToolsAn unusual table-based parser. There is no generation step, the parser is hand-coded using primitives provided by the package. The parser is implemented in C for speed. (just above).eGenix Public License, similar to Python, compatible with GPL.v 3.2.8 7/2014SimpleParse, Martel
SPARKUses docstrings to associate productions with actions. Unlike other tools, also includes semantic analysis and code generation phases.MITv 0.7 pre-alpha 7 5/2002
FlexModule and BisonModuleMacros to allow Flex and Bison to produce idiomatic lexers and parsers for Python. The generated lexers and parsers are in C, compiled into loadable modules.Pythonesquev 2.1 3/2002
Bison in a boxUses standard Bison to generate pure Python parsers. It actually reads the bison-generated .c file and generates Python code.GPLv 0.1.0 6/2001LALR(1)
Berkeley YACCClassic YACC, extended to generate Python code. Python support seems to be undocumented.Public Domainv 20141128 11/2014LALR(1)
PyLRLexer is based on regular expressions.12/1997LR
PyLRPyLR is a partial Python implementation of the OpenLR specificationApache 2.012/2014announcement
ConstructA declarative parser (and builder) for binary data.BSDv 2.5.2 4/2014
ModGrammarA general-purpose library for constructing language parsers and interpreters for context-free grammar definitions.BSDv 0.10 2/2013
lrparsingDiffers from other Python LR(1) parsers in using Python expressions as grammars, and offers disambiguation tools.AGPLv3v 1.0.11 3/2015LR(1) parser and a tokeniser
docoptGenerates a parser based on formalized conventions that are used for help messages and man pages for program interface description.MITv 0.6.2 6/2014

Standard Modules

The Python standard library includes a few modules for special-purpose parsing problems. These are not general-purpose parsers, but don't overlook them. If your need overlaps with their capabilities, they're perfect:

  • shlex lexes command lines using the rules common to many operating system shells.
  • ConfigParser implements a basic configuration file parser language which provides a structure similar to what you would find on Microsoft Windows INI files.
  • ArgParse makes it easy to write user-friendly command-line interfaces. The program defines what arguments it requires, and argparse will figure out how to parse those out of sys.argv. The argparse module also automatically generates help and usage messages and issues errors when users give the program invalid arguments.
  • email provides many services, including parsing email and other RFC-822 structures.parser parses Python source text.
  • cmd implements a simple command interface, prompting for and parsing out command names, then dispatching to your handler methods.
  • json is a JSON (JavaScript Object Notation) encoder and decoder
  • tokenize is a lexical scanner for Python source code, implemented in Python.

Articles

  • Simple Top-Down Parsing in Python - A methodology for writing top-down parsers in Python. (7/2008)
  • Pysec: Monadic Combinatoric Parsing in Python - An exposition of using monads to build a Python parser. (2/2008)

Licensing and Attribution


Python Parsing Tools by Michael R. Bernstein is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at https://github.com/webmaven/python-parsing-tools/.

Parsing

The Parsing module implements an LR(1) parser generator, as well as theruntime support for using a generated parser, via the Lr and Glr parserdrivers. There is no special parser generator input file format, but theparser generator still needs to know what classes/methods correspond tovarious aspects of the parser. This information is specified viadocstrings, which the parser generator introspects in order to generate aparser. Only one parser specification can be embedded in each module, butit is possible to share modules between parser specifications so that, forexample, the same token definitions can be used by multiple parserspecifications.

Patch

The parsing tables are LR(1), but they are generated using a fast algorithmthat avoids creating duplicate states that result when using the genericLR(1) algorithm. Creation time and table size are on par with the LALR(1)algorithm. However, LALR(1) can create reduce/reduce conflicts that don'texist in a true LR(1) parser. For more information on the algorithm, see:

Parser

Parsing table generation requires non-trivial amounts of time for largegrammars. Internal pickling support makes it possible to cache the mostrecent version of the parsing table on disk, and use the table if thecurrent parser specification is still compatible with the one that was usedto generate the pickled parsing table. Since the compatibility checking isquite fast, even for large grammars, this removes the need to use thestandard code generation method that is used by most parser generators.

Parser specifications are encapsulated by the Spec class. Parser instancesuse Spec instances, but are themselves based on separate classes. Thisallows multiple parser instances to exist simultaneously, without requiringmultiple copies of the parsing tables. There are two separate parser driverclasses:

Lr:

Java Parser Generator

Standard Characteristic Finite State Machine (CFSM) driver, based onunambiguous LR(1) parsing tables. This driver is faster than the Glrdriver, but it cannot deal with all parsing tables that the Glrdriver can.
Glr:
Free

Generalized LR driver, capable of tracking multiple parse treessimultaneously, if the %split precedence is used to mark ambiguousactions. This driver is closely based on Elkhound's design, whichis described in a technical report:

Parser generator directives are embedded in docstrings, and must begin witha '%' character, followed immediately by one of several keywords:

Precedence:
%fail%nonassoc%left%right%split
Token:
%token
Non-terminal:
%start%nonterm
Production:
%reduce

All of these directives are associated with classes except for %reduce.%reduce is associated with methods within non-terminal classes. The Parsingmodule provides base classes from which precedences, tokens, andnon-terminals must be derived. This is not as restrictive as it sounds,since there is nothing preventing, for example, a master Token class thatsubclasses Parsing.Token, which all of the actual token types then subclass.Also, nothing prevents using multiple inheritance.

Folowing are the base classes to be subclassed by parser specifications:

  • Precedence
  • Token
  • Nonterm

Drivers License Generator Picture

The Parsing module implements the following exception classes:

C# Parser Generator

  • SpecError - when there is a problem with the grammar specification
  • ParsingException - any problem that occurs during parsing
  • UnexpectedToken - when the input sequence contains a token that isnot allowed by the grammar (including end-of-input)

In order to maintain compatibility with legacy code, the Parsing moduledefines the following aliases. New code should use the exceptions abovethat do not shadow Python's builtin exceptions.

  • Exception - superclass for all exceptions that can be raised
  • SyntaxError - alias for UnexpectedToken
Additionally, trying to set private attributes may raise:
  • AttributeError

Author: Jason Evans jasone@canonware.com

Github repo: http://github.com/sprymix/parsing