| Build status | Documentation |
|---|---|
A simple straightforward implementation of PikaParser in pure Julia, following the specification by Luke A. D. Hutchison (see https://github.com/lukehutch/pikaparser).
Pika parsers are pretty fast, they are easy to specify, carry the ability to unambiguously match all PEG grammars including the left-recursive ones, and provide great mechanisms for parsing error recovery.
import PikaParser as PAll grammar clauses are subtype of a Clause. The types are indexed by the
labels for your grammar rules -- Julia symbols are a natural choice, but you
are free to use integers, strings, or anything else.
rules = Dict(
# match a sequence of characters that satisfies `isdigit`
:digits => P.some(:digit => P.satisfy(isdigit)),
# expression in parentheses
:parens => P.seq(
P.token('('),
# you can name the rules in nested contexts
:expr => P.first(:plusexpr, :minusexpr, :digits, :parens),
P.token(')'),
),
# some random operators
:plusexpr => P.seq(:expr, P.token('+'), :expr),
:minusexpr => P.seq(:expr, P.token('-'), :expr),
)
g = P.make_grammar(
[:expr], # the top-level rule
P.flatten(rules, Char), # process the rules into a single level and specialize them for crunching Chars
)The grammar is now prepared for parsing.
Parsing is executed simply by running your grammar on any indexable input using
parse.
(Notably, PikaParsers require frequent indexing of inputs, and incremental
parsing of streams is thus complicated. To improve the performance, it is also
advisable to lex your input into a vector of more complex tokens, using e.g.
parse_lex.)
input = "12-(34+567-8)"
p = P.parse(g, input)You can find if an expression was matched at a certain position:
P.find_match_at!(p, :expr, 1)...which returns an index in the match table (if found), such as 45.
You can have a look at the match: p.matches[45] should return:
PikaParser.Match(10, 1, 13, 2, 52, 0, 41, 0)
where 10 is the renumbered rule ID for :expr, 1 is the starting position
of the match in the input, 13 is the last position of the match (here, that
means the whole input); 2 is the option index (in this case, it points to
:expr option 2, which is :minusexpr). The rest of the Match structure is
used for internal values that organize the match tree and submatches.
You can use traverse_match to recursively walk the parse trees, to produce
ASTs, and translate, interpret or evaluate the expressions:
P.traverse_match(p, P.find_match_at!(p, :expr, 1))By default, this runs through the whole match tree and transcodes the matches
to Julia Expr AST. In this case, if you pipe the output through
JuliaFormatter, you will get something like:
expr(
minusexpr(
expr(digits(digit("1"), digit("2"))),
var"minusexpr-2"("-"),
expr(
parens(
var"parens-1"("("),
expr(
plusexpr(
expr(digits(digit("3"), digit("4"))),
var"plusexpr-2"("+"),
expr(
minusexpr(
expr(digits(digit("5"), digit("6"), digit("7"))),
var"minusexpr-2"("-"),
expr(digits(digit("8"))),
),
),
),
),
var"parens-3"(")"),
),
),
),
)It is straightforward to specify your own method of evaluating the parses by supplying the matchtree opening and folding functions. For example, you can evaluate the expression as follows:
P.traverse_match(p, P.find_match_at!(p, :expr, 1),
fold = (m, p, subvals) ->
m.rule == :digits ? parse(Int, m.view) :
m.rule == :expr ? subvals[1] :
m.rule == :parens ? subvals[2] :
m.rule == :plusexpr ? subvals[1] + subvals[3] :
m.rule == :minusexpr ? subvals[1] - subvals[3] : nothing,
)You should get the expectable result (-581).
PikaParser.jl was developed at the Luxembourg Centre for Systems
Biomedicine of the University of Luxembourg
(uni.lu/lcsb).
The development was supported by European Union's Horizon 2020 Programme under
PerMedCoE project (permedcoe.eu),
agreement no. 951773.