diff --git a/docs/features.md b/docs/features.md index bf74726..e9f9109 100644 --- a/docs/features.md +++ b/docs/features.md @@ -1,7 +1,7 @@ # Features - EBNF-inspired grammar, with extra features (See: [Grammar Reference](grammar.md)) - - Builds a parse-tree (AST) automagically based on the grammar + - Builds a parse-tree (AST) automagically based on the grammar - Stand-alone parser generator - create a small independent parser to embed in your project. - Automatic line & column tracking - Automatic terminal collision resolution @@ -39,11 +39,11 @@ Lark extends the traditional YACC-based architecture with a *contextual lexer*, The contextual lexer communicates with the parser, and uses the parser's lookahead prediction to narrow its choice of tokens. So at each point, the lexer only matches the subgroup of terminals that are legal at that parser state, instead of all of the terminals. It’s surprisingly effective at resolving common terminal collisions, and allows to parse languages that LALR(1) was previously incapable of parsing. -This is an improvement to LALR(1) that is unique to Lark. +This is an improvement to LALR(1) that is unique to Lark. ### CYK Parser -A [CYK parser](https://www.wikiwand.com/en/CYK_algorithm) can parse any context-free grammar at O(n^3*|G|). +A [CYK parser](https://www.wikiwand.com/en/CYK_algorithm) can parse any context-free grammar at O(n^3*|G|). Its too slow to be practical for simple grammars, but it offers good performance for highly ambiguous grammars. diff --git a/docs/how_to_use.md b/docs/how_to_use.md index 5113d27..782e54f 100644 --- a/docs/how_to_use.md +++ b/docs/how_to_use.md @@ -10,7 +10,7 @@ This is the recommended process for working with Lark: 3. Try your grammar in Lark against each input sample. Make sure the resulting parse-trees make sense. -4. Use Lark's grammar features to [[shape the tree|Tree Construction]]: Get rid of superfluous rules by inlining them, and use aliases when specific cases need clarification. +4. Use Lark's grammar features to [[shape the tree|Tree Construction]]: Get rid of superfluous rules by inlining them, and use aliases when specific cases need clarification. - You can perform steps 1-4 repeatedly, gradually growing your grammar to include more sentences. @@ -32,7 +32,7 @@ grammar = """start: rules and more rules rule1: other rules AND TOKENS | rule1 "+" rule2 -> add | some value [maybe] - + rule2: rule1 "-" (rule2 | "whatever")* TOKEN1: "a literal" diff --git a/docs/index.md b/docs/index.md index 7f9e142..7b513f6 100644 --- a/docs/index.md +++ b/docs/index.md @@ -44,4 +44,4 @@ $ pip install lark-parser * [Classes](classes.md) * [Cheatsheet (PDF)](lark_cheatsheet.pdf) * Discussion - * [Forum (Google Groups)](https://groups.google.com/forum/#!forum/lark-parser) \ No newline at end of file + * [Forum (Google Groups)](https://groups.google.com/forum/#!forum/lark-parser) diff --git a/docs/philosophy.md b/docs/philosophy.md index 2b72347..9d77ee0 100644 --- a/docs/philosophy.md +++ b/docs/philosophy.md @@ -27,7 +27,7 @@ In accordance with these principles, I arrived at the following design choices: ### 1. Separation of code and grammar -Grammars are the de-facto reference for your language, and for the structure of your parse-tree. For any non-trivial language, the conflation of code and grammar always turns out convoluted and difficult to read. +Grammars are the de-facto reference for your language, and for the structure of your parse-tree. For any non-trivial language, the conflation of code and grammar always turns out convoluted and difficult to read. The grammars in Lark are EBNF-inspired, so they are especially easy to read & work with. @@ -51,7 +51,7 @@ You can skip the building the tree for LALR(1), by providing Lark with a transfo The Earley algorithm can accept *any* context-free grammar you throw at it (i.e. any grammar you can write in EBNF, it can parse). That makes it extremely useful for beginners, who are not aware of the strange and arbitrary restrictions that LALR(1) places on its grammars. -As the users grow to understand the structure of their grammar, the scope of their target language and their performance requirements, they may choose to switch over to LALR(1) to gain a huge performance boost, possibly at the cost of some language features. +As the users grow to understand the structure of their grammar, the scope of their target language and their performance requirements, they may choose to switch over to LALR(1) to gain a huge performance boost, possibly at the cost of some language features. In short, "Premature optimization is the root of all evil." @@ -60,4 +60,4 @@ In short, "Premature optimization is the root of all evil." - Automatically resolve terminal collisions whenever possible - Automatically keep track of line & column numbers - + diff --git a/docs/recipes.md b/docs/recipes.md index 68c2ee4..6c36564 100644 --- a/docs/recipes.md +++ b/docs/recipes.md @@ -54,7 +54,7 @@ parser = Lark(""" %import common (INT, WS) %ignore COMMENT %ignore WS -""", parser="lalr", lexer_callbacks={'COMMENT': comments.append}) +""", parser="lalr", lexer_callbacks={'COMMENT': comments.append}) parser.parse(""" 1 2 3 # hello @@ -71,4 +71,4 @@ Prints out: [Token(COMMENT, '# hello'), Token(COMMENT, '# world')] ``` -*Note: We don't have to return a token, because comments are ignored* \ No newline at end of file +*Note: We don't have to return a token, because comments are ignored* diff --git a/docs/tree_construction.md b/docs/tree_construction.md index 8d2c059..47deab2 100644 --- a/docs/tree_construction.md +++ b/docs/tree_construction.md @@ -126,4 +126,4 @@ Lark will parse "hello world" as: start greet - planet \ No newline at end of file + planet diff --git a/examples/custom_lexer.py b/examples/custom_lexer.py index 965490f..786bf4f 100644 --- a/examples/custom_lexer.py +++ b/examples/custom_lexer.py @@ -49,7 +49,7 @@ def test(): res = ParseToDict().transform(tree) print('-->') - print(res) # prints {'alice': [1, 27, 3], 'bob': [4], 'carrie': [], 'dan': [8, 6]} + print(res) # prints {'alice': [1, 27, 3], 'bob': [4], 'carrie': [], 'dan': [8, 6]} if __name__ == '__main__': diff --git a/examples/python2.lark b/examples/python2.lark index b0d5e14..6fbae45 100644 --- a/examples/python2.lark +++ b/examples/python2.lark @@ -162,7 +162,7 @@ IMAG_NUMBER: (_INT | FLOAT) ("j"|"J") %ignore /[\t \f]+/ // WS -%ignore /\\[\t \f]*\r?\n/ // LINE_CONT +%ignore /\\[\t \f]*\r?\n/ // LINE_CONT %ignore COMMENT %declare _INDENT _DEDENT diff --git a/lark/load_grammar.py b/lark/load_grammar.py index 3c11830..69b7b01 100644 --- a/lark/load_grammar.py +++ b/lark/load_grammar.py @@ -549,7 +549,7 @@ def import_from_grammar_into_namespace(grammar, namespace, aliases): imported_terms = dict(grammar.term_defs) imported_rules = {n:(n,deepcopy(t),o) for n,t,o in grammar.rule_defs} - + term_defs = [] rule_defs = [] diff --git a/lark/reconstruct.py b/lark/reconstruct.py index a21f155..24a08d7 100644 --- a/lark/reconstruct.py +++ b/lark/reconstruct.py @@ -100,9 +100,9 @@ class Reconstructor: for origin, rule_aliases in aliases.items(): for alias in rule_aliases: yield Rule(origin, [Terminal(alias)], MakeMatchTree(origin.name, [NonTerminal(alias)])) - + yield Rule(origin, [Terminal(origin.name)], MakeMatchTree(origin.name, [origin])) - + def _match(self, term, token): diff --git a/tests/grammars/test.lark b/tests/grammars/test.lark index ab8d2a1..3c3cbcf 100644 --- a/tests/grammars/test.lark +++ b/tests/grammars/test.lark @@ -1,3 +1,3 @@ %import common.NUMBER %import common.WORD -%import common.WS \ No newline at end of file +%import common.WS diff --git a/tests/test_parser.py b/tests/test_parser.py index 2cb97fd..8bc4661 100644 --- a/tests/test_parser.py +++ b/tests/test_parser.py @@ -1275,8 +1275,8 @@ def _make_parser_test(LEXER, PARSER): self.assertEqual(p.parse("bb").children, [None, 'b', None, None, 'b', None]) self.assertEqual(p.parse("abbc").children, ['a', 'b', None, None, 'b', 'c']) self.assertEqual(p.parse("babbcabcb").children, - [None, 'b', None, - 'a', 'b', None, + [None, 'b', None, + 'a', 'b', None, None, 'b', 'c', 'a', 'b', 'c', None, 'b', None]) diff --git a/tox.ini b/tox.ini index 2c92629..b9b4794 100644 --- a/tox.ini +++ b/tox.ini @@ -21,4 +21,4 @@ recreate=True commands= git submodule sync -q git submodule update --init - python -m tests \ No newline at end of file + python -m tests