mirror of https://github.com/lark-parser/lark.git
Update README.md
This commit is contained in:
parent
e22536fc9b
commit
c319ace48d
21
README.md
21
README.md
|
@ -176,6 +176,27 @@ You can use the output as a regular python module:
|
||||||
0.38981434460254655
|
0.38981434460254655
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Using Unicode character classes with `regex`
|
||||||
|
Python's builtin `re` module has a few persistent known bugs and also won't parse
|
||||||
|
advanced regex features such as character classes.
|
||||||
|
With `pip install lark-parser[regex]`, the `regex` module will be installed alongside `lark`
|
||||||
|
and can act as a drop-in replacement to `re`.
|
||||||
|
|
||||||
|
Any instance of `Lark` instantiated with `regex=True` will now use the `regex` module
|
||||||
|
instead of `re`. For example, we can now use character classes to match PEP-3131 compliant Python identifiers.
|
||||||
|
```python
|
||||||
|
from lark import Lark
|
||||||
|
>>> g = Lark(r"""
|
||||||
|
?start: NAME
|
||||||
|
NAME: ID_START ID_CONTINUE*
|
||||||
|
ID_START: /[\p{Lu}\p{Ll}\p{Lt}\p{Lm}\p{Lo}\p{Nl}_]+/
|
||||||
|
ID_CONTINUE: ID_START | /[\p{Mn}\p{Mc}\p{Nd}\p{Pc}·]+/
|
||||||
|
""", regex=True)
|
||||||
|
|
||||||
|
>>> g.parse('வணக்கம்')
|
||||||
|
'வணக்கம்'
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue