Ines Montani
de11ea753a
Merge branch 'master' into develop
2020-02-18 14:47:23 +01:00
adrianeboyd
5ee9d8c9b8
Add MORPH attr, add support in retokenizer ( #4947 )
...
* Add MORPH attr / symbol for token attrs
* Update retokenizer for MORPH
2020-01-29 17:45:46 +01:00
Sofie Van Landeghem
a1b22e90cd
serialize ENT_ID ( #4852 )
...
* expand serialization test for custom token attribute
* add failing test for issue 4849
* define ENT_ID as attr and use in doc serialization
* fix few typos
2020-01-06 14:57:34 +01:00
Matthew Honnibal
ef666656b3
Fix attrs alignment
2019-07-12 17:59:47 +02:00
svlandeg
8608685543
ensure Span.as_doc keeps the entity links + unit test
2019-06-25 15:28:51 +02:00
Matthew Honnibal
c0caf7cf27
Fix LANG symbol
2018-02-17 18:10:50 +01:00
Matthew Honnibal
0bf2f6be29
Add missing symbol for LANG attr. Fixes inconsistent numeric ID
2018-02-17 17:37:02 +01:00
4altinok
3deef1497a
removed 18 and replaced 18 with is_currency
2018-02-11 18:51:09 +01:00
Matthew Honnibal
16122f566e
Fix cpdef enum in attrs.pyx
2017-09-17 12:28:53 -05:00
Matthew Honnibal
d68dd1f251
Add SENT_START attribute, for custom sentence boundary detection
2017-05-23 18:37:58 +02:00
Matthew Honnibal
1b31c05bf8
Whitespace
2016-12-18 16:51:40 +01:00
Wolfgang Seeker
03fb498dbe
introduce lang field for LexemeC to hold language id
...
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Matthew Honnibal
c4017a06d9
* Add placeholders for the new flags in attrs and symbols
2016-02-04 15:49:45 +01:00
Matthew Honnibal
064bd69ad0
* Refactor symbols, so that frequency rank can be derived from the orth id of a word.
2015-10-10 16:03:48 +11:00
Matthew Honnibal
c2d8edd0bd
* Add PROB attribute in attrs.pxd
2015-08-26 19:14:19 +02:00
Matthew Honnibal
9c667b7f15
* Set a value in attrs.pxd on the first flag, to reduce bugs
2015-08-06 16:08:04 +02:00
Matthew Honnibal
8e4c69ee8c
* Add is_oov property, and fix up handling of attributes
2015-07-27 01:50:06 +02:00
Matthew Honnibal
6bb96c122d
* Host IS_ flags in attrs.pxd, and add properties for them on Token and Lexeme objects
2015-07-26 16:37:16 +02:00
Matthew Honnibal
efa80096f1
* Upd attrs id list
2015-07-16 01:26:54 +02:00
Jordan Suchow
3a8d9b37a6
Remove trailing whitespace
2015-04-19 13:01:38 -07:00
Matthew Honnibal
6640386b25
* Fix Issue #43 : TAG attr not supported. Also add DEP attr, while I'm at it. Need better way of ensuring future changes don't break in similar way.
2015-04-07 06:00:57 +02:00
Matthew Honnibal
d4c99f7dec
* Add attrs.pxd
2015-01-26 22:22:09 +11:00