mirror of https://github.com/explosion/spaCy.git
Prefer _SP over SP for default tag map space attrs
If `_SP` is already in the tag map, use the mapping from `_SP` instead of `SP` so that `SP` can be a valid non-space tag. (Chinese has a non-space tag `SP` which was overriding the mapping of `_SP` to `SPACE`.)
This commit is contained in:
parent
1eed101be9
commit
b6b5908f5e
|
@ -152,7 +152,10 @@ cdef class Morphology:
|
||||||
self.tags = PreshMap()
|
self.tags = PreshMap()
|
||||||
# Add special space symbol. We prefix with underscore, to make sure it
|
# Add special space symbol. We prefix with underscore, to make sure it
|
||||||
# always sorts to the end.
|
# always sorts to the end.
|
||||||
space_attrs = tag_map.get('SP', {POS: SPACE})
|
if '_SP' in tag_map:
|
||||||
|
space_attrs = tag_map.get('_SP')
|
||||||
|
else:
|
||||||
|
space_attrs = tag_map.get('SP', {POS: SPACE})
|
||||||
if '_SP' not in tag_map:
|
if '_SP' not in tag_map:
|
||||||
self.strings.add('_SP')
|
self.strings.add('_SP')
|
||||||
tag_map = dict(tag_map)
|
tag_map = dict(tag_map)
|
||||||
|
|
Loading…
Reference in New Issue