Matthew Honnibal
|
72bbcc0871
|
Handle lemmatization for unknown string IDs
|
2017-09-24 05:01:31 -05:00 |
Matthew Honnibal
|
b78cc318c3
|
Fix loading of morphology exceptions
|
2017-06-04 16:34:32 -05:00 |
Matthew Honnibal
|
805495af27
|
Fix off-by-one in number of tags
|
2017-06-03 13:29:23 -05:00 |
Matthew Honnibal
|
11840ff5dd
|
Store tag map before normalizing props
|
2017-05-29 17:53:48 -05:00 |
Matthew Honnibal
|
fe11564b8e
|
Finish stringstore change. Also xfail vectors tests
|
2017-05-28 15:10:22 +02:00 |
Matthew Honnibal
|
84e66ca6d4
|
WIP on stringstore change. 27 failures
|
2017-05-28 14:06:40 +02:00 |
ines
|
d24589aa72
|
Clean up imports, unused code, whitespace, docstrings
|
2017-04-15 12:05:47 +02:00 |
ines
|
561f2a3eb4
|
Use consistent formatting for docstrings
|
2017-04-15 11:59:21 +02:00 |
Matthew Honnibal
|
c748907a66
|
Fix errors in previous commit
|
2017-03-25 22:25:01 +01:00 |
Matthew Honnibal
|
850d35dcb3
|
Make morphology use int attributes internally
The morphology class was calling the lemmatizer inconsistently,
which some string-valued attributes. This caused Issue #903.
|
2017-03-25 21:49:10 +01:00 |
Raphaël Bournhonesque
|
f332bf05be
|
Remove unused import statements
|
2017-03-21 21:08:54 +01:00 |
Roman Inflianskas
|
66e1109b53
|
Add support for Universal Dependencies v2.0
|
2017-03-03 13:17:34 +01:00 |
Matthew Honnibal
|
95a52005df
|
Revert "Fix Issue #683: Add 'SP' to tag_map, if it's not there already, within the Morphology class."
This reverts commit 40e71586d6 .
|
2017-01-09 09:55:55 -06:00 |
Matthew Honnibal
|
40e71586d6
|
Fix Issue #683: Add 'SP' to tag_map, if it's not there already, within the Morphology class.
|
2016-12-18 23:44:05 +01:00 |
Matthew Honnibal
|
813249f826
|
Work on morphology class. Still not fully consistent with rest of library.
|
2016-12-18 17:35:22 +01:00 |
Matthew Honnibal
|
837a5d4100
|
Update morphology class so that exceptions can be added one-by-one, and so that arbitrary attributes can be referenced.
|
2016-12-18 16:49:46 +01:00 |
Matthew Honnibal
|
e6fc4afb04
|
Whitespace
|
2016-12-18 15:48:00 +01:00 |
Matthew Honnibal
|
57c4341453
|
Refactor loading of morphology exceptions, adding a method add_special_case.
|
2016-12-18 14:59:44 +01:00 |
Ines Montani
|
8350d65695
|
Change morphology and lemmatizer API
Take morphology features as object instead of keyword arguments
|
2016-12-07 21:12:49 +01:00 |
Matthew Honnibal
|
1fb09c3dc1
|
Fix morphology tagger
|
2016-11-04 19:19:09 +01:00 |
Matthew Honnibal
|
6e37ba1d82
|
Fix #602, #603 --- Broken build
|
2016-11-04 09:54:24 +01:00 |
Matthew Honnibal
|
293c79c09a
|
Fix #595: Lemmatization was incorrect for base forms, because morphological analyser wasn't adding morphology properly.
|
2016-11-04 00:29:07 +01:00 |
Matthew Honnibal
|
07776d8096
|
Fix pos name conflict in lemmatize
|
2016-09-27 17:35:58 +02:00 |
Matthew Honnibal
|
bb4f201ad2
|
Pass morphological features from tag map into the lemmatizer.
|
2016-09-27 14:01:43 +02:00 |
Matthew Honnibal
|
7abe653223
|
* Fix imports
|
2016-01-19 03:36:51 +01:00 |
Matthew Honnibal
|
590f38bdb2
|
* Add hacky solution to Issue #220. Currently specials.json only supports literal patterns, which doesn't allow us to pre-tag whitespace with the correct token, SP, as a rule. The data-driven approach should be easy but for some reason fails here. Adding a hard code in Morphology isn't a good solution, but we do want to fix the behaviour right away, and don't want to wait for an architecturally better solution.
|
2016-01-19 03:35:20 +01:00 |
Matthew Honnibal
|
9d1b2a103a
|
* Fix capitalization in lemmatizer
|
2015-11-06 05:44:35 +11:00 |
Matthew Honnibal
|
5b2af4864f
|
* When lemmatizing non-noun, non-verb, non-adj words, output lower-case
|
2015-11-06 00:45:09 +11:00 |
Matthew Honnibal
|
dde9e1357c
|
* Add todo to morphology.lemmatize
|
2015-11-03 18:54:35 +11:00 |
Matthew Honnibal
|
833eb35c57
|
* Fix tag assignment in doc.from_array
|
2015-11-03 18:45:54 +11:00 |
Matthew Honnibal
|
5ca57bd859
|
* Ensure Morphology can be pickled, to address Issue #125.
|
2015-10-13 13:44:41 +11:00 |
Matthew Honnibal
|
278e12f7e8
|
* Addmorphology symbols to morphology. May need to remove these as an enum.
|
2015-10-13 13:44:40 +11:00 |
Matthew Honnibal
|
74c0853471
|
* Rename ATTR_IDS to attrs.IDS. Rename ATTR_NAMES to attrs.NAMES. Rename UNIV_POS_IDS to parts_of_speech.IDS
|
2015-10-13 13:44:39 +11:00 |
Matthew Honnibal
|
2d9e5bf566
|
* Allow punctuation to be lemmatized
|
2015-10-09 19:02:42 +11:00 |
Matthew Honnibal
|
b3a70e6375
|
* Clean up unnecessary try/except block
|
2015-10-08 14:34:11 +11:00 |
Matthew Honnibal
|
85c3fec1d1
|
* Fix morphology loading
|
2015-09-10 14:52:23 +02:00 |
Matthew Honnibal
|
31ccf494e6
|
Merge branch 'develop' of https://github.com/honnibal/spaCy into develop
|
2015-09-09 14:33:38 +02:00 |
Matthew Honnibal
|
0b527fbdc8
|
* Set POS tag in morphology
|
2015-09-09 14:30:24 +02:00 |
Matthew Honnibal
|
2be3620333
|
* Save morphological analyses in a cache
|
2015-09-08 15:39:24 +02:00 |
Matthew Honnibal
|
9eae9837c4
|
* Fix morphology look up
|
2015-09-06 17:53:39 +02:00 |
Matthew Honnibal
|
534e3dda3c
|
* More work on language independent parsing
|
2015-08-28 03:44:54 +02:00 |
Matthew Honnibal
|
c2307fa9ee
|
* More work on language-generic parsing
|
2015-08-28 02:02:33 +02:00 |
Matthew Honnibal
|
86c4a8e3e2
|
* Work on new morphology organization
|
2015-08-27 23:11:51 +02:00 |
Matthew Honnibal
|
0af139e183
|
* Tagger training now working. Still need to test load/save of model. Morphology still broken.
|
2015-08-27 09:16:11 +02:00 |
Matthew Honnibal
|
378729f81a
|
* Hack Morphology class towards usability
|
2015-08-26 19:17:21 +02:00 |
Matthew Honnibal
|
3f1944d688
|
* Make PyPy work
|
2015-01-05 17:54:38 +11:00 |
Matthew Honnibal
|
b00bc01d8c
|
* All tests now passing for reorg
|
2014-12-23 13:18:59 +11:00 |
Matthew Honnibal
|
73f200436f
|
* Tests passing except for morphology/lemmatization stuff
|
2014-12-23 11:40:32 +11:00 |
Matthew Honnibal
|
cf8d26c3d2
|
* POS tagger training working after reorg
|
2014-12-22 08:54:47 +11:00 |
Matthew Honnibal
|
4c4aa2c5c9
|
* Work on train
|
2014-12-22 07:25:43 +11:00 |