Commit Graph

3732 Commits

Author SHA1 Message Date
Ines Montani 8e977cc71c Fix formatting 2016-12-08 13:56:17 +01:00
Ines Montani 0176b99004 Fix formatting 2016-12-08 12:48:02 +01:00
Ines Montani 877f09218b Add more custom rules for abbreviations 2016-12-08 12:47:01 +01:00
Ines Montani bfaa42636c Update language data for German 2016-12-08 12:01:09 +01:00
Ines Montani ec44bee321 Fix capitalization on morphological features 2016-12-08 12:00:54 +01:00
Ines Montani ce979553df Resolve conflict 2016-12-07 21:16:52 +01:00
Ines Montani 8350d65695 Change morphology and lemmatizer API
Take morphology features as object instead of keyword arguments
2016-12-07 21:12:49 +01:00
Ines Montani 52e7d634df Remove trailing whitespace 2016-12-07 21:12:19 +01:00
Ines Montani 0d07d7fc80 Apply emoticon exceptions to tokenizer 2016-12-07 21:11:59 +01:00
Ines Montani 71f0f34cb3 Fix formatting 2016-12-07 21:11:29 +01:00
Ines Montani 9413bcd9ee Declare encoding and unicode literals 2016-12-07 21:10:34 +01:00
Ines Montani a280ff2657 Fix __all__ 2016-12-07 21:10:12 +01:00
Ines Montani ba8721953c Add missing emoticons 2016-12-07 21:09:44 +01:00
Ines Montani 1285c4ba93 Update English language data 2016-12-07 20:33:28 +01:00
Ines Montani 4a1e206064 Remove old lang_data directory 2016-12-07 20:33:28 +01:00
Ines Montani 79dce0aabe Add emoticons 2016-12-07 20:33:28 +01:00
Ines Montani a662a95294 Add line breaks 2016-12-07 20:33:28 +01:00
Ines Montani 07f0efb102 Add test for tokenizer regular expressions 2016-12-07 20:33:28 +01:00
Ines Montani e0712d1b32 Reformat language data 2016-12-07 20:33:28 +01:00
Ines Montani 5ad5408242 Update README.rst 2016-12-03 11:55:22 +01:00
Ines Montani a5707f4d05 Update README.rst 2016-12-03 11:53:38 +01:00
Matthew Honnibal 0c0f4c965d Increment version 2016-12-03 11:16:52 +01:00
Matthew Honnibal 73288497d5 Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-12-02 11:06:06 +01:00
Matthew Honnibal f6e356aada Add (and test) Span.sentiment attribute. By default we average token.span, but can override with custom hook. Re Issue #667 2016-12-02 11:05:50 +01:00
Ines Montani 4b889855cd Merge pull request #666 from blarghmatey/patch-1
Fixed minor typo
2016-12-01 12:10:30 +01:00
Tobias Macey 1d768d6510 Fixed minor typo
The word `motto` was missing the second `t`.
2016-12-01 06:08:33 -05:00
Matthew Honnibal 296d33a4fc Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-26 12:36:18 +01:00
Matthew Honnibal 1f6c37c6f5 Fix create_tokenizer when nlp is None 2016-11-26 12:36:04 +01:00
Ines Montani 38f5ad4bfb Merge pull request #660 from jsmootiv/patch-2
Minor typos Fix
2016-11-26 12:23:48 +01:00
Jimi Smoot 8373115cbd Minor typos 2016-11-25 18:22:52 -08:00
Matthew Honnibal c7889492f9 Fix model saving error for Python 3 2016-11-25 18:04:30 -06:00
Matthew Honnibal 22189e60db Use unicode literals in train_ud 2016-11-25 17:45:45 -06:00
Matthew Honnibal bc0a202c9c Fix unicode problem in nonproj module 2016-11-25 17:29:17 -06:00
Matthew Honnibal da5f0cce36 Fix train_ud script, which trains models from the Universal Dependencies format. 2016-11-25 11:19:33 -06:00
Matthew Honnibal 6dd3b94fa6 Filter out deprecated attributes when reading special-case tokenization rules. 2016-11-25 09:57:18 -06:00
Matthew Honnibal e879c79b8c Merge branch 'master' of https://github.com/explosion/spaCy 2016-11-25 09:18:28 -06:00
Matthew Honnibal a335c6dcc2 Exclude morphs from deprecated token attributes for now 2016-11-25 16:17:32 +01:00
Matthew Honnibal f799a07f25 Merge branch 'master' of https://github.com/explosion/spaCy 2016-11-25 09:16:43 -06:00
Matthew Honnibal 159e8c46e1 Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
Matthew Honnibal 6c1b2c0c2e Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-25 16:15:08 +01:00
Matthew Honnibal 846e80f2f4 Exclude morphs from deprecated token attributes for now 2016-11-25 16:14:54 +01:00
Matthew Honnibal 664f2dd1c0 Allow dep to be None in scorer, for missing labels. 2016-11-25 09:02:49 -06:00
Matthew Honnibal 39341598bb Fix NER label calculation 2016-11-25 09:02:22 -06:00
Matthew Honnibal ca773a1f53 Tweak arc_eager n_gold to deal with negative costs, and improve error message. 2016-11-25 09:01:52 -06:00
Matthew Honnibal a2f55e7015 Pass cfg through loading, for training. 2016-11-25 09:01:20 -06:00
Matthew Honnibal 608d8f5421 Pass cfg through parser, and have is_valid default to 1, not 0 when resetting state 2016-11-25 09:00:21 -06:00
Matthew Honnibal cc7e607a8a Fix gold.pyx for 1.0 2016-11-25 08:57:59 -06:00
Matthew Honnibal 314bc8d34f Fix train script for 1.0 2016-11-25 08:57:37 -06:00
root 080d29e092 Fix train.py for 1.0 2016-11-25 08:55:33 -06:00
Ines Montani ada007cb73 Fix formatting for consistency 2016-11-25 15:53:40 +01:00