Commit Graph

2030 Commits

Author SHA1 Message Date
Matthew Honnibal 8b8d048385 Merge pull request #135 from henningpeters/patch-1
remove compile warning noise
2015-10-10 01:40:15 +11:00
Matthew Honnibal d31c911f83 Merge pull request #136 from henningpeters/patch-2
cleanup
2015-10-10 01:40:00 +11:00
Henning Peters 876fc99c44 cleanup
looks like this file was accidentally added
2015-10-09 16:11:56 +02:00
Matthew Honnibal a3dfe2b901 * Increment data version 2015-10-09 13:26:17 +02:00
Matthew Honnibal af8d0a2a09 * Increment version 2015-10-09 12:42:41 +02:00
Matthew Honnibal 3bf50ab830 * Ensure the fabfile prebuild command installs pytest 2015-10-09 20:57:47 +11:00
Matthew Honnibal 599f739ddb * Fix smart quote lemma test 2015-10-09 20:51:28 +11:00
Matthew Honnibal 5682439d1e * Remove em dash test from test_lemmatizer, as em dashes are now handled in specials.json 2015-10-09 20:24:21 +11:00
Matthew Honnibal f35632e2e5 * Remove SBD print statement in train, after SBD evaluation was removed from Scorer 2015-10-09 11:08:58 +02:00
Matthew Honnibal 1f90502ce8 * Fix website/test_home for Python 3 2015-10-09 11:08:31 +02:00
Matthew Honnibal caff4638c9 * Fix website/test_api.py for Python 3 2015-10-09 11:08:12 +02:00
Matthew Honnibal a510858f5a * Pretty-print specials.json, and add the em dash 2015-10-09 11:07:45 +02:00
Matthew Honnibal 49600a44a8 * Fix trailing comma in lemma_rules.json 2015-10-09 11:06:57 +02:00
Matthew Honnibal 0e92e8574a * Fix pos tag in em-dash in specials 2015-10-09 11:06:37 +02:00
Matthew Honnibal d341443282 * Remove em-dash from lemma rules. Handle instead in specials. 2015-10-09 10:27:13 +02:00
Matthew Honnibal b6047afe4c * Fix punctuation lemma rules, to resolve Issue #130 2015-10-09 10:25:37 +02:00
Matthew Honnibal 393a13d1af * Add unicode em dash to specials.json, so that we can control what POS tag it gets. This way we can prevent sentence boundary detection errors, to address Issue #130. 2015-10-09 19:24:33 +11:00
Matthew Honnibal 1490feda29 * Make generate_specials pretty-print the specials.json file 2015-10-09 19:23:47 +11:00
Matthew Honnibal 1842a53e73 * Lemmatize smart quotes as plain quotes 2015-10-09 19:09:36 +11:00
Matthew Honnibal 2d9e5bf566 * Allow punctuation to be lemmatized 2015-10-09 19:02:42 +11:00
Matthew Honnibal 5332c0b697 * Add support for punctuation lemmatization, to handle unicode characters. This should help in addressing Issue #130 2015-10-09 18:54:40 +11:00
Matthew Honnibal b71ba2eed5 * Add tests for unicode puncuation character lemmatization 2015-10-09 18:43:14 +11:00
Henning Peters 0e13f18ea4 remove compile warning noise 2015-10-09 07:23:39 +02:00
Matthew Honnibal c5b2c4ead8 * Don't build old license page 2015-10-09 14:58:45 +11:00
Matthew Honnibal 4bae38128d * Remove license page from website in repo 2015-10-09 14:58:34 +11:00
Matthew Honnibal 00c1992503 * Mark tests that require models 2015-10-09 14:48:14 +11:00
Matthew Honnibal dea40cfec3 * Mark tests that require models 2015-10-09 14:37:48 +11:00
Matthew Honnibal 5031440c35 * Mark tests that require models 2015-10-09 14:29:28 +11:00
Matthew Honnibal 76936a3456 * Mark tests that require models 2015-10-09 14:19:07 +11:00
Matthew Honnibal 7b340912d4 * Mark tests that require models 2015-10-09 14:09:26 +11:00
Matthew Honnibal 20b8c3e281 * Mark tests that require models 2015-10-09 13:58:01 +11:00
Matthew Honnibal b125289f30 * Fix type declaration in asciied function 2015-10-09 13:46:57 +11:00
Matthew Honnibal 9ff288c7bb * Update tests, after removal of spacy.en.attrs 2015-10-09 13:37:25 +11:00
Matthew Honnibal c64fd472b0 * Fix travis.yml 2015-10-09 12:58:08 +11:00
Matthew Honnibal f2374ecfb6 Merge branch 'master' of ssh://github.com/honnibal/spaCy 2015-10-09 12:48:34 +11:00
Matthew Honnibal 5af4b62fe7 * Filter out phrases that consist of common, lower-case words. 2015-10-09 12:47:43 +11:00
Matthew Honnibal 4bbc8f45c6 * Fix multi word matcher 2015-10-09 02:02:37 +11:00
Matthew Honnibal 801d55a6d9 * Fix phrase matcher 2015-10-09 02:00:45 +11:00
Matthew Honnibal 7b23442543 Merge pull request #133 from pquentin/patch-2
Fix typo
2015-10-08 21:47:04 +11:00
Quentin Pradet 1a71706c05 Fix typo 2015-10-08 14:22:23 +04:00
Matthew Honnibal b3a70e6375 * Clean up unnecessary try/except block 2015-10-08 14:34:11 +11:00
Matthew Honnibal 4513bed175 * Avoid compiling unused files 2015-10-08 14:00:34 +11:00
Matthew Honnibal e3e8994368 * Patch italian tag map 2015-10-08 14:00:13 +11:00
Matthew Honnibal 2d68f75b6a * Fix identity tag map 2015-10-08 13:59:56 +11:00
Matthew Honnibal 5890682ed1 * Fix multi_word_matches script 2015-10-08 13:59:32 +11:00
Matthew Honnibal a83253b455 Merge pull request #129 from chrisdubois/patch-1
Fix size of allocation when creating a pattern
2015-10-08 12:04:41 +11:00
Matthew Honnibal 6ea1601e93 * Add script to train models off the UD treebanks. Note that the UD data is restricted to research purposes only, and should only be used to train models for academic experiments. 2015-10-08 12:01:08 +11:00
Chris DuBois e095faa785 Add contributor. 2015-10-07 17:55:46 -07:00
chrisdubois cc47b8ad6a Fix size of allocation when creating a pattern
Each pattern object currently contains two AttrValues rather than just one.
2015-10-07 10:32:55 -07:00
Matthew Honnibal b228a8f4a6 * Remove spacy/en/attrs 2015-10-06 16:20:46 +11:00