Commit Graph

4942 Commits

Author SHA1 Message Date
Matthew Honnibal ab846256cf Merge pull request #966 from recognai/master
Prepare Spanish language for training models, including configuration, rich-UD tag map and tests
2017-04-07 16:12:29 +02:00
Matthew Honnibal 83dca920d4 Rename test #913 -> #957, comment
Make test for #957 reference correct bug. Add comment.

Previous commit closes #957.
2017-04-07 15:54:25 +02:00
Matthew Honnibal be204ed714 Merge branch 'master' of https://github.com/explosion/spaCy 2017-04-07 15:50:14 +02:00
Matthew Honnibal e7b1ee9efd Switch to regex module for URL identification
The URL detection regex was failing on input such as 0.1.2.3, as this
input triggered excessive back-tracking in the builtin re module.
The solution was to switch to the regex module, which behaves better.

Closes #913.
2017-04-07 15:47:36 +02:00
Matthew Honnibal 5887383fc0 Add test for Issue #913: Hang from bad regex 2017-04-07 15:47:27 +02:00
Matthew Honnibal a001365c42 Require regex library 2017-04-07 15:43:34 +02:00
Matthew Honnibal a5538d93d0 Merge pull request #955 from kumaranvpl/fix_keras_parikh_entailment_bugs
Fix keras_parikh_entailment example bugs
2017-04-07 14:59:57 +02:00
Ines Montani 2a60597089 Update CONTRIBUTORS.md 2017-04-07 13:34:05 +02:00
ines 7ea1673072 Fix whitespace 2017-04-07 13:28:48 +02:00
ines 2f38c1d77f Add documentation for new convert and model commands 2017-04-07 13:27:55 +02:00
ines 255650dbc2 Add connlu2json converter from explosion/spacy-dev-resources/#11 2017-04-07 13:05:12 +02:00
ines 789ce8a45e Add convert command 2017-04-07 13:04:17 +02:00
ines 9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
ines 47ddce6eb7 Remove unused variable 2017-04-07 13:01:48 +02:00
ines 7dd134718a Merge branch 'master' into develop 2017-04-07 12:00:26 +02:00
ines dcf8ab0c47 Merge branch 'develop' 2017-04-07 12:00:09 +02:00
oeg b10bc1a177 Adds contributor agreement dvsrepo 2017-04-07 11:58:28 +02:00
ines f33c4cbae1 Add --no-cache-dir error to troubleshooting docs (see #958) 2017-04-07 10:22:18 +02:00
ines d6bbc3ffcd Fix formatting 2017-04-07 10:22:18 +02:00
ines 75f9b4c6e2 Fix whitespace 2017-04-07 10:22:18 +02:00
oeg c693d40791 feature(model): Add support for creating the Spanish model, including rich tagset, configuration, and basich tests 2017-04-06 18:48:45 +02:00
Matthew Honnibal 5e621b9862 Merge pull request #960 from recognai/master
Fixes typo in method calling Pseudoprojectivity method in create_pipeline method of BaseDefaults class
2017-04-06 17:57:27 +02:00
oeg 010293fb2f fix(typo): Fixes typo in method calling PseudoProjectivity.deprojectivize, failing with new train cli 2017-04-06 17:33:15 +02:00
Kumaran Rajendhiran 3f55d6afae Update README 2017-04-05 16:59:52 +05:30
Kumaran Rajendhiran 47d7137c83 Set max_length to 100 for demo and evaluate 2017-04-05 16:48:35 +05:30
Kumaran Rajendhiran 10e8dcdfdb Remove not needed parameters from function 2017-04-05 16:20:47 +05:30
ines 808cd6cf7f Add missing tags to verbs (resolves #948) 2017-04-03 18:12:52 +02:00
ines 2c36a61ec5 Add spacyr to libraries 2017-04-03 18:12:38 +02:00
Ines Montani 2de2195be8 Update CONTRIBUTORS.md 2017-04-01 10:39:42 +02:00
ines ad8bf1829f Import and combine Portuguese tokenizer exceptions (see #943) 2017-04-01 10:37:42 +02:00
Ines Montani f8b2d9c3b7 Merge pull request #943 from mamoit/master
Portuguese improvements
2017-04-01 10:32:00 +02:00
ines 3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
ines 42382d5692 Fix download commands in error messages (see #946) 2017-04-01 10:19:32 +02:00
ines d4a59c254b Remove whitespace 2017-04-01 10:19:01 +02:00
Matthew Honnibal 51882ee2b8 Fix check for setting ent_id in merge 2017-03-31 19:32:01 +02:00
Miguel Almeida 4fde64c4ea Portuguese contractions and some abreviations 2017-03-31 15:52:55 +01:00
Miguel Almeida 465b240bcb Review Portuguese stop words
Mainly to review typos and add missing masculines/feminines
2017-03-31 13:00:47 +01:00
Matthew Honnibal fc3900e5b2 Allow ent_id to be set in Token 2017-03-31 14:00:14 +02:00
Matthew Honnibal 9720103428 Improve attribute handlign in doc.merge(). Still unsatisfying 2017-03-31 13:59:58 +02:00
Matthew Honnibal cfff4e0f61 Improve test 2017-03-31 13:59:32 +02:00
Matthew Honnibal 1bb7b4ca71 Add comment 2017-03-31 13:59:19 +02:00
Matthew Honnibal 725249c59a Add merge_phrase callback in matcher.pyx 2017-03-31 13:58:59 +02:00
Matthew Honnibal e854f28304 Add test for Issue #758
Issue #758 occurs when no actions are available for a single token
doc after merging.
2017-03-31 13:26:25 +02:00
Miguel Almeida c1d020b0a6 Remove "ista" from portuguese stop words 2017-03-31 12:26:13 +01:00
Miguel Almeida 17a1e7a119 Add Portuguese numbers and ordinals 2017-03-31 12:21:01 +01:00
Matthew Honnibal 47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Ines Montani 8eafe80450 Update CONTRIBUTORS.md 2017-03-31 09:12:31 +02:00
Ines Montani 045a8e994d Merge pull request #942 from jreeter/master (resolves #934)
Issue #934 symlink should not convert paths as_posix under windows.
2017-03-31 09:04:55 +02:00
Joshua Reeter 564daf6dec Issue #934 symlink should not convert paths as_posix under windows. 2017-03-30 23:47:45 -05:00