Bùi Trung Chí
9af46b4f1b
Fix loading tokenizer with custom prefix search ( #2495 )
...
* Add contributor agreement
* Fix loading tokenizer with cutom prefix search
2018-07-04 12:56:07 +02:00
Muhammad Irfan
f33c703066
Add Urdu Language Support ( #2430 )
...
* added Urdu language support.
* added Urdu language tests.
* modified conftest.py for Urdu language support.
* added spacy contributor agreement.
2018-06-22 11:14:03 +02:00
himkt
14d9007efd
fix wrong indexing ( #2416 )
...
* fix wrong indexing
* add agreement
2018-06-19 10:20:57 +02:00
Aliia E
428bae66b5
Add Tatar Language Support ( #2444 )
...
* add Tatar lang support
* add Tatar letters
* add Tatar tests
* sign contributor agreement
* sign contributor agreement [x]
* remove comments from Language class
* remove all template comments
2018-06-19 10:17:53 +02:00
Cory Hurst
446f5ec41b
Silent keyword in info function in init ( #2459 )
...
* Pass through "silent" kwarg to the wrapper in the spacy module init.
reference issue #2196
* Pass through "silent" kwarg to the wrapper in the spacy module init.
reference issue #2196
* contributor agreement
2018-06-18 12:24:21 +02:00
Daniel Ruf
d6d688914f
chore: cache dependencies ( #2418 )
...
* chore: cache dependencies
* chore: add CLA
2018-06-11 00:22:41 +02:00
himkt
1a568f2e08
fix wrong documentations ( #2423 )
2018-06-11 00:21:06 +02:00
Bohdan Moskalevskyi
d66292f767
fix UD data file extensions ( #2425 )
...
* fix UD data files extension
* add contributor agreement for msklvsk
2018-06-08 14:26:11 +02:00
Nour Shalabi
a169b79092
Additions to Arabic stop words. ( #2422 )
...
* Additions to Arabic stop words.
* Create nourshalabi.md
2018-06-08 02:33:23 +02:00
Maciej
c7d53348d7
Fix bug in CLI iob and ner converter ( #2392 ) ( fixes #2385 )
...
* issue_2385 add tests for iob_to_biluo converter function
* issue_2385 fix and modify iob_to_biluo function to accept either iob or biluo tags in cli.converter
* issue_2385 add test to fix b char bug
* add contributor agreement
* fill contributor agreement
2018-05-30 12:28:44 +02:00
ansgar-t
9732988951
escape html in displacy.render ( #2378 ) ( closes #2361 )
...
## Description
Fix for issue #2361 :
replace &, <, >, " with &amp; , &lt; , &gt; , &quot; in before rendering svg
## Checklist
<!--- Before you submit the PR, go over this checklist and make sure you can
tick off all the boxes. [] -> [x] -->
- [x] I have submitted the spaCy Contributor Agreement.
- [ ] I ran the tests, and all new and existing tests passed.
(As discussed in the comments to #2361 )
- [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
2018-05-28 18:36:41 +02:00
Samuel Pouyt
d85494bfae
Added agrement ( #2374 )
2018-05-26 18:19:08 +02:00
James Messinger
4515e96e90
Better formatting for `spacy train` CLI ( #2357 )
...
* Better formatting for `spacy train` CLI
Changed to use fixed-spaces rather than tabs to align table headers and data.
### Before:
```
Itn. P.Loss N.Loss UAS NER P. NER R. NER F. Tag % Token %
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
### After:
```
Itn. Dep Loss NER Loss UAS NER P. NER R. NER F. Tag % Token % CPU WPS GPU WPS
0 4618.857 2910.004 76.172 79.645 67.987 88.732 88.261 100.000 4436.9 6376.4
1 4671.972 3764.812 74.481 78.046 62.374 82.680 88.377 100.000 4672.2 6227.1
2 4742.756 3673.473 71.994 77.380 63.966 84.494 90.620 100.000 4298.0 5983.9
```
* Added contributor file
2018-05-25 13:08:45 +02:00
Aristo Rinjuang
432ede04af
adding more words and rephrasing ( #2351 )
...
* adding more words and rephrasing
* adding a contributor
* tokenizer bugs solved
2018-05-24 11:40:57 +02:00
Shantam Raj
1a4682dd0b
Update _training.jade ( #2340 )
...
* Update _training.jade
Correcting grammar. Replacing "The" with "To".
* Create armsp.md
* Update armsp.md
2018-05-21 11:09:33 +02:00
Tahar Zanouda
00417794d3
Add Arabic language ( #2314 )
...
* added support for Arabic lang
* added Arabic language support
* updated conftest
2018-05-15 00:27:19 +02:00
vishnumenon
ae3719ece5
Fix the code for FACILITIY entities ( #2324 )
...
* Fix the code for FACILITIY entities
As far as I can tell, the default models all use "FAC" rather than "FACILITY"
* Added my Contributor Agreement
* Rename vishnumenon to vishnumenon.md
2018-05-12 15:19:17 +02:00
Jani Monoses
42b34832e4
Update Romanian stopword list ( #2316 )
...
* Contributor agreement for janimo
* Update Romanian stopword list
Include the correct spellings of all the words already in the repo
that are using cedillas (ş and ţ) instead of commas (ș and ț).
Add another unrelated spelling fix.
See https://github.com/stopwords-iso/stopwords-ro/pull/1 and
https://github.com/stopwords-iso/stopwords-ro/pull/2
2018-05-10 12:16:56 +02:00
Lucas Abbade
18af53014f
Adding my contributor agreement ( #2315 )
...
* Create LRAbbade.md
* Update LRAbbade.md
2018-05-09 21:25:05 +02:00
mauryaland
5368ba028a
Update stop_words.py for French language ( #2310 )
...
* Add contraction forms of some common stopwords
All the stopwords added contain the apostrophe" ' "or " ’ ".
* Adds contributor agreement mauryaland
* Update mauryaland.md
2018-05-09 12:04:38 +02:00
ines
37facf9b4d
Add config for no-response [ci skip]
2018-05-07 22:04:54 +02:00
ines
a685fff875
Merge branch 'master' of https://github.com/explosion/spaCy
2018-05-07 18:58:57 +02:00
ines
e2241c797c
Add lock-threads configuration [ci skip]
2018-05-07 18:54:22 +02:00
B!
414f5270b3
B Cavello's signed Contributor Agreement v2 ( #2302 )
...
This time hopefully created in the right spot. (Sorry about that!)
2018-05-07 17:48:54 +02:00
ines
929a01139a
Order issue templates
2018-05-04 03:04:41 +02:00
Ines Montani
7f39c8896b
Update issue templates ( #2295 )
...
* Update issue templates
* Update templates
2018-05-04 03:02:26 +02:00
Douglas Knox
9b49a40f4e
Test and fix for Issue #2219 ( #2272 )
...
Test and fix for Issue #2219 : Token.similarity() failed if single letter
2018-05-03 18:40:46 +02:00
G.Pruvost
cc8e804648
#2211 - Support for ssl certs config on download command ( #2212 )
...
* Add support for SSL/Certs customization on download CLI
* Add a note on SSL options for the 'download' CLI in the README
* Add contributor agreement
2018-05-03 18:37:02 +02:00
Alex Villarreal
13d562e1a4
Fix code sample for Doc.set_extension ( #2282 )
...
* Fix code sample for `set_extension`
The previous sample code for `set_extension` fails the assertion at the end, because `city_getter` it checked if the whole document text matches any of the city names. Now it checks if any of the city names is contained in the document text.
* Contributor agreement
2018-05-02 10:16:05 +02:00
Mr Roboto
6f5ccda19c
Addresses Issue #2228 - Deserialization fails when using tensor=False or sentiment=False ( #2230 )
...
* Fixes issue #2228
* Adds a new contributor
2018-05-01 13:40:22 +02:00
Shirish Kadam
d98a90440f
Added Adam project to spaCy Universe ( #2275 )
...
* Added 5hirish to contributors
* Added Adam Qas Project to spaCy Universe
* Remove $ from code example
2018-04-30 22:25:01 +02:00
Matt Upson
87cc6b3599
Add missing comma to NN example in docs ( #2255 )
...
Also add a completed contributor agreement.
2018-04-28 14:56:00 +02:00
Robin Linderborg
d01f503b54
Remove incorrect lemma lookup gäng->gänga ( #2252 )
...
* Remove incorrect lemma lookup gäng->gänga
In modern Swedish, "gäng" is mostly associated with "gang" or "group of people". The removed lemma lookup lemmatized it to the verb "thread".
* Add contrib agreement to correct directory
* Revert change to CONTRIBUTOR_AGREEMENT
2018-04-28 14:54:41 +02:00
Jens Dahl Møllerhøj
e5055e3cf6
Add Danish lemmatizer ( #2184 )
...
* add danish lemmatizer
* fill contributor agreement
2018-04-07 19:07:28 +02:00
ines
638068ec6c
Restore contributor agreement
2018-03-31 14:06:37 +02:00
Suraj Rajan
1cdbb7c97c
[2032] - Changed python set to cpp stl set ( #2170 )
...
Changed python set to cpp stl set #2032
## Description
Changed python set to cpp stl set. CPP stl set works better due to the logarithmic run time of its methods. Finding minimum in the cpp set is done in constant time as opposed to the worst case linear runtime of python set. Operations such as find,count,insert,delete are also done in either constant and logarithmic time thus making cpp set a better option to manage vectors.
Reference : http://www.cplusplus.com/reference/set/set/
### Types of change
Enhancement for `Vectors` for faster initialising of word vectors(fasttext)
2018-03-31 13:28:25 +02:00
Katrin Leinweber
6f84e32253
Formalise citation info ( #2167 )
...
* Create CITATION file
* Add Katrinleinweber contributor agreement
2018-03-30 10:34:14 +02:00
Viet Trung Tran
ea2af94cd9
Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer ( #2155 )
...
* support for Vietnamese
* Contributor Agreement for adding Vietnamese support on spaCy
2018-03-29 12:19:51 +02:00
ines
6173c4aaa6
Port over contributor agreements
2018-03-24 17:17:37 +01:00
Aaron Marquez
c7926f72eb
add contributor agreement for @enerrio
2018-02-15 12:43:04 -08:00
Claudiu-Vlad Ursache
cdd4b3d05c
Add contributor agreement for @ursachec
2018-02-13 20:49:42 +01:00
Johannes Dollinger
012e874d09
Add contributor agreement for emulbreh
2018-02-13 13:40:33 +01:00
Lyndon White
94ce43adf0
squashme
2018-02-09 23:19:11 +08:00
Lyndon White
5b1bc8d101
Sign contributors agreement
2018-02-09 23:14:29 +08:00
Pradeep Kumar Tippa
f1911ef73a
Added pktippa contributor agreement
2018-02-07 15:37:28 +05:30
sayf eddine hammemi
35272eade8
Accept contributer agreement.
2018-02-04 20:48:45 +01:00
Adam Binford
1a2c2f7d7f
Fixed auto linking after download and added simple test to check
2018-01-29 14:25:21 -05:00
Matthew Honnibal
cb7110c22e
Merge pull request #1882 from ohenrik/nb_lemma_and_tag_map
...
Add norwegian bokmål ('nb') lemmatizer and tag_map
2018-01-29 18:18:50 +01:00
Thomas Opsomer
f35895d81b
add contributor agreement
2018-01-28 20:12:05 +01:00
Ole Henrik Skogstrøm
bbc758526c
Added contributors agreement
2018-01-25 11:05:29 +01:00