Commit Graph

8620 Commits

Author SHA1 Message Date
vishnumenon ae3719ece5 Fix the code for FACILITIY entities (#2324)
* Fix the code for FACILITIY entities

As far as I can tell, the default models all use "FAC" rather than "FACILITY"

* Added my Contributor Agreement

* Rename vishnumenon to vishnumenon.md
2018-05-12 15:19:17 +02:00
Jani Monoses 42b34832e4 Update Romanian stopword list (#2316)
* Contributor agreement for janimo

* Update Romanian stopword list

Include the correct spellings of all the words already in the repo
that are using cedillas (ş and ţ) instead of commas (ș and ț).

Add another unrelated spelling fix.

See https://github.com/stopwords-iso/stopwords-ro/pull/1 and
https://github.com/stopwords-iso/stopwords-ro/pull/2
2018-05-10 12:16:56 +02:00
Lucas Abbade 18af53014f Adding my contributor agreement (#2315)
* Create LRAbbade.md

* Update LRAbbade.md
2018-05-09 21:25:05 +02:00
Lucas Abbade be7fdc59d1 Update lex_attrs.py (#2307)
* Update lex_attrs.py

Fixed spelling mistakes of some numbers (according to Brazilian Portuguese).

* Update lex_attrs.py

As requested, I've included the correct spelling for both Brazilian Portuguese and Portuguese Portuguese.

I will advise however, that the two are separated in the future. Brazilian Portuguese is a very different language from the original one, although most of the writing is unified, the way people talk in both countries is radically different. Keeping both languages as one may lead to bigger issues in the future, especially when it comes to spell checking.
2018-05-09 20:49:31 +02:00
mauryaland 5368ba028a Update stop_words.py for French language (#2310)
* Add contraction forms of some common stopwords

All the stopwords added contain the apostrophe" ' "or " ’ ".

* Adds contributor agreement mauryaland

* Update mauryaland.md
2018-05-09 12:04:38 +02:00
ines 7a3599c21a Fix formatting and consistency 2018-05-07 23:02:11 +02:00
ines 37facf9b4d Add config for no-response [ci skip] 2018-05-07 22:04:54 +02:00
ines ac25bc4016 Add docs section on sentence segmentation [ci skip] 2018-05-07 21:25:20 +02:00
ines 14148cd147 Fix formatting and wording 2018-05-07 21:24:35 +02:00
ines f803da609f Add scattertext [ci skip] 2018-05-07 19:10:23 +02:00
ines a685fff875 Merge branch 'master' of https://github.com/explosion/spaCy 2018-05-07 18:58:57 +02:00
ines e2241c797c Add lock-threads configuration [ci skip] 2018-05-07 18:54:22 +02:00
B! 414f5270b3 B Cavello's signed Contributor Agreement v2 (#2302)
This time hopefully created in the right spot. (Sorry about that!)
2018-05-07 17:48:54 +02:00
Matt Upson 9a1d3b63fb Add missing default to .set_extension (#2297)
Failing to set a default, method, or getter results in a ValueError:

ValueError: [E083] Error setting extension: only one of `default`, `method`, or `getter` (plus optional `setter`) is allowed. Got: 0
2018-05-04 18:47:01 +02:00
ines 929a01139a Order issue templates 2018-05-04 03:04:41 +02:00
Ines Montani 7f39c8896b
Update issue templates (#2295)
* Update issue templates

* Update templates
2018-05-04 03:02:26 +02:00
Douglas Knox 9b49a40f4e Test and fix for Issue #2219 (#2272)
Test and fix for Issue #2219: Token.similarity() failed if single letter
2018-05-03 18:40:46 +02:00
Paul O'Leary McCann bd72fbf09c Port Japanese mecab tokenizer from v1 (#2036)
* Port Japanese mecab tokenizer from v1

This brings the Mecab-based Japanese tokenization introduced in #1246 to
spaCy v2. There isn't a JapaneseTagger implementation yet, but POS tag
information from Mecab is stored in a token extension. A tag map is also
included.

As a reminder, Mecab is required because Universal Dependencies are
based on Unidic tags, and Janome doesn't support Unidic.

Things to check:

1. Is this the right way to use a token extension?

2. What's the right way to implement a JapaneseTagger? The approach in
 #1246 relied on `tag_from_strings` which is just gone now. I guess the
best thing is to just try training spaCy's default Tagger?

-POLM

* Add tagging/make_doc and tests
2018-05-03 18:38:26 +02:00
G.Pruvost cc8e804648 #2211 - Support for ssl certs config on download command (#2212)
* Add support for SSL/Certs customization on download CLI

* Add a note on SSL options for the 'download' CLI in the README

* Add contributor agreement
2018-05-03 18:37:02 +02:00
Jens Dahl Møllerhøj b9290397fb rename SP to _SP (#2289) 2018-05-03 18:33:49 +02:00
ines c9547b7b8b Update Juniper (see #2293) 2018-05-03 15:36:02 +02:00
Alex Villarreal 647f2544c5 Fix code sample for span.set_extension (#2286) 2018-05-03 00:39:22 +02:00
Alex Villarreal 13d562e1a4 Fix code sample for Doc.set_extension (#2282)
* Fix code sample for `set_extension`

The previous sample code for `set_extension` fails the assertion at the end, because `city_getter` it checked if the whole document text matches any of the city names. Now it checks if any of the city names is contained in the document text.

* Contributor agreement
2018-05-02 10:16:05 +02:00
Mr Roboto 6f5ccda19c Addresses Issue #2228 - Deserialization fails when using tensor=False or sentiment=False (#2230)
* Fixes issue #2228

* Adds a new contributor
2018-05-01 13:40:22 +02:00
Shirish Kadam d98a90440f Added Adam project to spaCy Universe (#2275)
* Added 5hirish to contributors

* Added Adam Qas Project to spaCy Universe

* Remove $ from code example
2018-04-30 22:25:01 +02:00
ines 56e7faf16b Fix spacing 2018-04-30 22:24:40 +02:00
ines 6efb4cdf88 Use Juniper and tidy up 2018-04-30 18:48:35 +02:00
ines 45bb8d75a5 Fix overflow issues on small screens [ci skip] 2018-04-29 03:17:36 +02:00
Ines Montani 49cee4af92
💫 Interactive code examples, spaCy Universe and various docs improvements (#2274)
* Integrate Python kernel via Binder

* Add live model test for languages with examples

* Update docs and code examples

* Adjust margin (if not bootstrapped)

* Add binder version to global config

* Update terminal and executable code mixins

* Pass attributes through infobox and section

* Hide v-cloak

* Fix example

* Take out model comparison for now

* Add meta text for compat

* Remove chart.js dependency

* Tidy up and simplify JS and port big components over to Vue

* Remove chartjs example

* Add Twitter icon

* Add purple stylesheet option

* Add utility for hand cursor (special cases only)

* Add transition classes

* Add small option for section

* Add thumb object for small round thumbnail images

* Allow unset code block language via "none" value

(workaround to still allow unset language to default to DEFAULT_SYNTAX)

* Pass through attributes

* Add syntax highlighting definitions for Julia, R and Docker

* Add website icon

* Remove user survey from navigation

* Don't hide GitHub icon on small screens

* Make top navigation scrollable on small screens

* Remove old resources page and references to it

* Add Universe

* Add helper functions for better page URL and title

* Update site description

* Increment versions

* Update preview images

* Update mentions of resources

* Fix image

* Fix social images

* Fix problem with cover sizing and floats

* Add divider and move badges into heading

* Add docstrings

* Reference converting section

* Add section on converting word vectors

* Move converting section to custom section and fix formatting

* Remove old fastText example

* Move extensions content to own section

Keep weird ID to not break permalinks for now (we don't want to rewrite URLs if not absolutely necessary)

* Use better component example and add factories section

* Add note on larger model

* Use better example for non-vector

* Remove similarity in context section

Only works via small models with tensors so has always been kind of confusing

* Add note on init-model command

* Fix lightning tour examples and make excutable if possible

* Add spacy train CLI section to train

* Fix formatting and add video

* Fix formatting

* Fix textcat example description (resolves #2246)

* Add dummy file to try resolve conflict

* Delete dummy file

* Tidy up [ci skip]

* Ensure sufficient height of loading container

* Add loading animation to universe

* Update Thebelab build and use better startup message

* Fix asset versioning

* Fix typo [ci skip]

* Add note on project idea label
2018-04-29 02:06:46 +02:00
ines 3c80f69ff5 Return data in cli.info and add silent option (resolves #2196) 2018-04-29 01:59:44 +02:00
ines 1c6d77610c Add remove_extension method on Doc, Token and Span (closes #2242) 2018-04-28 23:33:09 +02:00
ines a512fa60ef Remove upcoming option from docs for now 2018-04-28 23:32:18 +02:00
ines abdb853ebf Simplify underscore tests 2018-04-28 23:30:33 +02:00
ines 6fb6371670 Add collapse_phrases option to displacy (closes #2266) 2018-04-28 23:06:50 +02:00
Matt Upson 87cc6b3599 Add missing comma to NN example in docs (#2255)
Also add a completed contributor agreement.
2018-04-28 14:56:00 +02:00
Robin Linderborg 1f9904ef12 fixes #2238 (#2241)
* Remove erroneous lemma lookup år > åra in Swedish

* Add contributors agreement

* Add contrib agreement to correct directory

* Revert change to CONTRIBUTOR_AGREEMENT
2018-04-28 14:55:22 +02:00
Robin Linderborg d01f503b54 Remove incorrect lemma lookup gäng->gänga (#2252)
* Remove incorrect lemma lookup gäng->gänga
In modern Swedish, "gäng" is mostly associated with "gang" or "group of people". The removed lemma lookup lemmatized it to the verb "thread".

* Add contrib agreement to correct directory

* Revert change to CONTRIBUTOR_AGREEMENT
2018-04-28 14:54:41 +02:00
ines 4a3bea00c7 Update resources [ci skip] 2018-04-26 22:10:34 +02:00
ines 686225eadd Fix Spanish noun_chunks (resolves #2210)
Make sure 'NP' label is added to StringStore and move noun_bounds helper into a closure to allow reusing label sets
2018-04-18 18:44:01 -04:00
ines 9632595fb4 Use correct, non-deprecated merge syntax (resolves #2226) 2018-04-18 18:28:28 -04:00
Suraj Rajan 5957f15227 Fixed typos for #2222,#2223 (#2233) (closes #2222, closes #2223) 2018-04-18 14:55:26 -07:00
Pradeep Kumar Tippa df389e5b74 spacy-101 vocab doc giving valid variable names (#2236) 2018-04-18 14:54:26 -07:00
Ines Montani b4d35c7dfb
Update README.rst 2018-04-10 23:05:49 +02:00
Matthew Honnibal 97851d2c4e Increment version to v2.0.12.dev0 2018-04-10 22:20:16 +02:00
Matthew Honnibal ed39c75a92 Merge branch 'master' of https://github.com/explosion/spaCy 2018-04-10 22:19:40 +02:00
Matthew Honnibal 3836199a83 Fix loading of models when custom vectors are added 2018-04-10 22:19:20 +02:00
ines 0299d5fac8 Update argument annotations and formatting 2018-04-10 21:45:11 +02:00
ines 49b1e48bf5 Fix syntax error 2018-04-10 21:44:59 +02:00
ines ce63f8997b Update init-model docs 2018-04-10 21:42:54 +02:00
ines 70052e46e9 Fix formatting [ci skip] 2018-04-10 21:42:46 +02:00