spaCy/.github/contributors
Connor Brinton 657af5f91f
🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167)
* 🚨 Ignore all existing Mypy errors

* 🏗 Add Mypy check to CI

* Add types-mock and types-requests as dev requirements

* Add additional type ignore directives

* Add types packages to dev-only list in reqs test

* Add types-dataclasses for python 3.6

* Add ignore to pretrain

* 🏷 Improve type annotation on `run_command` helper

The `run_command` helper previously declared that it returned an
`Optional[subprocess.CompletedProcess]`, but it isn't actually possible
for the function to return `None`. These changes modify the type
annotation of the `run_command` helper and remove all now-unnecessary
`# type: ignore` directives.

* 🔧 Allow variable type redefinition in limited contexts

These changes modify how Mypy is configured to allow variables to have
their type automatically redefined under certain conditions. The Mypy
documentation contains the following example:

```python
def process(items: List[str]) -> None:
    # 'items' has type List[str]
    items = [item.split() for item in items]
    # 'items' now has type List[List[str]]
    ...
```

This configuration change is especially helpful in reducing the number
of `# type: ignore` directives needed to handle the common pattern of:
* Accepting a filepath as a string
* Overwriting the variable using `filepath = ensure_path(filepath)`

These changes enable redefinition and remove all `# type: ignore`
directives rendered redundant by this change.

* 🏷 Add type annotation to converters mapping

* 🚨 Fix Mypy error in convert CLI argument verification

* 🏷 Improve type annotation on `resolve_dot_names` helper

* 🏷 Add type annotations for `Vocab` attributes `strings` and `vectors`

* 🏷 Add type annotations for more `Vocab` attributes

* 🏷 Add loose type annotation for gold data compilation

* 🏷 Improve `_format_labels` type annotation

* 🏷 Fix `get_lang_class` type annotation

* 🏷 Loosen return type of `Language.evaluate`

* 🏷 Don't accept `Scorer` in `handle_scores_per_type`

* 🏷 Add `string_to_list` overloads

* 🏷 Fix non-Optional command-line options

* 🙈 Ignore redefinition of `wandb_logger` in `loggers.py`

*  Install `typing_extensions` in Python 3.8+

The `typing_extensions` package states that it should be used when
"writing code that must be compatible with multiple Python versions".
Since SpaCy needs to support multiple Python versions, it should be used
when newer `typing` module members are required. One example of this is
`Literal`, which is available starting with Python 3.8.

Previously SpaCy tried to import `Literal` from `typing`, falling back
to `typing_extensions` if the import failed. However, Mypy doesn't seem
to be able to understand what `Literal` means when the initial import
means. Therefore, these changes modify how `compat` imports `Literal` by
always importing it from `typing_extensions`.

These changes also modify how `typing_extensions` is installed, so that
it is a requirement for all Python versions, including those greater
than or equal to 3.8.

* 🏷 Improve type annotation for `Language.pipe`

These changes add a missing overload variant to the type signature of
`Language.pipe`. Additionally, the type signature is enhanced to allow
type checkers to differentiate between the two overload variants based
on the `as_tuple` parameter.

Fixes #8772

*  Don't install `typing-extensions` in Python 3.8+

After more detailed analysis of how to implement Python version-specific
type annotations using SpaCy, it has been determined that by branching
on a comparison against `sys.version_info` can be statically analyzed by
Mypy well enough to enable us to conditionally use
`typing_extensions.Literal`. This means that we no longer need to
install `typing_extensions` for Python versions greater than or equal to
3.8! 🎉

These changes revert previous changes installing `typing-extensions`
regardless of Python version and modify how we import the `Literal` type
to ensure that Mypy treats it properly.

* resolve mypy errors for Strict pydantic types

* refactor code to avoid missing return statement

* fix types of convert CLI command

* avoid list-set confustion in debug_data

* fix typo and formatting

* small fixes to avoid type ignores

* fix types in profile CLI command and make it more efficient

* type fixes in projects CLI

* put one ignore back

* type fixes for render

* fix render types - the sequel

* fix BaseDefault in language definitions

* fix type of noun_chunks iterator - yields tuple instead of span

* fix types in language-specific modules

* 🏷 Expand accepted inputs of `get_string_id`

`get_string_id` accepts either a string (in which case it returns its 
ID) or an ID (in which case it immediately returns the ID). These 
changes extend the type annotation of `get_string_id` to indicate that 
it can accept either strings or IDs.

* 🏷 Handle override types in `combine_score_weights`

The `combine_score_weights` function allows users to pass an `overrides` 
mapping to override data extracted from the `weights` argument. Since it 
allows `Optional` dictionary values, the return value may also include 
`Optional` dictionary values.

These changes update the type annotations for `combine_score_weights` to 
reflect this fact.

* 🏷 Fix tokenizer serialization method signatures in `DummyTokenizer`

* 🏷 Fix redefinition of `wandb_logger`

These changes fix the redefinition of `wandb_logger` by giving a 
separate name to each `WandbLogger` version. For 
backwards-compatibility, `spacy.train` still exports `wandb_logger_v3` 
as `wandb_logger` for now.

* more fixes for typing in language

* type fixes in model definitions

* 🏷 Annotate `_RandomWords.probs` as `NDArray`

* 🏷 Annotate `tok2vec` layers to help Mypy

* 🐛 Fix `_RandomWords.probs` type annotations for Python 3.6

Also remove an import that I forgot to move to the top of the module 😅

* more fixes for matchers and other pipeline components

* quick fix for entity linker

* fixing types for spancat, textcat, etc

* bugfix for tok2vec

* type annotations for scorer

* add runtime_checkable for Protocol

* type and import fixes in tests

* mypy fixes for training utilities

* few fixes in util

* fix import

* 🐵 Remove unused `# type: ignore` directives

* 🏷 Annotate `Language._components`

* 🏷 Annotate `spacy.pipeline.Pipe`

* add doc as property to span.pyi

* small fixes and cleanup

* explicit type annotations instead of via comment

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: svlandeg <sofie.vanlandeghem@gmail.com>
Co-authored-by: svlandeg <svlandeg@github.com>
2021-10-14 15:21:40 +02:00
..
0x2b3bfa0.md Create 0x2b3bfa0.md (#6916) 2021-02-04 23:25:11 +01:00
5hirish.md Added Adam project to spaCy Universe (#2275) 2018-04-30 22:25:01 +02:00
ALSchwalm.md Fix bug where Vocab.prune_vector did not use 'batch_size' (#2977) 2018-11-28 19:49:33 +01:00
AMArostegui.md spaCy Universe: New project; SpacyDotNet (#6702) 2021-01-13 12:47:30 +11:00
AlJohri.md sign contributor agreement for AlJohri (#4839) [ci skip] 2019-12-29 14:17:28 +01:00
Arvindcheenu.md Added Tamil Example Sentences (#5583) 2020-06-13 15:56:26 +02:00
AyushExel.md W&B integration: Optional support for dataset and model checkpoint logging and versioning (#7429) 2021-04-01 19:36:23 +02:00
Azagh3l.md Create Azagh3l.md (#3836) 2019-06-11 10:58:32 +02:00
Baciccin.md Add Ligurian language 2020-03-19 21:37:01 -07:00
Bharat123rox.md Made changes suggested by @ines 2019-03-20 07:43:19 +05:30
BigstickCarpet.md Better formatting for `spacy train` CLI (#2357) 2018-05-25 13:08:45 +02:00
BramVanroy.md Documentation improvement regarding joblib and SO (#2867) 2018-10-24 15:19:17 +02:00
BreakBB.md Fix symlink creation to show error message on failure (#3589) (resolves #3307)) 2019-04-16 11:58:31 +02:00
Bri-Will.md Adds contributor agreement for Bri-Will 2017-12-11 14:38:37 -08:00
Brixjohn.md Added alpha support for Tagalog language (#3062) 2018-12-18 13:08:38 +01:00
Cinnamy.md Correcting lang/ru/examples.py (#2845) 2018-10-13 15:19:43 +02:00
DeNeutoy.md Allow vectors to be optional in init-model, more robust string counting (#3155) 2019-01-14 23:48:30 +01:00
DimaBryuhanov.md DimaBryuhanov.md (#2590) 2018-07-24 18:43:27 +02:00
Dobita21.md Create Dobita21.md (#3614) 2019-04-18 12:51:54 +02:00
DoomCoder.md Improved polish tokenizer and stop words. (#2974) 2019-02-08 14:27:21 +11:00
DuyguA.md added contributor agreement for DuyguA 2017-11-13 15:45:13 +01:00
EARL_GREYT.md fix typo in first token (#4327) 2019-09-27 14:49:36 +02:00
Eleni170.md Add support for Greek language (#2535) 2018-07-10 13:48:38 +02:00
EmilStenstrom.md Add abbreviations from UD_Swedish-Talbanken (#2613) 2018-08-07 13:53:17 +02:00
F0rge1cE.md Fix offset bug in loading pre-trained word2vec. (#3689) 2019-05-06 23:00:38 +02:00
FallakAsad.md Bugfix/issue 3968 (#3982) 2019-07-18 00:20:32 +02:00
GiorgioPorgio.md Port over contributor agreement from spacy-lookups-data [ci skip] 2019-10-25 13:06:10 +02:00
Gizzio.md Improved polish tokenizer and stop words. (#2974) 2019-02-08 14:27:21 +11:00
GuiGel.md Bugfix/fix entity ruler from disk (#4670) 2019-11-21 16:26:37 +01:00
Hazoom.md Improve speed of _merge method (#4300) 2019-09-18 21:34:34 +02:00
HiromuHota.md Tags are joined with a comma and padded with asterisks (#3491) 2019-03-28 16:17:31 +01:00
ICLRandD.md Add entry for Blackstone in universe.json (#4101) 2019-08-09 17:16:51 +02:00
IsaacHaze.md Adds contributor agreement IsaacHaze 2017-12-10 23:15:06 +01:00
JKhakpour.md Add Persian(Farsi) language support (#2797) 2018-10-13 15:31:49 +02:00
Jan-711.md Fix/Improve german stop words (#5024) 2020-02-17 18:59:22 +01:00
JannisTriesToCode.md Documentation Typo Fix (#5492) 2020-05-22 19:50:26 +02:00
Jette16.md Add universe test (#9278) 2021-10-11 11:08:46 +02:00
KKsharma99.md Adding MindMeld to Universe JSON (#6275) 2020-10-21 18:42:11 +02:00
KennethEnevoldsen.md added agreement 2021-07-13 10:11:02 +02:00
Kimahriman.md Fixed auto linking after download and added simple test to check 2018-01-29 14:25:21 -05:00
LRAbbade.md Adding my contributor agreement (#2315) 2018-05-09 21:25:05 +02:00
Loghijiaha.md Tamil language support (#3154) 2019-01-14 15:32:30 +01:00
MartinoMensio.md adding spacy-universal-sentence-encoder (#5534) 2020-06-08 20:26:30 +02:00
MateuszOlko.md Improved polish tokenizer and stop words. (#2974) 2019-02-08 14:27:21 +11:00
MathiasDesch.md Add spaCy Contributor Agreement 2017-11-09 11:56:47 +01:00
MiniLau.md Add is_sent_end token property (#5375) 2020-04-29 12:53:16 +02:00
MisterKeefe.md make idx available via to_array (#5030) 2020-02-22 14:13:06 +01:00
Mlawrence95.md [minor doc change] embedding vis. link is broken in `website/docs/usage/examples.md` (#5325) 2020-04-21 20:35:12 +02:00
NSchrading.md Re-add existing contributor agreements 2016-11-09 16:42:02 +01:00
NirantK.md Create NirantK.md (#3807) [ci skip] 2019-06-01 17:36:06 +02:00
Nuccy90.md Update morph_rules.py (#6102) 2020-10-06 15:14:47 +02:00
Olamyy.md Adding support for Yoruba Language (#4614) 2019-12-21 14:11:50 +01:00
Pavle992.md Stopwords for Serbian language. (#4078) 2019-08-05 10:22:27 +02:00
PeterGilles.md Initial commit: New language Luxembourgish (lb) (#4424) 2019-10-14 12:27:50 +02:00
PluieElectrique.md Reduce memory usage of Lookup's BloomFilter (#5606) 2020-06-26 14:09:10 +02:00
Poluglottos.md Fix typo 2019-03-16 13:45:46 +01:00
PolyglotOpenstreetmap.md Create PolyglotOpenstreetmap.md (#3198) 2019-01-26 14:02:54 +01:00
R1j1t.md update spacy universe with my project (#5497) 2020-05-25 11:30:23 +02:00
RvanNieuwpoort.md Signed Contributer Agreement by Rob van Nieuwpoort 2016-12-15 10:34:19 +01:00
SamEdwardes.md Updates to universe.json for spaCyTextBlob (#7647) 2021-04-04 20:17:57 +02:00
SamuelLKane.md fix(util): fix decaying function output (#3495) 2019-03-28 13:24:47 +01:00
Schibsted.png Add contributor agreement [ci skip] 2019-08-30 17:02:43 +02:00
Stannislav.md Change type of texts argument in pipe to iterable (#6186) 2020-10-02 21:00:11 +02:00
Tiljander.md Describing priority rules for overlapping matches (#5197) 2020-03-26 13:13:22 +01:00
YohannesDatasci.md Armenian language support (#5246) 2020-04-03 13:02:18 +02:00
ZeeD.md applying suggestion to avoid mypy errors (#8265) 2021-06-02 19:25:30 +10:00
aajanki.md Improvements to the Finnish language data (#4738) 2019-12-03 12:55:28 +01:00
aaronkub.md fixing regex matcher examples (#3708) (#3719) 2019-05-10 14:23:52 +02:00
aashishg.md Added numbers to ../lang/hi/lex_attrs.py (#2629) 2018-08-08 16:06:11 +02:00
abchapman93.md Add VA COVID-19 NLP project to spaCy Universe (#5777) 2020-07-19 13:35:31 +02:00
abhi18av.md Create abhi18av.md 2017-11-13 17:23:05 +05:30
adrianeboyd.md Update TIGER/German dependency relations in documentation (#3204) 2019-01-30 14:23:12 +01:00
adrienball.md Fix egg fragments in direct download (#3369) 2019-03-07 21:07:19 +01:00
ajrader.md Correction of default lemmatizer lookup in English (Issue # 4104) (#4110) 2019-08-15 11:39:10 +02:00
akki2825.md add kannada support (#3264) 2019-02-12 18:28:39 +01:00
akornilo.md Update gold corpus code to properly ingest a directory of jsonl… (#4067) 2019-08-02 09:58:51 +02:00
alexcombessie.md Remove questionable French stopwords (#6310) 2021-01-08 11:36:22 +11:00
alexvy86.md Fix code sample for Doc.set_extension (#2282) 2018-05-02 10:16:05 +02:00
aliiae.md Add Tatar Language Support (#2444) 2018-06-19 10:17:53 +02:00
alldefector.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
alvaroabascar.md Fix issue 2396 (#3089) 2018-12-29 18:05:52 +01:00
alvations.md Create alvations.md (#3119) 2019-01-05 13:11:06 +01:00
ameyuuno.md added contributor agreement ameyuuno.md (#3925) 2019-07-09 10:09:52 +02:00
amitness.md Fix broken link to Dive Into Python 3 website (#3656) 2019-04-29 19:44:00 +02:00
amperinet.md add small fix for French lemmatizer (#3206) 2019-01-31 23:44:10 +01:00
aniruddha-adhikary.md update bengali token rules for hyphen and digits (#2731) 2018-09-05 21:49:00 +02:00
ansgar-t.md escape html in displacy.render (#2378) (closes #2361) 2018-05-28 18:36:41 +02:00
aongko.md Update Indonesian model (#2752) 2018-09-14 12:30:32 +02:00
aristorinjuang.md adding more words and rephrasing (#2351) 2018-05-24 11:40:57 +02:00
armsp.md Default code for Setting Entity annotations on the website errors (#7738) 2021-04-21 09:16:32 +02:00
aryaprabhudesai.md Create aryaprabhudesai.md (#2681) 2018-08-20 18:56:14 +02:00
askhogan.md Update example and sign contributor agreement (#3916) 2019-07-08 10:27:20 +02:00
avadhpatel.md Signed contributor agreement 2018-01-17 06:33:37 -06:00
avramandrei.md Added RONEC to spaCy Universe (#4151) 2019-08-20 14:46:07 +02:00
azarezade.md add contributors.md 2018-01-23 13:47:30 +03:30
b1uec0in.md Fix error when Korean text contains regexp special characters. (#4022) 2019-07-25 17:53:33 +02:00
bbieniek.md added contribution license 2021-08-19 21:45:18 +02:00
bdewilde.md Add contributor agreement 2017-11-20 11:28:31 -06:00
beatesi.md Updated wordforms for Norwegian lemmatizer (#3007) 2018-12-06 15:46:18 +01:00
bellabie.md Fix filename 2019-03-16 13:46:58 +01:00
bintay.md most_similar() return the k most similar vectors (#4364) 2019-10-03 14:09:44 +02:00
bittlingmayer.md Add Armenian sentence-final verchaket, Greek question mark and Arabic question mark to default punct (#5910) 2020-08-12 15:36:14 +02:00
bjascob.md Update Universe Website for pyInflect (#3641) 2019-04-26 13:17:36 +02:00
bodak.md Add hmrb to spaCy Universe (#8129) 2021-05-31 18:40:48 +10:00
boena.md Updates to Swedish Language (#3164) 2019-01-16 13:45:50 +01:00
borijang.md Include Macedonian language (#6230) 2020-10-15 15:55:01 +02:00
bratao.md spaCy v3 is not saving the best version in training loop (#6629) 2021-01-06 12:51:30 +11:00
broaddeep.md Support match alignments (#7321) 2021-04-08 18:10:14 +10:00
bryant1410.md Fix website docs for Vectors.from_glove (#3565) 2019-04-10 15:23:27 +02:00
bsweileh.md Update _training.md - Fix broken link on backpropagation (#7431) 2021-03-15 09:21:35 +01:00
btrungchi.md Fix loading tokenizer with custom prefix search (#2495) 2018-07-04 12:56:07 +02:00
calumcalder.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
cbilgili.md Adds Canbey Bilgili's Contributor Agreement 2017-12-01 17:27:41 +03:00
cclauss.md Create cclauss.md 2017-11-20 14:57:30 +01:00
cedar101.md Korean support (#3901) 2019-07-09 22:23:16 +02:00
celikomer.md Signed agreement (#3577) 2019-04-11 11:31:27 +02:00
ceteri.md Submitting `PyTextRank` for inclusion in the spaCy uniVerse (#4942) 2020-01-28 11:37:54 +01:00
charlax.md Add charlax's contributor agreement (#2805) 2018-09-27 12:24:42 +02:00
chezou.md Upadate the document for Unidic link with latest version URL (#3022) 2018-12-07 17:24:48 +01:00
chopeen.md [Closes #5292] Fix typo in option name "--n-save_every" (#5293) 2020-04-11 23:35:01 +02:00
chrisdubois.md Re-add existing contributor agreements 2016-11-09 16:42:02 +01:00
cicorias.md fixes symbolic link on py3 and windows (#2949) 2018-11-24 15:34:23 +01:00
clarus.md Typo (#3865) 2019-06-20 10:31:19 +02:00
clippered.md issue #3012: add test (#3021) 2018-12-18 15:02:49 +01:00
connorbrinton.md 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
coryhurst.md Silent keyword in info function in init (#2459) 2018-06-18 12:24:21 +02:00
cristianasp.md Update stop_words.py in Portuguese (a,o,e) (#6345) 2021-01-08 11:35:38 +11:00
d99kris.md Rename d99kris to d99kris.md 2017-12-17 13:44:55 +01:00
danielhers.md Signed contributor agreement 2017-11-08 16:28:56 +02:00
danielkingai2.md Don't use numpy directly for similarity (#3362) 2019-03-06 22:58:38 +00:00
danielruf.md chore: cache dependencies (#2418) 2018-06-11 00:22:41 +02:00
danielvasic.md Added Multext-East V5 tagset for Croatian language (#6248) 2020-11-05 12:19:22 +01:00
dardoria.md Bulgarian tokenizer exceptions (#7114) 2021-02-19 19:19:19 +01:00
darindf.md Fix error (#2802) 2018-09-26 21:31:03 +02:00
delzac.md Reflect on usage doc that IS_SENT_START attribute exist (#6114) 2020-10-09 10:14:40 +02:00
demfier.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
demongolem.md Update tokenizer.md for construction example (#3790) 2019-06-16 14:32:56 +02:00
dhpollack.md fix typo in svg file 2020-03-05 17:04:33 +01:00
dhruvrnaik.md Fix Span.char_span bug (#6816) 2021-01-26 15:50:37 +08:00
doug-descombaz.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
drndos.md Add Slovak language tools implementation (#4943) 2020-02-03 13:03:59 +01:00
dvsrepo.md Adds contributor agreement dvsrepo 2017-04-07 11:58:28 +02:00
elbaulp.md Changed learning rate by its param name. (#3855) 2019-06-20 10:29:20 +02:00
elben10 Fixes #5413 (#5315) 2020-04-16 13:29:02 +02:00
emulbreh.md Add contributor agreement for emulbreh 2018-02-13 13:40:33 +01:00
enerrio.md add contributor agreement for @enerrio 2018-02-15 12:43:04 -08:00
er-raoniz.md Fix example sentences in Hindi for grammatical errors (#4343) 2019-09-30 23:32:49 +02:00
erip.md Add initial Korean support (#4660) 2019-11-18 12:56:07 +01:00
estr4ng7d.md Marathi Language Support (#3767) 2019-05-24 14:29:42 +02:00
ezorita.md Add stub files for main cython classes (#8427) 2021-08-07 12:30:03 +02:00
filipecaixeta.md Add words to portuguese language _num_words (#2759) 2018-09-14 12:30:16 +02:00
fizban99.md Create fizban99.md (#3601) 2019-04-17 11:22:19 +02:00
florijanstamenkovic.md Fix Issue 6207 (#6208) 2020-10-09 10:14:40 +02:00
forest1988.md Avoid a SyntaxError in self-attentive-parser (#6428) 2020-11-22 21:59:37 +01:00
foufaster.md Create foufaster.md (#3179) 2019-01-21 15:45:54 +01:00
frascuchon.md Include universe spec for spacy-wordnet component (#2919) 2018-11-13 23:54:46 +01:00
free-variation.md Fixed spaCy+Keras example (#2763) 2018-09-15 13:06:39 +02:00
fsonntag.md Add contributer aggreement 2017-11-19 16:30:35 +01:00
fucking-signup.md Add contributor agreement 2018-01-08 03:08:57 +01:00
gandersen101.md Adding spaczz package to universe.json (#5717) 2020-07-07 20:55:24 +02:00
gavrieltal.md Initialize trues to 0.0 in training example (#3004) 2018-12-03 01:33:22 +01:00
giannisdaras.md Greek language optimizations (#2558) 2018-07-18 18:51:38 +02:00
graue70.md Fix typos in comments (#5904) 2020-08-12 15:35:25 +02:00
graus.md adds textpipe to universe (#3500) [ci skip] 2019-03-28 15:13:19 +01:00
greenriverrus.md Added contributor agreement 2017-11-26 22:14:08 +03:00
grivaz.md Introduces a bulk merge function, in order to solve issue #653 (#2696) 2018-09-10 16:41:42 +02:00
gtoffoli.md Added Italian POS-aware lemmatizer. (#8079) 2021-06-16 11:14:45 +02:00
guerda.md Update guerda.md 2020-03-24 10:42:30 +01:00
gustavengstrom.md Adding noun_chunks to the Swedish language model (sv) (#4422) 2019-10-21 12:57:06 +02:00
henry860916.md update response after calling add_pipe (#3661) 2019-05-01 12:02:18 +02:00
hertelm.md Website: fixed the token span in the text about the rule-based matching example (#5669) 2020-06-30 19:58:55 +02:00
himkt.md fix wrong indexing (#2416) 2018-06-19 10:20:57 +02:00
hiroshi-matsuda-rit.md fix a bug causing mis-alignments (#5560) 2020-06-08 15:49:34 +02:00
hlasse.md add textdescriptives to universe 2021-08-13 14:35:18 +02:00
holubvl3.md Create holubvl3 (#5845) 2020-07-30 17:40:31 +02:00
honnibal.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
howl-anderson.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
hugovk.md CLA 2017-11-29 10:25:20 +02:00
iann0036.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
idealley.md Added agrement (#2374) 2018-05-26 18:19:08 +02:00
idoshr.md Hebrew like num (#5952) 2020-08-24 14:30:05 +02:00
iechevarria.md Add n_process to Language.pipe documentation (#4842) [ci skip] 2019-12-29 14:23:33 +01:00
ilivans.md Add ilivans' contributor agreement 2020-05-14 15:59:06 +02:00
ines.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
intrafindBreno.md Create intrafindBreno.md (#3814) 2019-06-03 18:33:09 +02:00
isaric.md Issue #1107 - adds examples.py for Croatian language (#4143) 2019-08-18 23:04:41 +02:00
iurshina.md Fixes typos (#4843) 2019-12-29 14:24:13 +01:00
ivigamberdiev.md Update links and http -> https (#3532) 2019-04-02 17:36:22 +02:00
ivyleavedtoadflax.md Add missing comma to NN example in docs (#2255) 2018-04-28 14:56:00 +02:00
jabortell.md Add jabortell to the contributors (#6422) 2020-11-24 16:15:31 +01:00
jacopofar.md Visual C++ link updated (#2842) (closes #2841) [ci skip] 2018-10-12 14:59:45 +02:00
jacse.md Extend and fix Danish examples (#5227) 2020-04-02 10:42:35 +02:00
janimo.md Update Romanian stopword list (#2316) 2018-05-10 12:16:56 +02:00
jankrepl.md Add agreement 2021-03-09 10:57:32 +01:00
jarib.md Add three missing tags from the `nb` tag map (#3085) 2018-12-27 14:48:40 +01:00
jaydeepborkar.md Update stop_words.py and add name in contributors (#4325) 2019-09-27 11:57:27 +02:00
jbesomi.md Add texthero to universe.json (#5716) 2020-07-07 20:54:22 +02:00
jeannefukumaru.md fix typos in tag_map flagged by `python -m debug-data` (#3542) 2019-04-05 12:06:09 +02:00
jenojp.md Raise error if annotation dict in simple training style has unexpected keys #4074 (#4079) 2019-08-06 11:01:25 +02:00
jerbob92.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
jganseman.md Create jganseman.md 2021-01-26 11:02:31 +01:00
jgutix.md Update suffixes example (#5989) 2020-08-31 12:44:56 +02:00
jimregan.md CLA 2017-06-26 21:32:48 +01:00
jklaise.md Update load_lookups return type and docstring (#7907) 2021-04-27 09:13:39 +02:00
jmargeta.md Add contributor agreement for jmargeta 2020-10-16 00:38:42 +02:00
jmyerston.md Added ancient Greek language support (#8606) 2021-07-15 10:27:17 +02:00
johnhaley81.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
jonesmartins.md Add missing pronoums/determiners (#5569) 2020-06-10 18:47:04 +02:00
juliamakogon.md Ukrainian language added. Small fixes in Russian (#3241) 2019-02-07 21:05:11 +01:00
julien-talkair.md add spacy contributor agreement 2021-07-01 17:41:12 +02:00
juliensalinas.md Sign contributors agreement. 2021-05-14 11:00:27 +02:00
jumasheff.md Add contributor agreement 2021-01-25 00:34:12 +06:00
justindujardin.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
kabirkhan.md Add optional `id` property to EntityRuler patterns (#3591) 2019-06-16 13:29:04 +02:00
katarkor.md changed tag_map, morph_rules, lemmatizer for Norwegian (#2565) 2018-07-19 19:38:24 +02:00
katrinleinweber.md Formalise citation info (#2167) 2018-03-30 10:34:14 +02:00
kbulygin.md Fix the first `nlp` call for `ja` (closes #2901) (#3065) 2018-12-18 15:01:06 +01:00
keshan.md Adding basic support for Sinhala language. (#2788) 2018-09-25 12:18:25 +02:00
keshav.md Spacy Cli info method causing backward compatibility issues (#6793) 2021-01-23 11:21:43 +01:00
kevinlu1248.md Create kevinlu1248.md 2020-05-19 20:25:45 -07:00
khellan.md Norwegian tweaks (#3894) 2019-07-08 10:28:47 +02:00
kimfalk.md agreeing to the contributor agreement. 2017-12-19 15:31:52 +01:00
knoxdw.md Test and fix for Issue #2219 (#2272) 2018-05-03 18:40:46 +02:00
koaning.md add "whatlies" to spaCy universe (#5252) 2020-04-06 11:29:30 +02:00
kognate.md Added support for serializing overwrite and ent_id_sep (#3918) 2019-07-08 17:28:28 +02:00
kororo.md Add ExcelCy into Universe list (#2572) 2018-07-19 19:28:33 +02:00
kowaalczyk.md Improved polish tokenizer and stop words. (#2974) 2019-02-08 14:27:21 +11:00
kwhumphreys.md add agreement 2018-01-03 13:00:14 -08:00
laszabine.md Amend documentation to Language.evaluate (#5319) 2020-04-16 20:00:18 +02:00
lauraBaakman.md Fix contributor agreement 2019-02-07 20:56:13 +01:00
ldorigo.md Submit contributor agreement (#3705) 2019-05-10 14:19:18 +02:00
leicmi.md Remove duplicated branch in if/else-if statement (#5234) 2020-04-02 14:47:42 +02:00
leomrocha.md contributor agreement signed (#5525) 2020-05-31 20:13:39 +02:00
leyendecker.md Fix on EntityRendered to support break lines (after last entity) (closes #5838) 2020-07-29 18:48:39 +02:00
lfiedler.md issue5230: added contributors agreement 2020-04-06 21:04:06 +02:00
ligser.md Fill contributer agreement 2017-11-11 11:39:31 +03:00
lizhe2004.md fix the wrong hash url in adding-languages.md file (#5810) 2020-07-25 13:13:38 +02:00
lorenanda.md add new Romanian stopwords (#6621) 2021-01-08 11:34:47 +11:00
louisguitton.md Add mlflow to spaCy universe (#5352) 2020-04-29 10:18:03 +02:00
luvogels.md Update luvogels.md 2017-04-27 10:42:07 +02:00
mabraham.md Tokenizer to_disk and from_disk now ensure paths (#5116) 2020-03-08 13:25:56 +01:00
magnusburton.md Initial commit for Swedish 2016-12-20 11:05:06 +01:00
mahnerak.md Create mahnerak.md (#5615) 2020-06-20 11:14:26 +02:00
mariosasko.md Add TakeLab/spacy-udpipe to Universe (#8698) 2021-07-16 11:15:52 +02:00
markulrich.md Use correct local parameter in example MyComponent (and added markulrich.md contributor file) 2017-11-22 15:59:08 -08:00
mauryaland.md Update stop_words.py for French language (#2310) 2018-05-09 12:04:38 +02:00
mbkupfer.md added contributor agreement for mbkupfer (#2738) 2018-09-10 11:32:03 +02:00
mdaudali.md Correct typo for AllenAI url on homepage (#4050) 2019-07-31 00:16:33 +02:00
mdcclv.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
mdda.md Create mdda.md 2017-12-18 18:09:27 +08:00
meghanabhange.md Project Idea : denomme | Multilingual Name Detection (#7845) 2021-04-22 08:48:17 +02:00
melanuria.pdf Add contributor agreement (see #1672) 2017-12-20 22:00:12 +01:00
merrcury.md Create merrcury.md 2020-03-10 15:11:07 +05:30
michael-k.md Add `!=3.4.*` to python_requires (#5344) 2020-04-27 22:02:09 +02:00
mihaigliga21.md adding Romanian tag_map (#4257) 2019-09-09 11:53:09 +02:00
mikeizbicki.md fix bug in Korean language, resulting in 100x speedup by reducing overhead of mecab (#5701) 2020-07-06 17:03:33 +02:00
mikelibg.md Removed space in docs + added contributor indo (#2909) 2018-11-08 14:18:25 +01:00
mirfan899.md Add Urdu Language Support (#2430) 2018-06-22 11:14:03 +02:00
miroli.md Remove incorrect lemma lookup gäng->gänga (#2252) 2018-04-28 14:54:41 +02:00
mmaybeno.md Agnostic vocab array fix (#4680) 2019-11-23 14:59:52 +01:00
mn3mos.md #2211 - Support for ssl certs config on download command (#2212) 2018-05-03 18:37:02 +02:00
mollerhoj.md Add Danish lemmatizer (#2184) 2018-04-07 19:07:28 +02:00
moreymat.md Support CUDA 10 (#3126) 2019-01-09 03:10:45 +01:00
mpszumowski.md Fix bug in CLI iob and ner converter (#2392) (fixes #2385) 2018-05-30 12:28:44 +02:00
mpuig.md Catalan Language Support (#2940) 2018-11-26 15:25:47 +01:00
mr-bjerre.md Fix link to user hooks in docs (#4778) 2019-12-06 19:17:12 +01:00
msklvsk.md fix UD data file extensions (#2425) 2018-06-08 14:26:11 +02:00
munozbravo.md Overwrites default getter for like_num in Spanish by adding _num_words and like_num to lex_attrs.py (#3810) (closes #3803)) 2019-06-02 12:22:57 +02:00
myavrum.md Create myavrum.md (#5612) 2020-06-19 18:34:27 +02:00
narayanacharya6.md Address missing config overrides post load of models (#8208) 2021-05-31 18:36:52 +10:00
neelkamath.md Add "spaCy Server" to spaCy Universe (#4553) 2019-10-30 13:20:46 +01:00
nikhilsaldanha.md Add kannada examples (#5162) 2020-03-29 13:54:42 +02:00
nipunsadvilkar.md Incorrect Token attribute ent_iob_ description (#3800) 2019-05-31 16:50:45 +02:00
njsmith.md When calling getoption() in conftest.py, pass a default option (#2709) 2018-09-03 09:57:52 +02:00
nlptown.md Improved Dutch language resources and Dutch lemmatization (#3409) 2019-04-03 14:13:26 +02:00
nourshalabi.md Additions to Arabic stop words. (#2422) 2018-06-08 02:33:23 +02:00
nsorros.md Add logger debug for project push and pull (#8860) 2021-08-02 18:13:53 +02:00
ohenrik.md Added contributors agreement 2018-01-25 11:05:29 +01:00
onlyanegg.md Fix for Issue 4665 - conllu2json (#4953) 2020-02-03 13:01:48 +01:00
ophelielacroix.md Add (noun chunks) syntax iterators for Danish (#6246) 2021-01-07 16:33:00 +11:00
oroszgy.md Accepted contributor agreement. 2016-12-26 22:37:02 +01:00
osori.md Very minor issues in Korean example sentences (#5446) 2020-05-17 13:43:34 +02:00
ottosulin.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
oxinabox.md squashme 2018-02-09 23:19:11 +08:00
ozcankasal.md trilyon forgotten (#3083) 2018-12-27 14:44:23 +01:00
paoloq.md Matcher support for Span as well as Doc (#5113) 2020-04-15 13:51:33 +02:00
pberba.md Update `vocab.get_vector` docs to include features on Fasttext ngram (#4464) 2019-10-20 01:28:18 +02:00
pbnsilva.md Adds contributor agreement 2018-01-11 17:40:12 +01:00
peter-exos.md Run PhraseMatcher on Spans (#6918) 2021-02-10 23:43:32 +11:00
phiedulxp.md update lang/zh (#4103) 2019-08-12 10:37:48 +02:00
philipvollet.md Add projects to spaCy Universe (#9269) 2021-09-23 10:56:45 +02:00
phojnacki.md agreement of contributor, may I introduce a tiny pl languge contribution (#2799) 2018-09-27 12:25:22 +02:00
pickfire.md Add myself to contributors (#3575) 2019-04-11 11:31:04 +02:00
pinealan.md Fill in contributor agreement 2020-03-15 03:45:20 +08:00
pktippa.md Added pktippa contributor agreement 2018-02-07 15:37:28 +05:30
plison.md adding skweak to the SpaCy universe 2021-04-22 01:16:34 +02:00
pmbaumgartner.md contributor agreement 2019-07-14 20:46:06 -04:00
polm.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
prilopes.md Bugfix/dep matcher issue 4590 (#4601) 2019-11-07 12:01:06 +01:00
punitvara.md This PR adds Gujarati Language class along with (#5355) 2020-04-27 11:07:37 +02:00
pzelasko.md Less norm computations in token similarity (#2730) 2018-09-05 21:50:23 +02:00
questoph.md Fix basic language support for Luxembourgish (by adding punctuation.py) (#4648) 2019-11-15 16:16:47 +01:00
rafguns.md Add contributor agreement 2020-12-14 22:01:14 +01:00
rahul1990gupta.md Hindi: Adds tests for lexical attributes (norm and like_num) (#5829) 2020-10-07 10:23:32 +02:00
ramananbalakrishnan.md Support single value for attribute list in doc.to_array 2017-10-19 17:00:41 +05:30
rameshhpathak.md Add Nepali Language (#5622) 2020-06-22 10:25:46 +02:00
rasyidf.md Update Indonesian Example Phrases (#6124) 2020-09-23 14:02:26 +02:00
reneoctavio.md fix: Fix textcat labels to expect a Optional[Iterable[str]] instead of Optional[Dict] (#6911) 2021-02-04 23:37:13 +01:00
retnuh.md Update call to `mkdir()` to create the parents (#3139) 2019-01-11 03:02:18 +01:00
revuel.md Update universe.json (include PatternOmatic) (#6399) 2020-11-19 13:15:50 +01:00
richardliaw.md contribute (#5632) 2020-06-23 08:53:58 +02:00
richardpaulhudson.md Request to include Holmes in spaCy Universe (#3685) 2019-05-08 02:42:03 +02:00
robertsipek.md Fill contributor agreement by robertsipek (#6285) 2020-10-22 22:13:17 +02:00
rokasramas.md Lithuanian language support (#3895) 2019-07-08 10:25:22 +02:00
roshni-b.md updates for Bengali language (#3286) 2019-02-18 10:02:28 +01:00
ryanzhe.md biluo_tags_from_offsets throw exception for overlapping entities (#4021) 2019-08-15 18:13:32 +02:00
sabiqueqb.md Gh 5339 language class for malayalam (#5342) 2020-04-27 09:45:08 +02:00
sainathadapa.md Basic support for Telugu language (#2751) 2018-09-10 11:53:18 +02:00
sammous.md Updating description and code snippet spacy-lefff (#2623) 2018-08-02 17:25:27 +02:00
savkov.md Renamed the file 2018-01-11 17:49:29 +00:00
seanBE.md add return_matches and as_tuples back to Matcher.pipe (#4303) 2019-09-18 22:00:33 +02:00
sebastienharinck.md contrib: add contributor agreement for user sebastienharinck (#5316) 2020-04-16 11:32:09 +02:00
sevdimali.md Azerbaijani language added (#7911) 2021-04-28 14:42:02 +02:00
shigapov.md added spaCyOpenTapioca (#9181) 2021-09-11 13:16:51 +09:00
shuvanon.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
skrcode.md Restore contributor agreement 2018-03-31 14:06:37 +02:00
sloev.md add spacy_syllables to universe + sign contributor agreement 2020-03-13 18:09:42 +01:00
snsten.md Added support for Sanskrit language (#5956) 2020-08-25 10:56:29 +02:00
socool.md Update Thai tokenizer_exception list (#3529) 2019-04-03 09:13:36 +02:00
solarmist.md Mark Japanese documents as tagged. (#5803) 2020-07-23 08:57:01 +02:00
sorenlind.md Add contributor agreement. 2017-11-24 15:29:54 +01:00
suchow.md Re-add existing contributor agreements 2016-11-09 16:42:02 +01:00
svlandeg.md Fix small typo bug in French regexp + relevant unit test (#2980) 2018-11-29 20:16:13 +01:00
swfarnsworth.md Refactor dependencymatcher.pyx to use list comps and enumerate. (#8956) 2021-08-18 09:55:45 +02:00
tamuhey.md Fix iss4278 (#4279) 2019-09-12 10:44:49 +02:00
therealronnie.md Addresses Issue #2228 - Deserialization fails when using tensor=False or sentiment=False (#2230) 2018-05-01 13:40:22 +02:00
theudas.md Added Parameter to NEL to take n sentences into account (#5548) 2020-06-12 02:03:23 +02:00
thomasbird.md Add SCA for @thomasbird (#6576) 2020-12-15 20:59:47 +01:00
thomashacker.md Fix preservation of spacy package meta (#8663) 2021-07-12 11:18:52 +02:00
thomasopsomer.md add contributor agreement 2018-01-28 20:12:05 +01:00
thomasthiebaud.md Add spacy_fastlang to universe (#5271) 2020-04-15 13:50:46 +02:00
thoppe.md Added author information for NLPre (#5414) 2020-05-08 11:28:54 +02:00
tiangolo.md 📄 Add spaCy Contributor Agreement 2020-07-01 20:57:21 +02:00
tilusnet.md Create tilusnet.md (#5914) 2020-08-12 22:46:08 +02:00
tjkemp.md Enhancement/lang fi examples (#2547) 2018-07-15 09:50:27 +02:00
tmetzl.md Merge branch 'master' into develop [ci skip] 2019-03-11 12:23:24 +01:00
tokestermw.md added contributor agreement 2017-11-17 17:27:20 -08:00
tommilligan.md Limit to cupy-cuda v8, so as not to pull in v9 automatically. (#5194) 2020-03-29 13:52:08 +02:00
trungtv.md Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155) 2018-03-29 12:19:51 +02:00
tupui.md SCA tupui 2021-01-29 15:46:53 +01:00
tyburam.md Lex _attrs for polish language (#2750) 2018-09-10 11:53:57 +02:00
tzano.md Add Arabic language (#2314) 2018-05-15 00:27:19 +02:00
ujwal-narayan.md Enhancing Kannada language Resources (#3755) 2019-05-20 12:56:10 +02:00
umarbutler.md Fixed Typo in Warning (#5284) 2020-04-09 15:46:15 +02:00
ursachec.md Add contributor agreement for @ursachec 2018-02-13 20:49:42 +01:00
uwol.md added contributor agreement 2017-11-05 12:33:43 +01:00
veer-bains.md Fixed syntax error in lang/ko when using python 2 (#4082) (closes #4068) 2019-08-05 10:19:32 +02:00
vha14.md add oprd to the list of accepted deps for noun chunking (#6302) 2020-11-05 09:17:35 +01:00
vikaskyadav.md Create vikaskyadav.md (#2621) 2018-08-02 14:03:44 +02:00
vishnumenon.md Fix the code for FACILITIY entities (#2324) 2018-05-12 15:19:17 +02:00
vishnupriyavr.md Limiting noun_chunks for specific languages (#5396) 2020-05-14 12:58:06 +02:00
vondersam.md Swedish like_num (#5371) 2020-04-29 21:25:22 +02:00
vsolovyov.md Re-add existing contributor agreements 2016-11-09 16:42:02 +01:00
w4nderlust.md Added Ludwig among the projects (#3548) [ci skip] 2019-04-07 13:01:26 +02:00
wallinm1.md [finnish] Add contributor file 2017-02-04 13:54:10 +02:00
walterhenry.md User contributor agreement 2020-10-19 16:25:09 +02:00
wannaphongcom.md Update Thai tag map (#3480) 2019-03-25 16:53:26 +01:00
werew.md DependencyMatcher improvements (fix #6678) (#6744) 2021-01-22 11:20:08 +11:00
willismonroe.md Port over contributor agreements 2018-03-24 17:17:37 +01:00
willprice.md Improve random prefix generation in displaCy arcs (#3096) 2018-12-27 14:46:02 +01:00
wojtuch.md User correct variable name in the examples (#2664) 2018-08-13 22:21:24 +02:00
wxv.md Fix is_ascii documentation and create contributor file (#2988) 2018-11-30 15:57:58 +01:00
x-ji.md Fix venv command examples (#2560) [ci skip] 2018-07-18 10:31:24 +02:00
xadrianzetx.md Raise custom error in EntityLinker when KB is not set (#8442) 2021-06-25 23:04:00 +02:00
xssChauhan.md Change default output format from `jsonl` to `json` for cli convert (#3583) (closes #3523) 2019-04-12 11:31:23 +02:00
yanaiela.md Custom entity render (#4117) 2019-08-16 18:39:25 +02:00
yaph.md Create yaph.md so I can contribute (#3658) 2019-04-29 19:43:06 +02:00
yashpatadia.md Add test file for issue (#3625) and spacy contributor agreement 2019-07-11 14:53:14 +05:30
yohasebe.md Create yohasebe.md 2021-07-04 08:57:04 +09:00
yosiasz.md Add Amharic አማርኛ Language support (#6583) 2020-12-22 16:50:34 +01:00
yuukos.md Port over contributor agreements 2017-10-24 20:13:34 +02:00
zaibacu.md Website (Universe): An entry for rita-dsl (#6138) 2020-10-09 10:14:40 +02:00
zhuorulin.md Bugfix/fix wikidata train entity linker (#4509) 2019-10-24 12:52:59 +02:00
zqhZY.md add contributors.md 2017-12-28 18:04:52 +08:00
zqianem.md Fix typo in documentation (#4322) 2019-09-25 19:42:18 +02:00