spaCy

Commit Graph

Author	SHA1	Message	Date
Ines Montani	f0b30aedad	Make lemmatizers use initialize logic (#6182 ) * Make lemmatizer use initialize logic and tidy up * Fix typo * Raise for uninitialized tables	2020-10-02 15:42:36 +02:00
Ines Montani	df06f7a792	Update docs [ci skip]	2020-10-02 13:24:33 +02:00
Ines Montani	d2aa662ab2	Merge pull request #6179 from adrianeboyd/feature/token-morph-refactor-2 [ci skip]	2020-10-02 12:10:27 +02:00
Ines Montani	0f11c2150d	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2020-10-02 11:38:05 +02:00
Ines Montani	32cdc1c4f4	Update docs [ci skip]	2020-10-02 11:38:03 +02:00
Ines Montani	c41a4332e4	Add test for custom data augmentation	2020-10-02 11:37:56 +02:00
Ines Montani	6d8df081bd	Merge pull request #6180 from adrianeboyd/docs/minor-v3-2 [ci skip]	2020-10-02 11:37:25 +02:00
Ines Montani	3856048437	Merge pull request #6178 from explosion/feature/file-readers Integrate file readers via srsly, update orth_variants loading	2020-10-02 10:26:09 +02:00
Adriane Boyd	f83dfe62da	Fix test	2020-10-02 10:17:26 +02:00
Adriane Boyd	351f352cdc	Update Japanese docs and pin for sudachipy	2020-10-02 10:12:44 +02:00
Adriane Boyd	7670df04dd	Update Chinese usage docs	2020-10-02 10:09:03 +02:00
Adriane Boyd	3908fff899	Remove tag map sidebar	2020-10-02 09:07:55 +02:00
Adriane Boyd	fd09e6b140	Update docs for Token.morph / Token.set_morph	2020-10-02 09:05:15 +02:00
Adriane Boyd	65dfaa4f4b	Also accept MorphAnalysis in set_morph	2020-10-02 08:33:43 +02:00
Adriane Boyd	77e08c398f	Switch reset value for set_morph to None	2020-10-02 08:25:15 +02:00
Ines Montani	568768643e	Increment version [ci skip]	2020-10-02 01:50:13 +02:00
Ines Montani	01c1538c72	Integrate file readers	2020-10-02 01:36:06 +02:00
Ines Montani	af282ae732	Fix import	2020-10-02 01:12:34 +02:00
Ines Montani	e59ecb12c0	Auto-format	2020-10-02 01:12:30 +02:00
Ines Montani	6b94cee468	Fix docs [ci skip]	2020-10-02 01:11:19 +02:00
Matthew Honnibal	75a1569908	Merge	2020-10-01 23:07:53 +02:00
Matthew Honnibal	300e5a9928	Avoid relying on NORM in default v3 models (#6176 ) * Allow CharacterEmbed to specify feature * Default to LOWER in character embed * Update tok2vec * Use LOWER, not NORM	2020-10-01 23:05:55 +02:00
Ines Montani	50162b8726	Try to work around Sharp build issue [ci skip]	2020-10-01 22:27:45 +02:00
Ines Montani	5762876dcc	Update default config [ci skip]	2020-10-01 22:27:37 +02:00
Adriane Boyd	86c3ec9c2b	Refactor Token morph setting (#6175 ) * Refactor Token morph setting * Remove `Token.morph_` * Add `Token.set_morph()` * `0` resets `token.c.morph` to unset * Any other values are passed to `Morphology.add` * Add token.morph setter to set from MorphAnalysis	2020-10-01 22:21:46 +02:00
Matthew Honnibal	b854bca15c	Default to LOWER in character embed	2020-10-01 22:17:58 +02:00
Matthew Honnibal	684a77870b	Allow CharacterEmbed to specify feature	2020-10-01 22:17:26 +02:00
Ines Montani	da30701cd1	Increment version [ci skip]	2020-10-01 21:58:11 +02:00
Ines Montani	d48ddd6c9a	Remove default initialize lookups	2020-10-01 21:54:33 +02:00
Ines Montani	1700c8541e	Increment version [ci skip]	2020-10-01 17:57:16 +02:00
Ines Montani	b6b73a3ca8	Update docs [ci skip]	2020-10-01 17:45:29 +02:00
Ines Montani	f2627157c8	Update docs [ci skip]	2020-10-01 17:38:17 +02:00
Ines Montani	7f68f4bd92	Hide jsonl_loc on init vectors and tidy up [ci skip]	2020-10-01 16:44:17 +02:00
Adriane Boyd	27cbffff1b	Minor edit to CoNLL-U converter (#6172 ) This doesn't make a difference given how the `merged_morph` values override the `morph` values for all the final docs, but could have led to unexpected bugs in the future if the converter is modified.	2020-10-01 16:23:42 +02:00
Sofie Van Landeghem	a22215f427	Add FeatureExtractor from Thinc (#6170 ) * move featureextractor from Thinc * Update website/docs/api/architectures.md Co-authored-by: Ines Montani <ines@ines.io> * Update website/docs/api/architectures.md Co-authored-by: Ines Montani <ines@ines.io> Co-authored-by: Ines Montani <ines@ines.io>	2020-10-01 16:22:48 +02:00
Adriane Boyd	73538782a0	Switch Doc.__init__(ents=) to IOB tags (#6173 ) * Switch Doc.__init__(ents=) to IOB tags * Fix check for "-" * Allow "" or None as missing IOB tag	2020-10-01 16:22:18 +02:00
Adriane Boyd	df98d3ef9f	Update import from collections.abc (#6174 )	2020-10-01 16:21:49 +02:00
Ines Montani	0a8a124a6e	Update docs [ci skip]	2020-10-01 12:15:53 +02:00
Ines Montani	44160cd52f	Tidy up [ci skip]	2020-10-01 10:41:19 +02:00
Ines Montani	381258b75b	Merge pull request #6165 from explosion/feature/update-tokenizers-initialize	2020-10-01 09:49:47 +02:00
Ines Montani	4b6afd3611	Remove English [initialize] default block for now to get tests to pass	2020-09-30 23:49:29 +02:00
Ines Montani	6f29f68f69	Update errors and make Tokenizer.initialize args less strict	2020-09-30 23:48:47 +02:00
Ines Montani	a103ab5f1a	Update augmenter lookups and docs	2020-09-30 23:03:47 +02:00
Matthew Honnibal	5128298964	Add missing augmenter	2020-09-30 20:18:45 +02:00
Matthew Honnibal	59294e91aa	Restore the 'jsonl' arg for init vectors The lexemes.jsonl file is still used in our English vectors, and it may be required by users as well. I think it's worth supporting the option.	2020-09-30 19:06:50 +02:00
Matthew Honnibal	c379a4274a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2020-09-30 16:52:42 +02:00
Matthew Honnibal	e58dca3028	Add read_labels	2020-09-30 16:52:27 +02:00
Ines Montani	115481aca7	Update docs [ci skip]	2020-09-30 15:16:00 +02:00
Ines Montani	23c63eefaf	Tidy up env vars [ci skip]	2020-09-30 15:15:11 +02:00
Adriane Boyd	6b7bb32834	Refactor Chinese initialization	2020-09-30 11:46:45 +02:00

1 2 3 4 5 ...

13450 Commits All Branches Search

13450 Commits

All Branches