spaCy

Commit Graph

Author	SHA1	Message	Date
Matthew Honnibal	7441fce7ba	Fix undefined variable in conllu script	2018-02-26 14:59:56 +01:00
Matthew Honnibal	f0478635df	Fix Japanese tokenizer flag	2018-02-26 10:32:12 +01:00
Matthew Honnibal	5faae803c6	Add option to not use Janome for Japanese tokenization	2018-02-26 09:39:46 +01:00
Matthew Honnibal	9b406181cd	Add Chinese.Defaults.use_jieba setting, for UD	2018-02-25 15:12:38 +01:00
Matthew Honnibal	9e960d24fc	Refactor conllu script, fix interface, generalize	2018-02-25 14:54:47 +01:00
Matthew Honnibal	551c93fe01	Shuffle data after each epoch. Improve script	2018-02-25 13:35:32 +01:00
Matthew Honnibal	bdb0174571	Update conllu training script	2018-02-25 13:12:39 +01:00
Matthew Honnibal	e09070eca7	Refactor conllu script	2018-02-25 12:50:29 +01:00
Matthew Honnibal	44e496a82e	Refactor conllu script	2018-02-25 12:48:22 +01:00
Matthew Honnibal	c388833ca6	Minibatch by number of tokens, support other vectors, refactor CoNLL printing	2018-02-25 10:38:06 +01:00
Matthew Honnibal	dd78ef066a	Unset data size limit in conll script	2018-02-24 18:14:57 +01:00
Matthew Honnibal	8adeea3746	Generalize conllu script. Now handling Chinese (maybe badly)	2018-02-24 16:04:27 +01:00
Matthew Honnibal	329b14c9e6	Clean up conllu script	2018-02-24 10:31:53 +01:00
Matthew Honnibal	5be092ee72	CONLLU scoring 80.9% UAS with no oracle segments	2018-02-23 23:49:17 +01:00
Matthew Honnibal	23236340f4	Update CoNLL script. Don't preset SBD. Set batch size to 8, avoid writing twice	2018-02-22 21:35:50 +01:00
Matthew Honnibal	a26e399f84	Update conllu script	2018-02-22 19:43:54 +01:00
Matthew Honnibal	001e2ec6d6	Refactor CoNLL training script	2018-02-22 16:00:34 +01:00
Matthew Honnibal	6a27a4f77c	Set accelerating batch size in CONLL train script	2018-02-21 21:02:41 +01:00
Matthew Honnibal	4dc0fc9954	Replace labels that didn't make freq cutoff	2018-02-21 15:59:22 +01:00
Matthew Honnibal	97164b1763	Fix conllu script	2018-02-21 14:46:54 +01:00
Matthew Honnibal	24fb2c246f	Add script to do conllu training	2018-02-21 13:53:59 +01:00
Matthew Honnibal	00557c5fdd	Add example of NER multitask objective	2018-01-21 19:46:37 +01:00
avinash	b379c9d7d3	typos corrected	2018-01-03 16:54:22 +05:30
mpuels	1e8147aec7	fix: Add missing period in train data	2017-12-13 10:51:05 +01:00
mpuels	ee4d6fdd40	Fix typo in comment	2017-12-09 13:14:57 +01:00
ines	726fb2d0b5	Use fewer iterations by default to avoid overfitting on blank model (resolves #1632 )	2017-11-23 15:27:12 +01:00
ines	ec08996000	Add note on tags matching tokenization (see #1613 )	2017-11-20 15:12:47 +01:00
ines	1a38575de3	Make example Python 2 compatible (see #1617 )	2017-11-20 13:57:51 +01:00
ines	7d5afadf5e	Update vectors_loc description	2017-11-17 14:57:11 +01:00
ines	c57e05bec1	Make sure nr_dim is an int In some languages (e.g. Dutch), the nr_dim is extracted as a byte string, causing an error down the line.	2017-11-17 14:56:27 +01:00
yogendrasoni	334ed433b2	rstrip line before rsplit loading english fast text giving error because line contains new line at the end and rsplit is splitting it incorrectly	2017-11-15 13:55:08 +05:30
Matthew Honnibal	f0e28e8ae5	Make fasttext reader accommodate whitespace	2017-11-12 12:07:13 +01:00
ines	f36fab39b0	Don't rename component in intent parser example (resolves #1551 ) Otherwise, the default saved model won't know that it's supposed to create spaCy's 'parser'.	2017-11-10 23:35:38 +01:00
Ines Montani	1a23a0f87e	Remove broken link (resolves #1541 )	2017-11-10 12:28:39 +01:00
ines	3597a29c24	Update fastText vectors example (see #1525 ) Add option to specify language, and add note on "lang" being required to save out model	2017-11-09 14:54:39 +01:00
ines	33b84f4c39	Change clear_vectors to reset_vectors (resolves #1516 )	2017-11-08 18:11:23 +01:00
ines	89bd40b821	Fix print statement in textcat training example (resolves #1515 )	2017-11-08 17:17:40 +01:00
ines	a09c096d3c	Get docs ready for v2.0.0	2017-11-07 12:00:43 +01:00
ines	173b1551af	Update examples	2017-11-07 01:22:30 +01:00
ines	1b1c9105b4	Update example compatibility statements	2017-11-07 01:11:45 +01:00
ines	8fb48b9b91	Update and document new util functions	2017-11-07 00:22:43 +01:00
Matthew Honnibal	d7016d4050	Update intent parser example	2017-11-06 23:31:11 +01:00
ines	fe498b3d5e	Update training examples to use "simple style"	2017-11-06 23:14:04 +01:00
ines	c646365e2f	Port over changes and add note on compat (see #1445 )	2017-11-06 13:58:34 +01:00
ines	2dca9e71a1	Add notes on catastrophic forgetting (see #1496 )	2017-11-06 13:17:02 +01:00
Matthew Honnibal	717e8124fb	Update Keras sentiment analysis example	2017-11-05 17:11:00 +01:00
Matthew Honnibal	cfb83c231c	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-11-04 23:08:19 +01:00
Matthew Honnibal	ba0201de07	Update multiprocessing example	2017-11-04 23:07:57 +01:00
ines	70a9504560	Add inbetween print statement	2017-11-04 23:06:55 +01:00
Matthew Honnibal	e033162a1d	Update tagger training example	2017-11-01 21:49:08 +01:00

1 2 3 4 5

230 Commits