Commit Graph

14 Commits

Author SHA1 Message Date
Matthew Honnibal 8c8f5c62c6 Add LANG attribute to English and German 2016-10-18 18:52:48 +02:00
Matthew Honnibal e56653f848 Add language data for German 2016-09-25 15:44:45 +02:00
Matthew Honnibal 7db956133e Move tokenizer data for German into spacy.de.language_data 2016-09-25 15:37:33 +02:00
Matthew Honnibal 95aaea0d3f Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00
Matthew Honnibal fd65cf6cbb Finish refactoring data loading 2016-09-24 20:26:17 +02:00
Wolfgang Seeker 92bfbebeec remove unnecessary imports 2016-05-02 17:33:22 +02:00
Wolfgang Seeker 857454ffa0 fix indentation -.- 2016-05-02 17:10:41 +02:00
Wolfgang Seeker dae6bc05eb define German dummy lemmatizer until morphology is done 2016-05-02 16:04:53 +02:00
Henning Peters a7d7ea3afa first idea for supporting multiple langs in download script 2016-03-24 11:19:43 +01:00
Wolfgang Seeker 690c5acabf adjust train.py to train both english and german models 2016-03-03 15:21:00 +01:00
Henning Peters 9027cef3bc access model via sputnik 2015-12-07 06:01:28 +01:00
Matthew Honnibal 528e26a506 * Add rule to ensure ordinals are preserved as single tokens 2015-09-22 12:26:05 +10:00
Matthew Honnibal dbb48ce49e * Delete extra wordnets 2015-09-13 10:31:37 +10:00
Matthew Honnibal 2154a54f6b * Add spacy.de 2015-09-06 21:56:47 +02:00