Commit Graph

4528 Commits

Author SHA1 Message Date
Ines Montani ea6c85c67a
Merge pull request #1566 from MathiasDesch/master (resolves #1248)
Add exceptions to tokenizer and norm
2017-11-13 19:05:22 +01:00
Matthew Honnibal 1b348389bb Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-13 18:18:48 +01:00
Matthew Honnibal ca73d0d8fe Cleanup states after beam parsing, explicitly 2017-11-13 18:18:26 +01:00
Matthew Honnibal 63ef9a2e73 Remove __dealloc__ from ParserBeam 2017-11-13 18:18:08 +01:00
Mathias Deschamps c0691b2ab4 Add tokenizer exceptions for ing verbs
Extend list of tokenizing exceptions introduced in 123810b
2017-11-13 17:46:05 +01:00
Mathias Deschamps 288298ead9 Add norm exception for ing verbs
Some ing verbs are sometimes written in or in'. Make the NORM form correct
2017-11-13 17:46:05 +01:00
Abhinav Sharma 59f5740ede
improved upon the list of included stop_words 2017-11-13 17:13:49 +05:30
Matthew Honnibal 6e641f46d4 Create a preprocess function that gets bigrams 2017-11-12 00:43:41 +01:00
Matthew Honnibal c9251d79e3
Edit comment 2017-11-11 18:38:32 +01:00
Matthew Honnibal dd1678eab3
Edit comment 2017-11-11 18:37:08 +01:00
Roman Domrachev ee60a52ee7 Fix test imports and last batch cleanup 2017-11-11 11:32:16 +03:00
Roman Domrachev 4a6b094e09 Remove unused import 2017-11-11 03:13:05 +03:00
Roman Domrachev 3c600adf23 Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
ines ee97fd3cb4 Add regression test for #1547 2017-11-11 00:14:03 +01:00
ines 2df27db671 Add unicode declaration 2017-11-11 00:13:56 +01:00
ines 35653bef3a Add missing import (fixes #1546) 2017-11-10 19:05:18 +01:00
ines 4c5d2c80d5 Re-add python -m to commands, too brittle :( (see #1536) 2017-11-10 02:30:55 +01:00
ines 123810b6de Add "lovin'" to tokenizer exceptions (see #1248) 2017-11-09 17:09:30 +01:00
ines 1c218397f6 Ensure path in Doc.to_disk/from_disk (resolves ##1521)
Also add Doc serialization tests with both Path and string path options
2017-11-09 02:29:03 +01:00
Matthew Honnibal 49fd5a646f Set version for 2.0.2 release 2017-11-08 22:39:39 +01:00
Matthew Honnibal fba2dbddf7 Increment version 2017-11-08 22:19:08 +01:00
Matthew Honnibal a5ea0fdf5a Fix #1518: vocab.vectors.resize() didn't work 2017-11-08 22:18:37 +01:00
Matthew Honnibal de45702bbe Strip dev suffixes from version for compatibility check 2017-11-08 18:40:21 +01:00
Matthew Honnibal 51639214a1 Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-08 18:04:33 +01:00
Matthew Honnibal a2f980de4e Exclude .devN versioning from compatibility check 2017-11-08 18:03:52 +01:00
Daniel Hershcovich d7ae54ff44
Fix typo in message 2017-11-08 16:06:28 +02:00
Matthew Honnibal 4194bc5744 Xfail flakey serialization test 2017-11-08 13:55:13 +01:00
Matthew Honnibal d5537e5516 Work on Windows test failure 2017-11-08 13:25:18 +01:00
Matthew Honnibal c27c82d5f9 Fix serialization 2017-11-08 13:08:48 +01:00
Matthew Honnibal 1d5599cd28 Fix dtype 2017-11-08 12:18:32 +01:00
Matthew Honnibal fa7fdd0d9b Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-08 12:11:31 +01:00
Matthew Honnibal 072ff38a01 Try to fix python3.5 serialization 2017-11-08 12:10:49 +01:00
Ines Montani 3a0f34d567
Merge pull request #1509 from abhi18av/patch-1
Create examples.py for Hindi language
2017-11-08 11:37:19 +01:00
Ines Montani 42b241ccd0
Update language code in usage example in comment 2017-11-08 11:36:38 +01:00
Matthew Honnibal e262e8d942 Increment version to v2.0.2.dev0 2017-11-08 11:25:47 +01:00
Matthew Honnibal a8b592783b Make a dtype more specific, to fix a windows build 2017-11-08 11:24:35 +01:00
Abhinav Sharma 84edade82d
Create examples.py
Populated the file with the translations of English example sentences
2017-11-08 13:23:08 +05:30
Matthew Honnibal d725aee4e2 Increment version to 2.0.1 2017-11-08 02:14:47 +01:00
Matthew Honnibal 8d6f68f1df Increment version 2017-11-08 01:12:34 +01:00
ines bcf42b8846 Fix typo 2017-11-08 01:06:37 +01:00
Matthew Honnibal bbd2a3dee1 Fix title in about.py 2017-11-07 14:02:58 +01:00
Matthew Honnibal 4efaf9306c Set version to spacy-nightly rc2 2017-11-07 13:27:26 +01:00
Matthew Honnibal bf1ec2965f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-07 13:20:29 +01:00
Matthew Honnibal 726f689da4 Fix missing import 2017-11-07 13:20:12 +01:00
ines 834f9c1aab Update about.py 2017-11-07 13:11:33 +01:00
ines a4662a31a9 Move model package templates to cli.package and update docs 2017-11-07 12:15:35 +01:00
ines a09c096d3c Get docs ready for v2.0.0 2017-11-07 12:00:43 +01:00
Matthew Honnibal 9a88e66103 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-07 02:00:06 +01:00
Matthew Honnibal 174abe4677 Increment to 2.0.0rc1 2017-11-07 01:59:46 +01:00
ines 42a0fbf291 Fix textcat simple train example 2017-11-07 01:25:54 +01:00
ines 8fb48b9b91 Update and document new util functions 2017-11-07 00:22:43 +01:00
Matthew Honnibal 1cab703bba Move minibatch function to util 2017-11-06 23:45:36 +01:00
ines 5f43953536 Move test 2017-11-06 23:14:10 +01:00
Matthew Honnibal dd90fe09f5 Remove extraneous label from textcat class 2017-11-06 22:09:02 +01:00
Matthew Honnibal 45e0617e61 Allow Language.update to take unicode text and dict objects 2017-11-06 22:07:38 +01:00
Matthew Honnibal 1831dbd065 Add test of simple textcat workflow 2017-11-06 22:04:29 +01:00
Matthew Honnibal ffb9101f3f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-06 19:20:41 +01:00
Matthew Honnibal 8fea512ac8 Don't set tensor in textcat 2017-11-06 19:20:14 +01:00
ines acb9bdb852 Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
Matthew Honnibal 7d46793dd7 Add PRON_LEMMA to spacy.symbols 2017-11-06 17:38:25 +01:00
Matthew Honnibal 2f7e9f390d Make test less flakey 2017-11-06 17:34:50 +01:00
Matthew Honnibal 407b08017e Make test less flakey 2017-11-06 17:31:40 +01:00
Matthew Honnibal 102f797933 Fix lemma ordering in test 2017-11-06 17:02:17 +01:00
Matthew Honnibal 75e1618ec3 Fix lemma clobbering 2017-11-06 16:56:19 +01:00
Matthew Honnibal 6fdffd7246
Merge pull request #1497 from explosion/feature/improve-optimizer-handling
💫 Improve optimizer handling
2017-11-06 16:41:15 +01:00
Matthew Honnibal 8e6795437b Set release=True 2017-11-06 16:39:32 +01:00
Matthew Honnibal 5c85bf3791 Fix missing import 2017-11-06 15:06:27 +01:00
Matthew Honnibal 25859dbb48 Return optimizer from begin_training, creating if necessary 2017-11-06 14:26:49 +01:00
Matthew Honnibal 465adfee94 Remove unused resume_training method, and pass optimizer through 2017-11-06 14:26:00 +01:00
Matthew Honnibal 13336a6197 Fix Adam import 2017-11-06 14:25:37 +01:00
Matthew Honnibal 2eb11d60f2 Add function create_default_optimizer to spacy._ml 2017-11-06 14:11:59 +01:00
Matthew Honnibal 31babe3c3f Fix non-clobbering lemmatization 2017-11-06 12:36:05 +01:00
Matthew Honnibal 63c6ae4191 Fix lemmatizer test 2017-11-06 11:57:06 +01:00
Matthew Honnibal a86a0181b5 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 22:19:10 +01:00
Matthew Honnibal 134d3b8143 Fix morphology 2017-11-05 22:18:22 +01:00
ines 08d1cf850a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 21:41:58 +01:00
ines baa231745c Fix Dutch tag map 2017-11-05 21:41:50 +01:00
Matthew Honnibal 46e62ad747 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 19:40:00 +01:00
Matthew Honnibal bb25cb0f76 Avoid clobbering preset lemmas 2017-11-05 19:39:38 +01:00
ines 507ecb67af Fix Spanish tag map 2017-11-05 19:23:34 +01:00
Matthew Honnibal 320008352b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 18:46:15 +01:00
Matthew Honnibal 38109a0e4a Register SentenceSegmenter in Language.factories 2017-11-05 18:45:57 +01:00
ines 975e1042ff Fix Italian tag map 2017-11-05 18:34:09 +01:00
ines 6b2d6e4937 Fix Portuguese tag map 2017-11-05 18:31:00 +01:00
ines fa2687fded Fix Dutch tag map 2017-11-05 17:57:59 +01:00
ines fb8990d916 Fix Spanish tag map 2017-11-05 17:48:46 +01:00
ines 9d13288f73 Fix French tag map 2017-11-05 17:47:59 +01:00
ines 54579805c5 Fix French tag map 2017-11-05 17:44:05 +01:00
Matthew Honnibal 2b35bb76ad Fix tensorizer on GPU 2017-11-05 15:34:40 +01:00
Matthew Honnibal 6e5181bbaa Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 15:33:56 +01:00
Matthew Honnibal 6f438b17c1 Increment version to v2.0.0a19 2017-11-05 14:43:36 +01:00
Matthew Honnibal 225cc249c9 Pass string path to numpy, to fix #1479 2017-11-05 14:42:46 +01:00
Matthew Honnibal 00435d8f0c Add extra beam parsing test 2017-11-05 14:39:57 +01:00
Matthew Honnibal e777ea25bb
Merge pull request #1492 from uwol/develop
TextCategorizer return parameter fix
2017-11-05 14:13:04 +01:00
Matthew Honnibal 0d4bd6414e Fix Italian tag map 2017-11-05 14:11:03 +01:00
ines ef597622a6 Add Portuguese tag map 2017-11-05 13:58:34 +01:00
ines 793c62dfda Add Dutch tag map 2017-11-05 13:48:07 +01:00
ines f7485a09c8 Fix Italian tag map 2017-11-05 13:12:58 +01:00
uwol a2162b8908 tensorizer return parameter fix 2017-11-05 12:25:10 +01:00
ines 0a27afbf86 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-04 23:32:52 +01:00