Commit Graph

4545 Commits

Author SHA1 Message Date
Matthew Honnibal 8d692771f6 Improve profiling 2017-11-15 13:51:25 +01:00
Matthew Honnibal b797dca977 Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-15 13:11:43 +01:00
ines c9d72de0fb Add dummy serialization methods for Japanese and missing lang getter (resolves #1557) 2017-11-15 12:44:02 +01:00
Matthew Honnibal d274d3a3b9 Let beam forward use minibatches 2017-11-15 00:51:42 +01:00
Matthew Honnibal 855872f872 Remove state hashing 2017-11-14 23:36:46 +01:00
Roman Domrachev 3e21680814 Use safer method to get string without hit 2017-11-14 22:58:46 +03:00
Roman Domrachev a33d5a068d Try to hold origin data instead of restore it 2017-11-14 22:40:03 +03:00
Roman Domrachev 91e2fa6561 Clean all caches 2017-11-14 21:15:04 +03:00
Roman Domrachev 4e378dc4a4 Remove all obsolete code and test only initial problem 2017-11-14 20:45:04 +03:00
Roman 47ce2347b0
Create test that fails when actual cleanup caused 2017-11-14 20:28:13 +03:00
Roman caae77f72d
Update strings.pyx 2017-11-14 19:44:40 +03:00
Roman Domrachev 3d247d2bb8 Get back previous testcase 2017-11-14 18:01:37 +03:00
Roman Domrachev 870defa815 Swap keys in proper place
Remove unnecessary clear of the hits
2017-11-14 17:56:30 +03:00
Roman Domrachev 86ca434c93 Merge github.com:explosion/spaCy 2017-11-14 17:46:22 +03:00
Roman Domrachev a2745b0e84 StringStore now actually cleaned
Do not lose docs in ref tracking
2017-11-14 17:45:50 +03:00
Matthew Honnibal 2512ea9eeb Fix memory leak in beam parser 2017-11-14 02:11:40 +01:00
Matthew Honnibal 86ddf692a1 Fix bug in limit calculation on dev data 2017-11-14 01:37:10 +01:00
Ines Montani ea6c85c67a
Merge pull request #1566 from MathiasDesch/master (resolves #1248)
Add exceptions to tokenizer and norm
2017-11-13 19:05:22 +01:00
Matthew Honnibal 1b348389bb Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-13 18:18:48 +01:00
Matthew Honnibal ca73d0d8fe Cleanup states after beam parsing, explicitly 2017-11-13 18:18:26 +01:00
Matthew Honnibal 63ef9a2e73 Remove __dealloc__ from ParserBeam 2017-11-13 18:18:08 +01:00
Mathias Deschamps c0691b2ab4 Add tokenizer exceptions for ing verbs
Extend list of tokenizing exceptions introduced in 123810b
2017-11-13 17:46:05 +01:00
Mathias Deschamps 288298ead9 Add norm exception for ing verbs
Some ing verbs are sometimes written in or in'. Make the NORM form correct
2017-11-13 17:46:05 +01:00
Abhinav Sharma 59f5740ede
improved upon the list of included stop_words 2017-11-13 17:13:49 +05:30
Matthew Honnibal 6e641f46d4 Create a preprocess function that gets bigrams 2017-11-12 00:43:41 +01:00
Matthew Honnibal c9251d79e3
Edit comment 2017-11-11 18:38:32 +01:00
Matthew Honnibal dd1678eab3
Edit comment 2017-11-11 18:37:08 +01:00
Roman Domrachev ee60a52ee7 Fix test imports and last batch cleanup 2017-11-11 11:32:16 +03:00
Roman Domrachev 4a6b094e09 Remove unused import 2017-11-11 03:13:05 +03:00
Roman Domrachev 3c600adf23 Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
ines ee97fd3cb4 Add regression test for #1547 2017-11-11 00:14:03 +01:00
ines 2df27db671 Add unicode declaration 2017-11-11 00:13:56 +01:00
ines 35653bef3a Add missing import (fixes #1546) 2017-11-10 19:05:18 +01:00
ines 4c5d2c80d5 Re-add python -m to commands, too brittle :( (see #1536) 2017-11-10 02:30:55 +01:00
ines 123810b6de Add "lovin'" to tokenizer exceptions (see #1248) 2017-11-09 17:09:30 +01:00
ines 1c218397f6 Ensure path in Doc.to_disk/from_disk (resolves ##1521)
Also add Doc serialization tests with both Path and string path options
2017-11-09 02:29:03 +01:00
Matthew Honnibal 49fd5a646f Set version for 2.0.2 release 2017-11-08 22:39:39 +01:00
Matthew Honnibal fba2dbddf7 Increment version 2017-11-08 22:19:08 +01:00
Matthew Honnibal a5ea0fdf5a Fix #1518: vocab.vectors.resize() didn't work 2017-11-08 22:18:37 +01:00
Matthew Honnibal de45702bbe Strip dev suffixes from version for compatibility check 2017-11-08 18:40:21 +01:00
Matthew Honnibal 51639214a1 Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-08 18:04:33 +01:00
Matthew Honnibal a2f980de4e Exclude .devN versioning from compatibility check 2017-11-08 18:03:52 +01:00
Daniel Hershcovich d7ae54ff44
Fix typo in message 2017-11-08 16:06:28 +02:00
Matthew Honnibal 4194bc5744 Xfail flakey serialization test 2017-11-08 13:55:13 +01:00
Matthew Honnibal d5537e5516 Work on Windows test failure 2017-11-08 13:25:18 +01:00
Matthew Honnibal c27c82d5f9 Fix serialization 2017-11-08 13:08:48 +01:00
Matthew Honnibal 1d5599cd28 Fix dtype 2017-11-08 12:18:32 +01:00
Matthew Honnibal fa7fdd0d9b Merge branch 'master' of https://github.com/explosion/spaCy 2017-11-08 12:11:31 +01:00
Matthew Honnibal 072ff38a01 Try to fix python3.5 serialization 2017-11-08 12:10:49 +01:00
Ines Montani 3a0f34d567
Merge pull request #1509 from abhi18av/patch-1
Create examples.py for Hindi language
2017-11-08 11:37:19 +01:00
Ines Montani 42b241ccd0
Update language code in usage example in comment 2017-11-08 11:36:38 +01:00
Matthew Honnibal e262e8d942 Increment version to v2.0.2.dev0 2017-11-08 11:25:47 +01:00
Matthew Honnibal a8b592783b Make a dtype more specific, to fix a windows build 2017-11-08 11:24:35 +01:00
Abhinav Sharma 84edade82d
Create examples.py
Populated the file with the translations of English example sentences
2017-11-08 13:23:08 +05:30
Matthew Honnibal d725aee4e2 Increment version to 2.0.1 2017-11-08 02:14:47 +01:00
Matthew Honnibal 8d6f68f1df Increment version 2017-11-08 01:12:34 +01:00
ines bcf42b8846 Fix typo 2017-11-08 01:06:37 +01:00
Matthew Honnibal bbd2a3dee1 Fix title in about.py 2017-11-07 14:02:58 +01:00
Matthew Honnibal 4efaf9306c Set version to spacy-nightly rc2 2017-11-07 13:27:26 +01:00
Matthew Honnibal bf1ec2965f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-07 13:20:29 +01:00
Matthew Honnibal 726f689da4 Fix missing import 2017-11-07 13:20:12 +01:00
ines 834f9c1aab Update about.py 2017-11-07 13:11:33 +01:00
ines a4662a31a9 Move model package templates to cli.package and update docs 2017-11-07 12:15:35 +01:00
ines a09c096d3c Get docs ready for v2.0.0 2017-11-07 12:00:43 +01:00
Matthew Honnibal 9a88e66103 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-07 02:00:06 +01:00
Matthew Honnibal 174abe4677 Increment to 2.0.0rc1 2017-11-07 01:59:46 +01:00
ines 42a0fbf291 Fix textcat simple train example 2017-11-07 01:25:54 +01:00
ines 8fb48b9b91 Update and document new util functions 2017-11-07 00:22:43 +01:00
Matthew Honnibal 1cab703bba Move minibatch function to util 2017-11-06 23:45:36 +01:00
ines 5f43953536 Move test 2017-11-06 23:14:10 +01:00
Matthew Honnibal dd90fe09f5 Remove extraneous label from textcat class 2017-11-06 22:09:02 +01:00
Matthew Honnibal 45e0617e61 Allow Language.update to take unicode text and dict objects 2017-11-06 22:07:38 +01:00
Matthew Honnibal 1831dbd065 Add test of simple textcat workflow 2017-11-06 22:04:29 +01:00
Matthew Honnibal ffb9101f3f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-06 19:20:41 +01:00
Matthew Honnibal 8fea512ac8 Don't set tensor in textcat 2017-11-06 19:20:14 +01:00
ines acb9bdb852 Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
Matthew Honnibal 7d46793dd7 Add PRON_LEMMA to spacy.symbols 2017-11-06 17:38:25 +01:00
Matthew Honnibal 2f7e9f390d Make test less flakey 2017-11-06 17:34:50 +01:00
Matthew Honnibal 407b08017e Make test less flakey 2017-11-06 17:31:40 +01:00
Matthew Honnibal 102f797933 Fix lemma ordering in test 2017-11-06 17:02:17 +01:00
Matthew Honnibal 75e1618ec3 Fix lemma clobbering 2017-11-06 16:56:19 +01:00
Matthew Honnibal 6fdffd7246
Merge pull request #1497 from explosion/feature/improve-optimizer-handling
💫 Improve optimizer handling
2017-11-06 16:41:15 +01:00
Matthew Honnibal 8e6795437b Set release=True 2017-11-06 16:39:32 +01:00
Matthew Honnibal 5c85bf3791 Fix missing import 2017-11-06 15:06:27 +01:00
Matthew Honnibal 25859dbb48 Return optimizer from begin_training, creating if necessary 2017-11-06 14:26:49 +01:00
Matthew Honnibal 465adfee94 Remove unused resume_training method, and pass optimizer through 2017-11-06 14:26:00 +01:00
Matthew Honnibal 13336a6197 Fix Adam import 2017-11-06 14:25:37 +01:00
Matthew Honnibal 2eb11d60f2 Add function create_default_optimizer to spacy._ml 2017-11-06 14:11:59 +01:00
Matthew Honnibal 31babe3c3f Fix non-clobbering lemmatization 2017-11-06 12:36:05 +01:00
Matthew Honnibal 63c6ae4191 Fix lemmatizer test 2017-11-06 11:57:06 +01:00
Matthew Honnibal a86a0181b5 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 22:19:10 +01:00
Matthew Honnibal 134d3b8143 Fix morphology 2017-11-05 22:18:22 +01:00
ines 08d1cf850a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 21:41:58 +01:00
ines baa231745c Fix Dutch tag map 2017-11-05 21:41:50 +01:00
Matthew Honnibal 46e62ad747 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 19:40:00 +01:00
Matthew Honnibal bb25cb0f76 Avoid clobbering preset lemmas 2017-11-05 19:39:38 +01:00
ines 507ecb67af Fix Spanish tag map 2017-11-05 19:23:34 +01:00
Matthew Honnibal 320008352b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-05 18:46:15 +01:00
Matthew Honnibal 38109a0e4a Register SentenceSegmenter in Language.factories 2017-11-05 18:45:57 +01:00
ines 975e1042ff Fix Italian tag map 2017-11-05 18:34:09 +01:00