Commit Graph

12 Commits

Author SHA1 Message Date
Adriane Boyd 0a62098c5f
Fix lemmatizer is_base_form for python2.7 (#5734)
* Fix lemmatizer init args for python2.7

* Move English is_base_form to a class method

* Skip test pickling PhraseMatcher for python2
2020-07-09 22:11:24 +02:00
adrianeboyd 40e65d6f63
Fix most_similar for vectors with unused rows (#5348)
* Fix most_similar for vectors with unused rows

Address issues related to the unused rows in the vector table and
`most_similar`:

* Update `most_similar()` to search only through rows that are in use
according to `key2row`.

* Raise an error when `most_similar(n=n)` is larger than the number of
vectors in the table.

* Set and restore `_unset` correctly when vectors are added or
deserialized so that new vectors are added in the correct row.

* Set data and keys to the same length in `Vocab.prune_vectors()` to
avoid spurious entries in `key2row`.

* Fix regression test using `most_similar`

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-05-19 16:41:26 +02:00
Ines Montani 74b951fe61
Fix xpassing tests (#4657)
* Ignore internal warnings

* Un-xfail passing tests

* Skip instead of xfail
2019-11-16 20:20:53 +01:00
Ines Montani cfffdba7b1 Implement new API for {Phrase}Matcher.add (backwards-compatible) (#4522)
* Implement new API for {Phrase}Matcher.add (backwards-compatible)

* Update docs

* Also update DependencyMatcher.add

* Update internals

* Rewrite tests to use new API

* Add basic check for common mistake

Raise error with suggestion if user likely passed in a pattern instead of a list of patterns

* Fix typo [ci skip]
2019-10-25 22:21:08 +02:00
Ines Montani cc05d9dad6 Auto-format [ci skip] 2019-10-24 16:21:08 +02:00
Sofie Van Landeghem d5d55312b2 prevent division by zero in most_similar method (#4488) 2019-10-21 12:04:46 +02:00
Ines Montani 181c01f629 Tidy up and auto-format 2019-10-18 11:27:38 +02:00
Sofie Van Landeghem 9d3ce7cba2 Ensure training doesn't crash with empty batches (#4360)
* unit test for previously resolved unflatten issue

* prevent batch of empty docs to cause problems
2019-10-02 12:50:47 +02:00
Ines Montani 3d8fd4b461 Revert #4334 2019-09-29 17:32:12 +02:00
Ines Montani c9cd516d96 Move tests out of package (#4334)
* Move tests out of package

* Fix typo
2019-09-28 18:05:00 +02:00
Matthew Honnibal 22250cf6b7 Make regression test less sensitive to tag-map stuff 2019-08-25 21:54:26 +02:00
Ines Montani 82045aac8a Merge regression tests 2019-07-10 12:49:18 +02:00