Matthew Honnibal
|
5c3c962038
|
* Add html to gazetteer
|
2015-08-06 16:34:51 +02:00 |
Matthew Honnibal
|
10d869d102
|
* Don't allow conjunction between NPs in base NP chunks
|
2015-08-06 16:31:53 +02:00 |
Matthew Honnibal
|
8b8df851ca
|
* Fix print statement in test_merge
|
2015-08-06 16:28:31 +02:00 |
Matthew Honnibal
|
383dfabd67
|
* Fix matcher setting of entities
|
2015-08-06 16:27:01 +02:00 |
Matthew Honnibal
|
91a94e152b
|
* Make initial gazetteer
|
2015-08-06 16:10:04 +02:00 |
Matthew Honnibal
|
2767979135
|
* Update matcher tests
|
2015-08-06 16:09:28 +02:00 |
Matthew Honnibal
|
59c3bf60a6
|
* Ensure entity recognizer doesn't over-write preset types
|
2015-08-06 16:09:08 +02:00 |
Matthew Honnibal
|
cd7d1682cd
|
* Fix loading of gazetteer.json file
|
2015-08-06 16:08:25 +02:00 |
Matthew Honnibal
|
9c667b7f15
|
* Set a value in attrs.pxd on the first flag, to reduce bugs
|
2015-08-06 16:08:04 +02:00 |
Matthew Honnibal
|
c263577424
|
* Fix lower attribute in lexeme.pxd
|
2015-08-06 16:07:41 +02:00 |
Matthew Honnibal
|
3ecacb9635
|
* Copy gazetteer file in init_model
|
2015-08-06 16:07:23 +02:00 |
Matthew Honnibal
|
faf75dfcb9
|
* Update matcher tests
|
2015-08-06 14:33:35 +02:00 |
Matthew Honnibal
|
5737115e1e
|
* Work on gazetteer matching
|
2015-08-06 14:33:21 +02:00 |
Matthew Honnibal
|
9c1724ecae
|
* Gazetteer stuff working, now need to wire up to API
|
2015-08-06 00:35:40 +02:00 |
Matthew Honnibal
|
47db3067a0
|
* Compile spacy.matcher
|
2015-08-05 23:48:11 +02:00 |
Matthew Honnibal
|
5bc0e83f9a
|
* Reimplement matching in Cython, instead of Python.
|
2015-08-05 01:05:54 +02:00 |
Matthew Honnibal
|
4c87a696b3
|
* Add draft dfa matcher, in Python. Passing tests.
|
2015-08-04 15:55:28 +02:00 |
Matthew Honnibal
|
eb7138c761
|
* Add attr relation in base NP detection
|
2015-08-01 00:34:40 +02:00 |
Matthew Honnibal
|
4988356cf0
|
* Fix dependency type bug from merged tokens
|
2015-08-01 00:33:24 +02:00 |
Matthew Honnibal
|
af84669306
|
* Add smart-quote possessive marker to tokenizer
|
2015-07-30 05:12:48 +02:00 |
Matthew Honnibal
|
78a9068319
|
* Fix spacy attr on merged tokens
|
2015-07-30 04:25:58 +02:00 |
Matthew Honnibal
|
430e2edb96
|
* Fix noun_chunks issue
|
2015-07-30 03:51:50 +02:00 |
Matthew Honnibal
|
9590968fc1
|
* Fix negative indices in Span
|
2015-07-30 02:30:24 +02:00 |
Matthew Honnibal
|
74d8cb3980
|
* Add noun_chunks iterator, and fix left/right child setting in Doc.merge
|
2015-07-30 02:29:49 +02:00 |
Matthew Honnibal
|
d153f18969
|
* Fix negative indices on spans
|
2015-07-29 22:36:03 +02:00 |
Matthew Honnibal
|
320836e346
|
* Move string description further down for token, and highlght that it includes trailing whitespace
|
2015-07-28 21:05:08 +02:00 |
Matthew Honnibal
|
d17a15ae66
|
* Add test to check parse is being deserialized properly
|
2015-07-28 21:04:00 +02:00 |
Matthew Honnibal
|
b5132bed7d
|
* Set left and right children when loading parse from byte string
|
2015-07-28 21:03:18 +02:00 |
Matthew Honnibal
|
6609fcf4b2
|
* Make mem and vocab python-visible in Doc
|
2015-07-28 20:46:59 +02:00 |
Matthew Honnibal
|
d42fe2e694
|
* Add unicode_literals to strings.pyx
|
2015-07-28 16:15:53 +02:00 |
Matthew Honnibal
|
bb910cff92
|
* Fix Python3 problem in align_raw
|
2015-07-28 16:06:53 +02:00 |
Matthew Honnibal
|
dcafb181b9
|
* Fix Python3 problem in align_raw
|
2015-07-28 15:52:10 +02:00 |
Matthew Honnibal
|
c609ea18f0
|
* Increment version in download script
|
2015-07-28 15:22:17 +02:00 |
Matthew Honnibal
|
9c4d0aae62
|
* Switch to better Python2/3 compatible unicode handling
|
2015-07-28 14:45:37 +02:00 |
Matthew Honnibal
|
7606d9936f
|
* Python3 correction for GoldParse
|
2015-07-28 14:44:53 +02:00 |
Matthew Honnibal
|
ddc1a5cfe5
|
* Fix training under python3
|
2015-07-28 14:09:30 +02:00 |
Matthew Honnibal
|
a8bbd7312c
|
* Hackishly patch long dependencies problem
|
2015-07-28 00:14:29 +02:00 |
Matthew Honnibal
|
bb583f7f09
|
* Hackishly patch long dependencies problem
|
2015-07-27 23:14:33 +02:00 |
Matthew Honnibal
|
b96bf9b8cc
|
Merge branch 'master' of ssh://github.com/honnibal/spaCy
|
2015-07-27 22:57:48 +02:00 |
Matthew Honnibal
|
aa7a964a4f
|
* Add a type declaration for doc.from_array
|
2015-07-27 22:57:22 +02:00 |
Matthew Honnibal
|
9034f8a1cf
|
* Update test_docs
|
2015-07-27 22:15:19 +02:00 |
Matthew Honnibal
|
25a8774f42
|
* Fix regression in packer
|
2015-07-27 21:53:38 +02:00 |
Matthew Honnibal
|
174ed1ad20
|
* Tighten the frequency filter in init_model
|
2015-07-27 21:44:51 +02:00 |
Matthew Honnibal
|
1601e488ee
|
* Fix bug in decoding non-ascii characters
|
2015-07-27 21:43:58 +02:00 |
Matthew Honnibal
|
6deb1e84b6
|
* Upd serialization tests
|
2015-07-27 21:25:48 +02:00 |
Matthew Honnibal
|
6a95409cd2
|
* Fix type on bits
|
2015-07-27 21:16:49 +02:00 |
Matthew Honnibal
|
a296d72b54
|
* Fix en/attrs
|
2015-07-27 21:16:33 +02:00 |
Matthew Honnibal
|
45460f505c
|
* Fix data type on read32 in BitArray
|
2015-07-27 21:12:13 +02:00 |
Matthew Honnibal
|
3d43f49f69
|
* Revert prev change
|
2015-07-27 10:58:15 +02:00 |
Matthew Honnibal
|
6b586cdad4
|
* Change lexemes.bin format. Add a header specifying size of LexemeC and number of lexemes, and don't have the redundant orth information.
|
2015-07-27 08:31:51 +02:00 |