Matthew Honnibal
|
5ed8b2b98f
|
* Rename sic to orth
|
2015-01-23 02:08:25 +11:00 |
Matthew Honnibal
|
a27b23cc8f
|
* Have SBD return start/end indices
|
2015-01-22 22:24:44 +11:00 |
Matthew Honnibal
|
d460c28838
|
* Rename vec to repvec
|
2015-01-22 02:06:22 +11:00 |
Matthew Honnibal
|
8b9d913d97
|
* Rename vec to repvec
|
2015-01-22 02:05:58 +11:00 |
Matthew Honnibal
|
9cd0b6b3e9
|
* Various tweaks to Tokens class
|
2015-01-22 02:05:37 +11:00 |
Matthew Honnibal
|
5928d158ce
|
* Pass the string to Tokens
|
2015-01-22 02:04:58 +11:00 |
Matthew Honnibal
|
45264e356b
|
* Rename vec to repvec
|
2015-01-22 02:04:24 +11:00 |
Matthew Honnibal
|
5e63c606ad
|
* Rename vec to repvec
|
2015-01-22 02:03:54 +11:00 |
Matthew Honnibal
|
56e6cf0672
|
* Add _string attr to Tokens object
|
2015-01-21 18:57:09 +11:00 |
Matthew Honnibal
|
d6ac60e91c
|
* Bug fixes to sentences method, and improved vector transport for tokens
|
2015-01-21 18:56:32 +11:00 |
Matthew Honnibal
|
f2a229136c
|
* Fix data_dir=None argument to English class
|
2015-01-21 18:27:31 +11:00 |
Matthew Honnibal
|
ef49b8c179
|
* Add stop-word flag
|
2015-01-21 18:22:31 +11:00 |
Matthew Honnibal
|
6646bfc5df
|
* Add LOWER attr
|
2015-01-21 18:19:08 +11:00 |
Matthew Honnibal
|
f149259bf5
|
* Fix negative indices in tokens
|
2015-01-20 01:16:29 +11:00 |
Matthew Honnibal
|
b65b0c07bf
|
* Messily hook up vector in tokens
|
2015-01-19 19:59:55 +11:00 |
Matthew Honnibal
|
8ff5b8bd84
|
* Add attribute for POS scheme
|
2015-01-17 17:33:16 +11:00 |
Matthew Honnibal
|
6c7e44140b
|
* Work on word vectors, and other stuff
|
2015-01-17 16:21:17 +11:00 |
Matthew Honnibal
|
802867e96a
|
* Revise interface to Token. Strings now have attribute names like norm1_
|
2015-01-15 03:51:47 +11:00 |
Matthew Honnibal
|
7d3c40de7d
|
* Tests passing after refactor. API has obvious warts, particularly in Token and Lexeme
|
2015-01-15 00:33:16 +11:00 |
Matthew Honnibal
|
0930892fc1
|
* Tmp. Working on refactor. Compiles, must hook up lexical feats.
|
2015-01-14 00:03:48 +11:00 |
Matthew Honnibal
|
46da3d74d2
|
* Tmp. Refactoring, introducing a Lexeme PyObject.
|
2015-01-12 11:23:44 +11:00 |
Matthew Honnibal
|
ce2edd6312
|
* Tmp commit. Refactoring to create a Python Lexeme class.
|
2015-01-12 10:26:22 +11:00 |
Matthew Honnibal
|
aacaf1a0f0
|
* Fix parser
|
2015-01-08 01:19:23 +11:00 |
Matthew Honnibal
|
9a21127bf7
|
* Fix parser, which was importing the wrong model
|
2015-01-08 00:10:15 +11:00 |
Matthew Honnibal
|
6a3e39cdd1
|
* Add typedefs.pyx
|
2015-01-06 04:51:40 +11:00 |
Matthew Honnibal
|
a58920cc5e
|
* Import orth.word_shape as a C module
|
2015-01-06 03:18:22 +11:00 |
Matthew Honnibal
|
6b68f7ef75
|
* Finally get string types right for orth function
|
2015-01-06 03:17:39 +11:00 |
Matthew Honnibal
|
90c143bd85
|
* Fix orth import
|
2015-01-05 18:49:19 +11:00 |
Matthew Honnibal
|
7689dccd0f
|
* Remove unused import
|
2015-01-05 18:48:48 +11:00 |
Matthew Honnibal
|
3f1944d688
|
* Make PyPy work
|
2015-01-05 17:54:38 +11:00 |
Matthew Honnibal
|
a510d9f677
|
* Another assertion removed
|
2015-01-05 13:01:40 +11:00 |
Matthew Honnibal
|
2856946a66
|
* Remove assertion that doesn't work on Python 3
|
2015-01-05 12:51:16 +11:00 |
Matthew Honnibal
|
94034f1112
|
* Fix encoding in lemmatization
|
2015-01-05 11:54:29 +11:00 |
Matthew Honnibal
|
b132b3caa6
|
* Fix unicode error in lemmatizer
|
2015-01-05 11:53:54 +11:00 |
Matthew Honnibal
|
477e7fbffe
|
* Fix data reading for lemmatizer
|
2015-01-05 06:01:32 +11:00 |
Matthew Honnibal
|
58f75abaca
|
* Fix unicode error in orth
|
2015-01-05 05:53:08 +11:00 |
Matthew Honnibal
|
4e085d5166
|
* Fix lemmatizer for Python3
|
2015-01-05 05:51:26 +11:00 |
Matthew Honnibal
|
ae7c811fd1
|
* Use Exception instead of StandardError
|
2015-01-04 01:22:12 +11:00 |
Matthew Honnibal
|
0e4c2ba036
|
* Fix loading of special morph words
|
2015-01-03 23:13:00 +11:00 |
Matthew Honnibal
|
f5d41028b5
|
* Move around data files for test release
|
2015-01-03 01:59:22 +11:00 |
Matthew Honnibal
|
a24321b63a
|
* Add downloader
|
2015-01-02 21:44:41 +11:00 |
Matthew Honnibal
|
5d9a096e2f
|
* Some minor clean-up after HastyModel
|
2014-12-31 19:46:04 +11:00 |
Matthew Honnibal
|
aafaf58cbe
|
* Refactor _ml.Model, and finish implementing HastyModel so far not worthwhile.
|
2014-12-31 19:40:59 +11:00 |
Matthew Honnibal
|
bcd038e7b6
|
* Implement HastyModel
|
2014-12-31 01:16:47 +11:00 |
Matthew Honnibal
|
1a075f77ff
|
* Don't over-ride pre-loaded POS tags, if set by special-cases
|
2014-12-30 23:26:32 +11:00 |
Matthew Honnibal
|
785c7ba76a
|
* Embed signature on attrs
|
2014-12-30 23:25:31 +11:00 |
Matthew Honnibal
|
30e5805656
|
* Lazy-load tagger and parser
|
2014-12-30 23:25:09 +11:00 |
Matthew Honnibal
|
9976aa976e
|
* Messily fix morphology and POS tags on special tokens.
|
2014-12-30 23:24:37 +11:00 |
Matthew Honnibal
|
c1ef3febee
|
* Embedsignature in tokens.pyx
|
2014-12-30 21:22:00 +11:00 |
Matthew Honnibal
|
aac5028b6e
|
* Move tagger to _ml
|
2014-12-30 21:21:38 +11:00 |