Commit Graph

95 Commits

Author SHA1 Message Date
ines 1f9f867c70 Remove unused util function 2017-04-16 20:37:45 +02:00
ines ed7e19ad68 Remove unused import 2017-04-16 20:37:45 +02:00
ines 0084466a66 Remove unused utf8open util and replace os.path with ensure_path 2017-04-16 20:37:45 +02:00
ines d10bd0eaf9 Fix formatting 2017-04-16 13:42:34 +02:00
ines 31fa73293a Move read_json out to own util function 2017-04-16 13:03:28 +02:00
Matthew Honnibal e6ee7e130f Fix parse package meta 2017-04-15 13:38:53 +02:00
ines e1efd589c3 Fix json imports and use ujson 2017-04-15 12:13:34 +02:00
ines 956dc36785 Move functions to deprecated 2017-04-15 12:12:31 +02:00
ines c05ec4b89a Add compat functions and remove old workarounds
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 75f9b4c6e2 Fix whitespace 2017-04-07 10:22:18 +02:00
ines fdec758113 Add is_windows and is_python2 utility functions 2017-03-25 14:04:02 +01:00
ines 3f20efe165 Merge branch 'develop'
# Conflicts:
#	spacy/util.py
2017-03-22 17:14:15 +01:00
Raphaël Bournhonesque f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
ines 5aea327a5b Add util function to get raw user input 2017-03-20 22:48:56 +01:00
ines a6c0361803 Handle raw_input vs input in Python 2 and 3 2017-03-20 22:48:32 +01:00
ines adbcac6591 Fix spacing 2017-03-20 22:48:21 +01:00
ines 0eafc0f2c6 Add util functions to print data as table or markdown list 2017-03-18 13:00:14 +01:00
Matthew Honnibal adb0b7e43b Fix loading when no package found 2017-03-16 18:30:23 -05:00
ines 3d484c3faf Don't print in parse_package_meta and accept on_erro callback instead
TODO: log warning for missing meta data in spacy.link, as this affects
the Language class returned by spacy.load()
2017-03-16 20:34:50 +01:00
ines 5f3f04bd0a Add util function to load and parse package meta.json 2017-03-16 17:10:05 +01:00
ines 7f920c2f75 Don't break text in when rendering print_msg 2017-03-16 17:09:50 +01:00
ines 68c04fa897 Move sys_exit() function to util 2017-03-16 17:08:58 +01:00
ines 7b2eca36e4 Revert "Fix formatting and remove unused code"
This reverts commit d7898d586f.
2017-03-16 09:58:41 +01:00
ines f5d1a39a5b Add util functions for printing and wrapping messages 2017-03-15 17:35:57 +01:00
ines d7898d586f Fix formatting and remove unused code 2017-03-15 17:35:41 +01:00
ines 66c1f194f9 Use consistent unicode declarations 2017-03-12 13:07:28 +01:00
Matthew Honnibal 0f9b8a00a5 Unbreak data download 2017-01-09 23:40:26 +01:00
Matthew Honnibal d9a77ddf14 Return None for data path if it doesn't exist 2017-01-09 14:10:05 +01:00
Ines Montani de5aa92bc2 Handle deprecated tokenizer prefix data 2017-01-08 20:33:28 +01:00
Ines Montani 6a60a61086 Move update_exc to global language data utils 2016-12-17 12:29:02 +01:00
Ines Montani 66c7348cda Add update_exc util function 2016-12-08 13:58:12 +01:00
Ines Montani 8e977cc71c Fix formatting 2016-12-08 13:56:17 +01:00
Matthew Honnibal 6b8b05ef83 Specify that spacy.util is encoded in utf8 2016-11-02 19:58:00 +01:00
Matthew Honnibal 9efe568177 Add missing unicode_literals to spacy.util. I think this was messing up the tokenizer regex for non-ascii characters in Python 2. Re Issue #596 2016-11-02 12:31:34 +01:00
Matthew Honnibal 5e923b9bfa Return None in match_best_version if not path exists. 2016-10-15 14:47:29 +02:00
Matthew Honnibal ea23b64cc8 Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
Matthew Honnibal 95aaea0d3f Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00
Matthew Honnibal 82b8cc5efb Whitespace 2016-09-24 22:17:01 +02:00
Matthew Honnibal f19af6cb2c Python 3 compatible basestring 2016-09-24 22:08:43 +02:00
Matthew Honnibal fd65cf6cbb Finish refactoring data loading 2016-09-24 20:26:17 +02:00
Matthew Honnibal 83e364188c Mostly finished loading refactoring. Design is in place, but doesn't work yet. 2016-09-24 15:42:01 +02:00
Daylen Yang 5405e7dd73 Fix get_lang_class parsing (take 2) 2016-05-16 16:40:31 -07:00
Matthew Honnibal b240104f40 Revert "Fix get_lang_class parsing" 2016-05-17 08:04:26 +10:00
Daylen Yang 1692c2df3c Fix get_lang_class parsing
We want the get_lang_class to return "en" for both "en" and "en_glove_cc_300_1m_vectors". Changed the split rule to "_" so that this happens.
2016-05-16 14:38:20 -07:00
Henning Peters ff690f76ba fix loading non-german models 2016-04-12 16:00:56 +02:00
Henning Peters c90d4a6f17 relative imports in __init__.py 2016-03-26 11:44:53 +01:00
Henning Peters b8f63071eb add lang registration facility 2016-03-25 18:54:45 +01:00
Henning Peters a7d7ea3afa first idea for supporting multiple langs in download script 2016-03-24 11:19:43 +01:00
Henning Peters eb7ae61b1c cleanup api 2016-03-08 12:59:18 +01:00