spaCy/spacy
Brixjohn 52f3c95004 Added alpha support for Tagalog language (#3062)
I have added alpha support for the Tagalog language from the Philippines. It is the basis for the country's national language Filipino. I have heavily based the format to the EN and ES languages.

I have provided several words in the lemmatizer lookup table, added stop words from a source, translated numeric words to its Tagalog counterpart, added some tokenizer exceptions, and kept the tag map the same as the English language.

While the alpha language passed the preliminary testing that you provided, I think it needs more data to be useful for most cases.

* Added alpha support for Tagalog language

* Edited contributor template

* Included SCA; Reverted templates

* Fixed SCA template

* Fixed changes in SCA template
2018-12-18 13:08:38 +01:00
..
cli Accept iob2 and allow generic whitespace (#2999) 2018-12-06 15:50:25 +01:00
data
displacy 💫 Create random IDs for SVGs to prevent ID clashes (#2927) 2018-11-15 11:40:10 +01:00
lang Added alpha support for Tagalog language (#3062) 2018-12-18 13:08:38 +01:00
syntax Fix out-of-bounds access in NER training 2018-10-27 00:46:30 +02:00
tests Don't run weird failing test for now 2018-11-30 16:13:40 +01:00
tokens Fix docstring for is_right_punct(). (#3044) 2018-12-14 10:11:11 +01:00
__init__.pxd
__init__.py Unhack prefer_gpu 2018-10-14 23:27:09 +02:00
__main__.py Don't pass CLI command name as dummy argument 2018-01-04 21:33:47 +01:00
_ml.py 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
about.py Set version to v2.0.18 2018-12-01 03:35:09 +01:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx missing PrepCase attribute 2018-02-18 14:46:12 +00:00
compat.py Fix formatting 2018-11-26 13:27:41 +01:00
errors.py raise error when setting overlapping entities as doc.ents (#2880) 2018-10-26 23:29:16 +02:00
glossary.py Add FAC to spacy.explain (resolves #2706) 2018-08-26 14:13:50 +02:00
gold.pxd Add support for sent_start to GoldParse 2017-08-25 20:03:14 -05:00
gold.pyx New Feature: display more detail when Error E067 (#2639) 2018-08-07 10:45:29 +02:00
language.py Allow input text of length up to max_length, inclusive (#2922) 2018-11-13 16:46:29 +01:00
lemmatizer.py If no rules are set, lemmatize by lookup 2017-12-06 12:12:11 +01:00
lexeme.pxd WIP on stringstore change. 27 failures 2017-05-28 14:06:40 +02:00
lexeme.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
matcher.pyx Set up dependency tree pattern matching skeleton (#2732) 2018-09-27 13:27:18 +02:00
morphology.pxd fix typo/missing here too 2018-02-18 14:38:27 +00:00
morphology.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
parts_of_speech.pxd
parts_of_speech.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
pipeline.pxd Fix names of pipeline components 2017-10-26 12:38:23 +02:00
pipeline.pyx Fix loading of models when custom vectors are added 2018-04-10 22:19:20 +02:00
scorer.py 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
strings.pxd Try to fix StringStore clean up (see #1506) 2017-11-11 03:11:27 +03:00
strings.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
structs.pxd Make TokenC.sent_tart an int, to allow ternary value 2017-10-08 19:58:54 +02:00
symbols.pxd Fix inconsistencies in the symbols table 2018-02-18 13:51:31 +01:00
symbols.pyx Fix inconsistencies in the symbols table 2018-02-18 13:51:31 +01:00
tokenizer.pxd Disable tokenizer cache for special-cases. Fixes #1250 2017-10-24 16:08:05 +02:00
tokenizer.pyx Fix loading tokenizer with custom prefix search (#2495) 2018-07-04 12:56:07 +02:00
typedefs.pxd Work on changing StringStore to return hashes. 2017-05-28 12:36:27 +02:00
typedefs.pyx Tidy up rest 2017-10-27 21:07:59 +02:00
util.py Restore encoding arg on msgpack-numpy 2018-09-27 15:58:21 +02:00
vectors.pyx 💫 New system for error messages and warnings (#2163) 2018-04-03 15:50:31 +02:00
vocab.pxd Add Vocab.cfg attr, to hold stuff like oov probs 2017-10-30 16:08:50 +01:00
vocab.pyx Fix bug where Vocab.prune_vector did not use 'batch_size' (#2977) 2018-11-28 19:49:33 +01:00