Commit Graph

13 Commits

Author SHA1 Message Date
Ines Montani 5d28664fc5 Don't test Hungarian for numbers and hyphens for now
Reinvestigate behaviour of case affixes given reorganised tokenizer
patterns.
2017-01-08 20:45:40 +01:00
Ines Montani 038002d616 Reformat HU tokenizer tests and adapt to general style
Improve readability of test cases and add conftest.py with fixture
2017-01-05 18:06:44 +01:00
Gyorgy Orosz 45e045a87b Unicode/UTF8 compatibility for Python2 2016-12-24 00:21:00 +01:00
Gyorgy Orosz 72b61b6d03 Typo fix. 2016-12-24 00:10:29 +01:00
Gyorgy Orosz ab2f6ea46c Removed data files from tests.. 2016-12-21 20:22:09 +01:00
Gyorgy Orosz 3d5306acb9 Added further testcases. 2016-12-20 23:49:35 +01:00
Gyorgy Orosz 23956e72ff Improved partial support for tokenzing Hungarian numbers 2016-12-20 23:36:59 +01:00
Gyorgy Orosz 6add156075 Refactored language data structure 2016-12-20 22:28:20 +01:00
Gyorgy Orosz c035928156 Partial Hungarian number tokenization is added. 2016-12-20 20:46:20 +01:00
Gyorgy Orosz 0cf2144d24 Adding partial hyphen and quote handling support. 2016-12-11 00:14:36 +01:00
Gyorgy Orosz 2051726fd3 Passing Hungatian abbrev tests. 2016-12-10 23:37:58 +01:00
Gyorgy Orosz 0289b8ceaa Additional abbreviation tests. 2016-12-08 12:17:44 +01:00
Gyorgy Orosz 5b00039955 First steps towards the Hungarian tokenizer code. 2016-12-07 23:07:43 +01:00