Commit Graph

56 Commits

Author SHA1 Message Date
Yomguithereal 5627707c45 Better SNM unit tests 2018-07-31 16:08:47 +02:00
Yomguithereal 1f2c15a631 Adaptive sorted neighborhood 2018-07-31 16:07:54 +02:00
Yomguithereal a18c6cef7d Adding squeeze option to fingerprint 2018-07-31 12:04:43 +02:00
Yomguithereal eb877657fa Moving rusalka to phonetics 2018-07-24 16:51:25 +02:00
Yomguithereal 46bf43fcef Rusalka edge case 2018-07-19 17:29:45 +02:00
Yomguithereal d44621160b Adding a test 2018-07-19 17:25:57 +02:00
Yomguithereal 3590a6cd97 Limited levenshtein distance 2018-07-18 18:40:37 +02:00
Yomguithereal e7e56f60f5 Attempting to optimize levenshtein through python 2018-07-18 16:20:04 +02:00
Yomguithereal aee7da6c7c Playing with levenshtein distance 2018-07-18 14:52:17 +02:00
Yomguithereal 45019bc2b2 Adding fog.metrics.overlap_coefficient. cc @diegantobass 2018-07-18 11:27:32 +02:00
Yomguithereal 79566508eb Quickjoin now supports similarity metrics 2018-07-18 11:18:05 +02:00
Yomguithereal 5cd49a61b8 NN-Descent full 2018-07-18 11:03:29 +02:00
Yomguithereal ed52117234 Trying to use a vp_tree to speed up quickjoin: kinda fail 2018-07-16 15:44:08 +02:00
Yomguithereal 8abcaee486 NN-Descent unit test now correctly deterministic 2018-07-12 17:33:46 +02:00
Yomguithereal 7014dccf32 Drafting NN-Descent 2018-07-12 15:01:35 +02:00
Yomguithereal 9f13609ad7 Parallel quickjoin 2018-07-11 18:17:57 +02:00
Yomguithereal ef95edccd2 Drafting quickjoin 2018-07-11 17:58:25 +02:00
Yomguithereal ed84d42266 Sane defaults 2018-07-11 15:56:34 +02:00
Yomguithereal 71f6fb2919 Adding the skeleton key 2018-07-06 18:58:46 +02:00
Yomguithereal 13a8e1937d Fixing window semantics 2018-07-06 18:07:01 +02:00
Yomguithereal eff4399192 Adding SNM 2018-07-06 18:05:19 +02:00
Yomguithereal ed4c1f0834 Adding fog.key.omission 2018-07-06 17:30:26 +02:00
Yomguithereal 65881f081d Optimizing blocking clusterer 2018-07-06 16:41:00 +02:00
Yomguithereal 802f6ae19b Adding a blocking clusterer 2018-07-06 15:58:21 +02:00
Yomguithereal 13286fb50e Advances 2018-07-05 17:20:28 +02:00
Yomguithereal b9d4e11e25 Adding fog.phonetics.cologne 2018-07-04 17:06:43 +02:00
Yomguithereal 4e6dc205cd Parallel connected components 2018-07-02 21:23:29 +02:00
Yomguithereal 27441060d4 Adding jaccard_intersection_index 2018-07-02 17:28:44 +02:00
Yomguithereal 28fed6bba2 sparse_dotproduct -> sparse_dot_product 2018-06-27 16:57:58 +02:00
Yomguithereal 0ce60c9396 Fixing fingerprinting 2018-06-27 14:15:43 +02:00
Yomguithereal c96098c54f Adding cosine_similarity 2018-06-21 18:43:15 +02:00
Yomguithereal 95eb69d5db Improving rusalka 2018-06-21 16:26:50 +02:00
Yomguithereal 8d3553a724 Re-organizing minhash functions 2018-06-21 14:42:41 +02:00
Yomguithereal c211d443fb Testing numpy minhash 2018-06-21 14:21:08 +02:00
Yomguithereal 1402f449aa Implementing SuperMinHash 2018-06-20 18:07:23 +02:00
Yomguithereal 76dfc85a95 Better rusalka 2018-06-20 16:26:51 +02:00
Yomguithereal 2705c80898 Starting to work on rusalka 2018-06-20 15:47:43 +02:00
Yomguithereal 78c933765d Refining 2018-06-20 12:57:43 +02:00
Yomguithereal 12fde300cf Basic MinHash 2018-06-19 18:11:54 +02:00
Yomguithereal e65ae50033 Cleaning up 2018-06-19 18:03:39 +02:00
Yomguithereal bbe30ab9e8 Improvements 2018-06-15 14:27:19 +02:00
Yomguithereal add64e379b Starting to work on LSBMinHash 2018-06-13 17:09:33 +02:00
Yomguithereal d05cd76bf3 Faster diagonal chunks 2018-06-12 17:24:16 +02:00
Yomguithereal f1ddb40c5f Fingerprint key 2018-06-11 17:32:59 +02:00
Yomguithereal c5e216252d Fingerprinting 2018-06-11 17:20:46 +02:00
Yomguithereal fd3f87078d Improvements 2018-06-11 17:00:59 +02:00
Yomguithereal 6d229e0b85 Unit testing key collision 2018-06-08 19:24:49 +02:00
Yomguithereal 5cb76fc23b Unit testing vp_tree clustering 2018-06-08 19:12:43 +02:00
Yomguithereal 98d7ae35b4 Unit testing pairwise clustering 2018-06-08 19:07:02 +02:00
Yomguithereal a900f16a59 Adding ngrams tokenizer 2018-06-05 12:53:57 +02:00