Adding more to docs

This commit is contained in:
Yomguithereal 2020-10-03 22:52:08 +02:00
parent 357e1de861
commit d5078b26da
2 changed files with 83 additions and 16 deletions

View File

@ -15,14 +15,45 @@ pip install fog
## Usage ## Usage
* [Graph](#graph) * [Graph](#graph)
* [floatsam_sparsification](#floatsam_sparsification)
* [monopartite_projection](#monopartite_projection) * [monopartite_projection](#monopartite_projection)
* [Metrics](#metrics) * [Metrics](#metrics)
* [jaccard_similarity](#jaccard_similarity) * [cosine_similarity](#cosine_similarity)
* [sparse_cosine_similarity](#sparse_cosine_similarity) * [sparse_cosine_similarity](#sparse_cosine_similarity)
* [sparse_dot_product](#sparse_dot_product)
* [jaccard_similarity](#jaccard_similarity)
* [weighted_jaccard_similarity](#weighted_jaccard_similarity) * [weighted_jaccard_similarity](#weighted_jaccard_similarity)
* [overlap_coefficient](#overlap_coefficient)
### Graph ### Graph
#### floatsam_sparsification
Function using an iterative algorithm to try and find the best weight
threshold to apply to trim the given graph's edges while keeping the
underlying community structures.
It works by iteratively increasing the threshold and stopping as soon as
a significant connected component starts to drift away from the principal
one.
This is basically a very naive gradient descent with a very naive cost
function but it works decently for typical cases.
*Arguments*
* **graph** *nx.Graph*: Graph to sparsify.
* **starting_treshold** *float*: Starting similarity threshold.
* **learning_rate** *?float* [`0.05`]: How much to increase the threshold
at each step of the algorithm.
* **max_drifter_size** *?int*: Max size of component to detach itself
from the principal one before stopping the algorithm. If not
provided it will default to the logarithm of the graph's total
number of nodes.
* **weight** *?str* [`weight wrt networkx conventions`]: Name of the weight attribute.
* **remove_edges** *?bool* [`False`]: Whether to remove edges from the graph
having a weight less than found threshold or not. Note that if
`True`, this will mutate the given graph.
#### monopartite_projection #### monopartite_projection
Function computing a monopartite projection of the given bipartite graph. Function computing a monopartite projection of the given bipartite graph.
@ -56,20 +87,10 @@ bipartite and for better performance.
### Metrics ### Metrics
#### jaccard_similarity #### cosine_similarity
Function computing the Jaccard similarity. That is to say the intersection Function computing the cosine similarity of the given sequences.
of input sets divided by their union. Runs in O(n), n being the sum of A & B's sizes.
Runs in O(n), n being the size of the smallest set.
```python
from fog.metrics import jaccard_similarity
# Basic
jaccard_similarity('context', 'contact')
>>> ~0.571
```
*Arguments* *Arguments*
* **A** *iterable*: First sequence. * **A** *iterable*: First sequence.
@ -94,6 +115,36 @@ sparse_cosine_similarity({'apple': 34, 'pear': 3}, {'pear': 1, 'orange': 1})
* **A** *Counter*: First weighted set. * **A** *Counter*: First weighted set.
* **B** *Counter*: Second weighted set. * **B** *Counter*: Second weighted set.
#### sparse_dot_product
Function used to compute the dotproduct of sparse weighted sets represented
by python dicts.
Runs in O(n), n being the size of the smallest set.
*Arguments*
* **A** *Counter*: First weighted set.
* **B** *Counter*: Second weighted set.
#### jaccard_similarity
Function computing the Jaccard similarity. That is to say the intersection
of input sets divided by their union.
Runs in O(n), n being the size of the smallest set.
```python
from fog.metrics import jaccard_similarity
# Basic
jaccard_similarity('context', 'contact')
>>> ~0.571
```
*Arguments*
* **A** *iterable*: First sequence.
* **B** *iterable*: Second sequence.
#### weighted_jaccard_similarity #### weighted_jaccard_similarity
Function computing the weighted Jaccard similarity. Function computing the weighted Jaccard similarity.
@ -110,3 +161,14 @@ weighted_jaccard_similarity({'apple': 34, 'pear': 3}, {'pear': 1, 'orange': 1})
*Arguments* *Arguments*
* **A** *Counter*: First weighted set. * **A** *Counter*: First weighted set.
* **B** *Counter*: Second weighted set. * **B** *Counter*: Second weighted set.
#### overlap_coefficient
Function computing the overlap coefficient of the given sets, i.e. the size
of their intersection divided by the size of the smallest set.
Runs in O(n), n being the size of the smallest set.
*Arguments*
* **A** *iterable*: First sequence.
* **B** *iterable*: Second sequence.

View File

@ -16,19 +16,24 @@ DOCS = [
{ {
'title': 'Graph', 'title': 'Graph',
'fns': [ 'fns': [
graph.floatsam_sparsification,
graph.monopartite_projection graph.monopartite_projection
] ]
}, },
{ {
'title': 'Metrics', 'title': 'Metrics',
'fns': [ 'fns': [
metrics.jaccard_similarity, metrics.cosine_similarity,
metrics.sparse_cosine_similarity, metrics.sparse_cosine_similarity,
metrics.weighted_jaccard_similarity metrics.sparse_dot_product,
metrics.jaccard_similarity,
metrics.weighted_jaccard_similarity,
metrics.overlap_coefficient
] ]
} }
] ]
with open('./README.template.md') as f: with open('./README.template.md') as f:
TEMPLATE = f.read() TEMPLATE = f.read()