Add docs for Vectors.most_similar [ci skip]

This commit is contained in:
Ines Montani 2019-10-03 14:29:47 +02:00
parent 1db79a33cb
commit ce1d441de5
1 changed files with 23 additions and 0 deletions

View File

@ -303,6 +303,29 @@ vectors, they will be counted individually.
| ----------- | ---- | ------------------------------------ | | ----------- | ---- | ------------------------------------ |
| **RETURNS** | int | The number of all keys in the table. | | **RETURNS** | int | The number of all keys in the table. |
## Vectors.most_similar {#most_similar tag="method"}
For each of the given vectors, find the `n` most similar entries to it, by
cosine. Queries are by vector. Results are returned as a
`(keys, best_rows, scores)` tuple. If `queries` is large, the calculations are
performed in chunks, to avoid consuming too much memory. You can set the
`batch_size` to control the size/space trade-off during the calculations.
> #### Example
>
> ```python
> queries = numpy.asarray([numpy.random.uniform(-1, 1, (300,))])
> most_similar = nlp.vectors.most_similar(queries, n=10)
> ```
| Name | Type | Description |
| ------------ | --------- | ------------------------------------------------------------------ |
| `queries` | `ndarray` | An array with one or more vectors. |
| `batch_size` | int | The batch size to use. Default to `1024`. |
| `n` | int | The number of entries to return for each query. Defaults to `1`. |
| `sort` | bool | Whether to sort the entries returned by score. Defaults to `True`. |
| **RETURNS** | tuple | The most similar entries as a `(keys, best_rows, scores)` tuple. |
## Vectors.from_glove {#from_glove tag="method"} ## Vectors.from_glove {#from_glove tag="method"}
Load [GloVe](https://nlp.stanford.edu/projects/glove/) vectors from a directory. Load [GloVe](https://nlp.stanford.edu/projects/glove/) vectors from a directory.