spaCy/examples/keras_parikh_entailment/README.md

94 lines
3.0 KiB
Markdown
Raw Normal View History

2016-11-01 01:13:54 +00:00
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
2016-11-01 00:51:54 +00:00
# A Decomposable Attention Model for Natural Language Inference
2016-11-01 01:13:54 +00:00
**by Matthew Honnibal, [@honnibal](https://github.com/honnibal)**
2016-11-01 00:51:54 +00:00
This directory contains an implementation of entailment prediction model described
2016-11-01 01:13:54 +00:00
by [Parikh et al. (2016)](https://arxiv.org/pdf/1606.01933.pdf). The model is notable
for its competitive performance with very few parameters.
The model is implemented using [Keras](https://keras.io/) and [spaCy](https://spacy.io).
Keras is used to build and train the network, while spaCy is used to load
the [GloVe](http://nlp.stanford.edu/projects/glove/) vectors, perform the
feature extraction, and help you apply the model at run-time. The following
demo code shows how the entailment model can be used at runtime, once the
hook is installed to customise the `.similarity()` method of spaCy's `Doc`
and `Span` objects:
```python
def demo(model_dir):
nlp = spacy.load('en', path=model_dir,
create_pipeline=create_similarity_pipeline)
doc1 = nlp(u'Worst fries ever! Greasy and horrible...')
doc2 = nlp(u'The milkshakes are good. The fries are bad.')
print(doc1.similarity(doc2))
sent1a, sent1b = doc1.sents
print(sent1a.similarity(sent1b))
print(sent1a.similarity(doc2))
print(sent1b.similarity(doc2))
```
2016-11-01 00:51:54 +00:00
I'm working on a blog post to explain Parikh et al.'s model in more detail.
I think it is a very interesting example of the attention mechanism, which
I didn't understand very well before working through this paper.
2016-11-01 01:13:54 +00:00
## How to run the example
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
### 1. Install spaCy and its English models (about 1GB of data)
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
pip install spacy
python -m spacy.en.download
```
2016-11-01 00:51:54 +00:00
This will give you the spaCy's tokenization, tagging, NER and parsing models,
as well as the GloVe word vectors.
2016-11-01 01:13:54 +00:00
### 2. Install Keras
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
pip install keras
```
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
### 3. Get Keras working with your GPU
2016-11-01 00:51:54 +00:00
You're mostly on your own here. My only advice is, if you're setting up on AWS,
try using the AMI published by NVidia. With the image, getting everything set
up wasn't *too* painful.
2016-11-01 01:13:54 +00:00
###4. Test the Keras model:
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
py.test keras_parikh_entailment/keras_decomposable_attention.py
```
2016-11-01 00:51:54 +00:00
This should tell you that two tests passed.
2016-11-01 01:13:54 +00:00
### 5. Download the Stanford Natural Language Inference data
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
Source: http://nlp.stanford.edu/projects/snli/
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
### 6. Train the model:
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
python keras_parikh_entailment/ train <your_model_dir> <train_directory> <dev_directory>
```
2016-11-01 00:51:54 +00:00
Training takes about 300 epochs for full accuracy, and I haven't rerun the full
2016-11-01 01:13:54 +00:00
experiment since refactoring things to publish this example — please let me
2016-11-01 00:51:54 +00:00
know if I've broken something.
You should get to at least 85% on the development data.
2016-11-01 01:13:54 +00:00
### 7. Evaluate the model (optional):
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
python keras_parikh_entailment/ evaluate <your_model_dir> <dev_directory>
```
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
### 8. Run the demo (optional):
2016-11-01 00:51:54 +00:00
2016-11-01 01:13:54 +00:00
```bash
python keras_parikh_entailment/ demo <your_model_dir>
```