docs: minor spelling tweaks (#5022)

This commit is contained in:
brett koonce 2020-12-08 15:27:43 -06:00 committed by GitHub
parent 6d2aeff26a
commit ddd3eda26f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 14 additions and 14 deletions

View File

@ -21,7 +21,7 @@ To link up arbitrary hardware, implement your own Accelerator subclass
class MyAccelerator(Accelerator):
def __init__(self, trainer, cluster_environment=None):
super().__init__(trainer, cluster_environment)
self.nickname = 'my_accelator'
self.nickname = 'my_accelerator'
def setup(self):
# find local rank, etc, custom things to implement

View File

@ -324,13 +324,13 @@ that are included with NeMo:
- `Language Modeling (BERT Pretraining) <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb>`_
- `Question Answering <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Question_Answering_Squad.ipynb>`_
- `Text Classification <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/text_classification>`_ (including Sentiment Analysis)
- `Token Classifcation <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/token_classification>`_ (including Named Entity Recognition)
- `Token Classification <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/token_classification>`_ (including Named Entity Recognition)
- `Punctuation and Capitalization <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Punctuation_and_Capitalization.ipynb>`_
Named Entity Recognition (NER)
------------------------------
NER (or more generally token classifcation) is the NLP task of detecting and classifying key information (entities) in text.
NER (or more generally token classification) is the NLP task of detecting and classifying key information (entities) in text.
This task is very popular in Healthcare and Finance. In finance, for example, it can be important to identify
geographical, geopolitical, organizational, persons, events, and natural phenomenon entities.
See this `NER notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Token_Classification_Named_Entity_Recognition.ipynb>`_
@ -435,7 +435,7 @@ Hydra makes every aspect of the NeMo model, including the PyTorch Lightning Trai
Tokenizers
----------
Tokenization is the process of converting natural langauge text into integer arrays
Tokenization is the process of converting natural language text into integer arrays
which can be used for machine learning.
For NLP tasks, tokenization is an essential part of data preprocessing.
NeMo supports all BERT-like model tokenizers from
@ -462,7 +462,7 @@ Much of the state-of-the-art in natural language processing is achieved
by fine-tuning pretrained language models on the downstream task.
With NeMo, you can either `pretrain <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/nlp/language_modeling/bert_pretraining.py>`_
a BERT model on your data or use a pretrained lanugage model from `HuggingFace Transformers <https://github.com/huggingface/transformers>`_
a BERT model on your data or use a pretrained language model from `HuggingFace Transformers <https://github.com/huggingface/transformers>`_
or `NVIDIA Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_.
To see the list of language models available in NeMo:

View File

@ -46,7 +46,7 @@ Example 1: Pretrained, prebuilt models
Example 2: Extend for faster research
-------------------------------------
Bolts are contributed with benchmarks and continuous-integration tests. This means
you can trust the implementations and use them to bootstrap your resarch much faster.
you can trust the implementations and use them to bootstrap your research much faster.
.. code-block:: python

View File

@ -10,7 +10,7 @@ Loggers
*******
Lightning supports the most popular logging frameworks (TensorBoard, Comet, etc...). TensorBoard is used by default,
but you can pass to the :class:`~pytorch_lightning.trainer.trainer.Trainer` any combintation of the following loggers.
but you can pass to the :class:`~pytorch_lightning.trainer.trainer.Trainer` any combination of the following loggers.
.. note::

View File

@ -102,7 +102,7 @@ method of the trainer. A typical example of this would look like
trainer.fit(model)
The figure produced by ``lr_finder.plot()`` should look something like the figure
below. It is recommended to not pick the learning rate that achives the lowest
below. It is recommended to not pick the learning rate that achieves the lowest
loss, but instead something in the middle of the sharpest downward slope (red point).
This is the point returned py ``lr_finder.suggestion()``.

View File

@ -17,7 +17,7 @@ common metric implementations.
The metrics API provides ``update()``, ``compute()``, ``reset()`` functions to the user. The metric base class inherits
``nn.Module`` which allows us to call ``metric(...)`` directly. The ``forward()`` method of the base ``Metric`` class
serves the dual purpose of calling ``update()`` on its input and simultanously returning the value of the metric over the
serves the dual purpose of calling ``update()`` on its input and simultaneously returning the value of the metric over the
provided input.
These metrics work with DDP in PyTorch and PyTorch Lightning by default. When ``.compute()`` is called in

View File

@ -224,7 +224,7 @@ The accelerator backend to use (previously known as distributed_backend).
- (```ddp```) is DistributedDataParallel (each gpu on each node trains, and syncs grads)
- (```ddp_cpu```) is DistributedDataParallel on CPU (same as `ddp`, but does not use GPUs.
Useful for multi-node CPU training or single-node debugging. Note that this will **not** give
a speedup on a single node, since Torch already makes effient use of multiple CPUs on a single
a speedup on a single node, since Torch already makes efficient use of multiple CPUs on a single
machine.)
- (```ddp2```) dp on node, ddp across nodes. Useful for things like increasing
the number of negative samples
@ -982,7 +982,7 @@ Number of processes to train with. Automatically set to the number of GPUs
when using ``accelerator="ddp"``. Set to a number greater than 1 when
using ``accelerator="ddp_cpu"`` to mimic distributed training on a
machine without GPUs. This is useful for debugging, but **will not** provide
any speedup, since single-process Torch already makes effient use of multiple
any speedup, since single-process Torch already makes efficient use of multiple
CPUs.
.. testcode::

View File

@ -110,11 +110,11 @@ The algorithm in short works by:
2. Iteratively until convergence or maximum number of tries `max_trials` (default 25) has been reached:
- Call `fit()` method of trainer. This evaluates `steps_per_trial` (default 3) number of
training steps. Each training step can trigger an OOM error if the tensors
(training batch, weights, gradients ect.) allocated during the steps have a
(training batch, weights, gradients, etc.) allocated during the steps have a
too large memory footprint.
- If an OOM error is encountered, decrease batch size else increase it.
How much the batch size is increased/decreased is determined by the choosen
stratrgy.
How much the batch size is increased/decreased is determined by the chosen
strategy.
3. The found batch size is saved to either `model.batch_size` or `model.hparams.batch_size`
4. Restore the initial state of model and trainer