Remove orphaned docs pages (#19555)
This commit is contained in:
parent
d2ab93e8a9
commit
8549a932f7
|
@ -1,14 +1,14 @@
|
|||
:orphan:
|
||||
|
||||
Ecosystem CI
|
||||
============
|
||||
############
|
||||
|
||||
`Ecosystem CI <https://github.com/Lightning-AI/ecosystem-ci>`_ automates issue discovery for your projects against Lightning nightly and releases.
|
||||
It is a lightweight repository that provides easy configuration of Continues Integration running on CPUs and GPUs.
|
||||
Any user who wants to keep their project aligned with current and future Lightning releases can use the EcoSystem CI to configure their integrations.
|
||||
Read more: `Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI <https://devblog.pytorchlightning.ai/stay-ahead-of-breaking-changes-with-the-new-lightning-ecosystem-ci-b7e1cf78a6c7>`_
|
||||
|
||||
--------------
|
||||
|
||||
----
|
||||
|
||||
|
||||
***********************
|
||||
Integrate a New Project
|
|
@ -7,6 +7,7 @@
|
|||
../generated/CONTRIBUTING.md
|
||||
../generated/BECOMING_A_CORE_CONTRIBUTOR.md
|
||||
governance
|
||||
ecosystem-ci
|
||||
../versioning
|
||||
../past_versions
|
||||
../generated/CHANGELOG.md
|
||||
|
@ -35,7 +36,7 @@ Community
|
|||
:height: 100
|
||||
|
||||
.. displayitem::
|
||||
:header: How to Become a core contributor
|
||||
:header: How to become a core contributor
|
||||
:description: Steps to be a core contributor
|
||||
:col_css: col-md-12
|
||||
:button_link: ../generated/BECOMING_A_CORE_CONTRIBUTOR.html
|
||||
|
@ -69,6 +70,13 @@ Community
|
|||
:button_link: ../generated/CHANGELOG.html
|
||||
:height: 100
|
||||
|
||||
.. displayitem::
|
||||
:header: Ecosystem CI
|
||||
:description: Automate issue discovery for your projects against Lightning nightly and releases
|
||||
:col_css: col-md-12
|
||||
:button_link: ecosystem-ci.html
|
||||
:height: 100
|
||||
|
||||
.. raw:: html
|
||||
|
||||
</div>
|
||||
|
|
|
@ -1,817 +0,0 @@
|
|||
:orphan:
|
||||
|
||||
#################
|
||||
Conversational AI
|
||||
#################
|
||||
|
||||
These are amazing ecosystems to help with Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text to speech (TTS).
|
||||
|
||||
----
|
||||
|
||||
****
|
||||
NeMo
|
||||
****
|
||||
|
||||
`NVIDIA NeMo <https://github.com/NVIDIA/NeMo>`_ is a toolkit for building new State-of-the-Art
|
||||
Conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR),
|
||||
Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of
|
||||
prebuilt modules that include everything needed to train on your data.
|
||||
Every module can easily be customized, extended, and composed to create new Conversational AI
|
||||
model architectures.
|
||||
|
||||
Conversational AI architectures are typically very large and require a lot of data and compute
|
||||
for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node
|
||||
mixed-precision training.
|
||||
|
||||
.. note:: Every NeMo model is a LightningModule that comes equipped with all supporting infrastructure for training and reproducibility.
|
||||
|
||||
----------
|
||||
|
||||
NeMo Models
|
||||
===========
|
||||
|
||||
NeMo Models contain everything needed to train and reproduce state of the art Conversational AI
|
||||
research and applications, including:
|
||||
|
||||
- neural network architectures
|
||||
- datasets/data loaders
|
||||
- data preprocessing/postprocessing
|
||||
- data augmentors
|
||||
- optimizers and schedulers
|
||||
- tokenizers
|
||||
- language models
|
||||
|
||||
NeMo uses `Hydra <https://hydra.cc/>`_ for configuring both NeMo models and the PyTorch Lightning Trainer.
|
||||
Depending on the domain and application, many different AI libraries will have to be configured
|
||||
to build the application. Hydra makes it easy to bring all of these libraries together
|
||||
so that each can be configured from .yaml or the Hydra CLI.
|
||||
|
||||
.. note:: Every NeMo model has an example configuration file and a corresponding script that contains all configurations needed for training.
|
||||
|
||||
The end result of using NeMo, PyTorch Lightning, and Hydra is that
|
||||
NeMo models all have the same look and feel. This makes it easy to do Conversational AI research
|
||||
across multiple domains. NeMo models are also fully compatible with the PyTorch ecosystem.
|
||||
|
||||
Installing NeMo
|
||||
---------------
|
||||
|
||||
Before installing NeMo, please install Cython first.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install Cython
|
||||
|
||||
For ASR and TTS models, also install these linux utilities.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
apt-get update && apt-get install -y libsndfile1 ffmpeg
|
||||
|
||||
Then installing the latest NeMo release is a simple pip install.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install nemo_toolkit[all]==1.0.0b1
|
||||
|
||||
To install the main branch from GitHub:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python -m pip install git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[all]
|
||||
|
||||
To install from a local clone of NeMo:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
./reinstall.sh # from cloned NeMo's git root
|
||||
|
||||
For Docker users, the NeMo container is available on
|
||||
`NGC <https://catalog.ngc.nvidia.com/orgs/nvidia/collections/nemotrainingframework>`_.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
docker pull nvcr.io/nvidia/nemo:v1.0.0b1
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
docker run --runtime=nvidia -it --rm -v --shm-size=8g -p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/nemo:v1.0.0b1
|
||||
|
||||
Experiment Manager
|
||||
------------------
|
||||
|
||||
NeMo's Experiment Manager leverages PyTorch Lightning for model checkpointing,
|
||||
TensorBoard Logging, and Weights and Biases logging. The Experiment Manager is included by default
|
||||
in all NeMo example scripts.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
exp_manager(trainer, cfg.get("exp_manager", None))
|
||||
|
||||
And is configurable via .yaml with Hydra.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
exp_manager:
|
||||
exp_dir: null
|
||||
name: *name
|
||||
create_tensorboard_logger: True
|
||||
create_checkpoint_callback: True
|
||||
|
||||
Optionally launch Tensorboard to view training results in ./nemo_experiments (by default).
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
tensorboard --bind_all --logdir nemo_experiments
|
||||
|
||||
--------
|
||||
|
||||
Automatic Speech Recognition (ASR)
|
||||
==================================
|
||||
|
||||
Everything needed to train Convolutional ASR models is included with NeMo.
|
||||
NeMo supports multiple Speech Recognition architectures, including Jasper and QuartzNet.
|
||||
`NeMo Speech Models <https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels>`_
|
||||
can be trained from scratch on custom datasets or
|
||||
fine-tuned using pre-trained checkpoints trained on thousands of hours of audio
|
||||
that can be restored for immediate use.
|
||||
|
||||
Some typical ASR tasks are included with NeMo:
|
||||
|
||||
- `Audio transcription <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/asr/01_ASR_with_NeMo.ipynb>`_
|
||||
- `Byte Pair/Word Piece Training <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/asr/speech_to_text_bpe.py>`_
|
||||
- `Speech Commands <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/asr/03_Speech_Commands.ipynb>`_
|
||||
- `Voice Activity Detection <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/asr/06_Voice_Activiy_Detection.ipynb>`_
|
||||
- `Speaker Recognition <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/speaker_recognition/speaker_reco.py>`_
|
||||
|
||||
See this `asr notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/asr/01_ASR_with_NeMo.ipynb>`_
|
||||
for a full tutorial on doing ASR with NeMo, PyTorch Lightning, and Hydra.
|
||||
|
||||
Specify ASR Model Configurations with YAML File
|
||||
-----------------------------------------------
|
||||
|
||||
NeMo Models and the PyTorch Lightning Trainer can be fully configured from .yaml files using Hydra.
|
||||
|
||||
See this `asr config <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/asr/conf/config.yaml>`_
|
||||
for the entire speech to text .yaml file.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# configure the PyTorch Lightning Trainer
|
||||
trainer:
|
||||
gpus: 0 # number of gpus
|
||||
max_epochs: 5
|
||||
max_steps: null # computed at runtime if not set
|
||||
num_nodes: 1
|
||||
accelerator: ddp
|
||||
...
|
||||
# configure the ASR model
|
||||
model:
|
||||
...
|
||||
encoder:
|
||||
cls: nemo.collections.asr.modules.ConvASREncoder
|
||||
params:
|
||||
feat_in: *n_mels
|
||||
activation: relu
|
||||
conv_mask: true
|
||||
|
||||
jasper:
|
||||
- filters: 128
|
||||
repeat: 1
|
||||
kernel: [11]
|
||||
stride: [1]
|
||||
dilation: [1]
|
||||
dropout: *dropout
|
||||
...
|
||||
# all other configuration, data, optimizer, preprocessor, etc
|
||||
...
|
||||
|
||||
Developing ASR Model From Scratch
|
||||
---------------------------------
|
||||
|
||||
`speech_to_text.py <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/asr/speech_to_text.py>`_
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# hydra_runner calls hydra.main and is useful for multi-node experiments
|
||||
@hydra_runner(config_path="conf", config_name="config")
|
||||
def main(cfg):
|
||||
trainer = Trainer(**cfg.trainer)
|
||||
asr_model = EncDecCTCModel(cfg.model, trainer)
|
||||
trainer.fit(asr_model)
|
||||
|
||||
|
||||
Hydra makes every aspect of the NeMo model,
|
||||
including the PyTorch Lightning Trainer, customizable from the command line.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python NeMo/examples/asr/speech_to_text.py --config-name=quartznet_15x5 \
|
||||
trainer.accelerator=gpu \
|
||||
trainer.devices=4 \
|
||||
trainer.max_epochs=128 \
|
||||
+trainer.precision=16 \
|
||||
model.train_ds.manifest_filepath=<PATH_TO_DATA>/librispeech-train-all.json \
|
||||
model.validation_ds.manifest_filepath=<PATH_TO_DATA>/librispeech-dev-other.json \
|
||||
model.train_ds.batch_size=64 \
|
||||
+model.validation_ds.num_workers=16 \
|
||||
+model.train_ds.num_workers=16
|
||||
|
||||
.. note:: Training NeMo ASR models can take days/weeks so it is highly recommended to use multiple GPUs and multiple nodes with the PyTorch Lightning Trainer.
|
||||
|
||||
|
||||
Using State-Of-The-Art Pre-trained ASR Model
|
||||
--------------------------------------------
|
||||
|
||||
Transcribe audio with QuartzNet model pretrained on ~3300 hours of audio.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
quartznet = EncDecCTCModel.from_pretrained("QuartzNet15x5Base-En")
|
||||
|
||||
files = ["path/to/my.wav"] # file duration should be less than 25 seconds
|
||||
|
||||
for fname, transcription in zip(files, quartznet.transcribe(paths2audio_files=files)):
|
||||
print(f"Audio in {fname} was recognized as: {transcription}")
|
||||
|
||||
To see the available pretrained checkpoints:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
EncDecCTCModel.list_available_models()
|
||||
|
||||
NeMo ASR Model Under the Hood
|
||||
-----------------------------
|
||||
|
||||
Any aspect of ASR training or model architecture design can easily be customized
|
||||
with PyTorch Lightning since every NeMo model is a Lightning Module.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class EncDecCTCModel(ASRModel):
|
||||
"""Base class for encoder decoder CTC-based models."""
|
||||
|
||||
...
|
||||
|
||||
def forward(self, input_signal, input_signal_length):
|
||||
processed_signal, processed_signal_len = self.preprocessor(
|
||||
input_signal=input_signal,
|
||||
length=input_signal_length,
|
||||
)
|
||||
# Spec augment is not applied during evaluation/testing
|
||||
if self.spec_augmentation is not None and self.training:
|
||||
processed_signal = self.spec_augmentation(input_spec=processed_signal)
|
||||
encoded, encoded_len = self.encoder(audio_signal=processed_signal, length=processed_signal_len)
|
||||
log_probs = self.decoder(encoder_output=encoded)
|
||||
greedy_predictions = log_probs.argmax(dim=-1, keepdim=False)
|
||||
return log_probs, encoded_len, greedy_predictions
|
||||
|
||||
# PTL-specific methods
|
||||
def training_step(self, batch, batch_nb):
|
||||
audio_signal, audio_signal_len, transcript, transcript_len = batch
|
||||
log_probs, encoded_len, predictions = self.forward(
|
||||
input_signal=audio_signal, input_signal_length=audio_signal_len
|
||||
)
|
||||
loss_value = self.loss(
|
||||
log_probs=log_probs, targets=transcript, input_lengths=encoded_len, target_lengths=transcript_len
|
||||
)
|
||||
wer_num, wer_denom = self._wer(predictions, transcript, transcript_len)
|
||||
self.log_dict(
|
||||
{
|
||||
"train_loss": loss_value,
|
||||
"training_batch_wer": wer_num / wer_denom,
|
||||
"learning_rate": self._optimizer.param_groups[0]["lr"],
|
||||
}
|
||||
)
|
||||
return loss_value
|
||||
|
||||
Neural Types in NeMo ASR
|
||||
------------------------
|
||||
|
||||
NeMo Models and Neural Modules come with Neural Type checking.
|
||||
Neural type checking is extremely useful when combining many different neural
|
||||
network architectures for a production-grade application.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@property
|
||||
def input_types(self) -> Optional[Dict[str, NeuralType]]:
|
||||
if hasattr(self.preprocessor, "_sample_rate"):
|
||||
audio_eltype = AudioSignal(freq=self.preprocessor._sample_rate)
|
||||
else:
|
||||
audio_eltype = AudioSignal()
|
||||
return {
|
||||
"input_signal": NeuralType(("B", "T"), audio_eltype),
|
||||
"input_signal_length": NeuralType(tuple("B"), LengthsType()),
|
||||
}
|
||||
|
||||
|
||||
@property
|
||||
def output_types(self) -> Optional[Dict[str, NeuralType]]:
|
||||
return {
|
||||
"outputs": NeuralType(("B", "T", "D"), LogprobsType()),
|
||||
"encoded_lengths": NeuralType(tuple("B"), LengthsType()),
|
||||
"greedy_predictions": NeuralType(("B", "T"), LabelsType()),
|
||||
}
|
||||
|
||||
--------
|
||||
|
||||
Natural Language Processing (NLP)
|
||||
=================================
|
||||
|
||||
Everything needed to finetune BERT-like language models for NLP tasks is included with NeMo.
|
||||
`NeMo NLP Models <https://ngc.nvidia.com/catalog/models/nvidia:nemonlpmodels>`_
|
||||
include `HuggingFace Transformers <https://github.com/huggingface/transformers>`_
|
||||
and `NVIDIA Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_ BERT and Bio-Megatron models.
|
||||
NeMo can also be used for pretraining BERT-based language models from HuggingFace.
|
||||
|
||||
Any of the HuggingFace encoders or Megatron-LM encoders can easily be used for the NLP tasks
|
||||
that are included with NeMo:
|
||||
|
||||
- `Glue Benchmark (All tasks) <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/GLUE_Benchmark.ipynb>`_
|
||||
- `Intent Slot Classification <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/intent_slot_classification>`_
|
||||
- `Language Modeling (BERT Pretraining) <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb>`_
|
||||
- `Question Answering <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Question_Answering_Squad.ipynb>`_
|
||||
- `Text Classification <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/text_classification>`_ (including Sentiment Analysis)
|
||||
- `Token Classification <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples/nlp/token_classification>`_ (including Named Entity Recognition)
|
||||
- `Punctuation and Capitalization <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Punctuation_and_Capitalization.ipynb>`_
|
||||
|
||||
Named Entity Recognition (NER)
|
||||
------------------------------
|
||||
|
||||
NER (or more generally token classification) is the NLP task of detecting and classifying key information (entities) in text.
|
||||
This task is very popular in Healthcare and Finance. In finance, for example, it can be important to identify
|
||||
geographical, geopolitical, organizational, persons, events, and natural phenomenon entities.
|
||||
See this `NER notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/Token_Classification_Named_Entity_Recognition.ipynb>`_
|
||||
for a full tutorial on doing NER with NeMo, PyTorch Lightning, and Hydra.
|
||||
|
||||
Specify NER Model Configurations with YAML File
|
||||
-----------------------------------------------
|
||||
|
||||
.. note:: NeMo Models and the PyTorch Lightning Trainer can be fully configured from .yaml files using Hydra.
|
||||
|
||||
See this `token classification config <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/nlp/token_classification/conf/token_classification_config.yaml>`_
|
||||
for the entire NER (token classification) .yaml file.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# configure any argument of the PyTorch Lightning Trainer
|
||||
trainer:
|
||||
gpus: 1 # the number of gpus, 0 for CPU
|
||||
num_nodes: 1
|
||||
max_epochs: 5
|
||||
...
|
||||
# configure any aspect of the token classification model here
|
||||
model:
|
||||
dataset:
|
||||
data_dir: ??? # /path/to/data
|
||||
class_balancing: null # choose from [null, weighted_loss]. Weighted_loss enables the weighted class balancing of the loss, may be used for handling unbalanced classes
|
||||
max_seq_length: 128
|
||||
...
|
||||
tokenizer:
|
||||
tokenizer_name: ${model.language_model.pretrained_model_name} # or sentencepiece
|
||||
vocab_file: null # path to vocab file
|
||||
...
|
||||
# the language model can be from HuggingFace or Megatron-LM
|
||||
language_model:
|
||||
pretrained_model_name: bert-base-uncased
|
||||
lm_checkpoint: null
|
||||
...
|
||||
# the classifier for the downstream task
|
||||
head:
|
||||
num_fc_layers: 2
|
||||
fc_dropout: 0.5
|
||||
activation: 'relu'
|
||||
...
|
||||
# all other configuration: train/val/test/ data, optimizer, experiment manager, etc
|
||||
...
|
||||
|
||||
Developing NER Model From Scratch
|
||||
---------------------------------
|
||||
|
||||
`token_classification.py <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/nlp/token_classification/token_classification.py>`_
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# hydra_runner calls hydra.main and is useful for multi-node experiments
|
||||
@hydra_runner(config_path="conf", config_name="token_classification_config")
|
||||
def main(cfg: DictConfig) -> None:
|
||||
trainer = L.Trainer(**cfg.trainer)
|
||||
model = TokenClassificationModel(cfg.model, trainer=trainer)
|
||||
trainer.fit(model)
|
||||
|
||||
After training, we can do inference with the saved NER model using PyTorch Lightning.
|
||||
|
||||
Inference from file:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
gpu = 1 if cfg.trainer.gpus != 0 else 0
|
||||
trainer = L.Trainer(accelerator="gpu", devices=gpu)
|
||||
model.set_trainer(trainer)
|
||||
model.evaluate_from_file(
|
||||
text_file=os.path.join(cfg.model.dataset.data_dir, cfg.model.validation_ds.text_file),
|
||||
labels_file=os.path.join(cfg.model.dataset.data_dir, cfg.model.validation_ds.labels_file),
|
||||
output_dir=exp_dir,
|
||||
add_confusion_matrix=True,
|
||||
normalize_confusion_matrix=True,
|
||||
)
|
||||
|
||||
Or we can run inference on a few examples:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
queries = ["we bought four shirts from the nvidia gear store in santa clara.", "Nvidia is a company in Santa Clara."]
|
||||
results = model.add_predictions(queries)
|
||||
|
||||
for query, result in zip(queries, results):
|
||||
logging.info(f"Query : {query}")
|
||||
logging.info(f"Result: {result.strip()}\n")
|
||||
|
||||
Hydra makes every aspect of the NeMo model, including the PyTorch Lightning Trainer, customizable from the command line.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python token_classification.py \
|
||||
model.language_model.pretrained_model_name=bert-base-cased \
|
||||
model.head.num_fc_layers=2 \
|
||||
model.dataset.data_dir=/path/to/my/data \
|
||||
trainer.max_epochs=5 \
|
||||
trainer.accelerator=gpu \
|
||||
trainer.devices=[0,1]
|
||||
|
||||
-----------
|
||||
|
||||
Tokenizers
|
||||
----------
|
||||
|
||||
Tokenization is the process of converting natural language text into integer arrays
|
||||
which can be used for machine learning.
|
||||
For NLP tasks, tokenization is an essential part of data preprocessing.
|
||||
NeMo supports all BERT-like model tokenizers from
|
||||
`HuggingFace's AutoTokenizer <https://huggingface.co/transformers/model_doc/auto.html#autotokenizer>`_
|
||||
and also supports `Google's SentencePieceTokenizer <https://github.com/google/sentencepiece>`_
|
||||
which can be trained on custom data.
|
||||
|
||||
To see the list of supported tokenizers:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from nemo.collections import nlp as nemo_nlp
|
||||
|
||||
nemo_nlp.modules.get_tokenizer_list()
|
||||
|
||||
See this `tokenizer notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/02_NLP_Tokenizers.ipynb>`_
|
||||
for a full tutorial on using tokenizers in NeMo.
|
||||
|
||||
Language Models
|
||||
---------------
|
||||
|
||||
Language models are used to extract information from (tokenized) text.
|
||||
Much of the state-of-the-art in natural language processing is achieved
|
||||
by fine-tuning pretrained language models on the downstream task.
|
||||
|
||||
With NeMo, you can either `pretrain <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/nlp/language_modeling/bert_pretraining.py>`_
|
||||
a BERT model on your data or use a pretrained language model from `HuggingFace Transformers <https://github.com/huggingface/transformers>`_
|
||||
or `NVIDIA Megatron-LM <https://github.com/NVIDIA/Megatron-LM>`_.
|
||||
|
||||
To see the list of language models available in NeMo:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
nemo_nlp.modules.get_pretrained_lm_models_list(include_external=True)
|
||||
|
||||
Easily switch between any language model in the above list by using `.get_lm_model`.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
nemo_nlp.modules.get_lm_model(pretrained_model_name="distilbert-base-uncased")
|
||||
|
||||
See this `language model notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/nlp/01_Pretrained_Language_Models_for_Downstream_Tasks.ipynb>`_
|
||||
for a full tutorial on using pretrained language models in NeMo.
|
||||
|
||||
Using a Pre-trained NER Model
|
||||
-----------------------------
|
||||
|
||||
NeMo has pre-trained NER models that can be used
|
||||
to get started with Token Classification right away.
|
||||
Models are automatically downloaded from NGC,
|
||||
cached locally to disk,
|
||||
and loaded into GPU memory using the `.from_pretrained` method.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# load pre-trained NER model
|
||||
pretrained_ner_model = TokenClassificationModel.from_pretrained(model_name="NERModel")
|
||||
|
||||
# define the list of queries for inference
|
||||
queries = [
|
||||
"we bought four shirts from the nvidia gear store in santa clara.",
|
||||
"Nvidia is a company.",
|
||||
"The Adventures of Tom Sawyer by Mark Twain is an 1876 novel about a young boy growing "
|
||||
+ "up along the Mississippi River.",
|
||||
]
|
||||
results = pretrained_ner_model.add_predictions(queries)
|
||||
|
||||
for query, result in zip(queries, results):
|
||||
print()
|
||||
print(f"Query : {query}")
|
||||
print(f"Result: {result.strip()}\n")
|
||||
|
||||
NeMo NER Model Under the Hood
|
||||
-----------------------------
|
||||
|
||||
Any aspect of NLP training or model architecture design can easily be customized with PyTorch Lightning
|
||||
since every NeMo model is a Lightning Module.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class TokenClassificationModel(ModelPT):
|
||||
"""
|
||||
Token Classification Model with BERT, applicable for tasks such as Named Entity Recognition
|
||||
"""
|
||||
|
||||
...
|
||||
|
||||
def forward(self, input_ids, token_type_ids, attention_mask):
|
||||
hidden_states = self.bert_model(
|
||||
input_ids=input_ids, token_type_ids=token_type_ids, attention_mask=attention_mask
|
||||
)
|
||||
logits = self.classifier(hidden_states=hidden_states)
|
||||
return logits
|
||||
|
||||
# PTL-specific methods
|
||||
def training_step(self, batch, batch_idx):
|
||||
"""
|
||||
Lightning calls this inside the training loop with the data from the training dataloader
|
||||
passed in as `batch`.
|
||||
"""
|
||||
input_ids, input_type_ids, input_mask, subtokens_mask, loss_mask, labels = batch
|
||||
logits = self(input_ids=input_ids, token_type_ids=input_type_ids, attention_mask=input_mask)
|
||||
|
||||
loss = self.loss(logits=logits, labels=labels, loss_mask=loss_mask)
|
||||
self.log_dict({"train_loss": loss, "lr": self._optimizer.param_groups[0]["lr"]})
|
||||
return loss
|
||||
|
||||
...
|
||||
|
||||
Neural Types in NeMo NLP
|
||||
------------------------
|
||||
|
||||
NeMo Models and Neural Modules come with Neural Type checking.
|
||||
Neural type checking is extremely useful when combining many different neural network architectures
|
||||
for a production-grade application.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@property
|
||||
def input_types(self) -> Optional[Dict[str, NeuralType]]:
|
||||
return self.bert_model.input_types
|
||||
|
||||
|
||||
@property
|
||||
def output_types(self) -> Optional[Dict[str, NeuralType]]:
|
||||
return self.classifier.output_types
|
||||
|
||||
--------
|
||||
|
||||
Text-To-Speech (TTS)
|
||||
====================
|
||||
|
||||
Everything needed to train TTS models and generate audio is included with NeMo.
|
||||
`NeMo TTS Models <https://ngc.nvidia.com/catalog/models/nvidia:nemottsmodels>`_
|
||||
can be trained from scratch on your own data or pretrained models can be downloaded
|
||||
automatically. NeMo currently supports a two step inference procedure.
|
||||
First, a model is used to generate a mel spectrogram from text.
|
||||
Second, a model is used to generate audio from a mel spectrogram.
|
||||
|
||||
Mel Spectrogram Generators:
|
||||
|
||||
- `Tacotron 2 <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/tacotron2.py>`_
|
||||
- `Glow-TTS <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/glow_tts.py>`_
|
||||
|
||||
Audio Generators:
|
||||
|
||||
- Griffin-Lim
|
||||
- `WaveGlow <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/waveglow.py>`_
|
||||
- `SqueezeWave <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/squeezewave.py>`_
|
||||
|
||||
|
||||
Specify TTS Model Configurations with YAML File
|
||||
-----------------------------------------------
|
||||
|
||||
.. note:: NeMo Models and PyTorch Lightning Trainer can be fully configured from .yaml files using Hydra.
|
||||
|
||||
`tts/conf/glow_tts.yaml <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/conf/glow_tts.yaml>`_
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
# configure the PyTorch Lightning Trainer
|
||||
trainer:
|
||||
gpus: -1 # number of gpus
|
||||
max_epochs: 350
|
||||
num_nodes: 1
|
||||
accelerator: ddp
|
||||
...
|
||||
|
||||
# configure the TTS model
|
||||
model:
|
||||
...
|
||||
encoder:
|
||||
cls: nemo.collections.tts.modules.glow_tts.TextEncoder
|
||||
params:
|
||||
n_vocab: 148
|
||||
out_channels: *n_mels
|
||||
hidden_channels: 192
|
||||
filter_channels: 768
|
||||
filter_channels_dp: 256
|
||||
...
|
||||
# all other configuration, data, optimizer, parser, preprocessor, etc
|
||||
...
|
||||
|
||||
Developing TTS Model From Scratch
|
||||
---------------------------------
|
||||
|
||||
`tts/glow_tts.py <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/examples/tts/glow_tts.py>`_
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# hydra_runner calls hydra.main and is useful for multi-node experiments
|
||||
@hydra_runner(config_path="conf", config_name="glow_tts")
|
||||
def main(cfg):
|
||||
trainer = L.Trainer(**cfg.trainer)
|
||||
model = GlowTTSModel(cfg=cfg.model, trainer=trainer)
|
||||
trainer.fit(model)
|
||||
|
||||
Hydra makes every aspect of the NeMo model, including the PyTorch Lightning Trainer, customizable from the command line.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python NeMo/examples/tts/glow_tts.py \
|
||||
trainer.accelerator=gpu \
|
||||
trainer.devices=4 \
|
||||
trainer.max_epochs=400 \
|
||||
...
|
||||
train_dataset=/path/to/train/data \
|
||||
validation_datasets=/path/to/val/data \
|
||||
model.train_ds.batch_size = 64 \
|
||||
|
||||
.. note:: Training NeMo TTS models from scratch can take days or weeks so it is highly recommended to use multiple GPUs and multiple nodes with the PyTorch Lightning Trainer.
|
||||
|
||||
Using State-Of-The-Art Pre-trained TTS Model
|
||||
--------------------------------------------
|
||||
|
||||
Generate speech using models trained on `LJSpeech <https://keithito.com/LJ-Speech-Dataset/>`_,
|
||||
around 24 hours of single speaker data.
|
||||
|
||||
See this `TTS notebook <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/tutorials/tts/1_TTS_inference.ipynb>`_
|
||||
for a full tutorial on generating speech with NeMo, PyTorch Lightning, and Hydra.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# load pretrained spectrogram model
|
||||
spec_gen = SpecModel.from_pretrained("GlowTTS-22050Hz").cuda()
|
||||
|
||||
# load pretrained Generators
|
||||
vocoder = WaveGlowModel.from_pretrained("WaveGlow-22050Hz").cuda()
|
||||
|
||||
|
||||
def infer(spec_gen_model, vocder_model, str_input):
|
||||
with torch.no_grad():
|
||||
parsed = spec_gen.parse(text_to_generate)
|
||||
spectrogram = spec_gen.generate_spectrogram(tokens=parsed)
|
||||
audio = vocoder.convert_spectrogram_to_audio(spec=spectrogram)
|
||||
if isinstance(spectrogram, torch.Tensor):
|
||||
spectrogram = spectrogram.to("cpu").numpy()
|
||||
if len(spectrogram.shape) == 3:
|
||||
spectrogram = spectrogram[0]
|
||||
if isinstance(audio, torch.Tensor):
|
||||
audio = audio.to("cpu").numpy()
|
||||
return spectrogram, audio
|
||||
|
||||
|
||||
text_to_generate = input("Input what you want the model to say: ")
|
||||
spec, audio = infer(spec_gen, vocoder, text_to_generate)
|
||||
|
||||
To see the available pretrained checkpoints:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# spec generator
|
||||
GlowTTSModel.list_available_models()
|
||||
|
||||
# vocoder
|
||||
WaveGlowModel.list_available_models()
|
||||
|
||||
NeMo TTS Model Under the Hood
|
||||
-----------------------------
|
||||
|
||||
Any aspect of TTS training or model architecture design can easily
|
||||
be customized with PyTorch Lightning since every NeMo model is a LightningModule.
|
||||
|
||||
`glow_tts.py <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/nemo/collections/tts/models/glow_tts.py>`_
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class GlowTTSModel(SpectrogramGenerator):
|
||||
"""
|
||||
GlowTTS model used to generate spectrograms from text
|
||||
Consists of a text encoder and an invertible spectrogram decoder
|
||||
"""
|
||||
|
||||
...
|
||||
|
||||
# NeMo models come with neural type checking
|
||||
@typecheck(
|
||||
input_types={
|
||||
"x": NeuralType(("B", "T"), TokenIndex()),
|
||||
"x_lengths": NeuralType(("B"), LengthsType()),
|
||||
"y": NeuralType(("B", "D", "T"), MelSpectrogramType(), optional=True),
|
||||
"y_lengths": NeuralType(("B"), LengthsType(), optional=True),
|
||||
"gen": NeuralType(optional=True),
|
||||
"noise_scale": NeuralType(optional=True),
|
||||
"length_scale": NeuralType(optional=True),
|
||||
}
|
||||
)
|
||||
def forward(self, *, x, x_lengths, y=None, y_lengths=None, gen=False, noise_scale=0.3, length_scale=1.0):
|
||||
if gen:
|
||||
return self.glow_tts.generate_spect(
|
||||
text=x, text_lengths=x_lengths, noise_scale=noise_scale, length_scale=length_scale
|
||||
)
|
||||
else:
|
||||
return self.glow_tts(text=x, text_lengths=x_lengths, spect=y, spect_lengths=y_lengths)
|
||||
|
||||
...
|
||||
|
||||
def step(self, y, y_lengths, x, x_lengths):
|
||||
z, y_m, y_logs, logdet, logw, logw_, y_lengths, attn = self(
|
||||
x=x, x_lengths=x_lengths, y=y, y_lengths=y_lengths, gen=False
|
||||
)
|
||||
|
||||
l_mle, l_length, logdet = self.loss(
|
||||
z=z,
|
||||
y_m=y_m,
|
||||
y_logs=y_logs,
|
||||
logdet=logdet,
|
||||
logw=logw,
|
||||
logw_=logw_,
|
||||
x_lengths=x_lengths,
|
||||
y_lengths=y_lengths,
|
||||
)
|
||||
|
||||
loss = sum([l_mle, l_length])
|
||||
|
||||
return l_mle, l_length, logdet, loss, attn
|
||||
|
||||
# PTL-specific methods
|
||||
def training_step(self, batch, batch_idx):
|
||||
y, y_lengths, x, x_lengths = batch
|
||||
|
||||
y, y_lengths = self.preprocessor(input_signal=y, length=y_lengths)
|
||||
|
||||
l_mle, l_length, logdet, loss, _ = self.step(y, y_lengths, x, x_lengths)
|
||||
|
||||
self.log_dict({"l_mle": l_mle, "l_length": l_length, "logdet": logdet}, prog_bar=True)
|
||||
return loss
|
||||
|
||||
...
|
||||
|
||||
Neural Types in NeMo TTS
|
||||
------------------------
|
||||
|
||||
NeMo Models and Neural Modules come with Neural Type checking.
|
||||
Neural type checking is extremely useful when combining many different neural network architectures
|
||||
for a production-grade application.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@typecheck(
|
||||
input_types={
|
||||
"x": NeuralType(("B", "T"), TokenIndex()),
|
||||
"x_lengths": NeuralType(("B"), LengthsType()),
|
||||
"y": NeuralType(("B", "D", "T"), MelSpectrogramType(), optional=True),
|
||||
"y_lengths": NeuralType(("B"), LengthsType(), optional=True),
|
||||
"gen": NeuralType(optional=True),
|
||||
"noise_scale": NeuralType(optional=True),
|
||||
"length_scale": NeuralType(optional=True),
|
||||
}
|
||||
)
|
||||
def forward(self, *, x, x_lengths, y=None, y_lengths=None, gen=False, noise_scale=0.3, length_scale=1.0):
|
||||
...
|
||||
|
||||
--------
|
||||
|
||||
Learn More
|
||||
==========
|
||||
|
||||
- Watch the `NVIDIA NeMo Intro Video <https://youtu.be/wBgpMf_KQVw>`_
|
||||
- Watch the `PyTorch Lightning and NVIDIA NeMo Discussion Video <https://youtu.be/rFAX1-4DSr4>`_
|
||||
- Visit the `NVIDIA NeMo Developer Website <https://developer.nvidia.com/nvidia-nemo>`_
|
||||
- Read the `NVIDIA NeMo PyTorch Blog <https://medium.com/pytorch/nvidia-nemo-neural-modules-and-models-for-conversational-ai-d660480d9696>`_
|
||||
- Download pre-trained `ASR <https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels>`_, `NLP <https://ngc.nvidia.com/catalog/models/nvidia:nemonlpmodels>`_, and `TTS <https://ngc.nvidia.com/catalog/models/nvidia:nemospeechmodels>`_ models on `NVIDIA NGC <https://ngc.nvidia.com/>`_ to quickly get started with NeMo.
|
||||
- Become an expert on Building Conversational AI applications with our `tutorials <https://github.com/NVIDIA/NeMo#tutorials>`_, and `example scripts <https://github.com/NVIDIA/NeMo/tree/v1.0.0b1/examples>`_,
|
||||
- See our `developer guide <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ for more information on core NeMo concepts, ASR/NLP/TTS collections, and the NeMo API.
|
||||
|
||||
.. note:: NeMo tutorial notebooks can be run on `Google Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.
|
||||
|
||||
NVIDIA `NeMo <https://github.com/NVIDIA/NeMo>`_ is actively being developed on GitHub.
|
||||
`Contributions <https://github.com/NVIDIA/NeMo/blob/v1.0.0b1/CONTRIBUTING.md>`_ are welcome!
|
|
@ -1,92 +0,0 @@
|
|||
:orphan:
|
||||
|
||||
Lightning Bolts
|
||||
===============
|
||||
|
||||
`PyTorch Lightning Bolts <https://lightning-bolts.readthedocs.io/en/latest/>`_, is our official collection
|
||||
of prebuilt models across many research domains.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install lightning-bolts
|
||||
|
||||
In bolts we have:
|
||||
|
||||
- A collection of pretrained state-of-the-art models.
|
||||
- A collection of models designed to bootstrap your research.
|
||||
- A collection of callbacks, transforms, full datasets.
|
||||
- All models work on CPUs, TPUs, GPUs and 16-bit precision.
|
||||
|
||||
-----------------
|
||||
|
||||
Quality control
|
||||
---------------
|
||||
The Lightning community builds bolts and contributes them to Bolts.
|
||||
The lightning team guarantees that contributions are:
|
||||
|
||||
- Rigorously Tested (CPUs, GPUs, TPUs).
|
||||
- Rigorously Documented.
|
||||
- Standardized via PyTorch Lightning.
|
||||
- Optimized for speed.
|
||||
- Checked for correctness.
|
||||
|
||||
---------
|
||||
|
||||
Example 1: Pretrained, prebuilt models
|
||||
--------------------------------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from pl_bolts.models import VAE, GPT2, ImageGPT, PixelCNN
|
||||
from pl_bolts.models.self_supervised import AMDIM, CPCV2, SimCLR, MocoV2
|
||||
from pl_bolts.models import LinearRegression, LogisticRegression
|
||||
from pl_bolts.models.gans import GAN
|
||||
from pl_bolts.callbacks import PrintTableMetricsCallback
|
||||
from pl_bolts.datamodules import FashionMNISTDataModule, CIFAR10DataModule, ImagenetDataModule
|
||||
|
||||
------------
|
||||
|
||||
Example 2: Extend for faster research
|
||||
-------------------------------------
|
||||
Bolts are contributed with benchmarks and continuous-integration tests. This means
|
||||
you can trust the implementations and use them to bootstrap your research much faster.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from pl_bolts.models import ImageGPT
|
||||
from pl_bolts.self_supervised import SimCLR
|
||||
|
||||
|
||||
class VideoGPT(ImageGPT):
|
||||
def training_step(self, batch, batch_idx):
|
||||
x, y = batch
|
||||
x = _shape_input(x)
|
||||
|
||||
logits = self.gpt(x)
|
||||
simclr_features = self.simclr(x)
|
||||
|
||||
# -----------------
|
||||
# do something new with GPT logits + simclr_features
|
||||
# -----------------
|
||||
|
||||
loss = self.criterion(logits.view(-1, logits.size(-1)), x.view(-1).long())
|
||||
|
||||
self.log("loss", loss)
|
||||
return loss
|
||||
|
||||
----------
|
||||
|
||||
Example 3: Callbacks
|
||||
--------------------
|
||||
We also have a collection of callbacks.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from pl_bolts.callbacks import PrintTableMetricsCallback
|
||||
import lightning as L
|
||||
|
||||
trainer = L.Trainer(callbacks=[PrintTableMetricsCallback()])
|
||||
|
||||
# loss│train_loss│val_loss│epoch
|
||||
# ──────────────────────────────
|
||||
# 2.2541470527648926│2.2541470527648926│2.2158432006835938│0
|
|
@ -1,36 +0,0 @@
|
|||
:orphan:
|
||||
|
||||
Community Examples
|
||||
==================
|
||||
|
||||
|
||||
- `Lightning Bolts: Deep Learning components for extending PyTorch Lightning <https://lightning.ai/docs/pytorch/latest/ecosystem/bolts.html>`_.
|
||||
- `Lightning Flash: Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes <https://github.com/Lightning-AI/lightning-flash>`_.
|
||||
- `Contextual Emotion Detection (DoubleDistilBert) <https://github.com/juliusberner/emotion_transformer>`_
|
||||
- `Cotatron: Transcription-Guided Speech Encoder <https://github.com/mindslab-ai/cotatron>`_
|
||||
- `FasterRCNN object detection + Hydra <https://github.com/Erlemar/wheat>`_
|
||||
- `Image Inpainting using Partial Convolutions <https://github.com/ryanwongsa/Image-Inpainting>`_
|
||||
- `MNIST on TPU <https://colab.research.google.com/drive/1-_LKx4HwAxl5M6xPJmqAAu444LTDQoa3#scrollTo=BHBz1_AnamN_>`_
|
||||
- `NER (transformers, TPU) <https://colab.research.google.com/drive/1dBN-wwYUngLYVt985wGs_OKPlK_ANB9D>`_
|
||||
- `NeuralTexture (CVPR) <https://github.com/henzler/neuraltexture>`_
|
||||
- `Recurrent Attentive Neural Process <https://github.com/3springs/attentive-neural-processes>`_
|
||||
- `Siamese Nets for One-shot Image Recognition <https://github.com/bhiziroglu/Siamese-Neural-Networks>`_
|
||||
- `Speech Transformers <https://github.com/tongjinle123/speech-transformer-pytorch_lightning>`_
|
||||
- `Transformers transfer learning (Huggingface) <https://colab.research.google.com/drive/1F_RNcHzTfFuQf-LeKvSlud6x7jXYkG31#scrollTo=yr7eaxkF-djf>`_
|
||||
- `Transformers text classification <https://github.com/ricardorei/lightning-text-classification>`_
|
||||
- `VAE Library of over 18+ VAE flavors <https://github.com/AntixK/PyTorch-VAE>`_
|
||||
- `Transformers Question Answering (SQuAD) <https://github.com/tshrjn/Finetune-QA/>`_
|
||||
- `Atlas: End-to-End 3D Scene Reconstruction from Posed Images <https://github.com/magicleap/atlas>`_
|
||||
- `Self-Supervised Representation Learning (MoCo and BYOL) <https://github.com/untitled-ai/self_supervised>`_
|
||||
- `PyTorch-Forecasting: Time series forecasting package <https://github.com/jdb78/pytorch-forecasting>`_
|
||||
- `Transformers masked language modeling <https://github.com/yang-zhang/lightning-language-modeling>`_
|
||||
- `PyTorch Geometric examples with PyTorch Lightning and Hydra <https://github.com/tchaton/lightning-geometric>`_
|
||||
- `PyTorch Tabular: Deep learning with tabular data <https://github.com/manujosephv/pytorch_tabular>`_
|
||||
- `Asteroid: An audio source separation toolkit for researchers <https://github.com/asteroid-team/asteroid>`_
|
||||
|
||||
|
||||
PyTorch Ecosystem Examples
|
||||
==========================
|
||||
|
||||
- `PyTorch Geometric: Deep learning on graphs and other irregular structures <https://github.com/rusty1s/pytorch_geometric/tree/master/examples/pytorch_lightning>`_.
|
||||
- `TorchIO, MONAI and Lightning for 3D medical image segmentation <https://colab.research.google.com/github/fepegar/torchio-notebooks/blob/main/notebooks/TorchIO_MONAI_lightning.pytorch.ipynb>`_.
|
|
@ -1,78 +0,0 @@
|
|||
:orphan:
|
||||
|
||||
Lightning Flash
|
||||
===============
|
||||
|
||||
`Lightning Flash <https://lightning-flash.readthedocs.io/en/stable/>`_ is a high-level deep learning framework for fast prototyping, baselining, fine-tuning, and solving deep learning problems.
|
||||
Flash makes complex AI recipes for over 15 tasks across 7 data domains accessible to all.
|
||||
It is built for beginners with a simple API that requires very little deep learning background, and for data scientists, Kagglers, applied ML practitioners, and deep learning researchers that
|
||||
want a quick way to get a deep learning baseline with advanced features PyTorch Lightning offers.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install lightning-flash
|
||||
|
||||
-----------------
|
||||
|
||||
*********************************
|
||||
Using Lightning Flash in 3 Steps!
|
||||
*********************************
|
||||
|
||||
1. Load your Data
|
||||
-----------------
|
||||
|
||||
All data loading in Flash is performed via a ``from_*`` classmethod of a ``DataModule``.
|
||||
Which ``DataModule`` to use and which ``from_*`` methods are available depends on the task you want to perform.
|
||||
For example, for image segmentation where your data is stored in folders, you would use the ``SemanticSegmentationData``'s `from_folders <https://lightning-flash.readthedocs.io/en/latest/reference/semantic_segmentation.html#from-folders>`_ method:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from flash.image import SemanticSegmentationData
|
||||
|
||||
dm = SemanticSegmentationData.from_folders(
|
||||
train_folder="data/CameraRGB",
|
||||
train_target_folder="data/CameraSeg",
|
||||
val_split=0.1,
|
||||
image_size=(256, 256),
|
||||
num_classes=21,
|
||||
)
|
||||
|
||||
------------
|
||||
|
||||
2. Configure your Model
|
||||
-----------------------
|
||||
|
||||
Our tasks come loaded with pre-trained backbones and (where applicable) heads.
|
||||
You can view the available backbones to use with your task using `available_backbones <https://lightning-flash.readthedocs.io/en/latest/general/backbones.html>`_.
|
||||
Once you've chosen, create the model:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from flash.image import SemanticSegmentation
|
||||
|
||||
print(SemanticSegmentation.available_heads())
|
||||
# ['deeplabv3', 'deeplabv3plus', 'fpn', ..., 'unetplusplus']
|
||||
|
||||
print(SemanticSegmentation.available_backbones("fpn"))
|
||||
# ['densenet121', ..., 'xception'] # + 113 models
|
||||
|
||||
print(SemanticSegmentation.available_pretrained_weights("efficientnet-b0"))
|
||||
# ['imagenet', 'advprop']
|
||||
|
||||
model = SemanticSegmentation(head="fpn", backbone="efficientnet-b0", pretrained="advprop", num_classes=dm.num_classes)
|
||||
|
||||
------------
|
||||
|
||||
3. Finetune!
|
||||
------------
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from flash import Trainer
|
||||
|
||||
trainer = Trainer(max_epochs=3)
|
||||
trainer.finetune(model, datamodule=datamodule, strategy="freeze")
|
||||
trainer.save_checkpoint("semantic_segmentation_model.pt")
|
||||
|
||||
|
||||
To learn more about Lightning Flash, please refer to the `Lightning Flash documentation <https://lightning-flash.readthedocs.io/en/latest/>`_.
|
|
@ -1,93 +0,0 @@
|
|||
:orphan:
|
||||
|
||||
TorchMetrics
|
||||
============
|
||||
|
||||
`TorchMetrics <https://torchmetrics.readthedocs.io>`_ is a collection of machine learning metrics for distributed,
|
||||
scalable PyTorch models and an easy-to-use API to create custom metrics. It has a collection of 60+ PyTorch metrics implementations and
|
||||
is rigorously tested for all edge cases.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install torchmetrics
|
||||
|
||||
In TorchMetrics, we offer the following benefits:
|
||||
|
||||
- A standardized interface to increase reproducibility
|
||||
- Reduced Boilerplate
|
||||
- Distributed-training compatible
|
||||
- Rigorously tested
|
||||
- Automatic accumulation over batches
|
||||
- Automatic synchronization across multiple devices
|
||||
|
||||
-----------------
|
||||
|
||||
Example 1: Functional Metrics
|
||||
-----------------------------
|
||||
|
||||
Below is a simple example for calculating the accuracy using the functional interface:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import torch
|
||||
import torchmetrics
|
||||
|
||||
# simulate a classification problem
|
||||
preds = torch.randn(10, 5).softmax(dim=-1)
|
||||
target = torch.randint(5, (10,))
|
||||
|
||||
acc = torchmetrics.functional.accuracy(preds, target)
|
||||
|
||||
------------
|
||||
|
||||
Example 2: Module Metrics
|
||||
-------------------------
|
||||
|
||||
The example below shows how to use the class-based interface:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import torch
|
||||
import torchmetrics
|
||||
|
||||
# initialize metric
|
||||
metric = torchmetrics.Accuracy()
|
||||
|
||||
n_batches = 10
|
||||
for i in range(n_batches):
|
||||
# simulate a classification problem
|
||||
preds = torch.randn(10, 5).softmax(dim=-1)
|
||||
target = torch.randint(5, (10,))
|
||||
# metric on current batch
|
||||
acc = metric(preds, target)
|
||||
print(f"Accuracy on batch {i}: {acc}")
|
||||
|
||||
# metric on all batches using custom accumulation
|
||||
acc = metric.compute()
|
||||
print(f"Accuracy on all data: {acc}")
|
||||
|
||||
# Resetting internal state such that metric ready for new data
|
||||
metric.reset()
|
||||
|
||||
------------
|
||||
|
||||
Example 3: TorchMetrics with Lightning
|
||||
--------------------------------------
|
||||
|
||||
The example below shows how to use a metric in your :doc:`LightningModule <../common/lightning_module>`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
class MyModel(LightningModule):
|
||||
def __init__(self):
|
||||
...
|
||||
self.accuracy = torchmetrics.Accuracy()
|
||||
|
||||
def training_step(self, batch, batch_idx):
|
||||
x, y = batch
|
||||
preds = self(x)
|
||||
...
|
||||
# log step metric
|
||||
self.accuracy(preds, y)
|
||||
self.log("train_acc_step", self.accuracy, on_epoch=True)
|
||||
...
|
|
@ -1,10 +1,5 @@
|
|||
# Examples
|
||||
|
||||
Our most robust examples showing all sorts of implementations
|
||||
can be found in our sister library [Lightning Bolts](https://lightning.ai/docs/pytorch/latest/ecosystem/bolts.html).
|
||||
|
||||
______________________________________________________________________
|
||||
|
||||
*Note that some examples may rely on new features that are only available in the development branch and may be incompatible with any releases.*
|
||||
*If you see any errors, you might want to consider switching to a version tag you would like to run examples with.*
|
||||
*For example, if you're using `pytorch-lightning==1.6.4` in your environment and seeing issues, run examples of the tag [1.6.4](https://github.com/Lightning-AI/lightning/tree/1.6.4/pl_examples).*
|
||||
|
|
Loading…
Reference in New Issue