Refine remote fs doc (#11393)

This commit is contained in:
Rohit Gupta 2022-01-12 19:21:24 +05:30 committed by GitHub
parent f5bbc2cf17
commit 2d0dd1c445
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 34 additions and 8 deletions

View File

@ -385,3 +385,13 @@ Custom Checkpoint IO Plugin
.. note::
Some ``TrainingTypePlugins`` like ``DeepSpeedStrategy`` do not support custom ``CheckpointIO`` as checkpointing logic is not modifiable.
-----------
***************************
Managing Remote Filesystems
***************************
Lightning supports saving and loading checkpoints from a variety of filesystems, including local filesystems and several cloud storage providers.
Check out :ref:`Remote Filesystems <remote_fs>` document for more info.

View File

@ -1,16 +1,19 @@
Remote filesystems
==================
.. _remote_fs:
PyTorch Lightning enables working with data from a variety of filesystems, including local filesystems and several cloud storage providers
such as ``s3`` on AWS, ``gcs`` on Google Cloud, or ``adl`` on Azure.
##################
Remote Filesystems
##################
PyTorch Lightning enables working with data from a variety of filesystems, including local filesystems and several cloud storage providers such as
`S3 <https://aws.amazon.com/s3/>`_ on `AWS <https://aws.amazon.com/>`_, `GCS <https://cloud.google.com/storage>`_ on `Google Cloud <https://cloud.google.com/>`_,
or `ADL <https://azure.microsoft.com/solutions/data-lake/>`_ on `Azure <https://azure.microsoft.com/>`_.
This applies to saving and writing checkpoints, as well as for logging.
Working with different filesystems can be accomplished by appending a protocol like "s3:/" to file paths for writing and reading data.
.. code-block:: python
# `default_root_dir` is the default path used for logs and weights
# `default_root_dir` is the default path used for logs and checkpoints
trainer = Trainer(default_root_dir="s3://my_bucket/data/")
trainer.fit(model)
@ -32,7 +35,7 @@ Additionally, you could also resume training with a checkpoint stored at a remot
trainer = Trainer(default_root_dir=tmpdir, max_steps=3)
trainer.fit(model, ckpt_path="s3://my_bucket/ckpts/classifier.ckpt")
PyTorch Lightning uses `fsspec <https://filesystem-spec.readthedocs.io/en/latest/>`__ internally to handle all filesystem operations.
PyTorch Lightning uses `fsspec <https://filesystem-spec.readthedocs.io/>`_ internally to handle all filesystem operations.
The most common filesystems supported by Lightning are:

View File

@ -14,8 +14,9 @@
Logging
#######
*****************
Supported Loggers
=================
*****************
The following are loggers we support:
@ -101,6 +102,7 @@ Lightning offers automatic log functionalities for logging scalars, or manual lo
Automatic Logging
=================
Use the :meth:`~pytorch_lightning.core.lightning.LightningModule.log`
method to log from anywhere in a :doc:`lightning module <../common/lightning_module>` and :doc:`callbacks <../extensions/callbacks>`.
@ -182,6 +184,7 @@ If your work requires to log in an unsupported method, please open an issue with
Manual Logging Non-Scalar Artifacts
===================================
If you want to log anything that is not a scalar, like histograms, text, images, etc., you may need to use the logger object directly.
.. code-block:: python
@ -388,3 +391,13 @@ in the `hparams tab <https://pytorch.org/docs/stable/tensorboard.html#torch.util
self.log("hp/metric_2", some_scalar_2)
In the example, using ``"hp/"`` as a prefix allows for the metrics to be grouped under "hp" in the tensorboard scalar tab where you can collapse them.
-----------
***************************
Managing Remote Filesystems
***************************
Lightning supports saving logs to a variety of filesystems, including local filesystems and several cloud storage providers.
Check out :ref:`Remote Filesystems <remote_fs>` document for more info.