lightning/docs/source-app/glossary/storage/path.rst

325 lines
10 KiB
ReStructuredText

:orphan:
############
Path Storage
############
**Audience:** Users who want to share files between components.
The Lightning Storage system makes it easy to share files between LightningWork so you can run your app both locally and in the cloud without changing the code.
----
***********************
What is a Path Object ?
***********************
The Path object is a reference to a specific file or directory from a LightningWork and can be used to transfer those files to another LightningWork (one way, from source to destination).
A good mental representation of the Path Object usage is the `relay race <https://en.wikipedia.org/wiki/Relay_race>`_.
To make a transfer, the receiver asks (e.g when the path object is passed by the flow to the receiver)
for a copy of the files (baton) owned by their producer (e.g the LightningWork which created the files).
.. figure:: https://pl-flash-data.s3.amazonaws.com/assets_lightning/path2.png
*******************************************
How does the Path Object works internally ?
*******************************************
To understand the Path Object internal, let's first answer this question: How do you locate a specific file or folder within a distributed system made of multiple machines ?
You need to know on which machine the file or folder is located (e.g the LightningWork name uniquely identify its own machine in the cloud) and
then you need the local path of the file or folder on that machine.
In simple words, the Lightning Path augments :class:`pathlib.Path` object by tracking on which machine the file or folder is located.
----
**************************
When to use Path storage ?
**************************
In the cloud, every :class:`~lightning_app.core.work.LightningWork` runs in a separate machine with its own filesystem.
This means files in one Work cannot be directly accessed in another like you would be able to when running the app locally.
But with Lightning Storage, this is easy: Simply declare which files need to be shared and Lightning will take care of the rest.
.. raw:: html
<video width="100%" max-width="800px" controls autoplay src="https://pl-flash-data.s3.amazonaws.com/assets_lightning/docs/images/storage/path.m4v" />
----
***********************************
Tell Lightning where your files are
***********************************
Convert every filesystem path you want to share with other LightningWorks to by adding ``lit://`` in front of it.
.. code-block:: python
from lightning_app import LightningWork
from lightning_app.storage import Path
class SourceWork(LightningWork):
def __init__(self):
super().__init__()
self.checkpoint_dir = None
def run(self):
# Normally you would do:
# self.checkpoint_dir = "outputs/checkpoints"
# os.makedirs("outputs/checkpoints")
# ...
# In Lightning, do:
self.checkpoint_dir = "lit://outputs/checkpoints"
os.makedirs(self.checkpoint_dir)
...
Under the hood, we convert this string to a :class:`~lightning.app.storage.path.Path` object, which is a drop-in replacement for :class:`pathlib.Path` meaning it will work with :mod:`os`, :mod:`os.path` and :mod:`pathlib` filesystem operations out of the box!
----
****************************
Access files in another Work
****************************
Accessing files from another LightningWork is as easy as handing the path over by reference.
For example, share a directory by passing it as an input to the run method of the destination work:
.. code-block:: python
:emphasize-lines: 12
from lightning_app import LightningFlow
class Flow(LightningFlow):
def __init__(self):
super().__init__()
self.source = SourceWork()
self.destination = DestinationWork()
def run(self):
self.source.run()
# Pass the Path reference from one work to another
self.destination.run(self.source.checkpoint_dir)
When the destination Work starts, Lightning will automatically transfer the files to its filesystem (if they exist on the other end):
.. code-block:: python
class DestinationWork(LightningWork):
def run(self, checkpoint_dir):
# The directory is now accessible inside this Work
files = os.listdir(checkpoint_dir)
...
The automatic transfer only happens if the referenced files already exist in the originating LightningWork and it will overwrite any files that already exist locally.
In all other cases, you can trigger the transfer manually.
----
******************
Get files manually
******************
If you need to access files at a specific time or transfer them multiple times, use ``.get()`` method:
.. code-block:: python
def run(self, checkpoint_dir):
...
# Make the directory available
checkpoint_dir.get()
# If the path already exists locally, you can force overwriting it
checkpoint_dir.get(overwrite=True)
files = os.listdir(checkpoint_dir)
...
Multiple calls to the ``.get()`` method will always result in file transfers, regardless of whether the files have changed or not.
If the path does not exist remotely, it will raise a ``FileNotFoundError``.
If you need to handle this case, the Path also offers a method to check if files exist remotely.
----
********************************
Check if a file or folder exists
********************************
You can check if a path exists locally or remotely in the source Work using the ``.exists_local()`` and ``.exists_remote()`` methods:
.. code-block:: python
def run(self, checkpoint_dir):
if checkpoint_dir.exists_remote():
# Get the file only if it exists in the source Work
checkpoint_dir.get()
# OR
if checkpoint_dir.exists_local():
# Do something with the file if it exists locally
files = os.listdir(checkpoint_dir)
----
*************
Persist files
*************
If a LightningWork finishes or stops due to an interruption (e.g., due to insufficient credits), the filesystem and all files in it get deleted (unless running locally).
Lightning makes sure all Paths that are part of the state get stored and made accessible to the other Works that still need these files.
.. code-block:: python
from lightning_app.storage import Path
class Work(LightningWork):
def __init__(self):
super().__init__()
# The files in this path will be saved as an artifact when the Work finishes
self.checkpoint_dir = "lit://outputs/checkpoints"
# The files in this path WON'T be saved because it is not declared as a Lightning Path
self.log_dir = "outputs/logs"
----
*********************************
Example: Share a model checkpoint
*********************************
A common workflow in ML is to use a checkpoint created by another component.
First, define a component that saves a checkpoint:
.. code:: python
:emphasize-lines: 14-18
from lightning_app import LightningFlow, LightningWork
from lightning_app.storage import Path
import torch
import os
class ModelTraining(LightningWork):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.checkpoint_dir = "lit://outputs/checkpoints"
def run(self):
# create a directory
os.makedirs(self.model_checkpoints_path, exist_ok=True)
# make fake checkpoints
checkpoint_1 = torch.tensor([0, 1, 2, 3, 4])
checkpoint_2 = torch.tensor([0, 1, 2, 3, 4])
torch.save(checkpoint_1, os.path.join(self.checkpoint_dir, "checkpoint_1.ckpt"))
torch.save(checkpoint_2, os.path.join(self.checkpoint_dir, "checkpoint_2.ckpt"))
Next, define a component that needs the checkpoints:
.. code:: python
:emphasize-lines: 4, 7
class ModelDeploy(LightningWork):
def __init__(self, *args, **kwargs):
super().__init__()
def run(self, checkpoint_dir):
ckpts = os.listdir(checkpoint_dir)
checkpoint_1 = torch.load(ckpts[0])
checkpoint_2 = torch.load(ckpts[1])
Link both components via a parent component:
.. code:: python
:emphasize-lines: 7
class Flow(LightningFlow):
def __init__(self):
super().__init__()
self.train = ModelTraining()
# pass the checkpoint path
self.deploy = ModelDeploy()
def run(self):
self.train.run()
self.deploy.run(checkpoint_dir=self.train.checkpoint_dir)
app = L.LightningApp(Flow())
----
**************************
Frequently Asked Questions
**************************
- **Q: Can files in a LightningWork be accessed inside the LightningFlow too?**
No, LightningFlow is intentionally designed not to perform filesystem operations and computations and is intended to exclusively orchestrate Flow and Work.
- **Q: Is it possible to reference any file using the Lightning lit:// path notation?**
Yes, but only files for which the app has write permissions can be copied from Work to Work (apps don't run with root priviliges).
- **Q: Can I access the Lightning Storage in my UI (StreamLit, Web, ...)?**
This is currently not supported but will be in the future.
- **Q: Should I define my lit:// path in the __init__ or the run method?**
You can declare a Lightning path anywhere you'd like. However, the ``.get()`` and ``.exists_*()`` methods only work inside of the run method of a LightningWork.
- **Q:How often does Lightning synchronize the files between my Work?**
Lightning does not synchronize the files between works. It only transfers the files once when the Work ``run`` method starts.
But you can call ``Path.get()`` as many times as you wish to transfer the latest file into the current Work.
- **Does Lightning provide me direct access to the shared cloud folder?**
No, and this is on purpose. This restriction forces developers to build modular components that can be shared and integrated
into apps easily. This would be much harder to achieve if file paths in these components would reference a global shared storage.
----
.. raw:: html
<div class="display-card-container">
<div class="row">
.. displayitem::
:header: Learn about the Drive Object.
:description: Put, List and Get Files From a Shared Drive Disk.
:col_css: col-md-4
:button_link: drive.html
:height: 180
:tag: Basic
.. raw:: html
</div>
</div>