Update cloud docs (#8569)
* amp * amp * docs * add guides * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * amp * amp * docs * add guides * speed guides * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete ds.txt * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update conf.py * Update docs.txt * remove 16 bit * remove finetune from speed guide * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * speed * speed * speed * speed * speed * speed * speed * speed * speed * speed * speed * speed * remove early stopping from speed guide * remove early stopping from speed guide * remove early stopping from speed guide * fix label * fix sync * reviews * Update trainer.rst * Update trainer.rst * Update speed.rst * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * managing data * managing data * amp * amp * docs * sync * sync * amp * amp * add data guide * from review * Apply suggestions from code review Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * from review * from review * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add data guide * add data guide * add data guide * sync issues * from reviw * Update docs/source/guides/data.rst Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * add info if import fails * fix cross referencing * Add Datamodule motivation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * grid docs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update cloud_training.rst Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
This commit is contained in:
parent
4b7f78e200
commit
c7e5743d54
|
@ -4,39 +4,45 @@
|
|||
Cloud Training
|
||||
##############
|
||||
|
||||
Lightning has a native solution for training on AWS/GCP at scale.
|
||||
Go to `grid.ai <https://www.grid.ai/>`_ to create an account.
|
||||
Lightning makes it easy to scale your training, without the boilerplate.
|
||||
If you want to train your models on the cloud, without dealing with engineering infrastructure and servers, you can try `Grid.ai <https://www.grid.ai/>`_.
|
||||
|
||||
We've designed Grid to work seamlessly with Lightning, without needing to make ANY code changes.
|
||||
Developed by the creators of `PyTorch Lightning <https://www.pytorchlightning.ai/>`_, Grid is a platform that allows you to:
|
||||
|
||||
To use Grid, replace ``python`` in your regular command:
|
||||
|
||||
- **Scale your models to multi-GPU and multiple nodes** instantly with interactive sessions
|
||||
- **Run Hyperparameter Sweeps on 100s of GPUs** in one command
|
||||
- **Upload huge datasets** for availability at scale
|
||||
- **Iterate faster and cheaper**, you only pay for what you need
|
||||
|
||||
|
||||
****************
|
||||
Training on Grid
|
||||
****************
|
||||
|
||||
.. raw:: html
|
||||
|
||||
<video width="50%" max-width="400px" controls
|
||||
poster="https://grid-docs.s3.us-east-2.amazonaws.com/grid.png"
|
||||
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/grid.mp4"></video>
|
||||
|
||||
|
|
||||
|
||||
You can launch any Lightning model on Grid using the Grid `CLI <https://pypi.org/project/lightning-grid/>`_:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python my_model.py --learning_rate 1e-6 --layers 2 --gpus 4
|
||||
grid run --instance_type v100 --gpus 4 my_model.py --gpus 4 --learning_rate 'uniform(1e-6, 1e-1, 20)' --layers '[2, 4, 8, 16]'
|
||||
|
||||
To use the ``grid run`` command:
|
||||
You can also start runs or interactive sessions from the `Grid platform <https://platform.grid.ai>`_, where you can upload datasets, view artifacts, view the logs, the cost, log into tensorboard, and so much more.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
grid run --gpus 4 my_model.py --learning_rate 'uniform(1e-6, 1e-1, 20)' --layers '[2, 4, 8, 16]'
|
||||
**********
|
||||
Learn More
|
||||
**********
|
||||
|
||||
The above command will launch (20 * 4) experiments, each running on 4 GPUs (320 GPUs!) - by making ZERO changes to
|
||||
your code.
|
||||
`Sign up for Grid <http://platform.grid.ai>`_ and receive free credits to get you started!
|
||||
|
||||
The ``uniform`` command is part of our new expressive syntax which lets you construct hyperparameter combinations
|
||||
using over 20+ distributions, lists, etc. Of course, you can also configure all of this using yamls which
|
||||
can be dynamically assembled at runtime.
|
||||
`Grid in 3 minutes <https://docs.grid.ai/#introduction>`_
|
||||
|
||||
***************
|
||||
Grid Highlights
|
||||
***************
|
||||
|
||||
* Run any public or private repository with Grid, or use an interactive session.
|
||||
* Grid allocates all the machines and GPUs you need on demand, so you only pay for what you need when you need it.
|
||||
* Grid handles all the other parts of developing and training at scale: artifacts, logs, metrics, etc.
|
||||
* Grid works with the experiment manager of your choice, no code changes needed.
|
||||
* Use Grid Datastores- high-performance, low-latency, versioned datasets.
|
||||
* Attach Datastores to a Run so you don't have to keep downloading datasets
|
||||
* Use Grid Sessions for fast prototyping on a cloud machine of your choice
|
||||
* For more information check the `grid documentation <https://docs.grid.ai/>`_
|
||||
`Grid.ai Terms of Service <https://www.grid.ai/terms-of-service/>`_
|
||||
|
|
Loading…
Reference in New Issue