2021-04-21 23:38:16 +00:00
|
|
|
.. _grid:
|
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
##############
|
|
|
|
Cloud Training
|
|
|
|
##############
|
|
|
|
|
2021-04-21 23:38:16 +00:00
|
|
|
Lightning has a native solution for training on AWS/GCP at scale.
|
|
|
|
Go to `grid.ai <https://www.grid.ai/>`_ to create an account.
|
2020-10-12 16:15:33 +00:00
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
We've designed Grid to work seamlessly with Lightning, without needing to make ANY code changes.
|
2020-10-12 16:15:33 +00:00
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
To use Grid, replace ``python`` in your regular command:
|
2020-10-12 16:15:33 +00:00
|
|
|
|
|
|
|
.. code-block:: bash
|
|
|
|
|
|
|
|
python my_model.py --learning_rate 1e-6 --layers 2 --gpus 4
|
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
To use the ``grid run`` command:
|
2020-10-12 16:15:33 +00:00
|
|
|
|
|
|
|
.. code-block:: bash
|
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
grid run --gpus 4 my_model.py --learning_rate 'uniform(1e-6, 1e-1, 20)' --layers '[2, 4, 8, 16]'
|
2020-10-12 16:15:33 +00:00
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
The above command will launch (20 * 4) experiments, each running on 4 GPUs (320 GPUs!) - by making ZERO changes to
|
2020-10-12 16:15:33 +00:00
|
|
|
your code.
|
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
The ``uniform`` command is part of our new expressive syntax which lets you construct hyperparameter combinations
|
2020-10-12 16:15:33 +00:00
|
|
|
using over 20+ distributions, lists, etc. Of course, you can also configure all of this using yamls which
|
|
|
|
can be dynamically assembled at runtime.
|
|
|
|
|
2021-06-18 01:24:15 +00:00
|
|
|
***************
|
|
|
|
Grid Highlights
|
|
|
|
***************
|
|
|
|
|
|
|
|
* Run any public or private repository with Grid, or use an interactive session.
|
|
|
|
* Grid allocates all the machines and GPUs you need on demand, so you only pay for what you need when you need it.
|
|
|
|
* Grid handles all the other parts of developing and training at scale: artifacts, logs, metrics, etc.
|
|
|
|
* Grid works with the experiment manager of your choice, no code changes needed.
|
|
|
|
* Use Grid Datastores- high-performance, low-latency, versioned datasets.
|
|
|
|
* Attach Datastores to a Run so you don't have to keep downloading datasets
|
|
|
|
* Use Grid Sessions for fast prototyping on a cloud machine of your choice
|
|
|
|
* For more information check the `grid documentation <https://docs.grid.ai/>`_
|