Commit Graph

1455 Commits

Author SHA1 Message Date
Ethan Harris bbf7848a5f
[App] Fix cluster logic (#15383) 2022-10-28 15:35:21 +01:00
Ethan Harris e9a6b83437
[App] Reduce import depths and add test (#15330)
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-10-28 13:57:35 +00:00
thomas chaton df4b705768
Add JustPy Frontend (#15002)
* update

* update

* update

* update

* changelog

* update

* update

* update

* update

* update

* update

* update

* update

* uipdate

* update

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-27 11:48:26 -04:00
kimpty d956a123bd
Update train_model_basic.rst (#15352) 2022-10-27 09:13:11 -04:00
Jirka Borovec 95ae393ca8
LAI: creating mirror package (#15105)
* placeholder

* mirror + prune

* makedir

* setup

* ci

* ci

* name

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci clean

* empty

* py

* parallel

* doctest

* flake8

* ci

* typo

* replace

* clean

* Apply suggestions from code review

* re.sub

* fix UI path

* full replace

* ui path?

* replace

* updates

* regex

* ci

* fix

* ci

* path

* ci

* replace

* Update .actions/setup_tools.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* also convert lightning_lite tests for PL tests to adapt mocking paths

* fix app example test

* update logger propagation for PL tests

* update logger propagation for PL tests

* Apply suggestions from code review

* Revert "update logger propagation for PL tests"

This reverts commit c1a5e119c7.

* playwright

* py

* update import in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try edit import in overwrite

* debug code

* rev playwright

* Revert "try edit import in overwrite"

This reverts commit c02f766521.

* ci: adjust examples

* adjust examples cloud

* mock lightning_app

* Install assistant dependencies

* lightning

* setup

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

* disable cache

* move doctest to install

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* )

* echo ./

* ci

* lru

* revert disabling cache, prints

* ci

* prune ci jobs

* prune ci jobs

* training loop standalone tests

* add sys modules cleanup fixture

* make use of fixture

* revert standalone

* ci e2e

* fix imports in lightning

* fix imports of lightning in tests

* Revert "make use of fixture"

This reverts commit c15efdd205.

* Revert other commits for fixtures

* revert use of fixture

* py3.9

* fix mocking

* fix paths

* hack mocking

* docs

* Apply suggestions from code review

* rev suggestion

* Minor changes to the parametrizations

* Update checkgroup with the new and changed jobs

* include frontend dir

* cli

* fix imports and entry point

* Revert standalone

* rc1

* e2e on staging

* Revert "Revert standalone"

This reverts commit 9df96685b8.

* groups

* to

* ci: pt ver

* docker

* Apply suggestions from code review

* Copy over changes from previous commit to other groups

* Add back changes from bad merge

* Uppercase step name everywhere

* update

* ci

* ci: lai oldest

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: manskx <ahmed.mansy156@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-27 12:32:49 +02:00
Adrian Wälchli 0f9156374d
Mark internal Lite APIs as protected (#15307)
* mark internal lite apis as protected
* formatting
* docs update

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-10-26 12:51:50 +00:00
Raphael Randschau 13baad56e4
Add support for custom cloud compute configurations for Flows (#14831)
* use more recent lightning cloud launcher

* allow LightningApp to use custom cloud compute for flows

* feedback from adrian

* adjust other cloud tests

* update

* update

* update commens

* Update src/lightning_app/core/app.py

Co-authored-by: Sherin Thomas <sherin@grid.ai>

* Close profiler when `StopIteration` is raised (#14945)

* Find last checkpoints on restart (#14907)


Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Remove unused gcsfs dependency (#14962)

* Update hpu mixed precision link (#14974)

Signed-off-by: Jerome <janand@habana.ai>

* Bump version of fsspec (#14975)

fsspec verbump

* Fix TPU test CI (#14926)

* Fix TPU test CI

* +x first

* Lite first to uncovert errors faster

* Fixes

* One more

* Simplify XLALauncher wrapping to avoid pickle error

* debug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Debug commit successful. Trying local definitions

* Require tpu for mock test

* ValueError: The number of devices must be either 1 or 8, got 4 instead

* Fix mock test

* Simplify call, rely on defaults

* Skip OSError for now. Maybe upgrading will help

* Simplify launch tests, move some to lite

* Stricter typing

* RuntimeError: Accessing the XLA device before processes have spawned is not allowed.

* Revert "RuntimeError: Accessing the XLA device before processes have spawned is not allowed."

This reverts commit f65107ebf3.

* Alternative boring solution to the reverted commit

* Fix failing test on CUDA machine

* Workarounds

* Try latest mkl

* Revert "Try latest mkl"

This reverts commit d06813aa67.

* Wrong exception

* xfail

* Mypy

* Comment change

* Spawn launch refactor

* Accept that we cannot lazy init now

* Fix mypy and launch test failures

* The base dockerfile already includes mkl-2022.1.0 - what if we use it?

* try a different mkl version

* Revert mkl version changes

Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* Trainer: fix support for non-distributed PyTorch (#14971)

* Trainer: fix non-distributed use
* Update CHANGELOG

* fixes typing errors in rich_progress.py (#14963)

* revert default cloud compute rename

* allow LightningApp to use custom cloud compute for flows

* feedback from adrian

* update

* resolve merge with master conflict

* remove preemptible

* update CHANGELOG

* add basic flow cloud compute documentation

* fix docs build

* add missing symlink

* try to fix sphinx

* another attempt for docs

* fix new test

Signed-off-by: Jerome <janand@habana.ai>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Sherin Thomas <sherin@grid.ai>
Co-authored-by: Ziyad Sheebaelhamd <47150407+ziyadsheeba@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jerome Anand <88475913+jerome-habana@users.noreply.github.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: DP <10988155+donlapark@users.noreply.github.com>
2022-10-25 11:29:15 -07:00
Carlos Mocholí 7b3de1215f
Remove examples and loggers from develop dependencies (#15282)
* Remove examples and loggers from develop dependencies

* remove more references

* Fix mypy

* Keep logger file for docs mocking

* Simpler fix

* Fix docs build

* Global testsetup

* Matching files

* Undo change

* loggers as info

* Clarify

* Update requirements/pytorch/loggers.info

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
2022-10-25 09:23:26 -04:00
fabio fumarola 1beef7620f
Update lightning_cli_advanced_2.rst (#15257)
Co-authored-by: Mauricio Villegas <mauricio_ville@yahoo.com>
2022-10-24 15:25:21 +00:00
Kaushik B 7354073e6e
App: Remove the unsupported params for CloudCompute (#14852) 2022-10-21 19:37:59 +00:00
Carlos Mocholí 375ab53861
Migrate TPU tests to GitHub actions (#14687)
* Migrate TPU tests to GitHub actions

* No working dir

* Keep _target

* Dont skip draft

* CHECK_SLEEP

* Not yet

* Remove recurrent cleanup script

* Set secrets

* a step cannot have both the `uses` and `run` keys

* Version $PYTHON_VER was not found in the local cache

* can't load package ... ($GOPATH not set)

* The `set-env` command is disabled

* Try updating go

* Match timeout

* simplify path

* More cleanup

* Install coverage. Unmark draft

* Update .github/workflows/ci-pytorch-test-tpu.yml

* DEBUG echo

* Revert "DEBUG echo"

This reverts commit 4011856e6e.

* More debug

* SSH

* Im stupid

* Remove always()

* Forgot some

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-21 20:01:39 +02:00
thomas chaton 6a72a15a62
[App] Automate missing requirements installation for CLI (#15198)
* update

* update

* update

* update

* update

* update

* update

* wording

Co-authored-by: Mansy <ahmed.mansy156@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Mansy <ahmed.mansy156@gmail.com>
2022-10-20 15:02:13 -04:00
Jirka Borovec 26f632cb10
switch LAI deployment branch (#15194)
* switch LAI deployment branch
* update links
2022-10-19 16:07:05 -04:00
Adrian Wälchli 045c2f5715
Efficient gradient accumulation in LightningLite (#14966)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-19 19:55:12 +00:00
Ethan Harris 4acb10f981
Add support for command descriptions (#15193) 2022-10-19 17:34:35 +01:00
Rohit Gupta 85ce43d1a3
Add docs for distributed inference (#15149) 2022-10-18 18:39:17 +00:00
Adrian Wälchli 7b185c7fc9
Update 8-bit optimizer docs (#15155) 2022-10-17 22:23:56 +02:00
Jirka Borovec 05d91c8e75
docs: temp drop S3 from index (#15099)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-10-13 17:59:02 +00:00
HELSON dd33528e00
[docs] Docs for ColossalaiStrategy (#15093) 2022-10-13 16:14:03 +00:00
Jerome Anand 672b5cbefe
Update obsolete URL in HPU docs (#15112) 2022-10-13 13:27:16 +02:00
Adrian Wälchli d2840a20bd
Update examples that require the run() method (#15096)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-10-12 14:32:05 +01:00
Adrian Wälchli b4b651c73a
Update docs regarding deprecation window (#15089) 2022-10-12 15:25:03 +02:00
Ray Schireman 0a5e75e8d1
Add `inference_mode` flag to Trainer (#15034)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-12 12:22:01 +00:00
Rohit Gupta ad1e06f2d4
Update tuner docs (#15087) 2022-10-12 08:55:56 +00:00
Stefano Borzì 1865300228
chore: add model as recommended parameter for validate() (#15086) 2022-10-12 10:46:03 +02:00
edenlightning 8715cd0346
secrets docs (#14951)
* secrets docs

* Update docs/source-app/glossary/secrets.rst

Co-authored-by: Yurij Mikhalevich <yurij@grid.ai>

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update secrets.rst

* links

Co-authored-by: Yurij Mikhalevich <yurij@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-10-11 10:09:26 -04:00
ver217 2fef6d9403
Add ColossalAI strategy (#14224)
Co-authored-by: HELSON <c2h214748@gmail.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-11 13:59:09 +02:00
Adrian Wälchli 3183079204
Remove deprecated callback hooks (#14834)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-10 15:46:28 +00:00
Krishna Kalyan 0086c7bfcd
Missing steps in run on your own machine docs (#15033) 2022-10-10 14:36:47 +00:00
Rohit Gupta ca3c4e7f07
Add tuner callback docs (#15030) 2022-10-08 18:21:27 +00:00
Amrutha dfc7886b24
docs: replacement of method type_as in docs to Tensor.to (#15027) 2022-10-08 10:04:15 +00:00
Rohit Gupta 7fed7a12c5
Add `LRFinder` callback (#13802)
* add BatchSizeFinderCallback callback
* enable fast_dev_run test
* keep tune and remove early_exit
* move exception to setup
* Apply suggestions from code review

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-10-05 13:15:38 +02:00
Carlos Mocholí 7ef87464dd
Refactor XLA and TPU checks across codebase (#14550) 2022-10-04 22:54:14 +00:00
Jerome Anand e62521caf1
Update hpu mixed precision link (#14974)
Signed-off-by: Jerome <janand@habana.ai>
2022-10-03 09:05:17 +02:00
Andres Algaba 3daa4c9cc0
Remove deprecated on_init_start_end (#14867)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
2022-09-30 15:11:38 +00:00
Adrian Wälchli c8059d4464
Update quick start guide with latest info (#14880)
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-09-29 20:54:20 +00:00
Adrian Wälchli ff3c5b7b9d
Docs section for SLURM troubleshooting (#14873)
Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-09-29 12:41:31 +00:00
Rohit Gupta d1a3a3ebf5
Add BatchSizeFinder callback (#11089)
* add BatchSizeFinderCallback callback

* temp rm from init

* skip with lr_finder tests

* restore loops and intergrate early exit

* enable fast_dev_run test

* add docs and tests

* keep tune and remove early_exit

* add more tests

* patch lr finder

* disable skip

* force_save and fix test

* mypy and circular import fix

* fix mypy

* fix

* updates

* rebase

* address reviews

* add more exceptions for unsupported functionalities

* move exception to setup

* chlog

* unit test

* address reviews

* Apply suggestions from code review

* update

* update

* mypy

* fix

* use it as a util func

* license

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* mypy

* mypy

* review

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* updates

* updates

* fix import

* Protect callback attrs

* don't reset val dataloader

* update test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-09-27 08:54:37 -04:00
Andres Algaba 4fc8275cc3
Remove the deprecated `trainer.call_hook` (#14869) 2022-09-26 15:56:44 +02:00
jsr-p abb6049fa3
Update documentation for the basic skills tutorial level 2 on how to validate and test a model (#14874) 2022-09-24 10:34:06 +00:00
dconathan 633d14e67a
fixed comet -> mlflow typo in visualize/experiment_managers docs (#14843)
fixed comet -> mlflow typo

Co-authored-by: Devin Conathan <devin.conathan@libertymutual.com>
2022-09-24 00:13:28 +02:00
Adrian Wälchli dd2a1c5d29
Integrate Lite Precision into PL (#14798)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-09-22 14:51:04 +00:00
Laverne Henderson d1303cf628
Updated the structure and applied feedback (#14734) 2022-09-22 11:40:12 +02:00
Laverne Henderson da0ccb11a6
Updates links to components in the Gallery (#14807) 2022-09-21 22:22:05 +00:00
Mauricio Villegas 3064c28ce1
Added args parameter to LightningCLI to ease running from within Python (#14596)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-09-19 17:38:30 +00:00
Laverne Henderson 8c4e17f359
Removes the old HPO content (#14754)
* Removes the old HPO content

* Remove source-lit symlinks for HPO

* drop ref

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-09-19 09:25:57 -04:00
Gilad a5b0f8bd5c
Fix TQDMProgressBar usage in logging.rst (#14768) 2022-09-19 01:07:19 +02:00
Laverne Henderson 9ea4ab6b19
Update installation (#14732)
* Update installation

Updates to use python -m pip install -U lightning and adds troubleshooting note

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-09-15 15:30:32 -04:00
Yurij Mikhalevich 09f50b4295
Fix Google Tag Manager for the Lightning App docs (#14731)
- updates the Lightning App docs theme to the one without Pytorch Lightning docs Google Tag Manager hardcoded
- sets the GTM id in the conf.py for Lightning App docs
2022-09-15 18:35:16 +00:00
Akihiro Nitta 3c5e03e035
docs: Clarify versioning and API stability (#14549)
* mv releases to a standalone page

* Include release_policy in index

* Update policy

* mv releases to a standalone page

* Include release_policy in index

* Update policy

* Update title

* remove release_policy.rst

* Update versioning

* syntax

* simplify wording

* Include examples that don't follow X+2 rule

* syntax

* update

* consistency

* rm noninformative statement

* .

* Reduce redundancy in the deprecation process

* grammar?

* consistency

* Update docs/source-pytorch/versioning.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-09-15 09:16:14 -04:00