Commit Graph

312 Commits

Author SHA1 Message Date
Carlos Mocholí 12d6e44796
Grep for potential errors in standalone tests (#15341)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-11-05 04:29:38 +01:00
Adrian Wälchli dcfaa065ab
Improve the checkpoint upgrade utility script (#15333) 2022-11-04 21:41:32 +00:00
Yuxuan Lu ee8a57da0f
Fix usage of fs.listdir in CheckpointConnector (#15413)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
2022-11-04 20:21:52 +00:00
Adrian Wälchli 62d040c383
Fix ReduceOp type hint in ColossalAI strategy (#15535) 2022-11-04 19:34:34 +00:00
Adrian Wälchli 39c6ec9ce3
Only load global step when fitting (#15532)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-11-04 16:58:24 +00:00
Adrian Wälchli e52d6c5b35
Fix TensorBoardLogger's validation of example input when logging graph (#15323) 2022-11-02 21:10:15 +00:00
Adrian Wälchli 94f7d2319a
Introduce checkpoint migration (#15237)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-11-02 15:14:04 +00:00
Sitcebelly 94bed87a34
Implement freeze batchnorm with freezing track running stats (#15063)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-11-01 16:11:42 +00:00
Rohit Gupta 61ae35c378
Use sklearn in runif (#15426)
* Use sklearn in runif
* test by removing sklearn dep
* remove repeated code
* seed
2022-11-01 11:40:32 +00:00
Wouter Zwerink c287b5d668
neptune.init deprecation fix (#15393)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-10-31 11:10:44 -04:00
Adrian Wälchli 0f957b5a86
Fix DataLoader re-instantiation when attribute is array (#15409) 2022-10-31 16:09:29 +01:00
Rohit Gupta 773cb3e8c8
Fix skipped tests due to sklearn (#15311)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-10-31 13:58:34 +05:30
Adrian Wälchli 6b0d41cb8a
Fix issues when RichProgressBar disabled (#15376) 2022-10-29 00:52:35 +00:00
Carlos Mocholí 2fd1af0449
Deprecate `AllGatherGrad` (#15364) 2022-10-28 19:51:27 +00:00
Adrian Wälchli 5eafa52596
Fix resetting internal bars in RichProgressBar after each trainer stage (#15377) 2022-10-28 06:20:45 -04:00
Jirka Borovec 95ae393ca8
LAI: creating mirror package (#15105)
* placeholder

* mirror + prune

* makedir

* setup

* ci

* ci

* name

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci clean

* empty

* py

* parallel

* doctest

* flake8

* ci

* typo

* replace

* clean

* Apply suggestions from code review

* re.sub

* fix UI path

* full replace

* ui path?

* replace

* updates

* regex

* ci

* fix

* ci

* path

* ci

* replace

* Update .actions/setup_tools.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* also convert lightning_lite tests for PL tests to adapt mocking paths

* fix app example test

* update logger propagation for PL tests

* update logger propagation for PL tests

* Apply suggestions from code review

* Revert "update logger propagation for PL tests"

This reverts commit c1a5e119c7.

* playwright

* py

* update import in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try edit import in overwrite

* debug code

* rev playwright

* Revert "try edit import in overwrite"

This reverts commit c02f766521.

* ci: adjust examples

* adjust examples cloud

* mock lightning_app

* Install assistant dependencies

* lightning

* setup

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

* disable cache

* move doctest to install

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* )

* echo ./

* ci

* lru

* revert disabling cache, prints

* ci

* prune ci jobs

* prune ci jobs

* training loop standalone tests

* add sys modules cleanup fixture

* make use of fixture

* revert standalone

* ci e2e

* fix imports in lightning

* fix imports of lightning in tests

* Revert "make use of fixture"

This reverts commit c15efdd205.

* Revert other commits for fixtures

* revert use of fixture

* py3.9

* fix mocking

* fix paths

* hack mocking

* docs

* Apply suggestions from code review

* rev suggestion

* Minor changes to the parametrizations

* Update checkgroup with the new and changed jobs

* include frontend dir

* cli

* fix imports and entry point

* Revert standalone

* rc1

* e2e on staging

* Revert "Revert standalone"

This reverts commit 9df96685b8.

* groups

* to

* ci: pt ver

* docker

* Apply suggestions from code review

* Copy over changes from previous commit to other groups

* Add back changes from bad merge

* Uppercase step name everywhere

* update

* ci

* ci: lai oldest

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: manskx <ahmed.mansy156@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-27 12:32:49 +02:00
Justus Schock 6ee1f6c4b7
New skip conditions for unpickle-patching tests (#15329)
* New running conditions for tests
* found one more mistake
2022-10-26 18:33:22 +02:00
Adrian Wälchli ac89d70d4a
Fix pickling issues with rich progress bar (#15319) 2022-10-26 15:25:11 +00:00
Adrian Wälchli 38a9e69543
Extend the detection of interactive mode (#15293)
* extend interactive mode detection
* update test names
* changelog
* test
2022-10-26 15:24:11 +00:00
Adrian Wälchli 0f9156374d
Mark internal Lite APIs as protected (#15307)
* mark internal lite apis as protected
* formatting
* docs update

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-10-26 12:51:50 +00:00
Justus Schock 629912298a
add cloudio pickle patching for unified package (#15309) 2022-10-26 10:44:55 +00:00
otaj 76e462a0be
Do not lose references of trainer in test (#15272)
* Fix reference error

* Skip flaky hanging test

* .

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-25 09:23:15 -04:00
Atharva Phatak 4322f53874
Update mypy version (#15161)
* update mypy version
* type-ignore-comments
* more mypy-fix
* import-fix
* Update Lite too
* simpler implementation for flatten dict
* Fix rich progress
* Simplify rich test
* True None

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-22 23:26:43 +02:00
Rohit Gupta 0a729f6da1
Avoid initializing optimizers during deepspeed evaluation (#14944) 2022-10-22 00:37:03 +05:30
Mauricio Villegas b556713eef
Fix LightningCLI parse_env and description in subcommands (#15138)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-10-21 16:50:18 +00:00
Carlos Mocholí 90e1a0ecf0
Single source for the mypy version (#15224) 2022-10-21 16:15:33 +09:00
Carlos Mocholí bf458701de
Avoid underscore suffix in filenames (#15189) 2022-10-20 07:39:19 -04:00
Adrian Wälchli 576757fd79
Validate SRUN variables when launching in SLURM (#15011) 2022-10-19 21:42:11 +00:00
Carlos Mocholí 24c26f7db2
Standardize Lite's filenames (#15058) 2022-10-19 14:09:41 +02:00
Carlos Mocholí 8e83bfafcd
Finishing touches to the graveyard (#15123) 2022-10-13 18:58:34 +00:00
HELSON dd33528e00
[docs] Docs for ColossalaiStrategy (#15093) 2022-10-13 16:14:03 +00:00
Rohit Gupta eb17dc9839
Deprecate tuning enum and trainer properties (#15100) 2022-10-13 13:29:50 +00:00
Ray Schireman 0a5e75e8d1
Add `inference_mode` flag to Trainer (#15034)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-12 12:22:01 +00:00
Adrian Wälchli aa12727ce2
Error messages for removed DataModule hooks (#15072)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-11 21:58:05 +00:00
Adrian Wälchli 16419f30cb
Error messages for removed Logger APIs (#15067)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-11 21:16:25 +00:00
Adrian Wälchli b1cc740fd6
Error messages for removed Trainer mixin methods (#15065)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-11 16:10:27 -04:00
Adrian Wälchli fe32b39dbc
Error messages for the remaining callback hooks (#15064) 2022-10-11 19:18:47 +00:00
Adrian Wälchli f1509537ec
Error messages for unsupported Trainer attributes (#15059)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2022-10-11 17:08:06 +00:00
Carlos Mocholí c1db77e691
Introduce the graveyard 🪦 (#15061) 2022-10-11 14:01:58 +00:00
Carlos Mocholí c739f6d2d9
Filter APEX future warning (#15078) 2022-10-11 13:04:48 +00:00
ver217 2fef6d9403
Add ColossalAI strategy (#14224)
Co-authored-by: HELSON <c2h214748@gmail.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-11 13:59:09 +02:00
Carlos Mocholí 6f16e46bdb
Various test fixes (#15068) 2022-10-11 03:47:16 -04:00
Alessio Quercia 8a0514115c
Move the `_scan_checkpoints` utility function (#9312)
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-10 22:13:34 +00:00
Carlos Mocholí 713a2f79ee
Mark AMP test as flaky (#15055) 2022-10-10 23:26:13 +02:00
Carlos Mocholí c334b7766c
Remove old testing artifacts (#15052) 2022-10-10 17:34:18 +00:00
Adrian Wälchli 3183079204
Remove deprecated callback hooks (#14834)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-10 15:46:28 +00:00
Carlos Mocholí d15bd1520e
[Lite] precision_plugin -> precision (#15001)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-10-10 15:00:32 +00:00
Carlos Mocholí 69fee71f22
Trim flaky amp test (#15051) 2022-10-10 13:49:37 +02:00
Max Ehrlich 5a3007cd6c
Support Slurm Autorequeue for Array Jobs (#15040)
Signed-off-by: Max Ehrlich <max.ehr@gmail.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-10-10 13:43:57 +02:00
Adrian Wälchli 8f90084059
Remove deprecated on_load/save_checkpoint behavior (#14835)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-10-10 11:08:13 +00:00