* Remove the deprecated device_stats_monitor_prefix_keys
* Added pr no to changelog.md
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
* Fix TPU test CI
* +x first
* Lite first to uncovert errors faster
* Fixes
* One more
* Simplify XLALauncher wrapping to avoid pickle error
* debug
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Debug commit successful. Trying local definitions
* Require tpu for mock test
* ValueError: The number of devices must be either 1 or 8, got 4 instead
* Fix mock test
* Simplify call, rely on defaults
* Skip OSError for now. Maybe upgrading will help
* Simplify launch tests, move some to lite
* Stricter typing
* RuntimeError: Accessing the XLA device before processes have spawned is not allowed.
* Revert "RuntimeError: Accessing the XLA device before processes have spawned is not allowed."
This reverts commit f65107ebf3.
* Alternative boring solution to the reverted commit
* Fix failing test on CUDA machine
* Workarounds
* Try latest mkl
* Revert "Try latest mkl"
This reverts commit d06813aa67.
* Wrong exception
* xfail
* Mypy
* Comment change
* Spawn launch refactor
* Accept that we cannot lazy init now
* Fix mypy and launch test failures
* The base dockerfile already includes mkl-2022.1.0 - what if we use it?
* try a different mkl version
* Revert mkl version changes
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
* add BatchSizeFinderCallback callback
* temp rm from init
* skip with lr_finder tests
* restore loops and intergrate early exit
* enable fast_dev_run test
* add docs and tests
* keep tune and remove early_exit
* add more tests
* patch lr finder
* disable skip
* force_save and fix test
* mypy and circular import fix
* fix mypy
* fix
* updates
* rebase
* address reviews
* add more exceptions for unsupported functionalities
* move exception to setup
* chlog
* unit test
* address reviews
* Apply suggestions from code review
* update
* update
* mypy
* fix
* use it as a util func
* license
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* mypy
* mypy
* review
* fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* updates
* updates
* fix import
* Protect callback attrs
* don't reset val dataloader
* update test
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Fix attribute error in SWA when running with Tuner
* changelog
* add better test
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Added functions to the WandbLogger to download and use artifacts without having to access the experiment object
* Updated CHANGLELOG.md
* Added suggested changes
* Delete test_script
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
* add accelerator implementations to lite
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix imports
* rename registry argument
* fix test
* fix tests
* remove duplicated test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix tests
* deprecation
* deprecations
* flake8
* fixes
* add mps to runif
* fix tests
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove more
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* local import
* undo device stats :(
* fix import
* stupid typehints
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* more refactors :(
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
* rename init_device to setup_device
* remove unused import
* make uppercase to differentiate from class
* trick test after moving import locally
* add base classes and registry
* reg
* registry
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* tests
* update to other branches
* resolve todo(lite)
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add very basic unit tests
* fix name assignment
* Update src/lightning_lite/strategies/parallel.py
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
* remove deprecated property
* remove pre- and post backward for now
* protecting the registry utility function
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* remove unused import
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
* Add parameter to change the preemption signal
* Make the signal connector use the custom signal from SLURMEnvironment
Signed-off-by: Max Ehrlich <max.ehr@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>