Commit Graph

216 Commits

Author SHA1 Message Date
Adrian Wälchli 91e692c767
Rename the TPUSpawnStrategy to XLAStrategy (#16781)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-17 02:06:24 +00:00
Adrian Wälchli ad698f049b
Update Colossal AI docs and integration (#16778) 2023-02-16 16:14:24 +00:00
Justus Schock 47c69cd8eb
Remove DP (#16748)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-16 05:12:08 +00:00
Lightning Forever 41dd0d1f85
Remove the QuantizationAwareTraining callback (#16750) 2023-02-15 17:29:49 -05:00
Adrian Wälchli 83f4c83582
Replace ColossalAIStrategy with external implementation (#16757) 2023-02-15 15:11:52 +00:00
Adrian Wälchli c4074419b5
Remove the BaguaStrategy (#16746)
* remove bagua

* remove

* remove docker file entry
2023-02-14 08:58:58 -05:00
Adrian Wälchli 39020887d2
Remove Trainer's `track_grad_norm` argument (#16745) 2023-02-14 12:38:17 +00:00
Adrian Wälchli 99cb2cd056
Remove argparse utils (#16708)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-02-13 20:44:30 +00:00
Adrian Wälchli 67c09e3092
Separate the Gradient Accumulation Scheduler from Trainer (#16729)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-13 20:15:38 +00:00
Carlos Mocholí 457cd76d1a
Remove the unused `utilities.finite_checks` (#16682) 2023-02-09 21:11:05 +01:00
Adrian Wälchli 18106a8f95
Split train- and val progress into separate bars (#16695) 2023-02-09 19:43:50 +00:00
Adrian Wälchli 83296cc6cf
Update Fabric introduction (#16672)
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2023-02-09 18:06:29 +00:00
Carlos Mocholí bf51844917
Remove memory-retaining epoch-end hooks (#16520) 2023-02-06 17:00:36 +00:00
Adrian Wälchli cd0eedb082
Set `find_unused_parameters=False` as the default (#16611) 2023-02-06 16:51:21 +01:00
Sebastian Raschka ce424d235f
Move fsdp_native to fine-tuning recommendation (#16630) 2023-02-05 15:09:46 +01:00
JiHoon Kim 65abdeea88
Fabric docs typo correction (#16635) 2023-02-05 01:09:58 +01:00
Adrian Wälchli 0f75dce8b4
Add MPI cluster environment (#16570)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-03 10:45:11 +00:00
Adrian Wälchli acb7ee223c
Ignore generated package files (#16605)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-02 09:24:07 +00:00
Carlos Mocholí ef2a6088ff
Drop support for PyTorch 1.10 (#16492)
* Drop support for PyTorch 1.10

* CHANGELOG

* READMEs

* mypy

* ls

* New poplar version

* Fixed tests

* links

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* skip azure badges

* Table

* Matching dockerfiles

* Drop unnecessary channels and packages

* Push nightly

* Undo unrelated changes

* Revert "Push nightly"

This reverts commit 9618f737c4.

---------

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-01 14:09:12 -05:00
Adrian Wälchli 01b152f169
Update docs for multiple optimizers in 2.0 (#16588) 2023-02-01 17:34:55 +00:00
Carlos Mocholí dc298f2340
Drop support for Python 3.7 (#16579)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-01 01:36:42 +00:00
Andrea Tupini d634846b5e
Minor formatting fix on model_parallel docs (#16565) 2023-01-30 12:40:03 -05:00
Adrian Wälchli 8aca46a192
Remove `using_lbfgs` argument from `optimizer_step` module hook (#16538) 2023-01-30 12:49:35 +00:00
Adrian Wälchli 1008f313e8
Remove `on_tpu` argument from `optimizer_step` module hook (#16537)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-30 13:17:20 +01:00
Jirka Borovec 879701f52f
ci: hotfix precommit/poetry/isort (#16549) 2023-01-30 11:07:52 +01:00
Adrian Wälchli bb7b8d601a
Fabric docs feedback 2/n (#16480) 2023-01-27 20:13:20 +01:00
Carlos Mocholí 226290cfc1
PyTorch 2.0 switched the `set_to_none` default (#16531)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-01-27 16:51:56 +00:00
Adrian Wälchli b216a114a7
Decouple Tuner from Trainer (#16462)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-27 15:08:40 +00:00
Carlos Mocholí d562319a61
Make the `FaultToleranceCheckpoint` callback opt-in (#16512) 2023-01-27 13:02:14 +00:00
belerico b5599e1320
Add reinforcement learning example for Fabric (#16506)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: Luca Antiga <luca@lightning.ai>
2023-01-27 11:28:25 +00:00
Carlos Mocholí 76cb048b29
Remove docs about automatic fault tolerance (#16500)
Remove docs about the experimental automatic fault tolerance
2023-01-26 19:47:40 +01:00
Adrian Wälchli c68cfd686e
Rename LiteMultiNode to FabricMultiNode (#16505) 2023-01-26 11:36:27 +00:00
Adrian Wälchli dfd8d80cb1
Multi-node documentation for Fabric (#16495)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-01-25 22:07:09 +00:00
Carlos Mocholí 486b4d5d9d
Remove old platform docs (#16499)
* Remove old platform docs

* More

* More
2023-01-25 16:16:51 +01:00
Carlos Mocholí d78cf99176
Remove the "native" suffix from the codebase (#16490) 2023-01-25 14:09:09 +00:00
Adrian Wälchli 8147e9b111
Grammar corrections for Fabric docs (#16494) 2023-01-25 11:45:09 +01:00
Adrian Wälchli c87bb71fa8
Add `Fabric.all_reduce` (#16459) 2023-01-24 22:35:00 +00:00
Carlos Mocholí 5891cdc940
Mark the loop classes as protected (#16445) 2023-01-23 16:30:13 +00:00
Carlos Mocholí 39b7cb80ca
Remove the FairScale integration (#16400)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-01-23 13:39:04 +00:00
Adrian Wälchli 3611fcd152
Update Fabric docs based on user feedback (#16460)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-23 14:28:20 +01:00
Carlos Mocholí d3de5c64d7
Remove the deprecated code in `pl.utilities.data` (#16440) 2023-01-20 01:03:55 +01:00
Adrian Wälchli 39acb81b9b
Fabric checkpointing 1/n: base implementation (#16434)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-19 20:40:12 +00:00
Carlos Mocholí 8f736372ed Loop flattening: reduce base interface (#16429)
* Loop flattening: remove the default `.run()` implementation

* None return

* mypy

* Loop flattening: reduce base interface

* Fix

* DOcs

* Bad merge

* Fix

* Fix
2023-01-19 18:39:36 +01:00
Carlos Mocholí da82d490f3 Remove the deprecated code in `pl.utilities.optimizer` (#16439) 2023-01-19 18:39:36 +01:00
Carlos Mocholí 0cf0e90e4a Remove the deprecated code in `pl.utilities.cloud_io` (#16438) 2023-01-19 18:39:36 +01:00
Carlos Mocholí df795b45c0 Remove the deprecated code in `pl.utilities.seed` (#16422) 2023-01-19 18:39:36 +01:00
Carlos Mocholí f031f1e453 Remove the `HivemindStrategy` (#16407)
Remove the collaborative strategy
2023-01-19 18:39:36 +01:00
Carlos Mocholí 04b929c2af Remove the deprecated code in `pl.utilities.apply_func` (#16413) 2023-01-19 18:39:36 +01:00
Carlos Mocholí 256199ff7c Remove support for logging multiple metrics together (#16389) 2023-01-19 18:39:36 +01:00
Carlos Mocholí 46246c3336 Loop flattening: remove `.connect()` (#16384) 2023-01-19 18:39:36 +01:00