Commit Graph

8856 Commits

Author SHA1 Message Date
Yi Heng Lim 4444d0c37d
Fix support for passing -1 to `find_usable_cuda_devices` function (#16866)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-27 20:08:42 +00:00
Adrian Wälchli a54f37391f
Explain `configure_sharded_model` in ColossalAI docs (#16872) 2023-02-27 20:45:15 +01:00
Adrian Wälchli 07b89c87ee
Merge DDPStrategy and DDPSpawnStrategy in PL (#16809) 2023-02-27 14:43:23 -05:00
Adrian Wälchli e3efbaa7f6
Incorporate pytorch's fixes in device_count_nvml (#16795)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-27 18:07:55 +00:00
Adrian Wälchli 36c39ea702
Disable the IPU check group (#16886) 2023-02-27 17:51:59 +01:00
Carlos Mocholí 3ddf100671
Check for dataloader_idx presence in the hooks (#16837) 2023-02-27 17:46:46 +01:00
Adrian Wälchli e48613207a
Promote `Fabric.launch()` as the default experience in Fabric docs (#16878) 2023-02-27 08:19:54 -05:00
dependabot[bot] 6de4af9aa3
Bump Lightning-AI/utilities from 0.6.0 to 0.7.1 (#16879)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-27 13:16:58 +01:00
Adrian Wälchli 5bdf8a52bc
Remove redundant strategy property (#16811) 2023-02-26 00:11:51 +00:00
Darren Tuit cdf21a1305
Fix imports for lightning cli examples (#16871) 2023-02-26 00:09:34 +01:00
Aditya Kane df6e37da1c
Update torch_xla installation instructions in tpu_basic.rst (#16865) 2023-02-26 00:07:40 +01:00
Carlos Mocholí 6b9ddf00dd
Introduce `Trainer(barebones=True)` (#16854)
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2023-02-24 12:32:31 +01:00
Adrian Wälchli 462f1ee691
Fix amp ddp test in Fabric (#16862) 2023-02-23 19:05:30 -05:00
Carlos Mocholí 0130273eb5
Trainer: auto default (#16847) 2023-02-23 18:42:17 +01:00
Carlos Mocholí d486f94dd2
Fabric: auto default (#16842) 2023-02-23 13:45:27 +00:00
Adrian Wälchli bc9651364f
Update changelog after 1.9.3 and bump version for RC (#16833)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-23 13:37:24 +00:00
Liyang90 19139137c7
Fix for hanging issue on TPU Pod (#16844)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-23 14:27:00 +01:00
Carlos Mocholí 235e692259
Fabric: do `set_epoch` for `batch_sampler.sampler` (#16841) 2023-02-23 00:11:29 +00:00
Ethan Harris beced48904
[App] Add support for plugins to return actions (#16832) 2023-02-22 17:12:04 +00:00
Carlos Mocholí 62e3d5854f
Consume the prediction batch indices iteratively (#16826) 2023-02-22 17:03:08 +01:00
Justus Schock 598c2476cd
Remove implicit frontend testing from `testing.run_app_in_cloud` (#16741)
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
2023-02-22 14:48:10 +00:00
Carlos Mocholí 914effa04c
Rename `replace_sampler_ddp|replace_sampler` to `use_distributed_sampler` (#16829)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-22 14:07:02 +01:00
Carlos Mocholí 565d6111f3
Move `max_batches` definition to the Loops (#16820) 2023-02-22 14:01:34 +01:00
Ethan Harris f969411284
[App] Fix local app run with relative import (#16835) 2023-02-22 12:23:28 +00:00
Carlos Mocholí 2bd54e4602
Add more pytree tests (#16825)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-21 17:27:00 +01:00
Jirka Borovec 0009cde1db
fix make clean command (#16823) 2023-02-21 10:01:30 -05:00
Carlos Mocholí b30a43f783
Move the `CombinedLoader` to an utility file (#16819) 2023-02-20 18:06:35 +01:00
Carlos Mocholí d807c003a7
Always run standalone tests (#16705) 2023-02-20 14:58:44 +00:00
Carlos Mocholí 3b548c1104
Avoid instantiating CombinedDataset unnecessarily (#16805) 2023-02-20 15:39:08 +01:00
Adrian Wälchli 0e4ca7c286
Set accelerator through CLI only if set explicitly (#16818) 2023-02-20 13:45:06 +00:00
Adrian Wälchli 65e66814f8
Remove the `*_step_end` hooks (#16791)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-20 13:04:40 +00:00
Carlos Mocholí 83b88996cd
Move `_TrainingEpochLoop` (#16801) 2023-02-20 13:33:01 +01:00
Carlos Mocholí 365bf10936
Resolve FitLoop setter TODOs (#16803) 2023-02-20 13:32:36 +01:00
Carlos Mocholí 781768d2b2
Remove `Trainer(multiple_trainloader_mode)` in favor of `CombinedLoader(mode)` (#16800) 2023-02-20 13:32:06 +01:00
Adrian Wälchli 81b7c30291
Make DDP subprocess the default launcher for multi-device (#16780) 2023-02-20 11:20:50 +00:00
Mauricio Villegas 3a0519143a
Fix bug in lightning_cli_advanced_3.rst (#16792) 2023-02-20 12:12:17 +01:00
Sebastian Raschka a4f4b5372a
Add missing docs quote (#16797)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2023-02-20 11:09:55 +01:00
Adrian Wälchli 2844e9e246
Fix XLAEnvironment detection on TPU pod (#16806) 2023-02-20 11:01:06 +01:00
Justus Schock c7962a1619
Add back external colossalai test (#16817) 2023-02-20 09:46:40 +00:00
dependabot[bot] 60004eb468
Bump Lightning-AI/utilities from 0.4.1 to 0.6.0 (#16812)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-20 08:17:52 +01:00
Yurij Mikhalevich 6950a07eaa
[App] fix `lightning open` command & better redirects (#16794)
* fix(app): URLs, create run on app run

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-17 18:24:23 +00:00
Justus Schock 0fee28409b
Introduce new precision layout in PL (#16783) 2023-02-17 17:58:14 +01:00
Carlos Mocholí ec4f592ecf
Sequential `CombinedLoader` to flatten the eval and predict loops (#16726) 2023-02-17 17:37:11 +01:00
Adrian Wälchli ccd2a481d0
Update changelog after 1.9.2 release (#16777)
changelog

Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-02-17 08:52:03 -05:00
Ethan Harris 7f92d5c9d4
[App] Refactor plugins to be a standalone `LightningPlugin` (#16765) 2023-02-17 11:01:38 +00:00
Justus Schock ac5fa03385
Introduce new precision layout in fabric (#16767) 2023-02-17 10:41:18 +00:00
Ethan Harris 3a354acc61
[App] Reserve APP_SERVER_PORT in cloud port allocation (#16782)
Co-authored-by: thomas chaton <thomas@grid.ai>
2023-02-17 09:33:17 +00:00
Noha Alon 1a6331f88f
fix warning so the user has a clear next step (#16751) 2023-02-17 09:26:44 +02:00
Adrian Wälchli 91e692c767
Rename the TPUSpawnStrategy to XLAStrategy (#16781)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-17 02:06:24 +00:00
Ethan Harris 6e359dcc86
[App] Fix idle timeout e2e (#16786) 2023-02-17 01:52:46 +00:00