lightning

Commit Graph

Author	SHA1	Message	Date
ritsuki1227	6855f653bb	Set `MLFlowLogger` status to FAILED when training raises an error (#12292 ) Co-authored-by: Ritsuki Yamada <ritsuki.yamada@uzabase.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-09-20 07:43:32 -04:00
awaelchli	c0ff7a1b77	Add backward-compatibility for LightningLite in PL (#14735 )	2022-09-20 13:31:56 +02:00
awaelchli	e3e71670e6	Move src/pytorch_lightning/lite to src/lightning_lite (#14735 )	2022-09-20 13:31:56 +02:00
Carlos Mocholí	810643bca2	Surface Neptune installation problems to the user (#14715 )	2022-09-20 10:19:51 +00:00
Mauricio Villegas	3064c28ce1	Added args parameter to LightningCLI to ease running from within Python (#14596 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>	2022-09-19 17:38:30 +00:00
Carlos Mocholí	e9c571d39f	Move accelerator-specific parsing functions with their accelerators (#14753 ) Co-authored-by: awaelchli <aedu.waelchli@gmail.com>	2022-09-18 22:48:45 +00:00
Adrian Wälchli	4f9c7793e7	Fix TensorBoardLogger creating redundant experiment when finalizing (#14762 ) Co-authored-by: Kushashwa Ravi Shrimali <kushashwaravishrimali@gmail.com>	2022-09-18 16:27:15 -04:00
Adrian Wälchli	35c65b0287	Fix test suite when running on MPS-enabled hardware (#14708 )	2022-09-16 19:21:36 +00:00
Adrian Wälchli	47f0d336f1	Standalone Lite: Update LightningLite (#14726 )	2022-09-16 17:25:27 +00:00
Carlos Mocholí	8c01c89d74	Remove deprecated `NeptuneLogger` code (#14727 )	2022-09-16 16:26:15 +00:00
Adrian Wälchli	5bef75648e	Remove deprecated `torch_distributed_backend` logic (#14693 ) * Remove deprecated torch_distributed_backend logic * changelog * mention deprecated * imports Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz>	2022-09-16 17:27:36 +02:00
Adrian Wälchli	619e76f22d	Remove silent behavior when `num_slurm_tasks` does not correspond to number of processes in Trainer (#14300 ) * simplify logic * remove hpc * update * add changelog * more tests * update test Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-16 11:00:09 +00:00
Carlos Mocholí	5ff78f0753	Use the setter in the children recursively (#14724 )	2022-09-15 13:58:12 +00:00
Adrian Wälchli	8b3d6d8feb	Add easy access to `state_dict` in Lite module wrapper (#14629 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-09-14 19:29:23 -04:00
Manan Goel	48e783dd0d	Added support for downloading wandb artifacts in the WandbLogger (#14551 ) * Added functions to the WandbLogger to download and use artifacts without having to access the experiment object * Updated CHANGLELOG.md * Added suggested changes * Delete test_script Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: awaelchli <aedu.waelchli@gmail.com>	2022-09-14 14:11:52 +00:00
Adrian Wälchli	6333caabb0	Standalone Lite: Strategy base classes and registry (#14662 ) * add accelerator implementations to lite * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix imports * rename registry argument * fix test * fix tests * remove duplicated test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix tests * deprecation * deprecations * flake8 * fixes * add mps to runif * fix tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove more * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local import * undo device stats :( * fix import * stupid typehints * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more refactors :( * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix * rename init_device to setup_device * remove unused import * make uppercase to differentiate from class * trick test after moving import locally * add base classes and registry * reg * registry * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tests * update to other branches * resolve todo(lite) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add very basic unit tests * fix name assignment * Update src/lightning_lite/strategies/parallel.py Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> * remove deprecated property * remove pre- and post backward for now * protecting the registry utility function * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused import Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>	2022-09-14 09:15:21 -04:00
otaj	616304831a	Remove deprecated `BaseProfiler` and `AbstractProfiler` (#14404 ) Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>	2022-09-13 14:52:09 +00:00
Adrian Wälchli	19a1274093	Better error message when dataloader and datamodule is None (V2) (#14637 )	2022-09-13 12:26:03 +00:00
Adrian Wälchli	1ee3d1eb72	Avoid warning when cloning tensor in self.log (#14599 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-13 16:23:46 +05:30
Adrian Wälchli	4bd135a6f6	Remove deprecated `LoggerCollection` (#14283 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-12 21:46:46 +00:00
Max Ehrlich	e5998e6bf2	Make the SLURM Preemption/Timeout Signal Configurable (#14626 ) * Add parameter to change the preemption signal * Make the signal connector use the custom signal from SLURMEnvironment Signed-off-by: Max Ehrlich <max.ehr@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-09-12 19:24:35 +00:00
Adrian Wälchli	925edbca07	Remove the deprecated `weights_save_path` Trainer argument (#14424 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-09-12 19:02:38 +00:00
Mauricio Villegas	1680a76819	Removed from_argparse_args tests in test_cli.py (#14597 )	2022-09-12 18:25:29 +00:00
Adrian Wälchli	d013bcc5bf	Standalone Lite: Accelerators (#14578 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-12 16:00:14 +00:00
Carlos Mocholí	cf3428784f	Set `running_torchscript` recursively (#14657 ) * Set `running_torchscript` recursively * CHANGELOG	2022-09-12 14:39:40 +00:00
Carlos Mocholí	e859546b96	Integrate lightning_utilities `is_overridden` (#14620 )	2022-09-12 15:16:57 +02:00
awaelchli	cbbd148089	Add back-compatibility for checkpoint io plugins in pl/plugins/io (#14519 )	2022-09-12 08:28:46 -04:00
awaelchli	463439e624	Move checkpoint io plugins from pl/plugins/io to lite/plugins/io (#14519 )	2022-09-12 08:28:46 -04:00
Adrian Wälchli	024e7b8204	Standalone Lite: Cluster Environments (#14509 )	2022-09-12 12:20:08 +02:00
Vasilis Vryniotis	7e9e441843	Use TorchVision's Multi-weight Support and Model Registration API on Lightning (#14567 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-09-09 20:04:57 +00:00
Adrian Wälchli	95374440ce	Move device parser tests inside Lite (#14586 )	2022-09-07 21:22:46 +00:00
Adrian Wälchli	d2459df2ff	Standalone Lite: Remaining Utilities (#14492 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com> Co-authored-by: Felonious-Spellfire <felonious.spellfire@gmail.com>	2022-09-07 15:25:23 +00:00
Carlos Mocholí	bcad90141a	Remove old test artifacts (#14574 )	2022-09-07 10:09:59 -04:00
Carlos Mocholí	8c4184c105	Integrate with `lightning_utilities.core.enums` (#14558 )	2022-09-07 15:14:14 +02:00
Carlos Mocholí	5216c51096	Integrate `lightning_utilities.core.rank_zero` (#14556 )	2022-09-07 09:21:48 +00:00
Carlos Mocholí	273a9ed8c1	Integrate `lightning_utilities.core.apply_func` (#14537 )	2022-09-06 13:52:54 +00:00
Carlos Mocholí	44216fdd69	Integrate `lightning_utilities.core.imports` (#14475 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-09-06 12:56:20 +00:00
Carlos Mocholí	8a4a3b6766	Mark the lite `DeviceDtypeModuleMixin` as protected (#14548 )	2022-09-06 14:17:15 +02:00
Rohit Gupta	8c6119fbce	Add auto wrapping support for `DDPFullyShardedStrategy` (#14383 )	2022-09-05 19:07:26 +00:00
awaelchli	7f148b2c47	Deprecate pl/utilities/apply_func (#14516 )	2022-09-05 20:30:42 +02:00
awaelchli	9fea2ed9d5	move pl/utilities/apply_func.py to pl/utilities/apply_func.py (#14516 )	2022-09-05 20:30:42 +02:00
awaelchli	cfea2be137	Deprecate pl/utilities/cloud_io.py (#14515 )	2022-09-05 18:30:31 +02:00
awaelchli	def6548596	move pl/utilities/cloud_io.py to lite/utilities/cloud_io.py (#14515 )	2022-09-05 18:30:31 +02:00
awaelchli	165427a506	Deprecate pl/utilities/xla_device (#14514 )	2022-09-05 17:36:02 +02:00
awaelchli	75d5a2d046	move pl/utilities/xla_device.py to lite/utilities/xla_device.py (#14514 )	2022-09-05 17:36:02 +02:00
awaelchli	c2879c20da	Deprecate pl/core/mixins/device_dtype_mixin and update imports (#14511 )	2022-09-05 16:31:00 +02:00
awaelchli	cefe2fa123	Move test_dtype_device_mixin to lite (#14511 )	2022-09-05 16:31:00 +02:00
Rohit Gupta	ce702fd40e	Squeeze tensor while logging (#14489 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-09-05 14:01:51 +00:00
Tianshu Wang	23f0e20209	Fixed `WandbLogger` `save_dir` is not set after creation (#12748 ) (#14326 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-05 10:12:43 +00:00
Roberto de Moura Estevão Filho	ed0164a3d2	Estimate stepping batches with max_steps if max_epochs is not set (#14317 ) Co-authored-by: Roberto Estevão <robertode@microsoft.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-09-05 09:05:21 +00:00
Carlos Mocholí	4235eff712	Use a standalone test symlink for Lite (#14502 )	2022-09-04 20:57:28 +02:00
Adrian Wälchli	291dc1b615	Standalone Lite CI setup (#14451 ) Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-09-01 22:13:12 +00:00
Carlos Mocholí	e0c2c3e677	Clean up fairscale imports (#14476 )	2022-09-01 18:08:40 +02:00
Adrian Wälchli	28e18881a9	Mark stage argument in hooks as required (#14064 ) Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2022-09-01 15:47:40 +02:00
Rohit Gupta	e90ac769d6	Reset dataloaders on failure in tuner (#14372 )	2022-08-31 21:00:18 +00:00
Carlos Mocholí	2e3d85af84	Remove deprecated rank zero utilities (#14471 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-08-31 18:29:11 +00:00
Anner	626827c872	update rng state save/load test to also run on cuda gpu (#14396 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-08-31 16:36:35 +00:00
Carlos Mocholí	a1dd718781	Remove deprecated support for passing the warning category positionally (#14470 )	2022-08-31 17:34:56 +02:00
Carlos Mocholí	291267c3bf	Unify rank zero messaging utilities (#14116 )	2022-08-30 09:51:30 +00:00
ananthsub	d0d1818d50	Update `has_len_all_ranks` to use `Strategy.root_device` (#12144 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-08-29 20:23:34 +00:00
Carlos Mocholí	f202e84f4b	Remove the legacy `get_deprecated_arg_names` (#14415 )	2022-08-29 14:53:57 +02:00
Krishna Kalyan	1a3fe39571	Removed deprecated `Trainer.num_processes` property in favour of `Trainer.num_devices` (#14423 ) Co-authored-by: awaelchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-28 23:59:24 +02:00
Krishna Kalyan	5cbe1f48d2	Removed the deprecated `Trainer.data_parallel_device_ids` function in favour of `Trainer.device_ids` (#14422 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-28 18:07:00 +00:00
Krishna Kalyan	cea9a72d9d	Removed the deprecated the `trainer.lr_schedulers` (#14408 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-28 18:06:09 +00:00
otaj	1e04951206	Remove deprecated `TrainerCallbackHookMixin` (#14401 ) * remove deprecated callback hook * changelog	2022-08-28 10:56:37 +00:00
Rohit Gupta	f3574176e2	Change `trainer.should_stop` to not stop in between an epoch and run until `min_steps/min_epochs` only (#13890 )	2022-08-27 12:12:24 +00:00
Adrian Wälchli	250c06e406	Remove deprecated HPC model hooks (#14315 ) Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-08-26 20:59:32 +00:00
Carlos Mocholí	3ba0f56b18	Remove support for the deprecated torchtext legacy (#14375 )	2022-08-26 20:01:51 +00:00
Tianshu Wang	8950613552	save checkpoints and profiler output to the first logger (#14325 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-08-26 17:23:54 +00:00
Carlos Mocholí	d4bcafad7a	Remove the deprecated loop output format (#14373 )	2022-08-26 16:56:56 +00:00
Justin Goheen	ed84d04bcf	Fix mypy errors attributed to `pytorch_lightning.core.datamodule` (#13693 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: otaj <ota@lightning.ai>	2022-08-26 16:26:26 +00:00
Adrian Wälchli	fafd254678	Fix device parser logic to avoid creating CUDA context (#14319 ) * let environment disable forking * add helper function and error messages * tests * changelog Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-08-26 15:41:38 +00:00
Björn Barz	0102d0d4d4	Fix restoring trainer after `lr_find()` (#14113 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-26 15:19:08 +00:00
Justin Goheen	94e567e6f0	Fix mypy errors attributed to `pytorch_lightning.trainer.connectors.data_connector.py` (#13806 ) Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>	2022-08-26 13:28:27 +00:00
Adrian Wälchli	e2221a0b3e	Raise an error when resuming training with Apex (#14341 )	2022-08-26 13:11:24 +00:00
Rohit Gupta	6d00f31f0c	Add auto wrapping for `DDPFullyShardedNativeStrategy` (#14252 )	2022-08-26 09:01:48 +00:00
Christian Schell	70deac2cd4	Reset epoch progress with batch size scaler (#13846 ) Co-authored-by: Christian Schell <christian.schell@uni-wuerzburg.de> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-26 14:12:00 +05:30
Adrian Wälchli	e67842dcba	Support sharded optimizer state dumping outside of sharded strategies (#14208 )	2022-08-26 07:58:21 +00:00
Justus Schock	a01e016fff	Remove mps config for test (#14379 ) * Remove mps config for test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2022-08-26 02:47:37 -04:00
Anner	33a5ed9879	Add torch.cuda rng state to seed save/load (#14384 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-26 05:26:00 +00:00
Tanmoy	807435885e	Fix `LightningDataModule` hparams parsing (#12806 ) Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-25 18:57:48 +00:00
Jirka Borovec	99ba95a38e	fix imports of collections.abc for py3.10 (#14345 ) fix collections.abc for py3.10 Co-authored-by: Sherin Thomas <sherin@grid.ai>	2022-08-23 11:52:58 -04:00
Carlos Mocholí	7a617ec90e	Add back support for logging in the gradient clipping hooks (#14298 ) * Add back support for logging in the gradient clipping hooks * Docs and CHANGELOG * Fix tests	2022-08-22 09:19:53 -04:00
Rohit Gupta	db1835a82c	Fix an issue to avoid the impact of sanity check on `reload_dataloaders_every_n_epochs` for validation (#13964 )	2022-08-21 23:55:03 +05:30
Kaushik B	a8c6e69b43	Fix wrong num padding for RichProgressBar (#14296 )	2022-08-19 09:40:44 +05:30
Rohit Gupta	d9c6090170	Deprecate `on_colab_kaggle` func (#14247 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-18 18:34:21 +00:00
Adrian Wälchli	326f7565b0	Forward extra keyword arguments in `LightningDataModule.from_datasets` (#14185 ) Co-authored-by: otaj <ota@lightning.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-18 14:06:39 +00:00
Adrian Wälchli	7879628a3a	Fix access to logger attribute when multiple loggers are used (#14234 ) * Fix access to logger attribute when multiple loggers are used * add changelog	2022-08-18 08:55:08 -04:00
Rohit Gupta	e949362a6b	Enable `on_before_batch_transfer` for `DPStrategy` and `IPUAccelerator` (#14023 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-08-18 12:12:29 +00:00
Adrian Wälchli	2e59c49592	Update defaults for WandbLogger's run name and project name (#14145 )	2022-08-17 16:31:20 +00:00
otaj	44cdbcab04	Allowed setting attributes on `DataLoader` and `BatchSampler` when instantiated inside `*_dataloader` hooks (#14212 )	2022-08-17 11:42:54 -04:00
Rohit Gupta	48c23e5716	Use fsdp module to initialize precision scalar for fsdp native (#14092 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com>	2022-08-13 07:52:06 +00:00
Rohit Gupta	c8e22b4572	Avoid raising the sampler warning if num_replicas=1 (#14097 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>	2022-08-12 08:44:21 +00:00
Adrian Wälchli	807f9d8c96	Replace unwrapping logic in strategies (#13738 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-12 08:24:04 +00:00
Rohit Gupta	6789a066b5	Avoid false positive warning about using `sync_dist` when using torchmetrics (#14143 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-12 12:52:24 +05:30
Rohit Gupta	2d9e00fab6	Profile batch transfer and gradient clipping hooks (#14069 )	2022-08-11 23:21:53 +00:00
Adrian Wälchli	56533368af	Remove DeepSpeed version restriction from Lite (#13967 )	2022-08-11 16:17:56 +00:00
Adrian Wälchli	3b18da3eaf	Fix saving hyperparameters in a composition where parent is not a LM or LDM (#14151 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-11 15:49:46 +00:00
Carlos Mocholí	3dc08b1ef5	Fix flaky test caused by weak reference (#14157 )	2022-08-11 09:33:19 +02:00
Adrian Wälchli	a7cebf2416	Fix entry point test for Python 3.10 (#14154 )	2022-08-11 01:32:32 +02:00
Adrian Wälchli	4008f9cd41	Convert subprocess test to standalone test (#14101 )	2022-08-10 17:15:12 -04:00
otaj	f132d44821	Fix a bug that caused spurious `AttributeError` when multiple `DataLoader` classes are imported (#14117 )	2022-08-10 16:09:50 +00:00
Carlos Mocholí	9b61b1c482	Remove duplicated test classes (#14122 ) Remove duplicated classes	2022-08-10 17:21:05 +02:00
Adrian Wälchli	dc8ff5ed26	Fix device placement when `.cuda()` called without specifying index (#14128 )	2022-08-10 05:23:20 -04:00
Adam Reeve	975a4fc2f1	Support checkpoint save and load with Stochastic Weight Averaging (#9938 ) Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Kushashwa Ravi Shrimali <kushashwaravishrimali@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-09 23:18:21 +00:00
Adrian Wälchli	06c255c5c1	Skip ddp fork tests on windows (#14121 )	2022-08-09 22:54:10 +00:00
Carlos Mocholí	d85085479d	Reset all results on epoch end (#14061 )	2022-08-09 23:01:11 +05:30
Rohit Gupta	ac369f5570	Fix incorrect `precision="mixed"` being used with `DeepSpeedStrategy` and `IPUStrategy` (#14041 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-09 21:25:23 +05:30
Anton Shevtsov	c55fe7105b	Prefix seed_everything log messages with rank info (#14031 ) Co-authored-by: Anton Shevtsov <aeshevtsov@avito.ru> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-08-09 15:40:30 +02:00
Adrian Wälchli	0cfc53d6b4	Fix regression on default value for `find_unused_parameters` (#14095 )	2022-08-09 13:56:02 +05:30
Carlos Mocholí	d072e4451a	Fix dtype inference during gradient norm computation (#14051 )	2022-08-08 11:35:06 +00:00
Carlos Mocholí	aaeff90254	Remove deprecated `DistributedType` and `DeviceType` enum classes (#14045 )	2022-08-08 10:07:54 +02:00
Rohit Gupta	b25275ccc2	Cast to fp16 before moving to device with deepspeed (#14000 )	2022-08-05 22:15:15 +00:00
Carlos Mocholí	91dd6a68fb	Remove meta device utilities in favor of torchdistx (#13868 )	2022-08-05 12:20:27 +00:00
Adrian Wälchli	3d5c3d24f9	Remove unused auto_collect_arguments class method (#14015 )	2022-08-05 08:49:00 +00:00
Rohit Gupta	a4e4cab7a6	Deprecate `amp_level` from `Trainer` (#13898 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-08-05 08:31:19 +00:00
Carlos Mocholí	b88b700745	Remove the deprecated DDP2 strategy (#14026 )	2022-08-04 20:27:35 +00:00
Rohit Gupta	f5bd6e6f5f	Cast only floating types with IPUs (#13983 )	2022-08-04 19:46:07 +00:00
Adrian Wälchli	ef0623ec64	Remove deprecated training type plugins (#14011 ) * Remove deprecated training type plugins * update changelog * DDP2Plugin * Update src/pytorch_lightning/CHANGELOG.md	2022-08-04 18:00:00 +02:00
Rohit Gupta	e78bf2044b	Raise an error if batch transfer hooks are overridden with IPUAccelerator (#13961 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-08-04 12:04:42 +00:00
Adam J. Stewart	d748dae548	Fix erroneous warning for unset `max_epochs` (#13262 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-03 19:17:21 +00:00
Adrian Wälchli	e6a8283e9c	Organize accelerator tests (#13986 )	2022-08-03 13:49:55 +00:00
Adrian Wälchli	4ce97f37a2	Validate the model input of trainer methods (#13892 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-03 13:38:42 +00:00
Adrian Wälchli	ce025bf954	Lazy import check for hydra dependency (#13812 )	2022-08-03 04:27:16 -04:00
Jerome Anand	b3203d93d0	Added support for HPU device stats monitor (#13819 ) * Added support for HPU device stats monitor Signed-off-by: Jerome <janand@habana.ai> * Update changelog Signed-off-by: Jerome <janand@habana.ai> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> * Update reference Signed-off-by: Jerome <janand@habana.ai> * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * fix alignment * add descriptions * Update hpu_intermediate.rst Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-08-02 13:31:31 +05:30
Adrian Wälchli	eb233ea12d	Snapshot selected globals and restore them in spawned process (#13921 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-08-01 22:21:46 +00:00
Rohit Gupta	0f6caffa57	Fix deepspeed default precision plugin `amp_level` to O2 (#13897 ) Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>	2022-07-29 20:36:51 +00:00
Adrian Wälchli	caaf35689c	Improvements to standalone scripts (#13840 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-07-28 23:33:22 +00:00
HMellor	07b39c257b	Cast on host instead of IPU when using `precision=16` (#13880 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-07-28 19:26:41 +00:00
Adrian Wälchli	25203d4c81	Organize model summary utilities (#13893 )	2022-07-28 19:23:29 +02:00
Carlos Mocholí	406cea7146	Support DeepSpeed <0.7.0 (#13859 ) Co-authored-by: awaelchli <aedu.waelchli@gmail.com>	2022-07-28 14:38:51 +00:00
Carlos Mocholí	1299e4f984	Run GPU tests with PyTorch 1.12 (#13716 ) Co-authored-by: Jirka <jirka.borovec@seznam.cz>	2022-07-28 19:37:57 +05:30
Carlos Mocholí	511875e567	Support DeepSpeed >=0.6.0, <0.6.5 (#13863 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-07-27 18:57:52 +02:00
Adrian Wälchli	fff62f0ae5	Fix TPU testing and collect all tests (#11098 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>	2022-07-27 15:40:40 +00:00
otaj	95f5f170f5	Allowed custom `BatchSampler`s when instantiated in `*_dataloader` hook (#13640 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-07-27 15:32:50 +00:00
Adrian Wälchli	2a24b906ac	Add batch size script argument for standalone tests (#13841 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2022-07-27 12:36:22 +00:00
otaj	4c7b9f0b11	Disallow batch sampler with multiple IPU devices (#13854 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-07-27 15:20:43 +05:30
Anton Shevtsov	41f45b475e	Check if the scheduler already has `reduce_on_plateau` (#13838 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-07-27 09:10:57 +00:00
Adrian Wälchli	c3911700d1	Fix error handling in learning rate finder (#13845 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-07-27 04:32:39 -04:00
Rohit Gupta	faf7ff57c0	Add support for async checkpointing (#13658 )	2022-07-26 21:13:19 +05:30
Adrian Wälchli	a8d7b4476c	Fix PyTorch spelling errors (#13774 ) * Fix PyTorch spelling errors * more	2022-07-25 12:51:16 -04:00
Justus Schock	227871982d	Merge different gpu backends with accelerator='gpu' (#13642 ) * Rename GPUAccelerator to CUDAAccelerator * Add back GPUAccelerator and deprecate it * Remove temporary registration * accelerator connector reroute * accelerator_connector tests * update enums * lite support + tests * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * typo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move "gpu" support up before actual accelerator flag checks * Stupid arguments * fix tests * change exception type * fix registry test * pre-commit * CI: debug HPU flow (#13419) * Update the hpu-tests.yml to pull docker from vault * fire & sudo * habana-gaudi-hpus * Check the driver status on gaudi server (#13718) Co-authored-by: arao <arao@habana.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Akarsha Rao <94624926+raoakarsha@users.noreply.github.com> * Update typing-extensions requirement from <4.2.1,>=4.0.0 to >=4.0.0,<4.3.1 in /requirements (#13529) Update typing-extensions requirement in /requirements Updates the requirements on [typing-extensions](https://github.com/python/typing_extensions) to permit the latest version. - [Release notes](https://github.com/python/typing_extensions/releases) - [Changelog](https://github.com/python/typing_extensions/blob/main/CHANGELOG.md) - [Commits](https://github.com/python/typing_extensions/compare/4.0.0...4.3.0) --- updated-dependencies: - dependency-name: typing-extensions dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [pre-commit.ci] pre-commit suggestions (#13540) updates: - [github.com/psf/black: 22.3.0 → 22.6.0](https://github.com/psf/black/compare/22.3.0...22.6.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [FIX] Native FSDP precision + tests (#12985) * Simplify fetching's loader types (#13111) * Include app templates to the lightning and app packages (#13731) * Include app templates to the package Co-authored-by: mansy <mansy@lightning.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix mypy typing errors in pytorch_lightning/callbacks/model_checkpoint.py (#13617) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Fix typos initialize in docs (#13557) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix main progress bar counter when `val_check_interval=int` and `check_val_every_n_epoch=None` (#12832) * Fix mypy errors attributed to `pytorch_lightning.loggers.tensorboard.py` (#13688) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Fix mypy errors attributed to `pytorch_lightning.loggers.mlflow` (#13691) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com> * fix mypy errors for loggers/wandb.py (#13483) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Fix gatekeeper minimum check (#13769) * changelog * changelog * fix order * move up again * add missing test Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: arao <arao@habana.ai> Co-authored-by: Akarsha Rao <94624926+raoakarsha@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sean Naren <sean@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Mansy <ahmed.mansy156@gmail.com> Co-authored-by: mansy <mansy@lightning.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Lee Jungwon <33821003+BongYang@users.noreply.github.com> Co-authored-by: Nathaniel D'Amours <88633026+NathanielDamours@users.noreply.github.com> Co-authored-by: Justin Goheen <26209687+JustinGoheen@users.noreply.github.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com> Co-authored-by: Gautier Dagan <s2234411@ed.ac.uk> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>	2022-07-25 14:46:45 +00:00
Mauricio Villegas	1b31039c58	Update LightningCLI test for new support in latest release of jsonargparse (#13805 )	2022-07-25 09:25:42 +00:00
Adrian Wälchli	81f149e9d4	Rename spawn-based launchers (#13743 )	2022-07-23 11:48:15 -04:00
Adrian Wälchli	fa886f2a58	Lazy import check for neptune dependency (#13477 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>	2022-07-23 14:06:26 +00:00
Adrian Wälchli	d24978baa3	Add ddp_notebook alias for ddp_fork (#13744 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-07-23 09:06:35 -04:00
Jinyoung Lim	ae9803137a	Add logging messages to notify when `FitLoop` stopping conditions are met (#9749 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-07-23 12:07:47 +00:00
Carlos Mocholí	4f53e7132f	Promote the CLI out of utilities (#13767 )	2022-07-23 12:07:29 +00:00
Adrian Wälchli	f6f06d4e42	Set default strategy to ddp_fork in interactive environments (#13746 )	2022-07-22 19:34:30 +00:00
Carlos Mocholí	9f51c07604	Support setting the trainer reference recursively for ensembles (#13638 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2022-07-22 19:58:46 +02:00

1 2 3 4 5 ...

312 Commits