lightning

Commit Graph

Author	SHA1	Message	Date
Rohit Gupta	d9dfb2e920	fix tests (#10138 )	2021-10-25 19:37:47 +00:00
Danielle Pintz	1f7bd6650c	Mark accelerator connector as protected (#10032 )	2021-10-25 19:24:54 +00:00
jjenniferdai	6d79184ec5	Unify checkpoint load paths [redo #9693 ] (#10061 )	2021-10-25 19:05:31 +00:00
Adrian Wälchli	76081fb846	Mark SLURM detection methods in `AcceleratorConnector` as protected (#10101 ) Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2021-10-25 17:52:15 +00:00
Carlos Mocholí	2ee3127661	Use `torch.autocast` (#10053 )	2021-10-25 17:33:52 +00:00
Carlos Mocholí	b376799430	Minor fixes related to clipping (#10130 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 16:40:22 +00:00
manipopopo	cfb2d87765	Disable quantization aware training observers (#8540 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2021-10-25 15:46:09 +00:00
Adrian Wälchli	7eb2edf421	rename set_random_master_port (#10104 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 12:09:05 +00:00
Danielle Pintz	e94dcf6936	Mark `trainer.data_connector` as protected (#10031 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 12:29:09 +01:00
Carlos Mocholí	f95ba20012	Do not use the base version by default in `_compare_version` (#10051 )	2021-10-25 16:41:32 +05:30
thomas chaton	ed9802643c	[CI] Comment flaky tests (#10084 )	2021-10-25 10:31:06 +02:00
Kaushik B	c3614f1c07	Fix: skip importing DistributedOptimizer for Windows (#10071 )	2021-10-21 21:01:56 +00:00
thomas chaton	454e93bace	Add support for init_meta_context, materialize_module (#9920 )	2021-10-21 15:48:31 +01:00
jjenniferdai	2d9db211b5	Revert "Support serialized checkpoint loading (#9605 )" (#10057 ) This reverts commit `f0e6f1b58a`.	2021-10-21 02:51:22 +02:00
Kaushik B	aa1540410f	Add XLACheckpointIO (#9972 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2021-10-21 02:39:16 +05:30
Rohit Gupta	1599c77d16	Fix `LearningRateMonitor` logging with multiple param groups optimizer with no scheduler (#10044 )	2021-10-20 22:13:00 +05:30
Carlos Mocholí	6aeebf1bd3	Remove unnecessary dependency available checks (#10050 )	2021-10-20 16:21:37 +00:00
Alessio Bonfiglio	2a2fa5a56a	Group all the logged gradients under the same sub-folder (#7756 )	2021-10-20 15:48:36 +00:00
Kaushik B	56bc55db71	Update strategy flag in docs (#10000 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-10-20 21:02:53 +05:30
kingyiusuen	2ed92ecabb	Rerun flaky profiler tests on failure (#10035 )	2021-10-20 18:57:04 +05:30
Carlos Mocholí	f0b3e0f4de	Default to `precision=bf16` on CPU when `precision=16` is passed (#10033 )	2021-10-20 13:25:13 +00:00
Adrian Wälchli	2c16f1d6b9	remove dataloader patching on the LightningModule (#9764 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-10-20 15:23:20 +02:00
jjenniferdai	f0e6f1b58a	Support serialized checkpoint loading (#9605 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-20 09:38:35 +01:00
Carlos Mocholí	53c62f63e8	Constrain IPU precision choices (#10030 )	2021-10-20 00:52:01 +00:00
Carlos Mocholí	ad8d6c83da	[CLI] Shorthand notation to instantiate datamodules (#10011 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-20 00:49:48 +00:00
Carlos Mocholí	e44921ee21	Fix `self.log(on_epoch=True, reduce_fx=sum)` on_batch_start (#9791 )	2021-10-20 01:56:37 +02:00
Carlos Mocholí	d45897d522	Rename `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` (#10026 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-19 21:09:37 +00:00
Ning	0b68f2abf8	Remove `reset_train_val_dataloaders` from Trainer and move data reloading logic to loop (#9671 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2021-10-19 21:45:52 +02:00
Carlos Mocholí	e8beceb631	Add `TPUPrecisionPlugin` (#10020 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-19 17:48:57 +00:00
thomas chaton	1759403c8d	Add check for callable with datamodule len (#10003 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-19 14:51:08 +00:00
Rohit Gupta	0aa220b46b	Remove deprecated `distributed_backend` from `Trainer` (#10017 ) * rm distributed_backend from Trainer * unused * chlog * internal distributed_backend * Docstring Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2021-10-19 13:54:37 +00:00
Danielle Pintz	203737bfce	Don't raise DeprecationWarning for `LoggerConnector.gpus_metrics` (#9959 )	2021-10-18 22:51:09 +00:00
Adrian Wälchli	a99b7440b5	Add unit tests for `pl.utilities.grads` (#9765 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-10-18 18:58:51 +05:30
Rohit Gupta	4dc32ad7db	Fix logic to check for spawn in worker_check (#9902 ) * fix * update tests * chlog * skip windows	2021-10-18 13:02:46 +00:00
Carlos Mocholí	3f355d0eb7	Remove manual tracking of optimizer steps (#9957 )	2021-10-18 12:43:06 +00:00
Carlos Mocholí	0684e5295f	Remove deprecated `DataModule.dims` usage in tests (#9948 )	2021-10-18 17:35:41 +05:30
Carlos Mocholí	c69a79c86f	Fix `self.log(on_epoch=True)` on_batch_start (#9780 )	2021-10-18 14:02:16 +02:00
Elad Segal	8c76cf5ae1	reset val dataloader for binsearch (#9975 )	2021-10-18 12:54:26 +02:00
Carlos Mocholí	01b304ec57	Update accelerator connector messages after the addition of strategy (#9937 )	2021-10-18 01:10:48 +00:00
Carlos Mocholí	788f6864d9	Fix `LightningOptimizer` step and toggling logic (#9958 )	2021-10-18 00:23:51 +00:00
ronif	7b4df7bf91	Fix issue with no-init dataclass fields in move_to_device (#9963 ) Co-authored-by: ronif <ronif@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-17 07:10:47 +00:00
Carlos Mocholí	e5dfdf34f9	Avoid deprecation warning after #9901 (#9951 )	2021-10-16 17:36:25 +01:00
Kaushik B	5e8829b97d	(1/n) tests: Use strategy flag instead of accelerator for training strategies (#9931 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-10-16 20:40:25 +05:30
Carlos Mocholí	e973bcb76a	Use non-deprecated options in tests (#9949 )	2021-10-15 16:58:07 -07:00
Carlos Mocholí	db4e770004	Validate the precision input earlier (#9763 )	2021-10-15 17:30:00 +00:00
kingyiusuen	6429de8944	Add support for `len(datamodule)` (#9895 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-15 14:19:50 +02:00
Danielle Pintz	16213b1635	Deprecate `log_gpu_memory`, `gpu_metrics`, and util funcs in favor of `DeviceStatsMonitor` callback (#9921 ) Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-14 22:45:44 +02:00
Oliver Borchert	afbf703684	Single-process multi-node CPU training (#9603 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai>	2021-10-14 22:21:41 +02:00
Kaushik B	af4a8f1950	Refactor tests for TPU Accelerator (#9718 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-14 19:45:15 +00:00
Danielle Pintz	6feda08109	Deprecate `GPUStatsMonitor` and `XLAStatsMonitor` in favor of `DeviceStatsMonitor` (#9924 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-14 15:52:45 +00:00

1 2 3 4 5 ...

2065 Commits