lightning

Commit Graph

Author	SHA1	Message	Date
Kaushik B	762af9505b	Add missing test for testing custom registered training plugin (#10225 )	2021-10-29 04:06:06 +00:00
Adrian Wälchli	6ed7a0c172	Fix sigterm signal handling (#10189 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-10-29 00:01:39 +00:00
thomas chaton	255e3edc98	resolve failing test (#10191 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-28 15:27:03 +00:00
Carlos Mocholí	03f01fb5ec	Fix gradient norm tracking and gradient clipping (#9287 ) * WIP * Progress * Undo test change * Fix plugin closure execution order * Update CHANGELOG * Fix manual optimization on AMP and skipping backward * Fix for deepspeed * Typo * Hook test for manual closure * Add skipping test with AMP * You are hideous, apex * Add deepspeed test * Update CHANGELOG * Fix for broken master * Add RunIf * FIXMEs * Rename * Fix grad norm * add a simple test * update test * update test * update test * fix merge conflicts * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Sea of changes * Undo change * Introduce TPUPrecisionPlugin * Undo changes * Undo changes * Resolve FIXME * Undo change * Undo change * Undo change * Fix FIXMEs * Fix FIXME * Correct value * Bad merge * Fix circular imports * WIP * Fixing clipping * Fixes * Bad merge * Move optimizer step and clipping into the `PrecisionPlugin` * Fix AMP * Update CHANGELOG * Fix tests * Underscore * Progress * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove pre_optimizer_step * Missed one * Progress * Progress * Fix test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update FIXMEs * Fix test * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix test * DeepSpeed warning. mypy * Rename * Finish tests * Update CHANGELOG * Dumb fixes * accelerator=auto * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update on comments * Use ClassifModule Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-28 15:23:27 +00:00
Carlos Mocholí	5262b63dff	Pass the scaler as an input to `NativeMixedPrecisionPlugin` (#10055 ) Co-authored-by: thomas chaton <thomas@grid.ai>	2021-10-28 14:13:53 +00:00
Low Weng Fei	83d74bb385	Fix `reset_seed()` converting the `PL_SEED_WORKERS` environment variable `str` read to `bool` (#10099 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: tchaton <thomas@grid.ai>	2021-10-28 12:57:41 +00:00
Rohit Gupta	9af1dd7443	Deprecate `lr_sch_names` from `LearningRateMonitor` (#10066 )	2021-10-28 12:57:04 +00:00
Adam J. Stewart	b8ac17624d	Docs: fix mistakes in New Project docs (#10137 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-28 12:31:02 +00:00
Rohit Gupta	85eb17cde5	initialize poptorch_models based on trainer_fn (#10149 )	2021-10-28 11:59:52 +00:00
Kaushik B	d1985ebf96	Add Plugins Registry to docs (#10181 )	2021-10-28 16:44:08 +05:30
Adrian Wälchli	63015b5c87	Let `DDPSpawnPlugin.spawn` return a result from rank 0 (#10162 ) Co-authored-by: Kaushik B <kaushikbokka@gmail.com>	2021-10-28 11:39:13 +02:00
Adrian Wälchli	07b1b56d5c	Fix setting device when creating "inf" monitor value in `ModelCheckpoint` (#10118 ) Co-authored-by: thomas chaton <thomas@grid.ai>	2021-10-28 09:10:55 +00:00
Adrian Wälchli	afd1ae124e	Update deepspeed precision plugin for Lite (#10164 )	2021-10-28 08:33:56 +00:00
Carlos Mocholí	3a4e9970d6	Pin fairscale version (#10200 )	2021-10-27 23:24:17 +00:00
Carlos Mocholí	dbe1662dc3	Replace `_TORCH_GREATER_EQUAL_DEV_1_10` with `_TORCH_GREATER_EQUAL_1_10` (#10157 )	2021-10-27 13:38:39 +01:00
Adrian Wälchli	808edcdebf	update type (#10163 )	2021-10-27 11:16:09 +00:00
Kaushik B	c33df2639f	Set `dataset` attribute to `MpDeviceLoader` used in TPU Spawn (#10151 )	2021-10-27 01:23:01 +05:30
Adrian Wälchli	5ade197580	Update README page in pl_examples folder (#10114 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-26 17:38:56 +00:00
Adrian Wälchli	4a4a27db05	Update docutils package version in requirements.txt (#10158 )	2021-10-26 16:32:47 +00:00
Carlos Mocholí	48b6292cf0	Move optimizer step and clipping into the `PrecisionPlugin` (#10143 )	2021-10-26 17:26:26 +02:00
Rohit Gupta	93266e2c22	Avoid deprecated warnings from accelerator and checkpoint connector #10142	2021-10-26 14:10:30 +02:00
Carlos Mocholí	a0e45dc071	Some minor CI cleanup (#10088 )	2021-10-26 13:58:20 +02:00
Danielle Pintz	38090e47d7	Small code simplification in `training_epoch_loop.py` (#10146 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-10-26 13:22:36 +02:00
twsl	971281d27d	Make sure file and folder exists in Profiler (#10073 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-26 11:13:31 +00:00
Charlie_Tang	84ce1d095c	add 'sanity_checking' to datamodule 'on_after_batch_transfer' docs (#10067 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-26 11:12:57 +00:00
Danielle Pintz	a5235d5b01	Remove `model_connector.py` (#10111 )	2021-10-26 11:52:14 +02:00
Adrian Wälchli	871a96701a	Rename `master_params` to `main_params` (#10105 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-26 11:17:32 +02:00
Rohit Gupta	34d5980df6	Raise `MisconfigurationException` if `trainer.eval` is missing required methods (#10016 )	2021-10-25 23:12:08 -07:00
Danielle Pintz	13d6d7bad1	Remove `optimizer_connector.py` (#10120 )	2021-10-26 00:52:43 +00:00
Adrian Wälchli	21a5867dad	Rename `ClusterEnvironment.creates_processes` (#10106 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 23:15:41 +00:00
Adrian Wälchli	f1623355bd	Add example table to loop docs (#10058 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 22:42:15 +00:00
Rajat Goel	47e7a2860f	Fix Enums parsing in generated hparms yaml (#9170 ) Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 21:23:20 +00:00
Jirka Borovec	0e0247a4d4	docker Conda timeout (#10087 )	2021-10-25 20:56:47 +00:00
Eric Wiener	0e20119d24	Change default value of the `max_steps` Trainer argument from `None` to `-1` (#9460 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2021-10-25 20:21:33 +00:00
Rohit Gupta	d9dfb2e920	fix tests (#10138 )	2021-10-25 19:37:47 +00:00
Danielle Pintz	1f7bd6650c	Mark accelerator connector as protected (#10032 )	2021-10-25 19:24:54 +00:00
jjenniferdai	6d79184ec5	Unify checkpoint load paths [redo #9693 ] (#10061 )	2021-10-25 19:05:31 +00:00
Adrian Wälchli	76081fb846	Mark SLURM detection methods in `AcceleratorConnector` as protected (#10101 ) Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2021-10-25 17:52:15 +00:00
Carlos Mocholí	2ee3127661	Use `torch.autocast` (#10053 )	2021-10-25 17:33:52 +00:00
Carlos Mocholí	43c70ece17	Fix `optimizers` overloads typing annotation (#10069 )	2021-10-25 16:51:46 +00:00
Carlos Mocholí	b376799430	Minor fixes related to clipping (#10130 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2021-10-25 16:40:22 +00:00
Adrian Wälchli	d3e5a43546	Restrict setup methods to accept a single model (#10064 )	2021-10-25 16:32:57 +00:00
manipopopo	cfb2d87765	Disable quantization aware training observers (#8540 ) Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2021-10-25 15:46:09 +00:00
Adrian Wälchli	f8a7f3fde0	Add Yield loop example (#9983 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 14:26:36 +00:00
Adrian Wälchli	aff80477b7	Remove dead code in accelerator connector (#10100 ) * remove dead code in accelerator connector * remove slurm "fake_slurm_managing_tasks" dead code	2021-10-25 13:37:40 +00:00
Adrian Wälchli	7eb2edf421	rename set_random_master_port (#10104 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 12:09:05 +00:00
Kaushik B	64fc0d4257	Add method to TPUSpawn plugin to override how models are setup (#10039 )	2021-10-25 11:44:32 +00:00
Danielle Pintz	e94dcf6936	Mark `trainer.data_connector` as protected (#10031 ) Co-authored-by: tchaton <thomas@grid.ai>	2021-10-25 12:29:09 +01:00
Carlos Mocholí	f95ba20012	Do not use the base version by default in `_compare_version` (#10051 )	2021-10-25 16:41:32 +05:30
Adrian Wälchli	225989363b	update links in callback examples pointing to bolts (#10117 )	2021-10-25 10:27:14 +00:00

1 2 3 4 5 ...

5916 Commits All Branches Search

5916 Commits

All Branches