lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	65b6a6a497	0.10.0 (#3965 )	2020-10-07 20:41:56 -04:00
William Falcon	6044cf9003	Fixes #3945 (#3947 )	2020-10-07 13:46:27 -04:00
edenlightning	27f536b2ce	[CI SKIP] Fix early stop docs (#3940 ) * Update early_stopping.rst * Update __init__.py * Update new-project.rst * Update early_stopping.rst * Update __init__.py * Update early_stopping.rst * Update __init__.py Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-07 13:01:50 -04:00
William Falcon	b922409624	clean and organize fit (#3938 ) * clean and organize fit * clean and organize fit * clean and organize fit * clean and organize fit * clean and organize fit	2020-10-07 11:04:10 -04:00
William Falcon	575e01be82	tests for multiple optimizers and dataloader combinations (#3937 ) * added tests for multiple optimizers and dataloaders * added tests for multiple optimizers and dataloaders * added tests for multiple optimizers and dataloaders	2020-10-07 10:13:57 -04:00
ananthsub	d3f40d6a9e	Update to_disk to use fsspec for remote file support (#3930 ) * Update supporters.py * Update CHANGELOG.md * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update supporters.py * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-07 07:28:23 -04:00
edenlightning	335bb75356	update docs on logging (#3916 ) * Update loggers.rst * Update loggers.rst * Update index.rst * Create logging.rst * Delete experiment_reporting.rst * Delete experiment_logging.rst * Update __init__.py	2020-10-06 18:53:39 -04:00
Jirka Borovec	064ae53d63	nb steps in early stop (#3909 ) * nb steps * if * skip * rev * seed * seed	2020-10-06 15:20:08 -04:00
Lezwon Castelino	69833dad5b	Added check to verify xla device is TPU (#3274 ) * tpu device check * replaced with xmp spawn * Revert "replaced with xmp spawn" This reverts commit 6835380f * replaced all instances of XLA_AVAILABLE * moved inner_f to global scope * made refactors * added changelog * added TPU_AVAILABLE variable * fix codefactor issues * removed form trainer and early stopping * add TORCHXLA_AVAILABLE check * added tests * refactoring * Update pytorch_lightning/utilities/xla_device_utils.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * updated function names * fixed bug * updated CHANGELOG.md * added todo * added type hints * isort and black Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-06 19:54:37 +02:00
William Falcon	2cf17a3718	Adds tests to make sure logging doesn't happen multiple times (#3899 ) * Makes sure logging doesn't ever happen from non-root zero * Makes sure logging doesn't ever happen from non-root zero * Makes sure logging doesn't ever happen from non-root zero * added bug report model * fix local model * fix local model * fix local model * fix local model	2020-10-06 12:43:51 -04:00
Teddy Koker	9600926619	Rename log_save_interval, row_log_interval (#3748 ) * Rename row_log_interval -> log_every_n_steps log_save_interval -> flush_logs_every_n_steps * Changelog * fixed title underline length * typo * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * pep8 + deprecation test * 'todo: remove in 1.1 comment' * 1.1 -> 0.11 * log * docs * depr API * add depr tests * note * miss Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>	2020-10-06 10:27:06 -04:00
edenlightning	2119184801	Fix docs for auto_lr_find (#3883 ) * Fix docs for auto_lr_find * change testcode to codeblock we are not showing a complete example here	2020-10-05 22:28:38 -04:00
William Falcon	b34c7add23	Fixes #3668 , #3887 as a bonus (#3888 ) * Fixes #3668, #3887 as a bonus * Fixes #3668, #3887 as a bonus	2020-10-05 21:30:41 -04:00
Nrupatunga	7d47ed178b	[Bug-Fix]:properties `current_epoch` and `global_step` between model and trainer same always (#3785 ) * make current_epoch and global_step to be same as trainer, after model restore. * remove assignment here * test * minor modification * Update pytorch_lightning/core/lightning.py type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * Update pytorch_lightning/core/lightning.py type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * comments for current_epoch and global_step properties * Update tests/models/test_restore.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * update comments according to the changes made * Update tests/models/test_restore.py * add current_epoch, global_step to jit ignore list * Add comments to CHANGELOG * Update CHANGELOG.md * Update tests/models/test_restore.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-05 11:10:40 -04:00
William Falcon	b014223f72	Fixes #2678 - enables training_step to return None (#3862 ) * Fixes #2678 - enables training_step to return None * Fixes #2678 - enables training_step to return None	2020-10-05 07:33:46 -04:00
William Falcon	d787208e76	Fixes #2792 (#3857 )	2020-10-04 23:25:02 -04:00
William Falcon	f58c760409	Fixes #2551 (#3858 )	2020-10-04 23:02:35 -04:00
William Falcon	d9656d166c	fixed model checkpoint frequency (#3852 ) * fixed model checkpoint frequency * fixed model checkpoint frequency * fixed model checkpoint frequency * fixed model checkpoint frequency * merged	2020-10-04 21:49:20 -04:00
William Falcon	c6df63a588	Fixes #2479 (#3856 )	2020-10-04 21:30:33 -04:00
William Falcon	00f0d19a61	fixes #3798 (#3849 ) * fix #3798 * added tbptt test for logging	2020-10-04 19:36:51 -04:00
Harshal Mittal	6723b924f8	docs/fix_typo (#3847 )	2020-10-04 17:10:49 -04:00
William Falcon	70e792344a	test selecting the correct backend. temp backends while slurm and TE are decoupled (#3848 ) * test selecting the correct backend. tem backends while slurm and TE are decoupled * test selecting the correct backend. tem backends while slurm and TE are decoupled	2020-10-04 15:44:50 -04:00
William Falcon	1aa9d39506	Eval epoch can now log independently (#3843 ) * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger * ref: routed epoch outputs to logger	2020-10-04 13:36:35 -04:00
Adrian Wälchli	1906867fd4	deprecation warning (#3844 )	2020-10-04 13:17:09 -04:00
William Falcon	2c21f7d7e2	ref: adding compute environments (2/n) (#3842 ) * ref: adding compute environments (2/n) * ref: adding compute environments (2/n) * ref: adding compute environments (2/n) * ref: adding compute environments (2/n)	2020-10-04 08:48:46 -04:00
William Falcon	1f8ff7c48c	ref: callback system and init ddp (1/n) (#3836 ) * refactored callback system and init ddp * refactored callback system and init ddp * refactored callback system and init ddp * refactored callback system and init ddp	2020-10-03 23:39:17 -04:00
ananthsub	b8a6408a11	Update trainer.py (#3834 )	2020-10-03 22:18:05 -04:00
William Falcon	66aef10239	verified epoch logging (#3830 ) * ref: fix epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging * verified epoch logging	2020-10-03 21:17:24 -04:00
William Falcon	d9bc95f83e	ref: bug fix with logging val epoch end + monitor (#3812 ) * ref: fix metric err * ref: fix metric err * ref: fix metric err * ref: merge * ref: merge * ref: merge * ref: merge * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix	2020-10-03 12:33:29 -04:00
Brendan Fahy	b14c4d4c70	handle fsspec inconsistency in PyArrowHDFS (#3805 )	2020-10-02 22:35:42 -04:00
Jeff Yang	9942f3ebdf	Fix `on_train_batch_start` hook to end epoch early (#3700 ) * init * add test * changelog and docs * fix test * Apply suggestion from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-02 21:46:46 +02:00
Jirka Borovec	62eabdd535	revert backend types (#3788 ) * revert backend types * todo * todo	2020-10-02 06:18:44 -04:00
Akihiro Nitta	ebc1b23fa3	Use `raise .. from ..` to explicitly chain exceptions (#3750 ) * Fix exception chaining * names * Change exception names for consistency Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> * Change exception names for consistency Co-authored-by: Nicki Skafte <skaftenicki@gmail.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai> Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2020-10-01 21:45:44 +02:00
William Falcon	ac2b0f0f06	ref: continue #3733 (#3767 ) * ref: #3733 part 2 * ref: #3733 part 2	2020-10-01 09:25:33 -04:00
William Falcon	440f837f6d	ref: part a of #3733 (#3766 ) * ref: part a of #3733 * ref: part a of #3733	2020-10-01 08:15:23 -04:00
William Falcon	7c61fc7c27	ref: fixes logging for eval steps (#3763 ) * fixes logging for eval steps	2020-10-01 02:31:11 -04:00
William Falcon	a38d108a68	add dist lib to enable syncing anything across devices (#3762 ) * add dist lib to enable syncing anything across devices	2020-10-01 01:21:38 -04:00
Jirka Borovec	faa357648f	return simple docs to methods (#3645 ) * return simple docs to methods * sorting * imports * miss	2020-09-30 08:34:19 -04:00
Jirka Borovec	31a36f04df	define distributed as a type (#3740 ) * define type * miss * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * miss * warn Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-09-30 08:33:01 -04:00
Adrian Wälchli	9405c880af	log/save_interval based on global step (#3667 ) * log interval based on global step * test * test * test * test * pep * pep * added changelog * pep * merge * remove unused arg	2020-09-30 12:26:27 +02:00
William Falcon	b3be8022bd	tests for val step flow and logging (#3731 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test log dict * ref: test log dict * ref: test log dict * ref: test log dict	2020-09-29 22:12:56 -04:00
ananthsub	3dcf7130c5	Support checkpoint hooks on data module (#3563 ) * Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter * Store a reference to the trainer on the datamodule Fixes #3682 * Update data_connector.py * Update data_connector.py * Update test_datamodules.py * Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter * support checkpoint hooks for datamodule refactor on_{save/load}_checkpoint to a separate hook class that both the lightning module and data module inherit add spots in callback connector to call new datamodule hooks if available * hooks formatting * Update hooks.py * Update checkpoint_connector.py * Update lightning.py * update based on upstream/master checkout upstream/master * Update checkpoint_connector.py * add tests * undo format revert * Updated CHANGELOG.md * add checkpoint hooks * add Dict type * import CheckpointHooks	2020-09-29 19:51:44 +02:00
William Falcon	c14928a72a	ref: test val flow steps (#3723 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end	2020-09-29 11:42:38 -04:00
William Falcon	f42ea303c9	ref: enable self.log for eval loop metrics (#3715 ) * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end * ref: test val epoch end	2020-09-29 02:00:28 -04:00
Rohit Gupta	783750547d	disable optimizers setup during testing (#3059 ) * disable configure_optimizers during testing * minor changes * hvd and ddp * fix precision during testing * fix ddp * fix amp * fix cpu * update dp * simplify optimizers * add test * codefactor * ref optimizer setup * chlog * suggestions * isort * rebased with master	2020-09-29 01:09:04 +02:00
William Falcon	cdd7266cd8	ref: enable self.log from val step (#3701 ) * .log in eval * ref * ref: enable self.log in val step	2020-09-28 10:49:07 -04:00
William Falcon	2ecaa2a8be	ref: (2/n) fix no log in epoch end (#3699 )	2020-09-28 08:25:44 -04:00
ananthsub	859ec92da5	Make Trainer.__test_using_best_weights use cloud_io's load to support more storage backends (#3694 ) * Split out changes from #3563 to make that PR easier to review. This formats the file according to the Black formatter * Store a reference to the trainer on the datamodule Fixes #3682 * Update data_connector.py * Update data_connector.py * Update test_datamodules.py * Support more storage backends in trainer.test using best weights Similar to #3692 * Update trainer.py * Update trainer.py use cloud_io load directly	2020-09-28 07:53:57 -04:00
William Falcon	ddd11075bd	[WIP] ref: deprecated results obj, added support for simpler comms (1/n) (#3681 ) * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * ref: deprecated results obj, added support for simpler comms. Decouples logging from loops * fix global step err * fix global step err * fix global step err * fix global step err * fix global step err * fix typing err * fix str * fix typing err	2020-09-27 23:19:46 -04:00
William Falcon	931995b55b	remove flake 8 (#3687 )	2020-09-27 20:40:02 -04:00

1 2 3 4 5 ...

744 Commits