lightning

Commit Graph

Author	SHA1	Message	Date
thomas chaton	1f025789fc	[bugfix] Clean Validation Sanity Checking metrics (#8171 ) * resolve logging issue * update changelog * remove breakpoint * resolve bugs * remove pass	2021-06-28 13:49:56 -04:00
Adrian Wälchli	971908a1aa	Loop Refactor 1/N - Training Loop (#7871 ) Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tchaton <thomas@grid.ai> Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de> Co-authored-by: Justus Schock <justus.schock@posteo.de> Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>	2021-06-15 12:55:06 +00:00
Carlos Mocholí	8c0ea92af2	`TrainerState` refactor [5/5] (#7173 ) * `TrainerState` refactor * flake8 * Update finished check * Test cleanup * Fix tests * Fixes * Reorder * flake8 * Update CHANGELOG * Better docs * Better docs * Remove default * Update tests * Bad merge	2021-05-04 12:50:56 +02:00
Adrian Wälchli	b780af51be	update test for resume_from_checkpoint on missing file (#7255 )	2021-05-04 09:16:34 +00:00
Vaibhav Balloli	ccd87cadfc	Changes resume_from_checkpoint warning to error (#7075 ) Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-04-28 15:03:29 +02:00
Jirka Borovec	aa7d3dc6cc	Fix `torchmetrics` compatibility (#7131 ) * get_num_classes * tmp * fix one test * fix deprecated tests * fix deprecate * pep8 * deprecate 0.3 * wip * wip * HaCK * brnch * brnch * format * Apply suggestions from code review * prune * rev * mltilabel * Apply suggestions from code review * master * rev * . Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>	2021-04-22 20:45:46 +00:00
Elia Cereda	d0596fac94	Refactor RunningStage usage in advance of implementing Trainer.validate() (#4945 ) * Update code Co-authored-by: EliaCereda * More property updates * Move properties. Introduce trainer._fitting * Use trainer.fitting * Fix reset dataloaders * Unused code * RunningStage.SANITY_CHECKING * Use setters * Fix bugs * Fix bugs * TrainerState.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING} * Fix bugs * Fix bugs * Fix tests * Update CHANGELOG. Add deprecation warning. Fix tests * Unused imports * Optional trainer * More deprecation. More refactoring * Correct version * Use properties * Address comments * flake8 * Missed renamings * Typo * is -> == It is recommended to use for Enums since they are singletons, however, since the LightningEnum subclasses str, it's not a good idea in case a user sets the state/stage with a str * Also for tests * Typo * Address @tchaton's comments * PEP8 * Correct property * Update CHANGELOG * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Remove called sanity check Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-03-06 12:40:19 +00:00
Jirka Borovec	b9cf1223b9	missing tests default_root_dir=tmpdir (#6314 ) * default_root_dir=tmpdir * miss	2021-03-04 19:23:12 +00:00
Jirka Borovec	0f9134e043	Refactor: skipif for Windows 2/n (#6268 ) * win * isort * flake8	2021-03-02 09:36:01 +00:00
Jirka Borovec	eb815000f6	Refactor: skipif for multi - gpus 1/n (#6266 ) * ngpus * gpu * isort * pt * flake8	2021-03-02 09:03:32 +01:00
Jirka Borovec	1c851b89e1	fixing miss-leading tested acc values (#5876 ) * fixing tested values * . * tests * yapf * softmax * hvd * rename * lr * duplicate * drop * classif * rm EvalModel * Revert "rm EvalModel" This reverts commit `6c3fb39ebe`. * update tests * fix * azure * azure * self * cpu * Apply suggestions from code review Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>	2021-02-23 22:08:46 +00:00
Adrian Wälchli	0456b4598f	mini refactor for _running_stage access (#5724 ) * running stage * circular import * running stage cleanup * fix unused import * fix running stage access * add return type * Revert "add return type" This reverts commit `65b0fe269c`. * try fix typing	2021-02-22 12:01:54 +01:00
Adrian Wälchli	02ac4b0b6a	Replace .get_model() with explicit .lightning_module (#6035 ) * rename get_model -> lightning_module * update references to get_model * pep8 * add proper deprecation * remove outdated _get_reference_model * fix cyclic import	2021-02-18 15:59:54 +01:00
Adrian Wälchli	4bdf2fe55f	remove executable bit on source files (#5929 ) * 644	2021-02-12 00:06:40 +01:00
Kaushik B	4857546c25	Fix: Failing test in data_modules(dp) (#5924 ) * Update test_datamodules.py * fix code format issue * fix test restore * fix code format issue	2021-02-11 17:32:46 +00:00
Rohit Gupta	8e9a026bc3	[tests/models] refactor with BoringModel (#5507 ) * update with BoringModel * update with BoringModel * step * try TPU * TPU * update tests * update tpu tests * self * fix * dp * update tests * ref * update tests * fix tpu tests * fix dp and run_prediction * dp * only dp * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2021-02-11 14:32:07 +00:00
Jirka Borovec	a0f7831278	fix miss-leading imports in tests (#5873 ) * fix imorts * .	2021-02-09 05:10:52 -05:00
Jirka Borovec	bd920b4102	Refactor simplify tests (#5861 ) * add new * restructure * yapf * move * fix	2021-02-08 11:52:02 +01:00
Jirka Borovec	4faaef7758	formatting tests: 4/n (#5846 ) * models * ckpt * core * log	2021-02-06 12:07:26 +01:00
Adrian Wälchli	9555043a29	Force ModelCheckpoint callback to run last (#5731 )	2021-02-03 16:40:57 -05:00
Adrian Wälchli	692f77b8a7	Refactor LightningDataParallel (#5670 ) * module * fix model access * scalar conversion * refactor * kwargs * auto unsqueeze * refactor code duplication * clean up * docs * update dp docs * changelog * generalize test * test * rename * warning cache * isort * unsqueezing test * device * device * scalar test * device * device * include coverage of overrides * clear * add deprecation test * docs * improve coverage * increase coverage * fix merge * extend test * rename base class * mention the predict method in docs * combine iteration over collection * remove override * move * line * Apply suggestions from code review * fix running stage * f401 * fix cyclic import Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2021-01-31 06:08:16 -05:00
chaton	3da28fd634	[feat] 1/2 Add trainer.predict (#5579 ) * start adding predict * add predict * resolve test * add predict * remove limit_predict * update * add test for predict * typo * update on comments * remove predict_step * update ddp_shareded * check ddp_sharded * resolve on comments * resolve isort * update dp * add test dp 1 gpu * made default forward * resolve path * resolve bug * update on comments * resolve doc * resolve bug * update * resolve bug * update on comments * resolve pep8 * update test doc * update on comments * solve special tests * resolve bug * resolve flake8 * Update pytorch_lightning/callbacks/progress.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * add predict to LightningModule * missing predict * typo * rename is_prediction to _predicting * add * update * update * update doc Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>	2021-01-27 11:38:14 -05:00
Jirka Borovec	53b0ae49b9	fix imports / isort / flake8	2021-01-26 14:57:34 +01:00
chaton	0435e23a64	deprecate enable_pl_optimizer as it is not restored properly (#5244 ) * update * clean test * still in progress * udpdate test * update * update * resolve flake * add test for zero_grad * update * works without accumulated_grad * update * update * resolve amp * revert back to True * update * clean tests * cleaned out * typo * update test * git repare bug * remove print * udpate * Fix formatting/optimizer imports * Refactor the test for cleanliness * Add vanilla model to the test, better var names * Fixed var names, let's clean up these mock tests * repare test * update test * resolve flake8 * add manual_optimization * update tests * resolve flake8 * add random accumulate_grad_batches * improve test * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * update * clean tests * correct bug * Apply suggestions from code review * format * adress comments * update on comments * wip * typo * depreceate enable_pl_optimizer * resolve latest bugs * update * resolve merge * add comment * Update pytorch_lightning/core/lightning.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/deprecated_api/test_remove_1-3.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/connectors/optimizer_connector.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * update on comments * update restore * add a property * remove setstate as not needed anymore * update test * provide optimizer to on_before_zero_grad * update on comments * update on comments * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Update tests/trainer/optimization/test_parity_automatic_optimization.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * mofidy import * update changelog * resolve flake8 * update * update * clean doc Co-authored-by: SeanNaren <sean@grid.ai> Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> (cherry picked from commit `f2e99d617f`)	2021-01-26 14:29:46 +01:00
Jirka Borovec	059f4630c8	prune check on Trainer fit result (#5453 ) * prune check on Trainer fit result * flake8 * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * . Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-01-11 19:36:48 -05:00
Gianluca Scarpellini	7464aca44e	test_cpu and test_gpu EvalModelTemplate deprecation (#4820 ) * test_cpu refactoring - BoringModel and checkpoints; test_gpu refactoring - BoringModelboring_model refactoring - validation, testing; Fix - run_prediction as dispatcher for testing BoringModel * Removed EvalModelTemplate import from test_cpu and test_gpu * Reverting unintended changes * Issues with checkpointing * Fixed tests for logging and checkpointing * Fix for dispatcher * test_cpu refactoring - BoringModel and checkpoints; test_gpu refactoring - BoringModelboring_model refactoring - validation, testing; Fix - run_prediction as dispatcher for testing BoringModel * Removed EvalModelTemplate import from test_cpu and test_gpu * Reverting unintended changes * Issues with checkpointing * Fixed tests for logging and checkpointing * Fix for dispatcher * Fixed acc check for stocasticity of seeds * Fixed according to @borda suggestions * Hparams for boring_model * Deprecated RuntimeParamChagneModelAssing (functionality is tested in RuntimeParamChangeModelSaving) * Reduced boring_model parameters to just in and out features, test_cpu modelsinherit BoringModel to specify additional parameters (e.g., optimizer) * Fix PEP8 * Update tests/base/develop_pipelines.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/base/boring_model.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/base/develop_pipelines.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Merged test_early_stopping with all_features; added TODO for self.log * Fixed test_all_features trainer options * Ready for review! * Update tests/models/test_cpu.py Thank you! :) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * added optimizer_name, lr, and batch_size as hparams for save_hparameters() * Fixes for reducing PR size * Reverse test_hparams (removed DEPRECATED test for hparams direct assignment) * Changes for in_features * Fixed hparams * Fixed parameters for boring_model * Update tests/models/test_cpu.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update tests/models/test_cpu.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * fix for pep8 * Fixed run_predction and TODO * fix min acc for darwin/windows without pl_opt * eval as DEFAULT run_prediction strategy * Updated val_dataloader for running_test_no_val Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2021-01-07 05:50:08 -05:00
tarepan	bb366232e7	Add non-existing resume_from_checkpoint acceptance for auto-resubmit (#4402 ) * Add empty resume_from_checkpoint acceptance #4366 * Fix general error catch with focused file check * Add fsspec HTTP extras Add fsspec's HTTPFileSystem support through http extras. pl has supported remote http file (e.g. #2925), so this commit do not add new functionality. * Fix potential too much logging in DDP * Add PR changelog * Add well-written argument explanation Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix DDP-compatible restore logging Notify from where the states are restored. This feature temporally deleted as a result of PR review. With succeeding review, added with DDP compatibility. * Fix utility import pathes * Refactor load step commentaries * Refactor hpc ckpt suffix acquisition * Refactor restore/hpc_load match * Refactor hpc load trial * Refactor checkpoint dir check * Refactor unneeded function nest * Refactor nested If * Refactor duplicated cache clear * Refactor attempt flow with if/elif * Fix pip8 * Refactor hook commentary Co-authored-by: chaton <thomas@grid.ai> * Fix pep8 * Refactor hpc load checkpoint path acquisition * Fix pip8 * Fix typo Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix typo Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix doc Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Refactor None Union type with Optional * Fix build-doc CI failure debuged in #5329 * Fix fsspec import during build-doc #5329 * Fix test epoch Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> * Fix test with latest test models * . Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com> Co-authored-by: Roger Shieh <sh.rog@protonmail.ch> (cherry picked from commit `b0051e8c03`)	2021-01-06 12:55:38 +01:00
Jirka Borovec	0f36525e8f	fix/enable - check F401 (#5201 ) * refactor - check F401 * missed * fix	2020-12-21 10:15:04 +01:00
Jirka Borovec	05f25f3a54	update usage of deprecated checkpoint_callback (#5006 ) * drop usage of deprecated checkpoint_callback * fix * fix	2020-12-09 14:14:34 -05:00
Jirka Borovec	53d7c9555c	drop usage of deprecated distributed_backend (#5009 ) Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>	2020-12-09 09:18:23 +01:00
chaton	c2e6e68c7e	optimizer clean up (#4658 ) * add LightningOptimizer * typo * add mock closure * typo * remove logic in optimizer_step * update * update * update * desactivate LightningOptimizer for hovorod * resolve flake * typo * check optimizer name * change name * added backward to LightningOptimizer * remove use_lightning_optimizer * move update * simplify init * resolve comments * resolve bug * update * update * resolve bugs * resolve flake8 * set state * work manual_optimizer_step * add doc * add enable_pl_optimizer * make optimizer_step * add make_optimizer_step * add examples * resolve test * add test_optimizer_return_options_enable_pl_optimizer * add enable_pl_optimizer=True * update * update tests * resolve bugs * update * set Trainer to False * update * resolve bugs * update * remove from doc * resolve bug * typo * update * set to True * simplification * typo * resolve horovod * unwrap horovod * remove Optimizer * resolve horovod * move logic to amp_backend * doesn't seem to be pickable * update * add again * resolve some bugs * cleanup * resolve bug with AMP * change __repr__ * round at -12 * udpate * update * update * remove from horovod * typo * add convert_to_lightning_optimizers in each accelerators * typo * forgot * forgot a convert_to_lightning_optimizers * update * update * update * increase coverage * update * resolve flake8 * update * remove useless code * resolve comments + add support for LightningOptimizer base class * resolve flake * check optimizer get wrapped back * resolve DDPSharded * reduce code * lightningoptimizer * Update pytorch_lightning/core/optimizer.py Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Update pytorch_lightning/core/lightning.py * remove reference to step function * Apply suggestions from code review * update on comments * resolve * Update CHANGELOG.md * add back training_step in apex and native_amp * rename optimizer_step Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>	2020-12-01 00:09:46 +00:00
Carlos Mocholí	396a46f55f	Add current_score to ModelCheckpoint.on_save_checkpoint (#4721 ) * Add current_score to ModelCheckpoint.on_save_checkpoint * Update CHANGELOG [ci skip] * fix Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * fix2 * Add test for NaN * Fix failing tests * Simplify line * Add test docstrings Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2020-11-18 08:09:44 +00:00
Adrian Wälchli	9b7f01654a	Update old "module_arguments" and "hparams" references in docs (#4417 ) * replace module_arguments refernces * update hparams docs * add missing save_hyperparameters in example * deprecate instead of remove * Update docs/source/hyperparameters.rst Co-authored-by: chaton <thomas@grid.ai> * Update docs/source/hyperparameters.rst Co-authored-by: Teddy Koker <teddy.koker@gmail.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>	2020-11-03 12:13:10 +01:00
Adrian Wälchli	d1234c592d	deprecate passing ModelCheckpoint instance to Trainer(checkpoint_callback=...) (#4336 ) * first attempt * update tests * support multiple * test bugfix * changelog * pep * pep * import order * import * improve test for resuming * test * update test * add references test Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * docstring suggestion deprecation Co-authored-by: Jeff Yang <ydcjeff@outlook.com> * paramref Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com>	2020-10-30 04:47:37 +01:00
Rohit Gupta	4c7ebdc32b	Add dirpath and filename parameter in ModelCheckpoint (#4213 ) * Add dirpath and filename parameter in ModelCheckpoint * remove old function * chlog * codefactor * update tests * docs * fix doctest and added tests * pathlib dirpath * dep version and docs * try fix doctest * pep * suggestions Co-authored-by: carmocca <carlossmocholi@gmail.com> * suggestions * fix test * pep * trigger tests * Apply suggestions from code review Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * suggestions * try fix windows test * add and update some tests * trigger tests * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-23 09:59:12 +05:30
William Falcon	09c2020a93	notices (#4118 )	2020-10-13 07:18:07 -04:00
Jirka Borovec	8873750cf0	remove deprecated early_stop_callback (#3982 )	2020-10-08 06:30:33 -04:00
Jean-Baptiste SCHIRATTI	cea5f1f538	Fix for `load_from_checkpoint` (#2776 ) * Fix. * Fix #2550: allow to load model from checkpoint if self.save_hyperparameters() was not called. * Fix? Cleaner way of not calling self.save_hyperparameters in EvalModelTemplate. * Fix? `_load_model_state` cleanup * Fix? * Fix #2550: allow to load model from checkpoint if self.save_hyperparameters() was not called. * Fix. * Fix? Cleaner way of not calling self.save_hyperparameters in EvalModelTemplate. * Fix? `_load_model_state` cleanup * Fixed side effect in `test_load_model_from_checkpoint_extra_args`. * Apply suggestions from code review * fix * try * fixed missing arg in evalmodel * fixed missing arg in evalmodel * fix * update * fix loading * add test * prune Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-10-05 12:44:23 -04:00
Nrupatunga	7d47ed178b	[Bug-Fix]:properties `current_epoch` and `global_step` between model and trainer same always (#3785 ) * make current_epoch and global_step to be same as trainer, after model restore. * remove assignment here * test * minor modification * Update pytorch_lightning/core/lightning.py type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * Update pytorch_lightning/core/lightning.py type check, better clarity Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> * comments for current_epoch and global_step properties * Update tests/models/test_restore.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * update comments according to the changes made * Update tests/models/test_restore.py * add current_epoch, global_step to jit ignore list * Add comments to CHANGELOG * Update CHANGELOG.md * Update tests/models/test_restore.py Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-10-05 11:10:40 -04:00
William Falcon	d9bc95f83e	ref: bug fix with logging val epoch end + monitor (#3812 ) * ref: fix metric err * ref: fix metric err * ref: fix metric err * ref: merge * ref: merge * ref: merge * ref: merge * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: decoupled ddp2 * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix * ref: clean up ddp before final fix	2020-10-03 12:33:29 -04:00
William Falcon	d79bce1dff	enable None model checkpoint default (#3669 ) * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default * enable None model checkpoint default	2020-09-26 23:14:04 -04:00
William Falcon	cd16aa9854	ref: checkpoint connector methods 4/n (#3474 ) * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n * ref: checkpoint connector methods 4/n	2020-09-12 08:42:27 -04:00
Peter Yu	88886ace72	More robust way of collecting init argument names for LightningModules (#3066 ) When a LightningModule inherits from a class that implements `__new__()` such as `typing.Generic`, `inspect.signature(cls)` short-circuits and returns the signature of `__new__()` instead of `__init__()`. So, we need to be more specific and call inspection directly on the init function.	2020-08-20 07:19:11 -04:00
shijianjian	18d31a3b63	Added strict=False for load_from_checkpoint (#2819 ) * Added strict=False and hparams_file accepcts dict * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Type check fix * Added tests * Linting & test fix * Removed redundant code & test * Added strict=False and hparams_file accepcts dict * Apply suggestions from code review Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> * Type check fix * Added tests * Linting & test fix * Removed redundant code & test * Apply suggestions from code review * tests * tests * chlog * Update tests/models/test_restore.py Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * update test comments * Added docstring for the strict attribute * Added supplementary tests * Update saving.py * Apply suggestions from code review Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * pep8, removed extra func Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>	2020-08-13 16:25:43 -04:00
Jirka Borovec	06e8910f06	pytorch 1.6 (#2745 ) * pt 1.6 * don't use the new zipfile serialization for now * quick flake8 fixes * remove unnecessary f * coalesce strings * remove comma * remove extra commas * Apply suggestions from code review Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com> * set _use_new_zipfile_serialization to False only for pytorch 1.6.0 * remove unnecessary comments * flake8 fixes * use pkg_resources instead of packaging * readme * format * version * chlog Co-authored-by: Peter Yu <peter@asapp.com> Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>	2020-07-31 11:18:32 +02:00
Jirka Borovec	590e7fb1fd	tests: add default_root_dir=tmpdir (#2392 ) * tests: add default_root_dir=tmpdir * remove duplicate tmpdir args * add missing fixture * test requires multi gpu * typo * resize Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2020-07-28 09:47:53 -04:00
William Falcon	aaa1553e35	tests for val loop flow (#2605 ) * add tests for single scalar return from training * add tests for single scalar return from training * add tests for single scalar return from training * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only * fixing val step only	2020-07-14 14:20:45 -04:00
William Falcon	69cbb62774	Finish #2549 (#2557 ) * removed spawns for test_converters and verified tests Co-authored-by: Ananya Harsh Jha <ahj265@nyu.edu> Co-authored-by: zcain <zcain@google.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka <jirka@pytorchlightning.ai>	2020-07-08 20:33:48 -04:00
William Falcon	11069c8784	Fix ddp tests + .test() (#2512 ) * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * fix deprecation warnings * added base tests for tpu * added base tests for tpu * Update pytorch_lightning/trainer/trainer.py Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu * added base tests for tpu Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>	2020-07-07 12:24:56 -04:00
Adrian Wälchli	25ee51bc57	Continue Jeremy's early stopping PR #1504 (#2391 ) * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * cannot pass an int as default_save_path * refactor log message * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * add state_dict for early stopping * move best attr after monitor_op defined * improve early stopping and model checkpoint callbacks * fix formatting * fix attr init order * clean up setting of default_root_dir attr * logger needs default root dir set first * reorg trainer init * remove direct references to checkpoint callback * more fixes * more bugfixes * run callbacks at epoch end * update tests to use on epoch end * PR cleanup * address failing tests * refactor for homogeneity * fix merge conflict * separate tests * tests for early stopping bug regressions * small fixes * revert model checkpoint change * typo fix * fix tests * update train loop * fix test case * appease the linter * fix some doctests * move config to callback * fixes from rebase * fixes from rebase * chlog * docs * reformat * formatting * fix * fix * fixes from rebase * add new test for patience * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update pytorch_lightning/callbacks/model_checkpoint.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update tests/callbacks/test_early_stopping.py Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * fix formatting * remove enable_early_stop attribute * fix test with new epoch indexing * fix progress bar totals * fix off by one error (see #2289) epoch starts at 0 now * added missing imports * fix hpc_save folderpath * fix formatting * fix tests * small fixes from a rebase * fix * tmpdir * tmpdir * tmpdir * wandb * fix merge conflict * add back evaluation after training * test_resume_early_stopping_from_checkpoint TODO * undo the horovod check * update changelog * remove a duplicate test from merge error * try fix dp_resume test * add the logger fix from master * try remove default_root_dir * try mocking numpy * try import numpy in docs test * fix wandb test * pep 8 fix * skip if no amp * dont mock when doctesting * install extra * fix the resume ES test * undo conf.py changes * revert remove comet pickle from test * Update CHANGELOG.md Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> * Update weights_loading.rst * Update weights_loading.rst * Update weights_loading.rst * renamed flag * renamed flag * revert the None check in logger experiment name/version * add the old comments * _experiment * test chckpointing on DDP * skip the ddp test on windows * cloudpickle * renamed flag * renamed flag * parentheses for clarity * apply suggestion max epochs Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu> Co-authored-by: Jirka <jirka@pytorchlightning.ai> Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-06-28 21:36:46 -04:00

1 2

72 Commits