Commit Graph

286 Commits

Author SHA1 Message Date
chaton 867eef0e4c
[HOTFIX] Logging for evaluation (#4684)
* resolve bugs

* add should_flush_logs

* remove should_flush

* should work

* update test

* use something else

* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py

* log mock_log_metrics.mock_calls

* typo

* don't use keys

* convert to list

* typo

* check kwargs

* resolve bug

* resolve flake8

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-15 10:41:33 -05:00
Carlos Mocholí 61394d543c
Improve skipping step tests (#4109)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-14 21:10:24 +00:00
Jirka Borovec e1955e3c89
isolate PL debugger in tests (#4643)
* isolate PL debugger in tests

* miss

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-14 11:22:56 +00:00
ananthsub d096a2ea6d
Fix setup callback hook to pass LightningModule through (#4608)
* Fix setup callback hook

* Update CHANGELOG.md

* Update test_trainer.py

* Update test_trainer.py

* Update test_trainer.py

* fix chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-13 19:34:46 -05:00
chaton 4018237c30
[FEAT] Add lambda closure to manual_optimizer_step (#4618)
* added lambda_closure

* move to types

* add 2 new tests

* make example more complex

* add complex example to doc

* added more tests

* resolve doc

* typo

* update

* update tpu optimizer_step

* Apply suggestions from code review

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-12 19:22:06 +00:00
chaton 4a01fd048c
[FIX] Average Pbar Metrics (#4534)
* wip

* update

* normalize loss

* update test

* resolve bug

* update test and add TODO

* make sure it can be sync

* add TODO

* update sol
2020-11-12 15:59:01 +00:00
Sean Naren 33470ba605
Prevent crash if sync_dist=True on CPU (#4626)
* Added test/fix for sync_dist raising NotImplementedError

* Fixed comments/formatting

* Revert base class change, enforce sync tensors across accelerators, added GPU test
2020-11-11 22:04:05 +00:00
chaton 3d202f9ecc
[FEAT] Refactor logging 3/3 [v1] (#4552)
* wip

* wip check how many tests break

* wip

* resolve some bugs

* resolve more bugs

* resolve 2 bugs

* resolve

* temp fix

* update

* remove useless code

* remove result

* try to resolve bug

* update changelog

* formatting

* remove pl

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-11 17:05:24 +00:00
chaton 7e08b0d710
[bug-fix] DDP and automatic_optimization=False (#4485)
* resolve bug

* add self._running_manual_optim

* update

* update tests

* update lightning module

* resolve bug

* update tests

* update

* resolve pep8

* update

* replace by `ddp_spawn`

* temporary fix

* update

* update

* move update to training_loop

* make both ddp_spawn

* introduce `manual_optimizer_step`

* update changelog

* added changelog wrong place

* add force_optimizer_step

* update docstring for tests

* update optimizer_step

* update zero_grad

* resolve flake8

* move update into manual_optimizer_step

* add zero_grad

* remove zero_grad tests

* remove manual_backward in AMP, it doesn't help

* update

* loosen tests

* update

* update doc

* add TODO

* Removed unnecessary get model from native amp

* Remove try except with pytest raise

* Add seed, clean up imports, remove try catch to reproduce error

* update code

* update test

* revert back

* formatting

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-10 19:44:51 +00:00
Nicki Skafte 4f3160ba2e
Skip tuner algorithms on fast dev (#3903)
* skip on fast dev

* fix error

* changelog

* fix recursive issue

* combine tests

* pep8

* move logic to base funcs

* fix mistake

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* pep

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 00:34:42 +01:00
William Falcon 09a51697ed
Adds shortcut for path to log (#4573)
* added log_dir shortcut to trainer properties for writing logs

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut

* added log_dir shortcut
2020-11-08 12:16:22 -05:00
chaton 9c8701f2e2
[feat] Logging refactor 2/n - train (#4495)
* update logging

* solve more bugs

* replace Mapping by Dict

* update on comments

* resolve pep8

* Apply suggestions from code review

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* typo

* update for coverage

* update test

* update

* Update tests/models/test_hooks.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Update tests/models/test_hooks.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* update on comments

* remove deepcopy

* remove useless look for

* another small optim

* extra optim

* remove lastest optim, can be source of bug

* resolve bug

* add docstring

* optimize coverage

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/logging_tests/test_distributed_logging.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/trainer/evaluation_loop.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/logging/test_logger_connector.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/logging_tests/test_train_loop_logging_1_0.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update on comments

* update

* update on comments

* update parity speed

* get it down to 0.65

* update

* 0.8 max_dif

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-11-05 22:27:04 +00:00
chaton 11dc5264cd
Bugfix/4449 dict attribute error (#4480)
* resolve a bug

* resolve a bug

* remove todo

* resolve more bugs

* update tests

* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* resolve pyright

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-04 19:35:07 +00:00
Adrian Wälchli 9b7f01654a
Update old "module_arguments" and "hparams" references in docs (#4417)
* replace module_arguments refernces

* update hparams docs

* add missing save_hyperparameters in example

* deprecate instead of remove

* Update docs/source/hyperparameters.rst

Co-authored-by: chaton <thomas@grid.ai>

* Update docs/source/hyperparameters.rst

Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-03 12:13:10 +01:00
Rohit Gupta 1396321b4d
Add fsspec to tuner (#4458)
* Add fsspec to tuner

* suggestions

* pathlib

* pep

* missed pep
2020-11-03 15:09:40 +05:30
Rohit Gupta 360b3d8844
Disable training when limit_train_batches=0 (#4371)
* Disable training when limit_train_batches=0

* chlog

* pep

* limit_train_batches

* BoringModel

Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-11-03 12:10:35 +05:30
chaton 958aa1aee7
[test] Accumulated gradient optimization tests (#4477)
* adding tests

* wip

* update

* Update tests/trainer/test_trainer.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-02 23:44:11 +00:00
chaton ac3f7393fd
[FEAT] logging refactors 1/n (#4439)
* introducing new logging object

* typo

* typo

* Update pytorch_lightning/trainer/logging.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Update pytorch_lightning/trainer/logging.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* update on comments

* update on comments

* add more doctstring

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* resolve on comments

* solve pyright

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* update on comments

* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* update on comments

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-02 20:51:43 +00:00
Carlos Mocholí 66ade19d56
Rename conflicting test directories (#4451)
* logging -> logging_tests

* warnings -> warnings_tests

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-03 00:03:37 +05:30
Jirka Borovec ef03c39ab7
Add step index in checkpoint name (#3807)
* true final value of global step

* ch check

* tests

* save each validation interval

* wip

* add test

* add test

* wip

* fix tests, revert old edits, fix merge conflicts, update doctests

* test + bugfix

* sort files

* format test

* suggestion by ananth

* added changelog

* naming

* docs

* example

* suggestion

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fix test

* pep

* pep

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-11-02 15:05:58 +01:00
Dusan Drevicky c50c225f05
feature: Allow str arguments in Trainer.profiler (#3656)
* allow trainer's profiler param to have a str value

* add tests

* update docs

* update exception message

* Update CHANGELOG

* fix pep8 issues

* cleanup test code

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Add deprecation warning if using bool for profiler

* Add deprecation tests and move deprecated tests

* Remove bool option to profiler from docs

* Deprecate bool args to profiler in CHANGELOG

* fixup! Add deprecation warning if using bool for profiler

* fixup! Add deprecation tests and move deprecated tests

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Implement suggestions, remove whitespace

* fixup! Implement suggestions, remove whitespace

* Allow bool, str (case insensitive), BaseProfiler

* Add info about bool deprecation to trainer

* fixup! Add info about bool deprecation to trainer

* Move deprecate todo to test_deprecated

* Test wrong profiler type, improve error message

* fixup! Test wrong profiler type, improve error message

* Update pytorch_lightning/trainer/connectors/profiler_connector.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

* Readd bool to profiler types, test cli profiler arg

* Remove extra whitespace in doc

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update deprecation versions

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-27 16:27:16 +05:30
ananthsub f6efb712ed
Skip replacing dataloader sampler if it's already a distributed sampler (#4273)
* Update data_loading.py

* Update data_loading.py

* add test + update flag description

* add to changelog

* Update test_dataloaders.py

* fix-pickle

* Update test_dataloaders.py

* Added missing reference calls

* Update tests/trainer/test_dataloaders.py

* Apply suggestions from code review

* Update data_loading.py

* Update test_dataloaders.py

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-23 17:34:07 +01:00
Rohit Gupta 4c7ebdc32b
Add dirpath and filename parameter in ModelCheckpoint (#4213)
* Add dirpath and filename parameter in ModelCheckpoint

* remove old function

* chlog

* codefactor

* update tests

* docs

* fix doctest and added tests

* pathlib dirpath

* dep version and docs

* try fix doctest

* pep

* suggestions
Co-authored-by: carmocca <carlossmocholi@gmail.com>

* suggestions

* fix test

* pep

* trigger tests

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* suggestions

* try fix windows test

* add and update some tests

* trigger tests

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-23 09:59:12 +05:30
Sean Naren 065cc94112
Fix bug comparing max_steps to global step which inits at 0 (#4278)
* Fix bug comparing max_steps to global step which inits at 0

* Added test to ensure accumulate grad batch works with max steps

* check fix with TODO test

* correct call counts

* Add check to ensure we've finished accumulation of this global step before exiting loop in conjuction with max steps

* Remove + 1 check in test as this was incorrect

* Update incorrect expected outputs in lr finder test

* Added brackets for clarity

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-22 13:58:59 +01:00
Mauricio Villegas 546476c704
Allow changing the logged step value in validation_step (#4130)
* Fix to bug identified in https://github.com/PyTorchLightning/pytorch-lightning/issues/4102

* update tests

* chlog

Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-10-22 03:03:07 +05:30
Carlos Mocholí 2549ca40e6
Clean up optimizer code (#3587)
* Update optimizer code

* Update CHANGELOG

* Fix tuple of one list case

* Update docs

* Fix pep issue

* Minor typo [skip-ci]

* Use minimal match

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-21 21:12:48 +02:00
Justus Schock 0ec4107697
Optimizer closure (#4190)
* closure for all optimizers

* rename hook and take care of alternating backwards

* add comment

* training_loop_fix

* closure whenever possible

* training_loop

* simple tests that count backward calls

* fix test to work with closure

* remove debugging statement

* better place

* check grads after backward

* start fixing manual optimization

* skip step when result returned by closure was None

* fix gradient clipping test to work with closure

* attribute dict result only for automatic optimization

* adjust backward calls in accelerator

* adjust where to call gradient clipping

* adjust backward calls in tests

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* pass kwargs to xla optimizer

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-21 19:34:29 +01:00
Carlos Mocholí e0f9799dbf
Add strict option to lr_scheduler dict (#3586)
* Add strict option to lr_scheduler dict

* Update docs

* Unnecessary "else" after "raise"

* Update CHANGELOG

* Fix rebase
2020-10-21 14:14:37 +02:00
Sean Naren c336881959
Added fix to ensure that custom logged metrics within test_epoch_end are appended to the result object even without step reduced metrics (#4251) 2020-10-20 18:33:18 +02:00
Jirka Borovec f37444fa3e
CI: add flake8 (#4239) 2020-10-19 21:20:17 +01:00
William Falcon 72f19768c8
remove duplicate metric vs step log for train loop (#4173)
* remove duplicate metric vs step log

* remove duplicate metric vs step log

* remove duplicate metric vs step log

* fix ddp index issue
2020-10-15 10:47:00 -04:00
William Falcon 45d05ff68d
Fixes #4141 (#4169)
* fix val epoch agg

* fix val agg metrics

* fix val agg metrics

* fix val agg metrics
2020-10-15 09:12:05 -04:00
William Falcon 09c2020a93
notices (#4118) 2020-10-13 07:18:07 -04:00
William Falcon 2d5a7f5e7d
Fixes #3276 (#4116) 2020-10-13 06:42:11 -04:00
William Falcon bf2067a609
enabled manual returns (#4089) 2020-10-12 10:06:17 -04:00
William Falcon 5b645d713e
Covv1 (#4072)
* temporary drop metrics tests while speeding them up

* cov

* cov

* docs
2020-10-11 10:21:53 -04:00
William Falcon a4b9221fc5
ref: decouple apex second attemp part n/n (#4065)
* ref: decouple apex second attemp part n/n

* ref: decouple apex second attemp part n/n
2020-10-10 22:04:50 -04:00
William Falcon dbfe2b6129
ref: decouple apex second attemp part 9/n (#4063)
* ref: decouple apex second attemp part 9/n

* ref: decouple apex second attemp part 9/n
2020-10-10 18:44:24 -04:00
William Falcon e3717ed36e
ref: decouple apex second attemp part n/n (#4062)
* ref: decouple apex second attemp part 8/n

* ref: decouple apex second attemp part 8/n

* ref: decouple apex second attemp part 8/n

* ref: decouple apex second attemp part 8/n
2020-10-10 17:25:45 -04:00
William Falcon 5ce9fc6bb3
ref: decouple apex second attemp part 7/n (#4061)
* ref: decouple apex second attemp part 7/n

* ref: decouple apex second attemp part 7/n

* ref: decouple apex second attemp part 7/n
2020-10-10 16:44:15 -04:00
William Falcon dca86c310e
ref: decouple apex second attemp part 6/n (#4060)
* ref: decouple apex second attemp part 6/n

* ref: decouple apex second attemp part 6/n
2020-10-10 15:28:25 -04:00
Rohit Gupta bdbf846029
Fix to print scaler value in progress bar (#4053)
* Fix to print scaler value in progress bar

* chlog

* Fix to print scaler value in progress bar

* Fix to print scaler value in progress bar
2020-10-10 12:20:11 -04:00
William Falcon 05e0b4e5a1
Revert "Remove limitation of batch scaler (#4006)" (#4040)
This reverts commit 7e756ca11f.
2020-10-09 21:03:23 -04:00
Jirka Borovec baf4f35027
add parsing OS env vars (#4022)
* add parsing OS env vars

* fix env

* Apply suggestions from code review

* overwrite init

* Apply suggestions from code review
2020-10-09 19:34:09 -04:00
Nicki Skafte 7e756ca11f
Remove limitation of batch scaler (#4006)
* working code

* add tests

* fix scaling

* move patch dataloader to utils

* renaming

* fix tests

* add changelog

* update docs

* pep8
2020-10-09 14:53:01 -04:00
William Falcon bfdea3ea28
Multi opts tests and clarification (#4016)
* ref: clean up opts docs

* ref: clean up opts docs
2020-10-08 22:55:59 -04:00
Jirka Borovec 8873750cf0
remove deprecated early_stop_callback (#3982) 2020-10-08 06:30:33 -04:00
William Falcon 1d3c7dc8d6
removed deprecated trainer flags (#3969)
* removed deprecated flags

* removed es callback flag
2020-10-07 23:46:21 -04:00
William Falcon aa95addff2
removed support for EvalResult and TrainResult (#3968) 2020-10-07 22:39:16 -04:00
William Falcon 4c0d063c86
outputs in __batch_end hooks (#3966)
* train_batch_end outputs

* added tests for the output hooks
2020-10-07 21:48:38 -04:00