chaton
867eef0e4c
[HOTFIX] Logging for evaluation ( #4684 )
...
* resolve bugs
* add should_flush_logs
* remove should_flush
* should work
* update test
* use something else
* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
* log mock_log_metrics.mock_calls
* typo
* don't use keys
* convert to list
* typo
* check kwargs
* resolve bug
* resolve flake8
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-15 10:41:33 -05:00
Carlos Mocholí
61394d543c
Improve skipping step tests ( #4109 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-14 21:10:24 +00:00
Jirka Borovec
e1955e3c89
isolate PL debugger in tests ( #4643 )
...
* isolate PL debugger in tests
* miss
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-14 11:22:56 +00:00
ananthsub
d096a2ea6d
Fix setup callback hook to pass LightningModule through ( #4608 )
...
* Fix setup callback hook
* Update CHANGELOG.md
* Update test_trainer.py
* Update test_trainer.py
* Update test_trainer.py
* fix chlog
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-13 19:34:46 -05:00
chaton
4018237c30
[FEAT] Add lambda closure to manual_optimizer_step ( #4618 )
...
* added lambda_closure
* move to types
* add 2 new tests
* make example more complex
* add complex example to doc
* added more tests
* resolve doc
* typo
* update
* update tpu optimizer_step
* Apply suggestions from code review
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-12 19:22:06 +00:00
chaton
4a01fd048c
[FIX] Average Pbar Metrics ( #4534 )
...
* wip
* update
* normalize loss
* update test
* resolve bug
* update test and add TODO
* make sure it can be sync
* add TODO
* update sol
2020-11-12 15:59:01 +00:00
Sean Naren
33470ba605
Prevent crash if sync_dist=True on CPU ( #4626 )
...
* Added test/fix for sync_dist raising NotImplementedError
* Fixed comments/formatting
* Revert base class change, enforce sync tensors across accelerators, added GPU test
2020-11-11 22:04:05 +00:00
chaton
3d202f9ecc
[FEAT] Refactor logging 3/3 [v1] ( #4552 )
...
* wip
* wip check how many tests break
* wip
* resolve some bugs
* resolve more bugs
* resolve 2 bugs
* resolve
* temp fix
* update
* remove useless code
* remove result
* try to resolve bug
* update changelog
* formatting
* remove pl
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-11 17:05:24 +00:00
chaton
7e08b0d710
[bug-fix] DDP and automatic_optimization=False ( #4485 )
...
* resolve bug
* add self._running_manual_optim
* update
* update tests
* update lightning module
* resolve bug
* update tests
* update
* resolve pep8
* update
* replace by `ddp_spawn`
* temporary fix
* update
* update
* move update to training_loop
* make both ddp_spawn
* introduce `manual_optimizer_step`
* update changelog
* added changelog wrong place
* add force_optimizer_step
* update docstring for tests
* update optimizer_step
* update zero_grad
* resolve flake8
* move update into manual_optimizer_step
* add zero_grad
* remove zero_grad tests
* remove manual_backward in AMP, it doesn't help
* update
* loosen tests
* update
* update doc
* add TODO
* Removed unnecessary get model from native amp
* Remove try except with pytest raise
* Add seed, clean up imports, remove try catch to reproduce error
* update code
* update test
* revert back
* formatting
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-10 19:44:51 +00:00
Nicki Skafte
4f3160ba2e
Skip tuner algorithms on fast dev ( #3903 )
...
* skip on fast dev
* fix error
* changelog
* fix recursive issue
* combine tests
* pep8
* move logic to base funcs
* fix mistake
* Update pytorch_lightning/tuner/lr_finder.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* pep
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 00:34:42 +01:00
William Falcon
09a51697ed
Adds shortcut for path to log ( #4573 )
...
* added log_dir shortcut to trainer properties for writing logs
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
2020-11-08 12:16:22 -05:00
chaton
9c8701f2e2
[feat] Logging refactor 2/n - train ( #4495 )
...
* update logging
* solve more bugs
* replace Mapping by Dict
* update on comments
* resolve pep8
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* typo
* update for coverage
* update test
* update
* Update tests/models/test_hooks.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Update tests/models/test_hooks.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* update on comments
* remove deepcopy
* remove useless look for
* another small optim
* extra optim
* remove lastest optim, can be source of bug
* resolve bug
* add docstring
* optimize coverage
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging_tests/test_distributed_logging.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/evaluation_loop.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging/test_logger_connector.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging_tests/test_train_loop_logging_1_0.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* update
* update on comments
* update parity speed
* get it down to 0.65
* update
* 0.8 max_dif
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-11-05 22:27:04 +00:00
chaton
11dc5264cd
Bugfix/4449 dict attribute error ( #4480 )
...
* resolve a bug
* resolve a bug
* remove todo
* resolve more bugs
* update tests
* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* resolve pyright
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-04 19:35:07 +00:00
Adrian Wälchli
9b7f01654a
Update old "module_arguments" and "hparams" references in docs ( #4417 )
...
* replace module_arguments refernces
* update hparams docs
* add missing save_hyperparameters in example
* deprecate instead of remove
* Update docs/source/hyperparameters.rst
Co-authored-by: chaton <thomas@grid.ai>
* Update docs/source/hyperparameters.rst
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-03 12:13:10 +01:00
Rohit Gupta
1396321b4d
Add fsspec to tuner ( #4458 )
...
* Add fsspec to tuner
* suggestions
* pathlib
* pep
* missed pep
2020-11-03 15:09:40 +05:30
Rohit Gupta
360b3d8844
Disable training when limit_train_batches=0 ( #4371 )
...
* Disable training when limit_train_batches=0
* chlog
* pep
* limit_train_batches
* BoringModel
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-11-03 12:10:35 +05:30
chaton
958aa1aee7
[test] Accumulated gradient optimization tests ( #4477 )
...
* adding tests
* wip
* update
* Update tests/trainer/test_trainer.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-02 23:44:11 +00:00
chaton
ac3f7393fd
[FEAT] logging refactors 1/n ( #4439 )
...
* introducing new logging object
* typo
* typo
* Update pytorch_lightning/trainer/logging.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* Update pytorch_lightning/trainer/logging.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* update on comments
* update on comments
* add more doctstring
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* resolve on comments
* solve pyright
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* update on comments
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* update on comments
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-02 20:51:43 +00:00
Carlos Mocholí
66ade19d56
Rename conflicting test directories ( #4451 )
...
* logging -> logging_tests
* warnings -> warnings_tests
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-03 00:03:37 +05:30
Jirka Borovec
ef03c39ab7
Add step index in checkpoint name ( #3807 )
...
* true final value of global step
* ch check
* tests
* save each validation interval
* wip
* add test
* add test
* wip
* fix tests, revert old edits, fix merge conflicts, update doctests
* test + bugfix
* sort files
* format test
* suggestion by ananth
* added changelog
* naming
* docs
* example
* suggestion
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* fix test
* pep
* pep
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-11-02 15:05:58 +01:00
Dusan Drevicky
c50c225f05
feature: Allow str arguments in Trainer.profiler ( #3656 )
...
* allow trainer's profiler param to have a str value
* add tests
* update docs
* update exception message
* Update CHANGELOG
* fix pep8 issues
* cleanup test code
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Add deprecation warning if using bool for profiler
* Add deprecation tests and move deprecated tests
* Remove bool option to profiler from docs
* Deprecate bool args to profiler in CHANGELOG
* fixup! Add deprecation warning if using bool for profiler
* fixup! Add deprecation tests and move deprecated tests
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Implement suggestions, remove whitespace
* fixup! Implement suggestions, remove whitespace
* Allow bool, str (case insensitive), BaseProfiler
* Add info about bool deprecation to trainer
* fixup! Add info about bool deprecation to trainer
* Move deprecate todo to test_deprecated
* Test wrong profiler type, improve error message
* fixup! Test wrong profiler type, improve error message
* Update pytorch_lightning/trainer/connectors/profiler_connector.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Apply suggestions from code review
* Readd bool to profiler types, test cli profiler arg
* Remove extra whitespace in doc
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update deprecation versions
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-27 16:27:16 +05:30
ananthsub
f6efb712ed
Skip replacing dataloader sampler if it's already a distributed sampler ( #4273 )
...
* Update data_loading.py
* Update data_loading.py
* add test + update flag description
* add to changelog
* Update test_dataloaders.py
* fix-pickle
* Update test_dataloaders.py
* Added missing reference calls
* Update tests/trainer/test_dataloaders.py
* Apply suggestions from code review
* Update data_loading.py
* Update test_dataloaders.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-23 17:34:07 +01:00
Rohit Gupta
4c7ebdc32b
Add dirpath and filename parameter in ModelCheckpoint ( #4213 )
...
* Add dirpath and filename parameter in ModelCheckpoint
* remove old function
* chlog
* codefactor
* update tests
* docs
* fix doctest and added tests
* pathlib dirpath
* dep version and docs
* try fix doctest
* pep
* suggestions
Co-authored-by: carmocca <carlossmocholi@gmail.com>
* suggestions
* fix test
* pep
* trigger tests
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* suggestions
* try fix windows test
* add and update some tests
* trigger tests
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-23 09:59:12 +05:30
Sean Naren
065cc94112
Fix bug comparing max_steps to global step which inits at 0 ( #4278 )
...
* Fix bug comparing max_steps to global step which inits at 0
* Added test to ensure accumulate grad batch works with max steps
* check fix with TODO test
* correct call counts
* Add check to ensure we've finished accumulation of this global step before exiting loop in conjuction with max steps
* Remove + 1 check in test as this was incorrect
* Update incorrect expected outputs in lr finder test
* Added brackets for clarity
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-22 13:58:59 +01:00
Mauricio Villegas
546476c704
Allow changing the logged step value in validation_step ( #4130 )
...
* Fix to bug identified in https://github.com/PyTorchLightning/pytorch-lightning/issues/4102
* update tests
* chlog
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-10-22 03:03:07 +05:30
Carlos Mocholí
2549ca40e6
Clean up optimizer code ( #3587 )
...
* Update optimizer code
* Update CHANGELOG
* Fix tuple of one list case
* Update docs
* Fix pep issue
* Minor typo [skip-ci]
* Use minimal match
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-21 21:12:48 +02:00
Justus Schock
0ec4107697
Optimizer closure ( #4190 )
...
* closure for all optimizers
* rename hook and take care of alternating backwards
* add comment
* training_loop_fix
* closure whenever possible
* training_loop
* simple tests that count backward calls
* fix test to work with closure
* remove debugging statement
* better place
* check grads after backward
* start fixing manual optimization
* skip step when result returned by closure was None
* fix gradient clipping test to work with closure
* attribute dict result only for automatic optimization
* adjust backward calls in accelerator
* adjust where to call gradient clipping
* adjust backward calls in tests
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* pass kwargs to xla optimizer
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-21 19:34:29 +01:00
Carlos Mocholí
e0f9799dbf
Add strict option to lr_scheduler dict ( #3586 )
...
* Add strict option to lr_scheduler dict
* Update docs
* Unnecessary "else" after "raise"
* Update CHANGELOG
* Fix rebase
2020-10-21 14:14:37 +02:00
Sean Naren
c336881959
Added fix to ensure that custom logged metrics within test_epoch_end are appended to the result object even without step reduced metrics ( #4251 )
2020-10-20 18:33:18 +02:00
Jirka Borovec
f37444fa3e
CI: add flake8 ( #4239 )
2020-10-19 21:20:17 +01:00
William Falcon
72f19768c8
remove duplicate metric vs step log for train loop ( #4173 )
...
* remove duplicate metric vs step log
* remove duplicate metric vs step log
* remove duplicate metric vs step log
* fix ddp index issue
2020-10-15 10:47:00 -04:00
William Falcon
45d05ff68d
Fixes #4141 ( #4169 )
...
* fix val epoch agg
* fix val agg metrics
* fix val agg metrics
* fix val agg metrics
2020-10-15 09:12:05 -04:00
William Falcon
09c2020a93
notices ( #4118 )
2020-10-13 07:18:07 -04:00
William Falcon
2d5a7f5e7d
Fixes #3276 ( #4116 )
2020-10-13 06:42:11 -04:00
William Falcon
bf2067a609
enabled manual returns ( #4089 )
2020-10-12 10:06:17 -04:00
William Falcon
5b645d713e
Covv1 ( #4072 )
...
* temporary drop metrics tests while speeding them up
* cov
* cov
* docs
2020-10-11 10:21:53 -04:00
William Falcon
a4b9221fc5
ref: decouple apex second attemp part n/n ( #4065 )
...
* ref: decouple apex second attemp part n/n
* ref: decouple apex second attemp part n/n
2020-10-10 22:04:50 -04:00
William Falcon
dbfe2b6129
ref: decouple apex second attemp part 9/n ( #4063 )
...
* ref: decouple apex second attemp part 9/n
* ref: decouple apex second attemp part 9/n
2020-10-10 18:44:24 -04:00
William Falcon
e3717ed36e
ref: decouple apex second attemp part n/n ( #4062 )
...
* ref: decouple apex second attemp part 8/n
* ref: decouple apex second attemp part 8/n
* ref: decouple apex second attemp part 8/n
* ref: decouple apex second attemp part 8/n
2020-10-10 17:25:45 -04:00
William Falcon
5ce9fc6bb3
ref: decouple apex second attemp part 7/n ( #4061 )
...
* ref: decouple apex second attemp part 7/n
* ref: decouple apex second attemp part 7/n
* ref: decouple apex second attemp part 7/n
2020-10-10 16:44:15 -04:00
William Falcon
dca86c310e
ref: decouple apex second attemp part 6/n ( #4060 )
...
* ref: decouple apex second attemp part 6/n
* ref: decouple apex second attemp part 6/n
2020-10-10 15:28:25 -04:00
Rohit Gupta
bdbf846029
Fix to print scaler value in progress bar ( #4053 )
...
* Fix to print scaler value in progress bar
* chlog
* Fix to print scaler value in progress bar
* Fix to print scaler value in progress bar
2020-10-10 12:20:11 -04:00
William Falcon
05e0b4e5a1
Revert "Remove limitation of batch scaler ( #4006 )" ( #4040 )
...
This reverts commit 7e756ca11f
.
2020-10-09 21:03:23 -04:00
Jirka Borovec
baf4f35027
add parsing OS env vars ( #4022 )
...
* add parsing OS env vars
* fix env
* Apply suggestions from code review
* overwrite init
* Apply suggestions from code review
2020-10-09 19:34:09 -04:00
Nicki Skafte
7e756ca11f
Remove limitation of batch scaler ( #4006 )
...
* working code
* add tests
* fix scaling
* move patch dataloader to utils
* renaming
* fix tests
* add changelog
* update docs
* pep8
2020-10-09 14:53:01 -04:00
William Falcon
bfdea3ea28
Multi opts tests and clarification ( #4016 )
...
* ref: clean up opts docs
* ref: clean up opts docs
2020-10-08 22:55:59 -04:00
Jirka Borovec
8873750cf0
remove deprecated early_stop_callback ( #3982 )
2020-10-08 06:30:33 -04:00
William Falcon
1d3c7dc8d6
removed deprecated trainer flags ( #3969 )
...
* removed deprecated flags
* removed es callback flag
2020-10-07 23:46:21 -04:00
William Falcon
aa95addff2
removed support for EvalResult and TrainResult ( #3968 )
2020-10-07 22:39:16 -04:00
William Falcon
4c0d063c86
outputs in __batch_end hooks ( #3966 )
...
* train_batch_end outputs
* added tests for the output hooks
2020-10-07 21:48:38 -04:00