Sean Naren
e7134a9135
Sharded Plugin 2/n: Allow ddp plugin to modify optimizer state saving ( #4675 )
...
* Allow ddp plugin to modify optimizer state saving
* Rely on the accelerator for optimizer states
* Ensure we init the accelerator for the saving function
* Better comment for optim state dump
* Revert "Ensure we init the accelerator for the saving function"
This reverts commit af65effa
* Added accelerator check to initialize tuner before saving model checkpoint
* Simplify comment
* Revert "Added accelerator check to initialize tuner before saving model checkpoint"
This reverts commit f9929c0c
* Return single optimizer state to reduce duplication
* Fixed docstring
* Fixed typing
* Fixed comment
* Added CHANGELOG.md
Co-authored-by: chaton <thomas@grid.ai>
2020-11-18 16:38:35 +00:00
Jirka Borovec
5ea383332d
update chlog after 1.0.7 release ( #4735 )
2020-11-18 12:26:41 +00:00
Carlos Mocholí
396a46f55f
Add current_score to ModelCheckpoint.on_save_checkpoint ( #4721 )
...
* Add current_score to ModelCheckpoint.on_save_checkpoint
* Update CHANGELOG
[ci skip]
* fix
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* fix2
* Add test for NaN
* Fix failing tests
* Simplify line
* Add test docstrings
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-18 08:09:44 +00:00
Nicki Skafte
51097669b9
[metrics] change default behaviour of state dict ( #4685 )
...
* fix state dict
* Update docs/source/metrics.rst
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* changelog
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-16 12:33:45 +00:00
Jirka Borovec
be60efb3cf
allow decorate model init with saving hparams ( #4662 )
...
* addd tests
* use boring model
* parsing init
* chlog
* double decorate
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* bug
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-11-16 11:02:26 +01:00
ananthsub
d096a2ea6d
Fix setup callback hook to pass LightningModule through ( #4608 )
...
* Fix setup callback hook
* Update CHANGELOG.md
* Update test_trainer.py
* Update test_trainer.py
* Update test_trainer.py
* fix chlog
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-13 19:34:46 -05:00
Jirka Borovec
396a18eb78
update changelog after 1.0.6 ( #4624 )
...
* update changelog after 1.0.6
* fix formatting
2020-11-12 09:21:57 +01:00
chaton
3d202f9ecc
[FEAT] Refactor logging 3/3 [v1] ( #4552 )
...
* wip
* wip check how many tests break
* wip
* resolve some bugs
* resolve more bugs
* resolve 2 bugs
* resolve
* temp fix
* update
* remove useless code
* remove result
* try to resolve bug
* update changelog
* formatting
* remove pl
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-11 17:05:24 +00:00
chaton
7e08b0d710
[bug-fix] DDP and automatic_optimization=False ( #4485 )
...
* resolve bug
* add self._running_manual_optim
* update
* update tests
* update lightning module
* resolve bug
* update tests
* update
* resolve pep8
* update
* replace by `ddp_spawn`
* temporary fix
* update
* update
* move update to training_loop
* make both ddp_spawn
* introduce `manual_optimizer_step`
* update changelog
* added changelog wrong place
* add force_optimizer_step
* update docstring for tests
* update optimizer_step
* update zero_grad
* resolve flake8
* move update into manual_optimizer_step
* add zero_grad
* remove zero_grad tests
* remove manual_backward in AMP, it doesn't help
* update
* loosen tests
* update
* update doc
* add TODO
* Removed unnecessary get model from native amp
* Remove try except with pytest raise
* Add seed, clean up imports, remove try catch to reproduce error
* update code
* update test
* revert back
* formatting
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-10 19:44:51 +00:00
maxjeblick
343d19fa86
Find parameters which are specified in the LightningDataModule, only ( #4347 )
...
* search for attribute in datamodule if not found elsewhere
* add test for datamodule
* add lightning_getattr test for datamodule
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update CHANGELOG.md
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-10 14:01:20 +01:00
Nicki Skafte
465ec752f8
Metric ddp bugfix ( #4482 )
...
* changes
* fix spelling
* small note
* trying to fix ddp test
* fix ddp
* fix for test
* suggestion
* CHANGELOG
* Update pytorch_lightning/metrics/metric.py
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Sean Naren <sean@grid.ai>
2020-11-10 09:16:31 +01:00
Nicki Skafte
4f3160ba2e
Skip tuner algorithms on fast dev ( #3903 )
...
* skip on fast dev
* fix error
* changelog
* fix recursive issue
* combine tests
* pep8
* move logic to base funcs
* fix mistake
* Update pytorch_lightning/tuner/lr_finder.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* pep
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 00:34:42 +01:00
tarepan
41c9bee4f0
Fix load disparity between normal and hpc ( #4526 )
...
* Add missing load functionality in hpc
* Add general file load for hpc
* Add mark in CHANGELOG
* Fix Typo Li**hg**tning
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Refactor line separation
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Fix entangled fixation commit
* Fix naming of restore_model_states
* Fix amp restore place
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-09 17:26:38 +00:00
Stef | ステフ
4a6721af25
Missing TorchScript trace's update ( #4586 )
...
Co-authored-by: stef-ubuntu <stef@webempath.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-09 15:01:13 +01:00
Travis Addair
51cc7a89ee
Horovod: fixed early stopping and added metrics aggregation ( #3775 )
...
* Fixed early stopping for Horovod
* Refactored to sync_dist_if_available
* Bump min Horovod version to support hvd.is_initialized
* Changelog
* Added back change for Horovod
* Removed redundant checks for initialization
* Implement metrics gathering for Horovod
* Added test for EvalResult
* Renamed ddp_sync_on_step -> dist_sync_on_step
* Added metric test for Horovod
* Added option pass callable allgather function to metric base class
* Added dist_sync_fn
* Fixed calls to private _sync_dist
* Fixed Horovod test
* Added sync_tensor to the distributed backend
* Skip Windows
* Insert test path
* Removed redundant import
* Updated drone
* Unset HOROVOD_GPU_ALLREDUCE
* Unset
* No cache dir
* No uninstall
* Unset variables
* Uninstall Horovod during initialization
* Replaced more references to ddp_sync_on_step
* Fixed imports
* Fixed attribute
* Added back default
* Lint
* Added back docstring
* Made gather_all_tensors default
* Added whitespace
* Update tests/models/test_horovod.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/metrics/metric.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update CHANGELOG.md
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-05 12:52:02 -05:00
Jirka Borovec
41c6a1307b
update changelog after 1.0.5 ( #4505 )
...
* update changelog
* update
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-04 21:56:47 +01:00
ananthsub
5d08559c03
Avoid torchscript export for Metric forward ( #4428 )
...
* Update metric.py
* add test
* Update CHANGELOG.md
* Update test_metric_lightning.py
* Update test_metric_lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-03 23:02:02 +01:00
Rohit Gupta
360b3d8844
Disable training when limit_train_batches=0 ( #4371 )
...
* Disable training when limit_train_batches=0
* chlog
* pep
* limit_train_batches
* BoringModel
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-11-03 12:10:35 +05:30
Rohit Gupta
ad2556b669
Disable saving checkpoints if not trained ( #4372 )
...
* Disable saving checkpoints if not trained
* chlog
* update test
* fix
Co-authored-by: chaton <thomas@grid.ai>
2020-11-03 11:38:32 +05:30
Nicki Skafte
19187d38f9
[Metrics] Detach bugfix ( #4313 )
...
* detach on buffer
* doc update
* remove file
* changelog
* suggestions
* Update docs/source/metrics.rst
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
* fix for 4266
* Update docs/source/metrics.rst
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update CHANGELOG.md
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-02 19:44:49 +00:00
chaton
102fa9ee7d
[BUGFIX] AMP + Precision unscale grad ( #4441 )
...
* move unscale within Native plugin
* remove gradient tracking from lightning backward
* forgot trainer.fit
* typo
* update
* cleanup
* set to 1.6
* typo
* skip if below 1.6 strict
* update changelog
* remove useless code
* Update tests/plugins/test_amp_plugin.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Update tests/plugins/test_amp_plugin.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* update changelog
* Update CHANGELOG.md
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-02 16:36:48 +00:00
Jirka Borovec
ef03c39ab7
Add step index in checkpoint name ( #3807 )
...
* true final value of global step
* ch check
* tests
* save each validation interval
* wip
* add test
* add test
* wip
* fix tests, revert old edits, fix merge conflicts, update doctests
* test + bugfix
* sort files
* format test
* suggestion by ananth
* added changelog
* naming
* docs
* example
* suggestion
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* fix test
* pep
* pep
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-11-02 15:05:58 +01:00
Lezwon Castelino
839813eb7b
timeout for tpu check ( #4340 )
...
* timeout for tpu check
* added tests
* updated CHANGELOG.md
* fixed windows tests
* Update pytorch_lightning/utilities/xla_device_utils.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* requested changes
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-11-01 01:04:25 +01:00
Dusan Drevicky
38bb4e2da0
[Metrics] Add multiclass auroc ( #4236 )
...
* Add functional multiclass AUROC metric
* Add multiclass_auroc tests
* fixup! Add functional multiclass AUROC metric
* fixup! fixup! Add functional multiclass AUROC metric
* Add multiclass_auroc doc reference
* Update CHANGELOG
* formatting
* Shorter error message regex match in tests
* Set num classes as pytest parameter
* formatting
* Update CHANGELOG
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-10-30 19:56:13 +01:00
Jeff Yang
0f584faa6b
PyTorch 1.7 Stable support ( #3821 )
...
* prepare for 1.7 support [ci skip]
* tpu [ci skip]
* test run 1.7
* all 1.7, needs to fix tests
* couple with torchvision
* windows try
* remove windows
* 1.7 is here
* on purpose fail [ci skip]
* return [ci skip]
* 1.7 docker
* back to normal [ci skip]
* change to some_val [ci skip]
* add seed [ci skip]
* 4 places [ci skip]
* fail on purpose [ci skip]
* verbose=True [ci skip]
* use filename to track
* use filename to track
* monitor epoch + changelog
* Update tests/checkpointing/test_model_checkpoint.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-30 15:42:14 +00:00
Jeff Yang
48e0b33d56
[Changelog] 1.0.4 ( #4440 )
...
* changelog 1.0.4
* changelog 1.0.4
2020-10-30 13:34:29 +00:00
Nicki Skafte
e0b856c105
[Metrics] Confusion matrix class interface ( #4348 )
...
* docs + precision + recall + f_beta + refactor
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
* rebase
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
* fixes
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
* added missing file
* docs
* docs
* extra import
* add confusion matrix
* add to docs
* add test
* pep8 + isort
* update tests
* move util function
* unify functional and class
* add to init
* remove old implementation
* update tests
* pep8
* add duplicate
* fix doctest
* Update pytorch_lightning/metrics/classification/confusion_matrix.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* changelog
* bullet point args
* bullet docs
* bullet docs
Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <55400948+s-rog@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-30 11:44:25 +01:00
Adrian Wälchli
d1234c592d
deprecate passing ModelCheckpoint instance to Trainer(checkpoint_callback=...) ( #4336 )
...
* first attempt
* update tests
* support multiple
* test bugfix
* changelog
* pep
* pep
* import order
* import
* improve test for resuming
* test
* update test
* add references test
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* docstring suggestion deprecation
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
* paramref
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-30 04:47:37 +01:00
Martin Hwang
b459fd26ac
fix: `nb` is set total number of devices, when nb is -1. ( #4209 )
...
* fix: `nb` is set total number of devices, when nb is -1.
Refs: #4207
* feat: add test code
1. test combination `auto_select_gpus`, `gpus` options using
Trainer
2. test `pick_multiple_gpus` function directly
Refs: #4207
* docs: modify contents in `Select GPU devices`
Refs: #4207
* refactore: reflect the reuslt of review
Refs: #4207
* refactore: reflect the reuslt of review
Refs: #4207
* Update CHANGELOG.md
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <55400948+s-rog@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-10-29 10:50:37 +01:00
Boris Dayma
ff41d80706
feat(wandb): log in sync with Trainer step ( #4405 )
...
* feat(wandb): log in sync with Trainer step
* docs: update CHANGELOG
* style(test_wandb): fix formatting
* parentheses
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-29 01:07:06 +05:30
Carlos Mocholí
00cc69aed7
Add "monitor" to saved ModelCheckpoints ( #4383 )
...
* Add key
* Remove unused variables
* Update CHANGELOG [skip ci]
* best_model_monitor -> monitor
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-28 15:21:08 +05:30
Dusan Drevicky
c50c225f05
feature: Allow str arguments in Trainer.profiler ( #3656 )
...
* allow trainer's profiler param to have a str value
* add tests
* update docs
* update exception message
* Update CHANGELOG
* fix pep8 issues
* cleanup test code
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Add deprecation warning if using bool for profiler
* Add deprecation tests and move deprecated tests
* Remove bool option to profiler from docs
* Deprecate bool args to profiler in CHANGELOG
* fixup! Add deprecation warning if using bool for profiler
* fixup! Add deprecation tests and move deprecated tests
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Implement suggestions, remove whitespace
* fixup! Implement suggestions, remove whitespace
* Allow bool, str (case insensitive), BaseProfiler
* Add info about bool deprecation to trainer
* fixup! Add info about bool deprecation to trainer
* Move deprecate todo to test_deprecated
* Test wrong profiler type, improve error message
* fixup! Test wrong profiler type, improve error message
* Update pytorch_lightning/trainer/connectors/profiler_connector.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Apply suggestions from code review
* Readd bool to profiler types, test cli profiler arg
* Remove extra whitespace in doc
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update deprecation versions
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-27 16:27:16 +05:30
Chenglu
8e3faa2da1
get help from docstring ( #4344 )
...
* Add geting help message from docstring
* Fix pep8 issue
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-10-26 23:38:58 +05:30
chaton
f07ee33db6
BUG - Wandb: Sanitize callable. ( #4320 )
...
* add _sanitize_callable_params
* add call on _val if callable
* clean code formatter
* resolve pep8
* default return function name
* resolve pep8
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update CHANGELOG.md
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-26 11:57:03 +00:00
Adrian Wälchli
376268f01e
Implement finalize for WandbLogger ( #4341 )
...
* wandb finish
* experiment
* upload at end of run
* changelog
* comment
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-10-26 11:22:09 +00:00
ananthsub
c8ccec7a02
Enable profilers to write to remote files with fsspec ( #4162 )
...
* Update profilers.py
Enable profilers to use write to remote files with fsspec
* Update profilers.py
* Update CHANGELOG.md
* Update pytorch_lightning/profiler/profilers.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* formatting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-25 18:21:42 +01:00
Dusan Drevicky
6ad299573f
[Metrics] Fix/4237 auc unstable reorder ( #4281 )
...
* =Add deprecation warning for auc reorder
* =Add test for deprecation warning for auc reorder
* Update CHANGELOG
* Add reorder deprecation warning to auc docstring
* Fix pep8 f-string error
* remove duplicate import
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-10-25 10:26:40 +01:00
Adrian Wälchli
28d45a26a3
Set correct device ids in DDP [wip] ( #4297 )
...
* repro
debug
c
d
dd
d
d
d
ads
d
d
d
f
rank
f
v
d
d
d
d
d
d
d
d
d
d
d
set
drop PL_DDP_PID
clean up
keep set gpus
revert
Revert "drop PL_DDP_PID"
This reverts commit 7d88cae469541ef19128f9c20919fd3a6f863039.
d
pid
gpus
clean up
clean up
misconfig?
misconfig
clean
clean
* fix pep
* changelog
* remove script
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-24 17:33:47 -04:00
Sean Naren
5641b266d5
Bug/4319 ddp checkpoint ( #4323 )
...
* Broadcast best model path to ensure we sync with main process + wait for main process to save
* Add barrier call to ensure all processes are in sync
* Added changelog commit
* Move sync of best model path/score to model checkpoint, keep barrier to ensure all processes complete
* Ensure we broadcast as tuple
* Add init check
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Removed model checkpoint code, added barrier to trainer to enforce we syncronize and wait for all processes to finish before completing training
* Add barrier within teardown call, removed horovod teardown to inherit from base accelerator
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-10-24 16:55:49 -04:00
ananthsub
f6efb712ed
Skip replacing dataloader sampler if it's already a distributed sampler ( #4273 )
...
* Update data_loading.py
* Update data_loading.py
* add test + update flag description
* add to changelog
* Update test_dataloaders.py
* fix-pickle
* Update test_dataloaders.py
* Added missing reference calls
* Update tests/trainer/test_dataloaders.py
* Apply suggestions from code review
* Update data_loading.py
* Update test_dataloaders.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-23 17:34:07 +01:00
Rohit Gupta
4c7ebdc32b
Add dirpath and filename parameter in ModelCheckpoint ( #4213 )
...
* Add dirpath and filename parameter in ModelCheckpoint
* remove old function
* chlog
* codefactor
* update tests
* docs
* fix doctest and added tests
* pathlib dirpath
* dep version and docs
* try fix doctest
* pep
* suggestions
Co-authored-by: carmocca <carlossmocholi@gmail.com>
* suggestions
* fix test
* pep
* trigger tests
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* suggestions
* try fix windows test
* add and update some tests
* trigger tests
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-23 09:59:12 +05:30
William Falcon
753362d0a4
enable ddp as a plugin ( #4285 )
...
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
* enable custom ddp plugin
Co-authored-by: chaton <thomas@grid.ai>
2020-10-22 05:15:51 -04:00
Mauricio Villegas
546476c704
Allow changing the logged step value in validation_step ( #4130 )
...
* Fix to bug identified in https://github.com/PyTorchLightning/pytorch-lightning/issues/4102
* update tests
* chlog
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-10-22 03:03:07 +05:30
Carlos Mocholí
2549ca40e6
Clean up optimizer code ( #3587 )
...
* Update optimizer code
* Update CHANGELOG
* Fix tuple of one list case
* Update docs
* Fix pep issue
* Minor typo [skip-ci]
* Use minimal match
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-21 21:12:48 +02:00
Carlos Mocholí
e0f9799dbf
Add strict option to lr_scheduler dict ( #3586 )
...
* Add strict option to lr_scheduler dict
* Update docs
* Unnecessary "else" after "raise"
* Update CHANGELOG
* Fix rebase
2020-10-21 14:14:37 +02:00
Nicki Skafte
3a38294d6d
New section for changelog [ci skip] ( #4279 )
...
* changelog
* Apply suggestions from code review
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-10-21 11:54:02 +02:00
Jirka Borovec
e0e402dbe6
Docs/changelog for 1.0.3 ( #4267 )
...
* formatting
* miss
* missing & ver++
* path
2020-10-21 00:53:10 +02:00
Yigit Ozen
fc23c1be74
Fix 1.0.0 changelog ( #4180 )
2020-10-15 20:36:54 +02:00
Jirka Borovec
8f2bba10e1
update chlog ( #4177 )
2020-10-15 11:42:10 -04:00
Jirka Borovec
4204ef7b53
Bugfix/4156 filter hparams for yaml - fsspec ( #4158 )
...
* add test
* fix
* sleepy boy
* chlog
* Apply suggestions from code review
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-10-15 16:53:42 +02:00