thomas chaton
1a6dcbd422
[bugfix] Resolve Kineto Profiler for Conda ( #7376 )
2021-05-05 11:54:16 +00:00
ananthsub
98670c83a9
Deprecate`truncated_bptt_steps` flag on Trainer in favor of same setting on the LightningModule ( #7323 )
...
* deprecate-tbptt-trainer
* Update CHANGELOG.md
* Update lightning.py
* test
* Update lightning.py
* Update training_loop.py
* Update training_loop.py
* Update lightning.py
* Update training_loop.py
* Update training_loop.py
* update docs
* Update accelerator.py
* Update accelerator.py
* more docs
* tweaks
* chlog
* comments
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-05-05 11:21:00 +01:00
Jirka Borovec
573a5a8a34
update building latest XLA 1.8 ( #7359 )
...
* wip
* XLA
* .
2021-05-05 10:01:03 +01:00
William Falcon
a4abb62482
Update README.md
2021-05-04 21:54:33 -05:00
Christfried Focke
763a9a9495
Fix Namespace loading in PyYAML 5.4.x ( #6673 )
...
* Fix Namespace loading in PyYAML 5.4.x
* Remove OmegaConf reference from PyYAML requirements
* Max allowed version for pyyaml
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 22:56:11 +00:00
Kaushik B
e21b7a62d7
Add ddp_find_unused_parameters_false to Registry ( #7224 )
2021-05-04 22:40:00 +00:00
Jirka Borovec
df579a842a
set min PT version for legacy ( #7358 )
2021-05-04 17:50:12 -04:00
Jirka Borovec
bac4656eca
fix readme badges ( #7354 )
...
* fix readme badges
* Apply suggestions from code review
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
2021-05-04 16:37:26 -04:00
Carlos Mocholí
374ff750f5
Pass `current_epoch`/`global_step` as monitor candidates [1/2] ( #7344 )
...
* Pass `current_epoch`/`global_step` as monitor candidates
* Formatting
* Fix deprecated test
* Update CHANGELOG
2021-05-04 16:05:40 -04:00
Jirka Borovec
bc06623ff0
temp suspend NVIDIA CI build ( #7350 )
...
* temp suspend NVIDIA CI build
* just skip
* todo
* if: false
2021-05-04 15:22:02 -04:00
Jirka Borovec
839b206164
add CI event published ( #7353 )
2021-05-04 14:32:16 -04:00
Louis Taylor
b64aea637c
CI: move azure-pipelines config to separate directory ( #7276 )
...
* CI: move azure pipelines to separate directory
This removes some extra clutter in the top level as we add more
pipelines.
* rename
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 10:50:16 -04:00
Ethan Harris
2a740ebe77
Fix support for dataloader with None batches ( #7342 )
...
* Fix Dataloader None batch
* Fix Dataloader None batch
* Update CHANGELOG.md
* Fix breaking test
* Address comments
2021-05-04 12:24:03 +00:00
ramonemiliani93
5db832f181
Fix auto scaling mode when calling tune method on trainer. ( #7321 )
...
* Add test for non-existing mode, the test should fail if something different from `power` or `binsearch` is passed.
* Add newline.
* Apply fix
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update tests/tuner/test_scale_batch_size.py
* Update pytorch_lightning/tuner/batch_size_scaling.py
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-05-04 12:03:51 +00:00
ananthsub
69cf63e2fd
Update trainer.py ( #7340 )
2021-05-04 11:11:27 +00:00
Carlos Mocholí
8c0ea92af2
`TrainerState` refactor [5/5] ( #7173 )
...
* `TrainerState` refactor
* flake8
* Update finished check
* Test cleanup
* Fix tests
* Fixes
* Reorder
* flake8
* Update CHANGELOG
* Better docs
* Better docs
* Remove default
* Update tests
* Bad merge
2021-05-04 12:50:56 +02:00
Adrian Wälchli
a6aa1a0f82
make gpus=str in Trainer consistent with command line parsing of string ( #6388 )
...
* string gpu input
* update docs
* deprecation warning
* Revert "update docs"
This reverts commit c5f3893413
.
* deprecation
* add changelog
* update parser
* update warning
* implement v1.5 behavior ahead of time
* formatting
* set accelerator in test to avoid different warning
* add warning
* remove todo warn
* Update pytorch_lightning/utilities/device_parser.py
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
* resolve flake8
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
2021-05-04 09:56:27 +00:00
Boris Dayma
2a20102321
fix(wandb): allow custom init args ( #6989 )
...
* feat(wandb): allow custom init args
* style: pep8
* fix: get dict args
* refactor: simplify init args
* test: test init args
* style: pep8
* docs: update CHANGELOG
* test: check default resume value
* fix: default value of anonymous
* fix: respect order of parameters
* feat: use look-up table for anonymous
* yapf formatting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 09:45:36 +00:00
Hemil Desai
82c19e1444
Update LR schedulers only when their corresponding Optimizer is being… ( #4868 )
...
* Update LR schedulers only when their corresponding Optimizer is being used.
In the case when optimizer frequencies are specified,
the LR scheduler corresponding to a particular optimizer is updated
only when that optimizer is being used in the training loop or epoch.
* pep8speak fixes
* Fix failing tests
* Add docs
* PR Feedback
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* formatting fix
* PR Feedback - part 2
* More PR feedback
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Add typing imports
* Stronger tests and fixes related to that
* Add more tests plus PR feedback
* Make optimizer_freq_cumsum a cached property
@cached_property is only available after Python 3.8 so had to do it manually.
* Fix tests
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Avoid mutable defaults
* Parametrize lr scheduling tests
* PR feedback
* Apply suggestions from code review
* spell
* Apply suggestions from code review
* flake8
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-05-04 09:37:40 +00:00
Adrian Wälchli
b780af51be
update test for resume_from_checkpoint on missing file ( #7255 )
2021-05-04 09:16:34 +00:00
Louis Taylor
d413bab5ac
Add initial IPU CI job ( #7251 )
...
This adds an azure-pipelines job so we can verify the runners are
connected correctly. Since the IPU branch isn't merged, it won't yet
give any actual IPU test coverage.
2021-05-04 08:19:41 +00:00
Carlos Mocholí
3fdb61ac1b
Replace `_DataModuleWrapper` with `__new__` [1/2] ( #7289 )
...
* Remove `_DataModuleWrapper`
* Update pytorch_lightning/core/datamodule.py
* Update pytorch_lightning/core/datamodule.py
* Replace `__reduce__` with `__getstate__`
2021-05-04 08:00:24 +00:00
Leonard Lausen
597b309f2e
Fix `Trainer.plugins` type declaration ( #7288 )
...
* Fix trainer.plugins type declaration
* Don't ClusterEnvironment(Plugin)
* fix import error, yapf formatter
* Add test
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 08:42:57 +02:00
SpontaneousDuck
f135debb6a
Clarify logger flag ( #7190 )
...
* Clarify logger flag
Clarify behavior of boolean values on the logger flag for Trainer.
* Update docs/source/common/trainer.rst
* doc
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-05-04 00:21:28 +00:00
Daniel Mesejo-León
6da747e775
Deprecate `LightningModule.datamodule` reference in favor of the trainer one ( #6929 ) ( #7168 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-04 00:01:41 +00:00
Adrian Wälchli
3e8db4142b
add forgotten test in #7240 ( #7283 )
...
^
2021-05-03 23:56:30 +00:00
Carlos Mocholí
c6a171b776
Fix requirements/adjust_versions.py ( #7149 )
...
Co-authored-by: jirka <jirka.borovec@seznam.cz>
2021-05-04 01:06:28 +02:00
Kaushik B
6d7c6d6403
Update Accelerator Connector for Registry ( #7214 )
2021-05-03 21:03:21 +00:00
ananthsub
b7a444883c
Remove model.trainer call inside of dataloading mixin ( #7317 )
...
* Update data_loading.py
* Update data_loading.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-03 13:53:54 -07:00
Mauricio Villegas
78a6fd5588
Example and documentation for LightningCLI linking model and data arguments ( #7299 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-03 20:45:46 +00:00
Adrian Wälchli
bf1394a472
improve early stopping verbose logging ( #6811 )
2021-05-03 20:20:48 +00:00
ananthsub
393b252ef0
Update CODEOWNERS ( #7302 )
...
* Update CODEOWNERS
* @carmocca
* @borda
* Update CODEOWNERS
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2021-05-03 14:17:27 -05:00
ananthsub
14c552bb92
[bugfix] Fix dataloading for iterable datasets and limit_train_batches ( #7306 )
...
* bugfix-dataloading
* rm-logs
* Update CHANGELOG.md
* Update test_dataloaders.py
* Update test_dataloaders.py
* Update training_loop.py
* Update test_dataloaders.py
* Update CHANGELOG.md
* Update CHANGELOG.md
* Update test_dataloaders.py
* Update training_loop.py
* Update training_loop.py
* comments
* address comments
* more tests
* Update progress.py
* Update test_dataloaders.py
* Update test_dataloaders.py
* Update training_loop.py
* Update training_loop.py
* test ckpt fix?
* update again
2021-05-03 19:50:26 +01:00
Adrian Wälchli
7636d422fa
Update DeepSpeed version requirement in Dockerfile ( #7326 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-03 20:21:19 +02:00
ananthsub
39274273a4
Update accelerator.py ( #7318 )
2021-05-03 11:17:26 -04:00
Carlos Mocholí
badd0bba30
Move trainer functions ( #7295 )
2021-05-03 09:26:38 -04:00
Adrian Wälchli
e0c64f0ef6
Fix Adagrad optimizer not working with DDP/GPU ( #7277 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-05-03 03:57:17 +05:30
William Falcon
29357ba94e
Update README.md
2021-05-01 13:55:07 -04:00
Kaushik B
490cc57809
Device updates for TPU Pod ( #7243 )
2021-04-30 23:14:06 +05:30
thomas chaton
16d6c9828d
[bugfix] Apex never instantiated. ( #7274 )
...
* update
* update
* update apex
* update
* update
* update
* remove test.py
* update
* update
* update on comments
* update changelog
* update
* update
* typo
2021-04-30 13:16:28 -04:00
ananthsub
44fd01734c
Move grad_norm to a dedicated utilities file ( #7292 )
...
* rm-grad-norm-mixin
* Update grads.py
* Update CHANGELOG.md
* Apply suggestions from code review
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update docstrings
* Update __init__.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-04-30 09:19:22 -07:00
ananthsub
e407edba36
[fix] Attach train+val dataloaders to trainer in trainer loop ( #7207 )
...
* Update training_loop.py
* Update test_dataloaders.py
* changelog
* delay reload
* go back
* comments
* Update training_loop.py
* Update test_dataloaders.py
* Update tests/trainer/test_dataloaders.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-04-30 09:01:31 -07:00
thomas chaton
80b9ca0e38
[bugfix] Add reloading support using BaseFinetuning ( #7253 )
...
* update
* wip
* udpate
* update
* update
* update
* resolve bug
* update on comments
* update on comments
* update
* update
* formatting
* add comments
* update on comments
* update
* Update pytorch_lightning/callbacks/base.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* update
* update
* Typing and minor changes
* Refactor
* Fix deprecated test
* Broken commit
* Fix broken commit
* flake8
* Update CHANGELOG
* update on comments
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-04-30 11:14:43 -04:00
Carlos Mocholí
5af086ab9f
Attach data refactor and tuner bugs [4/n] ( #7258 )
...
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-04-30 13:54:58 +00:00
Adrian Wälchli
ea2287e723
update training type plugin docs regarding result caching ( #7261 )
...
* add docs
* typo
* update
2021-04-30 13:03:10 +00:00
Adrian Wälchli
b9b3fa371f
fix case where an IterableDataset doesn't produce a batch for an epoch ( #7294 )
...
* wip
* fix
* add test
* refactor + test
* rm
* formatting
* update changelog
* doc
* docstring
* remove unused import
* Update CHANGELOG.md
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-04-30 12:45:55 +00:00
ananthsub
969e857690
Rename `trainer._launch` to `trainer._run` ( #7265 )
...
* rename-run
* fix
2021-04-30 13:39:02 +01:00
Adrian Wälchli
8232de427a
fix save_hyperparameters(container) if container is empty ( #7268 )
...
* fix
* add tests
* changelog
* fix test
2021-04-30 13:38:42 +01:00
PythicCoder
8bffa4f0ca
Updated docs to fix typo and update grid status ( #7270 )
...
* Updated docs to fix typo and update grid status
* Update docs/source/starter/new-project.rst
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update docs/source/starter/new-project.rst
* Update docs/source/starter/new-project.rst
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-04-30 12:45:17 +01:00
Kaushik B
ac92b57e2b
No need of warning when saved callback_states is None ( #7293 )
2021-04-30 10:48:53 +00:00