Jirka Borovec
e1955e3c89
isolate PL debugger in tests ( #4643 )
...
* isolate PL debugger in tests
* miss
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-14 11:22:56 +00:00
Justus Schock
e04e7c9ecc
Makes automatic optimization a model attribute ( #4602 )
...
* Makes automatic optimization a model attribute
* Update trainer.py
* remove setting property in model
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Update pytorch_lightning/trainer/trainer.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Update trainer.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-14 11:13:42 +06:30
Justus Schock
144a5c9913
Increase parity to match logging refactor ( #4651 )
...
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-14 10:33:30 +06:30
Espen Haugsdal
fa88905af0
Fix docs typo: train_batch => val_batch ( #4659 )
...
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-14 08:23:11 +06:30
ananthsub
d096a2ea6d
Fix setup callback hook to pass LightningModule through ( #4608 )
...
* Fix setup callback hook
* Update CHANGELOG.md
* Update test_trainer.py
* Update test_trainer.py
* Update test_trainer.py
* fix chlog
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-13 19:34:46 -05:00
Nathan Painchaud
2d78d9b84a
CI: Added isort import check for the code on pull-request ( #4242 )
...
* added isort CI job and updated isort config
* changed CI check output from files to full diff
* added isort pre-commit hook
* Added missing first party and restricted files affected by isort
* Applied isort to root-level, docs and benchmarks
* Apply suggestions from code review
Co-authored-by: Nathan Painchaud <nathanpainchaud@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-13 22:57:46 +01:00
Jeff Yang
baa8558cc0
logger docs and api docs ( #3950 )
...
* logger and api docs
* remove gpu_usage_logger, lr_logger
* update docstring
* fix wandb example
* remove step result
* charts
* add some charts info
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-13 20:35:54 +05:30
Jirka Borovec
7940ea5aaf
CI: TPU drop install horovod ( #4622 )
...
Co-authored-by: chaton <thomas@grid.ai>
2020-11-13 11:33:52 +01:00
chaton
4018237c30
[FEAT] Add lambda closure to manual_optimizer_step ( #4618 )
...
* added lambda_closure
* move to types
* add 2 new tests
* make example more complex
* add complex example to doc
* added more tests
* resolve doc
* typo
* update
* update tpu optimizer_step
* Apply suggestions from code review
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-12 19:22:06 +00:00
Sean Naren
bacabaebaf
Sharded Accelerator 1/n: Expose clip gradients to plugins via abstract class ( #4639 )
...
* Added abstract precision plugin to expose clip_gradients function, use within accelerator to clip gradients
* Exclude model from override, keep optimizer (needed for sharded clip gradients), add override for O2 support apex
* Fix doc
* Applied codereview changes
* Refactored clip function to encapsulate tpu changes with tpu accelerator. Default to standard clip function for vanilla torch
* Pass correct grad clip val
* Moved var to property
* Apply code review suggestions
2020-11-12 17:18:09 +00:00
chaton
4a01fd048c
[FIX] Average Pbar Metrics ( #4534 )
...
* wip
* update
* normalize loss
* update test
* resolve bug
* update test and add TODO
* make sure it can be sync
* add TODO
* update sol
2020-11-12 15:59:01 +00:00
Jirka Borovec
bd6c413829
Conda: PT 1.8 ( #3833 )
...
* PT 1.8
* unfreeze PT
* drop nightly from full
* add PT 1.8 to workflow
* readme table
* cuda
* skip cuda
* test 1.8
* unfreeze torch vision
Co-authored-by: ydcjeff <ydcjeff@outlook.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-12 15:03:43 +01:00
chaton
35f00df176
[FEAT] Add pytest section to Contribution how to ? ( #4633 )
...
* update contributing
* formatting
2020-11-12 11:48:54 +00:00
Jeff Yang
79fc92647c
[make] Create Makefile ( #4620 )
...
* [make] Create Makefile
* exclude makefile
* contributing info
* rm .run_local_test.sh
2020-11-12 09:25:31 +00:00
Jirka Borovec
396a18eb78
update changelog after 1.0.6 ( #4624 )
...
* update changelog after 1.0.6
* fix formatting
2020-11-12 09:21:57 +01:00
Marc Ferradou
bff99ee159
Small typo correction on CONTRIBUTING.md ( #4625 )
...
* Update CONTRIBUTING.md
Small typo correction.
* Update .github/CONTRIBUTING.md
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-11-12 08:59:33 +01:00
Sean Naren
33470ba605
Prevent crash if sync_dist=True on CPU ( #4626 )
...
* Added test/fix for sync_dist raising NotImplementedError
* Fixed comments/formatting
* Revert base class change, enforce sync tensors across accelerators, added GPU test
2020-11-11 22:04:05 +00:00
chaton
3d202f9ecc
[FEAT] Refactor logging 3/3 [v1] ( #4552 )
...
* wip
* wip check how many tests break
* wip
* resolve some bugs
* resolve more bugs
* resolve 2 bugs
* resolve
* temp fix
* update
* remove useless code
* remove result
* try to resolve bug
* update changelog
* formatting
* remove pl
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-11 17:05:24 +00:00
chaton
514cb22bd7
[Fix] Move log value to cpu. ( #4592 )
...
* move value to cpu to save memory
* update
* move to cpu
* try something
* update
* update
* add back out_dict.update({k: v})
* add move_metrics_to_cpu
* update
* Update pytorch_lightning/utilities/memory.py
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* resolve comments
* Update pytorch_lightning/core/step_result.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-10 21:13:41 +00:00
chaton
7e08b0d710
[bug-fix] DDP and automatic_optimization=False ( #4485 )
...
* resolve bug
* add self._running_manual_optim
* update
* update tests
* update lightning module
* resolve bug
* update tests
* update
* resolve pep8
* update
* replace by `ddp_spawn`
* temporary fix
* update
* update
* move update to training_loop
* make both ddp_spawn
* introduce `manual_optimizer_step`
* update changelog
* added changelog wrong place
* add force_optimizer_step
* update docstring for tests
* update optimizer_step
* update zero_grad
* resolve flake8
* move update into manual_optimizer_step
* add zero_grad
* remove zero_grad tests
* remove manual_backward in AMP, it doesn't help
* update
* loosen tests
* update
* update doc
* add TODO
* Removed unnecessary get model from native amp
* Remove try except with pytest raise
* Add seed, clean up imports, remove try catch to reproduce error
* update code
* update test
* revert back
* formatting
* Update pytorch_lightning/core/lightning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-10 19:44:51 +00:00
Jirka Borovec
abf1d4b992
fix mock pkgs in docs ( #4591 )
...
* fix mock pkgs in docs
* sphinx
* CI
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 14:57:21 +01:00
maxjeblick
343d19fa86
Find parameters which are specified in the LightningDataModule, only ( #4347 )
...
* search for attribute in datamodule if not found elsewhere
* add test for datamodule
* add lightning_getattr test for datamodule
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update CHANGELOG.md
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-10 14:01:20 +01:00
Diedre Carmo
470e2945fc
fix logged keys in mlflow logger ( #4412 )
...
* [#4411 ] fix gpu_log_memory with mlflow logger
* sanitize parenthesis instead of removing for all loggers
* apply regex for mlflow key sanitization
* replace ',' with '.' typo
* add single warning and test
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 17:20:25 +05:30
Roger Shieh
11415faade
[req] Set min version for skimage for tests ( #4598 )
...
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-10 17:16:37 +06:30
Kai Zhang
30ad3e2ad3
Replace a MisconfigurationException with warning in ModelCheckpoint callback ( #4560 )
...
* replace MisconfigurationException with warning
* update test
* check raising UserWarning
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-10 10:44:43 +01:00
Nicki Skafte
465ec752f8
Metric ddp bugfix ( #4482 )
...
* changes
* fix spelling
* small note
* trying to fix ddp test
* fix ddp
* fix for test
* suggestion
* CHANGELOG
* Update pytorch_lightning/metrics/metric.py
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Sean Naren <sean@grid.ai>
2020-11-10 09:16:31 +01:00
Nicki Skafte
4f3160ba2e
Skip tuner algorithms on fast dev ( #3903 )
...
* skip on fast dev
* fix error
* changelog
* fix recursive issue
* combine tests
* pep8
* move logic to base funcs
* fix mistake
* Update pytorch_lightning/tuner/lr_finder.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* pep
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 00:34:42 +01:00
tarepan
41c9bee4f0
Fix load disparity between normal and hpc ( #4526 )
...
* Add missing load functionality in hpc
* Add general file load for hpc
* Add mark in CHANGELOG
* Fix Typo Li**hg**tning
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Refactor line separation
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Fix entangled fixation commit
* Fix naming of restore_model_states
* Fix amp restore place
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-09 17:26:38 +00:00
Jeff Yang
23719e3c05
[dockers] install nvidia-dali-cudaXXX ( #4532 )
...
* [dockers] install nvidia-dali-cuda100
* Apply suggestions from code review
* build DALI
* build DALI
* build DALI
* dali from source
* dali from source
* use binaries
* qq
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-09 21:18:24 +06:30
Stef | ステフ
4a6721af25
Missing TorchScript trace's update ( #4586 )
...
Co-authored-by: stef-ubuntu <stef@webempath.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-09 15:01:13 +01:00
Akihiro Nitta
45a695969a
Fix docstring ( #4585 )
...
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-09 19:52:47 +06:30
Jan Beitner
e01190e919
Adding pytorch-forecasting to community examples ( #4575 )
...
PyTorch Forecasting is a new library that is designed for time series forecasting practitioners and researchers alike.
It is based on the awesome work on PyTorch Lightning. Thanks a lot for creating such an asset!
Have a look at the documentation for more information.
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-09 12:33:44 +01:00
Nicki Skafte
01a925d333
[Docs] Note on running metric in dp ( #4494 )
...
* note
* Update docs/source/metrics.rst
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-09 11:30:28 +01:00
William Falcon
ee35907170
Accelerator docs ( #4583 )
...
* accelerator docs
* accelerator docs
2020-11-08 17:24:41 -05:00
William Falcon
3ba48d3bc4
ref: unify slurm and TE under backendPlugin 5/n" ( #4582 )
...
* ref: unify slurm and TE under backendPlugin 4/n
* ref: unify slurm and TE under backendPlugin 5/n
2020-11-08 16:20:19 -05:00
William Falcon
624f5b5938
ref: unify slurm and TE under backendPlugin 3/n ( #4581 )
2020-11-08 15:32:37 -05:00
William Falcon
bfaf014096
ref: unify slurm and TE under backendPlugin 2/n ( #4580 )
2020-11-08 15:07:16 -05:00
William Falcon
0f64f15f52
ref: unify slurm and TE under backendPlugin 1/n ( #4578 )
...
* ref: unify slurm and TE under backendPlugin
* ref: unify slurm and TE under backendPlugin
2020-11-08 14:28:55 -05:00
William Falcon
09a51697ed
Adds shortcut for path to log ( #4573 )
...
* added log_dir shortcut to trainer properties for writing logs
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
* added log_dir shortcut
2020-11-08 12:16:22 -05:00
William Falcon
f63fec9323
updated trainer docs ( #4571 )
2020-11-07 15:41:02 -05:00
William Falcon
e0bdf8124b
updated trainer docs ( #4570 )
2020-11-07 14:53:04 -05:00
William Falcon
bb356a73cb
added trainer api docs ( #4569 )
2020-11-07 14:18:45 -05:00
chaton
854c13673b
add congratulations at the end of our notebooks ( #4555 )
...
* add congratulations at the end of our notebooks
* udpate image
2020-11-07 12:05:29 +00:00
Indrayana Rustandi
6e5f232f5c
Add Dali MNIST example ( #3721 )
...
* add MNIST DALI example, update README.md
* Fix PEP8 warnings
* reformatted using black
* add mnist_dali to test_examples.py
* Add documentation as docstrings
* add nvidia-pyindex and nvidia-dali-cuda100
* replace nvidia-pyindex with --extra-index-url
* mark mnist_dali test as Linux and GPU only
* adjust CUDA docker and examples.txt, fix import error in test_examples.py
* adjust the GPU check
* Exit when DALI is not available
* remove requirements-examples.txt and DALI pip install
* Refactored example, moved to new logging api, added runtime check for test and dali script
* Patch to reflect the mnist example module
* add req.
* Apply suggestions from code review
* Removed requirement as it breaks CPU install, added note in README to install DALI
* add DALI to Drone
* test examples
* Apply suggestions from code review
* imports
* ABC
* cuda
* cuda
* pip DALI
* Move build into init function
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-06 14:53:46 +00:00
Jeff Yang
f3dfb98444
[ci] tag v1.4.1 for pypa/gh-action-pypi-publish ( #4548 )
2020-11-06 10:48:27 +00:00
cool425589
5e09fd31e9
show progressbar only on progress_rank 0 on ddp_slurm ( #4437 )
...
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-11-06 01:36:22 +01:00
chaton
9c8701f2e2
[feat] Logging refactor 2/n - train ( #4495 )
...
* update logging
* solve more bugs
* replace Mapping by Dict
* update on comments
* resolve pep8
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* typo
* update for coverage
* update test
* update
* Update tests/models/test_hooks.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Update tests/models/test_hooks.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* update on comments
* remove deepcopy
* remove useless look for
* another small optim
* extra optim
* remove lastest optim, can be source of bug
* resolve bug
* add docstring
* optimize coverage
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/connectors/logger_connector/epoch_result_store.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging_tests/test_distributed_logging.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/trainer/evaluation_loop.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging/test_logger_connector.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/trainer/logging_tests/test_train_loop_logging_1_0.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* update
* update on comments
* update parity speed
* get it down to 0.65
* update
* 0.8 max_dif
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-11-05 22:27:04 +00:00
Jirka Borovec
62ea4614f3
update PR template ( #4523 )
...
* update PR template
* Update .github/PULL_REQUEST_TEMPLATE.md
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
* Apply suggestions from code review
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2020-11-05 22:05:27 +01:00
Travis Addair
51cc7a89ee
Horovod: fixed early stopping and added metrics aggregation ( #3775 )
...
* Fixed early stopping for Horovod
* Refactored to sync_dist_if_available
* Bump min Horovod version to support hvd.is_initialized
* Changelog
* Added back change for Horovod
* Removed redundant checks for initialization
* Implement metrics gathering for Horovod
* Added test for EvalResult
* Renamed ddp_sync_on_step -> dist_sync_on_step
* Added metric test for Horovod
* Added option pass callable allgather function to metric base class
* Added dist_sync_fn
* Fixed calls to private _sync_dist
* Fixed Horovod test
* Added sync_tensor to the distributed backend
* Skip Windows
* Insert test path
* Removed redundant import
* Updated drone
* Unset HOROVOD_GPU_ALLREDUCE
* Unset
* No cache dir
* No uninstall
* Unset variables
* Uninstall Horovod during initialization
* Replaced more references to ddp_sync_on_step
* Fixed imports
* Fixed attribute
* Added back default
* Lint
* Added back docstring
* Made gather_all_tensors default
* Added whitespace
* Update tests/models/test_horovod.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/metrics/metric.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update CHANGELOG.md
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-05 12:52:02 -05:00
Jeff Yang
e81707ba02
[dockers] use inline cache ( #4511 )
...
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-04 23:08:17 +01:00