Commit Graph

1413 Commits

Author SHA1 Message Date
Rohit Gupta a642349228
Support limit_mode_batches (int) for infinite dataloader (#2840)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

* add and update tests

* max

* check

* check

* check

* chlog

* tests

* update exception message

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 13:02:36 +02:00
Nathan Hunt 234e2b590f
Use .comet.config file for CometLogger (#1913)
* Use .comet.config file or env var for API key.

* Make CometLogger API key changes backwards compatible.

* Fix line too long.

* Add documentation about loading from ~/.comet_config.

* Update required comet_ml version.

* Comet logger: allow offline experiments with config file.

This adds a new argument to the logger to control the online / offline mode explicitly so that if you give an API key and a save_dir (e.g. to control where checkpoints go while having ~/.comet.config) you can specify which mode you want.

* Make CometLogger API key changes backwards compatible.

* Comet logger: change online argument to be offline.

For consistency with other loggers.

* chlog

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:46:50 +02:00
Nima Sarang 793036d29c
Support returning python scalars in DP (#1935)
* Override the default gather method to support scalars

* add computing average of a list

* bug: change if to elif

* add some tests

* change style

* change documentation

* use apply_to_collection in DP gather

* use apply_to_collection in DP gather

* fix warning msg

* override gather method in DP

* add tests for python scalars

* add python scalars to docstring

* Update message

* override gather method in DP

* formatting

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-07 09:18:29 +02:00
Nicki Skafte 9a402461da
Bugfix: Lr finder and hparams compatibility (#2821)
* fix hparams lr finder bug

* add tests for new functions

* better tests

* fix codefactor

* fix styling

* fix tests

* fix codefactor

* Apply suggestions from code review

* modified hook

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-07 00:34:48 +02:00
xmotli02 767c44950c
Added basic file logger (#2721)
* Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* fixup! Added basic file logger #1803

* csv

* Apply suggestions from code review

* tests

* tests

* tests

* miss

* docs

Co-authored-by: xmotli02 <xmotli02@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 06:08:25 -04:00
Younghun Roh ac4a215071
Faster Accuracy metric (#2775)
* Faster classfication stats

* Faster accuracy metric

* minor change on cls metric

* Add out-of-bound class clamping

* Add more tests and minor fixes

* Resolve code style warning

* Update for #2781

* hotfix

* Update pytorch_lightning/metrics/functional/classification.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update about conversation

* Add docstring on stat_scores_multiple_classes

Co-authored-by: Younghun Roh <yhunroh@mindslab.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-06 11:40:35 +02:00
William Falcon dd78be516a
Update __init__.py 2020-08-05 20:45:11 -04:00
Justus Schock fe29c53ab5
add ddp sync for logging in result step (#2822)
* add ddp sync for logging in result step

* pep8

* pep8

* make ddp tests run also on cpu (except windowws)

* create class instance in ddp test

* revert automated formatting

* pep8
2020-08-05 20:42:09 -04:00
William Falcon b507c42c47
clarify batch hooks (#2842)
* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook

* modified hook
2020-08-05 20:01:30 -04:00
Ananya Harsh Jha a5f2b89ed0
updated sync bn (#2838)
* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* updated sync bn

* added ddp_spawn test

* updated test

* clean

* clean

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-06 01:12:11 +02:00
William Falcon 633cf76c68
Update __init__.py 2020-08-05 15:58:27 -04:00
William Falcon 5d0f0325d8
Revert "Support limit_mode_batches (int) for infinite dataloader" (#2839)
* Revert "Support limit_mode_batches (int) for infinite dataloader (#2787)"

This reverts commit de9c9f0864.

* Update training_tricks.py
2020-08-05 15:57:26 -04:00
Ruotian(RT) Luo bef27c58ed
save apex scaler states (#2828) 2020-08-05 13:43:50 -04:00
Ruotian(RT) Luo 6034d5e37d
fix apex gradient clipping (#2829) 2020-08-05 13:42:21 -04:00
Jeff Yang 5bbcb8db1f
Improve SSIM (#2833)
* make ssim fast

* remove padding

* pep8

* add comments for readability

* plus -> coef
2020-08-05 13:40:11 -04:00
William Falcon 2cbb1496d0
Update __init__.py 2020-08-05 13:37:11 -04:00
Ananya Harsh Jha e31c520c21
add support for sync_bn (#2801)
* initial commit for sync_bn

* updated changelog

* tests

* tests

* ddp tests hanging with script tests

* updated trainer

* updated params

* test

* passingtests

* passing tests

* passing tests

* passing tests

* tests

* removed apex

* doc

* doc

* doc

* doc

* docs

* tests

* tests

* tests
2020-08-05 13:29:05 -04:00
Rohit Gupta de9c9f0864
Support limit_mode_batches (int) for infinite dataloader (#2787)
* Support limit_mode_batches(int) for infinite dataloader

* flake8

* revert and update

* add and update tests

* pep8

* chlog

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add suggestions by @awaelchli

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* Apply suggestions from code review

* fix

* max

* check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 17:04:49 +00:00
Nicki Skafte b2a7d7580c
Docs for auto_select_gpu (#2836)
* added docs

* Update docs/source/multi_gpu.rst

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* testcode change to example

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-05 12:28:33 +00:00
s-rog 232b141cde
check if checkpoint_callback exists (#2832)
check checkpoint_callback before setting best_model_path
2020-08-05 06:32:14 -04:00
Nicki Skafte e3732789d7
Add remaning sklearn metrics (#2562)
* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* update requirement

* fix doctest

* fix tests

* added balanced accuracy

* added dcg score

* added mean absolute error

* added mean squared error

* fix

* added mean squared log error

* add median absolute error and r2 score

* switch arguments

* added mean poisson deviance

* add mean gamma deviance and mean tweedie deviance

* fix styling

* added explained variance score

* added cohen kappa score

* added hamming, hinge, jaccard

* fix styling

* update sklearn requirement to newer version

* fix doctest

* fix tests

* fix doctest

* fix failing docs

* fix test

* trying to fix errors

* Apply suggestions from code review

* format

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-05 11:32:53 +02:00
Justus Schock ad0f1194aa
Support Mean in DDP Sync (#2568)
* Update converters.py

* Update test_converters.py

* pep8

* pep8 tests

* Update test_datamodules.py

* Update test_converters.py

* Update converters.py

* Update test_datamodules.py

* Update test_converters.py

* Update test_converters.py

* fix tests

* fix ddp tests on windows

* chlog

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-08-04 18:32:20 +02:00
siahuat0727 38d6b2598e
Fix docs typo (#2778) 2020-08-04 17:46:35 +02:00
William Falcon a55c481d5d
Update __init__.py 2020-08-03 19:57:18 -04:00
Nathan Raw 1c4244e1ff
🐛 fix dm prepare_data call (#2811) 2020-08-03 19:39:01 -04:00
Rohit Gupta 6b9c548bab
docs update and follow up of #2789 (#2797)
* docs update and follow up of #2789

* pep8

* Update trainer.py

* Update trainer.py

Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-03 19:57:21 +00:00
Vlad Lialin ed8a01afb0
More clear docstring for `val_check_interval` (#2802)
* More clear docstring for `val_check_interval`

* Update trainer.py
2020-08-03 09:13:05 -04:00
siahuat0727 58ca33f194
Fix docs typo (#2803)
* Fix docs typo

* Fix docs typo
2020-08-03 14:55:17 +02:00
Santiago Castro 471f2b80af
Fix import deprecation warning (#2800) 2020-08-02 15:20:11 -04:00
William Falcon d85de32dcf
Update __init__.py 2020-08-02 08:18:44 -04:00
William Falcon a0c4365278
Gpu idx (#2796)
* ddp refactor
2020-08-02 08:13:31 -04:00
Jirka Borovec b01ad75700
missing chlogs (#2672)
* missing

* miss

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* miss

* note

* notes

* update CI testing with pip upgrade (#2380)

* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-02 12:34:36 +02:00
bkhakshoor 96eb6ebacd
fix shell injection vulnerability in subprocess call (#2786) 2020-08-01 23:25:57 -04:00
pwwang c600ca65ae
Fix false num_classes warning in metrics (#2781)
* Fix num_classes warning

Put to_categorical before get_num_classes in metrics/functional/classification.py

* Update classification.py

Remove whitespaces in blank line.
2020-08-01 23:24:19 -04:00
Rohit Gupta 8baec1a191
Fix shuffle for distributed sampler (#2789)
* Fix shuffle for distributed sampler

* add test

* test

* chlog

* update test

* update test

* update test

* assertions via callback

* define callback outside for pickling

* skip ddp test on windows

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-01 23:22:57 -04:00
Iz Beltagy 38fce2ea68
fix selecting GPUs using CUDA_VISIBLE_DEVICES (#2739)
* fix https://github.com/PyTorchLightning/pytorch-lightning/issues/2407

* Update pytorch_lightning/trainer/distrib_data_parallel.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-01 23:21:15 -04:00
William Falcon 7da7d2e428
callback docs (#2794)
* added logging docs

* added logging docs

* added logging docs

* added logging docs
2020-08-01 22:56:34 -04:00
William Falcon 1d811d0d11
Resultdocs (#2793)
* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs
2020-08-01 22:31:56 -04:00
William Falcon eb66cae55d
Update __init__.py 2020-08-01 20:24:02 -04:00
Nathan Raw 036bcea499
Call DataModule hooks implicitly in trainer (#2755)
*  call dm hooks in trainer implicitly

*  update tests

* 📝 remove unused stage arg from dm docs

*  update tests

*  update tests

* 🚧 include stage in datamodule.setup

* 📝 docs

* 📝 docs

* added more dm tests

* added more dm tests

* 🐛 call dm.setup everywhere

* 🔥 pickle tests now implied by accelerator tests

* 🎨 set dm as attr of trainer

* 🐛 .

* 🚧 wip

* add can prepare test

* add can prepare test

* verified setup in fit

* fixed setup call

* fixed setup call

* fixed setup call

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-01 20:17:57 -04:00
Thomas Schaaf a6719f09f0
Bugfix/torchtext include lengths (#2689)
* Test using torchtext.data.Field with include_lengths=True/False

* Fix issue that Tensors in a Batch generated by torchtext with torchtext.data.Field configured as include_lengths=True

* Add description for fix of issue #2688

* changes to accomodate CodeFactor issues

* Another attemt to make last CodeFactor issue pass (it's a false alarm)

* temporarly disable test of test_grad_tracking to check if testing will pass

* reenable test in test_grad_norm

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Renamed get_torchtext_data_iterator to _get_torchtext_data_iterator as suggested by @borda

* Update pytorch_lightning/utilities/apply_func.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* adding tests more specific to batch_move_data_to_device with tochtext Batch

* added check that Tensors were moved to target device

* removed tests using RNN models to be moved into a separate PR

* fixing FLAKE8 errors that showed up after merge from master branch
	modified:   tests/base/datamodules.py
	modified:   tests/callbacks/test_model_checkpoint.py

* parameterized test to reduce code duplication

* Added check only if length tensor exist. Removed left over comments.

* rearranged device parameterization and added pytest.param

* Try to figure out why only one device is tested on Linux machines

* Testing on CPU and GPU devices (GPU test is skip if no cuda device is available.

* added test for TPU device (experimental)

* Adding test parameterization for TPU test (experimental)

* change import statement to limit what is imported for a TPU environment

* made test work with TPU

* Change to trigger CI

* Change to trigger CI

* uncommented TPU test to check CI

* reenabling TPU test

* small change to trigger CI build

* small change to trigger CI build

* small change to trigger CI build

* adding tests/utilities/test_apply_func_torchtext.py to CI TPU test

* try to make test not skipped on CI with TPU

* remove testing on TPU

* undo an accidental change to test_tpu.py (file should not have been touched)

* small change to trigger CI build

* small change to trigger CI build

* Update tests/utilities/test_apply_func_torchtext.py

* Revert to previous version

* Apply suggestions from code review

* Change to trigger CI

Co-authored-by: Thomas Schaaf <tschaaf@mmm.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
2020-07-31 07:53:08 -04:00
siahuat0727 78a07e5f2d
Fix doc typo (#2773) 2020-07-31 07:42:47 -04:00
Lezwon Castelino b7afac351b
Add onnx export (#2596)
* export model to onnx

* prepare data before exporting

* support for dataloaders and tensors

* added tests

* use example_input_array
add to changelog

* updated docstring

* added onnx inference tests

* temp commit

* removed schema valid test

* add onnxruntime to environment.yml

* moved onnxruntime to environment.yml pip

* add example in doc

* add lines between code block

* added PR to changelog

* is file check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove *

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* infer example outputs

* added doctest for onnx

* fix windows tests

* moved eval within condition block

* self.forward to self

* added docs

* fixed docs error

* added to toctree

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-31 12:27:57 +02:00
Jirka Borovec 06e8910f06
pytorch 1.6 (#2745)
* pt 1.6

* don't use the new zipfile serialization for now

* quick flake8 fixes

* remove unnecessary f

* coalesce strings

* remove comma

* remove extra commas

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* set _use_new_zipfile_serialization to False only for pytorch 1.6.0

* remove unnecessary comments

* flake8 fixes

* use pkg_resources instead of packaging

* readme

* format

* version

* chlog

Co-authored-by: Peter Yu <peter@asapp.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-31 11:18:32 +02:00
Jirka Borovec 949734489a
remove deprecated in v0.9 (#2760)
* remove deprecated in v0.9

* data_loader

* import

* hook

* args
2020-07-30 23:19:28 +02:00
Phil 2f0fb34496
Speed up gradient clipping and allow parameters on multiple devices. (#2767)
The speed up is achieved by:
- Moving the "where" out of the loop (and replacing with min for simplicity).
- Replacing manual sum and pow with torch.norm. Even though this results
  in unnessecary computation (computing pow(root)) this is still a lot
  faster.
- Preallocating the output gives a slight speed up.

Note that calling .to for all parameters results in a small speed
penalty (~4 ms in my case) but allows parameters on different devices.

Overall this reduces the time used for gradient clipping from 206ms to
74 ms for my model (Resnet50 + few additional vars, all vars on GPU).
2020-07-30 11:53:24 -04:00
Tejasvi S Tomar 8ab5bcda3d
Misleading exception raised during batch scaling (#2223)
* Misleading exception raised during batch scaling

Use batch_size from `model.hparams.batch_size` instead of `model.batch_size`

* Improvements considering #1896

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 18:47:11 -04:00
Ethan Harris 458d3e210e
Add missing methods to logger collection (#2723)
* Add missing methods to logger collection

* Update CHANGELOG.md

* Fix errors after merge

* Fix codefactor issues

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 23:53:02 +02:00
Santiago Castro 17678229b4
Fix a deprecation warning (#2746) 2020-07-29 07:54:14 -04:00
siahuat0727 b9381c3258
Fix docs typo (#2747) 2020-07-29 07:11:49 -04:00