Commit Graph

2915 Commits

Author SHA1 Message Date
William Falcon d85de32dcf
Update __init__.py 2020-08-02 08:18:44 -04:00
Jirka Borovec 448be60701
update GPU to PT 1.5 (#2779)
* update gpu PT 1.6

* fix docker

* use PT 1.5

* Update tests/install_AMP.sh

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>

Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
2020-08-02 08:14:53 -04:00
William Falcon a0c4365278
Gpu idx (#2796)
* ddp refactor
2020-08-02 08:13:31 -04:00
Jirka Borovec b01ad75700
missing chlogs (#2672)
* missing

* miss

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* miss

* note

* notes

* update CI testing with pip upgrade (#2380)

* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-02 12:34:36 +02:00
bkhakshoor 96eb6ebacd
fix shell injection vulnerability in subprocess call (#2786) 2020-08-01 23:25:57 -04:00
pwwang c600ca65ae
Fix false num_classes warning in metrics (#2781)
* Fix num_classes warning

Put to_categorical before get_num_classes in metrics/functional/classification.py

* Update classification.py

Remove whitespaces in blank line.
2020-08-01 23:24:19 -04:00
Rohit Gupta 8baec1a191
Fix shuffle for distributed sampler (#2789)
* Fix shuffle for distributed sampler

* add test

* test

* chlog

* update test

* update test

* update test

* assertions via callback

* define callback outside for pickling

* skip ddp test on windows

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-01 23:22:57 -04:00
Iz Beltagy 38fce2ea68
fix selecting GPUs using CUDA_VISIBLE_DEVICES (#2739)
* fix https://github.com/PyTorchLightning/pytorch-lightning/issues/2407

* Update pytorch_lightning/trainer/distrib_data_parallel.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-01 23:21:15 -04:00
William Falcon 7da7d2e428
callback docs (#2794)
* added logging docs

* added logging docs

* added logging docs

* added logging docs
2020-08-01 22:56:34 -04:00
William Falcon 1d811d0d11
Resultdocs (#2793)
* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs

* added logging docs
2020-08-01 22:31:56 -04:00
William Falcon eb66cae55d
Update __init__.py 2020-08-01 20:24:02 -04:00
Nathan Raw 036bcea499
Call DataModule hooks implicitly in trainer (#2755)
*  call dm hooks in trainer implicitly

*  update tests

* 📝 remove unused stage arg from dm docs

*  update tests

*  update tests

* 🚧 include stage in datamodule.setup

* 📝 docs

* 📝 docs

* added more dm tests

* added more dm tests

* 🐛 call dm.setup everywhere

* 🔥 pickle tests now implied by accelerator tests

* 🎨 set dm as attr of trainer

* 🐛 .

* 🚧 wip

* add can prepare test

* add can prepare test

* verified setup in fit

* fixed setup call

* fixed setup call

* fixed setup call

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-01 20:17:57 -04:00
Jirka Borovec f9ccb0fd9b
update mergify (#2784) 2020-08-01 11:35:05 +02:00
Jirka Borovec 3772601cd6
update CI testing with pip upgrade (#2380)
* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user
2020-07-31 14:50:06 -04:00
Jirka Borovec bc7a08fbe0
test dockers & add AMP in pt-1.6 (#1584)
* exist images

* names

* images

* args

* pt 1.6 dev

* circleci

* update

* refactor

* build

* fix

* MKL
2020-07-31 08:23:13 -04:00
Thomas Schaaf a6719f09f0
Bugfix/torchtext include lengths (#2689)
* Test using torchtext.data.Field with include_lengths=True/False

* Fix issue that Tensors in a Batch generated by torchtext with torchtext.data.Field configured as include_lengths=True

* Add description for fix of issue #2688

* changes to accomodate CodeFactor issues

* Another attemt to make last CodeFactor issue pass (it's a false alarm)

* temporarly disable test of test_grad_tracking to check if testing will pass

* reenable test in test_grad_norm

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Renamed get_torchtext_data_iterator to _get_torchtext_data_iterator as suggested by @borda

* Update pytorch_lightning/utilities/apply_func.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* adding tests more specific to batch_move_data_to_device with tochtext Batch

* added check that Tensors were moved to target device

* removed tests using RNN models to be moved into a separate PR

* fixing FLAKE8 errors that showed up after merge from master branch
	modified:   tests/base/datamodules.py
	modified:   tests/callbacks/test_model_checkpoint.py

* parameterized test to reduce code duplication

* Added check only if length tensor exist. Removed left over comments.

* rearranged device parameterization and added pytest.param

* Try to figure out why only one device is tested on Linux machines

* Testing on CPU and GPU devices (GPU test is skip if no cuda device is available.

* added test for TPU device (experimental)

* Adding test parameterization for TPU test (experimental)

* change import statement to limit what is imported for a TPU environment

* made test work with TPU

* Change to trigger CI

* Change to trigger CI

* uncommented TPU test to check CI

* reenabling TPU test

* small change to trigger CI build

* small change to trigger CI build

* small change to trigger CI build

* adding tests/utilities/test_apply_func_torchtext.py to CI TPU test

* try to make test not skipped on CI with TPU

* remove testing on TPU

* undo an accidental change to test_tpu.py (file should not have been touched)

* small change to trigger CI build

* small change to trigger CI build

* Update tests/utilities/test_apply_func_torchtext.py

* Revert to previous version

* Apply suggestions from code review

* Change to trigger CI

Co-authored-by: Thomas Schaaf <tschaaf@mmm.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
2020-07-31 07:53:08 -04:00
Jirka Borovec b88fc43871
re-enable skipped tests (#2762)
* re-enable skipped

* timeout
2020-07-31 07:52:17 -04:00
siahuat0727 78a07e5f2d
Fix doc typo (#2773) 2020-07-31 07:42:47 -04:00
Jirka Borovec fcfdb4df13
conda speedup (#2546)
* conda speedup

* cache

* add pip cache

* suggestion

* cache

* cache

* req
2020-07-31 06:31:23 -04:00
Lezwon Castelino b7afac351b
Add onnx export (#2596)
* export model to onnx

* prepare data before exporting

* support for dataloaders and tensors

* added tests

* use example_input_array
add to changelog

* updated docstring

* added onnx inference tests

* temp commit

* removed schema valid test

* add onnxruntime to environment.yml

* moved onnxruntime to environment.yml pip

* add example in doc

* add lines between code block

* added PR to changelog

* is file check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove *

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* infer example outputs

* added doctest for onnx

* fix windows tests

* moved eval within condition block

* self.forward to self

* added docs

* fixed docs error

* added to toctree

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-31 12:27:57 +02:00
Jirka Borovec 06e8910f06
pytorch 1.6 (#2745)
* pt 1.6

* don't use the new zipfile serialization for now

* quick flake8 fixes

* remove unnecessary f

* coalesce strings

* remove comma

* remove extra commas

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* set _use_new_zipfile_serialization to False only for pytorch 1.6.0

* remove unnecessary comments

* flake8 fixes

* use pkg_resources instead of packaging

* readme

* format

* version

* chlog

Co-authored-by: Peter Yu <peter@asapp.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-31 11:18:32 +02:00
Jirka Borovec bc833fbf52
Horovod & py3.8 (#2764) 2020-07-30 23:39:07 +02:00
Jirka Borovec 949734489a
remove deprecated in v0.9 (#2760)
* remove deprecated in v0.9

* data_loader

* import

* hook

* args
2020-07-30 23:19:28 +02:00
Junbum Lee d18b9ef9d9
Fix typo on tpu.rst (#2759)
* Fix typo on tpu.rst

There're 3 ways :)

* Update docs/source/tpu.rst

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-07-30 18:11:18 +00:00
Phil 2f0fb34496
Speed up gradient clipping and allow parameters on multiple devices. (#2767)
The speed up is achieved by:
- Moving the "where" out of the loop (and replacing with min for simplicity).
- Replacing manual sum and pow with torch.norm. Even though this results
  in unnessecary computation (computing pow(root)) this is still a lot
  faster.
- Preallocating the output gives a slight speed up.

Note that calling .to for all parameters results in a small speed
penalty (~4 ms in my case) but allows parameters on different devices.

Overall this reduces the time used for gradient clipping from 206ms to
74 ms for my model (Resnet50 + few additional vars, all vars on GPU).
2020-07-30 11:53:24 -04:00
Tejasvi S Tomar 8ab5bcda3d
Misleading exception raised during batch scaling (#2223)
* Misleading exception raised during batch scaling

Use batch_size from `model.hparams.batch_size` instead of `model.batch_size`

* Improvements considering #1896

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 18:47:11 -04:00
Jirka Borovec 9edda9a41a
limit (#2756) 2020-07-29 18:35:49 -04:00
zcain117 eca7d0a6d3
Check CI_PULL_REQUEST and set GITHUB_REF accordingly. (#2741) 2020-07-29 18:35:32 -04:00
Ethan Harris 458d3e210e
Add missing methods to logger collection (#2723)
* Add missing methods to logger collection

* Update CHANGELOG.md

* Fix errors after merge

* Fix codefactor issues

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-29 23:53:02 +02:00
Santiago Castro 17678229b4
Fix a deprecation warning (#2746) 2020-07-29 07:54:14 -04:00
siahuat0727 b9381c3258
Fix docs typo (#2747) 2020-07-29 07:11:49 -04:00
Jeff Yang 63b92b7e63
fix: corrected attribute in *_dataloader in datamodule (#2748) 2020-07-29 07:11:13 -04:00
William Falcon 5b410681c3
Update __init__.py 2020-07-28 16:34:07 -04:00
Peter Yu b7f613ba6d
Correct CWD for ddp subprocesses when using Hydra (#2719)
* when hydra is enabled, set the cwd of subprocesses to the original cwd for ddp

* move imports up

* clean up imports
2020-07-28 16:33:28 -04:00
Adrian Wälchli db9f11d179
truncate long version number in progress bar (#2594)
* truncate version number

* add docs and example

* extend docs

* docs

* docs

* changelog

* show last

* Update pytorch_lightning/core/lightning.py

* Update pytorch_lightning/core/lightning.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-28 16:32:34 -04:00
Iz Beltagy c047676fae
fix https://github.com/PyTorchLightning/pytorch-lightning/issues/2635 (#2738) 2020-07-28 16:29:46 -04:00
Stas Bekman 2bd39c66af
make the error message readable (#2729)
* make the error message readable

make the error message readable by adding spaces, fixing a type "his -> this",

* cleanup

* Update pytorch_lightning/trainer/auto_mix_precision.py

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
2020-07-28 16:28:22 -04:00
Jirka Borovec 40337cce58
freeze PT 1.5 for Horovod issue (#2744)
* freeze pt 1.5

* torchtext

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* timeout

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-28 15:52:23 -04:00
William Falcon bc9348f2c4
Update README.md 2020-07-28 11:43:47 -04:00
William Falcon f770a91864
Update README.md 2020-07-28 11:37:48 -04:00
William Falcon bfa31e7583
Add files via upload 2020-07-28 11:36:43 -04:00
William Falcon 37d1bc6a42
Update README.md 2020-07-28 10:30:32 -04:00
William Falcon fa16624cc2
Update README.md 2020-07-28 10:28:59 -04:00
Jirka Borovec 590e7fb1fd
tests: add default_root_dir=tmpdir (#2392)
* tests: add default_root_dir=tmpdir

* remove duplicate tmpdir args

* add missing fixture

* test requires multi gpu

* typo

* resize

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-07-28 09:47:53 -04:00
Jirka Borovec a3aebc1350
skip CircleCI config on master (#2732)
* circleci config

* circleci config

* circleci config

* circleci config
2020-07-28 06:34:01 -04:00
William Falcon 2e6e254c76
quick start docs (#2731)
* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests

* added tests
2020-07-27 23:53:19 -04:00
Jirka Borovec 0fe933e23d
fixing TPU tests (#2632)
* init

* rename

* tpu_core_idx

* idx 8

* idxs

* @pl_multi_process_test

* assert

* assert

* deamon

* no close

* imort

* msg

* use_single_gpu

* dataset

* idx

* fix idx

* dataset

* format

* add pickable

* typo

* apex

* typo

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* docs

* typo

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* tests

* docs

* docs

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>

* docs

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-07-27 19:07:09 -04:00
Rohit Gupta 84c507c4df
Fix max_batches with fast_dev_run. (#2581)
* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* Fix fast_dev_run to run for all val_dataloaders

* fast_dev_run check

* changelog

* explicit

* limit_batches with fast_dev_run in init

* add test

* whitespace and comment fix

* comment and assertion

* added tests

* added tests

* added tests

* added tests

* update rtol

* Revert "update rtol"

This reverts commit 4320329540.

* added tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-07-27 17:56:55 -04:00
Adrian Wälchli 26afcaa30e
update conda packages (#2593) 2020-07-27 12:54:48 -04:00
Adrian Wälchli d03953260d
Fix weights_save_path when logger is used + simplify path handling + better docs (#2681)
* fix weights_save path and drop ckpt_path

* add tests

* unused import

* update docs

* changelog

* pep8

* fix horovod test

* make backward compatible

* perform same test for all loggers

* fix for when logger=False and weights_save_path is set

* update changelog

* update docs

* update tests

* do not set save dir dynamically

* remove duplicate test

* remove duplicated tests

* update tests

* update tests

* remove remaining ckpt_path references

* move defaults to init as suggested by @Borda

* test deprecation
2020-07-27 12:53:11 -04:00