Commit Graph

2626 Commits

Author SHA1 Message Date
William Falcon 325852c6df
enabled no returns from eval (#2446)
* enabled no returns from eval

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs

* fixed docs
2020-07-01 07:38:00 -04:00
Llannelongue fa2233f56f
Corrected typo `python -m pip pre-commit install` (#2447) 2020-07-01 07:02:02 -04:00
Jirka Borovec ded8a56bb3
missing changes in chlog (#2430)
* missing

* miss
2020-06-30 22:45:50 -04:00
Jirka Borovec e268061614
Pure package & base tests (#2418)
* base tests

* pil

* wip

* wip

* wip

* ignore

* ignore

* win

* link

* win

* cpu

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-06-30 19:35:54 -04:00
Adrian Wälchli 145670f893
fix logging on rank 0 only (#2425)
* fix and test for ddp block logging rank > 0

* rename

* use the dummy logger

* dummy logger test

* set the logger in  model

* decorator for rank zero experiment

* simplify check

* simplify

* fix problem with None in checkpoint path

* revert configure logger

* unused import

* offline

* try rank 0 decorator in checkpoint

* try fix test

* imgs

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* add asserts to make sure log zero only saves checkpoints

* fix tpu tests

* fix tpu tests

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 18:09:16 -04:00
William Falcon 04e68f022f fix tpu tests 2020-06-30 17:20:35 -04:00
William Falcon fc26078e39 fix tpu tests 2020-06-30 17:20:18 -04:00
Oliver Neumann 1a54ed6ad9
Checking ipywidgets is installed for ensure tqdm working (#2417)
* Adding importing ipywidgets before importing tqdm.auto to make sure ipywidgets is installed.

* Updated CHANGELOG.md

* Updated ipywidgets importing checks to @awaelchli comments.

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 16:59:35 -04:00
William Falcon 309ed75c5d
added reduce ddp results on eval (#2434)
* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval

* added reduce ddp results on eval
2020-06-30 16:15:35 -04:00
William Falcon e8bb4165b7
Fix apex scaling with decoupled backward (#2433)
* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs

* fix outputs
2020-06-30 14:51:39 -04:00
Jirka Borovec d4a02e3bd8
tests: drop CircleCI (#2412)
* drop CircleCI

* add PT testing

* fix

* cpu

* conda

* conda

* req

* base

* conda

* conda

* conda

* conda

* conda

* conda

* conda

* name

* req

* info

* tests

* pt 1.6

* drop 1.6

* info
2020-06-30 10:56:05 -04:00
William Falcon a42a0e16dd
Fixes train outputs (#2428)
* fix outputs

* fix outputs
2020-06-30 10:03:49 -04:00
Jirka Borovec a75398530c
continue (#2416) 2020-06-29 21:00:52 +02:00
Jirka Borovec dec074c2e7
typo (#2415) 2020-06-29 07:36:56 -04:00
Jirka Borovec 02d6045cac
release (#2414) 2020-06-29 07:21:28 -04:00
William Falcon 33b92557f5
Update __init__.py 2020-06-29 06:59:35 -04:00
William Falcon 92d1e75b26 fix batch typo 2020-06-29 06:54:21 -04:00
William Falcon 593837e1da fix amp wrong call 2020-06-29 06:46:19 -04:00
Jirka Borovec 3ff695510e
missing changes (#2283)
* missing

* RC1

* RC1

* format
2020-06-29 06:34:19 -04:00
William Falcon 58f03f3076
Update README.md 2020-06-28 22:44:58 -04:00
William Falcon 8f07b77fc0
Update __init__.py 2020-06-28 22:08:51 -04:00
Adrian Wälchli 25ee51bc57
Continue Jeremy's early stopping PR #1504 (#2391)
* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* cannot pass an int as default_save_path

* refactor log message

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* fix test with new epoch indexing

* fix progress bar totals

* fix off by one error (see #2289) epoch starts at 0 now

* added missing imports

* fix hpc_save folderpath

* fix formatting

* fix tests

* small fixes from a rebase

* fix

* tmpdir

* tmpdir

* tmpdir

* wandb

* fix merge conflict

* add back evaluation after training

* test_resume_early_stopping_from_checkpoint TODO

* undo the horovod check

* update changelog

* remove a duplicate test from merge error

* try fix dp_resume test

* add the logger fix from master

* try remove default_root_dir

* try mocking numpy

* try import numpy in docs test

* fix wandb test

* pep 8 fix

* skip if no amp

* dont mock when doctesting

* install extra

* fix the resume ES test

* undo conf.py changes

* revert remove comet pickle from test

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update weights_loading.rst

* Update weights_loading.rst

* Update weights_loading.rst

* renamed flag

* renamed flag

* revert the None check in logger experiment name/version

* add the old comments

* _experiment

* test chckpointing on DDP

* skip the ddp test on windows

* cloudpickle

* renamed flag

* renamed flag

* parentheses for clarity

* apply suggestion max epochs

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-28 21:36:46 -04:00
Jirka Borovec 1e16681693
fix loading with hparams (#2403)
* fix #2386

* extra test

* extra case

* extra test

* chlog

* fix test
2020-06-28 20:22:03 -04:00
Adrian Wälchli 058c500300
fix when torchtext not installed (#2402) 2020-06-28 20:03:51 -04:00
Jirka Borovec 861a73be12
fix loading past checpoints (#2405)
* fix #2334

* chlog
2020-06-28 17:20:33 -04:00
William Falcon 66ffbaddf5
updates teardown to account for ddp (#2389)
* remove warnings

* remove warnings

* added doc lines

* added doc lines
2020-06-28 07:01:04 -04:00
Adrian Wälchli d910cc5200
docs: dont mock imports when running sphinx doctest (#2396)
* skip if no amp

* dont mock when doctesting

* install extra
2020-06-27 23:31:06 -04:00
Jirka Borovec 75f0a2062c
move torchtext as optional (#2395)
* torchtext

* Update pytorch_lightning/utilities/apply_func.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update apply_func.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-06-27 20:15:10 -04:00
Jirka Borovec 51711c265a
fix loading model with kwargs (#2387)
* test

* fix

* fix
2020-06-27 16:38:03 -04:00
Mateusz Pieniak e82d9cdb66
Support torchtext on a single GPU (#2379)
* Handle torchtext.data.Batch on GPU

* Update CHANGELOG.md

* Apply code review requests

* Correct the docs

* Change requirements
2020-06-27 16:36:45 -04:00
Jirka Borovec 73a78a13c7
CI: partial move from CircleCI (#2378)
* move from CircleCI

* req

* tex

* tex

* sudo

* extra

* recom

* pic

* dvipng
2020-06-27 16:25:33 -04:00
William Falcon 90f641af0d
fixes logger crash on ddp (#2388)
* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings

* remove warnings
2020-06-27 15:08:22 -04:00
Jirka Borovec 41f5df18a4
move Trains logger to Bolts (#2384)
* move Trains logger

* chlog
2020-06-27 09:14:05 -04:00
Jirka Borovec 4e13e419ea
add CLI test for examples (#2285)
* cli examples

* ddp

* CI

* CI

* req

* tests

* skip DDP

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-27 09:13:29 -04:00
Jirka Borovec 6673fc9a0b
fix docker builds (#2383) 2020-06-27 08:49:19 -04:00
Jirka Borovec 2f739f5977
fix key typo (#2374) 2020-06-26 21:46:08 -04:00
Kshitij09 20d0f53896
Fix ModelCheckpoint example (#2321)
`save_top_k` should be an `int` and have been mentioned as `save_top_k=True` in the snippet provided under 'Saving and Loading Weights' docs. Changed it to its default value (1) to make it consistent.

Signed-off-by: Kshitij Patil <kshitijpatil98@gmail.com>
2020-06-26 21:45:41 -04:00
Jirka Borovec 0be78d13aa
native amp (#2373)
* native amp

* typo

* imports

* apex
2020-06-26 21:45:13 -04:00
Jirka Borovec f1c96930b1
repair CI for Win (#2358)
* no cov

* no cov

* ReduceOp

* group

* reduce_op.sum

* Update sklearns.py

* formatting

* horovod

* Apply suggestions from code review

* horovod

* horovod

* horovod

* horovod

* ci

* print

* ci

* timeout

* timeout

* time

* fix

* distributed cpu

* pipes

* time

* cpu

* spawn

* spawn

* spawn

* tp

* separate

* os

* os

* npm

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

* fix

* fix meta tags creating empty lines

* pyright

* node

* fix httpserver address

* drop tutils.default_trainer_options

* imports

* Better fix for load_from_checkpoint() not working with absolute path on Windows (#2294)

* Fix load_from_checkpoint() not working with URL on Windows

* Update CHANGELOG

* Update CHANGELOG.md

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* drop duplicate

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: airium <airium@outlook.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: AIRIUM <38249940+airium@users.noreply.github.com>
2020-06-26 21:38:25 -04:00
Jirka Borovec a5f45787ea
fix get dataloader size (#2375)
* get dataloader size

* pyright
2020-06-26 15:38:48 -04:00
Thomas Schaaf 7c0a3f4745
Bugfix/_has_len (#2307)
* deal with NotImplementedError raised by torchtext

* deal with NotImplementedError raised by torchtext

* Added tests for dataloader which raise NotImplementedError in __len__()

* Fixed some typos

* enabled tests for dataloader raising NotImplementedError in __len__ and corrected match string for raised exception

* deleted empty line for style compliance

* refactored CustomNotImplementedErrorDataloader to derive from CustomInfDataloader

* enabled reduced number of not_implemented_error dataloader test to reduce runtime for continuous integration

* reduced test number of not_implemented_error dataloader test further to reduce test time

* reduced test number of not_implemented_error dataloader test to one to reduce test time

* disabled all not_implemented_error dataloader test to see if test pass in time

* added __next__ with a reduced number (5) of elements after which CustomNotImplementedErrorDataloader stops to speedup test.

* enabling all not_implemented_error dataloader test

* added brief description of change and relation of torchtext

* CustomNotImplementedErrorDataloader reduced number of batches served to 2.

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Apply suggestions from code review

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Disable parallelism in dataloader

Suspect that it might cause pytest to hang more frequent

* added max_steps=None to Trainer in not_implemented_error dataloader tests

* rearranged not_implemented_error test in file to group them together

* disabled parallel data loading
Reason: testing if that stops the test framework from hanging.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Thomas Schaaf <tschaaf@cs.cmu.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-26 09:31:08 -04:00
William Falcon cbb2427f0d
changed apex level (#2362) 2020-06-25 18:54:32 -04:00
William Falcon 0a092f6683
making optimization steps for hooks (#2363)
*simplified optimizer step and zero grad overriding
2020-06-25 16:02:16 -04:00
William Falcon d22181714a
fix 2333 (#2360) 2020-06-25 11:10:17 -04:00
William Falcon f2710bb500
adds tensorboard hparams logging test (#2342)
* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* fixes hparam logging

* Apply suggestions from code review

* skipif

* rename

* Update test_tensorboard.py

* Update test_tensorboard.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-25 09:22:28 -04:00
William Falcon c275e1fc91
swaps lr sched order (#2356)
* swaps lr sched order

* Update optimizers.py

* added amdim encoder choice
2020-06-25 09:21:41 -04:00
davinnovation b6ab7ca121
[docs] add community example : pl + ms nni (#2340)
https://github.com/PyTorchLightning/pytorch-lightning/issues/2329
2020-06-24 23:13:49 -04:00
Adrian Wälchli 220bb6db57
remove wrong annotation (#2349) 2020-06-24 22:29:26 -04:00
Adrian Wälchli 9b2e60530f
Python logging level docs (#2348)
* docs about Python logging

* add link to Python logging docs
2020-06-24 22:29:01 -04:00
David Waterworth cc07dcae96
corrected example usage of save_hyperparameters from List[str] to seperate str (#2353)
Co-authored-by: David Waterworth <david.waterworth@cim.io>
2020-06-24 22:28:38 -04:00