Commit Graph

2479 Commits

Author SHA1 Message Date
Charles 9045b6c599
Fix typo in contributing docs (#2076) 2020-06-12 13:11:08 +02:00
Jason Phang e965515443
Fix DataParallel typo (#2154) 2020-06-11 21:45:22 -04:00
Peter Yu 06cd849538
Allow loading checkpoints from urls (#1667)
* allow loading checkpoints from urls

* tmpdir_server fixture

* test cases for loading checkpoints from url

* dir => root_dir

* default map_location to None

* test case for resume_from_checkpoint

* changelog

* doc update

* monkeypatch TORCH_HOME to avoid caching

* Use a threading server with random ports so that it is easier to clean up

* test fixes

* pep8 fix

* ThreadingHTTPServer support in 3.6

* pep8 fix

* fix changelog

* separate tests for urls

* typo

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-11 17:12:48 -04:00
Justus Schock bd49b07fbb
Rework of Sklearn Metrics (#1327)
* Create utils.py

* Create __init__.py

* redo sklearn metrics

* add some more metrics

* add sklearn metrics

* Create __init__.py

* redo sklearn metrics

* New metric classes (#1326)

* Create metrics package

* Create metric.py

* Create utils.py

* Create __init__.py

* add tests for metric utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add tests for this function

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* add metric tests

* fix to tensor conversion

* fix apply to collection

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* add missing type annotations

* rename utils to convertors

* Create metrics.rst

* Update index.rst

* Update index.rst

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* add explicit check for dtype to convert to

* fix ddp tests

* remove explicit ddp destruction

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* add sklearn metrics

* start adding sklearn tests

* fix typo

* return x and y only for curves

* fix typo

* add missing tests for sklearn funcs

* imports

* __all__

* imports

* fix sklearn arguments

* fix imports

* update requirements

* Update CHANGELOG.md

* Update test_sklearn_metrics.py

* formatting

* formatting

* format

* fix all warnings and formatting problems

* Update environment.yml

* Update requirements-extra.txt

* Update environment.yml

* Update requirements-extra.txt

* fix all warnings and formatting problems

* Update CHANGELOG.md

* docs

* inherit

* docs inherit.

* docs

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

* docs

* req

* min

* Apply suggestions from code review

Co-authored-by: Tullie Murrell <tulliemurrell@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Tullie Murrell <tulliemurrell@gmail.com>
2020-06-10 15:43:12 +02:00
Jirka Borovec 16a7326e52
test cloudpickle (#2105)
* cloudpickle

* ci tests
2020-06-09 16:51:30 -04:00
Jirka Borovec de15759f76
Docs/changelog (#2125)
* miss chlog

* miss chlog

* docs

* miss

* formatting
2020-06-09 16:51:14 -04:00
Jirka Borovec 74ab9d034b
setup py 3.8 (#2135) 2020-06-09 16:50:59 -04:00
edenlightning 7245e48153
[docs] Add Cotratron to community examples (#2130)
* [docs] Add Cotratron to community examples

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-06-09 16:43:49 +02:00
William Falcon 49b2424e6e
Update README.md 2020-06-09 07:43:33 -04:00
William Falcon 3f71f0ce29
Update README.md 2020-06-09 07:30:38 -04:00
William Falcon db0c94e4a4
Update README.md 2020-06-09 07:30:10 -04:00
Pattarawat Chormai 3be557dc5b
document: fix callback signature (#2113) 2020-06-09 07:10:44 -04:00
Tushar Jain 8d3d471f03
Update README.md (#2117) 2020-06-09 07:09:43 -04:00
Udit Arora a1658ea63d
Add docs about example dependencies (#2122)
* Add torchvision and gym dependencies

* Add pl_examples/requirements.txt to the list of dependencies for running local tests
2020-06-09 07:09:03 -04:00
Tullie Murrell 6537642f6a
Remove explicit flush from tensorboard logger (#2126)
* Remove explicit flush from tensorboard logger

* Update changelog
2020-06-09 07:08:12 -04:00
William Falcon 3f28a8ef32
Update __init__.py 2020-06-08 19:28:50 -04:00
William Falcon 479ab49d03
temporarily fixes early stopping bug (#2119)
* fixes early stopping bug

* fixes early stopping bug

* fixes early stopping bug

* fixes early stopping bug

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* fixe docs

* added test
2020-06-08 19:28:26 -04:00
William Falcon 73a6a957fd fixe docs 2020-06-08 18:00:24 -04:00
William Falcon 3260e59b27
Adds back the slow spawn ddp implementation that people want (#2115)
* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* training batch clean up

* adding spawn

* adding spawn

* adding spawn

* adding spawn

* adding spawn

* adding spawn

* adding spawn

* adding spawn
2020-06-08 17:55:25 -04:00
William Falcon 0bd7780adc
Fixes CPU and hanging GPU crash (#2118)
* training batch clean up

* training batch clean up

* training batch clean up
2020-06-08 16:30:20 -04:00
edenlightning 9e8716afe8
Update Readme with tunning overhead time (#2082) 2020-06-08 07:26:58 -04:00
Adrian Wälchli 1f95fb9af7
update readme with conda installation instruction (#2099)
* update readme with conda installation instruction

* fix team header

* bibtex spelling

* Update README.md

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-08 07:22:54 -04:00
Jirka Borovec d2967d9305
update hparams, allow OmegaConf (#2047)
* DictConf

* inits

* Apply suggestions from code review

Co-authored-by: Omry Yadan <omry@fb.com>

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* atrib

* wip

* wip

* wip

* added hparams test

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update test_hparams.py

* added hparams test

* added hparams test

* pep8

* pep8

* pep8

* docs

* wip

* wip

* clean

* review @omry

* Update docs/source/hyperparameters.rst

Co-authored-by: Omry Yadan <omry@fb.com>

Co-authored-by: Omry Yadan <omry@fb.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-08 07:19:34 -04:00
Jirka Borovec c09317e68f
cleaning (#2030)
* cleaning

* optim imports

* fix

* typo

* on

* mergify
2020-06-04 11:25:07 -04:00
Wah Loon Keng 6e993c608b
correct trainer.fit production example (#2068)
trainer.fit uses the parameter `val_dataloaders` but in the documentation it is `val_dataloader`, which is invalid.
2020-06-04 11:24:12 -04:00
Daniel Li 1ad81570e6
Update the documentation of configure_optimizers() (#2071)
* Explain the default value for scheduler

Co-authored-by: Qinru Li <q4li@eng.ucsd.edu>
2020-06-04 11:23:44 -04:00
William Falcon d96df75d6a
testing new speed (#1587)
* fixed new amp bugs

* fixed new amp bugs

* fixed new amp bugs

* try exit

* larger dataset

* full mnist

* full mnist

* trainer

* assert

* .05

* .10, #4

* #5

* #5

* #5

* refactor

* abs diff

* speed

* speed

* speed

* speed

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-04 11:20:12 -04:00
Adrian Wälchli 4234992302
Fix local variables being collected into module_arguments dict (#2048)
* do not include local vars in auto collection

* add test

* add test for model with "self" renamed to "obj"

* skip decorator

* changelog

* changelog

* update docs

* remove obsolete child collection

* generalize **args, **kwargs names

* docs

* also update varargs passed in

* Revert "also update varargs passed in"

This reverts commit 3d7a30dbee07a513ee13e1cc3e08ca5ccdb85734.

* update test
2020-06-04 08:35:50 -04:00
kumuji fd7814d287
Added black formater for the code with code-checker on pull (#1610)
* black

Added throught black.toml other options are hard so far

No caching for black github action

Moved from black.toml to pyproject.toml

Exclude not only yml but also yaml

Update pyproject.toml

Co-authored-by: Thomas Johansen <thomasjo@gmail.com>

Update .github/workflows/code-formatting-check.yml

mergify

Remove formating check

E231 error ignoring because of black formating

Updated CONTRIBUTING to the master

* Update .github/workflows/code-formatting-check.yml

* Bump black to 19.10b0 version

* resolved incorrect merge of CONTRIBUTING,

Black skipping string normalization

* Minor fixes in CONTRIBUTING, two typos

* Update setup.cfg

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-03 18:23:14 +02:00
Jirka Borovec 5d93d57573
Tests/drop macos py38 (#2061)
* tests drop macOS py38

* ignore single test

* try freeze env

* drop

* drop

* drop

* drop

* drop skips

* imports

* fix
2020-06-03 08:38:56 -04:00
Jirka Borovec c438d0dd90
increase acc (#2039)
* increase acc

* try 0.45

* @pytest

* @pytest

* try .50

* duration

* pytest
2020-06-03 08:28:19 -04:00
Jirka Borovec b4eb6ef5a1
tests drop macOS py38 (#2054)
* tests drop macOS py38

* ignore single test

* try freeze env

* drop

* drop

* drop

* drop

* drop skips

* drop macOS py38

* imports
2020-06-03 06:48:20 -04:00
Adrian Wälchli 8211256c46
data transfer model hook (+ refactor) (#1756)
* refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs

* resolve merge duplication

* overridden typo

* fix test

* tpu id

* raise if TPU not available

* re-use apply_to_collection function for parsing collections

* comment

* make utility function available to user

* documentation

* move changelog entry to top

* fix tpu transfer call

* fix call

* remove hardcoded string

* improve test

* call model hook by default

* Apply suggestions from code review

* rename utility function

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 21:45:19 -04:00
Devashish Shankar ade3f36b7a
Raise an error when lightning replaces an existing sampler (#2020)
* Raise an error when lightning replaces an existing sampler

Currently, Trainer replaces the existing sampler with DistributedSampler
if running distributing training and `replace_sampler_ddp=True` (default
behaviour). If a user has configured an existing sampler, this would
lead to widely different results if running a distributed vs
non-distributed training.

This PR fixes this by raising an Error if user has configured a sampler
and uses `replace_sampler_ddp=True`. The recommended behavior from now
on is to either remove the sampler or set `replace_sampler_ddp=False`

* Fix tests

* Simpler fix

* Fix tests

* Make inner method protected

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:52:04 -04:00
Ivan Nazarov e85a646a41
Mistake in parameters' grad norm tracking (#2012)
* fix grad norm formula

* grad-norm tracker test

* fixed seed and explicit rtol in grad norm tracking test

* a docstring for grad-norms and forced cast to float of norm_type

* support for inf-norm

* renamed the grad norm test

* docs

* fixed language in docstring

* Apply suggestions from code review

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:51:09 -04:00
Adrian Wälchli a699003e67
Update/merge multi-gpu docs (#2021)
* merge multi-gpu docs

* extend slurm docs

* update links to elastic

* format docs and type hints in distrib parts

* reference multi-gpu/slurm in trainer args docs

* fix doctest

* typo

* doctest

* Apply suggestions from code review

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* wall time

* Update docs/source/slurm.rst

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* fix title

* update docs for weights summary

* update changelog

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>
2020-06-02 18:50:08 -04:00
Udit Arora 26b69917b4
Add Open MPI installation details for horovod (#2050) 2020-06-02 18:48:26 -04:00
Lezwon Castelino 943c4b20af
slow tpu train (#2033)
* use parallel loader

* Revert "use parallel loader"

This reverts commit ed6e7583

* select tpu id for pl

* condition if tpu_id is None

* added info to changelog

* Revert "condition if tpu_id is None"

This reverts commit 1fb6e586

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:48:05 -04:00
Rohit Gupta fa696ce512
fix bug_report template (#2052)
* fix bug_report template

* article
2020-06-02 18:47:21 -04:00
Jirka Borovec 69575204f2
notes on Bug fixing (#2053)
* import

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-06-02 18:47:03 -04:00
Boris Dayma 00f1ac11e6
fix(wandb): use same logger on multiple training loops (#2055)
* fix(wandb): use same logger on multiple training loops

New training loops reset step to 0 which would previously try to overwrite logs

fix #2015

* docs(changelog.md): add reference to PR 2055
2020-06-02 18:46:02 -04:00
Rohit Gupta 0914873bc2
Fix domain_template scripts (#2014)
* Fix domain_templates

* Fix type of fake labels

* type

* args
2020-06-01 11:38:52 -04:00
William Falcon 82a20296e3
Replaces ddp .spawn with subprocess (#2029)
* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix
2020-06-01 11:00:32 -04:00
Jirka Borovec fd38f52e55
sooner CI testing (#2037) 2020-06-01 10:21:52 -04:00
William Falcon 0be530a427
Revert "Fixes EarlyStopping With Precision=16 (#1996)" (#2032)
This reverts commit bf39cb26c5.
2020-05-31 15:20:18 -04:00
authman bf39cb26c5
Fixes EarlyStopping With Precision=16 (#1996)
* Patch for issue 1815, which will allow EarlyStopping to work on precision=16

* Added a whitespace to the end of the line so CICD can rerun. No reason for the latest macos test to have been cancelled.

* Format.
2020-05-31 15:02:19 -04:00
Fabio Natanael Kepler 8b9b923ca8
Keep track of the best model's path saved by ModelCheckpoint (#1799)
* Add an additional attribute to ModelCheckpoint to keep track of the best model's path

Currently, only the best metric value is directly tracked. This new attribute will help in uses cases where the trained model needs to be used or tracked right after training.

* Add small description and usage example to docs

* Fix PEP8 issues

* Fix doctest example

* Fix expected output in doctest

* Apply suggestions from code review

* Show example as code block instead of doctest

* Apply suggestions from code review

* Update CHANGELOG.md

* Rename `ModelCheckpoint.best` to `ModelCheckpoint.best_model_score`

Also rename `ModelCheckpoint.best_model` (added in this PR) to `ModelCheckpoint.best_model_path`, for consistency, and `kth_best_model` to `kth_best_model_path`.

* Update pytorch_lightning/trainer/training_io.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Add warning when loading checkpoint from an old version

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-31 08:47:13 -04:00
Artem Lobantsev 55fdfe3845
Bugfix/fix gan example (#2019)
* 🐛 fixed fake example type assigning and hparams arg

* fixed GAN example to work with dp, ddp., ddp_cpu

* Update generative_adversarial_net.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-31 08:31:21 -04:00
William Falcon 0e37e8c4d2
hotfix to unblock hparams and OmniConf - removes auto_register_init_args by default (#2025)
* ogc install

* cleaned up tests

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix
2020-05-31 08:29:51 -04:00
Jirka Borovec 9893681859
fix changelog (#1864)
* fix chlog

* test for #1729

* hist

* update

* Document use case of passing test dataloaders to Trainer.test() (#1992)

* Issue 1990 Doc patch.

* Codeblock directive.

* Update to reflect current state of pytorch-lightning

* Final grammar cleaning. I hope these commits are squashed.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: authman <uapatira@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-31 00:48:05 -04:00