Commit Graph

157 Commits

Author SHA1 Message Date
Tullie Murrell 6537642f6a
Remove explicit flush from tensorboard logger (#2126)
* Remove explicit flush from tensorboard logger

* Update changelog
2020-06-09 07:08:12 -04:00
Jirka Borovec d2967d9305
update hparams, allow OmegaConf (#2047)
* DictConf

* inits

* Apply suggestions from code review

Co-authored-by: Omry Yadan <omry@fb.com>

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* atrib

* wip

* wip

* wip

* added hparams test

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update test_hparams.py

* added hparams test

* added hparams test

* pep8

* pep8

* pep8

* docs

* wip

* wip

* clean

* review @omry

* Update docs/source/hyperparameters.rst

Co-authored-by: Omry Yadan <omry@fb.com>

Co-authored-by: Omry Yadan <omry@fb.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-08 07:19:34 -04:00
Adrian Wälchli 4234992302
Fix local variables being collected into module_arguments dict (#2048)
* do not include local vars in auto collection

* add test

* add test for model with "self" renamed to "obj"

* skip decorator

* changelog

* changelog

* update docs

* remove obsolete child collection

* generalize **args, **kwargs names

* docs

* also update varargs passed in

* Revert "also update varargs passed in"

This reverts commit 3d7a30dbee07a513ee13e1cc3e08ca5ccdb85734.

* update test
2020-06-04 08:35:50 -04:00
kumuji fd7814d287
Added black formater for the code with code-checker on pull (#1610)
* black

Added throught black.toml other options are hard so far

No caching for black github action

Moved from black.toml to pyproject.toml

Exclude not only yml but also yaml

Update pyproject.toml

Co-authored-by: Thomas Johansen <thomasjo@gmail.com>

Update .github/workflows/code-formatting-check.yml

mergify

Remove formating check

E231 error ignoring because of black formating

Updated CONTRIBUTING to the master

* Update .github/workflows/code-formatting-check.yml

* Bump black to 19.10b0 version

* resolved incorrect merge of CONTRIBUTING,

Black skipping string normalization

* Minor fixes in CONTRIBUTING, two typos

* Update setup.cfg

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-03 18:23:14 +02:00
Adrian Wälchli 8211256c46
data transfer model hook (+ refactor) (#1756)
* refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs

* resolve merge duplication

* overridden typo

* fix test

* tpu id

* raise if TPU not available

* re-use apply_to_collection function for parsing collections

* comment

* make utility function available to user

* documentation

* move changelog entry to top

* fix tpu transfer call

* fix call

* remove hardcoded string

* improve test

* call model hook by default

* Apply suggestions from code review

* rename utility function

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 21:45:19 -04:00
Adrian Wälchli a699003e67
Update/merge multi-gpu docs (#2021)
* merge multi-gpu docs

* extend slurm docs

* update links to elastic

* format docs and type hints in distrib parts

* reference multi-gpu/slurm in trainer args docs

* fix doctest

* typo

* doctest

* Apply suggestions from code review

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* wall time

* Update docs/source/slurm.rst

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* fix title

* update docs for weights summary

* update changelog

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>
2020-06-02 18:50:08 -04:00
Lezwon Castelino 943c4b20af
slow tpu train (#2033)
* use parallel loader

* Revert "use parallel loader"

This reverts commit ed6e7583

* select tpu id for pl

* condition if tpu_id is None

* added info to changelog

* Revert "condition if tpu_id is None"

This reverts commit 1fb6e586

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 18:48:05 -04:00
Boris Dayma 00f1ac11e6
fix(wandb): use same logger on multiple training loops (#2055)
* fix(wandb): use same logger on multiple training loops

New training loops reset step to 0 which would previously try to overwrite logs

fix #2015

* docs(changelog.md): add reference to PR 2055
2020-06-02 18:46:02 -04:00
Fabio Natanael Kepler 8b9b923ca8
Keep track of the best model's path saved by ModelCheckpoint (#1799)
* Add an additional attribute to ModelCheckpoint to keep track of the best model's path

Currently, only the best metric value is directly tracked. This new attribute will help in uses cases where the trained model needs to be used or tracked right after training.

* Add small description and usage example to docs

* Fix PEP8 issues

* Fix doctest example

* Fix expected output in doctest

* Apply suggestions from code review

* Show example as code block instead of doctest

* Apply suggestions from code review

* Update CHANGELOG.md

* Rename `ModelCheckpoint.best` to `ModelCheckpoint.best_model_score`

Also rename `ModelCheckpoint.best_model` (added in this PR) to `ModelCheckpoint.best_model_path`, for consistency, and `kth_best_model` to `kth_best_model_path`.

* Update pytorch_lightning/trainer/training_io.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Add warning when loading checkpoint from an old version

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-31 08:47:13 -04:00
Jirka Borovec 9893681859
fix changelog (#1864)
* fix chlog

* test for #1729

* hist

* update

* Document use case of passing test dataloaders to Trainer.test() (#1992)

* Issue 1990 Doc patch.

* Codeblock directive.

* Update to reflect current state of pytorch-lightning

* Final grammar cleaning. I hope these commits are squashed.

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: authman <uapatira@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-31 00:48:05 -04:00
Justus Schock ceecf1cea9
Graceful shutdown on python interpreter exit (#1631)
* Fraceful shutdown on python interpreter exit

* Update CHANGELOG.md

* Update training_loop.py

* Update training_loop.py

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* pep8, move to constant

* Update training_loop.py

* Update training_loop.py

* Update training_loop.py

* pep8, move to constant

* pep8

* timeout

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-29 16:20:04 +02:00
Jirka Borovec 8ee6d91d0e
code guideline (#1949)
* code rule

* Apply suggestions from code review

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

* chlog

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-05-28 14:40:49 +00:00
Ivan Nazarov 7c19c373ac
LearningRateLogger in multi-scheduler setting (#1944)
* fixed undesired behaviour due to dict.fromkeys

* a test for log length consistency

* runtime-warn if no schedulers are configured

* chlog

* move

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-27 22:44:46 -04:00
Mateusz Pieniak 3af4994d5a
Removing unecessary early stopping calls (#1863)
* Removing unecessary early stopping calls

* Update CHANGELOG.md

Co-authored-by: Mateusz Pieniak <mateusz.pieniak@evidenceprime.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-26 19:06:06 -04:00
Adrian Wälchli 34237cfcaf
handle unknown args passed to Trainer.from_argparse_args (#1932)
* filter valid args

* error on unknown manual args

* added test

* changelog

* update docs and doctest

* simplify

* doctest

* doctest

* doctest

* better test with mock check for init call

* fstring

* extend test

* skip test on 3.6 not working

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-25 16:01:29 -04:00
Federico Baldassarre 65b4352930
early stopping checks on_validation_end (#1458)
* Fixes PyTorchLightning/pytorch-lightning#490

`EarlyStopping` should check the metric of interest `on_validation_end` rather than `on_epoch_end`. 
In a normal scenario, this does not cause a problem, but in combination with `check_val_every_n_epoch>1` in the `Trainer` it results in a warning or in a `RuntimeError` depending on `strict`.

* Highlighted that ES callback runs on val epochs in docstring

* Updated EarlyStopping in rst doc

* Update early_stopping.py

* Update early_stopping.rst

* Update early_stopping.rst

* Update early_stopping.rst

* Update early_stopping.rst

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update docs/source/early_stopping.rst

* fix doctest indentation warning

* Train loop calls early_stop.on_validation_end

* chlog

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-25 17:33:00 +00:00
Adrian Wälchli 8ca8336ce5
protect progress bar callback (#1855)
* wip protected progress bar settings

* remove callback attr from LRfinder

* whitespace

* changelog
2020-05-25 07:49:23 -04:00
Lucas Vazquez 112dd5c4f6
Adds the option of saving the last model on checkpoint (#1908)
* saves model every epoch

* implement test for save_last

* Update CHANGELOG.md

* Update CHANGELOG.md

* changes test description

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>

Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-05-25 07:47:44 -04:00
Nicki Skafte a34eb9e169
Fix logger bug and prepare data bug (#1933)
* tests, fix logger bug and prepare data bug

* add CHANGELOG.md

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-25 07:43:56 -04:00
William Falcon caa9c6760b
replace Hparams by init args (#1896)
* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* remove the need for hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* replace self.hparams

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* fixed

* finished moco

* basic

* testing

* todo

* recurse

* hparams

* persist

* hparams

* chlog

* tests

* tests

* tests

* tests

* tests

* tests

* review

* saving

* tests

* tests

* tests

* docs

* finished moco

* hparams

* review

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* hparams

* overwrite

* transform

* transform

* transform

* transform

* cleaning

* cleaning

* tests

* examples

* examples

* examples

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* chp key

* tests

* Apply suggestions from code review

* class

* updated docs

* updated docs

* updated docs

* updated docs

* save

* wip

* fix

* flake8

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-24 18:59:08 -04:00
Nicki Skafte 8f6b7a2b4f
Fix user warning produced by apex + scheduler combination (#1873)
* fix user error produced by apex + scheduler combination

* add changelog

* added reinit to every configure_apex call

* fix styling

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-22 07:19:37 -04:00
Jirka Borovec d610f3bb53
set min PT 1.3 (#1917)
* set min PT 1.3

* circleCI

* mergify

* min

* chlog

* skip
2020-05-22 07:14:08 -04:00
Maxim Grechkin 98f7842970
Allow dataloaders without sampler field present (#1907)
* Allow dataloaders without sampler field present

Sometimes we have a custom dataloader that doesn't have a sampler, better to check that the field is there before reading it.

* chlog

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-20 20:57:12 +00:00
Justus Schock 9b629637b8
New metric classes (#1326) (#1877)
* New metric classes (#1326)

* Create metrics package

* Create metric.py

* Create utils.py

* Create __init__.py

* add tests for metric utils

* add docstrings for metrics utils

* add function to recursively apply other function to collection

* add tests for this function

* update test

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update metric name

* remove example docs

* fix tests

* add metric tests

* fix to tensor conversion

* fix apply to collection

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove tests from init

* add missing type annotations

* rename utils to convertors

* Create metrics.rst

* Update index.rst

* Update index.rst

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/metrics/metric.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/utilities/test_apply_to_collection.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/metrics/convertors.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* add doctest example

* rename file and fix imports

* added parametrized test

* replace lambda with inlined function

* rename apply_to_collection to apply_func

* Separated class description from init args

* Apply suggestions from code review

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* adjust random values

* suppress output when seeding

* remove gpu from doctest

* Add requested changes and add ellipsis for doctest

* forgot to push these files...

* add explicit check for dtype to convert to

* fix ddp tests

* remove explicit ddp destruction

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* move dtype device mixin to more general place

* refactor to general device dtype mixin

* add initial metric package description

* change default to none for mac os

* pep8

* fix import

* Update index.rst

* Update ci-testing.yml

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update CHANGELOG.md

* Update pytorch_lightning/metrics/converters.py

* readme

* Update metric.py

* Update pytorch_lightning/metrics/converters.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-19 11:05:07 -04:00
Rohit Gupta ac76dfcf62
Remove NaNs from loss in LRFinder (#1862)
* Remove NaNs from loss in LRFinder

* np.isfinite

* chlog

* add test

* chlog

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-05-19 08:39:19 +02:00
Lezwon Castelino 7c7e50ca47
Allow user to select individual TPU core to train on (#1729)
* added tpu_id

added tpu_id to mixins

* train on individual tpu

* parallel loader if tpu_id is None

* removed progress_bar_refresh_rate

* chlog

* replaced num_tpu_cores with tpu_cores

* set tpu_id to None if int

* changed num_tpu_cores to tpu_cores in docs

* updated docs

* updated __init__.py
removed self.tpu_id for ParallelLoader

* Update pytorch_lightning/trainer/__init__.py

* check if tpu_cores is a list

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* xla device conditional

* num_tpu_cores deprecation

* removed duplicate warning

* fixed pep8 error

* Revert "removed duplicate warning"

This reverts commit 8adb0a9b

* deprecated api update

* fixed recursion error

* fixed tests

* fixed flake errors

* removed current_tpu_index

* Update CHANGELOG.md

* Update trainer.py

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:54 -04:00
Fabio Natanael Kepler 8c4c7b105e
Fix `save_weights_only` flag in ModelCheckpoint (#1780)
* Add flag to `dump_checkpoint` for only including weights

`ModelCheckpoint` then passes `self.save_weights_only` to the save function.

* Fix tests and add changelog entry

* Add check and descriptive message when training state is restored from a weights only checkpoint

Also add a test for making sure `ModelCheckpoint.save_weights_only` works as expected.

* Fix weights-only test to properly match expected exception

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-17 09:24:17 -04:00
Adrian Wälchli 769a459d27
remove extra kwargs from Trainer init (#1820)
* remove kwargs

* remove useless test

* rename unknown trainer flag

* trainer inheritance and test

* blank line

* test for unknown arg

* changelog
2020-05-17 09:14:54 -04:00
Jirka Borovec 692f302837
continue devel (#1793)
* miss

* miss

* miss

* update

* format
2020-05-17 08:30:45 -04:00
Jirka Borovec e95e1d71c7
release 0.7.6 (#1813)
* release 0.7.6rc2

* release 0.7.6

* include img

* smaller image

* missing

* miss

* miss

* miss

* up
2020-05-15 08:36:40 -04:00
Justus Schock c05077fae3
Enable non-blocking for gpu device transfer (#1843)
* Update distrib_parts.py

* Update CHANGELOG.md
2020-05-14 17:56:40 -04:00
Nicki Skafte 663b90035c
Bugfix: accumulation and suggestion for learning rate finder (#1801)
* fix suggestion being too naive

* fix accumulation error and added new tests

* fix styling

* update CHANGELOG.md

* update based on review

* fix tests

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-13 14:40:44 -04:00
Ashwin Bharambe aefc5314bc
[ddp] Support multi-node distributed execution under torchelastic (#1811)
The changes are quite local and limited in nature -- viz., checking for
some indicator environment variables. We check for (SLURM_LOCALID,
NODE_RANK, GROUP_RANK) in order. If multiple are found set, a warning is
logged.

This patch also fixes a minor bug with comparing the `WORLD_SIZE`
environment variable. This can be a string type.
2020-05-13 14:06:59 -04:00
So Uchida 22d7d03118
Replace meta_tags.csv with hparams.yaml (#1271)
* Add support for hierarchical dict

* Support nested Namespace

* Add docstring

* Migrate hparam flattening to each logger

* Modify URLs in CHANGELOG

* typo

* Simplify the conditional branch about Namespace

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* added examples section to docstring

* renamed _dict -> input_dict

* mata_tags.csv -> hparams.yaml

* code style fixes

* add pyyaml

* remove unused import

* create the member NAME_HPARAMS_FILE

* improve tests

* Update tensorboard.py

* pass the local test w/o relavents of Horovod

* formatting

* update dependencies

* fix dependencies

* Apply suggestions from code review

* add savings

* warn

* docstrings

* tests

* Apply suggestions from code review

* saving

* Apply suggestions from code review

* use default

* remove logging

* typo fixes

* update docs

* update CHANGELOG

* clean imports

* add blank lines

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* back to namespace

* add docs

* test fix

* update dependencies

* add space

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-13 15:05:15 +02:00
William Falcon 35fe2efe27
added override for hparams in load_from_ckpt (#1797)
* added override for hparams in load_from_ckpt

* override hparams

* override hparams

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* update doctest

* typo

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-13 10:27:22 +02:00
Oliver Neumann 9059d21042
Missing profiler attribute in add_argparse_args() ArgumentParser (#1794)
* Fixed typing annotation by adding boolean type. After that Profiler flag will be added to argparse.

* Updated CHANGELOG.md

* Updated git_init_arguments_and_types() to pass doctests.

* Added doctest example to add_argparse_parser()
2020-05-12 08:53:26 -04:00
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
Jirka Borovec 9d2df24d6b
RC & Docs/changelog (#1776)
* missing

* RC

* tol

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-11 21:57:53 -04:00
Fabio Natanael Kepler d120f97896
Fix saving native AMP scaler state (#1777)
Saving was introduced in #1561.
2020-05-11 21:38:37 -04:00
Rohit Gupta d962ab5d89
Fix lr key name in case of param groups (#1719)
* Fix lr key name in case of param groups

* Add tests

* Update test and added configure_optimizers__param_groups

* Update CHANGELOG
2020-05-10 17:05:34 -04:00
Nicki Skafte 4970927ec8
Feature: auto scale batch size (#1638)
* auto batch finder

* fix styling

* add description

* add different modes

* fix copy paste error

* better organised code

* fix styling

* add tests

* fix

* fix

* add some documentation

* added CHANGELOG.md

* some documentation

* update based on review

* Update trainer.py

* Update docs/source/training_tricks.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* use EvalModelTemplate

* param tests

* rename

* wrap params

* rename function

* rename

* rename param

* fix

* abs

* rename

* refactor code

* add docs

* try

* arg

* loop

* exept

* loop

* drop bool

* docs

* docs

* added check and test for passing dataloader to fit

* styling fix

* update based on review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-09 08:28:36 -04:00
Adrian Wälchli 25bbd059df
Also update progress_bar in training_epoch_end (#1724)
* update prog. bar metrics on train epoch end

* changelog

* wip test

* more thorough testing

* comments

* update docs

* move test

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-08 23:31:56 -04:00
Peter Yu 851866333c
Attach version_ to checkpoint path only if version is int (#1748) 2020-05-06 12:38:32 -04:00
Travis Addair f90afa29b8
Fix disabling progress bar on non-zero ranks using Horovod backend (#1709)
* Fix Horovod backend to disable progress bar on all ranks except 0

* Add join barriers

* Added changelog

* Make protected and add verbosity

* Refactor to disable progress bar callback in train

* Removed vebose setting

* Add cache check for Horovod

* Test run again

* Updated comment

* Always skip cache for Horovod

* Only reinstall when necessary

* Added separate step

* Fixed spacing

* Skip Python 3.8
2020-05-04 13:02:57 -04:00
Nicki Skafte e865b046b1
Bugfix/lr finder (#1676)
* fix early stopping bug

* allow val dataloader

* update CHANGELOG.md

* fix early stopping bug

* allow val dataloader

* update CHANGELOG.md

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-04 11:38:51 -04:00
Adrian Wälchli d28b145393
Update type hints for multiple dataloaders in .fit() and .test() (#1723)
* update typehints

* change log
2020-05-04 08:24:34 -04:00
Adrian Wälchli e6b34ef90d
[WIP] Reduction when batch size < num gpus (#1609)
* reduce if <= num_gpus

* add test with explanation

* chlog

* fix changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-05-02 11:01:44 -04:00
Jean-Baptiste SCHIRATTI fafe5d63a7
Transfer learning example (#1564)
* Fine tuning example.

* Fix (in train method) + Borda's comments (added argparse + fixed docstrings).

* Updated CHANGELOG.md

* Fix + updated docstring.

* Fixes (awaelchli's comments) + docstrings.

* Fix train/val loss.

* Fix.
2020-05-02 09:08:46 -04:00
Oliver Neumann 152a2eb30c
wandb logger 'global_step' affects other logger (#1492)
* Removed unnecessary 'global_step' from wandb logger.

* Fixed wrong step implementation in wandb and missing metric skipping in logger base.

* simplified metric check in base logger

* Added Fix Description in CHANGELOG.md

* Updated wandb logger tests.

* udpate test, step=3

* Moved Fix Description in CHANGELOG.md to unreleased.

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-02 08:50:47 -04:00
Dmitry Lipin 210cd657dd
fix LightningTemplateModel (#1577)
* fix LightningTemplateModel

* update CHANGELOG.md

* update LightningTemplate

* update changelog

* update changelog

* loss fix
2020-05-02 08:41:37 -04:00