Commit Graph

2355 Commits

Author SHA1 Message Date
Oliver Neumann 9059d21042
Missing profiler attribute in add_argparse_args() ArgumentParser (#1794)
* Fixed typing annotation by adding boolean type. After that Profiler flag will be added to argparse.

* Updated CHANGELOG.md

* Updated git_init_arguments_and_types() to pass doctests.

* Added doctest example to add_argparse_parser()
2020-05-12 08:53:26 -04:00
William Falcon c52382f547
Update README.md 2020-05-12 08:52:43 -04:00
William Falcon 8584df54e9
Update README.md 2020-05-12 08:52:11 -04:00
William Falcon a5c19ea784
Update README.md 2020-05-12 08:49:29 -04:00
William Falcon 423b82ea6c
Update README.md 2020-05-12 08:46:55 -04:00
William Falcon 39584d08ad
Update README.md 2020-05-12 08:46:22 -04:00
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
William Falcon 7af4505519
Update README.md 2020-05-12 07:50:23 -04:00
William Falcon a4fc4ffa6e
Update README.md 2020-05-12 07:49:17 -04:00
William Falcon 6216501455
Update README.md 2020-05-12 07:47:57 -04:00
William Falcon 6517d1cf5c
Add files via upload 2020-05-12 07:46:55 -04:00
Justus Schock 5f292390fd
Bug fix hparam logging with metrics (#1647)
* add metric logging

* Use pytorch built-in method

* Update tensorboard.py

* Update tensorboard.py
2020-05-12 07:25:12 -04:00
Jirka Borovec 35ac30e688
Fix build Docker releases (#1783)
* gh act - if

* gh act - if

* gh act - steps

* gh act - steps

* gh act - steps

* name

* name

* reorder

* docker

* timeout

* repo

* show

* show

* ver

* rc

* tag
2020-05-12 06:54:59 -04:00
William Falcon 10b16dbfab
made ddp the default if no backend specified with multiple GPUs (#1789)
* made ddp the default if no backend specified with multiple GPUs

* fix

* spawn

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-12 06:54:23 -04:00
Travis Addair acab068c74
Join Horovod workers at the end of trainer.fit() to prevent race conditions following training (#1786)
* Join Horovod workers at the end of trainer.fit() to prevent race conditions following training

* flake8

* flake8

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-12 09:15:25 +00:00
William Falcon 7b60d49432
fixed native amp + ddp (#1788)
* fixed native amp + ddp

* fixed native amp + ddp
2020-05-12 00:25:06 -04:00
Jeremy Jordan 1df0d2dc97
set logger level for package (#1718)
* move logging config to trainer class init

* alternate logging config
2020-05-12 00:14:35 -04:00
William Falcon 4b30ef6480
Device (#1790)
* added self.device

* added docs
2020-05-12 00:09:48 -04:00
Kevin Chen de1fdd8d3b
Removed test_dataloader call in check_testing_model_configuration (#1670)
* Removed test_dataloader call

* Check if test_dataloader is actually overriden

* Fixed method spelling

* Replaced lambdas

* Replaced None with super method

* Fixed testpass
2020-05-12 00:08:07 -04:00
William Falcon 5bb6b41b78
dataloaders with fast_dev_run (#1787)
* dataloaders with fast_dev_run

* dataloaders with fast_dev_run

* dataloaders with fast_dev_run

* fix

* pep 8
2020-05-11 23:32:44 -04:00
Jirka Borovec 9d2df24d6b
RC & Docs/changelog (#1776)
* missing

* RC

* tol

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* test

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-11 21:57:53 -04:00
Fabio Natanael Kepler d120f97896
Fix saving native AMP scaler state (#1777)
Saving was introduced in #1561.
2020-05-11 21:38:37 -04:00
William Falcon eeb411144f
enable fast_dev_run without a validation loop (#1779)
* fix val dataloader

* Update evaluation_loop.py
2020-05-11 11:30:22 -04:00
William Falcon 88c086bbd2
Update progress.py 2020-05-11 09:48:15 -04:00
William Falcon 15c11fc848
fixes no val loader 2020-05-11 09:47:33 -04:00
William Falcon d9bc8a978a
Update README.md 2020-05-10 17:09:09 -04:00
Rohit Gupta d962ab5d89
Fix lr key name in case of param groups (#1719)
* Fix lr key name in case of param groups

* Add tests

* Update test and added configure_optimizers__param_groups

* Update CHANGELOG
2020-05-10 17:05:34 -04:00
Justus Schock 7f64ad7a33
Fix Docker Pipeline (#1765)
* Update and rename docker_builds.yml to docker_nightly_builds.yml

* Update and rename docker_nightly_builds.yml to docker_builds.yml

* Update docker_builds.yml

* Update .github/workflows/docker_builds.yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-10 17:04:51 -04:00
Piotr Łusakowski 0cb6767465
Fix NeptuneLogger to work in ddp mode (#1753) 2020-05-10 13:19:18 -04:00
Alexander Kreuzer ee17c7c9c8
Fixed error message and test docstring (#1698)
training_dataloader -> train_dataloader

Co-authored-by: Alexander Kreuzer <alexander.kreuzer@sap.com>
2020-05-10 13:16:16 -04:00
Anthony Bisulco 76af84718a
Group argument wandb (#1760)
* group argument wandb

* formatting fix
2020-05-10 13:15:51 -04:00
Jirka Borovec 134eb61e1a
Tests: refactor cleanup (#1744)
* wip

* cleaning

* optim imports

* -

* default hparams

* fix restore

* fix imports
2020-05-10 13:15:28 -04:00
Nicki Skafte 4970927ec8
Feature: auto scale batch size (#1638)
* auto batch finder

* fix styling

* add description

* add different modes

* fix copy paste error

* better organised code

* fix styling

* add tests

* fix

* fix

* add some documentation

* added CHANGELOG.md

* some documentation

* update based on review

* Update trainer.py

* Update docs/source/training_tricks.rst

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/trainer/test_trainer_tricks.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* use EvalModelTemplate

* param tests

* rename

* wrap params

* rename function

* rename

* rename param

* fix

* abs

* rename

* refactor code

* add docs

* try

* arg

* loop

* exept

* loop

* drop bool

* docs

* docs

* added check and test for passing dataloader to fit

* styling fix

* update based on review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-09 08:28:36 -04:00
Adrian Wälchli 25bbd059df
Also update progress_bar in training_epoch_end (#1724)
* update prog. bar metrics on train epoch end

* changelog

* wip test

* more thorough testing

* comments

* update docs

* move test

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-08 23:31:56 -04:00
Yuri Brovman 3a642601e8
added warning for None dataloader (#1745)
* added warning for None dataloader

* fixed variable style

* updated warning message

* remove unused import

Co-authored-by: ybrovman <ybrovman@ebay.com>
2020-05-07 09:26:41 -04:00
Shunta Komatsu f656882942
Fix typo (#1750) 2020-05-07 09:25:54 -04:00
Pavel Grunt b9364f96b1
lr_finder: Fix typo in docstring (#1746) 2020-05-06 12:39:22 -04:00
Peter Yu 851866333c
Attach version_ to checkpoint path only if version is int (#1748) 2020-05-06 12:38:32 -04:00
Adrian Wälchli 0cb58fbb4c
Mock packages for RTD docs build (follow up to doctests) (#1739)
* mock all packages on RTD

* update
2020-05-05 16:48:45 -04:00
Yuri Brovman 35bbe178bd
fix _reset_eval_dataloader() for IterableDataset (#1560)
* removed if dl from _reset_eval_dataloader()

* changed to if dl != None to be more safe

* hints from pep8speaks

Co-authored-by: ybrovman <ybrovman@ebay.com>
2020-05-05 14:09:48 -04:00
Jeremy Jordan fc7f5919b5
improve pickle tests for callbacks (#1717)
* improve pickle tests for callbacks

* set mode dict as a class attr
2020-05-05 14:08:54 -04:00
Adrian Wälchli 2b03d34931
complete test (#1705) 2020-05-05 14:08:15 -04:00
Tian Wang d6a0375974
Fixing logic (#1734) 2020-05-05 14:07:26 -04:00
Jirka Borovec 2a2f303ae9
Tests: refactor trainer dataloaders (#1690)
* refactor default model

* drop redundant seeds

* refactor dataloaders tests

* fix multiple

* fix conf

* flake8

* Apply suggestions from code review

Co-authored-by: William Falcon <waf2107@columbia.edu>

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-05 12:31:15 -04:00
Adrian Wälchli a6de1b8d75
doctest for .rst files (#1511)
* add doctest to circleci

* Revert "add doctest to circleci"

This reverts commit c45b34ea911a81f87989f6c3a832b1e8d8c471c6.

* Revert "Revert "add doctest to circleci""

This reverts commit 41fca97fdcfe1cf4f6bdb3bbba75d25fa3b11f70.

* doctest docs rst files

* Revert "doctest docs rst files"

This reverts commit b4a2e83e3da5ed1909de500ec14b6b614527c07f.

* doctest only rst

* doctest debugging.rst

* doctest apex

* doctest callbacks

* doctest early stopping

* doctest for child modules

* doctest experiment reporting

* indentation

* doctest fast training

* doctest for hyperparams

* doctests for lr_finder

* doctests multi-gpu

* more doctest

* make doctest drone

* fix label build error

* update fast training

* update invalid imports

* fix problem with int device count

* rebase stuff

* wip

* wip

* wip

* intro guide

* add missing code block

* circleci

* logger import for doctest

* test if doctest runs on drone

* fix mnist download

* also run install deps for building docs

* install cmake

* try sudo

* hide output

* try pip stuff

* try to mock horovod

* Tranfer -> Transfer

* add torchvision to extras

* revert pip stuff

* mlflow file location

* do not mock torch

* torchvision

* drone extra req.

* try higher sphinx version

* Revert "try higher sphinx version"

This reverts commit 490ac28e46d6fd52352640dfdf0d765befa56988.

* try coverage command

* try coverage command

* try undoc flag

* newline

* undo drone

* report coverage

* review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* remove torchvision from extras

* skip tests only if torchvision not available

* fix testoutput torchvision

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-04 22:16:54 -04:00
Adrian Wälchli 48e808c20e
Move generated RST files to subfolder (#1555)
* move generated files to subfolder

* remove if exists

* reformat argv

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update rebase

* rebase yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-05-04 16:53:06 -04:00
Jirka Borovec 043ae697c2
Tests: refactor callbacks (#1688)
* refactor default model

* drop redundant seeds

* path

* refactor callback tests

* update

* fix sch

* wip

* fix return

* review
2020-05-04 16:52:22 -04:00
Jirka Borovec 6d58fb1353
Tests: refactor trainer (#1728)
* lr

* optim

* wip

* wip

* fix mean

* flake8
2020-05-04 16:51:39 -04:00
Travis Addair f90afa29b8
Fix disabling progress bar on non-zero ranks using Horovod backend (#1709)
* Fix Horovod backend to disable progress bar on all ranks except 0

* Add join barriers

* Added changelog

* Make protected and add verbosity

* Refactor to disable progress bar callback in train

* Removed vebose setting

* Add cache check for Horovod

* Test run again

* Updated comment

* Always skip cache for Horovod

* Only reinstall when necessary

* Added separate step

* Fixed spacing

* Skip Python 3.8
2020-05-04 13:02:57 -04:00
Ryan Henderson 1a9f1c80a1
Fix example argument parser in docs (#1692)
[`parser.parse_known_args()`](https://docs.python.org/3.7/library/argparse.html#argparse.ArgumentParser.parse_known_args) actually returns a tuple of the Namespace of known args and a list of unknown args. We only want the former.
2020-05-04 11:40:50 -04:00