Commit Graph

2396 Commits

Author SHA1 Message Date
Jirka Borovec a153fe4c2a
fix codecov reports (#1867)
* fix codecov

* upgrade codecov

* upgrade codecov
2020-05-18 20:34:59 -04:00
Ashraful Islam e0a5aee3a3
fix porgressbar postfix order (#1874) 2020-05-18 20:33:51 -04:00
Ashraful Islam 981169cacc
add warning for shuffling in test/val (#1865) 2020-05-18 09:53:02 -04:00
Lezwon Castelino 7c7e50ca47
Allow user to select individual TPU core to train on (#1729)
* added tpu_id

added tpu_id to mixins

* train on individual tpu

* parallel loader if tpu_id is None

* removed progress_bar_refresh_rate

* chlog

* replaced num_tpu_cores with tpu_cores

* set tpu_id to None if int

* changed num_tpu_cores to tpu_cores in docs

* updated docs

* updated __init__.py
removed self.tpu_id for ParallelLoader

* Update pytorch_lightning/trainer/__init__.py

* check if tpu_cores is a list

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* xla device conditional

* num_tpu_cores deprecation

* removed duplicate warning

* fixed pep8 error

* Revert "removed duplicate warning"

This reverts commit 8adb0a9b

* deprecated api update

* fixed recursion error

* fixed tests

* fixed flake errors

* removed current_tpu_index

* Update CHANGELOG.md

* Update trainer.py

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:54 -04:00
Victor Quach 1a797bdad5
add test for trainer.test() (#1858)
* fix trainer.test()

* Update trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:20 -04:00
William Falcon d7f9c03663
Update README.md 2020-05-17 12:02:09 -04:00
Subodh Dahal 6dc381a806
Fixed the default value of auto_lr_find in docs (#1854)
* Fixed the default value of auto_lr_find in docs

* Update lr_finder.rst

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 10:59:46 -04:00
Andrey 76f905f902
Adding a new section to the docs: Example Lightning Project Structures (#1851)
* Adding a new section to the docs: Example Lightning Project Structures

* Update index.rst

* Update index.rst

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 10:05:11 -04:00
Fabio Natanael Kepler 8c4c7b105e
Fix `save_weights_only` flag in ModelCheckpoint (#1780)
* Add flag to `dump_checkpoint` for only including weights

`ModelCheckpoint` then passes `self.save_weights_only` to the save function.

* Fix tests and add changelog entry

* Add check and descriptive message when training state is restored from a weights only checkpoint

Also add a test for making sure `ModelCheckpoint.save_weights_only` works as expected.

* Fix weights-only test to properly match expected exception

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-17 09:24:17 -04:00
Adrian Wälchli 769a459d27
remove extra kwargs from Trainer init (#1820)
* remove kwargs

* remove useless test

* rename unknown trainer flag

* trainer inheritance and test

* blank line

* test for unknown arg

* changelog
2020-05-17 09:14:54 -04:00
Jirka Borovec 692f302837
continue devel (#1793)
* miss

* miss

* miss

* update

* format
2020-05-17 08:30:45 -04:00
Rohit Gupta 56d521a317
Fix test configuration check and testing (#1804)
* Fix test configuration check and testing

* Fix test configuration check and testing

* Remove check_testing_configuration during test

* Fix docstring

* fix function name

* remove conflicts
2020-05-17 08:22:44 -04:00
Adrian Wälchli 4cdebf9a64
remove obsolete self._device in Trainer (#1849)
* remove unused device attribute

* dtype

* move on_gpu to model
2020-05-17 08:20:51 -04:00
William Falcon b84b02400a
enable any dict and namespace in hparams (#1847) 2020-05-15 15:08:16 -04:00
Jirka Borovec e95e1d71c7
release 0.7.6 (#1813)
* release 0.7.6rc2

* release 0.7.6

* include img

* smaller image

* missing

* miss

* miss

* miss

* up
2020-05-15 08:36:40 -04:00
William Falcon c8c5d33208
Update __init__.py 2020-05-14 18:44:46 -04:00
Justus Schock c05077fae3
Enable non-blocking for gpu device transfer (#1843)
* Update distrib_parts.py

* Update CHANGELOG.md
2020-05-14 17:56:40 -04:00
Jirka Borovec bee0392c37
extend arg parser (#1842)
* extend arg parser

* flake8

* tests

* example

* fix test
2020-05-14 17:56:11 -04:00
Peter Yu a6f6edd07d
Update args, kwargs doc for load_from_checkpoint() (#1839) 2020-05-14 15:43:47 -04:00
Jirka Borovec 236c1378f9
docs dpp warn (#1835)
* add warn

* Apply suggestions from code review
2020-05-14 11:06:03 -04:00
Nicki Skafte 88f816ed06
dummy logger (#1836)
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-14 10:34:11 -04:00
Jirka Borovec 1c10560531
Fix failing docs (#1821)
* missing pkg

* update CI

* strict RTD

* strict RTD

* make

* missing

* ignore

* ignore

* mock

* typo
2020-05-14 08:25:06 -04:00
Nand Dalal cf2d32d0a6
fix bugs in semantic segmentation example (#1824)
* Update unet.py

* Update semantic_segmentation.py
2020-05-14 02:36:45 -04:00
William Falcon 1265b2fe02
Update __init__.py 2020-05-13 19:51:41 -04:00
William Falcon 53d9316a56
fixes ddp bugs (#1819)
* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug
2020-05-13 19:17:04 -04:00
William Falcon 648d516668
Use store_true for bool args (#1822)
*  Use store_true for bool args

* debug

Co-authored-by: Nate Raw <nxr9266@g.rit.edu>
2020-05-13 19:12:06 -04:00
Peter Yu e961f7e344
args should come after the last positional argument (#1807) 2020-05-13 17:29:54 -04:00
Tullie Murrell fddd618915
Add ElasticTraining documentation (#1818) 2020-05-13 17:23:53 -04:00
Ashwin Bharambe 0e71705a0a
[checkpoint logic] Fix bug which doesn't account for NoneType for `model.hparams` (#1817)
The intention of the code is to output a warning message when `hparams`
is null or not set. Instead the code now fatals when
`model.hparams = None`. Prevent that.
2020-05-13 17:14:11 -04:00
William Falcon 12138ced7c
Update __init__.py 2020-05-13 14:42:50 -04:00
Nicki Skafte 663b90035c
Bugfix: accumulation and suggestion for learning rate finder (#1801)
* fix suggestion being too naive

* fix accumulation error and added new tests

* fix styling

* update CHANGELOG.md

* update based on review

* fix tests

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-13 14:40:44 -04:00
Ashwin Bharambe aefc5314bc
[ddp] Support multi-node distributed execution under torchelastic (#1811)
The changes are quite local and limited in nature -- viz., checking for
some indicator environment variables. We check for (SLURM_LOCALID,
NODE_RANK, GROUP_RANK) in order. If multiple are found set, a warning is
logged.

This patch also fixes a minor bug with comparing the `WORLD_SIZE`
environment variable. This can be a string type.
2020-05-13 14:06:59 -04:00
Deeksha Sharma b1d9656470
Update README.md (#1798)
* Update README.md

* Update README.md

committed suggestion

Co-authored-by: William Falcon <waf2107@columbia.edu>

* Update README.md

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Update README.md

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-05-13 12:22:12 -04:00
So Uchida 22d7d03118
Replace meta_tags.csv with hparams.yaml (#1271)
* Add support for hierarchical dict

* Support nested Namespace

* Add docstring

* Migrate hparam flattening to each logger

* Modify URLs in CHANGELOG

* typo

* Simplify the conditional branch about Namespace

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update CHANGELOG.md

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* added examples section to docstring

* renamed _dict -> input_dict

* mata_tags.csv -> hparams.yaml

* code style fixes

* add pyyaml

* remove unused import

* create the member NAME_HPARAMS_FILE

* improve tests

* Update tensorboard.py

* pass the local test w/o relavents of Horovod

* formatting

* update dependencies

* fix dependencies

* Apply suggestions from code review

* add savings

* warn

* docstrings

* tests

* Apply suggestions from code review

* saving

* Apply suggestions from code review

* use default

* remove logging

* typo fixes

* update docs

* update CHANGELOG

* clean imports

* add blank lines

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* back to namespace

* add docs

* test fix

* update dependencies

* add space

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-05-13 15:05:15 +02:00
William Falcon 35fe2efe27
added override for hparams in load_from_ckpt (#1797)
* added override for hparams in load_from_ckpt

* override hparams

* override hparams

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* update doctest

* typo

* chlog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-13 10:27:22 +02:00
Jirka Borovec 10ce1c0256
device property (#1791)
* device property

* add/copy properties

* inherit

* rename

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* dtype

* prop

* pt api

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-05-12 23:18:39 -04:00
Adrian Wälchli 8978794730
add missing flag (#1805) 2020-05-12 17:06:38 -04:00
William Falcon 5be1bc48e9
Update README.md 2020-05-12 11:43:53 -04:00
William Falcon d70d86985e
Update README.md 2020-05-12 11:43:01 -04:00
William Falcon 98cb7c2ce2
Update README.md 2020-05-12 08:59:23 -04:00
William Falcon 087bb34c68
Update README.md 2020-05-12 08:56:32 -04:00
Oliver Neumann 9059d21042
Missing profiler attribute in add_argparse_args() ArgumentParser (#1794)
* Fixed typing annotation by adding boolean type. After that Profiler flag will be added to argparse.

* Updated CHANGELOG.md

* Updated git_init_arguments_and_types() to pass doctests.

* Added doctest example to add_argparse_parser()
2020-05-12 08:53:26 -04:00
William Falcon c52382f547
Update README.md 2020-05-12 08:52:43 -04:00
William Falcon 8584df54e9
Update README.md 2020-05-12 08:52:11 -04:00
William Falcon a5c19ea784
Update README.md 2020-05-12 08:49:29 -04:00
William Falcon 423b82ea6c
Update README.md 2020-05-12 08:46:55 -04:00
William Falcon 39584d08ad
Update README.md 2020-05-12 08:46:22 -04:00
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
William Falcon 7af4505519
Update README.md 2020-05-12 07:50:23 -04:00
William Falcon a4fc4ffa6e
Update README.md 2020-05-12 07:49:17 -04:00