Commit Graph

2880 Commits

Author SHA1 Message Date
William Falcon f952decc16
fix tb version (#2985) 2020-08-15 07:14:01 -04:00
William Falcon 62ddcfcfb1
Update __init__.py 2020-08-14 17:54:25 -04:00
Nathan Raw b9695237f1
Save test predictions on multiple GPUs (#2926)
* Save test predictions on multiple GPUs
2020-08-14 17:52:43 -04:00
Jirka Borovec 097757b450
add linked badge (#2983) 2020-08-14 16:34:47 -04:00
William Falcon e7794eb79a
Fixes #2407 (#2981)
* fix gpus index error
2020-08-14 16:22:48 -04:00
Jirka Borovec 9110ea5301
add docker badge (#2980) 2020-08-14 16:05:53 -04:00
Jirka Borovec 5bce06c050
nb. devices (#2973) 2020-08-14 11:37:21 +02:00
William Falcon 0c264689cb
Fixes #2942 (#2969)
* Fixes #2942

* doc fix
2020-08-13 21:54:57 -04:00
William Falcon 48f658fbb5
Fixes #2943 (#2970) 2020-08-13 21:44:55 -04:00
William Falcon 639a4cbd25
autoplay (#2968) 2020-08-13 19:06:55 -04:00
Nicki Skafte 6a051c887f
Add docs for GpuUsageLogger (#2945)
* add docs

* fix spelling
2020-08-13 18:58:14 -04:00
Lezwon Castelino cfd06a083b
Bugfix/2956 tpu distrib backend fix (#2959)
* override dist backend when using tpus

* added test

* updated doc string

* drop redundant info...

* more redundant info

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-08-13 18:57:23 -04:00
edenlightning 5e7ae348b4
Add labels to sphinx docs (#2964)
* Add label

* add ref

* add ref

* add ref

* add label

* add label

* add label

* add label

* Update fast_training.rst

* label

* label

* label

* label

* label

* label

* label

* label

* label

* label

* label

* Update performance.rst

* Update production_inference.rst

* Update profiler.rst

* Update results.rst

* Update sequences.rst

* Update single_gpu.rst

* Update slurm.rst

* Update test_set.rst

* Update tpu.rst

* Update trainer.rst

* Update training_tricks.rst

* Update transfer_learning.rst

* Update weights_loading.rst

* Update governance.rst

* Update hooks.rst

* Update bolts.rst

* Update child_modules.rst

* Update hyperparameters.rst

* Update transfer_learning.rst
2020-08-13 18:56:51 -04:00
William Falcon b7fc805dcf
pep 8 (#2967) 2020-08-13 18:56:02 -04:00
William Falcon 9a503de6af
Replace docs gifs with videos snippets so user can play at own speed (#2966)
* update docs
2020-08-13 18:52:47 -04:00
Jirka Borovec d4491bb14a
update PR template (#2965)
* template

* typo

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-08-14 00:13:23 +02:00
Jeff Yang 07c023c32f
fix(docs): docstring for amp_backend (#2960)
* fix(docs): docstring for amp_backend

* fix(docs): early_stop_checkpoint -> early_stop_callback

* docs

Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
2020-08-13 23:25:56 +02:00
SiddhantRanade 88bfed371e
Fix enforce_datamodule_dataloader_override() for iterable datasets (#2957)
This function has the if statement `if (train_dataloader or val_dataloaders) and datamodule:`.


The issue is similar to that in https://github.com/PyTorchLightning/pytorch-lightning/pull/1560. The problem is that the `if(dl)` translates to `if(bool(dl))`, but there's no dataloader.__bool__ so bool() uses dataloader.__len__ > 0. But... dataloader.__len__ uses IterableDataset.__len__ for IterableDatasets for which __len__ is undefined.

The fix is also the same, the `if dl` should be replaced by `if dl is not None`.

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-13 23:06:17 +02:00
shijianjian 53f855cdbf
Added strict=False for load_from_checkpoint (#2819)
* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Apply suggestions from code review

* tests

* tests

* chlog

* Update tests/models/test_restore.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update test comments

* Added docstring for the strict attribute

* Added supplementary tests

* Update saving.py

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* pep8, removed extra func

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
2020-08-13 16:34:24 -04:00
shijianjian 18d31a3b63
Added strict=False for load_from_checkpoint (#2819)
* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Added strict=False and hparams_file accepcts dict

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Type check fix

* Added tests

* Linting & test fix

* Removed redundant code & test

* Apply suggestions from code review

* tests

* tests

* chlog

* Update tests/models/test_restore.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update test comments

* Added docstring for the strict attribute

* Added supplementary tests

* Update saving.py

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* pep8, removed extra func

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananyahjha93 <ananya@pytorchlightning.ai>
2020-08-13 16:25:43 -04:00
William Falcon 2c935d048e
track batch size (#2954) 2020-08-13 12:40:54 -04:00
William Falcon 054ac94bd1
track batch size (#2950) 2020-08-13 11:51:37 -04:00
Jirka Borovec 4354690e55
add apex test (#2921)
* add apex test

* rename

* level

* events

* wrap

* evt

* miss

* apex

* apex

* apex

* apex

* apex

* apex

* Update tests/models/test_amp.py

Co-authored-by: William Falcon <waf2107@columbia.edu>

* notes

* notes

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-13 10:03:13 -04:00
William Falcon 6c5a0a172f
Resultd (#2947)
* updated docs
2020-08-13 09:58:05 -04:00
Jirka Borovec 519b97effd
Clean save (#2933)
* thr
deterministic=True

* clean

* clean

* Apply suggestions from code review

Co-authored-by: Vadym Stupakov <vadim.stupakov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Vadym Stupakov <vadim.stupakov@gmail.com>
2020-08-13 07:26:33 -04:00
Jirka Borovec 665c1507f0
deterministic=True (#2944) 2020-08-13 06:29:27 -04:00
edenlightning 2c31beccfb
Add magicleap/atlas to community examples (#2937) 2020-08-12 16:05:15 -04:00
Gerardo Roa Dabike f6a3d8fd8d
GPU Usage Logger (#2932)
* GPU utilisation Callback

* GPU utilisation Callback

* Fixing style

* Fixing style

* Fixing CodeFactor: partial executable path

* Fix a misspelling in the Class name
2020-08-12 15:09:34 -04:00
Jirka Borovec fcf3c40172
update changelogs 2020-08-12 10:02:32 -04:00
Adrian Wälchli 411914bd2b
Fix hparams loading for model that accepts *args (#2911)
* fix hparams loading for model that accepts *args

* add test case

* changelog

* pep

* fix test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-12 09:58:35 -04:00
Rosario Scalise f9d88f8088
Support **DictConfig hparam serialization (#2519)
* change to OmegaConf API

Co-authored-by: Omry Yadan <omry@fb.com>

* Swapped Container for OmegaConf sentinel; Limited ds copying

* Add Namespace check.

* Container removed. Pass local tests.

Co-authored-by: Omry Yadan <omry@fb.com>
2020-08-12 08:10:17 -04:00
William Falcon a46130cdc1
add weighted average to results obj (#2930)
* track batch size in result obj
2020-08-12 08:02:00 -04:00
Nathan Raw 118bd14d16
Update CONTRIBUTING.md (#2927)
* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-08-12 12:59:36 +02:00
Phil e3528afae3
Move optimizer creation after device placement for ddp backends. (#2904) 2020-08-12 06:34:59 -04:00
Brendan Fahy 56396abe98
fix checkpointing to remote file paths (#2925) 2020-08-12 06:31:17 -04:00
William Falcon d13e5c9e53
document lightiningmodule better (#2920)
* updated docs
2020-08-11 19:39:43 -04:00
zcain117 580a5bd1df
Use kubectl to get logs from TPU CI instead of gcloud logging. (#2918)
* Use kubectl to get logs from TPU CI instead of gcloud logging.

* Update Github Action to read logs from kubectl rather than gcloud logging.
2020-08-11 19:30:56 -04:00
Adrian Wälchli 69d241c82e
Do not pass non_blocking=True if it does not support this argument (#2910)
* add docs

* non blocking only on tensor

* changelog

* add test case

* add test comment

* update changelog


changelog


chlog
2020-08-11 19:28:37 -04:00
Shiv Dhar 0097630a95
Fix typo (#2907)
Variable defined as `mnist_dm` but used as `mnist`. Change to use `mnist_dm`.
2020-08-11 08:39:16 +02:00
zcain117 35a3fd2f97
Add missing arg to docker build. (#2905) 2020-08-10 18:37:36 +00:00
William Falcon 28f79d9f7a
Mapkeys (#2900)
* added a map dict

* added a map dict
2020-08-09 18:50:39 -04:00
Brendan Fahy 97e6f35b34
fix missing return statement. Do not normalize remote paths (#2894)
* fix missing return statement. Do not normalize remote paths

* Update pytorch_lightning/utilities/cloud_io.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Add some documentation that we now support s3 and hdfs paths

* suggestion from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-09 22:38:43 +00:00
Jirka Borovec 5b53e40a92
filter drafts (#2897) 2020-08-09 17:28:00 -04:00
Adrian Wälchli 1ac507a255
constant root seed in reset_seed (tests) (#2895)
* fix root_seed in reset_seed

* seed value
2020-08-09 21:23:01 +00:00
ananda seelan 4d3dfd43e4
Minor doc fixes (#2893)
* Minor language fixes

* Typo fix
2020-08-09 15:00:08 -04:00
Caldera 6c18fd9a24
Update lr_logger.py (#2847)
* Update lr_logger.py

when logging learning_rate, we should provide different choices to log including 'step' and 'epoch'

* Update lr_logger.py

add some type annotations and docstrings

* Update lr_logger.py

fixed a bug where `on_train_batch_start()` can't be triggered, instead, we should use on_batch_start(); add `interval` args so that we can record learning_rates with respect to `global_step` or `current_epoch`.

* Update lr_logger.py

restore _extract_lr()

* suggestion

* Update lr_logger.py

modify _extract_lr(), it no more need to pass `interval` parameter.

* Update test_lr_logger.py

SkafteNicki 's suggetion

* log_interval now supports `None`, `step`, `epoch`

* change `log_interval` to `logging_interval`

* Update test_lr_logger.py

* Update lr_logger.py

* put types check into `on_train_start()`

* cleanup

* docstring typos

* minor changes from suggestions

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-08-09 16:30:43 +00:00
Jirka Borovec 0cfa05b703
missing chlogs (#2806)
* missing

* miss
2020-08-09 16:44:14 +02:00
William Falcon 38c018e4ba
Update __init__.py 2020-08-09 10:05:39 -04:00
William Falcon 0260a05a95
Update README.md 2020-08-09 07:11:45 -04:00
Uladzislau Sazanovich e9846dd758
Add tracking of basic states in Trainer [wip - to-be-merged after v0.9] (#2541)
* Add initial tracking of states in Trainer.

* Add INTERRUPTED state, improve tests, move state switching from callback to a trainer.

* Move part of a trainer state switching to a decorator.

* Add documentation.

* Fix docs, rename state enum, restore state to previous on exit if None, add tests for decorator only.

* Fix callback typing.

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-09 06:24:09 -04:00