Commit Graph

99 Commits

Author SHA1 Message Date
Justus Schock 0ae7f479d3
Update CHANGELOG.md 2020-04-27 09:47:00 +02:00
Jirka Borovec 80df5039f8
changelog (#1616)
* changelog

* warning

* pull

* typo

* typo
2020-04-26 16:11:22 -04:00
William Falcon 4755ded863
Clean up Argparse interface with trainer (#1606)
* fixed distutil parsing

* fixed distutil parsing

* Apply suggestions from code review

* log

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* doctest

* fixed hparams section

* fixed hparams section

* fixed hparams section

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-26 09:20:06 -04:00
William Falcon b620d86c54
diable val and test shuffling (#1600)
* diable val and test shuffling

* diable val and test shuffling

* diable val and test shuffling

* diable val and test shuffling

* log

* condition

* shuffle

* refactor

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-25 16:45:20 -04:00
Jirka Borovec 58a467dd68
model checkpint on rank_zero_only & global rank state (#1408)
* try delete in async or DDP us0-ecase

* changelog

* add model chekpoint rank

* simple delete

* flake8

* use global rank

* chnagelog

* fix review

* fix import

* proposal

* proposal

* proposal

* improve proposal (fix problems with method call self)

* cleaning

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 17:21:00 -04:00
Jirka Borovec e0e67685d7
missing change (#1591) 2020-04-24 10:30:33 -04:00
Boris Dayma f3d139e90f
fix(wandb): allow use of sweeps (#1512)
* fix(wandb): allow use of sweeps

overwrite run config parameters due to precision error

fix #1290

* docs(wandb): update changelog

* test(wandb): update config test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 10:29:24 -04:00
Adrian Wälchli 3e8f2d99a9
Progress bar callback (#1450)
* squash and rebase

sanity check hooks


sanity check callback hook finish


moved core progress bar functionality into callback


wip


remove duplicate merge


clean up


imports


docs


sanity check progress bar main


sanity


move callback calls


init progrss bar callback


configuration and docs


changelog


rate decorator


pass process_position


disable on rank > 0


position index


is_enabled


remove decorator


refactor init tqdm bars


callback method ordering 


cannot reset when disabled


sequence -> list


default values


fix has no attr _time() 


move on_val_end to proper place


fix the pickle issue


update warning


properties


check for None


remove old comment


switch order


pull out non-tqdm functionality into base class


documentation for the base class


docs


fix refresh rate issue in validation


restrict type hint of trainer arg


more docs


update trainer docs


rst docs


fix lines too long


fix test


add missing type hints


fix typo


move docstring to __init__ solves doctest failures


remove doctest :(( can't fix the pickle error


fix example


simplify by saving trainer reference


fix docs errors


move docstring


initial value


multiple val checks per epoch


simpler handling of inf dataset sizes


update inf docs


renamed training_tqdm_dict


rename get_tqdm_dict


rename occurences of tqdm 


update changelog


fix doctest


fix formatting errors


added callback tests


progress bar on off test


more tests for progress bar


weird test fix?


add ignored property


disable default progress bar in LR finder


change enable/disable behavior


trying doctest in CI again


undo doctest pickle error


undo doctest pickle error :((


remove progress_bar_callback Trainer arg and fix tests


restore progress bar after auto lr find


update docs


fix rebase


fix wrong negation

* fix fast dev run total

* more thorough testing

* remove old args

* fix merge

* fix merge

* separate tests

* type hint total batches

* reduce if

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_disabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_enabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* rename enabled/disabled

* move deprecated api

* remove duplicated test from merge

* fix rename is_disabled

* newline

* test also testprogress for fast dev run

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 20:46:18 -04:00
Alexey Karnachev edb8d7a23c
Nested metrics dictionaries now can be passed to the loggers (#1582)
* now func merge_dicts works with nested dictionaries

* CHANGELOG.md upd
2020-04-23 17:32:36 -04:00
Jirka Borovec 94e53444c6
fix changelog (#1583) 2020-04-23 16:57:37 -04:00
Ferdinand Schlatt 545b38ec5f
fix boolean argparse (#1571)
* fix boolean argparse #1570

* update change log
2020-04-23 11:44:18 -04:00
Lezwon Castelino 831842972f
check for kaggle env variable (#1568)
* check for kaggle env variable

* added changelog
2020-04-23 07:12:54 -04:00
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
Jirka Borovec bd168819f2
fix changelog (#1452)
* fix changelog

* formatting

* add ddp_cpu

* docs

* add another
2020-04-20 17:36:26 -04:00
Roshan Rao 0203938af8
Update learning rate on each backward pass instead of each forward pass. (#1477)
* change lr scheduler step interval to update every backwards pass instead of every forwards pass

* update CHANGELOG

* fix spacing

* Add TODO to lr schedule update

* remove trailing whitespace

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-20 08:03:52 -04:00
Adrian Wälchli 4fca994d0e
Fix callback default (horror bug!) (#1534)
* fix horror bug

* update changelog

* fix doctest

* liine too long
2020-04-20 07:02:53 -04:00
areshytko d0c9472cb3
Add SLURM check in ddp_train() and init_ddp_connection() (#1387)
* slurm check in ddp_train and init_ddp_connection

* Remove code example in init_ddp_connection

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove blank line

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* improve for test coverage

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update changelog

* Default values and warnings for DDP env variables

* fix merge artifacts

* update localhost value

* change to NODE_RANK

Co-authored-by: Alexander Reshytko <areshytko@Alexanders-MacBook-Pro.local>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-19 17:08:19 -04:00
Justus Schock c71bd73acb
DDP sampler (#1513)
* Add explicit flag for ddp sampler replacement

* Add flag for sampler replacement in ddp

* Update data_loading.py

* Update CHANGELOG.md

* pep8 fixes

* pep8
2020-04-19 16:58:57 -04:00
Hengjian (Henry) Jia 3c6f856f23
Fix Mixing hparams and arguments in LightningModule (#1505)
* Attempt to fix #1468

* Remove the if statement, it doesn't actually make any difference

* Update docs

* Correct warnings I caused in the last commit

* Add to changelog

* Actually add to changelog

* Clarify documentation and examples

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-19 07:03:40 -04:00
Ir1dXD 9b31272cf0
feat: save checkpoint before deleting old ones (#1453)
* feat: save checkpoint before deleting old ones

* fix: make sure that the new model is not deleted

* changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-16 16:40:51 +00:00
Adrian Wälchli 3c549e8ae3
Call on_before_zero_grad model hook (#1493)
* call on_before_zero_grad

* update changelog

* add note about overriding both hooks

* added test

* move test_hooks.py to models folder
2020-04-16 12:01:41 -04:00
Boris Dayma 06e6eadfaf
feat(semseg): allow model customization (#1371)
* feat(semantic_segmentation): allow customization of unet

* feat(semseg): allow model customization

* style(semseg): format to PEP8

* fix(semseg): rename logger

* docs(changelog): updated semantic segmentation example

* suggestions

* suggestions

* flake8

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-16 12:00:24 -04:00
William Falcon 3431c62d41
Remove error when test dataloader used in test (#1495)
* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* remove error when test dataloader used in test

* fix lost model reference

* remove error when test dataloader used in test

* fix lost model reference

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* moved optimizer types

* added tests for warning

* fix lost model reference

* fix lost model reference

* added tests for warning

* added tests for warning

* refactoring

* refactoring

* fix imports

* refactoring

* fix imports

* refactoring

* fix tests

* fix mnist

* flake8

* review

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-15 22:16:40 -04:00
Jirka Borovec b3fe17ddeb
fix flushing loggers (#1459)
* flushing loggers

* flushing loggers

* flushing loggers

* flushing loggers

* changelog

* typo

* fix trains

* optimize imports

* add logger test all

* add logger test pickle

* flake8

* fix benchmark

* hanging loggers

* try

* del

* all

* cleaning
2020-04-14 20:32:33 -04:00
William Falcon c96c6a6b33
attempting to remove some speed issues (#1482)
* removed some .items

* added speed tests

* added speed tests

* Update benchmarks/test_rnn_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update benchmarks/test_trainer_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix lost model reference

* added speed tests

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-14 20:23:36 -04:00
Ethan Harris 8544b334e4
Replace automatic nan check with optional flag (#1475)
* Replace automatic nan check with optional flag

* Update CHANGELOG.md
2020-04-13 14:06:25 -04:00
Nicki Skafte 3f09b32df3
Learning Rate finder (#1347)
* initial structure

* rebase

* incorporate suggestions

* update CHANGELOG.md

* initial docs

* fixes based on reviews

* added trainer arg

* update docs

* added saving/restore of model state

* initial tests

* fix styling

* added more tests

* fix docs, backward compatility and progressbar

* fix styling

* docs update

* updates based on review

* changed saving to standard functions

* consistent naming

* fix formatting

* improve docs, added support for nested fields, improve codecov

* update CHANGELOG.md

* Update lr_finder.rst

* Update pytorch_lightning/trainer/trainer.py

* Update trainer.py

* Update CHANGELOG.md

* Update path

* restoring

* test

* attribs

* docs

* doc typo

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-10 14:34:23 -04:00
Jirka Borovec dcda5194df
continues develop (#1419)
* continues develop

* changelog

* typo
2020-04-10 13:10:46 -04:00
Allard Hendriksen 7ac1580a31
Add automatic GPU choice to trainer (#1426)
* Add automatic GPU choice to trainer

This commit adds the `gpu_choice` parameter to Trainer. By default,
this parameter is set to 'manual' which causes no observable
difference in behavior.

When `gpu_choice` is set to "auto" and `gpus` is an int, then the
trainer will automatically allocate the first available GPU.
This is especially useful when GPUs are configured to be in "exclusive
mode", which means that only one process at a time can use them.

* Rename gpu_choice -> auto_select_gpus
2020-04-10 11:45:29 -04:00
Rohit Gupta e79ae18cae
Add test_dataloaders to test method (#1434)
* Add test_dataloaders to test method

* Remove test_dataloaders from .fit()

* Fix code comment

* Fix tests

* Add test_dataloaders to test method (#1393)

* Fix failing tests

* Update docs (#1393)
2020-04-10 11:44:03 -04:00
Alexey Karnachev 4c34d16a34
Fixed configure optimizer from dict without "scheduler" key (#1443)
* `configure_optimizer` from dict with only "optimizer" key. bug fixed

* autopep8

* pep8speaks suggested fixes

* CHANGELOG.md upd
2020-04-10 11:43:06 -04:00
William Falcon 1f685c2882
fix pretty print (#1441)
* grid sample

* grid sample

* grid sample

* grid sample

* grid sample

* changelog

* version

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-10 08:43:22 -04:00
Jirka Borovec b2707c9b2e
fix retruning returns (#1431)
* returns

* changelog
2020-04-09 15:01:08 -04:00
Jirka Borovec 17f58d2e11
add rank warning (#1428)
* add rank warning

* changelog

* use rank_zero_warn

* user trainer_init

* replace warnings

* fix test

* flake8

* docs

* changelog

* bug lol
2020-04-09 14:05:46 -04:00
Alexey Karnachev ddbf7de6dc
Added accumulation of loggers' metrics for the same steps (#1278)
* `add_argparse_args` method fixed (argument types added)

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* test_with_accumulate_grad_batches added

* agg_and_log_metrics logic added to the base logger class

* small format fix

* agg metrics strategies removed (not to complicate stuff)

* agg metrics: handle zero step

* autopep8

* changelog upd

* flake fix

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove .item which causes sync issues (#1254)

* remove .item which causes sync issues

* fixed gradient acc sched

* fixed gradient acc sched

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* autopep8

* loggers base.py types fixed

* test

* test

* metrics aggregation for loggers: each key now has a specific function (or default one)

* metrics aggregation for loggers: each key now has a specific function (or default one)

* docstrings upd

* manual typehints removed from docstrings

* batch_size decreased for test `test_with_accumulate_grad_batches`

* extend running accum

* refactor

* fix tests

* fix tests

* allowed_types generator scoped

* trainer.py distutils was imported twice, fixed

* TensorRunningAccum refactored

* TensorRunningAccum added to change log (Changed)

* change log pull link added

Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-08 08:35:47 -04:00
Jirka Borovec b2ae57795f
add pypi user (#1401)
* add pypi user

* changelog

* changelog
2020-04-07 09:49:38 -04:00
Jirka Borovec c3b82f0170
update Docs/changelog (#1398)
* update docs/changelog

* fix

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-07 09:10:58 -04:00
areshytko 495ffbd028
Tensorboard logger check if lightning_logs directory exists (#1377)
* tensorboard logger version if root_dir not exist

* update changelog

* resolve comments

Co-authored-by: Alexander Reshytko <areshytko@Alexanders-MacBook-Pro.local>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-07 06:39:54 -04:00
Paweł Rzepiński b8ff9bc1d2
Fix unimplemented type() on TPU (#1396)
* Fix unimplemented type() on TPU

* Add changelog entry

* Add quotation marks
2020-04-06 20:29:55 -04:00
Roshan Rao 4ed3027309
Set precision=16 when use_amp is passed as True (#1145)
* Set precision=16 when use_amp is passed as True

* Update CHANGELOG.md

* add use_amp to deprecated API

* Update trainer.py

* Update trainer.py

* move the use_amp attribute to deprecated API

* move use_amp deprecation back to Trainer's __init__

* drop unsed

* drop deprecated

* reorder imports

* typing

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-06 08:13:24 -04:00
Ethan Harris b18accc64c
Add warning for few workers (#1378)
* Add warning for few workers

* Fix style issue

* Update CHANGELOG.md

* Update test

* formatting

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-05 11:07:16 -04:00
Jirka Borovec fdcf9cd518
add forgotten change logs (#1380)
* forgot change logs

* more missing

* more missing
2020-04-05 11:05:13 -04:00
William Falcon f1e11d8b38
model_checkpoint to save all models (#1359)
* model_checkpoint to save all models

* changelog

* rise if

Co-authored-by: jamesjjcondon <jamesjjcondon@gmail.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-05 15:56:26 +02:00
Jirka Borovec 22bedf9b57
simplify examples structure (#1247)
* simplify examples structure

* update changelog

* fix imports

* rename example

* rename scripts

* changelog
2020-04-03 17:57:34 -04:00
Justus Schock f6a86e8551
generalize reinstantiation of dataloader (#1346)
* generalize reinstantiation of dataloader

* fix condition

* add test

* update changelog

* fix changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 17:55:08 -04:00
William Falcon e68ba1c836
added warnings to unimplemented methods (#1317)
* added warnings and removed default optimizer

* opt

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:06:51 -04:00
William Falcon dd5a05926c
Borisdayma: fix(wandb) - fix watch method (#1361)
* fix(wandb): fix watch method

* rebased

* Apply suggestions from code review

Co-authored-by: Boris Dayma <boris.dayma@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:02:38 -04:00
Jean-Baptiste SCHIRATTI e570d2e1ca
Doc fixes (#1362)
* Doc fixes from #1357 (awaelchli's comments) + changelog.

* Fix indentation.

* Add blank line to fix doc build?
2020-04-03 15:02:20 -04:00
Adrian Wälchli ebd9fc9530
Fix for incorrect run on the validation set with overwritten validation_epoch_end and test_end (#1353)
* reorder if clauses

* fix wrong method overload in test

* fix formatting

* update change_log

* fix line too long
2020-04-03 09:25:32 -04:00
Gerard Bentley f33b5a8d99
Simplify progress bar args (#1108)
* show progress bar dependent on refresh_rate

* test progress_bar_refresh control show bar

* remove show_progress_bar from other tests

* borda fixes

* flake8 fix

* changelog update prog bar refresh rate

* move show_progress_bar to deprecated 0.9 api

* rm show_progress_bar references, test deprecated

* Update pytorch_lightning/trainer/__init__.py

* fix test

* changelog

* minor CHANGELOG.md format

* Update pytorch_lightning/trainer/__init__.py

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 00:53:00 +02:00