Commit Graph

1057 Commits

Author SHA1 Message Date
Justus Schock cdbf2f4a37
Update tensorboard.py 2020-04-27 09:47:35 +02:00
Justus Schock e309b55b38
Update tensorboard.py 2020-04-27 09:44:26 +02:00
William Falcon acfb054103
Update __init__.py 2020-04-26 17:48:27 -04:00
William Falcon c0a517a553
Merge pull request #1623 from PyTorchLightning/hparams
Hparams
2020-04-26 17:47:20 -04:00
William Falcon 879d879985 fix hparams issue 2020-04-26 17:27:45 -04:00
Jirka Borovec 80df5039f8
changelog (#1616)
* changelog

* warning

* pull

* typo

* typo
2020-04-26 16:11:22 -04:00
William Falcon 9020cf91b5 fixed warning 2020-04-26 12:53:42 -04:00
William Falcon d290b818d0
Update __init__.py 2020-04-26 11:08:00 -04:00
William Falcon 4755ded863
Clean up Argparse interface with trainer (#1606)
* fixed distutil parsing

* fixed distutil parsing

* Apply suggestions from code review

* log

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* doctest

* fixed hparams section

* fixed hparams section

* fixed hparams section

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-26 09:20:06 -04:00
William Falcon 17bce62e5f
Update __init__.py 2020-04-25 19:04:39 -04:00
William Falcon b620d86c54
diable val and test shuffling (#1600)
* diable val and test shuffling

* diable val and test shuffling

* diable val and test shuffling

* diable val and test shuffling

* log

* condition

* shuffle

* refactor

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-25 16:45:20 -04:00
William Falcon 791ba91dec
slurm job id (#1605) 2020-04-25 16:01:15 -04:00
William Falcon 1e2c9eaf89 updated docs 2020-04-25 13:04:34 -04:00
William Falcon cbd088bd13
multi processing warnings (#1602)
* multi processing warnings

* multi processing warnings

* multi processing warnings

* multi processing warnings

* multi processing warnings

* multi processing warnings
2020-04-25 10:03:02 -04:00
William Falcon f531ab957b
Update __init__.py 2020-04-24 17:21:52 -04:00
Jirka Borovec 58a467dd68
model checkpint on rank_zero_only & global rank state (#1408)
* try delete in async or DDP us0-ecase

* changelog

* add model chekpoint rank

* simple delete

* flake8

* use global rank

* chnagelog

* fix review

* fix import

* proposal

* proposal

* proposal

* improve proposal (fix problems with method call self)

* cleaning

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 17:21:00 -04:00
William Falcon d0faf97893
fixed dataset stuff + docs (#1599)
* Fixed dataset docs and disabled auto-sampler for iterable dataset
2020-04-24 16:51:26 -04:00
Jirka Borovec 570b2c7aeb
fix depreated call (#1596)
* fix parity

* update deprecated call
2020-04-24 14:45:43 -04:00
William Falcon f07176da9b
Update __init__.py 2020-04-24 10:33:26 -04:00
Boris Dayma f3d139e90f
fix(wandb): allow use of sweeps (#1512)
* fix(wandb): allow use of sweeps

overwrite run config parameters due to precision error

fix #1290

* docs(wandb): update changelog

* test(wandb): update config test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 10:29:24 -04:00
William Falcon cd15bfc3ce
fixed new amp bugs (#1593) 2020-04-24 09:29:39 -04:00
William Falcon 67d5f4dc39
Update __init__.py 2020-04-23 21:02:19 -04:00
William Falcon 890458fdbd
Fixes automatic parser bug (#1585)
* fixes gpu parsing

* fixes gpu parsing
2020-04-23 21:00:41 -04:00
Adrian Wälchli 3e8f2d99a9
Progress bar callback (#1450)
* squash and rebase

sanity check hooks


sanity check callback hook finish


moved core progress bar functionality into callback


wip


remove duplicate merge


clean up


imports


docs


sanity check progress bar main


sanity


move callback calls


init progrss bar callback


configuration and docs


changelog


rate decorator


pass process_position


disable on rank > 0


position index


is_enabled


remove decorator


refactor init tqdm bars


callback method ordering 


cannot reset when disabled


sequence -> list


default values


fix has no attr _time() 


move on_val_end to proper place


fix the pickle issue


update warning


properties


check for None


remove old comment


switch order


pull out non-tqdm functionality into base class


documentation for the base class


docs


fix refresh rate issue in validation


restrict type hint of trainer arg


more docs


update trainer docs


rst docs


fix lines too long


fix test


add missing type hints


fix typo


move docstring to __init__ solves doctest failures


remove doctest :(( can't fix the pickle error


fix example


simplify by saving trainer reference


fix docs errors


move docstring


initial value


multiple val checks per epoch


simpler handling of inf dataset sizes


update inf docs


renamed training_tqdm_dict


rename get_tqdm_dict


rename occurences of tqdm 


update changelog


fix doctest


fix formatting errors


added callback tests


progress bar on off test


more tests for progress bar


weird test fix?


add ignored property


disable default progress bar in LR finder


change enable/disable behavior


trying doctest in CI again


undo doctest pickle error


undo doctest pickle error :((


remove progress_bar_callback Trainer arg and fix tests


restore progress bar after auto lr find


update docs


fix rebase


fix wrong negation

* fix fast dev run total

* more thorough testing

* remove old args

* fix merge

* fix merge

* separate tests

* type hint total batches

* reduce if

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_disabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_enabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* rename enabled/disabled

* move deprecated api

* remove duplicated test from merge

* fix rename is_disabled

* newline

* test also testprogress for fast dev run

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 20:46:18 -04:00
Guy Davidson fe2b6666e0
Fixing a small issue in trainer logging (#1563)
* The epoch was being logged to metrics, which isn't read, rather than to current_metrics.

* Updated the tests to account for the epoch arriving at the logger.
2020-04-23 17:52:41 -04:00
Jirka Borovec 7989ca844c
test deprecation warnings (#1470)
* check deprecation warnings

* extend warning test

* try

* unimport modules

* update
2020-04-23 17:34:47 -04:00
Alexey Karnachev edb8d7a23c
Nested metrics dictionaries now can be passed to the loggers (#1582)
* now func merge_dicts works with nested dictionaries

* CHANGELOG.md upd
2020-04-23 17:32:36 -04:00
William Falcon 5ab5084f7b
Update __init__.py 2020-04-23 15:32:40 -04:00
William Falcon 47629536e2
Amp2 (#1580)
* fixed new amp bugs

* fixed new amp bugs
2020-04-23 15:24:02 -04:00
William Falcon 68ca577919
why copy? (#1579) 2020-04-23 15:03:39 -04:00
William Falcon 29ebe92208
support for native amp (#1561)
* adding native amp suppport

* adding native amp suppport

* adding native amp suppport

* adding native amp suppport

* autocast

* autocast

* autocast

* autocast

* autocast

* autocast

* removed comments

* removed comments

* added state saving

* added state saving

* try install amp again

* added state saving

* drop Apex reinstall

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 14:47:08 -04:00
karlinjf 41b6cbb3ca
Don't copy the batch when training on a single gpu (#1576)
* fix

* whitespace

Co-authored-by: Josh Karlin <karlinjf@gmail.com>
2020-04-23 14:28:20 -04:00
Nicki Skafte e977d1cde5
Default value for ModelCheckpoint filepath (#1548)
* allow determine of filepath at runtime

* typing

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-04-23 11:50:58 -04:00
Ferdinand Schlatt 545b38ec5f
fix boolean argparse (#1571)
* fix boolean argparse #1570

* update change log
2020-04-23 11:44:18 -04:00
William Falcon 759557050a
Update __init__.py 2020-04-23 11:04:57 -04:00
Lezwon Castelino 831842972f
check for kaggle env variable (#1568)
* check for kaggle env variable

* added changelog
2020-04-23 07:12:54 -04:00
William Falcon 990fd22488
Update __init__.py 2020-04-22 20:16:04 -04:00
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
Jirka Borovec c1c6e3b6c9
default test logger (#1478)
* default test logger

* fix tests

* spawn

* try

* simplify tests

* simplify tests

* formatting

* loggers

* loggers

* revert to TestTube

* default

* default

* wraps

* world size

* optim imports
2020-04-21 20:33:10 -04:00
Kevin Chen bafdeca42f
Replace GPU device idx with current process index (#1541) 2020-04-21 14:29:15 -04:00
Justus Schock 29c7d2f195
Revert namespace package search to normal package search (#1545)
* Revert this

* typos

* version++

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-21 08:26:47 -04:00
Justus Schock 8035c10f37
Prepare Namespace package (#1543)
* Update __init__.py

* Update setup.py
2020-04-21 07:12:02 -04:00
Jirka Borovec bd168819f2
fix changelog (#1452)
* fix changelog

* formatting

* add ddp_cpu

* docs

* add another
2020-04-20 17:36:26 -04:00
Roshan Rao 0203938af8
Update learning rate on each backward pass instead of each forward pass. (#1477)
* change lr scheduler step interval to update every backwards pass instead of every forwards pass

* update CHANGELOG

* fix spacing

* Add TODO to lr schedule update

* remove trailing whitespace

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-20 08:03:52 -04:00
Adrian Wälchli 4fca994d0e
Fix callback default (horror bug!) (#1534)
* fix horror bug

* update changelog

* fix doctest

* liine too long
2020-04-20 07:02:53 -04:00
William Falcon b0bf51f99f
Update __init__.py 2020-04-19 17:58:45 -04:00
areshytko d0c9472cb3
Add SLURM check in ddp_train() and init_ddp_connection() (#1387)
* slurm check in ddp_train and init_ddp_connection

* Remove code example in init_ddp_connection

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove blank line

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* improve for test coverage

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* update changelog

* Default values and warnings for DDP env variables

* fix merge artifacts

* update localhost value

* change to NODE_RANK

Co-authored-by: Alexander Reshytko <areshytko@Alexanders-MacBook-Pro.local>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-19 17:08:19 -04:00
Justus Schock c71bd73acb
DDP sampler (#1513)
* Add explicit flag for ddp sampler replacement

* Add flag for sampler replacement in ddp

* Update data_loading.py

* Update CHANGELOG.md

* pep8 fixes

* pep8
2020-04-19 16:58:57 -04:00
William Falcon ae2e14e3ed
fixed memory leak from opt return (#1528)
* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return
2020-04-19 16:41:54 -04:00
Hengjian (Henry) Jia 3c6f856f23
Fix Mixing hparams and arguments in LightningModule (#1505)
* Attempt to fix #1468

* Remove the if statement, it doesn't actually make any difference

* Update docs

* Correct warnings I caused in the last commit

* Add to changelog

* Actually add to changelog

* Clarify documentation and examples

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-19 07:03:40 -04:00