Commit Graph

22 Commits

Author SHA1 Message Date
Udit Arora 08573d0f7e
Fix some pyright member access errors in training module (#2121)
* Fix pyright member access errors in training module

* Fix Trainer instantiation error due to inheritence order

* Add GH workflow for pyright

* Fix more pyright errors in trainer module

* Add pyrightconfig and setup python environment in type-check workflow

* Exclude pyrightconfig.json

* suggestions

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-12 17:23:18 +02:00
Adrian Wälchli e6b34ef90d
[WIP] Reduction when batch size < num gpus (#1609)
* reduce if <= num_gpus

* add test with explanation

* chlog

* fix changelog

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-05-02 11:01:44 -04:00
William Falcon 4755ded863
Clean up Argparse interface with trainer (#1606)
* fixed distutil parsing

* fixed distutil parsing

* Apply suggestions from code review

* log

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* fixed distutil parsing

* doctest

* fixed hparams section

* fixed hparams section

* fixed hparams section

* formatting

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-26 09:20:06 -04:00
Jirka Borovec 58a467dd68
model checkpint on rank_zero_only & global rank state (#1408)
* try delete in async or DDP us0-ecase

* changelog

* add model chekpoint rank

* simple delete

* flake8

* use global rank

* chnagelog

* fix review

* fix import

* proposal

* proposal

* proposal

* improve proposal (fix problems with method call self)

* cleaning

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 17:21:00 -04:00
Adrian Wälchli 3e8f2d99a9
Progress bar callback (#1450)
* squash and rebase

sanity check hooks


sanity check callback hook finish


moved core progress bar functionality into callback


wip


remove duplicate merge


clean up


imports


docs


sanity check progress bar main


sanity


move callback calls


init progrss bar callback


configuration and docs


changelog


rate decorator


pass process_position


disable on rank > 0


position index


is_enabled


remove decorator


refactor init tqdm bars


callback method ordering 


cannot reset when disabled


sequence -> list


default values


fix has no attr _time() 


move on_val_end to proper place


fix the pickle issue


update warning


properties


check for None


remove old comment


switch order


pull out non-tqdm functionality into base class


documentation for the base class


docs


fix refresh rate issue in validation


restrict type hint of trainer arg


more docs


update trainer docs


rst docs


fix lines too long


fix test


add missing type hints


fix typo


move docstring to __init__ solves doctest failures


remove doctest :(( can't fix the pickle error


fix example


simplify by saving trainer reference


fix docs errors


move docstring


initial value


multiple val checks per epoch


simpler handling of inf dataset sizes


update inf docs


renamed training_tqdm_dict


rename get_tqdm_dict


rename occurences of tqdm 


update changelog


fix doctest


fix formatting errors


added callback tests


progress bar on off test


more tests for progress bar


weird test fix?


add ignored property


disable default progress bar in LR finder


change enable/disable behavior


trying doctest in CI again


undo doctest pickle error


undo doctest pickle error :((


remove progress_bar_callback Trainer arg and fix tests


restore progress bar after auto lr find


update docs


fix rebase


fix wrong negation

* fix fast dev run total

* more thorough testing

* remove old args

* fix merge

* fix merge

* separate tests

* type hint total batches

* reduce if

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_disabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_enabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* rename enabled/disabled

* move deprecated api

* remove duplicated test from merge

* fix rename is_disabled

* newline

* test also testprogress for fast dev run

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 20:46:18 -04:00
Guy Davidson fe2b6666e0
Fixing a small issue in trainer logging (#1563)
* The epoch was being logged to metrics, which isn't read, rather than to current_metrics.

* Updated the tests to account for the epoch arriving at the logger.
2020-04-23 17:52:41 -04:00
William Falcon ae2e14e3ed
fixed memory leak from opt return (#1528)
* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return

* fixed memory leak from opt return
2020-04-19 16:41:54 -04:00
William Falcon c96c6a6b33
attempting to remove some speed issues (#1482)
* removed some .items

* added speed tests

* added speed tests

* Update benchmarks/test_rnn_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update benchmarks/test_trainer_parity.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* fix lost model reference

* added speed tests

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-14 20:23:36 -04:00
William Falcon b78c3d4da8
Fix weights path (#1445)
* renamed default path to actual root_dir

* added default weights path

* added default weights path

* added default weights path
2020-04-10 12:02:59 -04:00
Alexey Karnachev ddbf7de6dc
Added accumulation of loggers' metrics for the same steps (#1278)
* `add_argparse_args` method fixed (argument types added)

* autopep8 fixes

* --gpus=0 removed from test (for ci tests)

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Joe Davison <joe@huggingface.co>

* test_with_accumulate_grad_batches added

* agg_and_log_metrics logic added to the base logger class

* small format fix

* agg metrics strategies removed (not to complicate stuff)

* agg metrics: handle zero step

* autopep8

* changelog upd

* flake fix

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* metrics aggregators factored out, metrics_agg.py added + tests

* metrics agg default value added

* Update pytorch_lightning/loggers/metrics_agg.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* remove .item which causes sync issues (#1254)

* remove .item which causes sync issues

* fixed gradient acc sched

* fixed gradient acc sched

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* test_metrics_agg.py removed (all tested in doctrings), agg metrics refactored

* autopep8

* loggers base.py types fixed

* test

* test

* metrics aggregation for loggers: each key now has a specific function (or default one)

* metrics aggregation for loggers: each key now has a specific function (or default one)

* docstrings upd

* manual typehints removed from docstrings

* batch_size decreased for test `test_with_accumulate_grad_batches`

* extend running accum

* refactor

* fix tests

* fix tests

* allowed_types generator scoped

* trainer.py distutils was imported twice, fixed

* TensorRunningAccum refactored

* TensorRunningAccum added to change log (Changed)

* change log pull link added

Co-authored-by: Joe Davison <joe@huggingface.co>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-08 08:35:47 -04:00
Jirka Borovec 514d182b7f
cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
Jirka Borovec 7beed7cae6
Trainer cleanup (#934)
* Trainer cleanup

* update abstract

* remove ...

* remove __init__

* update mixin types

* update callbacks

* fix

* lower test acc
2020-02-27 16:21:14 -05:00
Ethan Harris a5f159b2c7
Add support for multiple loggers (#903)
* Add support for multiple loggers

* Fix PEP

* Cleanup

* Cleanup

* Add typing to loggers

* Update base.py

* Replace duck typing with isinstance check

* Update CHANGELOG.md

* Update comet experiment type, Switch to abstractmethod in logging.py

* Fix test

* Add passes to LightningLoggerBase

* Update experiment_logging.rst
2020-02-25 14:52:39 -05:00
Dmitry Lipin 06ca6428b6
Allow user to specify 'step' key while logging metrics (#808)
* allow to specify 'step' key

* add test

* docs to log_metrics

* fix test

* rename

* also rename
2020-02-15 23:35:23 -05:00
Adrian Wälchli 472f394788
Resolve some codefactor issues (#756)
* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* Fix default ckpt path when logger exists (#771)

* rename logging -> loggers (#767)

* move logging >> loggers

* add warning

* fix tests

* logging alias

* formatting

* formatting

* use isinstance for type checks

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add more detail to tbptt example (#755)

* add more detail to tbptt example

* warn user about new arg in training_step

Co-authored-by: Vadim Bereznyuk <kuynzereb@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-02-01 18:44:05 -05:00
Jirka Borovec 76a1c67d87
rename logging -> loggers (#767)
* move logging >> loggers

* add warning

* fix tests

* logging alias

* formatting

* formatting
2020-02-01 15:47:58 -05:00
Vadim Bereznyuk 7deec2c14e Move logger initialization (#750) 2020-01-26 09:42:57 -05:00
Elliot Waite b492e2b89e Change nb to num in ABCs, comments, and tqdm logging (#613)
* Change nb to num in ABCs, comments, and tqdm logging

* Fix warnings text

* Make warnings one line

* Change num to number in comments
2019-12-09 04:40:26 -08:00
Jirka Borovec 5d00e62047 Fix logger, tensorboard (#610)
* fix logger tests

* fix missing flush

* fix tensorboard

* fix namespace

* fix flush

* fix add_hparams
2019-12-08 07:59:25 -08:00
ctlaltdefeat 58cc6e13b9 Update logging.py (#602) 2019-12-07 10:12:33 -05:00
Elliot Waite 1051c189e1 Simplify variables: step, epoch, max_epochs, min_epochs (#589) 2019-12-07 08:50:21 -05:00
Jirka Borovec 1d4b6be17b rename trainer modules, drop `_mixin` (#571)
* rename trainer modules, drop _mixin

* fix imports
2019-12-04 11:39:14 -05:00