Commit Graph

51 Commits

Author SHA1 Message Date
Boris Dayma 1e36cffbca
feat(wandb): support distributed modes (#11650)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-02-09 19:53:21 +01:00
NathanGodey 8a1b1eeef8
WandbLogger's log_image can use step argument (#11716)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-05 01:02:41 +00:00
Akash Kwatra 115a5d08e8
Decouple utilities from `LightningLoggerBase` (#11484)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-02-02 23:29:01 +01:00
Boris Dayma 2db9ea3500
feat(wandb): support media logging (#9545) 2021-10-11 10:15:36 +01:00
Jirka Borovec 6e124e7207
CI: precommit - docformatter (#8584)
* CI: precommit - docformatter
* fix deprecated

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
ananthsub 930b81f96c
Remove unused rank_zero_deprecation in WandB logger (#9034)
* Remove unused imports in WandB logger and corresponding test
2021-08-22 12:58:48 +01:00
Adrian Wälchli ad3f183bc3
convert warning cache usage to rank_zero_only in WandbLogger (#8764) 2021-08-20 10:39:25 +00:00
Adrian Wälchli 3ef8cd654d
Add warning when `wandb.run` already exists (#8714)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-08-10 10:14:48 +02:00
Thien Tran 052aefc342
WandbLogger to log model topology by default (#8662)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-04 10:36:57 +00:00
Carlos Mocholí a64cc37394
Replace `yapf` with `black` (#7783)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Carlos Mocholí 733cdbb9ad
`every_n_val_epochs` -> `every_n_epochs` (#8383) 2021-07-13 01:20:20 +02:00
Carlos Mocholí 07d7c37a79
Remove magic monitor support for `ModelCheckpoint` (#8293) 2021-07-07 18:36:19 +01:00
Boris Dayma 9097347ea8
feat(wandb): log models as artifacts (#6231)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-27 20:15:02 +02:00
Boris Dayma 2a20102321
fix(wandb): allow custom init args (#6989)
* feat(wandb): allow custom init args

* style: pep8

* fix: get dict args

* refactor: simplify init args

* test: test init args

* style: pep8

* docs: update CHANGELOG

* test: check default resume value

* fix: default value of anonymous

* fix: respect order of parameters

* feat: use look-up table for anonymous

* yapf formatting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 09:45:36 +00:00
Tharindu Hasthika c502e47abf
Fixed setting of _save_dir when run initiated externally (#7106)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-04-23 01:14:46 +00:00
Adrian Wälchli 9c9e2a0325
fix gpus default for Trainer.add_argparse_args (#6898) 2021-04-09 11:20:43 +02:00
Boris Dayma 40d5a9d6df
fix(wandb): prevent WandbLogger from dropping values (#5931)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-02-27 01:52:23 +00:00
Jirka Borovec a0f7831278
fix miss-leading imports in tests (#5873)
* fix imorts

* .
2021-02-09 05:10:52 -05:00
Jirka Borovec 4faaef7758
formatting tests: 4/n (#5846)
* models

* ckpt

* core

* log
2021-02-06 12:07:26 +01:00
tchaton 77be6f6e24 resolve conflits
resolve doc

boring commit

docs

torchvision

tpu

Update dockers/tpu-tests/tpu_test_cases.jsonnet

Update dockers/tpu-tests/tpu_test_cases.jsonnet
2021-02-05 21:43:10 +01:00
Kaushik B 5dfd62c09e Disable training with zero num_training_batches when insufficient limit_train_batches (#5703)
* disable training when zero num_train_batches with limit_train_batches

* refactor train skip condition

* fix formatting issues

* fix formatting issues

* ref: test error msg

* fix tests for data loader calls

* fix train dataloader condition

* update limit_train_batches upper range in test comment

* remove model state check test

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
Rohit Gupta 2abf4693bc Fix log_dir property (#5537)
* fix and update tests

* update with ModelCheckpoint

* chlog

* wip wandb fix

* all fixed

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
Boris Dayma f0fafa2be0
feat(wandb): add sync_step (#5351)
* docs(wandb): add details to args

* feat(wandb): no sync between trainer and W&B steps

* style: pep8

* tests(wandb): test sync_step

* docs(wandb): add references

* docs(wandb): fix typo

* feat(wandb): more explicit warning

* feat(wandb): order of args

* style: Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* style: long line

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2021-01-24 17:44:09 -05:00
Rohit Gupta d583d56169
[tests/loggers] refactor with BoringModel (#5440)
* use BoringModel

* use BoringModel

* use BoringModel

* trigger

* limit_batches

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-10 07:30:06 -05:00
Boris Dayma dcd29aef06 feat(wandb): offset logging step when resuming (#5050)
* feat(wandb): offset logging step when resuming

* feat(wandb): output warnings

* fix(wandb): allow step to be None

* test(wandb): update tests

* feat(wandb): display warning only once

* style: fix PEP issues

* tests(wandb): fix tests

* tests(wandb): improve test

* style: fix whitespace

* feat: improve warning

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* feat(wandb): use variable from class instance

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* tests(wandb): check warnings

* feat(wandb): use WarningCache

* tests(wandb): fix tests

* style: fix formatting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-01-05 09:58:37 +01:00
Jirka Borovec 0f36525e8f
fix/enable - check F401 (#5201)
* refactor - check F401

* missed

* fix
2020-12-21 10:15:04 +01:00
Boris Dayma c586e5db77
feat(wandb): let wandb cli handle runs (#4648)
* feat(wandb): reinit handled by CLI

* fix: typo

* docs(wandb): improve formatting

* test(wandb): set wandb.run to None

* test(wandb): fix tests

* style: fix formatting

* docs(wandb): fix documentation

* Update code markup

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* docs(wandb): update CHANGELOG

* test(wandb): init called only when needed

* Update CHANGELOG.md

* try fix the test

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-11-24 01:31:28 +05:30
Jirka Borovec ef03c39ab7
Add step index in checkpoint name (#3807)
* true final value of global step

* ch check

* tests

* save each validation interval

* wip

* add test

* add test

* wip

* fix tests, revert old edits, fix merge conflicts, update doctests

* test + bugfix

* sort files

* format test

* suggestion by ananth

* added changelog

* naming

* docs

* example

* suggestion

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fix test

* pep

* pep

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-11-02 15:05:58 +01:00
chaton c2e72c3c86
[BUG-FIX] WandbLogger _sanitize_callable (#4422)
* fix

* resolve CodeFormatter

* Update pytorch_lightning/loggers/base.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-11-02 10:04:50 +01:00
Boris Dayma ff41d80706
feat(wandb): log in sync with Trainer step (#4405)
* feat(wandb): log in sync with Trainer step

* docs: update CHANGELOG

* style(test_wandb): fix formatting

* parentheses

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-29 01:07:06 +05:30
chaton f07ee33db6
BUG - Wandb: Sanitize callable. (#4320)
* add _sanitize_callable_params

* add call on _val if callable

* clean code formatter

* resolve pep8

* default return function name

* resolve pep8

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-26 11:57:03 +00:00
William Falcon 09c2020a93
notices (#4118) 2020-10-13 07:18:07 -04:00
Adrian Wälchli d03953260d
Fix weights_save_path when logger is used + simplify path handling + better docs (#2681)
* fix weights_save path and drop ckpt_path

* add tests

* unused import

* update docs

* changelog

* pep8

* fix horovod test

* make backward compatible

* perform same test for all loggers

* fix for when logger=False and weights_save_path is set

* update changelog

* update docs

* update tests

* do not set save dir dynamically

* remove duplicate test

* remove duplicated tests

* update tests

* update tests

* remove remaining ckpt_path references

* move defaults to init as suggested by @Borda

* test deprecation
2020-07-27 12:53:11 -04:00
William Falcon f35337adba
Fixes .test() for ddp (#2570)
* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint

* enable none checkpoint
2020-07-09 18:36:36 -04:00
Adrian Wälchli f16b4cfc52
save_dir fix for MLflowLogger + save_dir tests for others (#2502)
* mlflow rework

* logger save_dir

* folder

* mlflow

* simplify

* fix test

* add a test for file dir contents

* new line

* changelog

* docs

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* test for comet logger

* improve mlflow checkpoint test

* prevent  commet logger error on pytest exit

* test tensorboard save dir structure

* wandb save dir test

* skip test on windows

* add mlflow to pickle tests

* wandb

* code factor

* remove unused imports

* remove unused setter

* wandb mock

* wip mock

* wip mock

* wandb tests with mocking

* clean up

* clean up

* comments

* include wandblogger in test

* clean up

* missing argument

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-09 07:15:41 -04:00
Anthony Bisulco 899cd74044
flatten Wandb hyperparameters dict (#2459)
* wandb logging fix

* Changelog fix

* change test
2020-07-08 07:45:25 +02:00
Adrian Wälchli 25ee51bc57
Continue Jeremy's early stopping PR #1504 (#2391)
* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* cannot pass an int as default_save_path

* refactor log message

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* add state_dict for early stopping

* move best attr after monitor_op defined

* improve early stopping and model checkpoint callbacks

* fix formatting

* fix attr init order

* clean up setting of default_root_dir attr

* logger needs default root dir set first

* reorg trainer init

* remove direct references to checkpoint callback

* more fixes

* more bugfixes

* run callbacks at epoch end

* update tests to use on epoch end

* PR cleanup

* address failing tests

* refactor for homogeneity

* fix merge conflict

* separate tests

* tests for early stopping bug regressions

* small fixes

* revert model checkpoint change

* typo fix

* fix tests

* update train loop

* fix test case

* appease the linter

* fix some doctests

* move config to callback

* fixes from rebase

* fixes from rebase

* chlog

* docs

* reformat

* formatting

* fix

* fix

* fixes from rebase

* add new test for patience

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/callbacks/model_checkpoint.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/callbacks/test_early_stopping.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* fix formatting

* remove enable_early_stop attribute

* fix test with new epoch indexing

* fix progress bar totals

* fix off by one error (see #2289) epoch starts at 0 now

* added missing imports

* fix hpc_save folderpath

* fix formatting

* fix tests

* small fixes from a rebase

* fix

* tmpdir

* tmpdir

* tmpdir

* wandb

* fix merge conflict

* add back evaluation after training

* test_resume_early_stopping_from_checkpoint TODO

* undo the horovod check

* update changelog

* remove a duplicate test from merge error

* try fix dp_resume test

* add the logger fix from master

* try remove default_root_dir

* try mocking numpy

* try import numpy in docs test

* fix wandb test

* pep 8 fix

* skip if no amp

* dont mock when doctesting

* install extra

* fix the resume ES test

* undo conf.py changes

* revert remove comet pickle from test

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update weights_loading.rst

* Update weights_loading.rst

* Update weights_loading.rst

* renamed flag

* renamed flag

* revert the None check in logger experiment name/version

* add the old comments

* _experiment

* test chckpointing on DDP

* skip the ddp test on windows

* cloudpickle

* renamed flag

* renamed flag

* parentheses for clarity

* apply suggestion max epochs

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-28 21:36:46 -04:00
Boris Dayma 00f1ac11e6
fix(wandb): use same logger on multiple training loops (#2055)
* fix(wandb): use same logger on multiple training loops

New training loops reset step to 0 which would previously try to overwrite logs

fix #2015

* docs(changelog.md): add reference to PR 2055
2020-06-02 18:46:02 -04:00
Jirka Borovec 0cd5e64701
Tests: refactor loggers (#1689)
* refactor default model

* drop redundant seeds

* path

* refactor loggers tests

* imports
2020-05-04 07:13:52 -04:00
Oliver Neumann 152a2eb30c
wandb logger 'global_step' affects other logger (#1492)
* Removed unnecessary 'global_step' from wandb logger.

* Fixed wrong step implementation in wandb and missing metric skipping in logger base.

* simplified metric check in base logger

* Added Fix Description in CHANGELOG.md

* Updated wandb logger tests.

* udpate test, step=3

* Moved Fix Description in CHANGELOG.md to unreleased.

* Update CHANGELOG.md

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-02 08:50:47 -04:00
Jirka Borovec f380027951
refactor default model (#1652)
* refactor default model

* drop redundant seeds

* formatting

* path

* formatting

* rename
2020-05-02 08:38:22 -04:00
Jirka Borovec 34bc149359
move unnecessary dict trainer_options (#1469)
* move unnecessary dict trainer_options

* fix tests

* fix tests

* formatting

* missing
2020-05-01 10:43:58 -04:00
Boris Dayma f3d139e90f
fix(wandb): allow use of sweeps (#1512)
* fix(wandb): allow use of sweeps

overwrite run config parameters due to precision error

fix #1290

* docs(wandb): update changelog

* test(wandb): update config test

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 10:29:24 -04:00
Jirka Borovec b3fe17ddeb
fix flushing loggers (#1459)
* flushing loggers

* flushing loggers

* flushing loggers

* flushing loggers

* changelog

* typo

* fix trains

* optimize imports

* add logger test all

* add logger test pickle

* flake8

* fix benchmark

* hanging loggers

* try

* del

* all

* cleaning
2020-04-14 20:32:33 -04:00
William Falcon 3c5530c29d
Wandb bug/wandb multi (#1360)
* Allow reinits in sub procs

* Dont create an experiment on pickle, name, or project

* Comments consistency

* Fix test

* Apply suggestions from code review

Co-authored-by: Chris Van Pelt <vanpelt@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:03:00 -04:00
William Falcon dd5a05926c
Borisdayma: fix(wandb) - fix watch method (#1361)
* fix(wandb): fix watch method

* rebased

* Apply suggestions from code review

Co-authored-by: Boris Dayma <boris.dayma@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-03 15:02:38 -04:00
Bilal Khan a707d4bea1
Replace Wandb callback's finalize with no-op (#1193)
* Replace Wandb callback's finalize with no-op

* Update pytorch_lightning/loggers/wandb.py

* Update wandb.py

* remove wandb logger's finalize and update tests

* update changelog

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:45:06 -04:00
Jirka Borovec 45d671a4a8
CI: split tests-examples (#990)
* CI: split tests-examples

* tests without template

* comment depends

* CircleCI typo

* add doctest

* update test req.

* CI tests

* setup macOS

* longer train

* lover pred acc

* fix model

* rename default model

* lower tests acc

* typo

* imports

* fix test optimizer

* update calls

* fix Win

* lower Drone image

* fix call

* pytorch image

* fix test

* add dev image

* add dev image

* update image

* drone volume

* lint

* update test notes

* rename tests/models >> tests/base

* group models

* conftest

* optim imports

* typos

* fix import

* fix tests

* install AMP

* tests

* fix import
2020-03-25 07:46:27 -04:00
Jirka Borovec 514d182b7f
cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
Jirka Borovec e586ed4767
hparams as dict [blocked by 1041] (#1029)
* hparams as dict

* hparams as dict

* fixing

* fixing

* fixing

* fixing

* typing

* typing

* chnagelog

* update set hparams

* use setter

* simplify

* chnagelog

* imports

* pylint

* typing

* Update training_io.py

* Update training_io.py

* Update lightning.py

* Update test_trainer.py

* Update __init__.py

* Update base.py

* Update utils.py

* Update test_trainer.py

* Update training_io.py

* Update test_trainer.py

* Update test_trainer.py

* Update test_trainer.py

* Update test_trainer.py

* Update callback_config.py

* Update callback_config.py

* Update test_trainer.py

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-04 09:33:39 -05:00