Burhanuddin Rangwala
b576201a3d
Added doc strings to wandb logger ( #9109 )
2021-08-26 16:01:42 +01:00
ananthsub
930b81f96c
Remove unused rank_zero_deprecation in WandB logger ( #9034 )
...
* Remove unused imports in WandB logger and corresponding test
2021-08-22 12:58:48 +01:00
Adrian Wälchli
ad3f183bc3
convert warning cache usage to rank_zero_only in WandbLogger ( #8764 )
2021-08-20 10:39:25 +00:00
Carlos Mocholí
a1264a6850
Automatic string fixes ( #8886 )
2021-08-13 14:28:14 +00:00
Adrian Wälchli
3ef8cd654d
Add warning when `wandb.run` already exists ( #8714 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-08-10 10:14:48 +02:00
Adrian Wälchli
87093a3339
remove deprecated sync step argument from WandbLogger ( #8763 )
...
* remove deprecated sync step
* update chlog
2021-08-09 09:45:25 +02:00
Thien Tran
052aefc342
WandbLogger to log model topology by default ( #8662 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-04 10:36:57 +00:00
Carlos Mocholí
e63968ab88
Add `pyupgrade` to `pre-commit` ( #8557 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 14:38:12 +02:00
Carlos Mocholí
a64cc37394
Replace `yapf` with `black` ( #7783 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Kaushik B
f447839d16
Add `warning_cache.deprecation` and set warning stacklevel [1/2] ( #8005 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-06-18 11:50:24 +00:00
Boris Dayma
9097347ea8
feat(wandb): log models as artifacts ( #6231 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-27 20:15:02 +02:00
Boris Dayma
2a20102321
fix(wandb): allow custom init args ( #6989 )
...
* feat(wandb): allow custom init args
* style: pep8
* fix: get dict args
* refactor: simplify init args
* test: test init args
* style: pep8
* docs: update CHANGELOG
* test: check default resume value
* fix: default value of anonymous
* fix: respect order of parameters
* feat: use look-up table for anonymous
* yapf formatting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 09:45:36 +00:00
Tharindu Hasthika
c502e47abf
Fixed setting of _save_dir when run initiated externally ( #7106 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-04-23 01:14:46 +00:00
Boris Dayma
40d5a9d6df
fix(wandb): prevent WandbLogger from dropping values ( #5931 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-02-27 01:52:23 +00:00
Kunal Mundada
4d96f19493
Document exceptions in loggers ( #6171 )
...
* Document exceptions in loggers
* minor formatting
* docstring changed in comet.py
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-02-25 21:08:32 +01:00
Eric Cousineau
4531b1c796
wandb: Fix example rendering for docs ( #5905 )
...
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-02-16 20:14:01 +01:00
Justus Schock
da6dbc8d1d
PoC: Accelerator refactor ( #5743 )
...
* restoring the result from subprocess
* fix queue.get() order for results
* add missing "block_backward_sync" context manager
* add missing "block_backward_sync" context manager
* fix sync_batchnorm
* fix supported gpu-ids for tuple
* fix clip gradients and inf recursion
* accelerator selection: added cluster_environment plugin
* fix torchelastic test
* fix reduce early stopping decision for DDP
* fix tests: callbacks, conversion to lightning optimizer
* fix lightning optimizer does not pickle
* fix setting benchmark and deterministic option
* fix slurm amp test
* fix prepare_data test and determine node_rank
* fix retrieving last path when testing
* remove obsolete plugin argument
* fix test: test_trainer_config
* fix torchscript tests
* fix trainer.model access
* move properties
* fix test_transfer_batch_hook
* fix auto_select_gpus
* fix omegaconf test
* fix test that needs to simulate slurm ddp
* add horovod plugin
* fix test with named arguments
* clean up whitespace
* fix datamodules test
* remove old accelerators
* fix naming
* move old plugins
* move to plugins
* create precision subpackage
* create training_type subpackage
* fix all new import errors
* fix wrong arguments order passed to test
* fix LR finder
* Added sharded training type and amp plugin
* Move clip grad to precision plugin
* Added sharded spawn, select accelerators based on distributed_backend + enable custom fp16 plugin automatically
* Fix import issue, attempting to fix tests
* Fix initial test
* Reflect hook logic from master, should wrap model after move to device
* Optional state consolidation, since master has optimizers not wrapped
* change attribute for instance test
* reset optimizers
optimizers are not used in main process, so state would be wrong.
* legacy
* imports in accel
* legacy2
* trainer imports
* fix import errors after rebase
* move hook to new setup location
* provide unwrapping logic
* fix trainer callback system
* added ddp2 implementation
* fix imports .legacy
* move plugins
* restore legacy
* drop test.py from root
* add tpu accelerator and plugins
* fixes
* fix lightning optimizer merge
* reset bugreportmodel
* unwrapping
* step routing forward
* model access
* unwrap
* opt
* integrate distrib_type
* sync changes
* sync
* fixes
* add forgotten generators
* add missing logic
* update
* import
* missed imports
* import fixes
* isort
* mv f
* changelog
* format
* move helper to parallel plugin
* d
* add world size
* clean up
* duplicate
* activate ddp_sharded and tpu
* set nvidia flags
* remove unused colab var
* use_tpu <-> on_tpu attrs
* make some ddp_cpu and clusterplugin tests pass
* Ref/accelerator connector (#5742 )
* final cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* connector cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* trainer cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* accelerator cleanup + missing logic in accelerator connector
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* add missing changes to callbacks
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* reflect accelerator changes to lightning module
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* clean cluster envs
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* cleanup plugins
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* add broadcasting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* yapf
* remove plugin connector
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* plugins
* manual optimization
* update optimizer routing
* add rank to torchelastic
* fix memory mixed precision
* setstate on trainer for pickling in ddp spawn
* add predict method
* add back commented accelerator code
* adapt test for sync_batch_norm to new plugin
* fix deprecated tests
* fix ddp cpu choice when no num_processes are given
* yapf format
* skip a memory test that cannot pass anymore
* fix pickle error in spawn plugin
* x
* avoid
* x
* fix cyclic import in docs build
* add support for sharded
* update typing
* add sharded and sharded_spawn to distributed types
* make unwrap model default
* refactor LightningShardedDataParallel similar to LightningDistributedDataParallel
* update sharded spawn to reflect changes
* update sharded to reflect changes
* Merge 1.1.5 changes
* fix merge
* fix merge
* yapf isort
* fix merge
* yapf isort
* fix indentation in test
* copy over reinit scheduler implementation from dev1.2
* fix apex tracking calls with dev_debugger
* reduce diff to dev1.2, clean up
* fix trainer config test when gpus>0 and num_processes >0 and ddp_cpu
* sort plugin tests legacy/new
* fix error handling for amp on cpu
* fix merge
fix merge
fix merge
* [Feat] Resolve manual_backward (#5837 )
* resolve manual_backward
* resolve flake8
* update
* resolve for ddp_spawn
* resolve flake8
* resolve flake8
* resolve flake8
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* fix tests/accelerator tests on cpu
* [BugFix] Resolve manual optimization (#5852 )
* resolve manual_optimization
* update
* update
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* Remove copy trainer parameters to happen earlier within the loop and add safe guard to get ref model (#5856 )
* resovle a bug
* Accelerator refactor sharded rpc (#5854 )
* rpc branch
* merge
* update handling of rpc
* make devices etc. Optional in RPC
* set devices etc. later if necessary
* remove devices from sequential
* make devices optional in rpc
* fix import
* uncomment everything
* fix cluster selection
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* resolve bug
* fix assert in rpc test
* resolve a test
* fix docs compilation
* accelerator refactor - fix for sharded parity test (#5866 )
* fix memory issue with ddp_spawn
* x
x
x
x
x
x
x
x
x
* x
* Remove DDP2 as this does not apply
* Add missing pre optimizer hook to ensure lambda closure is called
* fix apex docstring
* [accelerator][BugFix] Resolve some test for 1 gpu (#5863 )
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* update
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* revert init
* update
* resolve flake8
* update
* update
* update
* update
* update
* all_gather
* update
* make plugins work, add misconfig for RPC
* update
* update
* remove breaking test
* resolve some tests
* resolve flake8
* revert to ddp_spawn
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de>
* yapf isort
* resolve flake8
* fix apex doctests
* fix apex doctests 2
* resolve docs
* update drone
* clean env
* update
* update
* update
* update
* merge
* Fix RPC related tests, clean out old API, update for new accelerator API [skip ci] (#5881 )
* Fix RPC related tests, clean out old API, update for new accelerator API
* Move tests out of legacy folder, update paths and names
* Update test_remove_1-4.py
* Expose properties for tpu cores/gpus/num_gpus
* Add root GPU property
* Move properties to properties.py
* move tests that were previously in drone
* Fix root GPU property (#5908 )
* Move root GPU to property, remove horovod set as this is handled in horovod plugin, ensure we mock correctly to set GPU accelerator
* Add missing tests back
* fix best model path transfer when no checkpoint callback available
* Fix setup hook order [wip] (#5858 )
* Call trainer setup hook before accelerator setup
* Add test case
* add new test
* typo
* fix callback order in test
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* rename ddp sequential -> rpc sequential for special test
* revert
* fix stupid merge problem
* Use property in connector for sampler (#5913 )
* merge the import conflicts
* fix spawning of processes in slurm
* [wip] Fix some bugs for TPU [skip ci] (#5878 )
* fixed for single tpu
* fixed spawn
* fixed spawn
* update
* update
* wip
* resolve bugs
* resolve bug
* update on comment
* removed decorator
* resolve comments
* set to 4
* update
* update
* need cleaning
* update
* update
* update
* resolve flake8
* resolve bugs
* exclude broadcast
* resolve bugs
* change test
* update
* update
* skip if meet fails
* properly raise trace
* update
* add catch
* wrap test
* resolve typo
* update
* typo
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
* resolve some tests
* update
* fix imports
* update
* resolve flake8
* update azure pipeline
* skip a sharded test on cpu that requires a gpu
* resolve tpus
* resolve bug
* resolve flake8
* update
* updat utils
* revert permission change on files
* suggestions from carlos
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* remove unrelated formatting changes
* remove incomplete comment
* Update pytorch_lightning/accelerators/__init__.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* remove unrelated formatting change
* add types
* warn 1.7 ddp manual backward only if ddp kwarg unset
* yapf + isort
* pep8 unused imports
* fix cyclic import in docs
* Apply suggestions from code review
* typer in accelerator.py
* typo
* Apply suggestions from code review
* formatting
* update on comments
* update typo
* Update pytorch_lightning/trainer/properties.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* update
* suggestion from code review
* suggestion from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-12 15:48:56 -05:00
Jirka Borovec
79d42d83e7
formatting 3/n: PL modules ( #5716 )
...
* cb
* log
* prof
* tune
* flake8
2021-02-08 14:28:38 -05:00
Rohit Gupta
2abf4693bc
Fix log_dir property ( #5537 )
...
* fix and update tests
* update with ModelCheckpoint
* chlog
* wip wandb fix
* all fixed
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
Jirka Borovec
7b30133a82
flake8 & isort ( #5647 )
2021-01-25 14:31:38 -05:00
Boris Dayma
f0fafa2be0
feat(wandb): add sync_step ( #5351 )
...
* docs(wandb): add details to args
* feat(wandb): no sync between trainer and W&B steps
* style: pep8
* tests(wandb): test sync_step
* docs(wandb): add references
* docs(wandb): fix typo
* feat(wandb): more explicit warning
* feat(wandb): order of args
* style: Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* style: long line
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2021-01-24 17:44:09 -05:00
Justus Schock
ef7345dc4e
add possibility for nested loaders ( #5404 )
...
* add possibility for nested loaders
* pep8: newline
2021-01-24 07:32:02 -05:00
Arnaud Gelas
8629048659
Fix isort failures in loggers ( #5527 )
...
Remove from skipped module in pyproject.toml and fix failures on:
- pytorch_lightning/loggers/*.py
2021-01-15 22:53:56 +05:30
Jirka Borovec
9610ea817b
refactor imports of logger dependencies ( #4860 )
...
* refactor imports of logger dependencies
* fix
* fix
* fix
* name
* fix
* mocks
* fix tests
* fix mlflow
* fix test tube
* fix wandb import check
* whitespace
* name
* name
* hack
* hack
* rev
* fix
* update mlflow import check
* try without installing conda dep
* .
* .
* .
* .
* .
* .
* .
* .
* .
Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
(cherry picked from commit ec0fb7a3ec
)
2021-01-06 15:16:06 +01:00
Jirka Borovec
74d0652164
flake8 ++
2021-01-05 09:58:37 +01:00
Boris Dayma
dcd29aef06
feat(wandb): offset logging step when resuming ( #5050 )
...
* feat(wandb): offset logging step when resuming
* feat(wandb): output warnings
* fix(wandb): allow step to be None
* test(wandb): update tests
* feat(wandb): display warning only once
* style: fix PEP issues
* tests(wandb): fix tests
* tests(wandb): improve test
* style: fix whitespace
* feat: improve warning
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* feat(wandb): use variable from class instance
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* tests(wandb): check warnings
* feat(wandb): use WarningCache
* tests(wandb): fix tests
* style: fix formatting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-01-05 09:58:37 +01:00
Haswanth Aekula
ac996fb008
Fixed docs for WandbLogger ( #5128 )
...
Fixed a small bug with the `WandbLogger` docs.
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2021-01-05 09:58:37 +01:00
Jirka Borovec
0f36525e8f
fix/enable - check F401 ( #5201 )
...
* refactor - check F401
* missed
* fix
2020-12-21 10:15:04 +01:00
Rohit Gupta
ef762a0d2a
update logging docs and decorators ( #4431 )
...
* update logging docs
* experiment
* add decorators to base and csv logger methods
* fix
* doc fix
* update docs
* update docs
* Update pytorch_lightning/loggers/base.py
Co-authored-by: chaton <thomas@grid.ai>
2020-12-01 11:35:00 +05:30
Boris Dayma
c586e5db77
feat(wandb): let wandb cli handle runs ( #4648 )
...
* feat(wandb): reinit handled by CLI
* fix: typo
* docs(wandb): improve formatting
* test(wandb): set wandb.run to None
* test(wandb): fix tests
* style: fix formatting
* docs(wandb): fix documentation
* Update code markup
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* docs(wandb): update CHANGELOG
* test(wandb): init called only when needed
* Update CHANGELOG.md
* try fix the test
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-11-24 01:31:28 +05:30
Rohit Gupta
2d9d7e4daa
Add prefix argument in loggers ( #4557 )
...
* Add prefix parameter in loggers
* chlog
* pep
* patch test
* remove args, access via self
* try fix the test
* try fix the test
* try fix the test
* prefix test
* fix assert has calls
fix assert call
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-11-22 06:38:58 +01:00
Jeff Yang
baa8558cc0
logger docs and api docs ( #3950 )
...
* logger and api docs
* remove gpu_usage_logger, lr_logger
* update docstring
* fix wandb example
* remove step result
* charts
* add some charts info
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-13 20:35:54 +05:30
Boris Dayma
ff41d80706
feat(wandb): log in sync with Trainer step ( #4405 )
...
* feat(wandb): log in sync with Trainer step
* docs: update CHANGELOG
* style(test_wandb): fix formatting
* parentheses
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-29 01:07:06 +05:30
chaton
f07ee33db6
BUG - Wandb: Sanitize callable. ( #4320 )
...
* add _sanitize_callable_params
* add call on _val if callable
* clean code formatter
* resolve pep8
* default return function name
* resolve pep8
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update CHANGELOG.md
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-10-26 11:57:03 +00:00
Adrian Wälchli
376268f01e
Implement finalize for WandbLogger ( #4341 )
...
* wandb finish
* experiment
* upload at end of run
* changelog
* comment
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-10-26 11:22:09 +00:00
Adrian Wälchli
3ff5327e83
Mocking loggers (part 1, wandb) ( #3596 )
...
* mocking for wandb
* remove wandb import in amp test
* mock loggers in sphinx
* check tests
* Update extra.txt
* setup
* dev
* min
* revert
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-09-25 16:00:02 +02:00
Rohit Gupta
07b857769a
Allow kwargs in Wandb & Neptune + kwargs docstring ( #3475 )
...
* Allow kwargs in WandbLogger
* isort
* kwargs docstring
* typo
* kwargs for other loggers
* pep and isort
* formatting
* fix failing test
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-09-19 18:51:43 +02:00
William Falcon
f43028f3ae
added copyright notices ( #3062 )
2020-08-19 22:03:22 -04:00
Adrian Wälchli
f16b4cfc52
save_dir fix for MLflowLogger + save_dir tests for others ( #2502 )
...
* mlflow rework
* logger save_dir
* folder
* mlflow
* simplify
* fix test
* add a test for file dir contents
* new line
* changelog
* docs
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* test for comet logger
* improve mlflow checkpoint test
* prevent commet logger error on pytest exit
* test tensorboard save dir structure
* wandb save dir test
* skip test on windows
* add mlflow to pickle tests
* wandb
* code factor
* remove unused imports
* remove unused setter
* wandb mock
* wip mock
* wip mock
* wandb tests with mocking
* clean up
* clean up
* comments
* include wandblogger in test
* clean up
* missing argument
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-07-09 07:15:41 -04:00
Anthony Bisulco
899cd74044
flatten Wandb hyperparameters dict ( #2459 )
...
* wandb logging fix
* Changelog fix
* change test
2020-07-08 07:45:25 +02:00
Adrian Wälchli
145670f893
fix logging on rank 0 only ( #2425 )
...
* fix and test for ddp block logging rank > 0
* rename
* use the dummy logger
* dummy logger test
* set the logger in model
* decorator for rank zero experiment
* simplify check
* simplify
* fix problem with None in checkpoint path
* revert configure logger
* unused import
* offline
* try rank 0 decorator in checkpoint
* try fix test
* imgs
* add asserts to make sure log zero only saves checkpoints
* add asserts to make sure log zero only saves checkpoints
* add asserts to make sure log zero only saves checkpoints
* add asserts to make sure log zero only saves checkpoints
* add asserts to make sure log zero only saves checkpoints
* fix tpu tests
* fix tpu tests
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-30 18:09:16 -04:00
Adrian Wälchli
25ee51bc57
Continue Jeremy's early stopping PR #1504 ( #2391 )
...
* add state_dict for early stopping
* move best attr after monitor_op defined
* improve early stopping and model checkpoint callbacks
* fix formatting
* fix attr init order
* clean up setting of default_root_dir attr
* logger needs default root dir set first
* reorg trainer init
* remove direct references to checkpoint callback
* more fixes
* more bugfixes
* run callbacks at epoch end
* update tests to use on epoch end
* PR cleanup
* address failing tests
* refactor for homogeneity
* fix merge conflict
* separate tests
* tests for early stopping bug regressions
* small fixes
* revert model checkpoint change
* typo fix
* fix tests
* update train loop
* cannot pass an int as default_save_path
* refactor log message
* fix test case
* appease the linter
* fix some doctests
* move config to callback
* fixes from rebase
* fixes from rebase
* chlog
* docs
* reformat
* formatting
* fix
* fix
* fixes from rebase
* add new test for patience
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/callbacks/test_early_stopping.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* fix formatting
* remove enable_early_stop attribute
* add state_dict for early stopping
* move best attr after monitor_op defined
* improve early stopping and model checkpoint callbacks
* fix formatting
* fix attr init order
* clean up setting of default_root_dir attr
* logger needs default root dir set first
* reorg trainer init
* remove direct references to checkpoint callback
* more fixes
* more bugfixes
* run callbacks at epoch end
* update tests to use on epoch end
* PR cleanup
* address failing tests
* refactor for homogeneity
* fix merge conflict
* separate tests
* tests for early stopping bug regressions
* small fixes
* revert model checkpoint change
* typo fix
* fix tests
* update train loop
* fix test case
* appease the linter
* fix some doctests
* move config to callback
* fixes from rebase
* fixes from rebase
* chlog
* docs
* reformat
* formatting
* fix
* fix
* fixes from rebase
* add new test for patience
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/callbacks/test_early_stopping.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* fix formatting
* remove enable_early_stop attribute
* fix test with new epoch indexing
* fix progress bar totals
* fix off by one error (see #2289 ) epoch starts at 0 now
* added missing imports
* fix hpc_save folderpath
* fix formatting
* fix tests
* small fixes from a rebase
* fix
* tmpdir
* tmpdir
* tmpdir
* wandb
* fix merge conflict
* add back evaluation after training
* test_resume_early_stopping_from_checkpoint TODO
* undo the horovod check
* update changelog
* remove a duplicate test from merge error
* try fix dp_resume test
* add the logger fix from master
* try remove default_root_dir
* try mocking numpy
* try import numpy in docs test
* fix wandb test
* pep 8 fix
* skip if no amp
* dont mock when doctesting
* install extra
* fix the resume ES test
* undo conf.py changes
* revert remove comet pickle from test
* Update CHANGELOG.md
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update weights_loading.rst
* Update weights_loading.rst
* Update weights_loading.rst
* renamed flag
* renamed flag
* revert the None check in logger experiment name/version
* add the old comments
* _experiment
* test chckpointing on DDP
* skip the ddp test on windows
* cloudpickle
* renamed flag
* renamed flag
* parentheses for clarity
* apply suggestion max epochs
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jeremy Jordan <jtjordan@ncsu.edu>
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-06-28 21:36:46 -04:00
Boris Dayma
00f1ac11e6
fix(wandb): use same logger on multiple training loops ( #2055 )
...
* fix(wandb): use same logger on multiple training loops
New training loops reset step to 0 which would previously try to overwrite logs
fix #2015
* docs(changelog.md): add reference to PR 2055
2020-06-02 18:46:02 -04:00
Justus Schock
6456247287
Re-Enable Import Errors ( #1938 )
...
* update logger imports
* pep8 fixes
* pep8
2020-05-25 07:31:35 -04:00
Anthony Bisulco
76af84718a
Group argument wandb ( #1760 )
...
* group argument wandb
* formatting fix
2020-05-10 13:15:51 -04:00
Oliver Neumann
152a2eb30c
wandb logger 'global_step' affects other logger ( #1492 )
...
* Removed unnecessary 'global_step' from wandb logger.
* Fixed wrong step implementation in wandb and missing metric skipping in logger base.
* simplified metric check in base logger
* Added Fix Description in CHANGELOG.md
* Updated wandb logger tests.
* udpate test, step=3
* Moved Fix Description in CHANGELOG.md to unreleased.
* Update CHANGELOG.md
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-05-02 08:50:47 -04:00
Jirka Borovec
58a467dd68
model checkpint on rank_zero_only & global rank state ( #1408 )
...
* try delete in async or DDP us0-ecase
* changelog
* add model chekpoint rank
* simple delete
* flake8
* use global rank
* chnagelog
* fix review
* fix import
* proposal
* proposal
* proposal
* improve proposal (fix problems with method call self)
* cleaning
Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 17:21:00 -04:00
Boris Dayma
f3d139e90f
fix(wandb): allow use of sweeps ( #1512 )
...
* fix(wandb): allow use of sweeps
overwrite run config parameters due to precision error
fix #1290
* docs(wandb): update changelog
* test(wandb): update config test
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 10:29:24 -04:00
Adrian Wälchli
6e1d72d98a
Improved docs for Loggers ( #1484 )
...
* improve __init__
* improve logger base
* improve comet logger docs
* improved docs for mlflow
* improved nepune logger docs
* fix matplotlib import issue
* improve tensorboard docs
* improve docs for test tube
* improved trains logger docs
* improve wandb logger docs
* improved docs in experiment_logging.rst
* added MLflow to the list of loggers
* fix too long lines
* fix trains doctest
* fix neptune doctest
* fix mlflow doctest
* Apply suggestions from code review
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* Apply suggestions from code review
* fix whitespace
* try bypass mode for neptune (fix doctest api key error)
* try "test" as api key
* Revert "try "test" as api key"
This reverts commit fd77db26d551f08b4b4a12bb93cbd8f7a0814f29.
* try test as api key
* update neptune docs
* bump neptune minimal version
* revert unnecessary bypass code
* test if CI runs doctests in .rst files
* Revert "test if CI runs doctests in .rst files"
This reverts commit a45aeb460a8c4b7445a35dd7b49265f48d11c485.
* add doctest directive
* neptune demo links
* added tutorial link for W&B
* fix line too long
* fix merge error
* fix merge error
* add instructions how to install loggers
* add instructions how to install the loggers
* hide _abc_impl property from docs
* review Borda, 4 spaces
* indentation in example sections
* blank
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-16 12:04:12 -04:00
Jirka Borovec
b3fe17ddeb
fix flushing loggers ( #1459 )
...
* flushing loggers
* flushing loggers
* flushing loggers
* flushing loggers
* changelog
* typo
* fix trains
* optimize imports
* add logger test all
* add logger test pickle
* flake8
* fix benchmark
* hanging loggers
* try
* del
* all
* cleaning
2020-04-14 20:32:33 -04:00