lightning

Commit Graph

Author	SHA1	Message	Date
Jirka Borovec	f72e354ee6	fixing TensorBoard (#687 ) * flake8 * fix typo * fix tensorboardlogger drop test_tube dependence * formatting * fix tensorboard & tests * upgrade Tensorboard * test formatting separately * try to fix JIT issue * add tests for 1.4	2020-01-16 07:22:29 -05:00
William Falcon	88b750a018	default logger is now tensorboard (#609 ) * refactor * refactor * refactor * made tensorboard the default not test-tube	2020-01-14 14:40:41 -05:00
MartinPernus	3002bd3df5	log named parameters (#660 )	2020-01-13 22:54:06 -05:00
Vadim Bereznyuk	756c70a4a0	Clearer disable validation logic (#650 ) * Clearer disable validation logic * fix for fast_dev_run * flake8 fix * Test check fix * update error message	2020-01-13 22:31:15 -05:00
Boris Dayma	ec7fc97857	Feature: wandb logger (#627 ) * Basic wandb support * refactor(wandb): remove unused variables and document logger * docs(wandb): explain how to use WandbLogger * test(wandb): add tests for WandbLogger * feat(wandb): add save_dir * fix(wandb): allow pickle of logger * fix(wandb): save logs in custom directory * test(wandb): test import * docs(wandb): simplify docstring and use doctest * test: increase number of epochs for satisfactory accuracy * test(test_load_model_from_checkpoint): ensure we load last checkpoint Co-authored-by: Chris Van Pelt <vanpelt@wandb.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:25:27 -05:00
Jirka Borovec	f7db44e750	fix deprecated tng and abstract ligntning (#644 )	2020-01-13 22:20:38 -05:00
Jakub	8dc8a8bfd3	Neptune integration (#648 ) * added neptune integration * added tests for NeptuneLogger, added neptune to docs * updated link to neptune support * fixed docstrings, fixed try/except in tests, changed append_tags input * fixed docstrings line lenght * bumped epoch nr in model restore tests * added tags support for single strings * fixed passing neptune token to backend * fixed project name in offline mode * added save_top_k=-1 to checkpoint callback * reformated initialization of neptune in online mode * bumped epoch nr to 4 in test_load_model_from_checkpoint * bumped epoch nr to 5 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:20:01 -05:00
Ayberk Aydın	0ae3dd9ed4	Fix GAN training. (#603 ) * fix dangling gradients make sure only the gradients of the current optimizer's paramaters are calculated in the training step. * add note about multiple optimizer gradient update * Update training_loop.py	2020-01-13 22:12:04 -05:00
Ayla Khan	1969c6cc2a	Remove extraneous f character from f-string. (#679 ) Makes tracking experiment names confusion, especially when using uuids.	2020-01-13 22:11:04 -05:00
Jirka Borovec	db6b404748	CI pass (#671 ) * fix pillow in test * test acc * update version in deprecated msg	2020-01-13 22:09:47 -05:00
Vadim Bereznyuk	12edc3099c	Fix the number of training batches used in the training loop (#653 ) * Fix the number of processed training batches * Fix tests * fix tests * fix tests * One more attempt * Fix another test	2020-01-05 14:37:09 -05:00
Vadim Bereznyuk	7824b5c5f5	Fix percent_checks (#649 ) * fix percent_checks * Added _percent_range_check * remove max	2020-01-05 14:36:06 -05:00
Hao Sheng	ca73b70d15	fix of issue 600 (#625 )	2019-12-14 20:24:46 -08:00
Jeremy Jordan	3dd0b8c186	fix metric name to work with default earlystopping (#628 )	2019-12-14 20:23:44 -08:00
Jay Morgan	d1633aac11	Fix #618 Change papi to api (#619 ) * Change papi to api * Added try catch for old/new api reference	2019-12-10 16:24:21 -08:00
Adrian Wälchli	e2ee4ddbdb	Fix early stopping off by 2 (min_epochs) (#617 ) * fix early stopping off by 2 * add min_epochs example in docs	2019-12-09 10:32:49 -08:00
VSJMilewski	d562172b4c	Allow for multiple example inputs when creating summary (#543 )	2019-12-09 04:42:07 -08:00
Elliot Waite	b492e2b89e	Change nb to num in ABCs, comments, and tqdm logging (#613 ) * Change nb to num in ABCs, comments, and tqdm logging * Fix warnings text * Make warnings one line * Change num to number in comments	2019-12-09 04:40:26 -08:00
Jirka Borovec	5d00e62047	Fix logger, tensorboard (#610 ) * fix logger tests * fix missing flush * fix tensorboard * fix namespace * fix flush * fix add_hparams	2019-12-08 07:59:25 -08:00
Nic Eggert	5329c72cb0	Implement TensorboardLogger (#607 ) * Implement TensorboardLogger * Pass default_save_path to trainers * Update tensorboard.py	2019-12-07 23:25:37 -05:00
Nic Eggert	2baa80d626	Make sure train doesn't crash when called at max_epoch (#608 )	2019-12-07 23:22:03 -05:00
Jirka Borovec	4970624f8b	fix Logger tests for Win (#605 ) * fix mlflow test * fix mlflow test * update logger / mlflow * flake8 * fix appveyor	2019-12-07 19:25:12 -05:00
ctlaltdefeat	58cc6e13b9	Update logging.py (#602 )	2019-12-07 10:12:33 -05:00
schwobr	2f01c03b38	Additional hooks (#598 ) * Renamed `on_sanity_check_start` to `on_train_start` and added `on_train_end` to `ModelHooks` * changed tests to use `on_train_start` instead of `on_sanity_check_start`	2019-12-07 08:52:06 -05:00
Elliot Waite	1051c189e1	Simplify variables: step, epoch, max_epochs, min_epochs (#589 )	2019-12-07 08:50:21 -05:00
Adrian Wälchli	f7e1040236	Docs and Tests for "gpus" Trainer Argument (#593 ) * add table for gpus argument * fix typo in error message * tests for supported values * tests for unsupported values * fix typo * add table for gpus argument * fix typo in error message * tests for supported values * tests for unsupported values * fix typo * fix typo list->str * fix travis warning "line too long"	2019-12-07 08:48:45 -05:00
YehCF	cc65f39d97	Fix number of total steps shown in progress bar during sanity validation check when number of validation dataloaders >= 2 (#597 ) * type: debug Calculate the adequate number of steps to run during sanity_check. This fixes the bug when there are two or more validation dataloaders. - Before: total=self.num_sanity_val_steps - After: total=self.num_sanity_val_stepslen(self.get_val_dataloaders()) type: refactor Put total=... in the next line * type: refactor run flake8	2019-12-07 08:47:59 -05:00
Nic Eggert	0489e31b02	Fix CometML tests (#585 ) * monkeypatch atexit.register to fix problem with cometml logging * Use experiment id for version in cometml	2019-12-07 00:24:59 -05:00
Jirka Borovec	1d4b6be17b	rename trainer modules, drop `_mixin` (#571 ) * rename trainer modules, drop _mixin * fix imports	2019-12-04 11:39:14 -05:00
Jirka Borovec	e0dbc8ab46	Abstract Mixin classes (#572 ) * make partial Trainer classes as abstract * add empty attributes/methods * flake8 * fix mixin order * update abstact * reorder	2019-12-04 10:57:32 -05:00
Adrian Wälchli	218f0a5b4a	inspect training_step for opt_idx (#573 )	2019-12-04 07:32:47 -05:00
Ir1dXD	c316173e89	use print for INFO and lower levels summarize() (#580 ) * use print for INFO and lower levels summarize() * use logging.INFO instead of magic number * bring logging.info back for other cases * move logging config to __init__.py * prepend the model summary with a newline	2019-12-04 07:05:34 -05:00
Ir1dXD	d4571d1d6f	filter param with no grad (#579 )	2019-12-04 07:04:58 -05:00
Dang Nguyen Anh Khoa	b5b77e44b1	fix logging error (#575 ) * fix logging error * no need for the '+' sign * move space to beginning of next line	2019-12-04 07:04:14 -05:00
Jirka Borovec	ab4fea0b55	fix defecation warnings (#570 ) * fix defecation warnings * flake8 * update deprecations	2019-12-04 06:59:19 -05:00
Jirka Borovec	3a58937d8b	rename variables nb -> num (#567 ) * rename nb -> num * flake8 * batch_nb, epoch_nb, gpu_nb, split_nb * add _num deprecations	2019-12-04 06:57:10 -05:00
Mary Trofimova	a6d64ac013	Support torch.optim.lr_scheduler.ReduceLROnPlateau (#320 ) * feat: add reducelronplateau callback * feat: use reducelronplateau callback in trainer * feat: only on unsupported lr schedulers * feat: last but not the least merge of master * feat: merge master * feat: support only on scheduler in reduceLrOnPlateauScheduler * refactor: code style * Update pt_callbacks.py * Update trainer.py * Update train_loop_mixin.py * Update trainer.py * Update train_loop_mixin.py	2019-12-03 07:59:41 -05:00
Yongrae Jo	2b8475f590	Add resuming from specific checkpoint (#516 ) * Add resume_from_checkpoint * Fix variable name * #515 Remove did_restore * #515 Simplify code * #515 Update doc for resume_from_checkpoint * #515 Add on_gpu	2019-11-30 16:48:38 -05:00
Pariente Manuel	df7b6d958e	Correct behavior for argument gpus in Trainer (#561 )	2019-11-30 14:50:50 -05:00
williamFalcon	db0587f158	fixed tests	2019-11-28 16:02:36 -08:00
William Falcon	29122e4308	Dp default (#560 ) * set auto dp if no backend * fix imagenet example * run flake8 first to fail build on syntax first	2019-11-28 18:14:08 -05:00
Jirka Borovec	d71556e7a1	Sphinx generated documentation (#521 ) * upgrade req. * move MkDocs * create Sphinx * init Sphinx * move md from MkDocs to Sphinx * CI: build docs * build Sphinx formatting move docs from MD to docstring in particular package/modules formatting add Sphinx ext. rename root_module to core drop implicit name "_logger" drop duplicate name "overwrite" fix imports use pytorch theme add sample link mapping try fix RTD build use forked template fix some docs warnings fix paths add deprecation warnings fix flake8 fix paths revert refactor revert MLFlowLogger * revert example import * update link * Update lightning_module_template.py	2019-11-28 12:48:55 -05:00
Jirka Borovec	47659daa5f	speed-up testing (#504 ) * extend CI timeout * add short MNIST * lower dataset and stop thr * refactor imports * formatting * early stop * play params * play params * minor refactoring # Conflicts: # pytorch_lightning/testing/__init__.py # pytorch_lightning/testing/lm_test_module.py # pytorch_lightning/testing/lm_test_module_base.py # pytorch_lightning/testing/lm_test_module_mixins.py # pytorch_lightning/testing/model.py # pytorch_lightning/testing/model_base.py # pytorch_lightning/testing/model_mixins.py # pytorch_lightning/testing/test_module.py # pytorch_lightning/testing/test_module_base.py # pytorch_lightning/testing/test_module_mixins.py * typo Co-Authored-By: Ir1dXD <sirius.caffrey@gmail.com> * Revert "refactor imports" This reverts commit `b86aee92` * update imports	2019-11-28 12:06:05 -05:00
Jirka Borovec	9785a3e78e	Refactor: name modules (#548 ) * refactor: rename some modules * add deprecation warnings * fix paths	2019-11-26 22:39:18 -05:00
Anton Bakhtin	fea7cc87f6	Move model to cuda before creating optimizer (#554 )	2019-11-26 22:35:38 -05:00
Jirka Borovec	f2191b0cdf	fix for pyTorch 1.2 (#549 ) * min pytorch 1.2 * fix IterableDataset * upgrade torchvision * fix msg	2019-11-26 10:58:50 -05:00
MikeScarp	55f3ffd7c7	fixing bug in testing for IterableDataset (#547 )	2019-11-26 04:59:20 -05:00
Tullie Murrell	48b797fdb0	Copy batch for local forward (#532 )	2019-11-23 04:04:40 -05:00
Tullie Murrell	55edf7c922	Remove unneeded filename print (#540 )	2019-11-23 04:00:39 -05:00
Tanel Alumäe	539d7bcb44	Avoid race condition in creating checkpoint directories (#530 ) * Avoid race condition in creating checkpoint directories In multi-GPU training, several processes run the code that creates checkpoint dirs. This fix avoids a probably rare situation (but it happened to me) where another process created a dir between the `exists` check and the `makedirs` call. * Remove the now unneeded check for dir existence	2019-11-21 13:27:39 -05:00

1 2 3 4 5 ...

752 Commits