lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	53b7644c15	fix docs path	2020-01-17 16:06:06 -05:00
William Falcon	bc67689068	clean v2 docs (#691 ) * updated gitignore * Update README.md * updated gitignore * updated links in ninja file * updated docs * Update README.md * Update README.md * finished callbacks * finished callbacks * finished callbacks * fixed left menu * added callbacks to menu * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * fixing TensorBoard (#687) * flake8 * fix typo * fix tensorboardlogger drop test_tube dependence * formatting * fix tensorboard & tests * upgrade Tensorboard * test formatting separately * try to fix JIT issue * add tests for 1.4 * added direct links to docs * updated gitignore * updated links in ninja file * updated docs * finished callbacks * finished callbacks * finished callbacks * fixed left menu * added callbacks to menu * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * added direct links to docs * finished rebase * making private members * making private members * making private members * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * set auto dp if no backend * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * working on trainer docs * fixed lightning import * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * cleared spaces * finished lightning module * finished lightning module * finished lightning module * finished lightning module * added callbacks * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * set auto dp if no backend * added loggers * added loggers * added loggers * added loggers * added loggers * added loggers * flake 8 * flake 8 Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>	2020-01-17 06:03:31 -05:00
Boris Dayma	ec7fc97857	Feature: wandb logger (#627 ) * Basic wandb support * refactor(wandb): remove unused variables and document logger * docs(wandb): explain how to use WandbLogger * test(wandb): add tests for WandbLogger * feat(wandb): add save_dir * fix(wandb): allow pickle of logger * fix(wandb): save logs in custom directory * test(wandb): test import * docs(wandb): simplify docstring and use doctest * test: increase number of epochs for satisfactory accuracy * test(test_load_model_from_checkpoint): ensure we load last checkpoint Co-authored-by: Chris Van Pelt <vanpelt@wandb.com> Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:25:27 -05:00
Jakub	8dc8a8bfd3	Neptune integration (#648 ) * added neptune integration * added tests for NeptuneLogger, added neptune to docs * updated link to neptune support * fixed docstrings, fixed try/except in tests, changed append_tags input * fixed docstrings line lenght * bumped epoch nr in model restore tests * added tags support for single strings * fixed passing neptune token to backend * fixed project name in offline mode * added save_top_k=-1 to checkpoint callback * reformated initialization of neptune in online mode * bumped epoch nr to 4 in test_load_model_from_checkpoint * bumped epoch nr to 5 Co-authored-by: William Falcon <waf2107@columbia.edu>	2020-01-13 22:20:01 -05:00
William Falcon	15cb79923a	Add files via upload	2019-12-12 10:23:02 -08:00
Jirka Borovec	c374c4fb80	extend documentation (#569 ) * extend documentation * update index * fix list	2019-12-07 00:23:48 -05:00
Jirka Borovec	3a58937d8b	rename variables nb -> num (#567 ) * rename nb -> num * flake8 * batch_nb, epoch_nb, gpu_nb, split_nb * add _num deprecations	2019-12-04 06:57:10 -05:00
Jirka Borovec	d71556e7a1	Sphinx generated documentation (#521 ) * upgrade req. * move MkDocs * create Sphinx * init Sphinx * move md from MkDocs to Sphinx * CI: build docs * build Sphinx formatting move docs from MD to docstring in particular package/modules formatting add Sphinx ext. rename root_module to core drop implicit name "_logger" drop duplicate name "overwrite" fix imports use pytorch theme add sample link mapping try fix RTD build use forked template fix some docs warnings fix paths add deprecation warnings fix flake8 fix paths revert refactor revert MLFlowLogger * revert example import * update link * Update lightning_module_template.py	2019-11-28 12:48:55 -05:00
Ir1dXD	7324dd902b	change Checkpoint callback's `save_best_only` to `save_top_k` (#128 ) * docs: enable syntax highlight * feat: change Checkpoint callback's `save_best_only` to `save_top_k` fix #70 * docs: update docs for save_top_k * revert other files * style: lint for travis-ci * fix typo * make flake8 happy * update according to review * add tests * rename func to private * add doc on `save_top_k == 0` * make flake8 happy * update according to PR comments * change some f-strings * Update pt_callbacks.py * Update test_models.py * update options * create folders * Update test_models.py * change epoch num * support calling multiple times, add docs and tests * update docs * roll back changes in earlystopping * clean test files * make flake8 happy * fix epoch number * update tests about epoch numbers * clean debugging code * fix testing utils codes * fix testing utils codes * fix testing utils codes * fix testing utils codes * change save_dir to tests/tests according to previous lines * remove unused overwrite option * make flake8 happy * change var name as per review * make flake8 happy * update property name to work on master * elaborate in the docs * update docs as per review * revert previous commit accidentally pressed wrong button when solving conflicts	2019-11-19 15:43:34 -08:00
Jeffrey Ling	1af85f3038	Update methods.md (#507 )	2019-11-14 12:06:23 -05:00
rwesterman	d1b6b011c3	Comet fix (#481 ) * Fixing comet ml bug and adding functionality * Updating documents * Fixing code style issues in comet_logger * Changing comet_logger experiment to execute lazily * Adding tests for comet_logger and addressing comments from @Borda * Setting step_num to optional keyword argument in log_metrics() to comply to other loggers * Adding offline logging mode for comet_ml, updating tests and docs * Switching to MisconfigurationException	2019-11-11 23:00:31 -05:00
Tullie Murrell	a3f785dfca	Fix tbptt docs (#484 )	2019-11-08 21:21:36 -05:00
William Falcon	3e38005a61	Ddp2 fix (#448 ) * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * allow ddp and apex to be configured * allow ddp and apex to be configured * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * added eval and train for redundancy * added eval and train for redundancy * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * added training_end * allow ddp and apex to be configured * allow ddp and apex to be configured * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * bananas * added eval and train for redundancy * added eval and train for redundancy	2019-11-05 10:01:52 -05:00
Ir1dXD	5a9afb11cc	change print to logging (#457 ) * change print to logging * always use logging.info * use f-strings * update code style * set logging configs * remove unused code	2019-11-05 08:43:21 -05:00
s-rog	4e9fd95f79	packed sequence clarification in train_dataloader (#443 ) * packed sequence clarification in train_dataloader * moved changes to training loop * removed changes from required interface * added index entry	2019-11-03 05:26:27 -05:00
Pattarawat Chormai	1865de1ff8	[WIP] Fix wrong example paths in README.md (#444 ) * Fix wrong example paths * correct dataloading wrong condition in Readme	2019-11-01 07:55:37 -04:00
Tullie Murrell	248495b1d1	Add tbptt (#429 ) * Add truncated bptt * Fix rebase error * AutoPep8 * Address comments, incl default bptt_split impl * Add tbptt test * Add default split for lists/tuples * Add tbptt docs * Fix trainer spacing * Update RequiredTrainerInterface.md	2019-10-31 06:45:28 -04:00
Joel Wong	f6b8b175bb	Update Docs for current checkpointing behaviour (#445 ) Related issue #432 The old documentation suggested that the way to restore a training session is to use a test_tube Experiment. Trainer no longer takes an experiment as a parameter, so it seems the current way to restore a training session is to pass an experiment via a TestTubeLogger. Even if this is not the most elegant solution, updating the docs will at least point new users in the right direction.	2019-10-31 06:40:32 -04:00
William Falcon	d5ca464cc6	Back hook (#424 ) * Fixes #356 * Fixes #356 * Fixes #356 * Fixes #356 * Fixes #356 * Fixes #356	2019-10-24 07:56:56 -04:00
Vismantas	2aba70e228	parse_gpu_ids fix (#382 ) * Unit tests for num_gpu property as proxy for __parse_gpu_ids. * Refactoring __parse_gpu_ids * Moved the function outside the class as it is an utility function and did not depend on class in any way. * Added unit tests for it. * Mocked torch.cuda.device_count function in tests. This allows the tests to be run on machines that do not have gpus. * Fixed the parse_gpu_ids function to handle -1 case. Function now handles -1 the same way as it does for '-1'. * Unit tests for root_gpu added. Added backend as a parameter as currently depending on backend set or not, code fails with exception in certain circumstances, before giving a wrong answer. * Moved __set_root_gpu function out of the class. This function does not depend on the class and can be tested more easily this way. Also added unit tests for this function. They simply reuse data for the root_gpu property. * determine_root_gpu_device passes unit tests. * num_gpus passes unit tests. Also added a None test for this function. * parse_gpu_ids tests changed to reflect desired state after refactoring. Planning to refactor parse_gpu_ids to always return list of ints. This will simplify code that use output of this function. * * parse_gpu_ids always returns lists * parse_gpu_ids checks given ids against available ids * parse_gpu_ids raises exception for non existant ids * parse_gpu_ids returns None when no gpus are available * cleaned up determine_root_gpu_device * cleaned up num_gpus property * Updated unit tests to reflect changes in the functions * Flake8 fixes * Moved fixture code up before where it is used. * Updated documentation. * Changed tests to match the API: * gpus=-1 or gpus='-1' should use all available gpu devices * gpus=N * N=0: no gpus should be used. * N>0: N gpus should be used * gpus=list of ints or a comma separated string of numbers: Use the gpus indicated by the list or the string. * Fixed code to pass all the changed tests for parsing gpus param. * Refactoring parse_gpu_ids function. * flake8 fixes. * Updating documentation. * flake8 fixes. * flake8 fixes. * flake8 fixes * Update trainer.py * Update dp_mixin.py * Make reduce_distributed_output a stand alone function. Fix imports. Fix flake8. * Add comet_ml dependency to tests requirements.txt * Revert "Make reduce_distributed_output a stand alone function. Fix imports. Fix flake8." This reverts commit `eac0338` * Merge with master.	2019-10-23 05:05:09 -04:00
Nic Eggert	05cea3ff8b	Save / Load Hyperparameters with checkpoint (#415 ) * Save and load hparams from checkpoints * Update docs * Add warning when not saving hparams * Missing import * Update .run_local_tests.sh * Update lm_test_module_mixins.py * Update lightning_module_template.py	2019-10-23 04:48:24 -04:00
William Falcon	792ad00ff9	Fixed val interval (#405 ) * added fixed frequency val batch check * added fixed frequency val batch check * Finished IterableDataset support * flake8 * flake8 * flake8	2019-10-22 05:10:00 +03:00
Cristobal Eyzaguirre	ab6794406e	Logger consistency (#397 ) * added comet logger * bug fix in cases where comet was not imported before torch * fixed mlflow logger to be consistent with docs, updated cometLogger and cometLoggers docs + flake 8 compliance	2019-10-22 04:51:17 +03:00
William Falcon	58d52c25a1	Fixes #347 (#393 )	2019-10-19 00:51:48 +02:00
William Falcon	c1bbc2158f	Fixes #361 (#391 )	2019-10-19 00:39:45 +02:00
Luis	5cfff1e5c1	Fixed link to trainer.py github code (#386 )	2019-10-18 15:36:33 +02:00
William Falcon	3dfcef6994	Loss keys (#387 ) * any key in logs or progress bar is a candidate for callback metric * any key in logs or progress bar is a candidate for callback metric	2019-10-18 15:28:13 +02:00
Adrian Wälchli	6e3e740a7f	Param printing (#336 ) * print thousands as K, M, B, T, ... * add option to print top-level modules only * added doc string and added spacing * do not print summary if neither "full" nor "top" * updated docs showing summary print options * fix line length for travis	2019-10-08 15:30:06 -04:00
David Kossnick	c0bd203cff	Fix broken link in Examples Readme (#327 ) It now points to the current examples folder.	2019-10-08 07:39:54 -04:00
William Falcon	491100abdd	Docs (#315 ) * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up demos * cleaning up docs * cleaned up test_tube logger * cleaned up test_tube logger * cleaned up test_tube logger	2019-10-05 23:52:32 -04:00
William Falcon	eca0e7cff7	docs	2019-10-05 17:34:10 -04:00
William Falcon	49c7d54dba	readme	2019-10-05 17:18:26 -04:00
William Falcon	6cc3f1757f	decouple returns from each step (#307 ) * decoupled training metrics from logging metrics * decoupled validation metrics from log metrics * updated docs * updated docs * updated docs * Fixed test * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master * merged master	2019-10-05 13:35:20 -04:00
William Falcon	8f5a06bfb8	Gpu mem (#308 ) * Fixes #289 * Fixes #289 * added lbfgs support * Fixes #280 (#309) * added test seeds (#306) * added test seeds * added test seeds * updated docs * added lbfgs support (#310) * added lbfgs support * added lbfgs support * added lbfgs support * Fixes #280 (#309) * added test seeds (#306) * added test seeds * added test seeds * updated docs * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * Fixes #289 * Fixes #289 * merged master * merged master	2019-10-05 11:29:34 -04:00
William Falcon	75fd89106f	added lbfgs support (#310 ) * added lbfgs support * added lbfgs support * added lbfgs support * Fixes #280 (#309) * added test seeds (#306) * added test seeds * added test seeds * updated docs * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support * added lbfgs support	2019-10-05 11:10:21 -04:00
William Falcon	bf09060fef	Fixes #292 (#303 ) * early stopping callback is not default * added a default logger * added default checkpoint callback * added default checkpoint/loggers * added default checkpoint/loggers * updated docs * cleaned demos * cleaned demos * cleaned demos * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers * clean up docs around loggers	2019-10-04 19:48:57 -04:00
William Falcon	a578de511d	clean up docs around loggers (#304 )	2019-10-04 18:53:38 -04:00
William Falcon	32e74b8f36	Ddp2 (#261 ) * adds ddp2 option where on each node a single process uses all gpus * added ddp2 test * added ddp2 docs * Update Distributed training.md * delete ref to old update_training_log_metrics * delete ref to old update_training_log_metrics * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * banana pancakes * debug * debug * debug * debug * debug * debug * debug * debug * cheesecake	2019-10-04 15:07:54 -04:00
Wouter van Amsterdam	63c475c600	tiny spelling error (#295 )	2019-10-04 07:14:30 -04:00
Nic Eggert	614cb3c03b	Initialize loggers only once (#270 ) * Create underlying loggers lazily This avoids creating duplicate experiments or run in multi-node DDP. * Save hyperparameters automatically * Update docs for snapshotting hyperparams * Fix test tube * Fix test tube pickling	2019-10-02 11:10:40 -04:00
William Falcon	133d6b3ec1	updated docs	2019-10-01 06:38:10 -04:00
William Falcon	fbc2cfd513	updated docs	2019-10-01 06:29:12 -04:00
Nic Eggert	480eed5cb6	Enable any ML experiment tracking framework (#223 ) * Implement generic loggers for experiment tracking * Add tests for loggers * Get model tests passing * Test and fix logger pickling * Expand pickle test and fix bug * Missed exp -> logger conversion * Remove commented code * Add docstrings * Update logging docs * Add mlflow to test requirements * Make linter happy * Fix mlflow timestamp * Update Logging.md * Update test_models.py * Update test_models.py * Update test_models.py * Update properties.md * Fix tests * Line length	2019-09-27 12:05:29 -04:00
William Falcon	1d7ffd11da	delete ref to old update_training_log_metrics (#262 )	2019-09-26 17:53:15 -04:00
William Falcon	059b2fae29	Update Distributed training.md	2019-09-26 15:30:54 -04:00
William Falcon	cefcf4cd12	Update Distributed training.md	2019-09-26 15:27:34 -04:00
Adrian Wälchli	e713e2e1e0	fix typo in early stopping (#260 )	2019-09-26 15:04:57 -04:00
William Falcon	acb4ebea56	added docs for cluster grid search	2019-09-26 12:02:03 -04:00
William Falcon	97b6ebccc0	expanded apex install (#255 )	2019-09-26 09:36:03 -04:00
William Falcon	3337c0237b	Fixes #250 (#253 )	2019-09-26 09:13:00 -04:00

1 2 3 4 5 ...

259 Commits