lightning

Commit Graph

Author	SHA1	Message	Date
Nic Eggert	614cb3c03b	Initialize loggers only once (#270 ) * Create underlying loggers lazily This avoids creating duplicate experiments or run in multi-node DDP. * Save hyperparameters automatically * Update docs for snapshotting hyperparams * Fix test tube * Fix test tube pickling	2019-10-02 11:10:40 -04:00
William Falcon	133d6b3ec1	updated docs	2019-10-01 06:38:10 -04:00
William Falcon	fbc2cfd513	updated docs	2019-10-01 06:29:12 -04:00
Nic Eggert	480eed5cb6	Enable any ML experiment tracking framework (#223 ) * Implement generic loggers for experiment tracking * Add tests for loggers * Get model tests passing * Test and fix logger pickling * Expand pickle test and fix bug * Missed exp -> logger conversion * Remove commented code * Add docstrings * Update logging docs * Add mlflow to test requirements * Make linter happy * Fix mlflow timestamp * Update Logging.md * Update test_models.py * Update test_models.py * Update test_models.py * Update properties.md * Fix tests * Line length	2019-09-27 12:05:29 -04:00
William Falcon	1d7ffd11da	delete ref to old update_training_log_metrics (#262 )	2019-09-26 17:53:15 -04:00
William Falcon	059b2fae29	Update Distributed training.md	2019-09-26 15:30:54 -04:00
William Falcon	cefcf4cd12	Update Distributed training.md	2019-09-26 15:27:34 -04:00
Adrian Wälchli	e713e2e1e0	fix typo in early stopping (#260 )	2019-09-26 15:04:57 -04:00
William Falcon	acb4ebea56	added docs for cluster grid search	2019-09-26 12:02:03 -04:00
William Falcon	97b6ebccc0	expanded apex install (#255 )	2019-09-26 09:36:03 -04:00
William Falcon	3337c0237b	Fixes #250 (#253 )	2019-09-26 09:13:00 -04:00
Alok Singh	b0a0a47a0b	Rename variables (#124 ) - data_batch → batch - batch_i → batch_idx - dataloader_i → dataloader_idx - tng → training - training_dataloader → train_dataloader - add_log_row_interval → row_log_interval - gradient_clip → gradient_clip_val - prog → progress - tqdm_dic → tqdm_dict	2019-09-25 19:05:06 -04:00
Cola	3d16a686b3	Add EarlyStop documentation (#245 ) * Update Training Loop.md * Update index.md * Update README.md * Update Training Loop.md * Update Training Loop.md	2019-09-25 14:52:40 -04:00
William Falcon	2a1bc22f42	updated docs	2019-09-17 09:57:16 -04:00
William Falcon	d3afc8acd5	updated docs	2019-09-17 09:53:31 -04:00
William Falcon	4c61d1f30a	updated docs	2019-09-16 11:07:16 -04:00
William Falcon	e1adbe80f9	updated docs	2019-09-16 11:04:40 -04:00
William Falcon	286625a02f	updated docs	2019-09-16 11:02:04 -04:00
William Falcon	b354988255	updated docs	2019-09-16 10:59:28 -04:00
William Falcon	10d190e045	Simplified gpu api. No NVIDIA flag managing by lightning for cluster (#213 ) * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added simple cluster template * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs	2019-09-08 15:36:58 -04:00
William Falcon	3e74ea15d8	Fixes #120 (#210 )	2019-09-06 14:27:24 -04:00
William Falcon	60633eaa32	Moves hpc auto-resubmit to trainer from test-tube (#207 ) * added slurm signal handler * added restore weight functions * set slurm signal handling inside process * added resubmit docs * added resubmit docs * fixed missing param * Update trainer.py * fixed missing param * fixed missing param * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests	2019-09-06 11:54:51 -04:00
Nic Eggert	1733dba735	Pass outputs from all dataloaders to test_end and validation_end (#203 ) * Pass outputs from all dataloaders to test_end and validation_end * Update tests * Update docs * Update trainer.py * Update test_models.py	2019-09-06 07:37:25 -04:00
Max Horn	dac41030d4	Allow to deactivate GPU memory logging in Trainer (#190 ) * Allow to deactivate GPU memory logging in Trainer Adds the flag `log_gpu_memory` to Trainer to deactivate logging of GPU memory utilization. On some servers logging the GPU memory usage can significantly slow down training. * Update Logging.md * Update trainer.py	2019-09-04 10:43:46 -04:00
William Falcon	c4ce347f3e	testing loop docs	2019-09-02 07:15:45 -04:00
William Falcon	9e6ce3b0d6	testing loop docs	2019-09-02 07:15:45 -04:00
William Falcon	a327596b79	add training loop docs	2019-09-02 07:15:45 -04:00
Verena Haunschmid	25d5b25792	Expectopatronum implement #89 (#182 ) * rename validate -> evaluate; implement test logic; allow multiple test_loaders * add test_step and test_end to LightningModule * add in_test_mode to pretraining to implement case 2 (test pretrained model) * fix code style issues * LightningTestModel: add optional second test set, implement test_step and test_end * implemented test for multiple test_dataloaders; fixed typo * add two test cases for #89 * add documentation for test_step, test_end; fix computation of loss in validation_step example * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Added proper dp ddp routing calls for test mode * Update trainer.py * Update test_models.py * Update trainer.py * Update trainer.py * Update override_data_parallel.py * Update test_models.py * Update test_models.py * Update trainer.py * Update trainer.py * Update trainer.py * Update test_models.py * Update test_models.py * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * Update trainer.py * Update override_data_parallel.py * Update debug.py * Update lm_test_module.py * Update test_models.py	2019-09-02 07:15:27 -04:00
Ir1dXD	c2247350bb	feat(val_sanity): enable skipping validation sanity (#176 ) * feat(val_sanity): enable skipping validation sanity when self.nb_sanity_val_steps is 0 * docs: elaborate on skipping	2019-08-28 06:41:31 -04:00
Ir1dXD	6eb6daa278	enable highlight (#170 )	2019-08-27 07:09:46 -04:00
William Falcon	4104a0fc47	cleaned up progbar (#165 ) * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * flake 8	2019-08-23 21:23:27 -04:00
Sebastian Præsius	9fc66026f1	train = False in test_dataloader (#162 ) A small change to the CoolModel example. Now test_dataloader returns the MNIST test dataset.	2019-08-22 17:44:06 -04:00
sebftw	a7a14dadb6	F.cross_entropy(y_hat, y)(y_hat, y) typo. (#137 ) This seems to be a typo. Throws TypeError: 'Tensor' object is not callable.	2019-08-18 18:17:43 -04:00
sebftw	b2a49197e4	tensorboarX to tensorboardX (#136 ) * tensorboarX to tensorboardX * Update properties.md	2019-08-18 18:17:05 -04:00
Ir1dXD	48de39ed50	elaborate on the correlation between overfit_pct and xxx_percent_check (#132 ) * Update Training Loop.md * update docs and elaborate on the correlation	2019-08-17 10:23:25 -04:00
Ir1dXD	24a97956e4	fix typo in docs (#129 ) * fix typo * fix typo * fix typo * fix list	2019-08-17 07:48:33 -04:00
William Falcon	e60e002f17	updated docs	2019-08-16 17:14:31 -04:00
William Falcon	bdd86087e6	updated docs	2019-08-16 10:07:56 -04:00
William Falcon	50f0de094f	updated docs	2019-08-16 10:07:44 -04:00
William Falcon	bc401d0f59	updated docs	2019-08-16 10:02:28 -04:00
William Falcon	4b97319c2e	updated docs	2019-08-15 21:29:25 -04:00
William Falcon	0e92a9d7af	updated docs	2019-08-15 21:19:29 -04:00
William Falcon	44da88fd15	updated docs	2019-08-15 13:59:27 -04:00
William Falcon	3dea127edb	updated docs	2019-08-13 13:05:47 -04:00
William Falcon	d4b1ac94a0	updated docs	2019-08-13 13:03:39 -04:00
William Falcon	b89b7f0a8c	updated docs	2019-08-13 13:02:17 -04:00
William Falcon	699fbabda7	updated optimizer_step docs	2019-08-13 11:59:33 -04:00
William Falcon	fd845d41c0	updated optimizer_step docs	2019-08-13 11:57:02 -04:00
William Falcon	d7660d3c64	updated optimizer_step docs	2019-08-13 11:55:10 -04:00
William Falcon	7e38f1f246	updated optimizer_step docs	2019-08-13 11:54:19 -04:00

1 2 3 4 5

220 Commits