lightning

Commit Graph

Author	SHA1	Message	Date
William Falcon	9576dd28b2	added load on CPU first (#221 ) * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added load on CPU first * added print logs * added print logs * changed close order * changed close order	2019-09-11 07:52:36 -04:00
William Falcon	90353ac54e	changed examples scripts	2019-09-11 07:05:15 -04:00
William Falcon	cf7dbf6d7c	changed examples scripts	2019-09-11 07:03:31 -04:00
William Falcon	30b25c8146	Sai prasanna master (#219 ) * Fix incorrect warning for DistributedSampler. Check whether `dataloader.sampler` is an instance of DistributedSampler instead of checking the `dataloader`. * Update trainer.py * merged	2019-09-09 11:36:24 -04:00
William Falcon	ac0111c196	Update multi_node_cluster_auto_slurm.py	2019-09-09 10:55:47 -04:00
William Falcon	cbc619afa1	Update multi_node_own_slurm_script.py	2019-09-09 10:54:43 -04:00
William Falcon	3393086cb6	Update multi_node_cluster_auto_slurm.py	2019-09-09 10:53:47 -04:00
William Falcon	506d5da68b	enable single gpu per node (#218 ) * enable single gpu per node * enable single gpu per node * enable single gpu per node * enable single gpu per node * enable single gpu per node * enable single gpu per node	2019-09-09 07:37:20 -04:00
William Falcon	a6fe6f0917	Update README.md	2019-09-08 18:21:05 -04:00
William Falcon	8f289f9fa8	Update README.md	2019-09-08 18:19:00 -04:00
William Falcon	6c947f4e0d	Update README.md	2019-09-08 18:18:21 -04:00
William Falcon	396047ffa0	Updated distributed Demos (#215 ) * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * added simple cluster template * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * simple slurm example * simple slurm example * simple slurm example	2019-09-08 18:17:33 -04:00
William Falcon	83b756f77b	Update tox.ini	2019-09-08 15:46:30 -04:00
William Falcon	10d190e045	Simplified gpu api. No NVIDIA flag managing by lightning for cluster (#213 ) * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added nvidia flag set * added simple cluster template * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs * sets correct backend for possible combinations of gpu inputs	2019-09-08 15:36:58 -04:00
William Falcon	b3434943c7	Update multi_node_cluster_template.py	2019-09-07 10:31:20 -04:00
Alok Singh	81df2259ef	Make print_nan_grads print grad (#208 ) This seems more useful for debugging.	2019-09-07 01:08:09 -04:00
williamFalcon	9f9d38673e	fixed demo	2019-09-06 16:26:46 -07:00
William Falcon	0c7fbc7178	Weights path (#211 ) * added docs. removed options. added weights_save option * removed old restore * cleaned up save path * cleaned up save path * flake8	2019-09-06 17:01:03 -04:00
William Falcon	3e74ea15d8	Fixes #120 (#210 )	2019-09-06 14:27:24 -04:00
William Falcon	7099f8dbfb	split trainer mixins (#209 ) * split trainer mixins * Update multi_node_cluster_template.py * Update single_cpu_template.py * Update single_gpu_node_16bit_template.py * Update single_gpu_node_ddp_template.py * Update single_gpu_node_dp_template.py * Update trainer_cpu_template.py * Update trainer_io.py * split trainer mixins * Update multi_node_cluster_template.py * deconflicted * deconflicted * deconflicted	2019-09-06 14:11:07 -04:00
William Falcon	60633eaa32	Moves hpc auto-resubmit to trainer from test-tube (#207 ) * added slurm signal handler * added restore weight functions * set slurm signal handling inside process * added resubmit docs * added resubmit docs * fixed missing param * Update trainer.py * fixed missing param * fixed missing param * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests * debugging tests	2019-09-06 11:54:51 -04:00
Jirka Borovec	7ed928dfac	add PR template (#204 ) * add PR template * Update PULL_REQUEST_TEMPLATE.md	2019-09-06 10:12:06 -04:00
Nic Eggert	1733dba735	Pass outputs from all dataloaders to test_end and validation_end (#203 ) * Pass outputs from all dataloaders to test_end and validation_end * Update tests * Update docs * Update trainer.py * Update test_models.py	2019-09-06 07:37:25 -04:00
Jirka Borovec	447ed30716	extend pip install info (#194 ) * extend pip install info * Update README.md * Update README.md	2019-09-06 07:30:51 -04:00
William Falcon	7e0ac3149c	refactored init (#206 )	2019-09-06 00:29:38 -04:00
Thomas J Fan	bd50d9a2b4	DOC Adds reference to test-tube (#205 )	2019-09-05 21:13:49 -04:00
Jirka Borovec	5ef6fa5608	add osx to Travis (#202 ) * add CI macOS * add CI Windows * update CI * drop Win * update CI * update CI	2019-09-05 15:08:19 -04:00
Anton Konstantinov	34b824a9d3	Implement correct transfer to GPU for batches (#200 )	2019-09-05 07:13:06 -04:00
Thomas J Fan	62252cee58	STY Minor flake8 fix (#197 )	2019-09-04 17:46:56 -04:00
Max Horn	dac41030d4	Allow to deactivate GPU memory logging in Trainer (#190 ) * Allow to deactivate GPU memory logging in Trainer Adds the flag `log_gpu_memory` to Trainer to deactivate logging of GPU memory utilization. On some servers logging the GPU memory usage can significantly slow down training. * Update Logging.md * Update trainer.py	2019-09-04 10:43:46 -04:00
Verena Haunschmid	0872c32151	fix import in Tensorboard example (#193 )	2019-09-04 10:20:59 -04:00
Thomas J Fan	c766167773	DOC Minor import fix (#192 )	2019-09-04 06:17:54 -04:00
Nic Eggert	64688e1e15	Refactor test modules (#180 ) * Expectopatronum implement #89 (#182) * rename validate -> evaluate; implement test logic; allow multiple test_loaders * add test_step and test_end to LightningModule * add in_test_mode to pretraining to implement case 2 (test pretrained model) * fix code style issues * LightningTestModel: add optional second test set, implement test_step and test_end * implemented test for multiple test_dataloaders; fixed typo * add two test cases for #89 * add documentation for test_step, test_end; fix computation of loss in validation_step example * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Added proper dp ddp routing calls for test mode * Update trainer.py * Update test_models.py * Update trainer.py * Update trainer.py * Update override_data_parallel.py * Update test_models.py * Update test_models.py * Update trainer.py * Update trainer.py * Update trainer.py * Update test_models.py * Update test_models.py * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * Update trainer.py * Update override_data_parallel.py * Update debug.py * Update lm_test_module.py * Update test_models.py * release v0.4.8 * Update README.md * add training loop docs * testing loop docs * testing loop docs * Convert __dataloader to _dataloader This will let inherited classes use it * Factor common test model setup into base class * Specialized test modules inherit from LightningTestModelBase * Fix __is_overriden so that it works with more complicated inheritance * Use mixins to add functionality to test models * Fix test with no val_dataloader * Remove unused imports * Get rid of wild card import * Update trainer.py * Update lm_test_module.py	2019-09-02 15:46:16 -04:00
William Falcon	c4ce347f3e	testing loop docs	2019-09-02 07:15:45 -04:00
William Falcon	8d6648e51d	Update README.md	2019-09-02 07:15:45 -04:00
William Falcon	9e6ce3b0d6	testing loop docs	2019-09-02 07:15:45 -04:00
William Falcon	a327596b79	add training loop docs	2019-09-02 07:15:45 -04:00
William Falcon	08a1ae8069	release v0.4.8	2019-09-02 07:15:45 -04:00
Verena Haunschmid	25d5b25792	Expectopatronum implement #89 (#182 ) * rename validate -> evaluate; implement test logic; allow multiple test_loaders * add test_step and test_end to LightningModule * add in_test_mode to pretraining to implement case 2 (test pretrained model) * fix code style issues * LightningTestModel: add optional second test set, implement test_step and test_end * implemented test for multiple test_dataloaders; fixed typo * add two test cases for #89 * add documentation for test_step, test_end; fix computation of loss in validation_step example * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Update trainer.py * Added proper dp ddp routing calls for test mode * Update trainer.py * Update test_models.py * Update trainer.py * Update trainer.py * Update override_data_parallel.py * Update test_models.py * Update test_models.py * Update trainer.py * Update trainer.py * Update trainer.py * Update test_models.py * Update test_models.py * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * Update trainer.py * Update override_data_parallel.py * Update debug.py * Update lm_test_module.py * Update test_models.py	2019-09-02 07:15:27 -04:00
Stanislav	73cf47112e	Gradient accumulation callback (#150 ) * Gradient accumulation callback * little test case * typo * import fix * method name fix * fix epochs indexing from 1 * better code style * code style fix v2 :/ * change interface * fix Trainre new api in tests * trainer api bug fix * new raising error, new update method * extentions tests * a little better tests * typo fix * flack8 better * using scheduler for int and dict * typo * firs epoch bug fix * test update * empty dict exception * floats check * codestyle fix * grad counting test * someday, i will install normal linter * add more checks * Update test_models.py * Update test_models.py * Update test_models.py * Update test_models.py * Update test_models.py * Update test_models.py * Update test_models.py	2019-08-30 10:56:14 -04:00
Ir1dXD	c2247350bb	feat(val_sanity): enable skipping validation sanity (#176 ) * feat(val_sanity): enable skipping validation sanity when self.nb_sanity_val_steps is 0 * docs: elaborate on skipping	2019-08-28 06:41:31 -04:00
William Falcon	67c314272b	Update setup.py (#174 )	2019-08-27 18:07:33 -04:00
Ir1dXD	da4c1e3409	docs: add repo_name in the upright corner (#171 )	2019-08-27 16:46:18 -04:00
Jirka Borovec	cd89b4ef43	move GH docs (#168 )	2019-08-27 07:10:26 -04:00
Ir1dXD	6eb6daa278	enable highlight (#170 )	2019-08-27 07:09:46 -04:00
William Falcon	c24599f5e5	release v	2019-08-24 08:13:54 -04:00
Ryan McCormick	b22e5918a9	fix python syntax in code blocks to be consistent (#166 ) A couple code blocks used "{.python}" instead of just "python" for the syntax highlighting, which doesn't render properly in GitHub markdown.	2019-08-23 21:24:18 -04:00
William Falcon	4104a0fc47	cleaned up progbar (#165 ) * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * cleaned up progbar * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * updated base files * flake 8	2019-08-23 21:23:27 -04:00
William Falcon	2ad9a9708b	Update README.md	2019-08-23 16:10:45 -04:00
William Falcon	ecce22f4de	Update README.md	2019-08-23 16:10:24 -04:00

1 2 3 4 5 ...

1416 Commits All Branches Search

1416 Commits

All Branches