Adrian Wälchli
6e3e740a7f
Param printing ( #336 )
...
* print thousands as K, M, B, T, ...
* add option to print top-level modules only
* added doc string and added spacing
* do not print summary if neither "full" nor "top"
* updated docs showing summary print options
* fix line length for travis
2019-10-08 15:30:06 -04:00
William Falcon
07c5d22ae3
cleaning up demos ( #313 )
...
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
2019-10-05 16:39:05 -04:00
William Falcon
6cc3f1757f
decouple returns from each step ( #307 )
...
* decoupled training metrics from logging metrics
* decoupled validation metrics from log metrics
* updated docs
* updated docs
* updated docs
* Fixed test
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
* merged master
2019-10-05 13:35:20 -04:00
William Falcon
8f5a06bfb8
Gpu mem ( #308 )
...
* Fixes #289
* Fixes #289
* added lbfgs support
* Fixes #280 (#309 )
* added test seeds (#306 )
* added test seeds
* added test seeds
* updated docs
* added lbfgs support (#310 )
* added lbfgs support
* added lbfgs support
* added lbfgs support
* Fixes #280 (#309 )
* added test seeds (#306 )
* added test seeds
* added test seeds
* updated docs
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* Fixes #289
* Fixes #289
* merged master
* merged master
2019-10-05 11:29:34 -04:00
William Falcon
967957e55c
added lbfgs support
2019-10-05 10:47:18 -04:00
William Falcon
bf09060fef
Fixes #292 ( #303 )
...
* early stopping callback is not default
* added a default logger
* added default checkpoint callback
* added default checkpoint/loggers
* added default checkpoint/loggers
* updated docs
* cleaned demos
* cleaned demos
* cleaned demos
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
2019-10-04 19:48:57 -04:00
William Falcon
a578de511d
clean up docs around loggers ( #304 )
2019-10-04 18:53:38 -04:00
William Falcon
a60a24d11b
disable auto gpu loading when restoring weights to avoid OOM ( #242 )
...
* Update root_module.py
* Update root_module.py
* Update root_module.py
* tests fix
* tests fix
2019-10-04 16:18:43 -04:00
William Falcon
73a7cf3c99
Mem crash ( #299 )
...
* fixes memory crash
* fixes memory crash
2019-10-04 15:53:44 -04:00
Hendrik Schröter
36f0b5bbd0
Use getter instead of python property for the dataloaders ( #275 )
...
* Use getter instead of python property for the dataloaders
* Fix lint
* Update trainer.py
2019-10-04 15:35:02 -04:00
William Falcon
32e74b8f36
Ddp2 ( #261 )
...
* adds ddp2 option where on each node a single process uses all gpus
* added ddp2 test
* added ddp2 docs
* Update Distributed training.md
* delete ref to old update_training_log_metrics
* delete ref to old update_training_log_metrics
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* banana pancakes
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* cheesecake
2019-10-04 15:07:54 -04:00
Alok Singh
b0a0a47a0b
Rename variables ( #124 )
...
- data_batch → batch
- batch_i → batch_idx
- dataloader_i → dataloader_idx
- tng → training
- training_dataloader → train_dataloader
- add_log_row_interval → row_log_interval
- gradient_clip → gradient_clip_val
- prog → progress
- tqdm_dic → tqdm_dict
2019-09-25 19:05:06 -04:00
William Falcon
55e7322747
Metrics load ( #228 )
...
* load from metrics defaults to CPU
* load from metrics defaults to CPU
* load from metrics defaults to CPU
2019-09-16 10:47:19 -04:00
William Falcon
7099f8dbfb
split trainer mixins ( #209 )
...
* split trainer mixins
* Update multi_node_cluster_template.py
* Update single_cpu_template.py
* Update single_gpu_node_16bit_template.py
* Update single_gpu_node_ddp_template.py
* Update single_gpu_node_dp_template.py
* Update trainer_cpu_template.py
* Update trainer_io.py
* split trainer mixins
* Update multi_node_cluster_template.py
* deconflicted
* deconflicted
* deconflicted
2019-09-06 14:11:07 -04:00
William Falcon
60633eaa32
Moves hpc auto-resubmit to trainer from test-tube ( #207 )
...
* added slurm signal handler
* added restore weight functions
* set slurm signal handling inside process
* added resubmit docs
* added resubmit docs
* fixed missing param
* Update trainer.py
* fixed missing param
* fixed missing param
* debugging tests
* debugging tests
* debugging tests
* debugging tests
* debugging tests
* debugging tests
* debugging tests
2019-09-06 11:54:51 -04:00
Verena Haunschmid
25d5b25792
Expectopatronum implement #89 ( #182 )
...
* rename validate -> evaluate; implement test logic; allow multiple test_loaders
* add test_step and test_end to LightningModule
* add in_test_mode to pretraining to implement case 2 (test pretrained model)
* fix code style issues
* LightningTestModel: add optional second test set, implement test_step and test_end
* implemented test for multiple test_dataloaders; fixed typo
* add two test cases for #89
* add documentation for test_step, test_end; fix computation of loss in validation_step example
* Update trainer.py
* Update trainer.py
* Update trainer.py
* Update trainer.py
* Update trainer.py
* Update trainer.py
* Added proper dp ddp routing calls for test mode
* Update trainer.py
* Update test_models.py
* Update trainer.py
* Update trainer.py
* Update override_data_parallel.py
* Update test_models.py
* Update test_models.py
* Update trainer.py
* Update trainer.py
* Update trainer.py
* Update test_models.py
* Update test_models.py
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* debug
* Update trainer.py
* Update override_data_parallel.py
* Update debug.py
* Update lm_test_module.py
* Update test_models.py
2019-09-02 07:15:27 -04:00
William Falcon
4104a0fc47
cleaned up progbar ( #165 )
...
* cleaned up progbar
* cleaned up progbar
* cleaned up progbar
* cleaned up progbar
* cleaned up progbar
* cleaned up progbar
* cleaned up progbar
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* updated base files
* flake 8
2019-08-23 21:23:27 -04:00
Sebastian Præsius
b31539f62e
Guard against AttributeError in dataloaders. ( #161 )
...
A solution for https://github.com/williamFalcon/pytorch-lightning/issues/142 .
Since hasattr "calls getattr(object, name) and to see whether it raises an AttributeError or not", I replaced it with a single call to getattr.
See also https://stackoverflow.com/questions/24971061/python-hasattr-vs-getattr
2019-08-23 08:21:39 -04:00
William Falcon
7f53e7bfb3
Val idx optional in validation_step ( #108 )
...
* made dataset_i only available with multiple datasets
* updated interface signature
* updated tests
2019-08-13 11:37:37 -04:00
William Falcon
905a2e5a12
allow user to control optimizer step for every optimizer
...
* added custom hook for user defined optimizer step
* refactored to allow multiple optimizers different training_step
* refactored to allow multiple optimizers different training_step
* refactored to allow multiple optimizers different training_step
* refactored to allow multiple optimizers different training_step
* refactored to allow multiple optimizers different training_step
* pep8
2019-08-13 09:32:45 -04:00
William Falcon
e5805bf8ff
val and test are optional now ( #95 )
...
* made validation step optional
* added no val model
* val_step can be implemented but not validation_end
* added no val end model
* added tests
* added tests
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* remove class
* updated docs
* updated docs
* updated test
* updated test
* updated test
* updated test
* updated test
* updated test
* updated test
* updated test
* updated test
* fix pep8
2019-08-11 10:01:57 -04:00
William Falcon
10e4b18452
made imports absolute
2019-08-07 10:14:59 -04:00
William Falcon
35f23bbc82
Merge pull request #55 from williamFalcon/continue
...
add training restore
2019-08-07 09:02:16 -04:00
William Falcon
cdbcbad352
added hook on_sanity_check_start
2019-08-07 07:51:55 -04:00
William Falcon
5c398d7a4e
removed bad hook call
2019-08-07 07:39:41 -04:00
William Falcon
a931ded310
removed bad hook call
2019-08-07 07:35:02 -04:00
William Falcon
95ec072d1e
removed bad hook call
2019-08-07 07:30:02 -04:00
William Falcon
d3f19c8321
added auto restore
2019-08-07 06:55:05 -04:00
Jiri BOROVEC
d9bfe964f9
update by flake8
2019-08-06 22:45:46 +02:00
Jiri BOROVEC
632d07b490
fix prints for py3.5
2019-08-06 22:45:46 +02:00
Jiri BOROVEC
c44966a8bf
apply PEP8
2019-08-06 22:45:27 +02:00
Jiri BOROVEC
469941a528
pkg relative imports
...
* split requirements.txt
* pytest verbose
2019-08-05 10:52:09 +02:00
William Falcon
019b4d16d0
formatting
2019-08-04 13:08:14 -05:00
William Falcon
f2ef367f7d
removing unused imports
2019-08-04 13:07:50 -05:00
William Falcon
ef6d5a412c
proc 0 only for save hpc. all procs for hpc load
2019-08-01 16:19:04 -04:00
williamFalcon
27660b8a96
running tests
2019-07-28 05:57:37 -07:00
williamFalcon
5db28899aa
merged
2019-07-28 05:39:25 -07:00
William Falcon
f5a01edfb8
added clean slurm save load test
2019-07-26 22:32:34 -04:00
William Falcon
f1de62671d
added clean slurm save load test
2019-07-26 22:32:27 -04:00
William Falcon
57edb08bd8
added clean slurm save load test
2019-07-26 22:28:09 -04:00
William Falcon
ffa7a0dbab
added clean slurm save load test
2019-07-26 22:26:55 -04:00
William Falcon
348223a702
fixed hpc save, load. cleaned apu
2019-07-26 22:09:35 -04:00
William Falcon
64de447545
fixed hpc save, load. cleaned apu
2019-07-26 22:07:02 -04:00
William Falcon
265411572f
fixed hpc save, load. cleaned apu
2019-07-26 22:04:27 -04:00
William Falcon
4148c36abd
added model save load test
2019-07-26 21:55:01 -04:00
William Falcon
aacf1947ea
auto state-dict and remove the way the model is loaded during hpc
2019-07-26 21:38:06 -04:00
William Falcon
e2c7fa44b7
auto state-dict and remove the way the model is loaded during hpc
2019-07-26 21:37:06 -04:00
William Falcon
1a835969a6
added saving tests to cpu
2019-07-26 12:14:58 -04:00
Phuc Le
7d97e3e6e4
Support any lr_scheduler
2019-07-26 11:03:44 +07:00
William Falcon
7e728d97e7
removed save model logging
2019-07-25 14:36:22 -04:00