* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* hpc restore takes priority over non hpc weights
* Unit tests for num_gpu property as proxy for __parse_gpu_ids.
* Refactoring __parse_gpu_ids
* Moved the function outside the class as it is
an utility function and did not depend on class in any way.
* Added unit tests for it.
* Mocked torch.cuda.device_count function in tests.
This allows the tests to be run on machines that do not have gpus.
* Fixed the parse_gpu_ids function to handle -1 case.
Function now handles -1 the same way as it does for '-1'.
* Unit tests for root_gpu added.
Added backend as a parameter as currently depending on backend set
or not, code fails with exception in certain circumstances, before
giving a wrong answer.
* Moved __set_root_gpu function out of the class.
This function does not depend on the class and can be tested
more easily this way.
Also added unit tests for this function. They simply reuse
data for the root_gpu property.
* determine_root_gpu_device passes unit tests.
* num_gpus passes unit tests.
Also added a None test for this function.
* parse_gpu_ids tests changed to reflect desired state after refactoring.
Planning to refactor parse_gpu_ids to always return list of ints.
This will simplify code that use output of this function.
* * parse_gpu_ids always returns lists
* parse_gpu_ids checks given ids against available ids
* parse_gpu_ids raises exception for non existant ids
* parse_gpu_ids returns None when no gpus are available
* cleaned up determine_root_gpu_device
* cleaned up num_gpus property
* Updated unit tests to reflect changes in the functions
* Flake8 fixes
* Moved fixture code up before where it is used.
* Updated documentation.
* Changed tests to match the API:
* gpus=-1 or gpus='-1' should use all available gpu devices
* gpus=N
* N=0: no gpus should be used.
* N>0: N gpus should be used
* gpus=list of ints or a comma separated string of numbers:
Use the gpus indicated by the list or the string.
* Fixed code to pass all the changed tests for parsing gpus param.
* Refactoring parse_gpu_ids function.
* flake8 fixes.
* Updating documentation.
* flake8 fixes.
* flake8 fixes.
* flake8 fixes
* Update trainer.py
* Update dp_mixin.py
* Make reduce_distributed_output a stand alone function.
Fix imports.
Fix flake8.
* Add comet_ml dependency to tests requirements.txt
* Revert "Make reduce_distributed_output a stand alone function. Fix imports. Fix flake8."
This reverts commit eac0338
* Merge with master.
* moved dp, ddp outside of trainer
* added main mixins
* finished major mixin refactor
* flake8
* finished major mixin refactor
* finished major mixin refactor
* finished major mixin refactor
* finished major mixin refactor
* finished major mixin refactor
* finished major mixin refactor
* finished major mixin refactor
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to test fx
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* changes to seed for tests
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* fix test
* no warnings always
* no warnings always
* no warnings always
* no warnings always
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* ckpt callback in pretrain routine so exp already has version
* ckpt callback in pretrain routine so exp already has version
* ckpt callback in pretrain routine so exp already has version
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* weights go into default logger folder
* fixes non python type callback metrics
* fixed fast dev run
* fixed fast dev run
* fixed fast dev run
* fixed fast dev run
* fixed fast dev run
* fixed fast dev run
* fixed fast dev run
* callbacks use all other keys in return dict
* callbacks use all other keys in return dict
* callbacks use all other keys in return dict
* callbacks use all other keys in return dict
* remove os.exit from early stopping
* print thousands as K, M, B, T, ...
* add option to print top-level modules only
* added doc string and added spacing
* do not print summary if neither "full" nor "top"
* updated docs showing summary print options
* fix line length for travis
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up docs
* cleaned up test_tube logger
* cleaned up test_tube logger
* cleaned up test_tube logger
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* cleaning up demos
* added lbfgs support
* added lbfgs support
* added lbfgs support
* Fixes#280 (#309)
* added test seeds (#306)
* added test seeds
* added test seeds
* updated docs
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* added lbfgs support
* early stopping callback is not default
* added a default logger
* added default checkpoint callback
* added default checkpoint/loggers
* added default checkpoint/loggers
* updated docs
* cleaned demos
* cleaned demos
* cleaned demos
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* clean up docs around loggers
* Create underlying loggers lazily
This avoids creating duplicate experiments or run in multi-node DDP.
* Save hyperparameters automatically
* Update docs for snapshotting hyperparams
* Fix test tube
* Fix test tube pickling
* always calls the lr scheduler with epoch nb
* added docs for cluster grid search
* added docs for cluster grid search
* undo test changes
* undo test changes
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added load on CPU first
* added print logs
* added print logs
* changed close order
* changed close order
* Fix incorrect warning for DistributedSampler.
Check whether `dataloader.sampler` is an instance of DistributedSampler instead of checking the `dataloader`.
* Update trainer.py
* merged
* enable single gpu per node
* enable single gpu per node
* enable single gpu per node
* enable single gpu per node
* enable single gpu per node
* enable single gpu per node
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added nvidia flag set
* added simple cluster template
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs
* sets correct backend for possible combinations of gpu inputs