William Falcon
29ebe92208
support for native amp ( #1561 )
...
* adding native amp suppport
* adding native amp suppport
* adding native amp suppport
* adding native amp suppport
* autocast
* autocast
* autocast
* autocast
* autocast
* autocast
* removed comments
* removed comments
* added state saving
* added state saving
* try install amp again
* added state saving
* drop Apex reinstall
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 14:47:08 -04:00
Jirka Borovec
0b22b64a10
Tests/docker ( #1573 )
...
* devel image
* try parallel
* new image
2020-04-23 12:52:59 -04:00
Travis Addair
7024177f7d
Added Horovod distributed backend ( #1529 )
...
* Initial commit of Horovod distributed backend implementation
* Update distrib_data_parallel.py
* Update distrib_data_parallel.py
* Update tests/models/test_horovod.py
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/models/test_horovod.py
Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
* Fixed tests
* Added six
* tests
* Install tox for GitHub CI
* Retry tests
* Catch all exceptions
* Skip cache
* Remove tox
* Restore pip cache
* Remove the cache
* Restore pip cache
* Remove AMP
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
Jirka Borovec
724b787cd1
faster CI testing ( #1323 )
...
* MNIST digits
* increase test acc
* smaller parity
* drone builds
* increase GH action timeout
* drone format
* fix paths
* drone cache
* circle cache
* fix test
* lower nb epochs
* circleCI
* user orb
* fix test
* fix test
* circle cache
* circle cache
* circle cache
* comment caches
* benchmark batch size
* cache dataset
* smaller dataset
* smaller dataset
* fix nb samples
* batch size
* fix test
2020-04-02 12:28:44 -04:00
William Falcon
18d055a390
Parity test ( #1284 )
...
* adding test
* adding test
* added base parity model
* added base parity model
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* added parity test
* move parity to benchmark
* formatting
* fixed gradient acc sched
* move parity to benchmark
* formatting
* fixed gradient acc sched
* skip for CPU
* call last
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-03-30 18:16:32 -04:00
Jirka Borovec
61177cd1c8
system info ( #1234 )
...
* system info
* update big info
* test script
* update config
* rename script
* import path
2020-03-27 08:45:52 -04:00
Jirka Borovec
45d671a4a8
CI: split tests-examples ( #990 )
...
* CI: split tests-examples
* tests without template
* comment depends
* CircleCI typo
* add doctest
* update test req.
* CI tests
* setup macOS
* longer train
* lover pred acc
* fix model
* rename default model
* lower tests acc
* typo
* imports
* fix test optimizer
* update calls
* fix Win
* lower Drone image
* fix call
* pytorch image
* fix test
* add dev image
* add dev image
* update image
* drone volume
* lint
* update test notes
* rename tests/models >> tests/base
* group models
* conftest
* optim imports
* typos
* fix import
* fix tests
* install AMP
* tests
* fix import
2020-03-25 07:46:27 -04:00
Jirka Borovec
22a7264e9a
improve partial Codecov ( #1172 )
...
* ignore in setup
* show report
* abs imports
* abstract pass
* cover loggers
* doctest trains
* locals
* pass
* revert tensorboard
* use tensorboardX
* revert tensorboardX
* fix trains
* Add TrainsLogger.set_credentials (#1179 )
* Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version.
Fix CI Trains tests
* Add global TrainsLogger set_bypass_mode (#1187 )
* Add global TrainsLogger set_bypass_mode skips all external communication
Co-authored-by: bmartinn <>
* rm some no-cov
Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>
2020-03-19 09:14:29 -04:00
Jirka Borovec
f6a7a5278a
enable Codecov ( #1133 )
...
* update config
* try Drone cache
* drop Drone cache
* move import
* remove token
2020-03-14 13:01:57 -04:00
Jirka Borovec
5691ffb160
add Drone CI ( #1115 )
...
* add Drone config
* update Drone config
* add Drone config
* list GPUs
* add type
* native torch
* native torch
* fix image
* update
* SLURM_LOCALID
* add badge
* simple test
2020-03-11 15:39:59 -04:00