Commit Graph

19 Commits

Author SHA1 Message Date
Jirka Borovec dcf6e4e310
remake nvidia docker (#6686)
* use latest

* remake

* examples
2021-03-29 09:39:06 +01:00
Carlos Mocholí 21fc5eb21e
Automatically find and run special tests (#6669) 2021-03-26 17:04:59 +00:00
Jirka Borovec 64d0fa4472
update coverage config (#6524)
* update coverage config

* parallel

* parallel

* Apply suggestions from code review

* Apply suggestions from code review

* paralel

* paralel

* paralel

* combine

* combine

* .

* ..

* ..

* ..

* rev

* cb

* cb

* drop

* drop

* .

* ..

* ...

* ...

* ...

* .
2021-03-23 23:05:04 +01:00
Jirka Borovec e62c7c7839
hotfix: mock examples (#6632)
* mock examples

* drop from GA
2021-03-22 16:49:01 +00:00
Jirka Borovec cb59039288
fixing examples (#6600)
* try Azure

* -e

* path
2021-03-20 18:58:59 +00:00
Jirka Borovec eb3ff413a9
CI: Azure publish results (#6514) 2021-03-15 14:38:40 +00:00
Jirka Borovec 85c8074bee
require: adjust versions (#6363)
* adjust versions

* release

* manifest

* pep8

* CI

* fix

* build
2021-03-06 14:34:54 +01:00
Jirka Borovec e84854264f
CI: fix examples - patch download MNIST (#6357)
* patch download

* CI

* isort

* extra
2021-03-05 16:50:21 +00:00
Jirka Borovec e038e747a0
hotfix for PT1.6 and torchtext (#6323)
* ci: azure reinstall torchtext

* move

* todos

* 0.6.0

* skip examples

* formatter

* skip

* todo

* Apply suggestions from code review
2021-03-04 17:48:17 +01:00
Jirka Borovec 6788dbabff
switch agents pool (#6270) 2021-03-01 22:14:55 +01:00
Jirka Borovec f2660acbf9
add sanity check on nb available GPUs (#6092) 2021-02-19 21:45:53 +00:00
Jirka Borovec e12c8a7254
add Azure tags trigger (#6066)
* add Azure tags trigger

* fix

* mnodes
2021-02-18 16:41:16 -05:00
Sean Naren 8440595b26
[CI] Move DeepSpeed into CUDA image, remove DeepSpeed install from azure (#6043)
* Move to CUDA image

* Remove deepspeed install as deepspeed now in the cuda image

* Remove path setting, as ninja should be in the container now
2021-02-17 18:51:31 -05:00
Sean Naren 7189d673f6
DeepSpeed Integration (#5954)
* Add initial deepspeed changes

* Address code review

* Move static method outside of function

* Fixes

* Add missing annotation

* Remove seed setting

* Doc changes

* Doc changes, add address reviews

* Fix docs

* Try fixing issue by moving to torch adam

* Clean up check

* Changes, better APIs!

* Add wrapper, swap to git install revision

* Add special test

* Add warning

* Address review

* Add better disclaimer

* Turn off ZeRO for testing due to compilation

* Add description on modifying parameters via the plugin

* Doc strings clear

* Small doc fixes

* Fix hash, reduce test

* Added CI change

* Move to azure pipeline

* Fix test name

* Add missing flag

* Remove sudo...

* Try conda instead

* Swap to conda base

* Try suggested install

* Apply suggestions from code review

* Apply suggestions from code review

* Revert "Apply suggestions from code review"

This reverts commit 41cca05a

* Revert "Apply suggestions from code review"

This reverts commit e06ec29e

* Remove setter

* Address most review

* Move out function, remove DeepSpeed from requirements

* Install deepspeed/mpi4py within container

* Use special tests, move to master commit for deepspeed

* Export path

* Force compile to happen first

* Remove!

* Debugging ninja

* Fix error in optimizer step logic

* Attempt to fix symbolic link

* Reverse to aid debugging

* Export path again

* Clean up mess

* var

* Revert "var"

This reverts commit 3450eaca

* Address review, add todo

* Add note about unsupported functionality

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-17 15:23:42 -05:00
Jirka Borovec c0ee1f19fc
fix install dtrun (#6025) 2021-02-17 11:43:51 +00:00
Jirka Borovec ba806c8ee0
enable testing DDP examples (#4995)
* enable testing DDP examples

* args

* ddp_spawn

* ddp as extra script

* path

# Conflicts:
#	.drone.yml

* install

* -u

* q
2021-02-15 15:36:13 +00:00
Nicki Skafte 979c879e45
drop DDP CLI test (#5938)
* fix tests

* =

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-12 17:42:32 +01:00
Jirka Borovec 373a31e63e
add azure timeout (#5907)
* add azure timeout

* rework
2021-02-10 20:21:20 +00:00
Jirka Borovec c2c82dad62
CI: Azure (#5882)
* add base Azure pipeline

* skip
2021-02-10 04:43:26 -05:00