Jirka Borovec
bddc6cd77a
pytest default color ( #4703 )
...
* pytest default color
* time
Co-authored-by: chaton <thomas@grid.ai>
2020-11-18 10:53:44 +00:00
Jirka Borovec
7940ea5aaf
CI: TPU drop install horovod ( #4622 )
...
Co-authored-by: chaton <thomas@grid.ai>
2020-11-13 11:33:52 +01:00
Jirka Borovec
bd6c413829
Conda: PT 1.8 ( #3833 )
...
* PT 1.8
* unfreeze PT
* drop nightly from full
* add PT 1.8 to workflow
* readme table
* cuda
* skip cuda
* test 1.8
* unfreeze torch vision
Co-authored-by: ydcjeff <ydcjeff@outlook.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-12 15:03:43 +01:00
Jeff Yang
23719e3c05
[dockers] install nvidia-dali-cudaXXX ( #4532 )
...
* [dockers] install nvidia-dali-cuda100
* Apply suggestions from code review
* build DALI
* build DALI
* build DALI
* dali from source
* dali from source
* use binaries
* qq
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-09 21:18:24 +06:30
Jeff Yang
1d594c5d0c
[docker] Lock cuda version ( #4453 )
...
* lock cuda version
* back to normal
2020-10-31 20:17:07 +06:30
Jeff Yang
0f584faa6b
PyTorch 1.7 Stable support ( #3821 )
...
* prepare for 1.7 support [ci skip]
* tpu [ci skip]
* test run 1.7
* all 1.7, needs to fix tests
* couple with torchvision
* windows try
* remove windows
* 1.7 is here
* on purpose fail [ci skip]
* return [ci skip]
* 1.7 docker
* back to normal [ci skip]
* change to some_val [ci skip]
* add seed [ci skip]
* 4 places [ci skip]
* fail on purpose [ci skip]
* verbose=True [ci skip]
* use filename to track
* use filename to track
* monitor epoch + changelog
* Update tests/checkpointing/test_model_checkpoint.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-30 15:42:14 +00:00
Jirka Borovec
ce8abd6255
Drone: use nightly build cuda docker images ( #3658 )
...
* upgrade PT version
* update docker
* docker
* try 1.5
* badge
* fix typo: dor -> for (#3918 )
* prune
* prune
* env
* echo
* try
* notes
* env
* env
* env
* notes
* docker
* prune
* maintainer
* CI
* update
* just 1.5
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* docker
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* CI
* push
* try
* prune
* CI
* CI
* CI
* CI
Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-26 10:47:09 +00:00
Jeff Yang
d83c4e4d69
Cache docker builds ( #3659 )
...
* parent faa357648f
author ydcjeff <ydcjeff@outlook.com> 1601049378 +0630
committer ydcjeff <ydcjeff@outlook.com> 1601469495 +0630
cache docker builds
lock horovod at 0.19.5
done [ci skip] [CI SKIP]
use --cache-from [ci skip]
typo and horovod [ci skip]
exclude pt 1.3 py3.8 [ci skip]
conda no cache [ci skip]
fix
* revert
* align with master [ci skip]
* retry
* remove empty continuation lines
* add comment
* fix build-args
2020-10-25 18:46:10 +06:30
chaton
829d90b257
activated color in all pytest runs ( #4254 )
...
* activated color in all pytest runs
* Update .drone.yml
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-20 16:38:17 +02:00
Jirka Borovec
d3567c33a6
move base req. to root ( #4219 )
...
* move base req. to root
* check-manifest
* check-manifest
* manifest
* req
2020-10-18 20:40:18 +02:00
Jeff Yang
90929fa433
Fix apt repo issue for docker ( #3823 )
...
* fix docker repo issue
* docker
* docker
* docker
* no cudnn
* no cudnn
* try 16.04
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
2020-10-05 23:18:14 -04:00
Jirka Borovec
1160270882
fix path in CI for release & python version in all dockers & duplicated badges ( #3765 )
...
* typo
* path
* check
* trigger
* fix conda
* pip ver
* fix cuda
* fix XLA
* fix xla
* ci
* docker
* BIULD
* unBIULD
* update
* py 3.8
* apex
* apex
2020-10-02 05:26:21 -04:00
Jirka Borovec
ab508dae0c
run TPU tests with multiple versions ( #3024 )
...
* rename
* multi build
* multi build
* copy
* copy
* copy
* copy
* copy
* copy
* clean
* note
* docker
* formatting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-30 08:36:02 -04:00
Jirka Borovec
a0968e4bdf
fix PT version in CUDA docker images ( #3739 )
...
* upgrade PT version
* update docker
* docker
* try 1.5
* fix docker versions
* old
* badge
2020-09-30 08:33:22 -04:00
Jirka Borovec
a94728c99b
spec Horovod version ( #3661 )
...
* spec Horovod version
* MAKEFLAGS="-j2"
* tests
* CI
* docker
* CI
* docker
2020-09-26 19:30:25 +02:00
Jirka Borovec
0784cf3ab4
dockers nightly ( #3615 )
...
* dockers nightly
* typo
* Apply suggestions from code review
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-09-25 15:58:01 +02:00
Jeff Yang
a2120130ed
Lightning docker image based on base-cuda ( #3637 )
...
* use lightning CI docker
* exclude py3.8 and torch1.3
* torch 1.7
* mergify
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-24 23:14:15 +02:00
Jirka Borovec
37a59be21b
build more docker configs ( #3533 )
...
* update build cases
* list
* matrix
* matrix
* builds
* docker
* -j1
* -q
* -q
* sep
* docker
* docker
* mergify
* -j1
* -j1
* horovod
* copy
2020-09-23 01:41:35 +02:00
Jeff Yang
8be79a9a96
stable, dev PyTorch in Dockerfile and conda gh actions ( #3074 )
...
* dockerfile and actions file
* dockerfile and actions file
* added pytorch conda cpu nightly
* added pytorch conda cpu nightly
* recopy base reqs
* gh action `include` torch nightly
* add pytorch nightly & conda gh badge
* rebase
* fix horovod
* proposal refactor
* Update .github/workflows/ci_pt-conda.yml
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update .github/workflows/ci_pt-conda.yml
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update
* update
* fix cmd
* filled &&
* fix
* add -y
* torchvision >0.7 allowed
* explicitly install torchvision
* use HOROVOD_GPU_OPERATIONS env variable
* CI
* skip 1.7
* table
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-17 20:30:39 +02:00
Jirka Borovec
cbc4f6f8a4
add CI for building dockers ( #3383 )
...
* rename
* fix badges
* add docker build
* mergify
* update
* env
* ci
* times
* CI
* name
* comment
2020-09-10 18:38:29 -04:00
Jirka Borovec
9f2b29a7cd
build XLA with py3.6 ( #2863 )
...
* build py3.6
* info
* conda
* update
* version
* version
* builds
* builds
* builds
* builds
* builds
2020-08-15 15:39:44 -04:00
Jirka Borovec
a6e7aa7796
allow using apex with any PT version ( #2865 )
...
* wip
* setup
* type
* name
* wip
* docs
* imports
* fix if
* fix if
* use_amp
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* fix tests
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* fix tests
* todos
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-08-08 11:07:32 +02:00
Jirka Borovec
448be60701
update GPU to PT 1.5 ( #2779 )
...
* update gpu PT 1.6
* fix docker
* use PT 1.5
* Update tests/install_AMP.sh
Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
Co-authored-by: Nathan Raw <nxr9266@g.rit.edu>
2020-08-02 08:14:53 -04:00
Jirka Borovec
bc7a08fbe0
test dockers & add AMP in pt-1.6 ( #1584 )
...
* exist images
* names
* images
* args
* pt 1.6 dev
* circleci
* update
* refactor
* build
* fix
* MKL
2020-07-31 08:23:13 -04:00
zcain117
d0b8e850a4
integrate with CircleCI ( #2486 )
...
* add circleCI
* wip
* CircleCI setup that worked on my private repo. Use a working pytorch-lightning commit
* Fix the orb imports
* Update circleci header comment
* Try to pull the GITHUB_REF from the CI_PULL_REQUEST
* Use null instead of space for 'sed'
* Add TODO for codecov
* Remove echo of GKE_CLUSTER since it will be redacted by CircleCI.
* Try running codecov upload.
* Try using codecov orb
* Use pip install codecov
* Use codecov orb again since it should be approved
* dockers/tpu-tests/Dockerfile
* action
* suggestions
* drop suggestion
* suggestion
Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-23 12:13:10 -04:00
Jirka Borovec
fb85d493d0
use XLA base image for TPU testing ( #2536 )
...
* drop py3.6
* use base image
* typo
* skip extra
* drop cache
2020-07-07 07:05:17 -04:00
Jirka Borovec
977df6ed31
Docker: building XLA base image ( #2494 )
...
* refactor
* add TPU base
* wip
* builds
* typo
* extras
* simple
* unzip
* rename
2020-07-06 14:21:36 -04:00