Commit Graph

142 Commits

Author SHA1 Message Date
Jirka Borovec abf1d4b992
fix mock pkgs in docs (#4591)
* fix mock pkgs in docs

* sphinx

* CI

Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 14:57:21 +01:00
Jeff Yang f3dfb98444
[ci] tag v1.4.1 for pypa/gh-action-pypi-publish (#4548) 2020-11-06 10:48:27 +00:00
Jeff Yang e81707ba02
[dockers] use inline cache (#4511)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-11-04 23:08:17 +01:00
Jirka Borovec 0d1365c442
release v1 (#4516) 2020-11-04 17:06:31 +00:00
Jirka Borovec fc78ffa622
extend release testing (#4506)
* extend release testing

* Drone

* also PR to release

* actions versions
2020-11-04 09:08:37 +00:00
Jeff Yang 1d594c5d0c
[docker] Lock cuda version (#4453)
* lock cuda version

* back to normal
2020-10-31 20:17:07 +06:30
Jeff Yang 0f584faa6b
PyTorch 1.7 Stable support (#3821)
* prepare for 1.7 support [ci skip]

* tpu [ci skip]

* test run 1.7

* all 1.7, needs to fix tests

* couple with torchvision

* windows try

* remove windows

* 1.7 is here

* on purpose fail [ci skip]

* return [ci skip]

* 1.7 docker

* back to normal [ci skip]

* change to some_val [ci skip]

* add seed [ci skip]

* 4 places [ci skip]

* fail on purpose [ci skip]

* verbose=True [ci skip]

* use filename to track

* use filename to track

* monitor epoch + changelog

* Update tests/checkpointing/test_model_checkpoint.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-30 15:42:14 +00:00
Jirka Borovec ce8abd6255
Drone: use nightly build cuda docker images (#3658)
* upgrade PT version

* update docker

* docker

* try 1.5

* badge

* fix typo: dor -> for (#3918)

* prune

* prune

* env

* echo

* try

* notes

* env

* env

* env

* notes

* docker

* prune

* maintainer

* CI

* update

* just 1.5

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* docker

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* push

* try

* prune

* CI

* CI

* CI

* CI

Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-26 10:47:09 +00:00
Jeff Yang d83c4e4d69
Cache docker builds (#3659)
* parent faa357648f
author ydcjeff <ydcjeff@outlook.com> 1601049378 +0630
committer ydcjeff <ydcjeff@outlook.com> 1601469495 +0630

cache docker builds

lock horovod at 0.19.5

done [ci skip] [CI SKIP]

use --cache-from [ci skip]

typo and horovod [ci skip]

exclude pt 1.3 py3.8 [ci skip]

conda no cache [ci skip]

fix

* revert

* align with master [ci skip]

* retry

* remove empty continuation lines

* add comment

* fix build-args
2020-10-25 18:46:10 +06:30
Jirka Borovec e0e402dbe6
Docs/changelog for 1.0.3 (#4267)
* formatting

* miss

* missing & ver++

* path
2020-10-21 00:53:10 +02:00
chaton 829d90b257
activated color in all pytest runs (#4254)
* activated color in all pytest runs

* Update .drone.yml

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-20 16:38:17 +02:00
Jirka Borovec 9edef4023c
prune ignore (#4240)
* prune ignore

* try drop loggers
2020-10-20 13:32:39 +01:00
Jirka Borovec f37444fa3e
CI: add flake8 (#4239) 2020-10-19 21:20:17 +01:00
Jirka Borovec 7c4f80a1af
allow codecov upload to fail (#4221) 2020-10-19 09:28:17 +02:00
Jirka Borovec d3567c33a6
move base req. to root (#4219)
* move base req. to root

* check-manifest

* check-manifest

* manifest

* req
2020-10-18 20:40:18 +02:00
Jirka Borovec 05cb6fcc58
Update ci_dockers.yml (#3935) 2020-10-07 08:26:07 -04:00
Jirka Borovec 7f4a9b75f3
skip some docker builds (temporally pass) (#3913)
* skip some docker builds

* todos

* skip
2020-10-06 17:29:43 -04:00
Jirka Borovec 064ae53d63
nb steps in early stop (#3909)
* nb steps

* if

* skip

* rev

* seed

* seed
2020-10-06 15:20:08 -04:00
Jirka Borovec f55a9cf63a
fic CI parsing Horovod version (#3804) 2020-10-06 17:18:16 +02:00
Jeff Yang b76fc5bae5
use docker for conda CI (#3841)
* use docker in conda CI

* update env if needed

* update with pip

* remove setting pytorch
2020-10-04 13:18:20 -04:00
zcain117 0c12065efd
[TPU CI] Use timestamp+pythonVersion to form the docker image tag. (#3779)
* Use timestamp+pythonVersion to form the docker image tag.

* Remove temporary step to check new env var.
2020-10-02 16:22:47 +02:00
Jirka Borovec 1160270882
fix path in CI for release & python version in all dockers & duplicated badges (#3765)
* typo

* path

* check

* trigger

* fix conda

* pip ver

* fix cuda

* fix XLA

* fix xla

* ci

* docker

* BIULD

* unBIULD

* update

* py 3.8

* apex

* apex
2020-10-02 05:26:21 -04:00
Jirka Borovec a5f28ced13
nightly release to tests (#3718) 2020-09-30 08:37:52 -04:00
Jirka Borovec ab508dae0c
run TPU tests with multiple versions (#3024)
* rename

* multi build

* multi build

* copy

* copy

* copy

* copy

* copy

* copy

* clean

* note

* docker

* formatting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-30 08:36:02 -04:00
Jirka Borovec a94728c99b
spec Horovod version (#3661)
* spec Horovod version

* MAKEFLAGS="-j2"

* tests

* CI

* docker

* CI

* docker
2020-09-26 19:30:25 +02:00
Jeff Yang 05e5f03fd7
Enable PyTorch 1.7 in conda CI (#3541)
* enable pt 1.7

* readme

* nightly diff version testing, will delete later

* nightly diff version testing, will delete later

* back to normal [ci skip]

* use __ignored_properties__

* define __ignored_properties__ in respective modules

* change log

* formatting

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-25 16:20:15 +02:00
Jirka Borovec 0784cf3ab4
dockers nightly (#3615)
* dockers nightly

* typo

* Apply suggestions from code review

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-09-25 15:58:01 +02:00
Jirka Borovec a25cb300d8
fix building nightly (#3642) 2020-09-25 08:15:06 -04:00
Jeff Yang a2120130ed
Lightning docker image based on base-cuda (#3637)
* use lightning CI docker

* exclude py3.8 and torch1.3

* torch 1.7

* mergify

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-24 23:14:15 +02:00
Jirka Borovec aa52c930f4
test examples (#3643)
* test examples

* testing

* testing

* typo

* req

* exception

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-09-24 17:33:11 +02:00
Jirka Borovec 37a59be21b
build more docker configs (#3533)
* update build cases

* list

* matrix

* matrix

* builds

* docker

* -j1

* -q

* -q

* sep

* docker

* docker

* mergify

* -j1

* -j1

* horovod

* copy
2020-09-23 01:41:35 +02:00
Jirka Borovec 0284f7ab5a
nightly releases (#3552)
* nightly

* nightly

* ls
2020-09-19 18:28:34 -04:00
Jeff Yang 8be79a9a96
stable, dev PyTorch in Dockerfile and conda gh actions (#3074)
* dockerfile and actions file

* dockerfile and actions file

* added pytorch conda cpu nightly

* added pytorch conda cpu nightly

* recopy base reqs

* gh action `include` torch nightly

* add pytorch nightly & conda gh badge

* rebase

* fix horovod

* proposal refactor

* Update .github/workflows/ci_pt-conda.yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update .github/workflows/ci_pt-conda.yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

* update

* fix cmd

* filled &&

* fix

* add -y

* torchvision >0.7 allowed

* explicitly install torchvision

* use HOROVOD_GPU_OPERATIONS env variable

* CI

* skip 1.7

* table

Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-09-17 20:30:39 +02:00
Jirka Borovec 7b64472ced
fix lib paths after Wandb 0.10 (#3520)
* try

* try

* drop 0.20

* drop 0.19.5

* -U

* Fixed Horovod in CI due to wandb==0.10.0 sys.path modifications (#3525)

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* format

* wb freeze

* types

Co-authored-by: Travis Addair <taddair@uber.com>
2020-09-17 08:37:49 -04:00
Jirka Borovec c64520e658
fix tensorboard version (#3132)
* tensorboard version

* WIP test tb hparams logs (#3040)

* optional

* req

* tensorboard>=2.2.0

* data

* data

* TB

Co-authored-by: Rosario Scalise <rosario@cs.washington.edu>
2020-09-15 23:48:48 +02:00
Jirka Borovec 61b31d94b4
build docs on master (#3492)
* build docs on master

* fomatting
2020-09-15 05:55:03 -04:00
Jirka Borovec cbc4f6f8a4
add CI for building dockers (#3383)
* rename

* fix badges

* add docker build

* mergify

* update

* env

* ci

* times

* CI

* name

* comment
2020-09-10 18:38:29 -04:00
Jirka Borovec cd40cb2fad
ignore types in files (#3409)
* ignore types in files

* CI timeout
2020-09-09 07:11:53 -04:00
Jirka Borovec 9f2b29a7cd
build XLA with py3.6 (#2863)
* build py3.6

* info

* conda

* update

* version

* version

* builds

* builds

* builds

* builds

* builds
2020-08-15 15:39:44 -04:00
Jirka Borovec 4354690e55
add apex test (#2921)
* add apex test

* rename

* level

* events

* wrap

* evt

* miss

* apex

* apex

* apex

* apex

* apex

* apex

* Update tests/models/test_amp.py

Co-authored-by: William Falcon <waf2107@columbia.edu>

* notes

* notes

Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-08-13 10:03:13 -04:00
zcain117 580a5bd1df
Use kubectl to get logs from TPU CI instead of gcloud logging. (#2918)
* Use kubectl to get logs from TPU CI instead of gcloud logging.

* Update Github Action to read logs from kubectl rather than gcloud logging.
2020-08-11 19:30:56 -04:00
Jirka Borovec 91b0d46cd5
do not fails all dockers (#2861) 2020-08-07 09:10:35 -04:00
Jirka Borovec ad956b5ed9
do not fails all dockers (#2860) 2020-08-07 14:14:22 +02:00
Jirka Borovec ea658e300c
Tests/install pkg (#2835)
* add install matrix

* nb tests

* win

* cfg

* torch

* link

* Update .github/workflows/install-pkg.yml

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* try

* try

* try

* try

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-08-07 03:08:23 -04:00
Jirka Borovec 3772601cd6
update CI testing with pip upgrade (#2380)
* try pt1.5

* cpu

* upgrade

* tpu

* user

* [blocked by #2380] freeze GPU PT 1.4 (#2780)

* freeze

* user
2020-07-31 14:50:06 -04:00
Jirka Borovec bc7a08fbe0
test dockers & add AMP in pt-1.6 (#1584)
* exist images

* names

* images

* args

* pt 1.6 dev

* circleci

* update

* refactor

* build

* fix

* MKL
2020-07-31 08:23:13 -04:00
Jirka Borovec b88fc43871
re-enable skipped tests (#2762)
* re-enable skipped

* timeout
2020-07-31 07:52:17 -04:00
Jirka Borovec fcfdb4df13
conda speedup (#2546)
* conda speedup

* cache

* add pip cache

* suggestion

* cache

* cache

* req
2020-07-31 06:31:23 -04:00
Jirka Borovec 06e8910f06
pytorch 1.6 (#2745)
* pt 1.6

* don't use the new zipfile serialization for now

* quick flake8 fixes

* remove unnecessary f

* coalesce strings

* remove comma

* remove extra commas

* Apply suggestions from code review

Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>

* set _use_new_zipfile_serialization to False only for pytorch 1.6.0

* remove unnecessary comments

* flake8 fixes

* use pkg_resources instead of packaging

* readme

* format

* version

* chlog

Co-authored-by: Peter Yu <peter@asapp.com>
Co-authored-by: Peter Yu <2057325+yukw777@users.noreply.github.com>
2020-07-31 11:18:32 +02:00
Jirka Borovec bc833fbf52
Horovod & py3.8 (#2764) 2020-07-30 23:39:07 +02:00