Commit Graph

16 Commits

Author SHA1 Message Date
Jirka Borovec 7e2e874d95
Refactor: legacy accelerators and plugins (#5645)
* tests: legacy

* legacy: accel

* legacy: plug

* fix imports

* mypy

* flake8
2021-01-26 20:04:36 -05:00
Jirka Borovec 9dd04028d5 tests for legacy checkpoints (#5223)
* wip

* generate

* clean

* tests

* copy

* download

* download

* download

* download

* download

* download

* download

* download

* download

* download

* download

* flake8

* extend

* aws

* extension

* pull

* pull

* pull

* pull

* pull

* pull

* pull

* try

* try

* try

* got it

* Apply suggestions from code review

(cherry picked from commit 72525f0a83)
2021-01-26 14:27:56 +01:00
Jirka Borovec 9be04c1c0b
try to update failing dockers (#5611) 2021-01-25 17:10:56 -05:00
Jirka Borovec 5119013c81 drop install FairScale for TPU (#5113)
* drop install FairScale for TPU

* typo

Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2021-01-05 09:58:37 +01:00
Lezwon Castelino 12cb9942a1
Tpu save (#4309)
* convert xla tensor to cpu before save

* move_to_cpu

* updated CHANGELOG.md

* added on_save to accelerators

* if accelerator is not None

* refactors

* change filename to run test

* run test_tpu_backend

* added xla_device_utils to tests

* added xla_device_utils to test

* removed tests

* Revert "added xla_device_utils to test"

This reverts commit 0c9316bb

* fixed pep

* increase timeout and print traceback

* lazy check tpu exists

* increased timeout
removed barrier for tpu during test
reduced epochs

* fixed torch_xla imports

* fix tests

* define xla utils

* fix test

* aval

* chlog

* docs

* aval

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-02 13:05:11 +00:00
Jirka Borovec 597dfa174c
build dockers XLA 1.7 (#4891)
* build XLA 1.7

* night XLA 1.7

* rename

* use 1.7

* tpu ver
2020-11-29 15:14:19 -04:00
Jirka Borovec bddc6cd77a
pytest default color (#4703)
* pytest default color

* time

Co-authored-by: chaton <thomas@grid.ai>
2020-11-18 10:53:44 +00:00
Jirka Borovec 7940ea5aaf
CI: TPU drop install horovod (#4622)
Co-authored-by: chaton <thomas@grid.ai>
2020-11-13 11:33:52 +01:00
Jirka Borovec ce8abd6255
Drone: use nightly build cuda docker images (#3658)
* upgrade PT version

* update docker

* docker

* try 1.5

* badge

* fix typo: dor -> for (#3918)

* prune

* prune

* env

* echo

* try

* notes

* env

* env

* env

* notes

* docker

* prune

* maintainer

* CI

* update

* just 1.5

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* docker

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* CI

* push

* try

* prune

* CI

* CI

* CI

* CI

Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-26 10:47:09 +00:00
chaton 829d90b257
activated color in all pytest runs (#4254)
* activated color in all pytest runs

* Update .drone.yml

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>

Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-20 16:38:17 +02:00
Jirka Borovec d3567c33a6
move base req. to root (#4219)
* move base req. to root

* check-manifest

* check-manifest

* manifest

* req
2020-10-18 20:40:18 +02:00
Jirka Borovec 1160270882
fix path in CI for release & python version in all dockers & duplicated badges (#3765)
* typo

* path

* check

* trigger

* fix conda

* pip ver

* fix cuda

* fix XLA

* fix xla

* ci

* docker

* BIULD

* unBIULD

* update

* py 3.8

* apex

* apex
2020-10-02 05:26:21 -04:00
Jirka Borovec ab508dae0c
run TPU tests with multiple versions (#3024)
* rename

* multi build

* multi build

* copy

* copy

* copy

* copy

* copy

* copy

* clean

* note

* docker

* formatting

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-09-30 08:36:02 -04:00
zcain117 d0b8e850a4
integrate with CircleCI (#2486)
* add circleCI

* wip

* CircleCI setup that worked on my private repo. Use a working pytorch-lightning commit

* Fix the orb imports

* Update circleci header comment

* Try to pull the GITHUB_REF from the CI_PULL_REQUEST

* Use null instead of space for 'sed'

* Add TODO for codecov

* Remove echo of GKE_CLUSTER since it will be redacted by CircleCI.

* Try running codecov upload.

* Try using codecov orb

* Use pip install codecov

* Use codecov orb again since it should be approved

* dockers/tpu-tests/Dockerfile

* action

* suggestions

* drop suggestion

* suggestion

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-07-23 12:13:10 -04:00
Jirka Borovec fb85d493d0
use XLA base image for TPU testing (#2536)
* drop py3.6

* use base image

* typo

* skip extra

* drop cache
2020-07-07 07:05:17 -04:00
Jirka Borovec 977df6ed31
Docker: building XLA base image (#2494)
* refactor

* add TPU base

* wip

* builds

* typo

* extras

* simple

* unzip

* rename
2020-07-06 14:21:36 -04:00