Commit Graph

214 Commits

Author SHA1 Message Date
Adrian Wälchli 77eef8aff5
Update GPU CI and docker images for PyTorch 2.1 (#18719)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-06 08:12:37 -04:00
Adrian Wälchli d31ef1f7d3
Drop support for PyTorch 1.11 (#18691)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-10-04 20:30:44 +02:00
Jirka Borovec 358336268f
enable codespell for docs & fixing +TPU (#18629)
* precommit/codespell

* run

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* disable

* more fixing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

* more fixing

* json

* note

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-09-26 11:54:44 -04:00
Jirka Borovec 4265c11e8c
docker: CUDA with runtime (#17977) 2023-07-03 17:39:09 +02:00
Jirka Borovec e3193d624c
docker pip uses no-cache-dir (#17896) 2023-06-26 18:12:53 +01:00
Jirka Borovec cb17fe90ec
docs: adjust base image to ubuntu20.04 (#17846) 2023-06-16 17:02:43 +02:00
Jirka Borovec 1f670a5cbd
docker: NGC prune git (#17740) 2023-06-02 02:22:59 +02:00
Jirka Borovec 51b0e81105
replace local adjustment script with external (#17582) 2023-05-29 19:34:04 +00:00
Jirka Borovec 694edaa507
drop IPU docker for runner (#17583) 2023-05-09 18:30:21 +02:00
Eric Lam 989ddeaa32
feat: add LargeFileManager configuration to Jupyter in Dockerfile (#17553) 2023-05-04 13:35:44 +02:00
Jirka Borovec db9f095b0b
Replace IPU with external implementation (#17075)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-04-27 16:09:51 +00:00
Carlos Mocholí 1145c450b5
Remove devel.txt requirements file (#17466) 2023-04-25 07:21:21 +00:00
Carlos Mocholí 9627121da7
Simplify strategy installation in CI (#17347) 2023-04-25 01:09:57 +00:00
Carlos Mocholí 856b29fc72
[TPU] Replace GKE in CI with manual gcloud usage (#17362) 2023-04-14 12:47:31 +00:00
Carlos Mocholí b2717f6878
[TPU] Improve TPU workflow (#17237)
* Trigger TPU tests if [TPU] is in the PR title

* Remove TODO

* checkgroup

* DEBUG

* Update

* 1h timeout

* Update

* Update

* Update

* Update

* Remove DEBUG
2023-04-11 16:33:32 +02:00
Jirka Borovec e7ef8db57e
Replace HPU with external implementation (#17067)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-03-24 13:30:53 +01:00
Carlos Mocholí 2b6f0a15e1
Update CI to pull torch 2.0 stable (#17107) 2023-03-21 12:31:05 +00:00
Jirka Borovec 2f17d1b999
docker: fix Torch url (#16927) 2023-03-02 20:15:27 +01:00
Justus Schock b4e29e0c8f
PL: Test PyTorch 2.0 pre-release on CPU and CUDA (#16764)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-28 18:38:40 +01:00
Adrian Wälchli ad698f049b
Update Colossal AI docs and integration (#16778) 2023-02-16 16:14:24 +00:00
Adrian Wälchli 3b7f186a05
Update colossalai version in Dockerfile (#16766)
update docker
2023-02-15 14:20:13 -05:00
Adrian Wälchli 565d3bb8c6
CI: Update colossalai version (#16747)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-14 14:07:29 -05:00
Adrian Wälchli c4074419b5
Remove the BaguaStrategy (#16746)
* remove bagua

* remove

* remove docker file entry
2023-02-14 08:58:58 -05:00
Jirka Borovec 770b792925
copyright Lightning AI team (#16647)
* copyright Lightning AI team

* more...
2023-02-06 15:26:51 +01:00
Jerome Anand a2b9e8c4f6
Upgrade to HPU release 1.8.0 (#16621) 2023-02-03 10:33:57 +01:00
Carlos Mocholí 6f7276b67a
Fix TPU CI (#16613) 2023-02-02 20:01:53 +01:00
Jirka Borovec 377210d85d
tests: switch imports for fabric (#16592) 2023-02-01 20:34:38 +00:00
Carlos Mocholí ef2a6088ff
Drop support for PyTorch 1.10 (#16492)
* Drop support for PyTorch 1.10

* CHANGELOG

* READMEs

* mypy

* ls

* New poplar version

* Fixed tests

* links

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* skip azure badges

* Table

* Matching dockerfiles

* Drop unnecessary channels and packages

* Push nightly

* Undo unrelated changes

* Revert "Push nightly"

This reverts commit 9618f737c4.

---------

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-01 14:09:12 -05:00
Carlos Mocholí 854db60269
Only run TPU standalone tests (#16586) 2023-02-01 13:23:48 +01:00
Carlos Mocholí 59f2d4ce63
Install colossalai==0.1.12 in CI (#16587) 2023-02-01 04:57:22 +00:00
Carlos Mocholí dc298f2340
Drop support for Python 3.7 (#16579)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-01 01:36:42 +00:00
Carlos Mocholí ce99602d50 Remove horovod (#16150) 2023-01-19 18:39:36 +01:00
Carlos Mocholí f17cacb04b Remove nivida/apex (#16149) 2023-01-19 18:39:36 +01:00
Jirka Borovec 3828f527dd
hotfix build docs + docker (#16351) 2023-01-14 03:02:10 +00:00
Carlos Mocholí c18a0ec819
Remove untested NVIDIA Dali example (#16306) 2023-01-10 14:11:08 +01:00
Jirka Borovec 3326e65bb2
Update CI to CUDA 11.7.1 (#16123)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-04 22:44:23 +00:00
Carlos Mocholí 15ef52bc73
Rename LightningLite to Fabric (#16244)
* Rename LightningLite to Fabric

* Fix introspection test

* Fix deprecated Lite tests

* Undo accidental Horovod removal

* Fixes
2023-01-04 10:57:18 -05:00
Shashwat Agrawal 574a951601
docs: updated broken links (#16191)
Co-authored-by: Shashwat <shashwat>
Fixes https://github.com/Lightning-AI/lightning/issues/16186
2022-12-24 04:18:58 +01:00
Jirka Borovec 186b799a62
ci: adjust version in all requirements (#16100) 2022-12-22 06:21:52 +00:00
Jerome Anand 8475f85e16
Upgrade to HPU release 1.7.1 (#15956)
* Upgrade to HPU release 1.7.1
Update torch version check for hpu

Signed-off-by: Jerome <janand@habana.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-12-08 11:43:22 +00:00
Jirka Borovec 77006a20e6
CI: parameterize TPU tests (#15876)
* update
* param
* Apply suggestions from code review
2022-12-06 17:00:15 +00:00
Jirka Borovec 61ee3fabc3
PKG: distribute single semver (#15374)
* global
* distrib ver
* codeowners
* Apply suggestions from code review

Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-11-12 15:36:36 +00:00
Adrian Wälchli e87c11a592
Upgrade GPU CI to PyTorch 1.13 (#15583)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-11-12 14:58:37 +00:00
Carlos Mocholí a3edbec501
Delete unused TPU CI files (#15611)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2022-11-11 18:30:02 +00:00
Carlos Mocholí 6ba00af1e0
Drop PyTorch 1.9 support (#15347)
* Drop 1.9

* Everything else

* READMEs

* Missed some

* IPU skips

* Remove exception type

* Add back
2022-11-10 08:59:13 -05:00
Jerome Anand e79a69a9ee
Upgrade to HPU release 1.7.0 (#15616)
Signed-off-by: Jerome <janand@habana.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-11-10 10:47:17 +01:00
Jirka Borovec fb9dae8df3
ci: update install lite & cut pkg dependency (#14517)
* ci: update install lite

* try without lite in req file

* ci: install

* app

* init

* Revert "app"

This reverts commit f3f09e7888.

* ci: cpu

* ci: gpu

* pkg

* env

* bench

* trigger

* notes

* prune

* set version

* fix version

* git reset

* hpu, ipu

* adjust

* --hard

* git checkout

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* rc2

* L

* docs

* hpu

Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-31 20:50:51 +01:00
Carlos Mocholí 7f3e9de726
Fix TPU tests on master builds (#15349) 2022-10-31 15:58:02 +00:00
Jirka Borovec 95ae393ca8
LAI: creating mirror package (#15105)
* placeholder

* mirror + prune

* makedir

* setup

* ci

* ci

* name

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci clean

* empty

* py

* parallel

* doctest

* flake8

* ci

* typo

* replace

* clean

* Apply suggestions from code review

* re.sub

* fix UI path

* full replace

* ui path?

* replace

* updates

* regex

* ci

* fix

* ci

* path

* ci

* replace

* Update .actions/setup_tools.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* also convert lightning_lite tests for PL tests to adapt mocking paths

* fix app example test

* update logger propagation for PL tests

* update logger propagation for PL tests

* Apply suggestions from code review

* Revert "update logger propagation for PL tests"

This reverts commit c1a5e119c7.

* playwright

* py

* update import in tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try edit import in overwrite

* debug code

* rev playwright

* Revert "try edit import in overwrite"

This reverts commit c02f766521.

* ci: adjust examples

* adjust examples cloud

* mock lightning_app

* Install assistant dependencies

* lightning

* setup

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Apply suggestions from code review

* disable cache

* move doctest to install

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* )

* echo ./

* ci

* lru

* revert disabling cache, prints

* ci

* prune ci jobs

* prune ci jobs

* training loop standalone tests

* add sys modules cleanup fixture

* make use of fixture

* revert standalone

* ci e2e

* fix imports in lightning

* fix imports of lightning in tests

* Revert "make use of fixture"

This reverts commit c15efdd205.

* Revert other commits for fixtures

* revert use of fixture

* py3.9

* fix mocking

* fix paths

* hack mocking

* docs

* Apply suggestions from code review

* rev suggestion

* Minor changes to the parametrizations

* Update checkgroup with the new and changed jobs

* include frontend dir

* cli

* fix imports and entry point

* Revert standalone

* rc1

* e2e on staging

* Revert "Revert standalone"

This reverts commit 9df96685b8.

* groups

* to

* ci: pt ver

* docker

* Apply suggestions from code review

* Copy over changes from previous commit to other groups

* Add back changes from bad merge

* Uppercase step name everywhere

* update

* ci

* ci: lai oldest

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: manskx <ahmed.mansy156@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-27 12:32:49 +02:00
Carlos Mocholí 375ab53861
Migrate TPU tests to GitHub actions (#14687)
* Migrate TPU tests to GitHub actions

* No working dir

* Keep _target

* Dont skip draft

* CHECK_SLEEP

* Not yet

* Remove recurrent cleanup script

* Set secrets

* a step cannot have both the `uses` and `run` keys

* Version $PYTHON_VER was not found in the local cache

* can't load package ... ($GOPATH not set)

* The `set-env` command is disabled

* Try updating go

* Match timeout

* simplify path

* More cleanup

* Install coverage. Unmark draft

* Update .github/workflows/ci-pytorch-test-tpu.yml

* DEBUG echo

* Revert "DEBUG echo"

This reverts commit 4011856e6e.

* More debug

* SSH

* Im stupid

* Remove always()

* Forgot some

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2022-10-21 20:01:39 +02:00