Commit Graph

91 Commits

Author SHA1 Message Date
Luca Antiga b7099851b7
Install setuptools via pip in base Docker image (#20487) 2024-12-10 15:40:40 +01:00
Luca Antiga caa9e1e594
Remove deprecated distutils (#20481)
* Remove deprecated distutils

* Fix format

* Fix package name

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-09 16:00:04 +01:00
awaelchli bf25167bbf
Add testing for PyTorch 2.4 (Trainer) (#20010) 2024-07-11 06:52:56 -04:00
Adrian Wälchli 49ed2b102b
Add PyTorch 2.3 to CI matrix (#19708) 2024-04-29 07:16:13 -04:00
Jirka Borovec 3bd133b107
CI: enable testing with coming PT 2.2 (#19289)
* ci: build dockers for PT 2.2
* py3.12
* --pre --extra-index-url
* typing-extensions
* bump jsonargparse
* install latest jsonargparse
* Add windows skips for Fabric
* convert to xfail
* add pytorch skips
* skip checkpoint consolidation test
* set max torch

---------

Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-01-26 16:42:09 +01:00
Adrian Wälchli 77eef8aff5
Update GPU CI and docker images for PyTorch 2.1 (#18719)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-10-06 08:12:37 -04:00
Jirka Borovec 4265c11e8c
docker: CUDA with runtime (#17977) 2023-07-03 17:39:09 +02:00
Jirka Borovec e3193d624c
docker pip uses no-cache-dir (#17896) 2023-06-26 18:12:53 +01:00
Jirka Borovec 51b0e81105
replace local adjustment script with external (#17582) 2023-05-29 19:34:04 +00:00
Carlos Mocholí 1145c450b5
Remove devel.txt requirements file (#17466) 2023-04-25 07:21:21 +00:00
Carlos Mocholí 9627121da7
Simplify strategy installation in CI (#17347) 2023-04-25 01:09:57 +00:00
Carlos Mocholí 2b6f0a15e1
Update CI to pull torch 2.0 stable (#17107) 2023-03-21 12:31:05 +00:00
Jirka Borovec 2f17d1b999
docker: fix Torch url (#16927) 2023-03-02 20:15:27 +01:00
Justus Schock b4e29e0c8f
PL: Test PyTorch 2.0 pre-release on CPU and CUDA (#16764)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-28 18:38:40 +01:00
Adrian Wälchli ad698f049b
Update Colossal AI docs and integration (#16778) 2023-02-16 16:14:24 +00:00
Adrian Wälchli 3b7f186a05
Update colossalai version in Dockerfile (#16766)
update docker
2023-02-15 14:20:13 -05:00
Adrian Wälchli 565d3bb8c6
CI: Update colossalai version (#16747)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-14 14:07:29 -05:00
Adrian Wälchli c4074419b5
Remove the BaguaStrategy (#16746)
* remove bagua

* remove

* remove docker file entry
2023-02-14 08:58:58 -05:00
Jirka Borovec 770b792925
copyright Lightning AI team (#16647)
* copyright Lightning AI team

* more...
2023-02-06 15:26:51 +01:00
Carlos Mocholí 59f2d4ce63
Install colossalai==0.1.12 in CI (#16587) 2023-02-01 04:57:22 +00:00
Carlos Mocholí ce99602d50 Remove horovod (#16150) 2023-01-19 18:39:36 +01:00
Carlos Mocholí f17cacb04b Remove nivida/apex (#16149) 2023-01-19 18:39:36 +01:00
Jirka Borovec 3828f527dd
hotfix build docs + docker (#16351) 2023-01-14 03:02:10 +00:00
Carlos Mocholí c18a0ec819
Remove untested NVIDIA Dali example (#16306) 2023-01-10 14:11:08 +01:00
Jirka Borovec 3326e65bb2
Update CI to CUDA 11.7.1 (#16123)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-04 22:44:23 +00:00
Jirka Borovec 186b799a62
ci: adjust version in all requirements (#16100) 2022-12-22 06:21:52 +00:00
Adrian Wälchli e87c11a592
Upgrade GPU CI to PyTorch 1.13 (#15583)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-11-12 14:58:37 +00:00
Carlos Mocholí 6ba00af1e0
Drop PyTorch 1.9 support (#15347)
* Drop 1.9

* Everything else

* READMEs

* Missed some

* IPU skips

* Remove exception type

* Add back
2022-11-10 08:59:13 -05:00
ver217 2fef6d9403
Add ColossalAI strategy (#14224)
Co-authored-by: HELSON <c2h214748@gmail.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-10-11 13:59:09 +02:00
Rui Wang 40868f7f43
Add bagua support for CUDA 11.6 images (#14529)
* Add support for bagua-cuda116

* Remove bagua-cuda115 from installation

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-09-09 20:07:25 +00:00
otaj 1ae14ca754
[CI] fix horovod tests (#14382) 2022-08-25 17:30:06 +00:00
otaj bb634310e7
[CI] Bump CUDA in Docker images to 11.6.1 (#14348)
* bump cuda in docker images to 11.6.1

* PUSH TO HUB. REVERT THIS!

* conda forge for 11.6

* cuda 11.5

* revert conda changes

* 11.6 back again

* 11.6 back again, all of them

* maybe all passes now

* maybe all passes now

* final push

* Revert "PUSH TO HUB. REVERT THIS!"

This reverts commit 602bfce224.

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-08-23 12:10:52 -04:00
Carlos Mocholí 1299e4f984
Run GPU tests with PyTorch 1.12 (#13716)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-07-28 19:37:57 +05:30
Carlos Mocholí ad87d2cad0
Future 5/n: Move requirements (#13306)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-21 17:11:33 +02:00
Carlos Mocholí 0cf9d73d28
Drop PyTorch 1.8 support (#13155)
* Drop PyTorch 1.8 support

* Missed update

* Skip profiler test until supported

* Upgrade ipu dockerfile pytorch version

* Update XLA version
2022-06-14 20:46:44 -04:00
Jirka Borovec fab2ff35ad
CI: Azure - multiple configs (#12984)
* CI: Azure - multiple configs
* names
* benchmark
* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-14 01:59:03 +00:00
Jirka Borovec fec9a09672
add freeze for development and full range for install (#12994)
* freeze versions

* unfreeze

* dependabot

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fix all req

* ...

* use base

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix refs

* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* Apply suggestions from code review

* dockers

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-05-12 09:14:18 -04:00
Jirka Borovec 783ec43a85
parse strategies as own extras (#12975)
* parse strategies as own extras

* prune devel

* Update Makefile

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* revert parse_requirements

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-09 09:25:53 -04:00
Jirka Borovec 7ce948edb6
Unpin CUDA docker image for GPU CI (#12373)
* unpin CUDA docker image for GPU CI
* Apply suggestions from code review

Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Akihiro Nitta <akihiro@pytorchlightning.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-05-06 02:56:57 +00:00
Jirka Borovec bb51e2a55b
Merge pull request #12723 from PyTorchLightning/req/strategies
Separate strategies' requirements
2022-05-04 10:06:02 -04:00
Akihiro Nitta ecd135e939
Update nvidia gpg key to fix nightly docker builds (#12930)
* Update gpg key
* Use curl instead of wget
* Install key manually
2022-05-02 09:00:44 +02:00
Akihiro Nitta ace6a5827b
Update building docker images (#12837)
Co-authored-by: Akihiro Nitta <akihiro@pytorchlightning.ai>
2022-04-21 22:10:42 +00:00
Jirka Borovec f9b69ce5b0
CI: check docker requires (#12677)
* check docker requires
* ci update
* bagua
* conda
* cuda
2022-04-12 00:29:54 +09:00
Jirka Borovec fe940e195d
CI: update prune_pkgs (#12382) 2022-03-21 12:50:50 +00:00
four4fish 1eff3b53c1
Update fairscale version (#11567)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-21 11:38:55 +00:00
Jirka Borovec efa870eebc
Docker: fix NCCL building Horovod (#12318)
* Horovod w. MPI
* nccl_built
* fix
2022-03-18 14:23:19 +00:00
Jirka Borovec 7ee690758c
CI: fix running PT 1.11 (#12304)
* fix fire
* horovod
* assistant
* cmake
* u20
* cuda
* -j2
* fix mypy

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-12 09:00:20 +00:00
Jirka Borovec 1144673cd9
CI: sanity check for req. pkgs (#11819)
* CI: sanity check for req. pkgs
* scripts
* rename
* gcsfs ?
* rich !
* install extra
* move
* set -e

Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-03-11 09:20:47 +00:00
Jirka Borovec 8577ef7bba
Skip horovod 0.24.0 only (#12248)
* try skip horovod 0.24.0 only
* HOROVOD_BUILD_CUDA_CC_LIST
* fix test

Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-03-10 16:01:08 +00:00
wangraying a0655611de
Add bagua installation in dockerfile (#11283)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-02-24 15:17:31 +01:00