lightning/.github/workflows/ci_dockers.yml

160 lines
5.6 KiB
YAML
Raw Permalink Normal View History

name: Docker
# https://www.docker.com/blog/first-docker-github-action-is-here
# https://github.com/docker/build-push-action
# see: https://help.github.com/en/actions/reference/events-that-trigger-workflows
on: # Trigger the workflow on push or pull request, but only for the master branch
push:
branches: [master, "release/*"] # include release branches like release/1.0.x
pull_request:
branches: [master, "release/*"]
paths:
- "dockers/**"
- "!dockers/README.md"
- "requirements/*"
- "requirements.txt"
- "environment.yml"
- ".github/workflows/*docker*.yml"
- ".github/workflows/events-nightly.yml"
- "setup.py"
2022-02-02 19:48:15 +00:00
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.head_ref }}
cancel-in-progress: ${{ ! (github.ref == 'refs/heads/master' || startsWith(github.ref, 'refs/heads/release/')) }}
jobs:
build-PL:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
# the config used in '.azure-pipelines/gpu-tests.yml' since the Dockerfile uses the cuda image
python_version: ["3.9"]
pytorch_version: ["1.10", "1.11"]
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build PL Docker
# publish master/release
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
PYTORCH_VERSION=${{ matrix.pytorch_version }}
file: dockers/release/Dockerfile
push: false
timeout-minutes: 50
build-XLA:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
# the config used in '.circleci/config.yml`'
python_version: ["3.7"]
xla_version: ["1.8"]
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build XLA Docker
# publish master/release
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
XLA_VERSION=${{ matrix.xla_version }}
file: dockers/base-xla/Dockerfile
push: false
2021-10-25 20:56:47 +00:00
timeout-minutes: 60
build-CUDA:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
include:
# the config used in '.azure-pipelines/gpu-tests.yml'
- {python_version: "3.7", pytorch_version: "1.8", cuda_version: "10.2", ubuntu_version: "18.04"}
- {python_version: "3.7", pytorch_version: "1.10", cuda_version: "11.1", ubuntu_version: "20.04"}
- {python_version: "3.7", pytorch_version: "1.11", cuda_version: "11.3.1", ubuntu_version: "20.04"}
# latest (used in Tutorials)
- {python_version: "3.8", pytorch_version: "1.8", cuda_version: "11.1", ubuntu_version: "20.04"}
- {python_version: "3.8", pytorch_version: "1.9", cuda_version: "11.1", ubuntu_version: "20.04"}
- {python_version: "3.9", pytorch_version: "1.10", cuda_version: "11.1", ubuntu_version: "20.04"}
- {python_version: "3.9", pytorch_version: "1.11", cuda_version: "11.3.1", ubuntu_version: "20.04"}
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build CUDA Docker
# publish master/release
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
PYTORCH_VERSION=${{ matrix.pytorch_version }}
CUDA_VERSION=${{ matrix.cuda_version }}
UBUNTU_VERSION=${{ matrix.ubuntu_version }}
file: dockers/base-cuda/Dockerfile
push: false
Weekly patch release v1.6.5 (#13481) * update NGC docker (#13136) * update docker * Apply suggestions from code review Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Decouple pulling legacy checkpoints from existing GHA workflows and docker files (#13185) * Add pull-legacy-checkpoints action * Replace pulls with the new action and script * Simplify * Merge pull request #13250 from PyTorchLightning/ci/rm-base CI: Remove simple test `ci_test-base.yml` * Update rich requirement from !=10.15.*,<=12.0.0,>=10.2.2 to >=10.2.2,!=10.15.0.a,<13.0.0 in /requirements (#13047) * Update rich requirement in /requirements Updates the requirements on [rich](https://github.com/willmcgugan/rich) to permit the latest version. - [Release notes](https://github.com/willmcgugan/rich/releases) - [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md) - [Commits](https://github.com/willmcgugan/rich/compare/v10.2.2...v12.4.1) --- updated-dependencies: - dependency-name: rich dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Fix torch.distributed._sharded_tensor DeprecationWarning (#13261) * update tutorials (#13268) * [BUG] `estimated_stepping_batches` requires distributed comms in `configure_optimizers` for `DeepSpeedStrategy` (#13350) * Update torchmetrics requirement from <=0.7.2,>=0.4.1 to >=0.4.1,<0.9.2 in /requirements (#13275) Update torchmetrics requirement in /requirements Updates the requirements on [torchmetrics](https://github.com/PyTorchLightning/metrics) to permit the latest version. - [Release notes](https://github.com/PyTorchLightning/metrics/releases) - [Changelog](https://github.com/PyTorchLightning/metrics/blob/master/CHANGELOG.md) - [Commits](https://github.com/PyTorchLightning/metrics/compare/v0.4.1...v0.9.1) --- updated-dependencies: - dependency-name: torchmetrics dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix mypy errors for model summary utilities (#13384) * rename org Lightning AI * Modified python version check to accommodate for legacy version styles (#13420) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> (cherry picked from commit b332b6632821e3f8fd451bbdf158bc9389eea51a) * Call `set_epoch` for distributed batch samplers (#13396) Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> (cherry picked from commit 2dd332f9c795aa5e590dce4d83e76d791a7b43df) * _RICH_AVAILABLE * _FAIRSCALE_AVAILABLE * _BAGUA_AVAILABLE * redefine * chlog spaces * CI: Fix `fatal: unsafe repository` (#13515) * update release date * CI: azure rename * Restore log step during restart (#13467) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * remove redundant test * Update CI setup (#13291) * drop mamba * use legacy GPU machines * fix schema check Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com> Co-authored-by: Sean Naren <sean@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Martino Sorbaro <martinosorb@users.noreply.github.com>
2022-07-12 23:40:14 +00:00
timeout-minutes: 95
build-Conda:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
include:
# see: https://pytorch.org/get-started/previous-versions/
- {python_version: "3.8", pytorch_version: "1.8", cuda_version: "11.1"}
- {python_version: "3.8", pytorch_version: "1.9", cuda_version: "11.1"}
- {python_version: "3.8", pytorch_version: "1.10", cuda_version: "11.1"}
- {python_version: "3.9", pytorch_version: "1.11", cuda_version: "11.3.1"}
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build Conda Docker
# publish master/release
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
PYTORCH_VERSION=${{ matrix.pytorch_version }}
CUDA_VERSION=${{ matrix.cuda_version }}
file: dockers/base-conda/Dockerfile
push: false
Weekly patch release v1.6.5 (#13481) * update NGC docker (#13136) * update docker * Apply suggestions from code review Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Decouple pulling legacy checkpoints from existing GHA workflows and docker files (#13185) * Add pull-legacy-checkpoints action * Replace pulls with the new action and script * Simplify * Merge pull request #13250 from PyTorchLightning/ci/rm-base CI: Remove simple test `ci_test-base.yml` * Update rich requirement from !=10.15.*,<=12.0.0,>=10.2.2 to >=10.2.2,!=10.15.0.a,<13.0.0 in /requirements (#13047) * Update rich requirement in /requirements Updates the requirements on [rich](https://github.com/willmcgugan/rich) to permit the latest version. - [Release notes](https://github.com/willmcgugan/rich/releases) - [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md) - [Commits](https://github.com/willmcgugan/rich/compare/v10.2.2...v12.4.1) --- updated-dependencies: - dependency-name: rich dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * Fix torch.distributed._sharded_tensor DeprecationWarning (#13261) * update tutorials (#13268) * [BUG] `estimated_stepping_batches` requires distributed comms in `configure_optimizers` for `DeepSpeedStrategy` (#13350) * Update torchmetrics requirement from <=0.7.2,>=0.4.1 to >=0.4.1,<0.9.2 in /requirements (#13275) Update torchmetrics requirement in /requirements Updates the requirements on [torchmetrics](https://github.com/PyTorchLightning/metrics) to permit the latest version. - [Release notes](https://github.com/PyTorchLightning/metrics/releases) - [Changelog](https://github.com/PyTorchLightning/metrics/blob/master/CHANGELOG.md) - [Commits](https://github.com/PyTorchLightning/metrics/compare/v0.4.1...v0.9.1) --- updated-dependencies: - dependency-name: torchmetrics dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix mypy errors for model summary utilities (#13384) * rename org Lightning AI * Modified python version check to accommodate for legacy version styles (#13420) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> (cherry picked from commit b332b6632821e3f8fd451bbdf158bc9389eea51a) * Call `set_epoch` for distributed batch samplers (#13396) Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> (cherry picked from commit 2dd332f9c795aa5e590dce4d83e76d791a7b43df) * _RICH_AVAILABLE * _FAIRSCALE_AVAILABLE * _BAGUA_AVAILABLE * redefine * chlog spaces * CI: Fix `fatal: unsafe repository` (#13515) * update release date * CI: azure rename * Restore log step during restart (#13467) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * remove redundant test * Update CI setup (#13291) * drop mamba * use legacy GPU machines * fix schema check Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com> Co-authored-by: Sean Naren <sean@grid.ai> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Martino Sorbaro <martinosorb@users.noreply.github.com>
2022-07-12 23:40:14 +00:00
timeout-minutes: 95
2021-05-07 12:07:29 +00:00
build-ipu:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
# the config used in 'dockers/ipu-ci-runner/Dockerfile'
2021-11-04 17:26:24 +00:00
python_version: ["3.9"] # latest
pytorch_version: ["1.9"]
2021-05-07 12:07:29 +00:00
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Build IPU Docker
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
PYTORCH_VERSION=${{ matrix.pytorch_version }}
file: dockers/base-ipu/Dockerfile
push: false
tags: pytorchlightning/pytorch_lightning:base-ipu-py${{ matrix.python_version }}-torch${{ matrix.pytorch_version }}
timeout-minutes: 50
- name: Build IPU CI runner Docker
uses: docker/build-push-action@v2
with:
build-args: |
PYTHON_VERSION=${{ matrix.python_version }}
PYTORCH_VERSION=${{ matrix.pytorch_version }}
file: dockers/ipu-ci-runner/Dockerfile
push: false
2021-10-25 20:56:47 +00:00
timeout-minutes: 60