Commit Graph

16 Commits

Author SHA1 Message Date
Jirka Borovec edea0d4bc3
switch azure pool (#10266) 2021-11-01 11:42:11 +00:00
Carlos Mocholí 3a4e9970d6
Pin fairscale version (#10200) 2021-10-27 23:24:17 +00:00
Kaushik B 5e8829b97d
(1/n) tests: Use strategy flag instead of accelerator for training strategies (#9931)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-16 20:40:25 +05:30
Sean Naren 83acb8671d
Update DeepSpeed version, fix failing tests (#9898) 2021-10-11 22:35:33 +00:00
Carlos Mocholí 0dfc6a18bd
Call any trainer function from the `LightningCLI` (#7508) 2021-08-28 04:43:14 +00:00
Adrian Wälchli de22e40095
restrict deepspeed version in CI (#8951) 2021-08-17 14:02:27 +01:00
thomas chaton 9e61de2063
Torch Elastic DDP DeadLock bug fix (#8655)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-02 21:48:43 +02:00
thomas chaton 85bba06529
update (#8674) 2021-08-02 11:56:09 +02:00
Jirka Borovec 470842f5c8
CI: validate JSON & fix benchmark (#8567)
* CI: validate JSON

* as GHA

* PT1.8

* 32g

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-07-28 18:09:15 +02:00
Adrian Wälchli 96729fc45a
update links for collect_env_details.py script (#8436) 2021-07-19 11:26:09 +00:00
Carlos Mocholí ae1fd6a201
Unblock GPU CI (#8456)
* Debug

* Increase SHM size

* Debug

* Refactor MNIST imports

* Undo debugging

* Prints
2021-07-19 09:41:18 +02:00
Carlos Mocholí 4184d7e738
Refactor GPU examples tests (#8294) 2021-07-06 13:14:04 +01:00
Carlos Mocholí 2c43bfc5ef
GPU CI - run torch 1.8 (LTS) (#8116) 2021-06-24 16:56:43 +00:00
Sean Naren f7459f5328
DeepSpeed Infinity Update (#7234)
* Update configs to match latest API

* Ensure we move the entire model to device before configure optimizer is called

* Add missing param

* Expose parameters

* Update references, drop local rank as it's now infered from the environment variable

* Fix ref

* Force install deepspeed 0.3.16

* Add guard for init

* Update pytorch_lightning/plugins/training_type/deepspeed.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Revert type checking

* Install master for CI for testing purposes

* Update CI

* Fix tests

* Add check

* Update versions

* Set precision

* Fix

* See if i can force upgrade

* Attempt to fix

* Drop

* Add changelog

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 16:38:28 +00:00
Carlos Mocholí e16d4fbdee
CI code cleaning (#7615) 2021-05-21 11:35:12 +00:00
Louis Taylor b64aea637c
CI: move azure-pipelines config to separate directory (#7276)
* CI: move azure pipelines to separate directory

This removes some extra clutter in the top level as we add more
pipelines.

* rename

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-04 10:50:16 -04:00