Commit Graph

52 Commits

Author SHA1 Message Date
William Falcon 5fd01b0e68
Finish Ananthsub patch 1 (enable prepare_data from correct processes). clarify local vs global rank (#2166)
* [trainer] Call prepare_data once per node in DDP/DDP2 training

* refactored DDP routes

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* renamed proc_rank to local_rank

* spawn message

* spawn message

* spawn message

* fixes

* fixes

* fixes

* fixes

* fixes

* Update trainer.py

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-06-13 12:00:14 -04:00
Udit Arora 08573d0f7e
Fix some pyright member access errors in training module (#2121)
* Fix pyright member access errors in training module

* Fix Trainer instantiation error due to inheritence order

* Add GH workflow for pyright

* Fix more pyright errors in trainer module

* Add pyrightconfig and setup python environment in type-check workflow

* Exclude pyrightconfig.json

* suggestions

Co-authored-by: Jirka <jirka@pytorchlightning.ai>
2020-06-12 17:23:18 +02:00
Adrian Wälchli 8211256c46
data transfer model hook (+ refactor) (#1756)
* refactor and added hook


variant a


variant b


add test


revert rename


add changelog


docs

* resolve merge duplication

* overridden typo

* fix test

* tpu id

* raise if TPU not available

* re-use apply_to_collection function for parsing collections

* comment

* make utility function available to user

* documentation

* move changelog entry to top

* fix tpu transfer call

* fix call

* remove hardcoded string

* improve test

* call model hook by default

* Apply suggestions from code review

* rename utility function

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-06-02 21:45:19 -04:00
Adrian Wälchli a699003e67
Update/merge multi-gpu docs (#2021)
* merge multi-gpu docs

* extend slurm docs

* update links to elastic

* format docs and type hints in distrib parts

* reference multi-gpu/slurm in trainer args docs

* fix doctest

* typo

* doctest

* Apply suggestions from code review

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* wall time

* Update docs/source/slurm.rst

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>

* fix title

* update docs for weights summary

* update changelog

Co-authored-by: Lucas Vazquez <lucasgouvaz@gmail.com>
2020-06-02 18:50:08 -04:00
William Falcon 82a20296e3
Replaces ddp .spawn with subprocess (#2029)
* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* replace ddp spawn with subprocess

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix

* hot fix
2020-06-01 11:00:32 -04:00
Jirka Borovec 5e8c5abf63
fix default arg (#1927)
* fix default

* formatting errors

* update

* flake8
2020-05-26 19:04:42 -04:00
Jirka Borovec ca815698f5
Revert "Remove unused param tpu_core_idx (#1948)" (#1963)
This reverts commit d0ec11b9d6.
2020-05-26 19:02:51 -04:00
Rohit Gupta d0ec11b9d6
Remove unused param tpu_core_idx (#1948) 2020-05-25 16:04:53 -04:00
Nicki Skafte 8f6b7a2b4f
Fix user warning produced by apex + scheduler combination (#1873)
* fix user error produced by apex + scheduler combination

* add changelog

* added reinit to every configure_apex call

* fix styling

Co-authored-by: Nicki Skafte <nugginea@gmail.com>
2020-05-22 07:19:37 -04:00
Lezwon Castelino 7c7e50ca47
Allow user to select individual TPU core to train on (#1729)
* added tpu_id

added tpu_id to mixins

* train on individual tpu

* parallel loader if tpu_id is None

* removed progress_bar_refresh_rate

* chlog

* replaced num_tpu_cores with tpu_cores

* set tpu_id to None if int

* changed num_tpu_cores to tpu_cores in docs

* updated docs

* updated __init__.py
removed self.tpu_id for ParallelLoader

* Update pytorch_lightning/trainer/__init__.py

* check if tpu_cores is a list

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* xla device conditional

* num_tpu_cores deprecation

* removed duplicate warning

* fixed pep8 error

* Revert "removed duplicate warning"

This reverts commit 8adb0a9b

* deprecated api update

* fixed recursion error

* fixed tests

* fixed flake errors

* removed current_tpu_index

* Update CHANGELOG.md

* Update trainer.py

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-17 16:30:54 -04:00
Adrian Wälchli 4cdebf9a64
remove obsolete self._device in Trainer (#1849)
* remove unused device attribute

* dtype

* move on_gpu to model
2020-05-17 08:20:51 -04:00
Justus Schock c05077fae3
Enable non-blocking for gpu device transfer (#1843)
* Update distrib_parts.py

* Update CHANGELOG.md
2020-05-14 17:56:40 -04:00
William Falcon 53d9316a56
fixes ddp bugs (#1819)
* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug
2020-05-13 19:17:04 -04:00
Jirka Borovec 10ce1c0256
device property (#1791)
* device property

* add/copy properties

* inherit

* rename

* Apply suggestions from code review

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* dtype

* prop

* pt api

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2020-05-12 23:18:39 -04:00
Travis Addair acab068c74
Join Horovod workers at the end of trainer.fit() to prevent race conditions following training (#1786)
* Join Horovod workers at the end of trainer.fit() to prevent race conditions following training

* flake8

* flake8

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-12 09:15:25 +00:00
William Falcon 4b30ef6480
Device (#1790)
* added self.device

* added docs
2020-05-12 00:09:48 -04:00
Travis Addair f90afa29b8
Fix disabling progress bar on non-zero ranks using Horovod backend (#1709)
* Fix Horovod backend to disable progress bar on all ranks except 0

* Add join barriers

* Added changelog

* Make protected and add verbosity

* Refactor to disable progress bar callback in train

* Removed vebose setting

* Add cache check for Horovod

* Test run again

* Updated comment

* Always skip cache for Horovod

* Only reinstall when necessary

* Added separate step

* Fixed spacing

* Skip Python 3.8
2020-05-04 13:02:57 -04:00
Travis Addair 2950f66983
Fix Horovod distributed backend to set the root_gpu property (#1669)
* params

* drop acc

* Fix Horovod distributed backend to set the root_gpu

* Fixed test

* Fixed tests

* Fixed lint

* Set root_gpu during initialization

* chlog

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2020-05-01 14:13:35 -04:00
Nathan Breitsch 3eac6cfd4f
Don't convert namedtuple to tuple (#1589)
* Don't convert namedtuple to tuple

* Test namedtuples sent to device correctly
2020-04-30 08:04:50 -04:00
Jirka Borovec 58a467dd68
model checkpint on rank_zero_only & global rank state (#1408)
* try delete in async or DDP us0-ecase

* changelog

* add model chekpoint rank

* simple delete

* flake8

* use global rank

* chnagelog

* fix review

* fix import

* proposal

* proposal

* proposal

* improve proposal (fix problems with method call self)

* cleaning

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-04-24 17:21:00 -04:00
William Falcon 890458fdbd
Fixes automatic parser bug (#1585)
* fixes gpu parsing

* fixes gpu parsing
2020-04-23 21:00:41 -04:00
Adrian Wälchli 3e8f2d99a9
Progress bar callback (#1450)
* squash and rebase

sanity check hooks


sanity check callback hook finish


moved core progress bar functionality into callback


wip


remove duplicate merge


clean up


imports


docs


sanity check progress bar main


sanity


move callback calls


init progrss bar callback


configuration and docs


changelog


rate decorator


pass process_position


disable on rank > 0


position index


is_enabled


remove decorator


refactor init tqdm bars


callback method ordering 


cannot reset when disabled


sequence -> list


default values


fix has no attr _time() 


move on_val_end to proper place


fix the pickle issue


update warning


properties


check for None


remove old comment


switch order


pull out non-tqdm functionality into base class


documentation for the base class


docs


fix refresh rate issue in validation


restrict type hint of trainer arg


more docs


update trainer docs


rst docs


fix lines too long


fix test


add missing type hints


fix typo


move docstring to __init__ solves doctest failures


remove doctest :(( can't fix the pickle error


fix example


simplify by saving trainer reference


fix docs errors


move docstring


initial value


multiple val checks per epoch


simpler handling of inf dataset sizes


update inf docs


renamed training_tqdm_dict


rename get_tqdm_dict


rename occurences of tqdm 


update changelog


fix doctest


fix formatting errors


added callback tests


progress bar on off test


more tests for progress bar


weird test fix?


add ignored property


disable default progress bar in LR finder


change enable/disable behavior


trying doctest in CI again


undo doctest pickle error


undo doctest pickle error :((


remove progress_bar_callback Trainer arg and fix tests


restore progress bar after auto lr find


update docs


fix rebase


fix wrong negation

* fix fast dev run total

* more thorough testing

* remove old args

* fix merge

* fix merge

* separate tests

* type hint total batches

* reduce if

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_disabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* is_enabled

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* rename enabled/disabled

* move deprecated api

* remove duplicated test from merge

* fix rename is_disabled

* newline

* test also testprogress for fast dev run

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 20:46:18 -04:00
William Falcon 29ebe92208
support for native amp (#1561)
* adding native amp suppport

* adding native amp suppport

* adding native amp suppport

* adding native amp suppport

* autocast

* autocast

* autocast

* autocast

* autocast

* autocast

* removed comments

* removed comments

* added state saving

* added state saving

* try install amp again

* added state saving

* drop Apex reinstall

Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-23 14:47:08 -04:00
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
Krishna Penukonda a22a8142ac
Allow Trainer's `gpus` arg type to be subclass of currently accepted types (#1423)
* Fixed Trainer `gpus` arg type issue

Fixes #1388

* Disallow boolean gpus parameter

Co-Authored-By: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fixed missing paranthesis

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2020-04-17 18:18:29 -04:00
Allard Hendriksen 7ac1580a31
Add automatic GPU choice to trainer (#1426)
* Add automatic GPU choice to trainer

This commit adds the `gpu_choice` parameter to Trainer. By default,
this parameter is set to 'manual' which causes no observable
difference in behavior.

When `gpu_choice` is set to "auto" and `gpus` is an int, then the
trainer will automatically allocate the first available GPU.
This is especially useful when GPUs are configured to be in "exclusive
mode", which means that only one process at a time can use them.

* Rename gpu_choice -> auto_select_gpus
2020-04-10 11:45:29 -04:00
Jirka Borovec 17f58d2e11
add rank warning (#1428)
* add rank warning

* changelog

* use rank_zero_warn

* user trainer_init

* replace warnings

* fix test

* flake8

* docs

* changelog

* bug lol
2020-04-09 14:05:46 -04:00
Roshan Rao 4ed3027309
Set precision=16 when use_amp is passed as True (#1145)
* Set precision=16 when use_amp is passed as True

* Update CHANGELOG.md

* add use_amp to deprecated API

* Update trainer.py

* Update trainer.py

* move the use_amp attribute to deprecated API

* move use_amp deprecation back to Trainer's __init__

* drop unsed

* drop deprecated

* reorder imports

* typing

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-06 08:13:24 -04:00
William Falcon 16f4cc9ff0
Shubhamagarwal92 master (#1349)
* SA: for #958: set torch cuda device when finding root

* SA: for #958: removing root gpu hack in trainer/evaluation_loop

* SA: setting torch cuda device

* comment line too long

* check if root gpu exists or available

* Incorporating suggestions on #1094

* since root gpu returns none instead of -1 for cpu

* undo changes

* fixed dp memory thing

Co-authored-by: Shubham Agarwal <shubhamagarwal92@gmail.com>
2020-04-03 17:56:19 -04:00
Gerard Bentley f33b5a8d99
Simplify progress bar args (#1108)
* show progress bar dependent on refresh_rate

* test progress_bar_refresh control show bar

* remove show_progress_bar from other tests

* borda fixes

* flake8 fix

* changelog update prog bar refresh rate

* move show_progress_bar to deprecated 0.9 api

* rm show_progress_bar references, test deprecated

* Update pytorch_lightning/trainer/__init__.py

* fix test

* changelog

* minor CHANGELOG.md format

* Update pytorch_lightning/trainer/__init__.py

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Gerard Bentley <gbkh2015@mymail.pomona.edu>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-03 00:53:00 +02:00
Ethan Harris 28242f02d1
Remove default optimizer, add None optimizer option (#1279)
* Add warning when using default optimizer

* Refactor optimizer tests to test_optimizers

* Remove default optimizer, add option to use no optimizer

* Update CHANGELOG.md

* Update pytorch_lightning/trainer/optimizers.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fix style

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-04-02 11:48:53 -04:00
William Falcon 7de51f78ac
Sampler (#1318)
* sampler

* sampler

* sampler

* check for dataloader type

* check for dataloader type
2020-03-31 18:22:45 -04:00
Asaf Manor aca8c7e6f3
Optimizer Frequencies logic, and new configure_optimizers (#1269)
* init_optimizers accepts Dict, Sequence[Dict]
and returns optimizer_frequencies.
optimizer_frequencies was added as a member of Trainer.

* Optimizer frequencies logic implemented in training_loop.
Description added to configure_optimizers in LightningModule

* optimizer frequencies tests added to test_gpu

* Fixed formatting for merging PR #1269

* Apply suggestions from code review

* Apply suggestions from code review

Co-Authored-By: Asaf Manor <32155911+asafmanor@users.noreply.github.com>

* Update trainer.py

* Moving get_optimizers_iterable() outside.

* Update note

* Apply suggestions from code review

* formatting

* formatting

* Update CHANGELOG.md

* formatting

* Update CHANGELOG.md

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-31 16:41:24 +00:00
Jirka Borovec 6ddb03922a
Profiler summary (#1259)
* refactor and add types

* add Prorfiler summary

* fix imports

* Revert "refactor and add types"

This reverts commit b4c552fa

* changelog

* revert rename

* fix test

* mute verbose
2020-03-31 08:57:48 -04:00
Jirka Borovec 09167efdb5
Checkpointing interval (#1272)
* formatting

* formatting

* fix interval

* fix train loop

* fix test

* parametrize test

* Apply suggestions from code review

Co-Authored-By: Adrian Wälchli <adrian.waelchli@students.unibe.ch>

* fix calling

* flake8

* add types

Co-authored-by: Adrian Wälchli <adrian.waelchli@students.unibe.ch>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-03-30 18:37:02 -04:00
Adrian Wälchli 792962ecc9
CI: Force docs warnings to be raised as errors (+ fix all) (#1191)
* add argument to force warn

* fix automodule error

* fix permalink error

* fix indentation warning

* fix warning

* fix import warnings

* fix duplicate label warning

* fix bullet point indentation warning

* fix duplicate label warning

* fix "import not top level" warning

* line too long

* fix indentation

* fix bullet points indentation warning

* fix hooks warnings

* fix reference problem with excluded test_tube

* fix indentation in print

* change imports for trains logger

* remove pandas type annotation

* Update pytorch_lightning/core/lightning.py

* include bullet points inside note

* remove old quick start guide (unused)

* fix unused warning

* fix formatting

* fix duplicate label issue

* fix duplicate label warning (replaced by class ref)

* fix tick

* fix indentation warnings

* docstring ticks

* remove obsolete docstring typing

* Revert "remove old quick start guide (unused)"

This reverts commit d51bb40695.

* added old quick start guide to navigation

* remove unused  tutorials file

* ignore some modules that got deprecated and are not used anymore

* fix duplicate label warning

* move examples doc and exclude pl_examples from autodoc

* fix formatting for configure_optimizer

* fix no blank line warnings

* fix "see also" labels and add paramref extension

* fix more reference problems

* fix multi-gpu reference

* fix weird warning

* fix indentation and unrecognized characters in code block

* fix warning "... not included in toctree"

* fix PIL import error

* fix duplicate target "here" warning

* fix broken link

* revert accidentally moved pl_examples

* changelog

* stdout

* note some things to know

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-03-20 20:49:01 +01:00
Jirka Borovec 22a7264e9a
improve partial Codecov (#1172)
* ignore in setup

* show report

* abs imports

* abstract pass

* cover loggers

* doctest trains

* locals

* pass

* revert tensorboard

* use tensorboardX

* revert tensorboardX

* fix trains

* Add TrainsLogger.set_credentials (#1179)

* Add TrainsLogger.set_credentials to control trains server configuration and authentication from code. Sync trains package version.
Fix CI Trains tests

* Add global TrainsLogger set_bypass_mode (#1187)

* Add global TrainsLogger set_bypass_mode skips all external communication

Co-authored-by: bmartinn <>

* rm some no-cov

Co-authored-by: Martin.B <51887611+bmartinn@users.noreply.github.com>
2020-03-19 09:14:29 -04:00
Jacob Zhong 1a73fa0b03
change default logger to dedicated one (#1064)
Fix test


Fix format

Update pytorch_lightning/__init__.py
Separate imports
2020-03-17 18:44:00 -04:00
Jirka Borovec 514d182b7f
cleaning imports (#1032) 2020-03-12 12:41:37 -04:00
William Falcon 15e268d6df
Coverage (#1058)
* docs

* docs

* docs

* docs
2020-03-05 19:49:18 -05:00
William Falcon 17891653cd
handle keyboard interrupt for ddp .test() (#1019)
* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs
2020-03-02 23:38:47 -05:00
Bilal Khan 29cbc9e723
Fix #997 (#1018) 2020-03-02 21:51:05 -05:00
William Falcon 6dae5698ef
fixes test issues on ddp (#1017)
* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs

* updated checkpoint docs
2020-03-02 21:50:38 -05:00
Jirka Borovec 7beed7cae6
Trainer cleanup (#934)
* Trainer cleanup

* update abstract

* remove ...

* remove __init__

* update mixin types

* update callbacks

* fix

* lower test acc
2020-02-27 16:21:14 -05:00
Hanbyul Kim 563e2ba2c6
resolving documentation warnings (#833)
* add more underline

* fix LightningMudule import error

* remove unneeded blank line

* escape asterisk to fix inline emphasis warning

* add PULL_REQUEST_TEMPLATE.md

* add __init__.py and import imagenet_example

* fix duplicate label

* add noindex option to fix duplicate object warnings

* remove unexpected indent

* refer explicit LightningModule

* fix minor bug

* refer EarlyStopping explicitly

* restore exclude patterns

* change the way how to refer class

* remove unused import

* update badges & drop Travis/Appveyor (#826)

* drop Travis

* drop Appveyor

* update badges

* fix missing PyPI images & CI badges (#853)

* docs - anchor links (#848)

* docs - add links

* add desc.

* add Greeting action (#843)

* add Greeting action

* Update greetings.yml

Co-authored-by: William Falcon <waf2107@columbia.edu>

* add pep8speaks (#842)

* advanced profiler describe + cleaned up tests (#837)

* add py36 compatibility

* add test case to capture previous bug

* clean up tests

* clean up tests

* Update lightning_module_template.py

* Update lightning.py

* respond lint issues

* break long line

* break more lines

* checkout conflicting files from master

* shorten url

* checkout from upstream/master

* remove trailing whitespaces

* remove unused import LightningModule

* fix sphinx bot warnings

* Apply suggestions from code review

just to trigger CI

* Update .github/workflows/greetings.yml

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-02-27 16:07:51 -05:00
William Falcon 3562aa5aae fix tpu transfer bug 2 2020-02-17 17:47:16 -05:00
William Falcon 919a26fe41 fix tpu transfer bug 2020-02-17 17:46:46 -05:00
William Falcon d4a31f02e0
Enable TPU support (#868)
* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* fix test pkg create (#873)

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* Update pytorch_lightning/trainer/trainer.py

Co-Authored-By: Luis Capelo <luiscape@gmail.com>

* Fix segmentation example (#876)

* removed torchvision model and added custom model

* minor fix

* Fixed relative imports issue

* Fix/typo (#880)

* Update greetings.yml

* Update greetings.yml

* Changelog (#869)

* Create CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Update PULL_REQUEST_TEMPLATE.md

* Add PR links to Version 0.6.0 in CHANGELOG.md

* Add PR links for Unreleased in CHANGELOG.md

* Update PULL_REQUEST_TEMPLATE.md

* Fixing Function Signatures (#871)

* added tpu docs

* added tpu flags

* add tpu docs + init training call

* amp

* amp

* amp

* amp

* optimizer step

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added auto data transfer to TPU

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

* added test return and print

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Luis Capelo <luiscape@gmail.com>
Co-authored-by: Akshay Kulkarni <akshayk.vnit@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Shikhar Chauhan <xssChauhan@users.noreply.github.com>
2020-02-17 16:01:20 -05:00
Adrian Wälchli 472f394788
Resolve some codefactor issues (#756)
* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* remove unnecessary pass statements

* use isinstance for type checks

* remove unnecessary else/elif after return

* remove unnecessary return statements

* move doc string to top

* merge isinstance calls

* remove unnecessary else/elif after raise

* use list comprehension

* do not use len without comparison

* add missing shebang

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add missing period to doc string

* Fix default ckpt path when logger exists (#771)

* rename logging -> loggers (#767)

* move logging >> loggers

* add warning

* fix tests

* logging alias

* formatting

* formatting

* use isinstance for type checks

* revert isinstance check back to type

broke tests, because bool is actually subclass of int

* add more detail to tbptt example (#755)

* add more detail to tbptt example

* warn user about new arg in training_step

Co-authored-by: Vadim Bereznyuk <kuynzereb@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jeremy Jordan <13970565+jeremyjordan@users.noreply.github.com>
2020-02-01 18:44:05 -05:00
Jirka Borovec ea59a99426 update org paths & convert logos (#685)
* fix typos

* update org paths

* update links from READMe to docs

* add svg logo

* add svg logo-text

* update logos

* testing temp paths

* prune links from readme

* optimize imports

* update logo

* update paths in README

* missing imports
2020-01-20 14:50:31 -05:00