"With the release of `pytorch-lightning` version 0.9.0, we have included a new class called `LightningDataModule` to help you decouple data related hooks from your `LightningModule`.\n",
"\n",
"This notebook will walk you through how to start using Datamodules.\n",
"The most up to date documentation on datamodules can be found [here](https://pytorch-lightning.readthedocs.io/en/latest/extensions/datamodules.html).\n",
"Below, we reuse a `LightningModule` from our hello world tutorial that classifies MNIST Handwritten Digits.\n",
"\n",
"Unfortunately, we have hardcoded dataset-specific items within the model, forever limiting it to working with MNIST Data. 😢\n",
"\n",
"This is fine if you don't plan on training/evaluating your model on different datasets. However, in many cases, this can become bothersome when you want to try out your architecture with different datasets."
"DataModules are a way of decoupling data-related hooks from the `LightningModule` so you can develop dataset agnostic models."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "eJeT5bW081wn"
},
"source": [
"## Defining The MNISTDataModule\n",
"\n",
"Let's go over each function in the class below and talk about what they're doing:\n",
"\n",
"1. ```__init__```\n",
" - Takes in a `data_dir` arg that points to where you have downloaded/wish to download the MNIST dataset.\n",
" - Defines a transform that will be applied across train, val, and test dataset splits.\n",
" - Defines default `self.dims`, which is a tuple returned from `datamodule.size()` that can help you initialize models.\n",
"\n",
"\n",
"2. ```prepare_data```\n",
" - This is where we can download the dataset. We point to our desired dataset and ask torchvision's `MNIST` dataset class to download if the dataset isn't found there.\n",
" - **Note we do not make any state assignments in this function** (i.e. `self.something = ...`)\n",
"\n",
"3. ```setup```\n",
" - Loads in data from file and prepares PyTorch tensor datasets for each split (train, val, test). \n",
" - Setup expects a 'stage' arg which is used to separate logic for 'fit' and 'test'.\n",
" - If you don't mind loading all your datasets at once, you can set up a condition to allow for both 'fit' related setup and 'test' related setup to run whenever `None` is passed to `stage`.\n",
" - **Note this runs across all GPUs and it *is* safe to make state assignments here**\n",
"\n",
"\n",
"4. ```x_dataloader```\n",
" - `train_dataloader()`, `val_dataloader()`, and `test_dataloader()` all return PyTorch `DataLoader` instances that are created by wrapping their respective datasets that we prepared in `setup()`"
" <h1> <strong> Congratulations - Time to Join the Community! </strong> </h1>\n",
"</code>\n",
"\n",
"Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the Lightning movement, you can do so in the following ways!\n",
"\n",
"### Star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) on GitHub\n",
"The easiest way to help our community is just by starring the GitHub repos! This helps raise awareness of the cool tools we're building.\n",
"\n",
"* Please, star [Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n",
"The best way to keep up to date on the latest advancements is to join our community! Make sure to introduce yourself and share your interests in `#general` channel\n",
"\n",
"### Interested by SOTA AI models ! Check out [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n",
"Bolts has a collection of state-of-the-art models, all implemented in [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) and can be easily integrated within your own projects.\n",
"\n",
"* Please, star [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts)\n",
"\n",
"### Contributions !\n",
"The best way to contribute to our community is to become a code contributor! At any time you can go to [Lightning](https://github.com/PyTorchLightning/pytorch-lightning) or [Bolt](https://github.com/PyTorchLightning/pytorch-lightning-bolts) GitHub Issues page and filter for \"good first issue\". \n",
"\n",
"* [Lightning good first issue](https://github.com/PyTorchLightning/pytorch-lightning/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n",
"* [Bolt good first issue](https://github.com/PyTorchLightning/pytorch-lightning-bolts/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)\n",
"* You can also contribute your own notebooks with useful examples !\n",
"\n",
"### Great thanks from the entire Pytorch Lightning Team for your interest !\n",