proxy.py/tutorial/welcome.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Welcome\n",
    "\n",
    "## Background\n",
    "\n",
    "`proxy.py` was released on 20th August, 2013 as a single file HTTP proxy server implementation with no external dependencies.  See the [first commit](https://github.com/abhinavsingh/proxy.py/commit/75044a72d9c7b4b8910ba551006b801eafdf3c47) and [read introductory blog](https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/) to get an insight about why `proxy.py` was created.\n",
    "\n",
    "## Introduction\n",
    "\n",
    "Today, `proxy.py` has matured into a full blown networking library with focus on being lightweight, ability to deliver maximum performance while being extendible.  Unlike other Python servers, `proxy.py` doesn't need a `WSGI` or `UWSI` frontend, which then usually has to be placed behind a reverse proxy e.g. `Nginx` or `Apache`.  Of-course, `proxy.py` can be placed directly behind a load-balancer _(optionally capable of speaking HA proxy protocol)_.\n",
    "\n",
    "## Working with proxy.py\n",
    "\n",
    "To work with `proxy.py`, you must follow these critical concepts:\n",
    "\n",
    "1. Avoid using synchronous IO operations within your code\n",
    "\n",
    "    `proxy.py` is asynchronous in nature and by making a synchronous call in your plugin code, you may block the entire core event loop.  For asynchronous operations, you must tie into the `proxy.py` event loop using the provided plugin APIs.\n",
    "\n",
    "2. Plugin instances are NOT global\n",
    "\n",
    "    Plugin instances are created for every request.  Hence, your plugin code must be written to handle execution of a single request.  `proxy.py` will internally take care of concurrency for you.\n",
    "\n",
    "## The Concept Of Work\n",
    "\n",
    "`proxy.py` core is written with a high level concept of `work`.\n",
    "\n",
    "- A running instance can receive `work` from one or multiple `sources`\n",
    "  - Example, when `proxy.py` starts, an accepted client connection is a `work` coming from TCP socket `sources`\n",
    "- Handlers can be written to process various types of `work`\n",
    "  - Example, `HttpProtocolHandler` handles HTTP client connections `work`\n",
    "- A client connection can come from a variety of `sources`\n",
    "  - TCP sockets\n",
    "  - UDP sockets\n",
    "  - Unix sockets\n",
    "  - Raw sockets\n",
    "\n",
    "In fact, `work` can be any processing unit.  It doesn't have to be a client connection.  Example:\n",
    "\n",
    "- A file on disk can act as the `source` and each line in that file as the `work` definition\n",
    "- Imagine tailing a file on disk as `source` and processing each line as a separate `work` object\n",
    "- If you want, each line in the file can also be a URL to be scrapped or download\n",
    "- If you want, your `work` handlers can append new URLs _(discovered by scrapping previous URL entries)_ back in the file, creating an infinite feedback loop between the `work` processing core.\n",
    "\n",
    "And just like that we have created a web scraper!!!\n",
    "\n",
    "To extend this generic concept, now imagine a distributed queue as the `source` of our `work`, where each published message in the queue is our `work` payload.  Some examples of such `sources` can be:\n",
    "- A `Redis` channel\n",
    "- Google Cloud PubSub channel\n",
    "- Kafka queues\n",
    "\n",
    "And just like that we have created a distributed `work` executor!!!"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}