{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Welcome\n", "\n", "## Background\n", "\n", "`proxy.py` was released on 20th August, 2013 as a single file HTTP proxy server implementation with no external dependencies. See the [first commit](https://github.com/abhinavsingh/proxy.py/commit/75044a72d9c7b4b8910ba551006b801eafdf3c47) and [read introductory blog](https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/) to get an insight about why `proxy.py` was created.\n", "\n", "## Introduction\n", "\n", "Today, `proxy.py` has matured into a full blown networking library with focus on being lightweight, ability to deliver maximum performance while being extendible. Unlike other Python servers, `proxy.py` doesn't need a `WSGI` or `UWSI` frontend, which then usually has to be placed behind a reverse proxy e.g. `Nginx` or `Apache`. Of-course, `proxy.py` can be placed directly behind a load-balancer _(optionally capable of speaking HA proxy protocol)_.\n", "\n", "## Working with proxy.py\n", "\n", "To work with `proxy.py`, you must follow these critical concepts:\n", "\n", "1. Avoid using synchronous IO operations within your code\n", "\n", " `proxy.py` is asynchronous in nature and by making a synchronous call in your plugin code, you may block the entire core event loop. For asynchronous operations, you must tie into the `proxy.py` event loop using the provided plugin APIs.\n", "\n", "2. Plugin instances are NOT global\n", "\n", " Plugin instances are created for every request. Hence, your plugin code must be written to handle execution of a single request. `proxy.py` will internally take care of concurrency for you.\n", "\n", "## The Concept Of Work\n", "\n", "`proxy.py` core is written with a high level concept of `work`.\n", "\n", "- A running instance can receive `work` from one or multiple `sources`\n", " - Example, when `proxy.py` starts, an accepted client connection is a `work` coming from TCP socket `sources`\n", "- Handlers can be written to process various types of `work`\n", " - Example, `HttpProtocolHandler` handles HTTP client connections `work`\n", "- A client connection can come from a variety of `sources`\n", " - TCP sockets\n", " - UDP sockets\n", " - Unix sockets\n", " - Raw sockets\n", "\n", "In fact, `work` can be any processing unit. It doesn't have to be a client connection. Example:\n", "\n", "- A file on disk can act as the `source` and each line in that file as the `work` definition\n", "- Imagine tailing a file on disk as `source` and processing each line as a separate `work` object\n", "- If you want, each line in the file can also be a URL to be scrapped or download\n", "- If you want, your `work` handlers can append new URLs _(discovered by scrapping previous URL entries)_ back in the file, creating an infinite feedback loop between the `work` processing core.\n", "\n", "And just like that we have created a web scraper!!!\n", "\n", "To extend this generic concept, now imagine a distributed queue as the `source` of our `work`, where each published message in the queue is our `work` payload. Some examples of such `sources` can be:\n", "- A `Redis` channel\n", "- Google Cloud PubSub channel\n", "- Kafka queues\n", "\n", "And just like that we have created a distributed `work` executor!!!" ] } ], "metadata": { "language_info": { "name": "python" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }