proxy.py/tutorial/welcome.ipynb

70 lines
3.6 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Welcome\n",
"\n",
"## Background\n",
"\n",
"`proxy.py` was released on 20th August, 2013 as a single file HTTP proxy server implementation with no external dependencies. See the [first commit](https://github.com/abhinavsingh/proxy.py/commit/75044a72d9c7b4b8910ba551006b801eafdf3c47) and [read introductory blog](https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/) to get an insight about why `proxy.py` was created.\n",
"\n",
"## Introduction\n",
"\n",
"Today, `proxy.py` has matured into a full blown networking library with focus on being lightweight, ability to deliver maximum performance while being extendible. Unlike other Python servers, `proxy.py` doesn't need a `WSGI` or `UWSI` frontend, which then usually has to be placed behind a reverse proxy e.g. `Nginx` or `Apache`. Of-course, `proxy.py` can be placed directly behind a load-balancer _(optionally capable of speaking HA proxy protocol)_.\n",
"\n",
"## Working with proxy.py\n",
"\n",
"To work with `proxy.py`, you must follow these critical concepts:\n",
"\n",
"1. Avoid using synchronous IO operations within your code\n",
"\n",
" `proxy.py` is asynchronous in nature and by making a synchronous call in your plugin code, you may block the entire core event loop. For asynchronous operations, you must tie into the `proxy.py` event loop using the provided plugin APIs.\n",
"\n",
"2. Plugin instances are NOT global\n",
"\n",
" Plugin instances are created for every request. Hence, your plugin code must be written to handle execution of a single request. `proxy.py` will internally take care of concurrency for you.\n",
"\n",
"## The Concept Of Work\n",
"\n",
"`proxy.py` core is written with a high level concept of `work`.\n",
"\n",
"- A running instance can receive `work` from one or multiple `sources`\n",
" - Example, when `proxy.py` starts, an accepted client connection is a `work` coming from TCP socket `sources`\n",
"- Handlers can be written to process various types of `work`\n",
" - Example, `HttpProtocolHandler` handles HTTP client connections `work`\n",
"- A client connection can come from a variety of `sources`\n",
" - TCP sockets\n",
" - UDP sockets\n",
" - Unix sockets\n",
" - Raw sockets\n",
"\n",
"In fact, `work` can be any processing unit. It doesn't have to be a client connection. Example:\n",
"\n",
"- A file on disk can act as the `source` and each line in that file as the `work` definition\n",
"- Imagine tailing a file on disk as `source` and processing each line as a separate `work` object\n",
"- If you want, each line in the file can also be a URL to be scrapped or download\n",
"- If you want, your `work` handlers can append new URLs _(discovered by scrapping previous URL entries)_ back in the file, creating an infinite feedback loop between the `work` processing core.\n",
"\n",
"And just like that we have created a web scraper!!!\n",
"\n",
"To extend this generic concept, now imagine a distributed queue as the `source` of our `work`, where each published message in the queue is our `work` payload. Some examples of such `sources` can be:\n",
"- A `Redis` channel\n",
"- Google Cloud PubSub channel\n",
"- Kafka queues\n",
"\n",
"And just like that we have created a distributed `work` executor!!!"
]
}
],
"metadata": {
"language_info": {
"name": "python"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}