Expand "How mitmproxy works". Clean up some un-needed sections.

This commit is contained in:
Aldo Cortesi 2013-03-10 17:09:40 +13:00
parent 6a9683719c
commit 5ceef16486
9 changed files with 166 additions and 1076 deletions

View File

@ -50,7 +50,7 @@ Requirements
------------ ------------
* [Python](http://www.python.org) 2.7.x. * [Python](http://www.python.org) 2.7.x.
* [netlib](http://pypi.python.org/pypi/netlib) 0.2.2 or newer. * [netlib](http://pypi.python.org/pypi/netlib), version matching mitmproxy.
* [PyOpenSSL](http://pypi.python.org/pypi/pyOpenSSL) 0.13 or newer. * [PyOpenSSL](http://pypi.python.org/pypi/pyOpenSSL) 0.13 or newer.
* [pyasn1](http://pypi.python.org/pypi/pyasn1) 0.1.2 or newer. * [pyasn1](http://pypi.python.org/pypi/pyasn1) 0.1.2 or newer.
* [urwid](http://excess.org/urwid/) version 1.1 or newer. * [urwid](http://excess.org/urwid/) version 1.1 or newer.
@ -69,4 +69,3 @@ The following components are needed if you plan to hack on mitmproxy:
* The test suite uses the [nose](http://readthedocs.org/docs/nose/en/latest/) unit testing * The test suite uses the [nose](http://readthedocs.org/docs/nose/en/latest/) unit testing
framework and requires [pathod](http://pathod.org) and [flask](http://flask.pocoo.org/). framework and requires [pathod](http://pathod.org) and [flask](http://flask.pocoo.org/).
* Rendering the documentation requires [countershape](http://github.com/cortesi/countershape). * Rendering the documentation requires [countershape](http://github.com/cortesi/countershape).

File diff suppressed because one or more lines are too long

View File

@ -1,3 +1,8 @@
body {
padding-top: 60px;
padding-bottom: 40px;
}
.terminal { .terminal {
color: #c0c0c0; color: #c0c0c0;
font-size: 1em; font-size: 1em;

View File

@ -1,6 +1,6 @@
<div class="navbar"> <div class="navbar navbar-fixed-top">
<div class="navbar-inner"> <div class="navbar-inner">
<div class="container-fluid"> <div class="container">
<a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse"> <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
<span class="icon-bar"></span> <span class="icon-bar"></span>
<span class="icon-bar"></span> <span class="icon-bar"></span>
@ -12,15 +12,14 @@
</div> </div>
</div> </div>
<div class="container-fluid"> <div class="container">
<div class="row-fluid"> <div class="row">
<div class="span3"> <div class="span3">
<div class="well sidebar-nav"> <div class="well sidebar-nav">
<ul class="nav nav-list"> <ul class="nav nav-list">
$!nav("index.html", this, state)!$ $!nav("index.html", this, state)!$
$!nav("install.html", this, state)!$ $!nav("install.html", this, state)!$
$!nav("howmitmproxy.html", this, state)!$ $!nav("howmitmproxy.html", this, state)!$
$!nav("faq.html", this, state)!$
<li class="nav-header">Tools</li> <li class="nav-header">Tools</li>
$!nav("mitmproxy.html", this, state)!$ $!nav("mitmproxy.html", this, state)!$
@ -68,13 +67,11 @@
</div> </div>
$!body!$ $!body!$
</div> </div>
</div>
</div><!--/row-->
<hr> <hr>
<footer> <footer>
<p>@!copyright!@</p> <p>@!copyright!@</p>
</footer> </footer>
</div>
</div><!--/.fluid-container-->

View File

@ -1,42 +1,83 @@
<a href="http://github.com/cortesi/mitmproxy"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://d3nwyuy0nl342s.cloudfront.net/img/e6bef7a091f5f3138b8cd40bc3e114258dd68ddf/687474703a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f72696768745f7265645f6161303030302e706e67" alt="Fork me on GitHub"></a> <div class="navbar navbar-fixed-top">
<div class="yui-t7" id="doc"> <div class="navbar-inner">
<div style="" id="hd"> <div class="container">
<div class="HorizontalNavBar"> <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
<ul> <span class="icon-bar"></span>
<li class="inactive"><a href="@!urlTo("/index.html")!@">home</a></li> <span class="icon-bar"></span>
<li class="active"><a href="@!urlTo("doc/index.html")!@">docs</a></li> <span class="icon-bar"></span>
<li class="inactive"><a href="@!urlTo("/about.html")!@">about</a></li> </a>
</ul> <a class="brand" href="@!urlTo("/index.html")!@">mitmproxy</a>
</div> <div class="nav">
<h1><a href="@!urlTo("/index.html")!@">mitmproxy</a> </h1> <ul class="nav">
<br> <li $!'class="active"' if this.match("/index.html", True) else ""!$> <a href="@!top!@/index.html">home</a> </li>
<p>an SSL-capable man-in-the-middle proxy</p> <li $!'class="active"' if this.under("/doc") else ""!$><a href="@!top!@/doc/index.html">docs</a></li>
</div> <li $!'class="active"' if this.under("/about.html") else ""!$><a href="@!top!@/about.html">about</a></li>
<div id="bd"> </ul>
<div id="yui-main"> </div>
<div style="" class="yui-b">
<!--(block nav)-->
<div id="nav">
<!--(block pb)-->
<a href="@!urlTo(previous)!@">prev</a>
<!--(end)-->
<!--(block nb)-->
<a href="@!urlTo(next)!@">next</a>
<!--(end)-->
$!pb if previous and not previous.parent.root else "prev"!$ |
<a href="@!urlTo('doc/index.html')!@">index</a> |
$!nb if next and not next.parent.root else "next"!$
</div>
<!--(end)-->
$!nav if this.title!="docs" else ""!$
$!title if this.title!="docs" else "<h1>mitmproxy 0.9 docs</h1>"!$
$!body!$
</div>
</div>
</div>
<div style="" id="ft">
<p>Copyright 2011 Aldo Cortesi</p>
</div> </div>
</div>
</div> </div>
$!ga!$ <div class="container">
<div class="row">
<div class="span3">
<div class="well sidebar-nav">
<ul class="nav nav-list">
$!nav("/doc/index.html", this, state)!$
$!nav("install.html", this, state)!$
$!nav("howmitmproxy.html", this, state)!$
<li class="nav-header">Tools</li>
$!nav("mitmproxy.html", this, state)!$
$!nav("mitmdump.html", this, state)!$
<li class="nav-header">Features</li>
$!nav("anticache.html", this, state)!$
$!nav("clientreplay.html", this, state)!$
$!nav("filters.html", this, state)!$
$!nav("proxyauth.html", this, state)!$
$!nav("replacements.html", this, state)!$
$!nav("serverreplay.html", this, state)!$
$!nav("setheaders.html", this, state)!$
$!nav("sticky.html", this, state)!$
$!nav("reverseproxy.html", this, state)!$
$!nav("upstreamcerts.html", this, state)!$
<li class="nav-header">SSL interception</li>
$!nav("ssl.html", this, state)!$
$!nav("certinstall/firefox.html", this, state)!$
$!nav("certinstall/osx.html", this, state)!$
$!nav("certinstall/windows7.html", this, state)!$
$!nav("certinstall/ios.html", this, state)!$
$!nav("certinstall/android.html", this, state)!$
<li class="nav-header">Transparent Proxying</li>
$!nav("transparent.html", this, state)!$
$!nav("transparent/linux.html", this, state)!$
$!nav("transparent/osx.html", this, state)!$
<li class="nav-header">Tutorials</li>
$!nav("tutorials/30second.html", this, state)!$
$!nav("tutorials/gamecenter.html", this, state)!$
<li class="nav-header">Scripting mitmproxy</li>
$!nav("scripting/inlinescripts.html", this, state)!$
$!nav("scripting/libmproxy.html", this, state)!$
</ul>
</div>
</div>
<div class="span9">
<div class="page-header">
<h1>@!this.title!@</h1>
</div>
$!body!$
</div>
</div>
<hr>
<footer>
<p>@!copyright!@</p>
</footer>
</div>

View File

@ -1,19 +0,0 @@
## Any tips for running mitmproxy on OSX?
You can use the OSX <b>open</b> program to create a simple and effective
<b>~/.mailcap</b> file to view HTTP bodies:
<pre>
application/*; /usr/bin/open -Wn %s
audio/*; /usr/bin/open -Wn %s
image/*; /usr/bin/open -Wn %s
video/*; /usr/bin/open -Wn %s
</pre>
## I'd like to hack on mitmproxy. What should I work on?
There's a __todo__ file at the top of the source tree that outlines a variety
of tasks, from simple to complex. If you don't have your own itch, feel free to
scratch one of those!

View File

@ -1,15 +1,11 @@
TODO:
- Clarify terminology: SSL vs TLS
Mitmproxy is an enormously flexible tool. Knowing exactly how the proxying Mitmproxy is an enormously flexible tool. Knowing exactly how the proxying
process works will help you deploy it more creatively, and let you understand process works will help you deploy it creatively, and allow you to understand
its fundamental assumptions and how to work around them. This document explains its fundamental assumptions and how to work around them. This document explains
mitmproxy's proxy mechanism by example, starting with the simplest explicit mitmproxy's proxy mechanism in detail, starting with the simplest unencrypted
proxy configuration, and working up to the most complicated interaction - explicit proxying, and working up to the most complicated interaction -
transparent proxying of SSL-protected traffic in the presence of SNI. transparent proxying of SSL-protected traffic[^ssl] in the presence of
[SNI](http://en.wikipedia.org/wiki/Server_Name_Indication).
<div class="page-header"> <div class="page-header">
@ -75,9 +71,11 @@ This is where mitmproxy's fundamental trick comes into play. The MITM in its
name stands for Man-In-The-Middle - a reference to the process we use to name stands for Man-In-The-Middle - a reference to the process we use to
intercept and interfere with these theoretially opaque data streams. The basic intercept and interfere with these theoretially opaque data streams. The basic
idea is to pretend to be the server to the client, and pretend to be the client idea is to pretend to be the server to the client, and pretend to be the client
to the server. The tricky part is that the Certificate Authority system is to the server, while we sit in the middle decoding traffic from both sides. The
tricky part is that the [Certificate
Authority](http://en.wikipedia.org/wiki/Certificate_authority) system is
designed to prevent exactly this attack, by allowing a trusted third-party to designed to prevent exactly this attack, by allowing a trusted third-party to
cryptographically sign a server's SSL certificates to verify that the certs are cryptographically sign a server's SSL certificates to verify that they are
legit. If this signature is from a non-trusted party, a secure client will legit. If this signature is from a non-trusted party, a secure client will
simply drop the connection and refuse to proceed. Despite the many shortcomings simply drop the connection and refuse to proceed. Despite the many shortcomings
of the CA system as it exists today, this is usually fatal to attempts to MITM of the CA system as it exists today, this is usually fatal to attempts to MITM
@ -86,7 +84,8 @@ an SSL connection for analysis.
Our answer to this conundrum is to become a trusted Certificate Authority Our answer to this conundrum is to become a trusted Certificate Authority
ourselves. Mitmproxy includes a full CA implementation that generates ourselves. Mitmproxy includes a full CA implementation that generates
interception certificates on the fly. To get the client to trust these interception certificates on the fly. To get the client to trust these
certificates, we register mitmproxy as a CA with the device manually. certificates, we [register mitmproxy as a trusted CA with the device
manually](@!urlTo("ssl.html")!@).
## Complication 1: What's the remote hostname? ## Complication 1: What's the remote hostname?
@ -103,25 +102,27 @@ Using the IP address is perfectly legitimate because it gives us enough
information to initiate the pipe, even though it doesn't reveal the remote information to initiate the pipe, even though it doesn't reveal the remote
hostname. hostname.
Mitmproxy has a cunning mechanism that smooths this over - upstream certificate Mitmproxy has a cunning mechanism that smooths this over - [upstream
sniffing. As soon as we see the CONNECT request, we pause the client part of certificate sniffing](@!urlTo("features/upstreamcerts.html")!@). As soon as we
the conversation, and initiate a simultaneous connection to the server. We see the CONNECT request, we pause the client part of the conversation, and
complete the SSL handshake with the server, and inspect the certificates it initiate a simultaneous connection to the server. We complete the SSL handshake
used. Now, we use the Common Name in the upstream SSL certificates to generate with the server, and inspect the certificates it used. Now, we use the Common
the dummy certificate for the client. Voila, we have the correct hostname to Name in the upstream SSL certificates to generate the dummy certificate for the
present to the client, even if it was never specified. client. Voila, we have the correct hostname to present to the client, even if
it was never specified.
## Complication 2: Subject Alternate Name ## Complication 2: Subject Alternative Name
Enter the next complication. Sometimes, the certificate Common Name is not, in Enter the next complication. Sometimes, the certificate Common Name is not, in
fact, the hostname that the client is connecting to. This is because of the fact, the hostname that the client is connecting to. This is because of the
optional Subject Alternate Name field in the SSL certificate that allows an optional [Subject Alternative
arbitrary number of alternate domains to be specified. If the expected domain Name](http://en.wikipedia.org/wiki/SubjectAltName) field in the SSL certificate
matches any of these, the client wil proceed, even though the domain doesn't that allows an arbitrary number of alternative domains to be specified. If the
match the certificate Common Name. The answer here is simple: when extract the expected domain matches any of these, the client wil proceed, even though the
CN from the upstream cert, we also extract the SANs, and add them to the domain doesn't match the certificate Common Name. The answer here is simple:
generated dummy certificate. when extract the CN from the upstream cert, we also extract the SANs, and add
them to the generated dummy certificate.
## Complication 3: Server Name Indication ## Complication 3: Server Name Indication
@ -130,9 +131,10 @@ One of the big limitations of conventional SSL is that each certificate
requires its own IP address. This means that you couldn't do virtual hosting requires its own IP address. This means that you couldn't do virtual hosting
where multiple domains with independent certificates share the same IP address. where multiple domains with independent certificates share the same IP address.
In a world with a rapidly shrinking IPv4 address pool this is a problem, and we In a world with a rapidly shrinking IPv4 address pool this is a problem, and we
have a solution in the form of the Server Name Indication extension to the SSL have a solution in the form of the [Server Name
and TLS protocols. This lets the client specify the remote server name at the Indication](http://en.wikipedia.org/wiki/Server_Name_Indication) extension to
start of the SSL handshake, which then lets the server select the right the SSL and TLS protocols. This lets the client specify the remote server name
at the start of the SSL handshake, which then lets the server select the right
certificate to complete the process. certificate to complete the process.
SNI breaks our upstream certificate sniffing process, because when we connect SNI breaks our upstream certificate sniffing process, because when we connect
@ -144,6 +146,15 @@ passed to us. Now we can pause the conversation, and initiate an upstream
connection using the correct SNI value, which then serves us the correct connection using the correct SNI value, which then serves us the correct
upstream certificate, from which we can extract the expected CN and SANs. upstream certificate, from which we can extract the expected CN and SANs.
There's another wrinkle here. Due to a limitation of the SSL library mitmproxy
uses, we can't detect that a connection _hasn't_ sent an SNI request until it's
too late for upstream certificate sniffing. In practice, we therefore make a
vanilla SSL connection upstream to sniff non-SNI certificates, and then discard
the connection if the client sends an SNI notification. If you're watching your
traffic with a packet sniffer, you'll see two connections to the server when an
SNI request is made, the first of which is immediately closed after the SSL
handshake. Luckily, this is almost never an issue in practice.
## Putting it all together ## Putting it all together
@ -218,22 +229,28 @@ This makes transparent proxying ideal for those situations where you can't
change client behaviour - proxy-oblivious Android applications being a common change client behaviour - proxy-oblivious Android applications being a common
example. example.
To achieve this, we need to introduce two extra components. The first new To achieve this, we need to introduce two extra components. The first is a
component is a router that transparently redirects the TCP connection to the redirection mechanism that transparently reroutes a TCP connection destined for
proxy. Once the client has initiated the connection, it makes a vanilla HTTP a server on the Internet to a listening proxy server. This usually takes the
request, which might look something like this: form of a firewall on the same host as the proxy server -
[iptables](http://www.netfilter.org/) on Linux or
[pf](http://en.wikipedia.org/wiki/PF_(firewall)) on OSX. Once the client has
initiated the connection, it makes a vanilla HTTP request, which might look
something like this:
<pre>GET /index.html HTTP/1.1</pre> <pre>GET /index.html HTTP/1.1</pre>
Note that this request differs from the explicit proxy variation, in that it Note that this request differs from the explicit proxy variation, in that it
omits the scheme and hostname. How, then, do we know which upstream host to omits the scheme and hostname. How, then, do we know which upstream host to
forward the request to? The routing mechanism that has performed the forward the request to? The routing mechanism that has performed the
redirection keeps track of the original destination. Each different routing redirection keeps track of the original destination for us. Each routing
mechanism has its own ideosyncratic way of exposing this data, so this mechanism has a different way of exposing this data, so this introduces the
introduces the second component required for working transparent proxying: a second component required for working transparent proxying: a host module that
host module that knows how to retrieve the original destination address from knows how to retrieve the original destination address from the router. In
the router. Once we have this information, the process is fairly mitmproxy, this takes the form of a built-in set of
straight-forward. [modules](https://github.com/cortesi/mitmproxy/tree/master/libmproxy/platform)
that know how to talk to each platform's redirection mechanism. Once we have
this information, the process is fairly straight-forward.
<img src="transparent.png"/> <img src="transparent.png"/>
@ -338,4 +355,4 @@ and cope with SNI.
</table> </table>
[^ssl]: I use "SSL" to refer to both SSL and TLS in the generic sense, unless otherwise specified.

View File

@ -6,9 +6,15 @@ sys.path.insert(0, "..")
from libmproxy import filt from libmproxy import filt
MITMPROXY_SRC = "~/git/public/mitmproxy" MITMPROXY_SRC = "~/git/public/mitmproxy"
this.layout = countershape.Layout("_layout.html")
if ns.options.website:
this.layout = countershape.Layout("_websitelayout.html")
else:
this.layout = countershape.Layout("_layout.html")
ns.title = countershape.template.Template(None, "<h1>@!this.title!@</h1>")
this.titlePrefix = "mitmproxy 0.9 - " this.titlePrefix = "mitmproxy 0.9 - "
this.markup = markup.Markdown() this.markup = markup.Markdown(extras=["footnotes"])
ns.docMaintainer = "Aldo Cortesi" ns.docMaintainer = "Aldo Cortesi"
ns.docMaintainerEmail = "aldo@corte.si" ns.docMaintainerEmail = "aldo@corte.si"
@ -73,5 +79,4 @@ pages = [
Directory("tutorials"), Directory("tutorials"),
Page("transparent.html", "Overview"), Page("transparent.html", "Overview"),
Directory("transparent"), Directory("transparent"),
Page("faq.html", "FAQ"),
] ]

24
todo
View File

@ -1,24 +0,0 @@
This is a loose collection of todo items, in case someone else wants to start
hacking on mitmproxy. Drop me a line (aldo@corte.si) if you want to tackle any
of these and need some pointers.
Targeted for 0.9:
- White-background colorscheme
- Extra content view modules: CSS indenter, Flash SWF info extractor
- Upstream proxy support.
- Follow mode to keep most recent flow in view
- Verbose view to show timestamps
- Search within requests/responses
- Transparent proxy support
- Ordering a-la mutt's "o" shortcut
Future:
- Add some "workspace" features to mitmproxy:
- Flow comments
- Copying/duplicating flows
- Ordering by time, size, etc. a-la-mutt (o keyboard shorcut is reserved for this)
- Support HTTP Digest authentication through the stickyauth option. We'll
have to save the server nonce, and recalculate the hashes for each request.
- Chunked encoding support for requests (we already support it for responses).
- A progress indicator for large files