mitogen/docs/ansible.rst

269 lines
10 KiB
ReStructuredText
Raw Normal View History

2018-02-16 05:27:29 +00:00
Ansible Extension
=================
.. image:: images/ansible/cell_division.png
2018-02-16 05:27:29 +00:00
:align: right
An experimental extension to `Ansible`_ is included that implements host
connections over Mitogen, replacing embedded shell invocations with pure-Python
equivalents invoked via highly efficient remote procedure calls tunnelled over
SSH. No changes are required to the target hosts.
2018-02-16 05:27:29 +00:00
The extension isn't nearly in a generally dependable state yet, however it
already works well enough for testing against real-world playbooks. `Bug
reports`_ in this area are very welcome Ansible is a huge beast, and only
significant testing will prove the extension's soundness.
.. _Ansible: https://www.ansible.com/
.. _Bug reports: https://goo.gl/yLKZiJ
Overview
--------
You should **expect a 1.25x - 7x speedup** and a **CPU usage reduction of at
2018-02-18 08:59:53 +00:00
least 2x**, depending on network conditions, the specific modules executed, and
time spent by the target host already doing useful work. Mitogen cannot speed
up a module once it is executing, it can only ensure the module executes as
quickly as possible.
2018-02-16 05:27:29 +00:00
* **A single SSH connection is used for each target host**, in addition to one
sudo invocation per distinct user account. Subsequent playbook steps always
reuse the same connection. This is much better than SSH multiplexing combined
with pipelining, as significant state can be maintained in RAM between steps,
and the system logs aren't filled with spam from repeat SSH and sudo
invocations.
2018-02-16 05:27:29 +00:00
* **A single Python interpreter is used** per host and sudo account combination
for the duration of the run, avoiding the repeat cost of invoking multiple
interpreters and recompiling imports, saving 300-800 ms for every playbook
2018-02-16 05:27:29 +00:00
step.
* Remote interpreters reuse Mitogen's module import mechanism, caching uploaded
dependencies between steps at the host and user account level. As a
consequence, **bandwidth usage is consistently an order of magnitude lower**
2018-02-16 05:27:29 +00:00
compared to SSH pipelining, and around 5x fewer frames are required to
traverse the wire for a run to complete successfully.
* **No writes to the target host's filesystem occur**, unless explicitly
2018-02-16 05:27:29 +00:00
triggered by a playbook step. In all typical configurations, Ansible
repeatedly rewrites and extracts ZIP files to multiple temporary directories
on the target host. Since no temporary files are used, security issues
relating to those files in cross-account scenarios are entirely avoided.
Limitations
-----------
2018-02-18 05:31:16 +00:00
This is a proof of concept: issues below are exclusively due to code immaturity.
2018-02-17 11:52:06 +00:00
High Risk
~~~~~~~~~
* Connection establishment is single-threaded until more pressing issues are
solved. To evaluate performance, target only one host. Many hosts still work,
the first playbook step will simply run unnecessarily slowly.
* `Asynchronous Actions And Polling
<http://docs.ansible.com/ansible/latest/playbooks_async.html>`_ has received
minimal testing.
2018-02-16 05:27:29 +00:00
* Transfer of large (i.e. GB-sized) files using certain Ansible-internal APIs,
such as triggered via the ``copy`` module, will cause corresponding temporary
memory and CPU spikes on both host and target machine, due to delivering the
file as a single large message. If many machines are targetted with a large
file, the host machine could easily exhaust available RAM. This will be fixed
soon as it's likely to be tickled by common playbook use cases.
2018-02-16 05:27:29 +00:00
* Situations may exist where the playbook's execution conditions are not
respected, however ``delegate_to``, ``connection: local``, ``become``,
``become_user``, and ``local_action`` have all been tested.
Low Risk
~~~~~~~~
* Only UNIX machines running Python 2.x are supported, Windows will come later.
2018-02-16 05:27:29 +00:00
* Only the ``sudo`` become method is available, however adding new methods is
straightforward, and eventually at least ``su`` will be included.
* The only supported strategy is ``linear``, which is Ansible's default.
2018-02-27 13:58:43 +00:00
* In some cases ``remote_tmp`` may not be respected.
* Interaction with modules employing special action plugins is minimally
tested, except for the ``synchronize``, ``template`` and ``copy`` modules.
* For now only Python command modules work, however almost all modules shipped
with Ansible are Python-based.
* Uncaptured standard output of remotely executing modules and shell commands
are logged to the console. This will be fixed in a later version.
* Ansible defaults to requiring pseudo TTYs for most SSH invocations, in order
to allow it to handle ``sudo`` with ``requiretty`` enabled, however it
disables pseudo TTYs for certain commands where standard input is required or
``sudo`` is not in use. Mitogen does not require this, as it can simply call
:py:func:`pty.openpty` from the SSH user account during ``sudo`` setup.
A major downside to Ansible's default is that stdout and stderr of any
resulting executed command are merged, with additional carriage return
characters synthesized in the output by the TTY layer. Neither of these
problems are apparent using the Mitogen extension, which may break some
playbooks.
A future version will emulate Ansible's behaviour, once it is clear precisely
what that behaviour is supposed to be. See `Ansible#14377`_ for related
discussion.
.. _Ansible#14377: https://github.com/ansible/ansible/issues/14377
Behavioural Differences
-----------------------
* Ansible with SSH multiplexing enabled causes a string like ``Shared
connection to host closed`` to appear in ``stderr`` output of every executed
command. This never manifests with the Mitogen extension.
* Asynchronous jobs execute in a thread of the single target Python
interpreter. In future this will be replaced with subprocesses, as it's
likely some use cases spawn many asynchronous jobs.
2018-02-16 05:27:29 +00:00
Configuration
-------------
.. warning::
Don't test the prototype in a live environment until this notice is removed.
2018-02-16 05:27:29 +00:00
1. Ensure the host machine is using Python 2.x for Ansible by verifying the
2018-02-18 08:26:17 +00:00
output of ``ansible --version``. Ensure the ``python`` command starts a
Python 2.x interpreter. If not, substitute ``python`` for the correct
command in steps 2 and 3.
2. ``python -m pip install -U git+https://github.com/dw/mitogen.git`` **on the
2018-02-16 05:27:29 +00:00
host machine only**.
2018-02-18 08:26:17 +00:00
3. ``python -c 'import ansible_mitogen as a; print a.__path__'``
4. Add ``strategy_plugins = /path/to/../ansible_mitogen/plugins/strategy`` using the
2018-02-16 05:27:29 +00:00
path from above to the ``[defaults]`` section of ``ansible.cfg``.
5. Add ``strategy = mitogen`` to the ``[defaults]`` section of ``ansible.cfg``.
6. Cross your fingers and try it out.
Demo
----
Local VM connection
~~~~~~~~~~~~~~~~~~~
This demonstrates Mitogen vs. connection pipelining to a local VM, executing
the 100 simple repeated steps of ``run_hostname_100_times.yml`` from the
examples directory. Mitogen requires **43x less bandwidth and 4.25x less
time**.
2018-02-16 05:27:29 +00:00
.. image:: images/ansible/run_hostname_100_times.png
2018-02-16 05:27:29 +00:00
Kathmandu to Paris
~~~~~~~~~~~~~~~~~~
This is a full Django application playbook over a ~180ms link between Kathmandu
and Paris. Aside from large pauses where the host performs useful work, the
high latency of this link means Mitogen only manages a 1.7x speedup.
2018-02-18 05:31:16 +00:00
Many early roundtrips are due to inefficiencies in Mitogen's importer that will
be fixed over time, however the majority, comprising at least 10 seconds, are
due to idling while the host's previous result and next command are in-flight
on the network.
2018-02-16 05:27:29 +00:00
The initial extension lays groundwork for exciting structural changes to the
execution model: a future version will tackle latency head-on by delegating
2018-02-18 09:38:04 +00:00
some control flow to the target host, melding the performance and scalability
benefits of pull-based operation with the management simplicity of push-based
operation.
2018-02-16 05:27:29 +00:00
.. image:: images/ansible/costapp.png
2018-02-16 05:27:29 +00:00
SSH Variables
-------------
This list will grow as more missing pieces are discovered.
* ansible_python_interpreter
* ansible_ssh_timeout
* ansible_host, ansible_ssh_host
* ansible_user, ansible_ssh_user
* ansible_port, ssh_port
* ansible_ssh_executable, ssh_executable
* ansible_ssh_private_key_file
* ansible_ssh_pass, ansible_password (default: assume passwordless)
2018-03-02 13:50:10 +00:00
* ssh_args, ssh_common_args, ssh_extra_args
2018-02-16 05:27:29 +00:00
Sudo Variables
--------------
* ansible_python_interpreter
* ansible_sudo_exe, ansible_become_exe
* ansible_sudo_user, ansible_become_user (default: root)
* ansible_sudo_pass, ansible_become_pass (default: assume passwordless)
* sudo_flags, become_flags
Debugging
---------
See :ref:`logging-env-vars` in the Getting Started guide for environment
variables that activate debug logging.
Implementation Notes
--------------------
Interpreter Reuse
~~~~~~~~~~~~~~~~~
The extension aggressively reuses the single target Python interpreter to
execute every module. While this works well, it violates an unwritten
assumption regarding Ansible modules, and so it is possible a buggy module
could cause a run to fail, or for unrelated modules to interact with each other
due to bad hygiene. Mitigations will be added as necessary if problems of this
sort ever actually manfest.
Patches
~~~~~~~
Three small patches are employed to hook into Ansible in desirable locations,
in order to override uses of shell, the module executor, and the mechanism for
selecting a connection plug-in. While it is hoped the patches can be avoided in
future, for interesting versions of Ansible deployed today this simply is not
possible, and so they continue to be required.
The patches are well defined, act conservatively including by disabling
themselves when non-Mitogen connections are in use, and additional third party
plug-ins are unlikely to attempt similar patches, so the risk to an established
configuration should be minimal.
Flag Emulation
~~~~~~~~~~~~~~
Mitogen re-parses ``sudo_flags``, ``become_flags``, and ``ssh_flags`` using
option parsers extracted from `sudo(1)` and `ssh(1)` in order to emulate their
equivalent semantics. This allows:
* robust support for common ``ansible.cfg`` tricks without reconfiguration,
such as forwarding SSH agents across ``sudo`` invocations,
* reporting on conflicting flag combinations,
* reporting on unsupported flag combinations,
* avoiding opening the extension up to untestable scenarios where users can
insert arbitrary garbage between Mitogen and the components it integrates
with.
* precise emulation by an alternative implementation, for example if Mitogen
grew support for Paramiko.