Commit Graph

334 Commits

Author SHA1 Message Date
David Wilson bac28bc5ca issue #76, #370: add fix for disconnect cleanup test
Simply listen to RouteMonitor's Context "disconnect"  and forget
contexts according to RouteMonitor's rules, rather than duplicate them
(and screw it up).
2018-11-01 20:06:09 +00:00
David Wilson c148c869e6 issue #76, #370: add disconnect cleanup test 2018-11-01 20:04:18 +00:00
David Wilson 58c0e45661 issue #400: rework the monkeypatch. 2018-11-01 15:58:28 +00:00
David Wilson c9ecc82f85 issue #400: add logic to work around AWX callback bug. 2018-11-01 13:33:51 +00:00
David Wilson 6c71c5bfef issue #369: disable reset_connection on Ansible<2.5.6
https://github.com/ansible/ansible/issues/27520
2018-10-31 18:30:03 +00:00
David Wilson c4aec22a33 issue #369: fix one more _reset() reference. 2018-10-31 16:43:27 +00:00
David Wilson 6107ebdc0d issue #396: fix compatibility with Connection._reset(). 2018-10-31 16:43:27 +00:00
David Wilson d280bba02b issue #369: fix KeyError during new context start.
Update _via_by_context earlier; fixes:

    Traceback (most recent call last):
      File "/Users/dmw/src/mitogen/mitogen/service.py", line 519, in _on_service_call
	return invoker.invoke(method_name, kwargs, msg)
      File "/Users/dmw/src/mitogen/mitogen/service.py", line 253, in invoke
	response = self._invoke(method_name, kwargs, msg)
      File "/Users/dmw/src/mitogen/mitogen/service.py", line 239, in _invoke
	ret = method(**kwargs)
      File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 454, in get
	reraise(*result)
      File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 412, in _wait_or_start
	response = self._connect(key, spec, via=via)
      File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 363, in _connect
	self._update_lru(context, spec, via)
      File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 266, in _update_lru
	self._update_lru_unlocked(new_context, spec, via)
      File "/Users/dmw/src/mitogen/ansible_mitogen/services.py", line 253, in _update_lru_unlocked
	if self._refs_by_context[context] == 0:
    KeyError: Context(1008, u'ssh.localhost.sudo.mitogen__user3')
2018-10-31 16:43:27 +00:00
David Wilson 54445470e2 issue #409: add missing path config variables to severa plugins
So every method can be redirected to a stub implementation.
2018-10-31 12:40:08 +00:00
David Wilson a68675af8b issue #409: fix reference error in kubectl.py. 2018-10-31 12:39:43 +00:00
David Wilson 05f9fb4dd8 issue #409: don't run kubectl test in <2.5. 2018-10-31 12:13:50 +00:00
David Wilson 18af1dfb51 ansible: kubectl_path argument appears in wrong connection method
Closes #409.
2018-10-31 00:48:11 +00:00
David Wilson 96f000c5ea ansible: tilde-expand SSH key before passing to SSH; closes #334. 2018-10-30 14:58:35 +00:00
David Wilson 536690760d issue #369: teach CallChain to reset the connection. 2018-10-29 19:41:03 +00:00
David Wilson 33412927f5 issue #369: refactor Connection to support reset()
Now the tests pass.
2018-10-29 19:34:50 +00:00
David Wilson 9b7c958e2e issue #369: refactor ContextService to support reset(). 2018-10-29 19:30:56 +00:00
David Wilson d0f5671887 ansible: split key_from_dict() out into free function. 2018-10-29 15:27:20 +00:00
David Wilson 0dc3f8accf ansible: fix another target.py format string. 2018-10-26 11:26:15 +01:00
David Wilson 7e04ee8af9 ansible: fix is_good_temp_dir() log format 2018-10-26 11:11:25 +01:00
David Wilson fba52a0edf issue #76: add API for ansible_mitogen to get route list
Earlier commit moved Stream.routes attribute into a private map
belonging to RouteMonitor, to make upgrades smoother. This adds a new
accessor method to RouteMonitor.
2018-10-24 13:02:46 +01:00
David Wilson 7fd9fb0014 issue #397: fix another case where stray tmpdirs can be left behind.
Newer Ansibles use atexit.register() to invoke cleanup, so we need to
run those registrations after each run.
2018-10-23 15:29:03 +01:00
David Wilson 1b17aa1d1a ansible: fix temp cleanup regression and add test; closes #397. 2018-10-23 14:42:44 +01:00
David Wilson 9d070541d9 ansible: try to create tempdir if missing.
Closes #358.
2018-10-02 21:06:00 +01:00
David Wilson 4c81eba599 Merge commit 'refs/pull/377/head' of github.com:dw/mitogen into dmw
(Pull #377)

Changes:
- additional_parameters -> extra_args
- Merge with kubectl changes from dmw branch
- Update docs
- Remove unused username class member
- Avoid mutable kubectl_args class member
- Use six.iteritems
2018-10-02 20:00:00 +01:00
David Wilson 7a00e1cc87 issue #360: missing locks around shutdown and LRU management. 2018-10-02 19:19:30 +01:00
David Wilson 498db57ec8 issue #360: ansible: missing lock around ContextService.put(). 2018-10-02 19:19:30 +01:00
David Wilson f8bf780e21 issue #362: Py3.x fixes. 2018-10-02 19:19:30 +01:00
David Wilson f8b6c774dd issue #362: cap max open files in children. 2018-10-02 19:19:30 +01:00
David Wilson 5521945bd2 ansible: temporary files take 5. 2018-10-02 19:19:30 +01:00
Yannig Perré 17548d1e49 [Enhancement] handle kubectl vars from Ansible connector.
This change allows the kubectl connector to support the same options as
Ansible's original connector.

The playbook sample comes with an example of a pod containing two containers
and checking that moving from one container to another, the version of Python
changes as expected.
2018-10-02 11:54:15 +02:00
Yannig Perré 6828926a36 Kubernetes connection support for mitogen. 2018-09-19 16:52:20 +02:00
David Wilson 638b196a45 ansible: fix put_file() for large temporary files.
Reverts 49736b3a, large file copies can't avoid the RTT.

The parent stack must be blocked while FileService progresses, as unlike
the small file path, it does not make a snapshot of the (possibly
temporary) file passed by the action plug-in. So we need to keep that
file alive while the service runs.

Add a new integration test and a new soak test to cover both.
2018-09-10 19:09:37 +01:00
David Wilson 9ff34afafe ansible: fix regression. 2018-09-10 02:28:29 +01:00
David Wilson 90f89f95fb ansible: fix exec_command() regression. 2018-09-10 01:31:15 +01:00
David Wilson 49736b3ae8 ansible: fix FileService call, and remove another roundtrip. 2018-09-10 00:31:16 +01:00
David Wilson e241081cae ansible: stop sharing target temp_dir in runner.
This cannot work with delegate_to, since delegate_to permits multiple
concurrent tasks to be executing on the same target.
2018-09-09 23:41:53 +01:00
David Wilson 43d9815f6d ansible: use CallChain everywhere.
This replaces the 'dump to logger' behaviour of pipelined calls from
before with a call chain that returns any exception on next synchronized
call.
2018-09-09 20:29:02 +01:00
David Wilson b254eb3399 ansible: fix non-action connection instantiation.
e.g. by synchronize module.
2018-09-08 22:35:42 +01:00
David Wilson 705d77a9be ansible: remove a bunch more aliasing from connection.py. 2018-09-08 22:31:19 +01:00
David Wilson 66142e7d75 ansible: fork isolated tasks from correct parent.
Closes #355.
2018-09-08 22:17:39 +01:00
David Wilson da8c6b45b0 ansible: remove task_vars aliasing from connection.py.
Crazy spam creep.
2018-09-08 21:59:17 +01:00
David Wilson 32751cd356 master: allow batching context switches for forward_modules()
-7 switches per task.
2018-09-08 20:53:11 +01:00
David Wilson 86942b6bf9 ansible: add explanatory exception
If disconnection occurs during a Connection.call(), return
AnsibleConnectionFailure.
2018-09-08 20:53:11 +01:00
David Wilson 3c6b72b452 ansible: gracefully return (and explain) ChannelError in ContextService.
When Ansible abnormally shuts down, the broker begins
force-disconnecting every context, including those for which connection
is currently in-progress.

When that happens, .call(init_child) throws ChannelError, and that needs
returned back to the worker, assuming the worker still even exists.

This solution is incomplete: with sick nodes, it's also possible the
worker died naturally, and so the worker should perhaps respond by
retrying the connection.

Previously, the unhandled ChannelError would spam the console when e.g.
fork() began returning EAGAIN.
2018-09-08 20:53:11 +01:00
David Wilson e647adc62e ansible: copy GIL change from linear2 branch.
Reduces runtime by 25% given 100 25ms SSH targets:

    ANSIBLE_STRATEGY=mitogen \
    MITOGEN_POOL_SIZE=100 \
    /usr/bin/time -l ansible k3-x100 -m shell -a hostname

Before:
           39.56 real        35.29 user        17.24 sys
      59600896  maximum resident set size
       1784252  page reclaims
          9016  messages sent
         10382  messages received
         18774  voluntary context switches
        770070  involuntary context switches

After:
           29.79 real        22.10 user        11.77 sys
      59281408  maximum resident set size
       1725268  page reclaims
          8582  messages sent
          9959  messages received
         14582  voluntary context switches
         75280  involuntary context switches
2018-09-08 20:53:11 +01:00
David Wilson 2647f73501 ansible: bump UNIX listener default backlog, and set it to match forks.
The connection multiplexer can expect to not be scheduled at least until
every $forks worker processes has attempted a connection, so the backlog
must be able to hold every worker.
2018-09-08 20:53:11 +01:00
David Wilson 8ab11f415f ansible: better support for diagnosing hangs
* Always enable the faulthandler module in the top-level process if it
  is available.
* Make MITOGEN_DUMP_THREAD_STACKS interval configurable, to better
  handle larger runs.
* Add docs subsection on diagnosing hangs.

Conflicts:
	ansible_mitogen/process.py
2018-09-08 20:53:11 +01:00
David Wilson 9792b8b54f ansible: use template-expanded delegate_to hostname in one more location. 2018-09-08 20:53:11 +01:00
David Wilson 90c2ed03d0 ansible: fix synchronize module
Broken by recent connection delegation fixes.
2018-08-20 15:43:56 +01:00
David Wilson 7458dfae85 ansible: avoid roundtrip for small file transfers.
Calls to connect.put_file() where the file is sufficiently small enough
to fit in a single RPC proceed without waiting for an RPC response. If
the write fails the target context will log an exception, and any
subsequent step depending on the written file will fail.

I verified every built-in action plugin for file transfer calls, and
they all depend on the transferred file in the following step, so this
should be safe.

Reduces template/copy actions to 2-RTT, loop-20-templates.yml runtime
reduced from 30 seconds to 10 seconds over a 250ms link compared to
v0.2.2, and from 123 seconds compared to vanilla with pipelining
enabled.
2018-08-20 15:18:03 +01:00