mitogen/ansible_mitogen
David Wilson f78a5f08c6 issue #605: ansible: share a sem_t instead of a pthread_mutex_t
The previous version quite reliably causes worker deadlocks within 10
minutes running:

    # 100 times:
    - import_playbook: integration/async/runner_one_job.yml
    # 100 times:
    - import_playbook: integration/module_utils/adjacent_to_playbook.yml

via .ci/soak/mitogen.sh with PLAYBOOK= set to the above playbook.

Attaching to the worker with gdb reveals it in an instruction
immediately following a futex() call, which likely returned EINTR due to
attaching gdb. Examining the pthread_mutex_t state reveals it to be
completely unlocked.

pthread_mutex_t on Linux should have zero trouble living in shmem, so
it's not clear how this deadlock is happening. Meanwhile POSIX
semaphores are explicitly designed for cross-process use and have a
completely different internal implementation, so try those instead. 1
hour of soaking reveals no deadlock.

This is about avoiding managing a lockable temporary file on disk to
contain our counter, and somehow communicating a reference to it into
subprocesses (despite the subprocess module closing inherited fds, etc),
somehow deleting it reliably at exit, and somehow avoiding concurrent
Ansible runs stepping on the same file. For now ctypes is still less
pain.

A final possibility would be to abandon a shared counter and instead
pick a CPU based on the hash of e.g. the new child's process ID. That
would likely balance equally well, and might be worth exploring when
making this code work on BSD.
2019-08-10 23:40:36 +00:00
..
compat ansible: create stub __init__.py for sdist. 2019-02-14 01:28:14 +00:00
plugins [linear2] update mitogen_get_stack for new _build_stack() return value 2019-07-29 16:30:01 +01:00
__init__.py ansible: restructure to avoid intermediate imports 2018-03-19 21:58:29 +05:45
affinity.py issue #605: ansible: share a sem_t instead of a pthread_mutex_t 2019-08-10 23:40:36 +00:00
connection.py issue #615: use FileService for target->controll file transfers 2019-08-10 00:37:17 +01:00
loaders.py Update copyright year everywhere. 2019-02-13 16:16:49 +00:00
logging.py issue #552: include process identity in log messages. 2019-02-25 17:25:09 +00:00
mixins.py [linear2]: merge fallout flaggged by LGTM 2019-07-31 11:41:29 +01:00
module_finder.py module_finder: pass raw file to compile() 2019-07-23 16:04:44 +01:00
parsing.py ansible: remove cutpasted docstring 2019-08-02 04:05:28 +01:00
planner.py ansible: abstract worker process model. 2019-07-29 13:52:30 +01:00
process.py Split out and make readable more log messages across both packages 2019-08-04 14:41:47 +01:00
runner.py issue #600: /etc/environment may be non-ASCII in an unknown encoding 2019-08-01 12:12:18 +01:00
services.py [linear2] merge fallout: re-enable _send_module_forwards(). 2019-08-04 20:43:14 +01:00
strategy.py ansible: cleanup various docstrings 2019-08-04 12:14:48 +01:00
target.py issue #575: fix exception text rendering 2019-04-02 14:06:41 +01:00
transport_config.py Add buildah transport 2019-06-08 18:15:58 -05:00