Commit Graph

586 Commits

Author SHA1 Message Date
David Wilson 1777b8f42e ansible: use DeduplicatingService for ContextService; closes #162. 2018-03-23 09:30:41 +05:45
David Wilson f6b5d9f2f6 issue #162: implement mitogen.service.DeduplicatingService
This abstracts the pattern found in parent.ModuleForwarder and to a
lesser degree master.ModuleResponser. We can probably use it in those
contexts later.
2018-03-23 09:29:39 +05:45
David Wilson 65fcef2374 core: mark every side O_CLOEXEC
Not sure why this wasn't done before, seems it should have always been
this way, and can't see any reason it wasn't. Without it, many fds are
leaked into at least .local() children. Closes #163.
2018-03-23 06:58:59 +05:45
David Wilson 2d7821b824 tests: test_stream_name: fix non-localhost Docker 2018-03-22 13:25:56 +05:45
David Wilson 08612d4ca2 issue #155: fix call_function_test regression
It's entirely unclear how test_aborted_on_local_context_disconnect ever
passed, but it was broken by the previous commit.
2018-03-22 12:57:05 +05:45
David Wilson 54ff1c90fa issue #155: add DEL_ROUTE, propagate ADD_ROUTE upwards
* IDs are allocated by the parent responsible for contructing a new
  child, using ALLOCATE_ID to the master as necessary to allocate new ID
  ranges.

* ADD_ROUTE is sent up the tree rather than down. This permits
  construction of the new context to complete concurrent to parent
  contexts learning about its existence. Since all streams are strictly
  ordered, it's not possible for any parent to observe messages from the
  new context prior to arrival of an ADD_ROUTE from the parent notifying
  of its existence.

  If the new context, for example, implements an Ansible async task, its
  parent can start executing that without waiting for any synchronous
  confirmation from any parent or the master.

* Since routes propagate up, it's no longer possible for a plain
  non-parent child to ever receive ADD_ROUTE, so that code can be moved
  out of core.py and into parent.py (-0.2kb compressed).

* Add a .routes attribute to parent.Stream, and respond to disconnection
  signal on the stream by propagating DEL_ROUTE for any ADD_ROUTE ever
  received from that stream.

* Centralize route management in a new parent.RouteMonitor class
2018-03-22 11:56:24 +05:45
David Wilson aeeeb45ccb docs: farewell, glorious iframe! 2018-03-22 07:13:42 +05:45
David Wilson adfd827531 test.sh: enhancements
* Explicitly name every test to run, I have lots of unchecked in stuff
* Allow SIGINT to stop the process
2018-03-22 05:21:42 +05:45
David Wilson 23e279b617 tests: get import_test limping back to health. 2018-03-22 05:07:31 +05:45
David Wilson 469279d9ca master: refactor ThreadWatcher
In order to support a .remove() method, to prevent a minor but annoying
(log visible) memory leak while running the tests.
2018-03-22 05:04:59 +05:45
David Wilson e3209d1de0 core: log Broker's id in repr. 2018-03-22 05:04:35 +05:45
David Wilson f4ba66e3ee issue #155: allocate child IDs in batches of 1000.
Avoids a roundtrip for every fork.
2018-03-22 03:41:04 +05:45
David Wilson 0c77107041 issue #96: fail test.sh if any test fails 2018-03-21 11:09:14 +05:45
David Wilson 1ed86774b5 issue #156: document select exception 2018-03-21 09:23:54 +05:45
David Wilson 7ec02f9bb0 issue #156: ensure Latch state is cleaned up if select throws. 2018-03-21 09:22:29 +05:45
David Wilson 20f5d89dfa issue #156: fix several more races
* Don't need to sleep if queue>sleepers, can just pop the right queue
  element and return it.

* If queue>sleeping and waking==sleeping, no mechanism existed to ensure
  a thread newly added to sleeping would ever be woken. Above change
  fixes that.

* Cannot trust select() return value, scheduler might sleep us
  indefinitely while put() writes a byte.

* Sleeping threads didn't pop FIFO, they popped in whatever order
  scheduler woke them up. Must recover index and use it to pick the pop
  index.
2018-03-20 14:53:19 +05:45
David Wilson 526b0a514b issue #156: prevent Latch.close() triggering spurious wakeups 2018-03-20 13:14:51 +05:45
David Wilson 18e2977baf docs: annoying phrasing 2018-03-20 13:05:41 +05:45
David Wilson 2c22c41819 issue #156: don't decrement `waking` if we timed out rather than being woken. 2018-03-20 13:02:46 +05:45
David Wilson 07a8994ff5 issue #156: waking thread result dictionary with an integer. 2018-03-20 12:55:55 +05:45
David Wilson 001e0163fe issue #156: handle multiple _put() before wake of first sleeper
- If latch.get() is called and the queue is empty, a thread is put to
  sleep.

- If Latch.put() from another thread then appends an item to the queue and
  wakes the sleeping thread, and

- If a subsequent Latch.put() from the same or another thread manages to
  acquire `lock` before the sleeping thread is scheduled,

- The sleeping thread's wake socket would have multiple bytes written to
  it.

Therefore create a new _pending variable to track the only item assigned
to each thread (keyed by its write socket), and remove the socket from
`sleeping` from within put.
2018-03-20 12:22:25 +05:45
David Wilson 168a954d90 issue #156: prefix Latch private variables 2018-03-20 09:26:47 +05:45
David Wilson b5398bd17f issue #156: docs typo 2018-03-20 09:12:50 +05:45
David Wilson 512ff77a46 issue #156: prevent non-sleeping threads from starving sleeping threads.
See new docs
2018-03-20 09:12:45 +05:45
David Wilson 9e514240a1 issue #156: always enable microsecond logging 2018-03-20 09:12:38 +05:45
David Wilson c51eee3c7f issue #156: make Pool repr log thread too. 2018-03-20 02:19:51 +05:45
David Wilson c20c2587d9 issue #156: make Latch() repr match Pool() repr. 2018-03-20 02:18:33 +05:45
David Wilson 7f4b89b7bb issue #156: log worker thread crashes in mitogen.pool 2018-03-20 02:18:04 +05:45
David Wilson 6e368d37da issue #156: log queue size too 2018-03-20 02:13:54 +05:45
David Wilson 037b461c39 issue #156: yet more logging :( 2018-03-20 02:08:48 +05:45
David Wilson 653c73c8f0 issue #156: also log target of wakes 2018-03-19 23:50:36 +05:45
David Wilson 4d96d0c1af issue #156: fix duplicate -vvvv logging 2018-03-19 23:48:08 +05:45
David Wilson a5cc7cb43c issue #156: add extra debugging around Latch
Change from writing '\x00' to writing '\x7f', and verify that is the
byte that woke the sleeping thread. Add a bunch more IO logging.
2018-03-19 23:47:44 +05:45
David Wilson ac7a64dfa3 core: assign common expression to a variable. 2018-03-19 21:58:35 +05:45
David Wilson 148ce1d703 issue #155: increase context ID width to 32 bits
Needed to make large range allocations (1000 per ALLOCATE_ID roundtrip)
feasible.
2018-03-19 21:58:35 +05:45
David Wilson 3579b6806b issue #152: reproduction for second problem 2018-03-19 21:58:35 +05:45
David Wilson c183f06dfb issue #152: respect the Ansible-selected interpreter for local connections too. 2018-03-19 21:58:35 +05:45
David Wilson 89b0faae2f Workaround for global state in yum_repository module; closes #154. 2018-03-19 21:58:35 +05:45
David Wilson 305e024819 issue #154: import user's reproduction 2018-03-19 21:58:35 +05:45
David Wilson 071d9fbfb3 docs: tidy ansible docs. 2018-03-19 21:58:35 +05:45
David Wilson 4d8ccab2ca ansible: docstring fixes 2018-03-19 21:58:35 +05:45
David Wilson 2132c311b2 tests: mark some tests as skipped 2018-03-19 21:58:35 +05:45
David Wilson f241eac5ce parent: allow Python to determine its install prefix from argv[0]
Fixes support for virtualenv. Closes #152.
2018-03-19 21:58:35 +05:45
David Wilson 088fd76109 issue #152: import reproduction 2018-03-19 21:58:35 +05:45
David Wilson dec3af375a issue #144: ansible: increase default pool size to 16. 2018-03-19 21:58:35 +05:45
David Wilson 9cf889b846 issue #144: master: public/private Pool attributes. 2018-03-19 21:58:35 +05:45
David Wilson 19632473dc issue #144: ansible: use service.Pool with default size=1. 2018-03-19 21:58:35 +05:45
David Wilson fe900087a2 issue #144: service: working service.Pool object.
It knows how to dispatch messages from multiple receivers (associated
with multiple services) to multiple threads, where the service
implementation is invoked on the message.

It wakes a maximum of one thread per received message.

It knows how to shut down gracefully.

Implication: due to the latch use, there are 2 file descriptors burned
for every thread. We don't need interruptibility here, so in future, it
might be nice to allow swapping a diferent queueing primitive into
Select (maybe a subclass?) just for this case.
2018-03-19 21:58:35 +05:45
David Wilson 4f361be7e7 issue #144: teach Select() to close its latch
Causes all threads sleeping on the select to wake.
2018-03-19 21:58:35 +05:45
David Wilson 8aada2646c core: support throwing LatchError in every sleeping thread
This is to allow Select() to be used as a generic queueing primitive
that supports graceful shutdown.
2018-03-19 21:58:35 +05:45