2017-07-17 10:32:14 +00:00
2018-03-12 14:03:15 +00:00
### Odyssey architecture and internals
2017-07-17 10:32:14 +00:00
2018-03-12 14:03:15 +00:00
Odyssey heavily depends on two libraries, which were originally created during its
2017-07-17 10:38:28 +00:00
development: Machinarium and Shapito.
2017-07-17 10:32:14 +00:00
2017-07-18 14:49:17 +00:00
#### Machinarium
2017-07-17 10:32:14 +00:00
2017-07-18 10:11:59 +00:00
Machinarium extensively used for organization of multi-thread processing, cooperative multi-tasking
2018-03-12 14:03:15 +00:00
and networking IO. All Odyssey threads are run in context of machinarium `machines` -
2017-07-25 13:31:26 +00:00
pthreads with coroutine schedulers placed on top of `epoll(7)` event loop.
2017-07-17 10:32:14 +00:00
2018-03-12 14:03:15 +00:00
Odyssey does not directly use or create multi-tasking primitives such as OS threads and mutexes.
2017-07-18 10:30:23 +00:00
All synchronization is done using message passing and transparently handled by machinarium.
2017-07-17 10:32:14 +00:00
2018-05-29 14:30:11 +00:00
Repository: [third\_party/machinarium ](https://github.com/yandex/odyssey/tree/master/third_party/machinarium )
2017-07-17 10:38:28 +00:00
2017-07-18 14:49:17 +00:00
#### Shapito
2017-07-17 10:32:14 +00:00
Shapito provides resizable buffers (streams) and methods for constructing, reading and validating
2017-07-18 10:11:59 +00:00
PostgreSQL protocol requests. By design, all PostgreSQL specific details should be provided by
Shapito library.
2017-07-17 10:38:28 +00:00
2018-05-29 14:30:11 +00:00
Repository: [third\_party/shapito ](https://github.com/yandex/odyssey/tree/master/third_party/shapito )
2017-07-17 10:38:28 +00:00
2017-07-18 13:35:31 +00:00
#### Core components
2017-07-19 09:47:41 +00:00
```
2017-07-18 15:03:29 +00:00
main()
.----------.
| instance |
2017-07-19 09:47:41 +00:00
thread '----------'
2018-03-02 10:07:06 +00:00
.--------. .-------------.
2018-03-13 13:33:17 +00:00
| system | | worker_pool |
2018-03-02 10:07:06 +00:00
'--------' '-------------'
.--------. .---------. .---------. .---------.
| router | | servers | | worker0 | ... | workerN |
'--------' '---------' '---------' '---------'
2018-03-02 13:12:32 +00:00
.---------. .------. thread thread
| console | | cron |
'---------' '------'
2017-07-19 09:47:41 +00:00
```
2017-07-18 13:35:31 +00:00
2017-07-18 14:49:17 +00:00
#### Instance
2017-07-18 14:06:53 +00:00
2017-07-18 14:49:17 +00:00
Application entry point.
2017-07-18 14:27:40 +00:00
2017-07-18 14:49:17 +00:00
Handle initialization. Read configuration file, prepare loggers.
2018-03-13 13:33:17 +00:00
Run system and worker\_pool threads.
2017-07-18 14:06:53 +00:00
2018-03-15 14:25:01 +00:00
[sources/instance.h ](/sources/instance.h ), [sources/instance.c ](/sources/instance.c )
2017-07-18 14:06:53 +00:00
2018-03-13 13:33:17 +00:00
#### System
2017-07-18 14:06:53 +00:00
2018-03-02 13:12:32 +00:00
Start router, cron and console subsystems.
2017-07-18 14:06:53 +00:00
2017-07-18 14:27:40 +00:00
Create listen server one for each resolved address. Each listen server runs inside own coroutine.
2017-07-18 14:49:17 +00:00
Server coroutine mostly waits on `machine_accept()` .
On incoming connection, new client context is created and notification message is sent to next
2018-03-13 13:33:17 +00:00
worker using `workerpool_feed()` . Client IO context is not attached to any `epoll(7)` context yet.
2017-07-18 14:06:53 +00:00
2017-07-19 11:29:23 +00:00
Handle signals using `machine_signal_wait()` . On `SIGHUP` : do versional config reload, add new databases
2018-03-02 10:07:06 +00:00
and obsolete old ones. On `SIGINT` , `SIGTERM` : call `exit(3)` . Other threads are blocked from receiving signals.
2017-07-19 10:09:04 +00:00
2018-03-15 14:25:01 +00:00
[sources/system.h ](/sources/system.h ), [sources/system.c ](/sources/system.c )
2017-07-18 14:06:53 +00:00
2017-07-18 14:49:17 +00:00
#### Router
2017-07-18 14:06:53 +00:00
2017-07-19 09:47:41 +00:00
Handle client registration and routing requests. Do client-to-server attachment and detachment.
Ensure connection limits and client pool queueing. Handle implicit `Cancel` client request, since access
2017-07-18 14:49:17 +00:00
to server pool is required to match a client key.
2017-07-18 14:27:40 +00:00
2018-03-02 10:07:06 +00:00
Router works in request-reply manner: client (from worker thread) sends a request message to
2017-07-18 15:03:29 +00:00
router and waits for reply. Could be a potential hot spot (not an issue at the moment).
2017-07-18 14:06:53 +00:00
2018-03-15 14:25:01 +00:00
[sources/router.h ](/sources/router.h ), [sources/router.c ](/sources/router.c )
2017-07-18 14:06:53 +00:00
2018-03-02 13:12:32 +00:00
#### Cron
2017-07-18 14:06:53 +00:00
2017-07-19 09:47:41 +00:00
Do periodic service tasks, like idle server connection expiration and
2018-03-06 15:25:59 +00:00
database config obsoletion.
2017-07-18 14:06:53 +00:00
2018-03-15 14:25:01 +00:00
[sources/cron.h ](/sources/cron.h ), [sources/cron.c ](/sources/cron.c )
2017-07-18 14:06:53 +00:00
2018-03-02 10:07:06 +00:00
#### Worker and worker pool
2017-07-18 14:49:17 +00:00
2018-03-02 10:07:06 +00:00
Worker thread (machinarium machine) waits on incoming connection notification queue. On new connection event,
create new frontend coroutine and handle client (frontend) lifecycle. Each worker thread can host
2017-07-18 14:49:17 +00:00
thousands of client coroutines.
2017-07-18 14:06:53 +00:00
2018-03-02 10:07:06 +00:00
Worker pool is responsible for maintaining a thread pool of workers. Threads are machinarium machines,
2017-07-18 14:49:17 +00:00
created using `machine_create()` .
2017-07-18 14:06:53 +00:00
2018-03-15 14:25:01 +00:00
[sources/worker.h ](/sources/worker.h ), [sources/worker.c ](/sources/worker.c ),
[sources/worker_pool.h ](/sources/worker_pool.h ), [sources/worker_pool.c ](/sources/worker_pool.c )
2017-07-25 13:31:26 +00:00
2018-06-04 10:53:03 +00:00
#### Single worker mode
To reduce multi-thread communication overhead, Odyssey handles case with a single worker (`workers 1`)
differently.
Instead of creating separate thread + coroutine for each worker, only one worker coroutine created inside system thread. All message channels `machine_channel_create()` created marked as non-shared. This allows to make faster communications without a need to do expensive system calls for event loop wakeup.
2017-07-25 13:43:00 +00:00
#### Client (frontend) lifecycle
2017-07-25 13:31:26 +00:00
2017-07-25 13:43:00 +00:00
Whole client logic is driven by a single `od_frontend()` function, which is a coroutine entry point.
There are 6 distinguishable stages in client lifecycle.
2017-07-25 13:31:26 +00:00
2018-03-15 14:25:01 +00:00
[sources/frontend.h ](/sources/frontend.h ), [sources/frontend.c ](/sources/frontend.c )
2017-07-25 14:02:05 +00:00
#### 1. Startup
2017-07-25 13:31:26 +00:00
2017-07-25 13:43:00 +00:00
Read initial client request. This can be `SSLRequest` , `CancelRequest` or `StartupMessage` .
2017-07-25 14:02:05 +00:00
Handle SSL/TLS handshake.
2017-07-25 13:43:00 +00:00
2017-07-25 14:02:05 +00:00
#### 2. Process Cancel request
2017-07-25 13:43:00 +00:00
2017-07-25 14:02:05 +00:00
In case of `CancelRequest` , call Router to handle it. Disconnect client right away.
2017-07-25 13:43:00 +00:00
2017-07-25 14:02:05 +00:00
#### 3. Route client
2017-07-25 13:43:00 +00:00
2017-07-25 14:02:05 +00:00
Call router. Use `Database` and `User` to match client configuration route. Router assigns
2017-07-25 13:43:00 +00:00
matched route to a client. Each route object has a reference counter.
All routes are periodically garbage-collected.
2017-07-25 14:02:05 +00:00
#### 4. Authenticate client
Write client an authentication request `AuthenticationMD5Password` or `AuthenticationCleartextPassword` and
wait for reply to compare passwords. In case of success send `AuthenticationOk` .
#### 5. Process client requests
2017-07-25 14:06:58 +00:00
Depending on selected route storage type, do `local` (console) or `remote` (remote PostgreSQL server) processing.
2017-07-25 14:02:05 +00:00
2017-07-25 14:33:02 +00:00
Following remote processing logic repeats until client sends `Terminate` ,
client or server disconnects during the process:
2017-07-25 14:06:58 +00:00
* Read client request. Handle `Terminate` .
2017-07-25 14:33:02 +00:00
* If client has no server attached, call Router to assign server from the server pool. New server connection registered and
2018-03-02 10:07:06 +00:00
initiated by the client coroutine (worker thread). Maybe discard previous server settings and configure it using client parameters.
2017-07-25 14:06:58 +00:00
* Send client request to the server.
* Wait for server reply.
* Send reply to client.
* In case of `Transactional` pooling: if transaction completes, call Router to detach server from the client.
2017-07-25 14:33:02 +00:00
* Repeat.
2017-07-25 14:02:05 +00:00
#### 6. Cleanup
2017-07-25 14:33:02 +00:00
If server is not Ready (query still in-progress), initiate automatic `Cancel` procedure. If server is Ready and left in active transaction,
initiate automatic `Rollback` . Return server back to server pool or disconnect.
Free client context.
2017-07-31 14:40:54 +00:00
#### Client error codes
In the most scenarios PostgreSQL error messages `ErrorResponce` are copied to a client as-is. Yet, there are some
2018-03-12 14:03:15 +00:00
cases, when Odyssey has to provide its own error message and SQLCode to client.
2017-07-31 14:40:54 +00:00
2017-07-31 14:50:20 +00:00
Function `od_frontend_error()` is used for formatting and sending error message to client.
2017-07-31 14:40:54 +00:00
2017-07-31 14:50:20 +00:00
| SQLCode | PostgreSQL Code Name | Stage |
2017-07-31 14:40:54 +00:00
| ------- | -------------------- | ----- |
2017-07-31 14:50:20 +00:00
| **08P01** | `PROTOCOL_VIOLATION` | Startup, TLS handshake, authentication |
| **0A000** | `FEATURE_NOT_SUPPORTED` | TLS handshake |
| **28000** | `INVALID_AUTHORIZATION_SPECIFICATION` | Authentication |
| **28P01** | `INVALID_PASSWORD` | Authentication |
| **58000** | `SYSTEM_ERROR` | Routing, System specific |
| **3D000** | `UNDEFINED_DATABASE` | Routing |
| **53300** | `TOO_MANY_CONNECTIONS` | Routing |
| **08006** | `CONNECTION_FAILURE` | Server-side error during connection or IO |
PostgreSQL specific error codes can be found in `src/backend/errocodes.txt` .