# Npgsql multiplexing A high-performance database driver model --- ## whoami * Shay Rojansky * Engineer at Microsoft, part of the .NET data team * Lead dev of Npgsql, PostgreSQL driver for .NET * Linux guy * Based in Berlin Note: Started with Slackware around 1995 --- ## Who is Npgsql? * .NET open-source driver for PostgreSQL * There's also an EF Core provider, but we'll focus on the low-level (ADO.NET). * Bottom of the stack - beneath us is the .NET runtime * Supposed to be a boring component! --- ## Performance TechEmpower Round 20 (Feb 2021) ![TechEmpower round 20 composite scores](img/techempower.png "Logo Title Text 1")
... running on Linux with PostgreSQL
Note: * Already old (Feb 2021) * Does not include newer improvements for .NET 6 * Many of these are experiments, not full-fledged web frameworks etc. * The first Fortunes MySQL entry is 44th place --- ## What's this talk about? * Multiplexing is an experimental perf feature released in November 2020 (Npgsql 5.0). * Other database drivers do this, but not in the .NET world. * Connection pooling, network buffering, coalescing, and a whole lot of low-level perf goodness. --- # Let's begin! --- # Traditional connection pooling --- ## Typical low-level DB code ```csharp await using var connection = new NpgsqlConnection("..."); await connection.OpenAync(); await using var command = new NpgsqlCommand("SELECT * FROM foo", connection); await using var reader = await command.ExecuteReaderAsync(); // Consume results from the reader ``` --- ## Connection pooling * Opening a physical connection every time is slow. * Network roundtrip(s) (TLS!) * In PG, spawn process for each connection * Like every perf problem in the universe, the answer is caching, or connection pooling --- ## Connection pooling flow ![Traditional connection sequence diagram](diagrams/traditional-sequence-diagram.svg) --- ## Where to pool? * #1: In .NET: in-process pooling, in the driver. * #2: Java: the pool is a separate standard API. * #3: Out-of-process (pgbouncer2, pgpool, odyssey). * In-process is faster (zero network overhead) and easier to set up. * External allows pooling across multiple app instances. ![Why not both](img/why-not-both.jpeg) Note: * The connection pool is the one component dealing with concurrency and shared mutable state. * I used to think the Java pooling API model was great, but we'll see that it has drawbacks. --- ## Maximum connections * The pool always as a max connection size, setting an upper bound on server resources. * An attempt open when saturated waits until someone else releases (or until timeout). ![Open with wait](diagrams/open-wait.svg) Note: * Max Pool Size is 100 by default * Very important to have a hard limit (server resources). Compare with memory object pooling. --- # Let's evolve this! --- ## Empty trains * Physical connections are exclusively "locked" for the duration of the command. * We're sending out trains with only a single passenger each. ![Empty train](img/empty-train.png) --- ## Fill trains with passengers * We can fill the same train with multiple unrelated passengers ... * .. as long as they arrive around the same time. * Benefits: * Less TCP packet fragmentation * Less I/O system calls (both client and server) --- ## Without multiplexing ![Pool without multiplexing](diagrams/pool-without-multiplexing.svg) --- ## With Multiplexing
--- ## Busy trains * The more commands we pack into the same write, the more we save on overhead. * In TechEmpower Fortunes, we see around 23 commands per write. * For this to work, lots of commands need to be submitted simultaneously * Optimizes the massive load scenario --- ## Coalescing and Nagling * This is a form of implicit batching, or *coalescing*. * Nagle's algorithm: TCP/IP optimization to coalesce together small writes into a single packet. * Nagling works on a single TCP socket; we combine from multiple query producers. * Nagling compensates for poor application I/O practices (no buffering). * Typically disabled (TCP_NODELAY), because increases latency. --- # Advantage #1: ## Coalescing --- ## Perf benefits
Without multiplexing, the optimal RPS in TechEmpower Fortunes is
213,689
.
With multiplexing:
454,254
(112% improvement).
... but it's not just about the coalescing :)
--- ## Saturation * Remember: in traditional pooling, we wait when saturated - all connections are locked. * In multiplexing, we can do better: continue pushing commands to the database, even if they're busy! * We call this "overcapacity mode". --- ## Under capacity Use only idle connection as long as we can... ![Under capacity](diagrams/multiplexing-under-capacity.svg) --- ## Over capacity ... but keep pushing commands into busy connections ![Over capacity](diagrams/multiplexing-over-capacity.svg) --- ## Advantages * Pipelining: get commands to the DB faster, even while it's still processing previous commands. * As long as the DB is processing faster than we send, we're pipelining efficiently. * If we send faster than the DB processes, eventually TCP buffers get full (TCP zero window). * And we block, just like before. Note: If we get to TCP zero window, all bets are off - the DB is totally overloaded. --- # Advantage #2: ## Better handling of saturation --- ## How many DB connections should I have? * The answer depends on server resources, especially number of cores. * Also: are queries going to be doing lots of I/O? * If we increase (in-use) connections too much, we start to cause useless context-switching. Note: * Make it clear that TechEmpower Fortunes does very little I/O. --- ## Less connections * Thanks to multiplexing over-capacity, we can lower the connection count and not starve client-side. * Do more with less: far more efficient use of physical connections. * This helps the DB by reducing context switching. * Even at extremely low connection counts, we're still very efficient. * The multiplexing pool acts as a sort of "backpressure" for connections. --- ## Multiplexing benefits 1. Coalescing (less overhead) 2. Over-capacity mode (saturation) 3. Reduce number of connections needed
Showdown!
Non-multiplexing
Multiplexing
# We've talked about the good... ## ... now let's talk about the bad. --- ## Statelessness * Cannot do anything that involves connection state... * ... especially transactions. * Can still exclusively rent out connections, as before. Notes: * Other examples: temp tables, session variables... --- ## Head of line blocking * We get query results from the DB in the same order we sent them, FIFO. * Processing the results of a later query can only happen when earlier results were processing. * A slow query (or client-side processing!) can block other, unrelated producers. Note: * Resultset buffering: but memory. --- ## Some conclusions * Don't over-parallelize! * Stop thinking about connections! We just have database command producers. * Multiplexing relies on an in-process (and probably in-driver) pool. * Remember: the perf numbers we've seen are TechEmpower benchmarks, massive concurrent load. * This is all still a bit experimental! Notes: * The in-process comment: .NET is (inadvertently) in a good place --- ## The Npgsql team A special thanks to
Nikita Kazmin
and
Nino Floris
for their considerable work on multiplexing, and also to Brar Piening! --- Thank you! **Shay Rojansky**
, @shayrojansky