One problem with "read everything on the socket until it's empty" is an issue of...

hyc_symas · on May 22, 2014

Btw, that's not just theoretical. The connection manager in OpenLDAP slapd used to do exactly that - and then we found exactly that problem while benchmarking/soak testing. Now we always process only one request from a connection before moving on to the next connection. Not as efficient, but more fair.

There are always tradeoffs. Time sharing, multi-processing, and time-slicing are all inherently inefficient. Batch mode processing, where 100% of system resources are dedicated to a single job until it completes, is most efficient and gets jobs done quicker. We accept this efficiency tradeoff because we believe that interactive processing is more user-friendly.

So take the quest for efficiency with a grain of salt. Perfect efficiency may actually give a worse user experience.

On the flip side, we have LMDB, a read-optimized database engine which does no system calls or blocking library calls at all in the read codepath. It is as nearly perfectly efficient as one can get without resorting to hand-optimized machine language. Fairness inside LMDB isn't a concern, because calling apps are always going to trigger context switches on their own anyway.

The quest for efficiency has different constraints depending on what level of a system you're programming in.

hvidgaard · on May 22, 2014

1 req pr connection at a time, while fair, seems a bit harsh. From a performance perspective it's not a bad idea to process up to n reqs in at most m time from a connection before moving on to the next, and thus amortize the cost from switching connection.

robotresearcher · on May 22, 2014

The issue of magic numbers like this n is discussed in the article. n=1 has the benefit of perfect fairness. All other values have to be tweaked for the platform.

hvidgaard · on May 23, 2014

The article also explicitly say you should amortize the cost of things like switching context, which switching connection is.

I never stated that n and m has to be a magic numbers. They can be can (and should be) adaptive. While n=1 does provide perfect fairness, but what if switching connection cost just as much as processing 1 request? Setting n = 2 increase throughput by 50%, at the cost of 33% longer wait until the last connection is handled. However, because the throughput is increased, any subsequent requests will be handled faster than with n = 1.

In reality you want a sane m value, the time allowed to cycle through all connections. I'm not sure exactly how to make this adaptive, but it's likely very dependent on the nature of the usage (http connections from a GUI, p2p network, or something else). As long as you cycle through all connections within m time and aren't saturating the connection, the algorithm can increase n to increase throughput.

frankzinger · on May 22, 2014

Not if you receive as much as possible at once (i.e., the size of the socket's receive buffer), and process the N PDUs at your leisure. (By "process at your leisure" I mean process the PDUs, of course, but also service other clients.)

While the server is "processing", the client can only send enough request data to fill the socket's receive buffer again (around 300K bytes on my machine over the loopback interface, and it can be adjusted if it's too large). When that buffer becomes full, TCP's flow control kicks in and prevents the client's TCP stack from sending any more data, which will cause the client's socket send buffer to fill up, at which point it will be prevented from sending any more request data (send() will block or keep returning EWOULDBLOCK/EAGAIN if using non-blocking operations).

When you've eventually processed all of the PDUs, call recv() again, draining the next 300K bytes. Then the client will be able to start sending request data again.

This avoids the resource-hogging problem you described and keeps the user mode:kernel mode ratio as high as possible, which is what the article was talking about.