-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linux: use io_uring for network i/o #4044
Comments
From a library user aspect:
|
👍 to this, I was bitten by no having this kind of api when I was experimenting with iouring in the past. Also, something that moves libuv/leps#2 forward is a plus.
How would this work exactly? IIUC, IORING_OP_PROVIDE_BUFFERS needs also to define the number of buffers and their size. Would we need to define at least one of those or would that be an implementation detail
lgtm.
Why the |
Yes, it would be entirely up to libuv to decide how to chop up the allotment; I don't want too many io_uring eccentricities sneaking into the API because it also needs to work meaningfully on other platforms. That said, I'm leaning towards shelving buffer management for now and just implement the simple thing first. The flags argument lets us add it later as an opt-in. |
Cool stuff. I just wanted to mention two other projects which are in the same area, abstracting IO over multiple platforms like Linux with io_uring, Windows IOCP and Macos. The first is libxev by Mitchell Hashimoto (https://github.com/mitchellh/libxev) which is a library for doing async IO, adopting the io_uring style (perhaps good to study for how it does work on other platforms?). The second is TigerBeetle which is a database but it has the IO fairly neatly in a directory (https://github.com/tigerbeetle/tigerbeetle/tree/main/src/io) so it's quite easy to read. Both are written in Zig, though libxev exposes a C API. Perhaps one can draw some inspiration from their work :) |
I opt for keeping the firehose approach incoming connections/data. io_uring has a handy feature called multshot(https://man.archlinux.org/man/io_uring.7.en#IORING_CQE_F_MORE), applicable to accept(43), read(0). Multshot can be depicted as one submission(SQE), multiple completeness(CQE). This feature closely matches the handle-based API in libuv. |
If we switch reading from handle-based to request-based, the first dilemma comes to me is that how we submit read operation to io_uring, one by one ( |
Ignore this, pls. I just spot that libuv makes use of IORING_SETUP_SQPOLL, which makes us exempted from submiting operations(SQE) |
This is an invitation to brainstorm. :-)
There is currently a mismatch between io_uring's and libuv's I/O model that stops libuv from achieving maximal performance if it were to use io_uring for network i/o, particularly when it comes to receiving incoming connections and packets.
https://github.com/axboe/liburing/wiki/io_uring-and-networking-in-2023 describes best practices that can be summarized as:
keep multiple i/o requests enqueued (mismatch: libuv's "firehose" approach to incoming connections/data)
let io_uring manage buffers (mismatch: libuv punts memory management to the user)
I've been thinking about what needs to change and this is what I came up with so far:
add request-based apis for reading data and accepting connections, like
int uv_stream_read(uv_read_t* req, uv_stream_t* handle, uv_buf_t* bufs, size_t nbufs, unsigned flags, uv_stream_read_cb cb)
whereuv_stream_read_cb
isvoid (*)(uv_read_t* req, ssize_t nread, uv_buf_t* bufs)
add a memory pool api that tells libuv it can allocate this much memory for buffers. Strawman proposal:
uv_loop_configure(loop, UV_LOOP_SET_BUFFER_POOL_SIZE, 8<<20)
with 0 meaning "pick a suitable default"introduce a flag that tells
uv_stream_read()
to ignorebufs
and interpretnbufs
as "bytes to read into buffer pool" (kind of ugly, suggestions welcome)introduce a new
int uv_release_buffers(loop, bufs, nbufs)
(maybes/loop/handle/
?) that tells libuv it's okay to reuse those buffer pool slices again. Alternatively: reclaim the memory automatically whenuv_stream_read_cb
returns but that means users may have to copy, or that they hold on longer to the buffers than is needed (inefficient if they queue up new requests in the callback)Libuv-on-Windows is internally already request-based so should be relatively easy to adapt.
Suggestions for improvements very welcome!
The text was updated successfully, but these errors were encountered: