NAME

IO::Uring - io_uring for Perl

VERSION

version 0.002

SYNOPSIS

my $ring = IO::Uring->new(32);
my $buffer = "\0" x 4096;
$ring->recv($fh, $buffer, MSG_WAITALL, 0, sub($res, $flags) { ... });
$ring->send($fh, $buffer, 0, 0, sub($res, $flags) { ... });
$ring->run_once while 1;

DESCRIPTION

This module is a low-level interface to io_uring, Linux's new asynchronous I/O interface drastically reducing the number of system calls needed to perform I/O. Unlike previous models such as epoll, it's based on a proactor model instead of a reactor model, meaning that you schedule asynchronous actions and then get notified by a callback when the action has completed.

Generally speaking, the methods of this class match a system call 1-to-1 (e.g. recv(2)), except that they have two additional arguments:

  1. The submission flags. In particular this allows you to chain actions.

  2. A callback. This callback receives two integer arguments: a result (on error typically a negative errno value), and the completion flags. This callback will be kept alive by this module; any other resources that need to be kept alive should be captured by it.

All event methods return an identifier that can be used with cancel.

Note: This is an early release and this module should still be regarded as experimental. Backwards compatibility is not yet guaranteed.

METHODS

new($queue_size)

Create a new uring object, with the given submission queue size.

run_once($min_events = 1)

Submit all pending requests, and process at least $min_events completed (but up to $queue_size) events.

probe()

This probes for which features are supported on this system. It returns a hash of feature-name to true/false. Generally speaking feature names map directly to method names but note that for filesystem operations you should check for the *at version (e.g. 'openat' not 'open').

accept($sock, $flags, $s_flags, $callback)

Accept a new socket from listening socket $sock.

bind($sock, $sockaddr, $s_flags, $callback)

Bind the socket $sock to $sockaddr.

cancel($identifier, $flags, $s_flags, $callback = undef)

Cancel a pending request. $identifier should usually be the value returned by a previous event method. $flags is usually 0, but may be IORING_ASYNC_CANCEL_ALL, IORING_ASYNC_CANCEL_FD or IORING_ASYNC_CANCEL_ANY. Note that unlike most event methods the $callback is allowed to be empty.

close($fh, $s_flags, $callback)

Close the filehandle $fh.

connect($sock, $sockaddr, $s_flags, $callback)

Connect socket $sock to address $sockaddr.

fallocate($fh, $offset, $length, $s_flags, $callback)

Allocate disk space in $fh for $offset and $length.

fsync($fh, $flags, $s_flags, $callback)

Synchronize a file's in-core state with its storage device. flags may be 0 or IORING_FSYNC_DATASYNC.

ftruncate($fh, $length, $s_flags, $callback)

Truncate $fh to length $length.

listen($fh, $count)

Mark the socket referred to by $fh as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2). $count is the maximum backlog site for pending connections.

link($old_path, $new_path, $flags, $s_flags, $callback)

Link the file at $new_path to $old_path.

linkat($old_dir, $old_path, $new_dir, $new_path, $flags, $s_flags, $callback)

Link the file at $new_path in $new_dir (a directory handle) to $old_path in $old_dir.

Prepare a timeout request for linked submissions (using the IOSQE_IO_LINK/IOSQE_IO_HARDLINK submission flags). $timespec must refer to a Time::Spec object that must be kept alive until submission (usually through the callback). $flags is a bit set that may contain any of the following values: IORING_TIMEOUT_ABS, IORING_TIMEOUT_BOOTTIME, IORING_TIMEOUT_REALTIME, IORING_TIMEOUT_ETIME_SUCCESS, IORING_TIMEOUT_MULTISHOT.

Like cancel and timeout_remove, the $callback is optional.

mkdir($path, $mode, $s_flags, $callback)

Make a new directory at $path with mode $mode.

mkdirat($dirhandle, $path, $mode, $s_flags, $callback)

Make a new directory at $path under $dirhandle with mode $mode.

nop($s_flags, $callback)

This executes a no-op.

open($path, $flags, $mode, $s_flags, $callback)

Open a file at $path with $flags and mode.

openat($dirhandle, $path, $flags, $mode, $s_flags, $callback)

Open a file at $path under $dirhandle with $flags and mode.

poll($fh, $mask, $s_flags, $callback)

Poll the file handle $fh once. $mask can have the same values as synchronous poll (e.g. POLLIN, POLLOUT).

poll_multishot($fh, $mask, $s_flags, $callback)

Poll the file handle $fh and repeatedly call $callback whenever new data is available. $mask can have the same values as synchronous poll (e.g. POLLIN, POLLOUT).

shutdown($fh, $how, $s_flags, $callback)

Shut down a part of a connection, the same way the core builtin shutdown($fh, $how) does.

splice($fh_in, $off_in, $fh_out, $off_out, $nbytes, $flags, $s_flags, $callback)

Move data between two file handle without copying between kernel address space and user address space. It transfers up to size bytes of data from the file handle $fh_in to the file handle fh_out, where one of the file handles must refer to a pipe.

For a pipe file handles the associated offset must be -1. If set it will be used as the offset in the file or block device to start the read.

flags must currently be 0.

sync_file_range($fh, $length, $offset, $flags, $s_flags, $callback)

Synchronize the given range to disk. $flags must currently be 0.

read($fh, $buffer, $offset, $s_flags, $callback)

Equivalent to pread($fh, $buffer, $offset). The buffer must be preallocated to the desired size, the callback received the number of bytes in it that are actually written to. The buffer must be kept alive, typically by enclosing over it in the callback.

recv($sock, $buffer, $flags, $s_flags, $callback)

Equivalent to recv($fh, $buffer, $flags). The buffer must be preallocated to the desired size, the callback received the number of bytes in it that are actually written to. The buffer must be kept alive, typically by enclosing over it in the callback.

rename($old_path, $new_path, $flags, $s_flags, $callback)

Rename the file at $old_path to $new_path.

renameat($old_dir, $old_path, $new_dir, $new_path, $flags, $s_flags, $callback)

Rename the file at $old_path in $old_dir (a directory handle) to $new_path in $new_dir.

send($sock, $buffer, $flags, $s_flags, $callback)

Equivalent to send($fh, $buffer, $flags). The buffer must be kept alive, typically by enclosing over it in the callback.

sendto($sock, $buffer, $flags, $sockaddr, $s_flags, $callback)

Send a buffer to a specific address. The buffer and address must be kept alive, typically by enclosing over it in the callback.

socket($domain, $type, $protocol, $flags, $s_flags, $callback)

Create a new socket of the given $domain, $type and $protocol.

tee($fh_in, $fh_out, $nbytes, $flags, $callback)

Prepare a tee request. This will use as input the file handle $fh_in and as output the file handle $fh_out duplicating $nbytes bytes worth of data. $flags are modifier flags for the operation and must currently be 0.

timeout($timespec, $count, $flags, $s_flags, $callback)

Create a timeout. $timespec must refer to a Time::Spec object that must be kept alive through the callback. $count is the number of events that should be waited on, typically it would be 0. $flags is a bit set that may contain any of the following values: IORING_TIMEOUT_ABS, IORING_TIMEOUT_BOOTTIME, IORING_TIMEOUT_REALTIME, IORING_TIMEOUT_ETIME_SUCCESS, IORING_TIMEOUT_MULTISHOT.

timeout_remove($id, $flags, $s_flags, $callback = undef)

Remove a timeout identified by $id. $flags is currently unused and must be 0. Like cancel and link_timeout, the callback is optional.

timeout_update($id, $timespec, $flags, $s_flags, $callback)

Update the timer identifiers by $id. timespec and flags have the same meaning as in timeout.

unlink($path, $mode, $s_flags, $callback)

Remove a file or directory at $path with flags $flags.

unlinkat($dirhandle, $path, $mode, $s_flags, $callback)

Remove a file or directory at $path under $dirhandle with flags $flags.

waitid($id_type, $id, $info, $options, $flags, $s_flags, $callback)

Wait for another process. $id_type specifies the type of ID used and must be one of P_PID ($id is a PID), P_PGID ($id is a process group), P_PIDFD ($id is a PID fd) or P_ALL ($id is ignored, wait for any child). $info must be a Signal::Info object that must be kept alive through the callback, it will contain the result of the event. $options is a bitset of WEXITED, WSTOPPED WCONTINUED, WNOWAIT; typically it would be WEXITED. $flags is currently unused and must be 0. When the callback is triggered the following entries of $info will be set: pid, uid, signo (will always be SIGCHLD), status and code (CLD_EXITED, CLD_KILLED)

write($fh, $buffer, $offset, $s_flags, $callback)

Equivalent to send($fh, $buffer, $flags). The buffer must be kept alive, typically by enclosing over it in the callback.

FLAGS

The following flags are all optionally exported:

Submission flags

These flags are passed to all event methods, and affect how the submission is processed.

  • IOSQE_ASYNC

    Normal operation for io_uring is to try and issue an SQE as non-blocking first, and if that fails, execute it in an async manner. To support more efficient overlapped operation of requests that the application knows/assumes will always (or most of the time) block, the application can ask for an SQE to be issued async from the start. Note that this flag immediately causes the SQE to be offloaded to an async helper thread with no initial non-blocking attempt. This may be less efficient and should not be used liberally or without understanding the performance and efficiency tradeoffs.

  • IOSQE_IO_LINK

    When this flag is specified, the SQE forms a link with the next SQE in the submission ring. That next SQE will not be started before the previous request completes. This, in effect, forms a chain of SQEs, which can be arbitrarily long. The tail of the chain is denoted by the first SQE that does not have this flag set. Chains are not supported across submission boundaries. Even if the last SQE in a submission has this flag set, it will still terminate the current chain. This flag has no effect on previous SQE submissions, nor does it impact SQEs that are outside of the chain tail. This means that multiple chains can be executing in parallel, or chains and individual SQEs. Only members inside the chain are serialized. A chain of SQEs will be broken if any request in that chain ends in error.

  • IOSQE_IO_HARDLINK

    Like IOSQE_IO_LINK , except the links aren't severed if an error or unexpected result occurs.

  • IOSQE_IO_DRAIN

    When this flag is specified, the SQE will not be started before previously submitted SQEs have completed, and new SQEs will not be started before this one completes.

Completion flags

These are values set in the $flags arguments of the event callbacks. They include:

  • IORING_CQE_F_MORE

    If set, the application should expect more completions from the request. This is used for requests that can generate multiple completions, such as multi-shot requests, receive, or accept.

  • IORING_CQE_F_SOCK_NONEMPTY

    If set, upon receiving the data from the socket in the current request, the socket still had data left on completion of this request.

Event specific flags

cancel

  • IORING_ASYNC_CANCEL_ALL

    Cancel all requests that match the given criteria, rather than just canceling the first one found. Available since 5.19.

  • IORING_ASYNC_CANCEL_FD

    Match based on the file handle used in the original request rather than the user_data. Available since 5.19.

  • IORING_ASYNC_CANCEL_ANY

    Match any request in the ring, regardless of user_data or file handle. Can be used to cancel any pending request in the ring. Available since 5.19.

fsync

The only allowed flag value for fsync:

  • IORING_FSYNC_DATASYNC

    If set fsync will do an fdatasync instead: not sync if only metadata has changed.

  • AT_SYMLINK_FOLLOW

recv / send / sendto

  • IORING_RECVSEND_POLL_FIRST

    If set, io_uring will assume the socket is currently empty and attempting to receive data will be unsuccessful. For this case, io_uring will arm internal poll and trigger a receive of the data when the socket has data to be read. This initial receive attempt can be wasteful for the case where the socket is expected to be empty, setting this flag will bypass the initial receive attempt and go straight to arming poll. If poll does indicate that data is ready to be received, the operation will proceed.

remove / removeat

  • RENAME_EXCHANGE

    Atomically exchange oldpath and newpath. Both pathnames must exist but may be of different types (e.g., one could be a non-empty directory and the other a symbolic link).

  • RENAME_NOREPLACE

    Don't overwrite newpath of the rename. Return an error if newpath already exists.

    RENAME_NOREPLACE can't be employed together with RENAME_EXCHANGE. RENAME_NOREPLACE requires support from the underlying filesystem.

timeout

  • IORING_TIMEOUT_ABS

    The value specified in ts is an absolute value rather than a relative one.

  • IORING_TIMEOUT_BOOTTIME

    The boottime clock source should be used.

  • IORING_TIMEOUT_REALTIME

    The realtime clock source should be used.

  • IORING_TIMEOUT_ETIME_SUCCESS

    Consider an expired timeout a success in terms of the posted completion. This means it will not sever dependent links, as a failed request normally would. The posted CQE result code will still contain -ETIME in the res value.

  • IORING_TIMEOUT_MULTISHOT

    The request will return multiple timeout completions. The completion flag IORING_CQE_F_MORE is set if more timeouts are expected. The value specified in count is the number of repeats. A value of 0 means the timeout is indefinite and can only be stopped by a removal request. Available since the 6.4 kernel.

  • AT_REMOVEDIR

    If the AT_REMOVEDIR flag is specified, unlink / unlinkat performs the equivalent of rmdir(2) on pathname.

waitid

waitid has various constants defined for it. The following values are defined for the $idtype:

  • P_PID

    This indicates the identifier is a process identifier.

  • P_PGID

    This indicates the identifier is a process group identifier.

  • P_PIDFD

    This indicates the identifier is a pidfd.

  • P_ALL

    This indicates the identifier will be ignored and any child is waited upon.

The following constants are defined for the $options argument:

  • WEXITED

    Wait for children that have terminated.

  • WSTOPPED

    Wait for children that have been stopped by delivery of a signal.

  • WCONTINUED

    Wait for (previously stopped) children that have been resumed by delivery of SIGCONT.

  • WNOWAIT

    Leave the child in a waitable state; a later wait call can be used to again retrieve the child status information.

AUTHOR

Leon Timmermans <fawaka@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2025 by Leon Timmermans.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.