0

Context

I'm writing code that emulates one aspect of pidfds on platforms that don't support them (old Linux, other Unix).

I'm doing this a) in order to test some pidfd-related code on very old platforms and b) as a personal challenge. Mostly as a personal challenge/for fun.

I explicitly am not trying to re-implement all pidfd functionality; I don't actually care about processes//proc/PIDs at all. Instead, I'm trying to emulate just one part of pidfds' features: the fact that a pidfd's file descriptor has three behaviours:

  1. If opened in blocking mode, block on read.
  2. If opened or fcntl'd to nonlocking mode, return EWOULDBLOCK on read.
  3. Explicitly return EINVAL on read.

That third part is the tricky part, and an unusual aspect of pidfds at this time (very little else intentionally returns EINVAL for read()).

Question

I want to induce that behaviour on some other kind of file descriptor (doesn't matter what kind). Specifically, I want a file descriptor that:

  • By default, obeys the usual O_NONBLOCK (or not) behaviour.
  • After I do something to it, all calls to read(2) that would not ordinarily return an error will instead return EINVAL, regardless of the parameters to read(2).

This turns out to be surprisingly tricky.

What I've tried

read(2)'s manual page for EINVAL says it's returned if:

fd is attached to an object which is unsuitable for reading; or the file was opened with the O_DIRECT flag, and either the address specified in buf, the value specified in count, or the file offset is not suitably aligned.

...or if an invalid buffer size is passed to read(2) on a timerfd.

Neither the timerfd case nor the O_DIRECT case satisfy my requirements, as they only return EINVAL if certain arguments are passed to read(2), and I want it to be returned in all non-erroring cases.

I've also tried signalfds (couldn't find a case that returned EINVAL on read), inotify FDs (same), and various permutations of forcibly close(2)d or shutdown(2) pipes, FIFOs, and anonymous sockets.

I'm not that well-versed in POSIX trivia, though, so it's entirely possible I missed something that allows a file descriptor type I've already experimented on to return EINVAL.

Bonus points if there's a solution that works on BSD/MacOS, but really anything is better than nothing, even if Linux-specific or kernel-version-specific.

I've tried some of the other tricks on this question, but they largely generate error codes other than EINVAL.

5
  • Write your own pidfd_read() function that's normally just a wrapper over read() instead of directly using read()? That way you have control over what it returns when running tests.
    – Shawn
    Commented Jul 5 at 19:59
  • Unfortunately, the things I want to integrate with are libraries that take a file descriptor number, and I'd rather not resort to e.g. LD_PRELOAD hacks to shim in a new version of read(2) (nor can I rely on dynamic linking even if that were an option).
    – Zac B
    Commented Jul 5 at 20:16
  • Manpage for read(2) EINVAL also says "fd was created via a call to timerfd_create(2) and the wrong size buffer was given to read(); see timerfd_create(2) for further information." -- have you followed up there? Also, I think @Shawn was suggesting you simply write a wrapper for read, not dynamically load something already part of another library. That way you control under what circumstances your wrapper returns EINVAL given the parameters and whatever testing you wish to do to arrive at the circumstance you want to return EINVAL. Another good option. Commented Jul 5 at 23:02
  • Also, what is your use-case for opening with O_DIRECT? Check Use of O_DIRECT on Linux for various cases and concerns. Commented Jul 5 at 23:13
  • Ah sorry, I could have been clearer. I meant that I want to pass this file descriptor to APIs I don't control that take an fd number and call read(2) internally, so replacing read() with my own function is not really ideal. The timerfd_create(2) and O_DIRECT discussions are only relevant insofar as the manpage for read(2) mentions those systems as ways to make read() return EINVAL. Unfortunately, both timerfd and O_DIRECT's induced EINVAL behaviors depend on the arguments to read(), which doesn't satisfy my requirements of all read() calls returning EINVAL.
    – Zac B
    Commented Jul 6 at 0:05

2 Answers 2

0

The O_DIRECT case can actually satisfy your requirement, because the file offset isn't one of the arguments to read. If you do int fd = open("/bin/sh", O_RDONLY|O_DIRECT); lseek(fd, 1, SEEK_SET);, then read(fd, buf, count); should fail with EINVAL no matter what buf and count are.

-1

Call read() with a negative size, as in:

$ cat >so_$$.c
#include <stdio.h>
#include <unistd.h>

char buffer[100];

int main()
{
    int bytes_read = read(0, buffer, -10);
    if (bytes_read < 0) {
        perror("read");
    }
}
$ make so_$$
cc -O2 -pipe  so_43241.c  -o so_43241
$ ./so_$$
read: Invalid argument
$ uname -a
FreeBSD europa.lcssl.es 14.0-STABLE FreeBSD 14.0-STABLE #10 stable/14-n266056-70025e767f28: Wed Dec 27 11:41:52 EET 2023     [email protected]:/home/lcu/obj/home/usr/src/amd64.amd64/sys/GENERIC amd64
$ _
7
  • Is there a way to reserve a closed FD number so it doesn't get reused? If so, this will definitely work. As it is, when I tried this and let my tests run in the presence of other threads/parts of the program for awhile, the tests started failing when other stuff reused the closed file descriptor number and reads started succeeding.
    – Zac B
    Commented Jul 8 at 12:23
  • No, that results in EBADF. Commented Jul 8 at 23:42
  • @JosephSible-ReinstateMonica, edited... Commented Jul 9 at 7:13
  • Now it results in EFAULT. Commented Jul 9 at 14:01
  • Nope, it resulted in EINVAL, if you passed a valid buffer, at least on my system Commented 19 hours ago

Not the answer you're looking for? Browse other questions tagged or ask your own question.