userbinator 6 hours ago

I don't think this behaviour is "peculiar" as the author says it is; why does the error number matter if you know the call succeeded? GetLastError() on Windows works similarly, although with the additional weird caveat that (undocumentedly) some functions may set it to 0 on success.

The system call wrappers could all have explicitly set errno to 0 on success, but they didn't.

Because it's plainly unnecessary. It'd be a waste today, and even more so on a PDP-11 in the 1970s.

  • aa-jv an hour ago

    I agree with you, its not so peculiar as one might think. (Disclaimer: been writing software for Unix since before POSIX...)

    This design choice reflects the POSIX philosophy of minimizing overhead and maximizing flexibility for system programming. Frequent calls to write(), for example, would be hindered by having to reset errno with each call/check of write() return value - especially in cases where a lot of write()'s are queued.

    Or .. a library function like fopen() might internally call multiple system calls (e.g., open(), stat(), etc.). If one of these calls sets errno but the overall fopen() operation succeeds, the library doesn’t need to clear errno. For instance, fopen() might try to open a file in one mode, fail with an errno of EACCES (permission denied), then retry in another mode and succeed. The final errno value is irrelevant since the call succeeded.

    This mechanism minimizes overhead by not requiring errno to be reset on success.

    It allows flexible, efficient implementations of library and system calls and encourages robust programming practices by mandating return-value checks before inspecting errno.

    It supports complex, retry-based logic in libraries without unnecessary state management - and it preserves potentially useful debugging information.

    You only care about errno when you know an actual error occurred. Until then, ignore it.

    This is similar to other systems-level things that can occur in such environments, for example when setting a hard Reset-Reason or Fail-Reason register in static/non-volatile memory somewhere, for later assessment.

    IMHO, the thing thats most peculiar about this is that folks these days think of it as weird/quaint - when in fact, it makes a lot of sense if you think about it.

    • amelius 21 minutes ago

      Does Valgrind give a warning when you do check errno after a successful system call?

jezze 4 hours ago

I think it would have been better if they had designed it so that the error message from the kernel came in a seperate register. That would mean you didnt have to use signed int for the return value. The issue is that one register now is sort of disambigious. It either returns the thing you want or the error but these are seperate types. If you had them in seperate registers you would have the natural type of the thing you are interested in without having to convert it. This would however force you to first check the value in the error register before using the value in the return register but that makes more sense to me than the opposite.

  • bhawks 2 hours ago

    A whole separate register?

    That is quite expensive. Obviously you need to physically add the register to the chip.

    After that the real work comes. You need to change your ISA to make the register addressible by machine code. Pdp11 had 8 general purpose registers so they used 3 bits everywhere to address the registers. Now we need 4 sometimes. Many op codes can work on 2 registers, so we need to use 8 out of 16 bits to address both where before we only needed 6. Also pdp11 had fixed 16 bits for instruction encoding so either we change it to 18 bit instructions or do more radical changes on the ISA.

    This quickly spirals into significant amounts of work versus encoding results and error values into the same register.

    Classic worse is better example.

    • dwattttt 2 hours ago

      > A whole separate register?

      There are quite a few registers (in all the ISAs I'm familiar with) that are defined as not preserved across calls; kernels already have to wipe them in order to avoid leaking kernel-specific data to userland, one of them could easily hold additional information.

      EDIT: additionally, it's been a long time since the register names we're familiar with in an ISA actually matched the physical registers in a chip.

rwmj 3 hours ago

By far the largest issue with errno is that we don't record where inside the kernel the error gets set (or "raised" if this was a non-C language). We had a real customer case recently where a write call was returning ENOSPC, even though the filesystem did not seem to have run out of space, and searching for the place where that error got raised was a multi-week journey.

In Linux it'd be tough to implement this because errors are usually raised as a side effect of returning some negative value, but also because you have code like:

  err = -EIO;
  ... nothing else sets err here ...
  return err;
But instrumenting every function that returns a negative int would be impossible (and wrong). And there are also cases where the error is saved in (eg) a bottom half and returned in the next available system call.
toast0 5 hours ago

Imho, this is an area where the limitations of C shine through.

Some kernels return error status as a CPU flag or otherwise separately from the returned value. But that's very hard to use in C, so the typical convention for a syscall wrapper is to return a non-negative number for success and -error for failure, but if negative numbers are valid as the return, you've got to do something else.

badc0ffee 6 hours ago

Something worth mentioning would have been those libc calls where the only way to tell if a return value of 0 is an error is to check errno. And of course, as the article says, errno is only set in error, you need to set it to 0 before making that libc call.

I think strtol was one such function, but there were others.

  • ethan_smith 2 hours ago

    strtol() isn't actually such a case - it sets errno on range errors but returns 0 for valid input "0"; the classic examples are actually getpriority() and sched_yield() which can legitimately return -1 on success.

  • Asmod4n 6 hours ago

    getservbyname Is also one of those functions, it returns NULL on error or for signaling it couldn’t find what you looked for.

tptacek 5 hours ago

Further gnarliness, hopefully long past relevance:

https://cr.yp.to/docs/connect.html

  • JdeBP an hour ago

    You might think so. (-:

    * https://github.com/jdebp/nosh/blob/trunk/source/socket_conne...

    kqueue() can apparently return the error right in the data of the kevent, but I'm still using poll() so cannot confirm; whilst I can confirm that kqueue/kevent is alas not as truly consistent as one might expect. (Someone recently tried to move FreeBSD devd to kqueue, and hit various problems of FreeBSD devices that are still, even in version 14, not yet kqueue-ready.)

amelius 4 days ago

Why didn't they mention threads?

  • bartvk 4 days ago

    Oh gosh, that's interesting. I bet that complicates using using errno. Or is errno somehow copied into a local variable?

    • Vogtinator 4 days ago

      errno is in thread-local storage (TLS)

      • wahern 6 hours ago

        Notably, the POSIX Threads API itself (i.e. pthread_ routines) returns errors directly rather than through errno.

        • JdeBP 29 minutes ago

          It was a common design of new operating system APIs, not encumbered by PDP-11 compatibility, in the 1980s.

          * https://tty0.social/@JdeBP/114816928464571239

          Even some of the later augmentations in MS/PC/DR-DOS did things like return an error code in AX and the result in (say) CX instead of using AX and CF.

      • amelius 4 days ago

        Yes. It is too bad that they didn't use a similar solution for the current working directory. Chdir() is process-wide, not thread local :(

        • o11c 6 hours ago

          `openat` has basically solved that since 2.6.16 (which came out in 2006). There are still some uncommon APIs have been slow to gain `at` variants but there's usually a workaround (for example, `getxattrat` and family were only added in 6.13 (this January), but can be implemented in terms of `openat` + `fgetxattr`)

          • hvenev 3 hours ago

            > can be implemented in terms of `openat` + `fgetxattr`

            Except for symlinks. `fgetxattr` requires a file opened for read or write, but symlinks can only be opened as `O_PATH`.