hey @zwol i saw this stackoverflow answer you gave explaining why opening a directory by inode would circumvent access control rules--do i understand correctly then that opening any file/directory within linux requires providing either:
(1) an absolute path
(2) a path relative to the process cwd
i'm doing a pretty normal directory traversal (multithreaded though) and i have always wanted some sort of API which doesn't require me to construct a full path string for each intermediate and terminal entry (having worked on build tools, i'm terrified of overrunning total path length). i'm finding name_to_handle_at()/open_by_handle_at() in linux--is that worth using, or are there any downsides i should be aware of?
as i look it up, it seems that maximum path length is filesystem-specific, and that ext4 has no maximum, while macos is the one that imposes path length limits. i still think it would be super great to have a standard API for "open this entry relative to an opened parent dir", and i would really like it if that could even avoid having to specify the entry's filename, ideally by providing some sort of token in the result of readdir() that could be opened into an fd
it just seems like a waste to have to go through path resolution again in order to access directory entries, when we already did the work in readdir() to get the relevant info about the entry we want to open next. while the inode itself is way too low-level for this, it seems like adding a field to dirent which can be opened without constructing a path string might be useful? am i missing some existing discussion about this?
cc @dalias but no need to respond either of you, this is mostly brainstorming
@zwol @dalias i'm just going to do the normal thing and open file descriptors by full paths (absolute or relative to cwd), since i'm looking to do something portable here (i'm rewriting the rust ignore crate). the rust stdlib read_dir() method just retains a ref-counted pointer to the parent directory's path string, so i'm just going to take advantage of that, but it feels like a silly workaround to have to resolve paths for every recursive entry
and also lol @zwol i LOVE the bit on your stackoverflow profile:
DO NOT CONTACT ME WITH ANY SORT OF JOB OFFER.
amazing legendary vibes especially the rest of it being steadfast against LLMs but that's a fantastic header is all
@hipsterelectron @zwol Did you miss openat()/the whold *at() family of functions added in POSIX 2008? I think these are exactly what you want. The operate on a relative pathname relative to a given open directory fd, rather than implicitly using the process-global current working directory (which behaves like a hidden file descriptor).
@hipsterelectron @zwol The whole name_to_handle_at/open_by_handle_at stuff you mentioned is NOT a good idea to use. It's probably only conditionally available (not on old kernels, not in hardened environments, maybe not even on current kernels at all except with right CONFIG_?) and is intended for implementing NFS/Samba type stuff, not for normal application usage.
@hipsterelectron @zwol In terms of the access rights it grants and how it maintains a reference, it's pretty much exactly like a hidden fd.
@hipsterelectron @zwol And you can convert it to a real fd with open(".", ...) 😁
@dalias the cwd being global felt unnerving to me but thinking of it as a resource like an fd (similar to stdio streams) makes me feel much more comfortable about it
@hipsterelectron I think it makes sense to think of it as a fd belonging to main, where code that doesn't intend to be implicitly using that state should be using the *at() functions, where the caller can give it an explicit dir fd relative to which it should work, or AT_FDCWD to mean "main's".
@hipsterelectron @zwol Probably because Rust is allergic to POSIX and wants non-POSIX Windows runtimes to be first-class... 🙄
@hipsterelectron @zwol There's definitely a way to do this on Windows though. I recall it coming up during midipix development. They implemented it on top of NTAPI handle stuff.
@hipsterelectron Unfortunately there are two philosophies here:
1. POSIX is one of "many" types of platforms, special-cased like "all" the others.
2. POSIX is the agreed-upon standard for how software interfaces with the operating system, and on the ONE system that doesn't at least try to provide it, you patch things up as needed.
Unfortunately Rust seems to be in camp 1...
@dalias @hipsterelectron yeah that's exactly what I was going to bring up
The "by_handle" calls were designed for user space file servers (NFS, SMB) and I have the impression they are not very useful for anything else
@zwol @hipsterelectron Yep. From the man page:
"The caller must have the CAP_DAC_READ_SEARCH capability to invoke open_by_handle_at()."
IOW only root or near-equivalent can use these interfaces because they bypass the standard access controls.
So the only usage case is "I want ZOMG-FAST NFS/Samba and I don't care if it probably introduces security bugs".
@lispi314 @hipsterelectron Sorry I can't be bothered to reply to repeated delete-and-redraft. Doesn't your instance have edit if needed??
@dalias @hipsterelectron while posix is common in the server space, it is not the main or even a supported API used by programs on most other platforms, and so it makes sense that the OS interface APIs provided model a common denominator across platforms (which is not posix most importantly)
When you pretend everything is posix, your OS APIs will be modeled after POSIX as well, hardcoding design decisions that are simply not supported on other oses in the same way, if at all.
For example, you might provide a call to get or set the current permissions of a file. on posix this is a simple wrapper around stat and chmod, but what about platforms that only let you mark a file as read only?
you can implement the get the file mode call, sure, but setting it will simply not behave like posix chmod, no matter how hard you try. this will introduce bugs in software that simply doesn’t know or hasn’t accounted for these differences
And this example isn’t taken out of nowhere — that would be how files work without ACLs in windows