| |

Subscribe / Log in / New account

Windows NT synchronization primitives for Linux

By Jonathan Corbet
February 16, 2024

The futex mechanism provided by the kernel allows for the creation of efficient and flexible locking primitives in user space. Futexes work well for many applications, but not all. One of the exceptions, it seems, is that perennially difficult-to-support use case: Windows games. With this patch series, Elizabeth Figura seeks to provide the sort of locking that those games need, by way of a special-purpose virtual device.

The performance of a futex can be hard to beat when it is used as intended; in the uncontended case, there is no need for a system call at all to acquire one. Surprisingly, though, the Windows NT locking primitives were not designed with the objective of being easy to implement efficiently with futexes; as a result, there are certain operations supported by Windows that are not straightforward to implement on Linux. At the top of the list is operations requiring the simultaneous acquisition of multiple locks.

Applications written for Unix-like systems normally do not suffer from the lack of Windows-style locking primitives, but Windows applications that have been made to run on Linux often will. Until now, these applications have been supported in Wine by creating a special process to arbitrate access to locks. That solution can work, but it adds an interprocess-communication overhead to every locking operation, which hurts performance. The new device takes the place of that process, handling locking in the kernel without the communication overhead.

To use this feature, a process opens the new special file /dev/ntsync. Every open of that file creates a new instance that is distinct from all of the others, so the intended use case is a single process that shares an instance across multiple threads. Each instance provides a whole set of ioctl() operations (all described on this patch). The first step to use those operations will be to create the locks to be managed by the device; they come in three flavors:

A mutex is similar to the kernel equivalent; it is a lock that can be held by a single owner at a time. Locking calls can nest, though: once a thread has acquired a mutex it can do so again any number of times. Once all of the acquisition calls have been matched with release calls, the mutex is freed.
A semaphore is a counter, as one would expect. Every acquisition decrements the counter by one; as long as the counter is nonzero, the semaphore remains available.
An event is a condition variable; it has a boolean value, and threads can wait until it becomes true. If the event is marked for auto-reset, it will be reset to false as soon as a wait is satisfied, meaning that only one thread will see the event become true. Otherwise, an event, once set to true, stays that way until explicitly reset.

The NTSYNC_IOC_CREATE_MUTEX, NTSYNC_IOC_CREATE_SEM, and NTSYNC_IOC_CREATE_EVENT ioctl() calls can be used to create a mutex, semaphore, or event, respectively. On success, each of these operations returns a file descriptor that can be used to operate on the created object. The API is a bit different than one might expect, in that the file descriptor is not the return value from ioctl(); instead, it is stored in a structure passed by user space.

For example, to create a mutex, a thread starts with this structure:

   struct ntsync_mutex_args {
   	__u32 mutex;
   	__u32 owner;
   	__u32 count;
   };

On entry to the NTSYNC_IOC_CREATE_MUTEX call, the value of mutex is ignored. The owner field is set to the (application-defined) ID of the initial owner of the mutex, while count is set to the number of times the mutex has been acquired by that owner. To create a mutex that is not yet owned by anybody, both of those fields will simply be set to zero. On a successful return, the file descriptor corresponding to this mutex will be stored in the mutex field.

A number of operations are provided for manipulating these objects. For mutexes, NTSYNC_IOC_READ_MUTEX will return the current state of a mutex, while NTSYNC_IOC_MUTEX_UNLOCK will unlock a (currently locked) mutex. A slightly strange one is NTSYNC_IOC_KILL_OWNER, which doesn't actually kill anything; it takes a thread ID as an argument and, if that ID is the owner of the mutex, that mutex will be freed and marked as "abandoned". The next attempt to acquire the mutex will return an error status of EOWNERDEAD, but the acquisition will have succeeded anyway.

For semaphores, NTSYNC_IOC_READ_SEM will read the current state, and NTSYNC_IOC_SEM_POST will add a given amount to the semaphore's count (perhaps releasing the semaphore). Events can be queried with NTSYNC_IOC_READ_EVENT and modified with NTSYNC_IOC_SET_EVENT, NTSYNC_IOC_RESET_EVENT, and NTSYNC_IOC_PULSE_EVENT. That last operation acts like an instantaneous set and reset of the event, allowing one or more waiting threads to proceed but never causing the event to appear to be set. The "pulse" operation is one of those that is hard to implement with futexes.

To actually acquire a mutex or semaphore involves calling either NTSYNC_IOC_WAIT_ANY (which will return as soon as it is able to acquire any one of a list of mutexes and semaphores or one of the indicated events is set) or NTSYNC_IOC_WAIT_ALL, which will only return when it is able to atomically acquire all of the indicated resources. The latter operation will make an attempt whenever one of the resources is freed, but will only succeed if all of them happen to be available. It will not hold a partial set of resources while waiting for the rest, so it could be subject to starvation if the resources are heavily contended. Both wait operations include an optional timeout.

The motivation behind this work becomes clear after a look at the benchmark results provided in the patch cover letter:

The gain in performance varies wildly depending on the application in question and the user's hardware. For some games NT synchronization is not a bottleneck and no change can be observed, but for others frame rate improvements of 50 to 150 percent are not atypical.

The question that has not been directly answered in the cover letter is whether the futex API could have been enhanced to provide the needed functionality without introducing an entirely new API. It would seem (though your editor, needless to say, has not tried to implement it) that the "pulse event" functionality would be relatively straightforward to add. Some aspects of the multi-resource wait operations were provided by the addition of futex_waitv() to the 5.16 kernel, but more work would clearly have to be done. It may well be that adding a standalone virtual device for this niche functionality is easier and less intrusive than trying to coerce futexes into doing the job.

The comments on the first version of the patch set were focused on the details of the API rather than whether a separate device was needed; they resulted in a number of changes leading to the API described here. Subsequent versions, the last of which was posted on February 14, have received relatively few comments so far. So, perhaps, the community is happy with this proposal in its current form, and Linux gamers can look forward to a 131% faster Lara Croft in the near future.

Index entries for this article
Kernel	Locking mechanisms
Kernel	Releases/6.10

Run time loading

Posted Feb 16, 2024 17:23 UTC (Fri) by nickodell (subscriber, #125165) [Link] (3 responses)

Is it possible to load this driver at run-time so that users who haven't used Wine since their last boot won't pay the cost of having this loaded? Or must it be compiled into the kernel?

Run time loading

Posted Feb 16, 2024 17:57 UTC (Fri) by dskoll (subscriber, #1630) [Link]

The patch to Kconfig would indicate it can be compiled as a module.

Run time loading

Posted Feb 16, 2024 22:06 UTC (Fri) by shironeko (subscriber, #159952) [Link]

it can be built as a module, as shown in the first patch

Run time loading

Posted Feb 17, 2024 9:07 UTC (Sat) by grawity (subscriber, #80596) [Link]

What is the cost of having this loaded? (Besides having the word "NT" on your system.)

Windows NT synchronization primitives for Linux

Posted Feb 16, 2024 19:23 UTC (Fri) by abatters (✭ supporter ✭, #6932) [Link] (9 responses)

> Futexes work well for many applications, but not all.

The timing of this made me laugh. A week ago I ran into this years-old bug for the first time:

pthread_cond_signal failed to wake up pthread_cond_wait due to a bug in undoing stealing

To be fair though it is a glibc bug not a futex bug.

Windows NT synchronization primitives for Linux

Posted Feb 17, 2024 7:16 UTC (Sat) by alonz (subscriber, #815) [Link] (8 responses)

But this does illustrate that synchronization primitives are hard.

Windows NT synchronization primitives for Linux

Posted Feb 17, 2024 8:50 UTC (Sat) by Sesse (subscriber, #53779) [Link] (7 responses)

The only thing that's harder is not using them! Lock-free programming is… subtle.

(I guess you could argue that implementing mutexes is a form of lock-free programming…)

Windows NT synchronization primitives for Linux

Posted Feb 17, 2024 9:20 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (4 responses)

Indeed, if you can implement your new synchronization primitives on top of something else (which is what this ntsync work or even the futex system call do, using e.g. waitqueues) it's not that hard. But the futex code in glibc is much harder to write, much more so than the futex implementation in the kernel itself!

Windows NT synchronization primitives for Linux

Posted Feb 17, 2024 9:43 UTC (Sat) by itsmycpu (subscriber, #139639) [Link] (3 responses)

glibc code, for example for mutexes, is somewhat complex mostly because of the many options that it supports. Futex'es without the userspace part are "just" a mechanism to wait atomically, not a complete synchronization primitive.

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 15:57 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (2 responses)

Yeah, compare Rust's Mutex† which, on Linux, ultimately boils down to: https://github.com/rust-lang/rust/blob/master/library/std...

... with what they have to do if you only have POSIX threads: https://github.com/rust-lang/rust/blob/master/library/std...

Between "It's very stupid but POSIX technically allows this, so we need to cope" on one hand, and people just straight up not complying with POSIX anyway on the other I'd have thrown all of my toys out of the pram before writing the latter monster.

† Technically unlike the C mutex, a Rust Mutex<T> is a wrapper for a type T, the code I've linked isn't about that part, just the part with the actual mechanics of a mutex which are platform specific.

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 19:57 UTC (Sun) by intelfx (subscriber, #130118) [Link] (1 responses)

> with what they have to do if you only have POSIX threads: https://github.com/rust-lang/rust/blob/master/library/std...
>
> [...] I'd have thrown all of my toys out of the pram before writing the latter monster [...]

I'm not seeing anything criminal there. Certainly not a monster. WDYM?

Windows NT synchronization primitives for Linux

Posted Feb 19, 2024 10:31 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

For example when we want to lock the mutex, that can't fail in POSIX, it will just deadlock if we're recursively locking our own mutex, which in Rust's model is fine... too bad Sun didn't agree and so it fails instead.

_ = libc::pthread_mutex_lock(raw(self)); // Should be fine, but instead there's a whole mess

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 15:32 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (1 responses)

I don't think so, lock freedom guarantees global progress. If our threads are executing and we're a lock free algorithm, we will get work done. It may be the least preferred work, it may get done more slowly than preferred, but some of the work we had gets done. I think mutexes can't promise that, you might spend all of your execution resources on the mechanics and get no work done.

And one step harder is wait freedom, a guarantee of local progress - if our threads are executing specific work will get done, if thread A is squawking a goose, that goose gets squawked, it may not get squawked quickly but it definitely gets squawked, whereas a lock free algorithm is allowed to leave thread A starving forever until some day squawking that goose is globally the only work left to do.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 1:26 UTC (Wed) by itsmycpu (subscriber, #139639) [Link]

Maybe that was just meant to say that implementing a mutex requires similar programming techniques (atomic operations, avoiding issues like race conditions) as lock-free programming.

Of course, there can be (and there are already some) higher level abstractions for lock-free programming as well.

Windows NT synchronization primitives for Linux

Posted Feb 16, 2024 22:28 UTC (Fri) by itsmycpu (subscriber, #139639) [Link]

> The comments on the first version of the patch set were focused on the details of the API rather than whether a separate device was needed; they resulted in a number of changes leading to the API described here. Subsequent versions, the last of which was posted on February 14, have received relatively few comments so far.

Specifically, there seems to be no comment reflecting an extensive (and independent) expertise in synchronization primitives. In this regard, it sounds like a "whatever" response.

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 2:19 UTC (Sun) by geofft (subscriber, #59789) [Link] (45 responses)

I was wondering how this compares to futex2 / futex_waitv, which was also targeting the use case of Windows synchronization primitives that are heavily used by games. The LPC presentation linked in the patchset covers basically this, especially starting around here re futex: https://youtu.be/NjU4nyWyhU8?t=626 Essentially, futex_waitv covers the "wait for one" operation, but not the "wait for all" or "pulse" operations.

I still don't follow why there can't just be more futex API for this, e.g. a flag to futex_waitv to wait for all instead of any. But it sounds like some of the motivation is to isolate the weird NT-compatibility APIs into their own place (this is being proposed as a driver, not a syscall, and there are tons of niche drivers in the kernel already) and lower the risk of messing up syscalls that other applications use.

(Also - why isn't the "pulse" operation just FUTEX_WAKE without actually writing to the futex location? I'm certainly missing something....)

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 16:16 UTC (Sun) by HenrikH (subscriber, #31152) [Link] (42 responses)

I think the main reason is that there exists no use case for the wait for all except for the single case of replicating WIN32 behaviour. Quite telling is also that (per the presentation) they have not found a single game or application in Windows that actually use the wait for all flag, so this is just to get 100% compliance with the API and I don't think that changing something as fundamental as the futex would fly with the kernel devs with this in mind.

Windows NT synchronization primitives for Linux

Posted Feb 18, 2024 16:36 UTC (Sun) by mb (subscriber, #50428) [Link] (41 responses)

>they have not found a single game or application in Windows that actually use the wait for all flag

Ok, well. The correct thing to do would then be to panic/abort wine if an application uses the flag.
Implementation of the flag on Linux should then be based on whether these aborts do actually occur.

Windows NT synchronization primitives for Linux

Posted Feb 20, 2024 19:20 UTC (Tue) by HenrikH (subscriber, #31152) [Link] (40 responses)

Upstream WINE does not allow that, this is Collabora trying to find a solution that is both performant that will be accepted upstream.

Windows NT synchronization primitives for Linux

Posted Feb 20, 2024 20:11 UTC (Tue) by mb (subscriber, #50428) [Link] (39 responses)

Wine doesn't allow ignoring non-existent problems and demands fixing non-existent problems instead?
Great! They must have solved all real problems then.

>trying to find a solution that is both performant

If this option is never executed by any Windows application, I have an easy and very performant implementation:

abort();

Why not first prove that a problem exists and _then_ fix it?

Windows NT synchronization primitives for Linux

Posted Feb 20, 2024 23:08 UTC (Tue) by pizza (subscriber, #46) [Link] (5 responses)

> If this option is never executed by any Windows application, I have an easy and very performant implementation:

Didn't Figura include benchmarks [1] showing significant (double-to-triple-digit percentages [2]) improvements in popular games when these patches were used?

All shorts of shenaigans have been added to Linux for far lower gains than this; why should this be treated any differently?

[1] See the first message in the "patch series" link in the article
[2] between 21% to 678%

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 1:03 UTC (Wed) by itsmycpu (subscriber, #139639) [Link] (4 responses)

> Didn't Figura include benchmarks ...

The wait-for-all function is not relevant to any benchmarks shown.

Similar benchmarks show that other solutions (which do not require kernel patches, at least not additional ones) have similar performance or even marginally better, for the benchmarked games.

And, those don't exhaust the possibilities of userspace solutions in general, and their shortcomings are not inherent to non-kernel solutions in general.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 1:17 UTC (Wed) by pizza (subscriber, #46) [Link] (3 responses)

> Similar benchmarks show that other solutions (which do not require kernel patches, at least not additional ones) have similar performance or even marginally better, for the benchmarked games.

Citation, please?

Because I remember this bit of drama going back many, many, many years, and the only reason a kernel-based proposal has been worked on at all was that the userspace-based options weren't anywhere near performant enough.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 2:00 UTC (Wed) by itsmycpu (subscriber, #139639) [Link] (2 responses)

https://www.youtube.com/watch?v=NjU4nyWyhU8
At about 14:06.

In my view the best slide of that presentation is the one mentioning "fast user-space RPC" as a "half-baked idea", called an "interesting idea".

(Such concepts exist for a long time in all kind of variations. Some people associate the concept with "actors" in a multi-threaded context, or "asynchronous message queues". I'm using something that could go by that label (though so far within a single process, between threads) for many years, and it works very well, using lock-free queues. As a low-level implementation, execution time in a loop is a small single-digit number of nanoseconds.)

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 3:04 UTC (Wed) by pizza (subscriber, #46) [Link] (1 responses)

> In my view the best slide of that presentation is the one mentioning "fast user-space RPC" as a "half-baked idea", called an "interesting idea".

Ah, so this mythical other approach is just that; no implementation much less any benchmarks showing it to be just as good or better than the kernel-based approach that exists _today_.

> I'm using something that could go by that label (though so far within a single process, between threads) for many years, and it works very well, using lock-free queues

...Um, you do realize that Wine needs to synchronize between multiple independent heayweight *processes* ?

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 3:23 UTC (Wed) by itsmycpu (subscriber, #139639) [Link]

> Ah, so this mythical other approach is just that; no implementation much less any benchmarks showing it to be just as good or better than the kernel-based approach that exists _today_.

There are several existing approaches that have the same performance, just the existing ones also have shortcomings which that approach would not have.

> ...Um, you do realize that Wine needs to synchronize between multiple independent heayweight *processes* ?

Yes, as indicated I do, however I wonder what you mean with a "heavyweight" process?

Shared memory is as fast between processes (I measured it), and can be read-protected or write-protected for specific processes.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 9:52 UTC (Wed) by farnz (subscriber, #17727) [Link] (32 responses)

Wine already has a solution to this problem; it's just excessively slow because it depends on relatively complex user-space emulation of a kernel primitive, where the wineserver process does a lot of the work to emulate the Windows kernel behaviours atop the host OS. The goal of this work is to allow you to completely bypass wineserver and rely on the kernel doing all of the synchronization work for the Windows WaitForMultipleObjects family of API calls.

However, if you bypass wineserver, it can't then correctly emulate the wait for all behaviour, since it no longer has all the information it needs (it needs to know about all waiters on a given object to correctly emulate Windows behaviour, since Windows has some fairness between waiters that you need to emulate to get it right). Telling wineserver before and after each wait destroys the performance gain from not doing wineserver IPC, so that's off the table. And Wine doesn't want to regress on API support; while commercial applications don't use wait for all, Wine also wants to be able to run all in-house applications perfectly, and I know that such applications have existed (since my employer 20 years ago had one, written for NT 4.0).

So, you're asking Wine to make a choice:

Regress on API support, knowing that there are probably applications that the Wine developers don't have access to that depend on the old behaviour, and thus that they're quite likely to be faced with bug reports about the regression.
Refuse to accept a speed-up change (not a correctness change) until it correctly handles the same API surface that the existing Wine implementation handles.

Wine is choosing to not knowingly regress purely for the sake of performance, and asking the people who are trying to push a performance-only change to avoid regression. How they do that is up to them; they may find a way for wait for all operations to be handled by wineserver collaborating with the /dev/ntsync mechanism, or add it to /dev/ntsync fully.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 11:35 UTC (Wed) by mb (subscriber, #50428) [Link] (22 responses)

Improving the performance of some old and crusty proprietary application that might exist somewhere in private is not really a good justification for getting a new locking mechanism merged into the kernel.
Just live with the old performance then or change it to use faster locking.
Or use your own special purpose wine dkms driver.

I don't see why the mainline Linux kernel should care.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 11:50 UTC (Wed) by farnz (subscriber, #17727) [Link] (21 responses)

And therein lies the problem: if you merge this driver without resolving the "wait for all" problem, it won't get used by Wine, and you're at risk of needing two mechanisms that do the same thing with different UAPI (this one for older Wine forks, and one with the tweaks needed to resolve the "wait for all" problem for newer Wine versions).

So, a solution has to be found to the "wait for all" problem; can you make it possible for Wine to transparently fall back to the old methods if an application uses "wait for all", for example? Is there a trivial way for userspace to "reclaim" all waits that are in-kernel using this mechanism, and fall back to wineserver IPC? Can you come up with a simple way to implement "wait for all" that's low performance (after all, the existing method is low performance anyway)? Is there an easy way to detect that an application uses "wait for all" waits before it uses a "wait for some" wait, and thus disable this optimization?

And these are questions that need answering before /dev/ntsync merges, not after.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 12:09 UTC (Wed) by mb (subscriber, #50428) [Link] (20 responses)

My proposal was to not merge ntsync at all.
I have not yet seen a good justification for having it in mainline.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 12:24 UTC (Wed) by farnz (subscriber, #17727) [Link] (19 responses)

The justification for merging is that it's a significant performance improvement for applications running under Wine. And the justification for not putting the full functionality into futex is that the wait-for-all operation is only needed under Wine, and even then, only when you have applications using edges of the Windows API. Most of the time, pulse and wait-for-any are all that you need.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 12:37 UTC (Wed) by mb (subscriber, #50428) [Link] (18 responses)

So what's wrong with dkms? Why does this single-purpose driver have to be in mainline?

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 13:01 UTC (Wed) by farnz (subscriber, #17727) [Link] (1 responses)

The kernel maintainers want everything in mainline, not in DKMS. DKMS is meant for backports from a later mainline kernel, or for cases where legal issues prevent something being merged into mainline (e.g. licensing conditions).

You'd have to ask Greg K-H and others why they don't want a stable API or ABI for modules so that things can stay outside mainline forever.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 13:13 UTC (Wed) by johill (subscriber, #25196) [Link]

Perhaps hyperbolic, but taken to the extreme that argument could also mean no drivers, filesystems, etc. really need to be in the kernel since you could always compile extra modules out of tree, put them into the initramfs, and be done with it.

But yeah, that's not how Linux works? There might be whole architectures with fewer users than this feature would have ...

It's also tremendously impractical with modules signing, having to have compilers everywhere, etc.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 14:55 UTC (Wed) by pizza (subscriber, #46) [Link]

> So what's wrong with dkms? Why does this single-purpose driver have to be in mainline?

Because there's a non-trivial amount of users that would see a substantial improvement? As I mentioned earlier in this thread, all sorts of insanity is merged into Linux every cycle that only yields a low-single-percentage improvement on narrow (and more often than not, proprietary/internal) use cases; this would seem to be a no-brainer by those same general principles.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 17:54 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (14 responses)

> So what's wrong with dkms? Why does this single-purpose driver have to be in mainline?

Why not move out drivers for all of the one-off devices out of the kernel, then? DKMS significantly complicates the OS updates and secure boot.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 9:59 UTC (Thu) by farnz (subscriber, #17727) [Link] (13 responses)

And just looking at the kernel, there's a huge number of single-purpose drivers that most users are never going to need - 100G Ethernet, Infiniband, FireWire, UIO, parallel SCSI, PCMCIA, amateur radio, SLIP, FPGA drivers, SGI system drivers, analogue TV tuners, multipath block I/O, old PC-style gameports, Industrial I/O (iio), obscure HID devices, CXL, ATM, PATA and probably more. The kernel's full of stuff that most people don't need - either because it's legacy (analogue TV tuners, ATM, PATA), or special case (IIO, UIO, amateur radio), or because it's too expensive for most of us (100G Ethernet, Infiniband, FPGA drivers).

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 19:04 UTC (Thu) by mb (subscriber, #50428) [Link] (12 responses)

That's what-about-ism. ntsync has nothing to do with analogue TV tuners and all the other stuff you mentioned.

But thanks to all for explaining the technical details and the background behind ntsync.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 21:35 UTC (Thu) by farnz (subscriber, #17727) [Link] (11 responses)

But all of those other stuff are "single use drivers" for things that I'm never likely to use; I have more chance of benefiting from ntsync than I do from all of those other things.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 22:09 UTC (Thu) by mb (subscriber, #50428) [Link] (10 responses)

Exactly. What about all those other unrelated things over here?

As I said: Thanks for explaining some technical details. But now you keep on saying things that are off topic.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 22:25 UTC (Thu) by farnz (subscriber, #17727) [Link] (9 responses)

Because you said "So what's wrong with dkms? Why does this single-purpose driver have to be in mainline?" - but that applies to every last one of those things that are also single-purpose drivers in mainline.

I'd like you to explain what criteria makes a single-purpose driver sensible to include in mainline (rather than being things that should be in dkms), such that all the single-purpose drivers in mainline that I'm never likely to use meet that criteria, but this doesn't.

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 6:26 UTC (Fri) by mb (subscriber, #50428) [Link] (8 responses)

Yes, but this is not what I meant.

With "single-purpose" I did _not_ mean that a driver would only drive a single type of hardware. That is _obvious_. That's the case for most drivers.
And I also was not referring to the number of users of this driver.
There are even drivers in the mainline where probably only a single instance of the hardware exists. Does it make sense? No, it does't. But that's a completely different thing. Doesn't have anything to do with how we discuss ntsync.

I was referring to the number of _applications_ that would use the driver. I'm sorry that I didn't make that clearer to begin with. That was my fault.
ntsync only serves a single _application_.

It is like the DCO driver for OpenVPN. Which is a DKMS. And I don't see anything wrong with that. Except for maybe that it breaks module signing.

And yes, I do know that we have more features in the kernel that only serve a single application. You don't need to bring that up. ;-)

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 10:57 UTC (Fri) by farnz (subscriber, #17727) [Link] (2 responses)

The OpenVPN DCO driver is out of tree not because it has a single use, but because the kernel has a guideline of "don't break userspace applications", and the OpenVPN DCO driver currently wants the freedom to make changes that will deliberately break userspace applications. This is a very good reason to not enter mainline - you don't want to be under the strictures that Linus applies.

Windows NT synchronization primitives for Linux

Posted Mar 6, 2024 19:18 UTC (Wed) by florianfainelli (subscriber, #61952) [Link] (1 responses)

And yet this driver is now attempting to enter mainline: https://lore.kernel.org/netdev/20240106215740.14770-1-ant...

The technical reasons for attempting to upstream should be simple and clear: it facilitates the distribution of your module and it gives you some amount of maintenance "for free".

Windows NT synchronization primitives for Linux

Posted Mar 7, 2024 11:36 UTC (Thu) by farnz (subscriber, #17727) [Link]

Yep. Now that they're confident that they don't want to deliberately break user-space, they're going for mainline, just like everything else, because there's good reasons to be in mainline. It's just that when you know you want to break mainline's rules (unstable userspace interfaces like OpenVPN DCO, wrong licence like NVidia's proprietary driver), you need to stay out until you're ready to keep to mainline's rules.

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 11:24 UTC (Fri) by timon (guest, #152974) [Link] (4 responses)

But why then do you think that the ntsync driver only serves a single application? Do you count anything running under Wine as the same application?

From what I understand from the article and its comments, there might be numerous (legacy?) applications that use those NT features and that one may want to run on a Linux kernel with Wine, or there might be not a single application ever using this NT feature with Wine.

That Wine wants to be able to offer the Windows APIs as complete, conformant and performant as possible seems like a laudable goal and a win for FOSS -- and I don't see why one would want to hinder their efforts by relegating their work to out-of-tree modules via DKMS, as long as there are no legal or severe technical hurdles.

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 17:47 UTC (Fri) by mb (subscriber, #50428) [Link] (3 responses)

Please see the root of this thread.
This is going way off topic to what I originally commented on.

The original assumption was that there were *zero* use cases. (which was *not* claimed by me)
Which has been clarified as being wrong.

The corrected assumption is that there are proprietary apps hidden somewhere that wine does not want to break.
Whether that warrants a kernel driver or not is a completely different question. IMO it doesn't, but feel free to have a different opinion. I'm Ok with that.

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 18:35 UTC (Fri) by farnz (subscriber, #17727) [Link] (2 responses)

Note that the thing that warrants a kernel driver is the wait-for-any functionality of the call; the challenge is that if the kernel driver only implements wait-for-any, and neither implements a way to "return" those waits to userspace early nor a wait-for-all option, then Wine has a regression and won't use this driver for wait-for-any, even though it works for that purpose.

The exact implementation of this is up to the people doing the work; do you put a wait-for-all mode in the kernel, even though it's rarely needed, or do you come up with a way for Wine to reclaim all the wait-for-anys sitting in the kernel and push all wait-for-anys (including the reclaimed ones) through the slow path, not using the kernel driver?

Both work, since we only have evidence that there are applications that do not use wait-for-all, but would benefit from a faster wait-for-any than can be implemented in userspace alone.

Windows NT synchronization primitives for Linux

Posted Feb 29, 2024 18:55 UTC (Thu) by raven667 (subscriber, #5198) [Link] (1 responses)

> the challenge is that if the kernel driver only implements wait-for-any, and neither implements a way to "return" those waits to userspace early nor a wait-for-all option, then Wine has a regression and won't use this driver for wait-for-any, even though it works for that purpose

I'm catching up but isn't the way that MS solves these kinds of compat issues in the Windows world is tedius baked-in lists of specific applications which need the differential behavior? A WINE config/cli flag that indicates when emulated wait-for-all is needed which uses the existing slow implementation with a default which uses the fast kernel implementation, that aborts with a sensible error if the app uses features which aren't implemented, so the user/admin can restart the app with the right compat option saved. Maybe an option to upload those compat lists somewhere so the config could be distributed, if the existence of the app isn't sensitive data.

I know having things work correctly automatically is more awesome but sometimes just doing the dumb brute force thing is more effective and efficient than working through all the details, coordination and judgement to automate something. Technical debt isn't always bad as long as you are picky when you create it.

Windows NT synchronization primitives for Linux

Posted Feb 29, 2024 19:20 UTC (Thu) by farnz (subscriber, #17727) [Link]

That would work, too, but then needs the people proposing a Linux-only accelerator for wait-for-any behaviour to create the compat options + list, so that people can use it.

This is not an unsoveable problem; it's "just" that Wine isn't happy with a gain in performance for some applications at the expense of others now failing to work at all, and people will have to do the work so that this becomes a gain in performance for some applications (those that only use wait-for-any) but not others.

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 11:36 UTC (Wed) by Wol (subscriber, #4433) [Link] (8 responses)

Can they add an option (default off) to enable this performance gain, in the knowledge that it may trigger a regression for in-house apps?

After all, if they are in-house, there's always the option (provided they haven't lost the source) to work around the regression ...

Cheers,
Wol

Windows NT synchronization primitives for Linux

Posted Feb 21, 2024 11:46 UTC (Wed) by farnz (subscriber, #17727) [Link] (7 responses)

That's not Wine's way of working (although a Wine fork like Proton may do just that). Wine would like to remove configuration options that depend on you knowing details of what APIs your application uses, not add more, since you don't have to set options on Windows to say "this is a perfectly well-behaved Win32 app that should function on Windows NT 3.1 to Windows 11 without modification, but that gets a performance boost if you set this modification"; instead, Windows itself detects applications that can't have the performance boost, and disables it appropriately.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 6:54 UTC (Thu) by calumapplepie (guest, #143655) [Link] (6 responses)

Can we do what windows itself does, then?
Grep through the binary of whatever wine is executing for wait_for_all calls, then disable the optimization if any are spotted?

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 7:41 UTC (Thu) by Wol (subscriber, #4433) [Link] (5 responses)

Even better, if wait_for_all_calls is part of wine (I get the impression it might be), on the first call it could itself disable optimisation.

Cheers,
Wol

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 9:52 UTC (Thu) by farnz (subscriber, #17727) [Link] (4 responses)

You can't do it on the first call, because to implement it properly, you need to prevent previous wait-for-any calls from using the kernel mechanism; time-travelling backwards like that is a technical challenge. And Windows handles it by having its kernel equivalent of /dev/ntsync handle wait-for-all as well as wait-for-any; it may have fast paths in there for wait-for-any, but it's all in-kernel.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 15:05 UTC (Thu) by Wol (subscriber, #4433) [Link] (3 responses)

And a wait-for-any call will go direct from the app bypassing wine?

Otherwise you could presumably ref-count wait-for-any, and if the count was non-zero you'd have to divert new calls and wait for existing calls to go to zero. Messy, assuming it's even possible ...

Cheers,
Wol

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 15:21 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

You also need some mechanism to tell all the existing calls in the kernel that wineserver is taking over wait calls now, so they need to return to userspace and be retried (inside Wine) by the IPC mechanism instead of the kernel mechanism, otherwise you can't correctly implement the corner-cases of Win32 when you have both wait-for-any and wait-for-all referencing the same objects.

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 17:04 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

That was the point of my ref-counting - wine takes over, but cannot proceed until the ref count drops to zero. Almost certainly not nice ...

Cheers,
Wol

Windows NT synchronization primitives for Linux

Posted Feb 22, 2024 21:36 UTC (Thu) by farnz (subscriber, #17727) [Link]

Wine needs to take over before the ref count drops to zero, else you've regressed. To match Windows kernel behaviour, you either need the Linux kernel to support wait-for-all somehow, or you need the Linux kernel to hand over the existing wait-for-any waits to Wine, so that Wine can correctly emulate the corner cases of wait-for-all.

Windows NT synchronization primitives for Linux

Posted Feb 23, 2024 12:41 UTC (Fri) by nicklecompte (subscriber, #151334) [Link] (1 responses)

> (Also - why isn't the "pulse" operation just FUTEX_WAKE without actually writing to the futex location? I'm certainly missing something....)

I am actually more familiar with Windows events than futexes so excuse my heresy :) One problem I see is auto-reset events, where pulse specifically only wakes one *and only one* of the waiting threads. But I don't think futexes don't have an easy way of guaranteeing the wakes happen atomically in that fashion: it seems like this would wake *at least* one waiting thread, absent a very clever workaround or modifying the kernel directly. Of course I could be missing something. But I suspect Windows kernel being written to atomically wake up a single waiting thread is probably expensive to emulate on Linux.

Windows NT synchronization primitives for Linux

Posted Feb 24, 2024 23:37 UTC (Sat) by itsmycpu (subscriber, #139639) [Link]

> I am actually more familiar with Windows events than futexes so excuse my heresy :) One problem I see is auto-reset events, where pulse specifically only wakes one *and only one* of the waiting threads. But I don't think futexes don't have an easy way of guaranteeing the wakes happen atomically in that fashion: it seems like this would wake *at least* one waiting thread, absent a very clever workaround or modifying the kernel directly. Of course I could be missing something. But I suspect Windows kernel being written to atomically wake up a single waiting thread is probably expensive to emulate on Linux.

Futexes do have an easy way to wake exactly one thread (of course, otherwise how would you efficiently implement a lock using them).

Windows NT synchronization primitives for Linux

Posted Feb 20, 2024 17:37 UTC (Tue) by quotemstr (subscriber, #45331) [Link]

Why not just add system calls to implement these synchronization primitives directly? They're probably useful for programs in general, not just Windows ones

Windows NT synchronization primitives for Linux

Posted May 28, 2024 16:54 UTC (Tue) by riking (subscriber, #95706) [Link]

"Wait for all with timeout and alert" feels like the core operation here that is currently impossible to perform on Linux without an arbitrator, and when you don't know if a program is going to use that operation you need to send all operations on all of these objects over the costly arbitrator connection.

Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds