There was a giant super-long GitHub issue about improving Rust std mutexes a few years back. Prior to that issue Rust was using something much worse, pthread_mutex_t. It explained the main reason why the standard library could not just adopt parking_lot mutexes:
> One of the problems with replacing std's lock implementations by parking_lot is that parking_lot allocates memory for its global hash table. A Rust program can define its own custom allocator, and such a custom allocator will likely use the standard library's locks, creating a cyclic dependency problem where you can't allocate memory without locking, but you can't lock without first allocating the hash table.
> After some discussion, the consensus was to providing the locks as 'thinnest possible wrapper' around the native lock APIs as long as they are still small, efficient, and const constructible. This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
> This means that on platforms like Linux and Windows, the operating system will be responsible for managing the waiting queues of the locks, such that any kernel improvements and features like debugging facilities in this area are directly available for Rust programs.
> This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
Note that the SRW Locks are gone, except if you're on a very old Windows. So today the Rust built-in std mutex for your platform is almost certainly basically a futex though if it is on Windows it is not called a futex and from some angles is better - the same core ideas of the futex apply, we only ask the OS to do any work when we're contended, there is no OS limited resource (other than memory) and our uncontended operations are as fast as they could ever be.
SRW Locks were problematic because they're bulkier than a futex (though mostly when contended) and they have a subtle bug and for a long time it was unclear when Microsoft would get around to fixing that which isn't a huge plus sign for an important intrinsic used in all the high performance software on a $$$ commercial OS...
Mara's work (which you linked) is probably more work, and more important, but it's not actually the most recent large reworking of Rust's Mutex implementation.
> Prior to that issue Rust was using something much worse, pthread_mutex_t
Presumably you're referring to this description, from the Github Issue:
> > On most platforms, these structures are currently wrappers around their pthread equivalent, such as pthread_mutex_t. These types are not movable, however, forcing us to wrap them in a Box, resulting in an allocation and indirection for our lock types. This also gets in the way of a const constructor for these types, which makes static locks more complicated than necessary.
pthread mutexes are const-constructible in a literal sense, just not in the sense Rust requires. In C you can initialize a pthread_mutex_t with the PTHREAD_MUTEX_INITIALIZER initializer list instead of pthread_mutex_init, and at least with glibc there's no subsequent allocation when using the lock. But Rust can't do in-place construction[1] (i.e. placement new in C++ parlance), which is why Rust needs to be able to "move" the mutex. Moving a mutex is otherwise non-sensical once the mutex is visible--it's the address of the mutex that the locking is built around.
The only thing you gain by not using pthread_mutex_t is a possible smaller lock--pthread_mutex_t has to contain additional members to support robust, recursive, and error checking mutexes, though altogether that's only 2 or 3 additional words because some are union'd. I guess you also gain the ability to implement locking, including condition variables, barriers, etc, however you want, though now you can't share those through FFI.
[1] At least not without unsafe and some extra work, which presumably is a non-starter for a library type where you want to keep it all transparent.
I.e., if I pthread_mutex_init(&some_addr, ...), I cannot then copy the bits from some_addr to some_other_addr and then pthread_mutex_lock(&some_other_addr). Hence not movable.
> Moving a mutex is otherwise non-sensical once the mutex is visible
What does "visible" mean here? In Rust, in any circumstance where a move is possible, there are no other references to that object, hence it is safe to move.
Well, technically if you only have a mutable borrow (it's not your object) then you can't move from it unless you replace it somehow. If you have two such borrows you can swap them, if the type implements Default you can take from one borrow and this replaces it with its default and if you've some other way to make one you can replace the one you've got a reference to with that one, but if you can't make a new one and don't have one to replace it with, then too bad, no moving the one you've got a reference to.
> What does "visible" mean here? In Rust, in any circumstance where a move is possible, there are no other references to that object, hence it is safe to move.
And other than during construction or initialization (of the mutex object, containing object, or related state), how common is it in Rust to pass a mutex by value? If you can pass by value then the mutex isn't (can't) protect anything. I'm struggling to think of a scenario where you'd want to do this, or at least why the inability to do so is a meaningful impediment (outside construction/initialization, that is). I understand Rust is big on pass-by-value, but when the need for a mutex enters the fray, it's because you're sharing or about to share, and thus passing by reference.
Depends on the program, and it can be a very useful tool.
Rust has Mutex::get_mut(&mut self) which allows getting the inner &mut T without locking. Having a &mut Mutex<T> implies you can get &mut T without locks. Being able to treat Mutex<T> like any other value means you can use the whole suite of Rust's ownership tools to pass the value through your program.
Perhaps you temporarily move the Mutex into a shared data structure so it can be used on multiple threads, then take it back out later in a serial part of your program to get mutable access without locks. It's a lot easier to move Mutex<T> around than &mut Mutex<T> if you're going to then share it and un-share it.
Also It's impossible to construct a Mutex without moving at least once, as Rust doesn't guarantee return value optimization. All moves in Rust are treated as memcpy that 'destroy' the old value. There's no way to even assign 'let v = Mutex::new()' without a move so it's also a hard functional requirement.
I’m actually thinking of the sheer size of pthread mutexes. They are giant. The issue says that they wanted something small, efficient, and const constructible. Pthread mutexes are too large for most applications doing fine-grained locking.
On a typical modern 64-bit Linux for example they're 40 bytes ie they are 320 bits. So yeah, unnecessarily bulky.
On my Linux system today Rust's Mutex<Option<CompactString>> is smaller than the pthread mutex type whether it is locked and has the text "pthread_mutex_t is awful" inside it or maybe unlocked with explicitly no text (not an empty string), either would only take like 30-odd bytes, the pthread_mutex_t is 40 bytes.
On Windows the discrepancy is even bigger, their OS native mutex type is this sprawling 80 byte monster while their Mutex<Option<CompactString> is I believe slightly smaller than on Linux even though it has the same features.
> On Windows the discrepancy is even bigger, their OS native mutex type is this sprawling 80 byte monster
I guess you are referring to CRITICAL_SECTION? SRWLock, which has the size of a pointer, has been introduced in Windows Vista. Since Windows 8 you can use WaitOnAddress to build even smaller locks.
Yes, CRITICAL_SECTION is far too large. Mara asked some years ago whether SRWLock could guarantee what Rust actually needs for this purpose (the documentation at that time refused to clarify whether we can move it for example) and that's why her change was to SRWLock from CRITICAL_SECTION.
And yes, the newer change uses WaitOnAddress to provide the same API as the futex from the various Unix platforms. Raymond Chen's description of the differences is perhaps rather exaggerated, which isn't to say there's no difference, but it's well within what's practical for an adaptor layer.
Also although the SRWLock itself is the same size as a pointer (thus, 64 bits on a modern computer, compared to a 32-bit Futex) there's a reason it's the same size as a pointer - it actually is basically a pointer, and so in some cases it's pointing at a data structure which we should reasonably say is also part of the overhead.
The pointer is to a large aligned object, which means the bottom bits would be zero and so SRWLock uses those for flag bits. It's a nice trick but we should remember that it isn't really comparable to a Futex though it's certainly cheaper than CRITICAL_SECTION.
I dunno, it seems to me that the standard mutex performs very well on all scenarios, and doesn't have any significant downsides, except for the hogging case, which could be fixed by assigning the non-hogging threads a higher priority.
Whereas parking_lot has a ton of problematic scenarios, where after the spinlock times out, and it yields the thread to the OS, which has no idea to wake up the thread after the resource is unblocked.
It could be even argued that preventing starvation is outside the design scope of the Mutex as a construct, as it only guarantees mutual exclusion and that the highest priority waiting thread should get access to it.
The simplest solution is for `std::mutex` to provide a simple, efficient mutex which is a good choice for almost any program. And it does. Niche programs can pull in a crate.
I doubt `parking_lot` would have been broadly used—maybe wouldn't even have been written—if `std` had this implementation from the start.
What specifically in this comparison made you think that `parking_lot` is broadly needed? They had to work pretty hard to find a scenario in which `parking_lot` did much better in any performance metrics. And as I alluded to in another comment, `parking_lot::Mutex<InnerFoo>` doesn't have a size advantage over `std::mutex::Mutex<InnerFoo>` when `InnerFoo` has word alignment. That's the most common situation, I think.
If I were to make a wishlist of features for `std::mutex` to just have, it wouldn't be anything `parking_lot` offers. It'd be stuff like the lock contention monitoring that the (C++) `absl::Mutex` has. (And at least on some platforms you can do a decent job of monitoring this with `std::mutex` by monitoring the underlying futex activity.)
This. the standard library has a responsibility to provide an implementation that performs well enough in every possible use case, while trying to be generally as fast as possible.
My takeaway is that the documentation should make more explicit recommendations depending on the situation -- i.e., people writing custom allocators should use std mutexes; most libraries and allocations that are ok with allocation should use parking_lot mutexes; embedded or libraries that don't want to depend on allocate should use std mutexes. Or maybe parking_lot is almost useless unless you're doing very fine-grained locking. Something like that.
Author of the original WTF::ParkingLot here (what rust’s parking_lot is based on).
I’m surprised that this only compared to std on one platform (Linux).
The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
> The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
IMHO that's a very cool feature which is essentially wasted when using it as a `Mutex<InnerBlah>` because the mutex's size will get rounded up to the alignment of `InnerBlah`. And even when not doing that, afaict `parking_lot` doesn't expose a way to use the remaining six bits in `parking_lot::RawMutex`. I think the new std mutexes made the right choice to use a different design.
> I’m surprised that this only compared to std on one platform (Linux).
Can't speak for the author, but I suspect a lot of people really only care about performance under Linux. I write software that I often develop from a Mac but almost entirely deploy on Linux. (But speaking of Macs: std::mutex doesn't yet use futexes on macOS. Might happen soon. https://github.com/rust-lang/rust/pull/122408)
Hypothetically Rust could make `Mutex<InnerBlah>` work with just two bits in the same way it makes `Option<&T>` the same size as `&T`. Annotate `InnerBlah` with the information about which bits are available and let `Mutex` use them.
There was talk of Rust allowing stride != alignment. [1] I think this would mean if say `InnerBlah` has size 15 and alignment 8, `parking_lot::Mutex<InnerBlah>` can be size 16 rather than the current 24. Same would be true for an `OuterBlah` the mutex is one field of. But I don't think it'll happen.
In principle, you Rust could create something like std::num::NonZero and its corresponding sealed trait ZeroablePrimitive to mark that two bits are unused. But that doesn't exist yet as far as I know.
There are also currently the unstable rustc_layout_scalar_valid_range_start and rustc_layout_scalar_valid_range_end attributes (which are used in the definition of NonNull, etc.) which could be used for some bit patterns.
I do the same in my toy JVM (to implement the reentrant mutex+condition variable that every Java object has), except I've got a rare deadlock somewhere because, as it turns out, writing complicated low level concurrency primitives is kinda hard :p
How can a parking_lot lock be less than 1 byte? does this uses unsafe?
Rust in general doesn't support bit-level objects unless you cast things to [u8] and do some shifts and masking manually (that is, like C), which of course is wildly unsafe for data structures with safety invariants
I don’t know the details of the Rust port but I don’t imagine the part that involves the two bits to require unsafe, other than in the ways that any locking algorithm dances with unsafety in Rust (ownership relies on locking algorithms being correct)
This is very similar to how Java's object monitors are implemented. In OpenJDK, the markWord uses two bits to describe the state of an Object's monitor (see markWord.hpp:55). On contention, the monitor is said to become inflated, which basically means revving up a heavier lock and knowing how to find it.
I'm a bit disappointed though, I assumed that you had a way of only using 2 bits of an object's memory somehow, but it seems like the lock takes a full byte?
It’s just that if you use the WTF::Lock class the. You get a full byte simply because the smallest possible size of a class instance in C++ is one byte.
But there’s a template mixing
thing you can use to get it to be two bits (you tell the mixin which byte to steal the two bits from and which two bits).
I suspend the same situation holds in the Rust port.
I am very familiar with how Java does locks. This is different. Look at the ParkingLot/parking_lot API. It lets you do much more than just locks, and there’s no direct equivalent of what Java VMs call the inflated or fat lock. The closest thing is the on demand created queue keyed by address.
The idea is that six bits in the byte are free to use as you wish. Of course you'll need to implement operations on those six bits as CAS loops (which nonetheless allow for any arbitrary RMW operation) to avoid interfering with the mutex state.
Unhelpful response. This cuongle.dev article does not answer nextaccountic's question, and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation. The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.
(unless there's somewhere else in the crate that provides an accessor for this but that'd be a weird interface)
(or you just use transmute to "know" that it's one byte and which bits within the byte it actually cares about, but really don't do that)
(slightly more realistically, you could probably use the `parking_lot_core::park` portion of the implementation and build your own equivalent of `parking_lot::RawMutex` on top of it)
(or you send the `parking_lot` folks a PR to extend `parking_lot::RawMutex` with interface you want; it is open source after all)
> and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation
The WebKit post explicitly talks about how you just need two bits to describe the lock state.
> The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.
Not impossible. One way to do this is to just use parking_lot directly.
In WebKit there’s a template mixin that lets you steal two bits for locking however you like. JavaScriptCore uses this to steal two bits from the indexing type byte (if I remember right)
> The WebKit post explicitly talks about how you just need two bits to describe the lock state.
It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.
> Not impossible. One way to do this is to just use parking_lot directly.
By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article? That's confusingly put, and nextaccountic's question is certainly Rust-specific and likely expecting an answer relating to this particular crate. At the least, "does this use unsafe" would certainly be true with an implementation from scratch or when using FFI into C++.
I hear that this algorithm and the C++ implementation are your invention, and all due respect for that. I'm also hearing that you are not familiar with this Rust implementation. It does not offer the main benefit you're describing. `parking_lot::RawMutex` is a one-byte type; that six bits within it are unused is true but something callers can not take advantage of. Worse, `parking_lot::Mutex<InnerFoo>` in practice is often a full word larger than `InnerFoo` due to alignment padding. As such, there's little benefit over a simpler futex-based approach.
> It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.
Read the WebKit post.
> By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article?
No. nextaccountic's comment and the cuongle.dev article are both talking about Rust. The Rust `parking_lot` implementation only uses two bits within a byte, but it doesn't provide a way for anything else to use the remaining six.
pizlonator's comments mention both the (C++) WTF::ParkingLot and the Rust `parking_lot`, and they don't answer nextaccountic's question about the latter.
> nextaccountic is confused.
nextaccountic asked how this idea could be applied to this Rust implementation. That's a perfectly reasonable question. pizlonator didn't know the anwer. That's perfectly reasonable too. Conscat suggested the article would be helpful; that was wrong.
This is one of the biggest design flaws in Rust's std, in my opinion.
Poisoning mutexes can have its use, but it's very rare in practice. Usually it's a huge misfeature that only introduces problems. More often than not panicking in a critical section is fine[1], but on the other hand poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
I'm not saying such a project doesn't exist, but I don't think I've ever seen a project which does anything sensible with Mutex's `Poisoned` error besides ignoring it. It's always either an `unwrap` (and we know how well that can go [2]), or do the sensible thing and do this ridiculous song-and-dance:
let guard = match mutex {
Ok(guard) => guard,
Err(poisoned) => poisoned.into_inner()
};
Suffice to say, it's a pain.
So in a lot of projects when I need a mutex I just add `parking_lot`, because its performance is stellar, and it doesn't have the poisoning insanity to deal with.
[1] -- obviously it depends on a case-by-case basis, but if you're using such a low level primitive you should know what you're doing
> It's always either an `unwrap` (and we know how well that can go [2])
If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point. It's fine to panic in a critical section if something's horribly wrong, the problem comes with blindly continuing after a panic in other threads that operate on the same data. In general, you're unlikely to know what that panic was, so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
In general, unless I were being careful to maintain fault boundaries between threads or tasks (the archetypical example being an HTTP server handling independent requests), I'd want a panic in one thread to cascade into stopping the program as soon as possible. I wouldn't want to swallow it up and keep using the same data like nothing's wrong.
> so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
One can make a panic wrapper type if they cared: It's what the stdlib Mutex currently does:
MutexGuard checks if its panicking during drop using `std::thread::panicking()`, and if so, sets a bool on the Mutex. The next acquirer checks for that bool & knows state may be corrupted. No need to bake this into the Mutex itself.
My point is that "blindly continuing" is not a great default if you "don't care". If you continue, then you first have to be aware that a multithreaded program can and will continue after a panic in the first place (most people don't think about panics at all), and you also have to know the state of the data after every possible panic, if any. Overall, you have to be quite careful if you want to continue properly, without risking downstream bugs.
The design with a verbose ".lock().unwrap()" and no easy opt-out is unfortunate, but conceptually, I see poisoning as a perfectly acceptable default for people who don't spend all their time musing over panics and their possible causes and effects.
> If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point.
I find that in the majority of cases you're essentially dealing with one of two cases:
1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements. In this case panicking is no different than if you would panic on a single thread while modifying an object through a plain old `&mut`. Here too dealing with poisoning is just useless busywork.
> I'd want a panic in one thread to cascade into stopping the program as soon as possible.
Sure, but you don't need mutex poisoning for this.
> 1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
Many people underestimate how many things can panic in corner cases. I've found quite a few unsafe functions in various crates that were unsound due to integer-overflow panics that the author hadn't noticed. Knowing for a fact that your operation cannot panic is the exception rather than the rule, and while it's unfortunate that the std Mutex doesn't accomodate non-poisoning mutexes, I see poisoning as a reasonable default.
(If Mutex::lock() unwrapped the error automatically, then very few people would even think about the "useless busywork" of the poison bit. For a similar example, the future types generated for async functions contain panic statements in case they are polled after completion, and no one complains about those.)
> 2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements.
Then I'd stick to a RefCell. Unless it's a static variable in a single-threaded program, in which case I usually just write some short wrapper functions if I find the manipulation too tedious.
We're currently working on separating poison from mutexes, such that the default mutexes won't have poisoning (no more `.lock().unwrap()`), and if you want poisoning you can use something like `Mutex<Poison<T>>`.
Speaking only for myself (though several other people have expressed the same sentiment), I wish we could get rid of unwinding. That would be a massive challenge to do while preserving capabilities people care about, such as the ability to handle panics in http request handlers without exiting. I think it would be possible, though.
That sounds really interesting, whether it is done in Rust, some Rust 2.0, or a successor or experimental language.
I do not know whether it is possible, though. If one does not unwind, what should actually happen instead? How would for instance partial computations, and resources on the stack, be handled? Some partial or constrained unwinding? I have not given it a lot of thought, though.
How do languages without exceptions handle it? How does C handle it? Error codes all the way? Maybe something with arenas or regions?
I do not have a good grasp on panics in Rust, but panics in Rust being able to either unwind or abort dependent on configuration, seems complex, and that design happened for historical reasons, from what I have read elsewhere.
Vague sketch: imagine if we had scoped panic hooks, unhooked via RAII. So, for use cases that today use unwinding for cleanup (e.g. "switch the terminal back out of curses mode"), you do that cleanup in a panic hook instead.
The hard use case to handle without unwinding is an HTTP server that wants to allow for panics in a request handler without panicking the entire process. Unwinding is a janky way to handle that, and creates issues in code that doesn't expect unwinding (e.g. half-modified states), and poisoning in particular seems likely to cascade and bring down other parts of the process anyway if some needed resource gets poisoned. But we need a reasonable alternative to propose for that use case, in order to seriously evaluate eliminating unwinding.
I am not sure that I understand what scoped panic hooks would or might look like. Are they maybe similar to something like try-catch-finally in Java? Would the language force the programmer to include them in certain cases somehow?
If a request handler for example has at some point in time 7 nested calls, in call no. 2 and call no. 6 have resources and partial computation that needs clean-up somehow and somewhere, and call no. 7 panics, I wonder what the code would look like in the different calls, and what would happen and when, and what the compiler would require, and what other relevant code would look like.
For the simple case, suppose that you're writing a TUI application that takes over the terminal. When it exits, even by panic, you want to clean up the terminal state so the user doesn't have to blindly type "reset".
Today, people sometimes do that by using `panic = "unwind"`, and writing a `catch_unwind` around their program, and using that to essentially implement a "finally" block. Or, they do it by having an RAII type that cleans up on `Drop`, and then they count on unwinding to ensure their `Drop` gets called even on panic. (That leaves aside the issue that something called from `Drop` is not allowed to fail or panic itself.) The question is, how would you do that without unwinding?
We have a panic hook mechanism, where on panic the standard library will call a user-supplied function. However, there is only one panic hook; if you set it, it replaces the old hook. If you have only one cleanup to do, that works fine. For more than one, you can follow the semantic of having your panic hook call the previous hook, but that does not allow unregistering hooks out of order; it only really works if you register a panic hook once for the whole program and never unregister it (e.g. "here's the hook for cleaning up tracing", "here's the hook for cleaning up the terminal state").
Suppose, instead, we had a mechanism that allowed registering arbitrary panic hooks, and unregistering them when no longer needed, in any order. Then, we could do RAII-style resource handling: you could have a `CursesTerminal` type, which is responsible for cleaning up the terminal, and it cleans up the terminal on `Drop` and on panic. To do the latter, it would register a panic hook, and deregister that hook on `Drop`.
With such a mechanism, panic hooks could replace anything that uses `catch_unwind` to do cleanup before going on to exit the program. That wouldn't fully solve the problem of doing cleanup and then swallowing the panic and continuing, but it'd be a useful component for that.
I have not given it much thought, but it would primarily be for the subset of Rust programs that do not need zero-cost abstractions as much, right? Since, even in the case of no panics, one would be paying at runtime for registering panic hooks, if I understand correctly.
I've used recovering from poisoned state in impl Drop in quite a few places.
In my case it's usually waiting for the GPU to finish some asynchronous work that's been spun up by CPU threads that may have panicked while holding the lock. This is necessary to avoid freeing resources that the GPU may still be using.
I usually prefix this with `if !std::thread::panicking() {}`, so I don't end up waiting (possibly forever) if I'm already cleaning up after a panic.
Hi, I don't have public examples to share but I can give an explanation of a simple scenario.
I have a container of resources, e.g. textures. When the GPU wants to use them, CPU will lease them until a point of time in the future denoted by a value (u64) of a GPU timeline semaphore. The handle and value of the semaphore is added to a list guarded by a mutex. Then GPU work is kicked off and the GPU will increment semaphore to that value when done.
In the Drop implementation of the container, we need to wait until all semaphores reach their respective value before freeing resources, and do so even if some thread panicked while holding the lock guarding the list. This is where I use .unwrap_or_else to get the list from the poison value.
It's not infeasible to try to catch any errors and propagate them when the lock is grabbed. But this is mostly for OOM and asserts that are not expected to fire. The ergonomics would be worse if the "lease" function would be fallible.
This said, I would not object to poisoning being made optional.
Oh, I don't think recovery from poison is why poisoning is good. The reason poisoning is good is that at the moment you've acquired a lock on a mutex, you should be able to assume that the invariants guarded by the mutex are upheld (and panic if not).
Mutex doesn't promise to uphold any more invariants than `&mut T` does. If the state can be corrupted by a panic while holding `&mut T`, I don't think there's any good reason to expect that obtaining it through `MutexGuard` should make any difference.
Panic propagation is typically handled much better at thread `join()` boundaries.
A panic in single-threaded, non-parallel code will either terminate the program or be recovered cleanly, so the potential for side effects to be silently observed in a way that breaks invariants is unique to Mutex<>. This is the reason for mutex poisoning,
I fail to see that there is any material difference. Whether you catch-unwind within a single thread or in a separate thread such that the panic can be resumed on join makes zero difference.
Heck, you can have Drop impls observing the state while unwinding.
A true panic-safe data structure requires serious thought, and mutex poisoning does nothing here - it is neither necessary nor sufficient.
This is a false dichotomy. Not every technique needs to work in all cases in order to be useful.
This seems analogous to arguing that because seat belts don't save the lives of all people involved in car crashes, and they're kind of annoying, then they shouldn't be factory-standard.
This is a case of a feature that is actively harmful for the things it tries to prevent, because it increases the risk in practice of panics "spreading" throughout a system, even after the programmer thought she had finished handling it, and because it gives a false impression what kind of guarantee you actually have.
I understand what you mean, but you're saying has not been true for me in practice. Mutexes absolutely are used to uphold invariants in a way that &mut T is much less often.
There's something to be said here about what I've sometimes called the cancellation blast radius. The issues with cancellation happen when the data corruption/invariant violation is externally visible (if the corrupt data is torn down, who cares.) Mutexes make data corruption externally visible very often.
In projects I've worked on, this just hasn't been the case. Mutexes, especially in Rust, can grant you a `&mut T` when what you have is `&Mutex<T>`, and that's it - failing to uphold invariants in the API surface of `T` is a bug whether or not it lives inside a mutex.
Lots of data structures need to care about panic-safety. Inserting a node in a tree must leave the tree in a valid state if allocating memory for the new node fails, for example. All of that is completely orthogonal to whether or not the data structure is also observable from multiple threads behind a mutex, and I would argue especially in the case of mutex, whose purpose it is to make an object usable from multiple threads as-if they had ownership.
Acknowledging that panic safety is a real issue with data structures that mutex poisoning does not solve, I don't think we're going to agree on anything else here, unfortunately. We probably have entirely different experiences writing software -- mutex poisoning is very valuable in higher-level code.
That’s not surprising to me, but it’s not much of an argument for changing the default to be less safe. Most people want poisoning to propagate fatal errors and avoid reading corrupted data, not to recover from panics.
Edit: isn’t that an argument not to change the default? If people were recovering from poison a lot and that was painful, that’s one thing. But if people aren’t doing that, why is this a problem?
If the issue is that everyone has to write an extra unwrap, then a good step would be to make lock panic automatically in the 2027 edition, and add a lock_or_poison method for the current behavior. But I think removing poisoning altogether from the default mutex, such that it silently unlocks on panic, would be very bad. The silent-unlock behavior is terrible with async cancellations and terrible with panics.
You seem to keep making the implicit assumption that because people are using `unwrap()`, they must not care about the poisoning behavior. I really don't understand where this assumption is coming from. I explicitly want to propagate panics from contexts that hold locks to contexts that take locks. The way to write that is `lock().unwrap()`. I get that some people might write `lock().unwrap()` not because they care about propagating panics, but because they don't care either way and it's easy. But why are you assuming that that's most people?
I'm suggesting that the balance of pain to benefit is not working out enough to inflict it on everyone by default. I'm not suggesting it has no value, just not enough to be worth it.
Is that not because there is not much to do, and therefore people use .unwrap() — because crashing is actually quite sane?
Correctness trumps ergonomics, and the default should definitely be poisoning/panicking unless handled. There could definitely be an optional poison-eating mutex, but I argue the current Mutex does the right thing.
To the contrary, the projects I've been part of have had no end of issues related to being cancelled in the middle of a critical section [1]. I consider poisoning to be table stakes for a mutex.
Well, I mean, if you've made the unfortunate decision to hold a Mutex across await points...?
This is completely banned in all of my projects. I have a 100k+ LOC project running in production, that is heavily async and with pervasive usage of threads and mutexes, and I never had a problem, precisely because I never hold a mutex across an await point. Hell, I don't even use async mutexes - I just use normal synchronous parking lot mutexes (since I find the async ones somewhat pointless). I just never hold them across await points.
As I said in the article, we avoid Tokio mutexes entirely for the exact reason that being cancelled in the middle of a critical section is bad. In Rust, there are two sources of cancellations in the middle of a critical section: async cancellations and panics. Ergo, panicking in the middle of a critical section is also bad, and mutexes ought to detect that and mark their internal state as corrupted as a result.
> Ergo, panicking in the middle of a critical section is also bad, and mutexes ought to detect that and mark their internal state as corrupted as a result.
I fundamentally disagree with this. Panicking in the middle of an operation that is supposed to be atomic is bad. If it's not supposed to be atomic then it's totally fine, just as panicking when you hold a plain old `&mut` is fine. Not every use of a `Mutex` is protecting an atomic operation that depends on not being cancelled for its correctness, and even for those situations where you do it's a better idea to prove that a panic cannot happen (if possible) or gracefully handle the panic.
I really don't see a point of mutex poisoning in most cases. You can either safely panic while you're holding a mutex (because your code doesn't care about atomicity), or you simply write your code in such a way that it's still correct even if you panic (e.g. if you temporarily `.take()` something in your critical section then you write a wrapper which restores it on `Drop` in case of a panic). The only thing poisoning achieves is to accidentally give you denial-of-service CVEs, and is actively harmful when it comes to producing reliable software.
I've written many production Rust services and programs over the years, both sync and async, and in my experience—by far the most common use of mutexes is to temporarily violate invariants that are otherwise upheld while the mutex is unlocked (which I think is what you mean by "atomic"). In some cases invariants can be restored, but in many cases they simply cannot.
Panicking while in the middle of a non-mutex-related &mut T is theoretically bad as well, but in my experience, &mut T temporary invariant violations don't happen nearly as often as corruption of mutex-guarded data.
I disagree, lock poisoning is a good way of improving correctness of concurrent code in case of fatal errors. As demonstrated by the benchmarks in this article, it's not very expensive for typical use cases.
In 99% of the cases where one thread has panic'd while holding a lock, you want to panic the thread that attempts to grab the lock. The contents of anything inside the lock is very much undefined and continuing will lead to unpredictable results. So most of the time you just want:
let guard = mutex.lock().expect("poisoned");
The last 1% is when you want to clean up something even if a panic has occured. This is usually in a impl Drop situation. It's not much more verbose either, just:
let guard = mutex.lock().unwrap_or_else(|poison| poison.into_inner());
What is painful is trying to propagate the poison value as an error using `?`. In that case you're probably better off using a match expression because the usual `.into()` will not play nice with common error handling crates (thiserror, anyhow) or need to implement `From` manually for the error types and drop the contents of the poison error before propagating.
This might be the case for long running server processes where you have n:m threading with long running threads and want to keep processing other requests even if one request fails. Although in that case you probably want (or your framework provides) some kind of robustness with `catch_unwind` that will log the errors, respond with HTTP 500 or whatever and then resume. Because that's needed to catch panics from non-mutex related code.
> poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
There's a tension between making DoS hard and avoiding RCE vulnerabilities, since the way to avoid an unplanned/bad code state becoming an RCE vulnerability is to crash as quickly and thoroughly as possible when you get into that state.
I've dug into this topic in the past and my takeaway for this entire thing was “cool idea, but don't use it practice ”.
I.e. just unrwap the lock call's result. If a worker thread panics you should assume your applications done for. Some people even recommend setting panic=abort for release builds, in which case you won't even be able to catch those panics to begin with.
I mean, think about the actual use cases here. On of my threads just panicked. Does it make sense to continue running the application?
And if you answer yes, this is an error condition that can occur, then it shouldn't panick to begin with and instead handle errors gracefully, leaving the mutex unpoisoned.
Questions for anyone who is an expert on poisoning in Rust:
Is it safe to ignore poisoned mutexes if and only if the relevant pieces of code are unwind-safe, similar to exception safety in C++? As in, if a panic happens, the relevant pieces of code handles the unwinding safely, thus data is not corrupted, and thus ignoring the poison is fine?
I will personally recommend that unless you are writing performance sensitive code*, don’t use mutexes at all because they are too low-level an abstraction. Use MPSC queues for example, or something like RCU. I find these abstractions much more developer friendly.
Channels and RCU can often be better for performance as well. I have run into scaling issues due to contention way too many times. Sometimes just because two things were in different parts of the same cache line.
Sharing as little mutable state as possible is often the best way to scale to a large number of CPU cores. RCU can help with that if you have data that is mostly read but rarely written. If your workload is balanced or even write heavy RCU is probably not a good option.
I have even had to redesign because I had too much contention on plain old atomic reference counters, almost 30 % of my runtime was reference counting of a small number of specific Arcs, and hardware performance counters pointed at cache line contention. I redesigned that code to avoid Arcs entirely which also allowed some additional optimisations, resulting in cutting approximately 40 % of my runtime in total.
So, each specific use case should be approached individually if you care about performance. And always profile and benchmark. If the code isn't performance critical, by all means do what you think is most maintainable. But measure, because you are probably wrong about what part of your code is the bottleneck unless you measure.
I have found out that mutex solutions are more maintainable and amendable without big redesigns compared with channels or RCU.
Consider a simple case of single producer-single consumer. While one can use bounded channels to implement back-pressure, in practice when one wants to either drop messages or apply back-pressure based on message priority any solution involving channels will lead to pile of complex multi-channel solutions and select. With mutex the change will be a straightforward replace of a queue by a priority queue and an extra if inside the mutex.
A channel can be backed by a priority queue if you wish. It’s just an abstraction. The channel internally probably uses mutexes too; it’s just that it’s helpful not to see mutexes in application code.
Surely one can abstract priority queue with mutexes into own data structure. However it will contain enough application-specific logic so a chance of reuse will be slim. So by directly working with mutexes one will have simpler code overall that can still be more easy to adapt to changing requirements.
In general the problem with channels is that they are not flexible enough while being rather abstract. A better abstraction is message passing with one message queue per thread like in Erlang. IMO it can cover more cases before one needs mutexes. But even with that proper back pressure and rate limiting is hard.
A mutex is a natural abstraction when there is exactly one of them. You have a bunch of tasks doing their own stuff, with shared mutable state behind the mutex. When you start thinking about using two mutexes, other abstractions often become more convenient.
Although there are owning mutexes for C++ the C++ standard library does not provide such a thing. So the std::mutex used in the example is not an owning mutex and that example works and does what was described.
One reason not to provide the owning mutex in C++ is that it isn't able to deliver similar guarantees to Rust because its type system isn't strong enough. Rust won't let you accidentally keep a reference to the protected object after unlocking, C++ will for example.
I am not very familiar with C++'s API, but I believe that you are right that the C++ example in the article is incorrect, though for a different reason, namely that RAII is supported also in C++.
In C++, a class like std::lock_guard also provides "Automatic unlock". AFAICT, the article argues that only Rust's API provides that.
> In C++, a class like std::lock_guard also provides "Automatic unlock". AFAICT, the article argues that only Rust's API provides that.
The issue isn't automatic unlocking. From the article:
> The problem? Nothing stops you from accessing account without locking the mutex first. The compiler won’t catch this bug.
i.e., a C++ compiler will happily compile code that modifies `account` without taking the lock first. Your lock_guard example suffers from this same issue.
Nothing in the C++ stdlib provides an API that makes it impossible to access `account` without first taking the lock, and while you can write C++ classes that approximate the Rust API you can't quite reach the same level of robustness without external help.
That is a different topic from what I wrote about.
The article wrote:
> Automatic unlock: When you lock, you receive a guard. When the guard goes out of scope, it automatically unlocks. No manual cleanup needed.
And presented Rust as being different from C++ regarding that, and the C++ example was not idiomatic, since it did not use something like std::lock_guard.
I have not addressed the rest of your comment, since it is a different topic, sorry.
Fair point with respect to the separate topic. My apologies.
As for the automatic cleanup bit, perhaps the article is trying to focus purely on the mutex types themselves? Or maybe they included the "when you lock" bit to emphasize that you can't forget to unlock the mutex (i.e., no reliance on unenforced idioms). Hard to say given the brevity/nature of the section, and in the end I think it's not that much of a problem given the general topic of the blogpost.
It seems completely clear. He first gives unidiomatic C++ code, then next gives idiomatic Rust code, and differentiates the two based on the code snippets. It is a mistake on his part, and I do not see how it could reasonably be viewed otherwise. It is not a huge mistake, but it is still a clear mistake.
Perhaps it might help to clarify precisely what claim(s) you think are being made?
From my reading, the section (and the article in general, really) is specifically focusing on mutexes, so the observations the article makes are indeed accurate in that respect (i.e., C++'s std::mutex indeed does not have automatic unlocking; you need to use an external construct for that functionality). Now, if the article were talking about locking patterns more generally, I think your criticism would hold more weight, but I think the article is more narrowly focused than that.
For a bit of a more speculative read, I think it's not unreasonable to take the C++ code as a general demonstration of the mutex API "languages other than Rust" use rather than trying to be a more specific comparison of locking patterns in Rust and C++. Consider the preceding paragraph:
> In languages other than Rust, you typically declare a mutex separately from your data, then manually lock it before entering the critical section and unlock it afterward. Here’s how it looks in C++:
I don't think it's unreasonable to read the "it" in the final sentence as "that pattern"; i.e., "Here's what that pattern looks like when written in C++". The example code would be perfectly correct in that case - it shows a mutex declared separately from the data, and it shows that mutex being manually locked before entering the critical section and unlocked after.
There was a giant super-long GitHub issue about improving Rust std mutexes a few years back. Prior to that issue Rust was using something much worse, pthread_mutex_t. It explained the main reason why the standard library could not just adopt parking_lot mutexes:
From https://github.com/rust-lang/rust/issues/93740
> One of the problems with replacing std's lock implementations by parking_lot is that parking_lot allocates memory for its global hash table. A Rust program can define its own custom allocator, and such a custom allocator will likely use the standard library's locks, creating a cyclic dependency problem where you can't allocate memory without locking, but you can't lock without first allocating the hash table.
> After some discussion, the consensus was to providing the locks as 'thinnest possible wrapper' around the native lock APIs as long as they are still small, efficient, and const constructible. This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
> This means that on platforms like Linux and Windows, the operating system will be responsible for managing the waiting queues of the locks, such that any kernel improvements and features like debugging facilities in this area are directly available for Rust programs.
> This means SRW locks on Windows, and futex-based locks on Linux, some BSDs, and Wasm.
Note that the SRW Locks are gone, except if you're on a very old Windows. So today the Rust built-in std mutex for your platform is almost certainly basically a futex though if it is on Windows it is not called a futex and from some angles is better - the same core ideas of the futex apply, we only ask the OS to do any work when we're contended, there is no OS limited resource (other than memory) and our uncontended operations are as fast as they could ever be.
SRW Locks were problematic because they're bulkier than a futex (though mostly when contended) and they have a subtle bug and for a long time it was unclear when Microsoft would get around to fixing that which isn't a huge plus sign for an important intrinsic used in all the high performance software on a $$$ commercial OS...
Mara's work (which you linked) is probably more work, and more important, but it's not actually the most recent large reworking of Rust's Mutex implementation.
> if it is on Windows it is not called a futex
What is it called?
WaitOnAddress or from Rust's point of view wait_on_address
> Prior to that issue Rust was using something much worse, pthread_mutex_t
Presumably you're referring to this description, from the Github Issue:
> > On most platforms, these structures are currently wrappers around their pthread equivalent, such as pthread_mutex_t. These types are not movable, however, forcing us to wrap them in a Box, resulting in an allocation and indirection for our lock types. This also gets in the way of a const constructor for these types, which makes static locks more complicated than necessary.
pthread mutexes are const-constructible in a literal sense, just not in the sense Rust requires. In C you can initialize a pthread_mutex_t with the PTHREAD_MUTEX_INITIALIZER initializer list instead of pthread_mutex_init, and at least with glibc there's no subsequent allocation when using the lock. But Rust can't do in-place construction[1] (i.e. placement new in C++ parlance), which is why Rust needs to be able to "move" the mutex. Moving a mutex is otherwise non-sensical once the mutex is visible--it's the address of the mutex that the locking is built around.
The only thing you gain by not using pthread_mutex_t is a possible smaller lock--pthread_mutex_t has to contain additional members to support robust, recursive, and error checking mutexes, though altogether that's only 2 or 3 additional words because some are union'd. I guess you also gain the ability to implement locking, including condition variables, barriers, etc, however you want, though now you can't share those through FFI.
[1] At least not without unsafe and some extra work, which presumably is a non-starter for a library type where you want to keep it all transparent.
> The effect of referring to a copy of the object when locking, unlocking, or destroying it is undefined.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V...
I.e., if I pthread_mutex_init(&some_addr, ...), I cannot then copy the bits from some_addr to some_other_addr and then pthread_mutex_lock(&some_other_addr). Hence not movable.
> Moving a mutex is otherwise non-sensical once the mutex is visible
What does "visible" mean here? In Rust, in any circumstance where a move is possible, there are no other references to that object, hence it is safe to move.
Well, technically if you only have a mutable borrow (it's not your object) then you can't move from it unless you replace it somehow. If you have two such borrows you can swap them, if the type implements Default you can take from one borrow and this replaces it with its default and if you've some other way to make one you can replace the one you've got a reference to with that one, but if you can't make a new one and don't have one to replace it with, then too bad, no moving the one you've got a reference to.
You're right and I edited my comment.
> What does "visible" mean here? In Rust, in any circumstance where a move is possible, there are no other references to that object, hence it is safe to move.
And other than during construction or initialization (of the mutex object, containing object, or related state), how common is it in Rust to pass a mutex by value? If you can pass by value then the mutex isn't (can't) protect anything. I'm struggling to think of a scenario where you'd want to do this, or at least why the inability to do so is a meaningful impediment (outside construction/initialization, that is). I understand Rust is big on pass-by-value, but when the need for a mutex enters the fray, it's because you're sharing or about to share, and thus passing by reference.
Depends on the program, and it can be a very useful tool.
Rust has Mutex::get_mut(&mut self) which allows getting the inner &mut T without locking. Having a &mut Mutex<T> implies you can get &mut T without locks. Being able to treat Mutex<T> like any other value means you can use the whole suite of Rust's ownership tools to pass the value through your program.
Perhaps you temporarily move the Mutex into a shared data structure so it can be used on multiple threads, then take it back out later in a serial part of your program to get mutable access without locks. It's a lot easier to move Mutex<T> around than &mut Mutex<T> if you're going to then share it and un-share it.
Also It's impossible to construct a Mutex without moving at least once, as Rust doesn't guarantee return value optimization. All moves in Rust are treated as memcpy that 'destroy' the old value. There's no way to even assign 'let v = Mutex::new()' without a move so it's also a hard functional requirement.
You can pass the mutex by value and it does continue to protect its value.
https://play.rust-lang.org/?version=stable&mode=debug&editio...
I’m actually thinking of the sheer size of pthread mutexes. They are giant. The issue says that they wanted something small, efficient, and const constructible. Pthread mutexes are too large for most applications doing fine-grained locking.
On a typical modern 64-bit Linux for example they're 40 bytes ie they are 320 bits. So yeah, unnecessarily bulky.
On my Linux system today Rust's Mutex<Option<CompactString>> is smaller than the pthread mutex type whether it is locked and has the text "pthread_mutex_t is awful" inside it or maybe unlocked with explicitly no text (not an empty string), either would only take like 30-odd bytes, the pthread_mutex_t is 40 bytes.
On Windows the discrepancy is even bigger, their OS native mutex type is this sprawling 80 byte monster while their Mutex<Option<CompactString> is I believe slightly smaller than on Linux even though it has the same features.
> On Windows the discrepancy is even bigger, their OS native mutex type is this sprawling 80 byte monster
I guess you are referring to CRITICAL_SECTION? SRWLock, which has the size of a pointer, has been introduced in Windows Vista. Since Windows 8 you can use WaitOnAddress to build even smaller locks.
Yes, CRITICAL_SECTION is far too large. Mara asked some years ago whether SRWLock could guarantee what Rust actually needs for this purpose (the documentation at that time refused to clarify whether we can move it for example) and that's why her change was to SRWLock from CRITICAL_SECTION.
And yes, the newer change uses WaitOnAddress to provide the same API as the futex from the various Unix platforms. Raymond Chen's description of the differences is perhaps rather exaggerated, which isn't to say there's no difference, but it's well within what's practical for an adaptor layer.
Also although the SRWLock itself is the same size as a pointer (thus, 64 bits on a modern computer, compared to a 32-bit Futex) there's a reason it's the same size as a pointer - it actually is basically a pointer, and so in some cases it's pointing at a data structure which we should reasonably say is also part of the overhead.
The pointer is to a large aligned object, which means the bottom bits would be zero and so SRWLock uses those for flag bits. It's a nice trick but we should remember that it isn't really comparable to a Futex though it's certainly cheaper than CRITICAL_SECTION.
I dunno, it seems to me that the standard mutex performs very well on all scenarios, and doesn't have any significant downsides, except for the hogging case, which could be fixed by assigning the non-hogging threads a higher priority.
Whereas parking_lot has a ton of problematic scenarios, where after the spinlock times out, and it yields the thread to the OS, which has no idea to wake up the thread after the resource is unblocked.
It could be even argued that preventing starvation is outside the design scope of the Mutex as a construct, as it only guarantees mutual exclusion and that the highest priority waiting thread should get access to it.
Seems like the simple solution to this problem would be to have both, no?
A simple native lock in the standard library along with a nicer implementation (also in the standard library) that depends on the simple lock?
The simplest solution is for `std::mutex` to provide a simple, efficient mutex which is a good choice for almost any program. And it does. Niche programs can pull in a crate.
I doubt `parking_lot` would have been broadly used—maybe wouldn't even have been written—if `std` had this implementation from the start.
What specifically in this comparison made you think that `parking_lot` is broadly needed? They had to work pretty hard to find a scenario in which `parking_lot` did much better in any performance metrics. And as I alluded to in another comment, `parking_lot::Mutex<InnerFoo>` doesn't have a size advantage over `std::mutex::Mutex<InnerFoo>` when `InnerFoo` has word alignment. That's the most common situation, I think.
If I were to make a wishlist of features for `std::mutex` to just have, it wouldn't be anything `parking_lot` offers. It'd be stuff like the lock contention monitoring that the (C++) `absl::Mutex` has. (And at least on some platforms you can do a decent job of monitoring this with `std::mutex` by monitoring the underlying futex activity.)
This. the standard library has a responsibility to provide an implementation that performs well enough in every possible use case, while trying to be generally as fast as possible.
My takeaway is that the documentation should make more explicit recommendations depending on the situation -- i.e., people writing custom allocators should use std mutexes; most libraries and allocations that are ok with allocation should use parking_lot mutexes; embedded or libraries that don't want to depend on allocate should use std mutexes. Or maybe parking_lot is almost useless unless you're doing very fine-grained locking. Something like that.
Author of the original WTF::ParkingLot here (what rust’s parking_lot is based on).
I’m surprised that this only compared to std on one platform (Linux).
The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
> The main benefit of parking lot is that it makes locks very small, which then encourages the use of fine grained locking. For example, in JavaScriptCore (ParkingLot’s first customer), we stuff a 2-bit lock into every object header - so if there is ever a need to do some locking for internal VM reasons on any object we can do that without increasing the size of the object
IMHO that's a very cool feature which is essentially wasted when using it as a `Mutex<InnerBlah>` because the mutex's size will get rounded up to the alignment of `InnerBlah`. And even when not doing that, afaict `parking_lot` doesn't expose a way to use the remaining six bits in `parking_lot::RawMutex`. I think the new std mutexes made the right choice to use a different design.
> I’m surprised that this only compared to std on one platform (Linux).
Can't speak for the author, but I suspect a lot of people really only care about performance under Linux. I write software that I often develop from a Mac but almost entirely deploy on Linux. (But speaking of Macs: std::mutex doesn't yet use futexes on macOS. Might happen soon. https://github.com/rust-lang/rust/pull/122408)
Hypothetically Rust could make `Mutex<InnerBlah>` work with just two bits in the same way it makes `Option<&T>` the same size as `&T`. Annotate `InnerBlah` with the information about which bits are available and let `Mutex` use them.
There was talk of Rust allowing stride != alignment. [1] I think this would mean if say `InnerBlah` has size 15 and alignment 8, `parking_lot::Mutex<InnerBlah>` can be size 16 rather than the current 24. Same would be true for an `OuterBlah` the mutex is one field of. But I don't think it'll happen.
[1] e.g. https://internals.rust-lang.org/t/pre-rfc-allow-array-stride...
References only have a single bit available as a niche (the null byte), which Option makes use of for null pointer optimization (https://doc.rust-lang.org/std/option/index.html#representati...).
In principle, you Rust could create something like std::num::NonZero and its corresponding sealed trait ZeroablePrimitive to mark that two bits are unused. But that doesn't exist yet as far as I know.
There are also currently the unstable rustc_layout_scalar_valid_range_start and rustc_layout_scalar_valid_range_end attributes (which are used in the definition of NonNull, etc.) which could be used for some bit patterns.
Also aspirations to use pattern types for this sort of thing: https://github.com/rust-lang/rust/issues/135996
I think he meant 1 byte on the heap for the shared state, on the stack it's larger.
Which is fine since in Rust we almost always have the mutex in function scope as long as we're using it.
> I suspect a lot of people really only care about performance under Linux
Yeah this is true
I do the same in my toy JVM (to implement the reentrant mutex+condition variable that every Java object has), except I've got a rare deadlock somewhere because, as it turns out, writing complicated low level concurrency primitives is kinda hard :p
How can a parking_lot lock be less than 1 byte? does this uses unsafe?
Rust in general doesn't support bit-level objects unless you cast things to [u8] and do some shifts and masking manually (that is, like C), which of course is wildly unsafe for data structures with safety invariants
Original post: https://webkit.org/blog/6161/locking-in-webkit/
Post that mentions the two bit lock: https://webkit.org/blog/7122/introducing-riptide-webkits-ret...
I don’t know the details of the Rust port but I don’t imagine the part that involves the two bits to require unsafe, other than in the ways that any locking algorithm dances with unsafety in Rust (ownership relies on locking algorithms being correct)
This is very similar to how Java's object monitors are implemented. In OpenJDK, the markWord uses two bits to describe the state of an Object's monitor (see markWord.hpp:55). On contention, the monitor is said to become inflated, which basically means revving up a heavier lock and knowing how to find it.
I'm a bit disappointed though, I assumed that you had a way of only using 2 bits of an object's memory somehow, but it seems like the lock takes a full byte?
The lock takes two bits.
It’s just that if you use the WTF::Lock class the. You get a full byte simply because the smallest possible size of a class instance in C++ is one byte.
But there’s a template mixing thing you can use to get it to be two bits (you tell the mixin which byte to steal the two bits from and which two bits).
I suspend the same situation holds in the Rust port.
I am very familiar with how Java does locks. This is different. Look at the ParkingLot/parking_lot API. It lets you do much more than just locks, and there’s no direct equivalent of what Java VMs call the inflated or fat lock. The closest thing is the on demand created queue keyed by address.
The idea is that six bits in the byte are free to use as you wish. Of course you'll need to implement operations on those six bits as CAS loops (which nonetheless allow for any arbitrary RMW operation) to avoid interfering with the mutex state.
The lock uses two bits but still takes up a whole (atomic) byte
This article elaborates how it works.
Unhelpful response. This cuongle.dev article does not answer nextaccountic's question, and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation. The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.
https://docs.rs/parking_lot/0.12.5/parking_lot/struct.RawMut...
(unless there's somewhere else in the crate that provides an accessor for this but that'd be a weird interface)
(or you just use transmute to "know" that it's one byte and which bits within the byte it actually cares about, but really don't do that)
(slightly more realistically, you could probably use the `parking_lot_core::park` portion of the implementation and build your own equivalent of `parking_lot::RawMutex` on top of it)
(or you send the `parking_lot` folks a PR to extend `parking_lot::RawMutex` with interface you want; it is open source after all)
> and neither do the webkit.org articles that describe the parking lot concept but not this Rust implementation
The WebKit post explicitly talks about how you just need two bits to describe the lock state.
> The correct answer appears to be that it's impossible: `parking_lot::RawMutex` has private storage that owns the entire byte and does not provide any accessor for the unused six bits.
Not impossible. One way to do this is to just use parking_lot directly.
In WebKit there’s a template mixin that lets you steal two bits for locking however you like. JavaScriptCore uses this to steal two bits from the indexing type byte (if I remember right)
> The WebKit post explicitly talks about how you just need two bits to describe the lock state.
It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.
> Not impossible. One way to do this is to just use parking_lot directly.
By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article? That's confusingly put, and nextaccountic's question is certainly Rust-specific and likely expecting an answer relating to this particular crate. At the least, "does this use unsafe" would certainly be true with an implementation from scratch or when using FFI into C++.
I hear that this algorithm and the C++ implementation are your invention, and all due respect for that. I'm also hearing that you are not familiar with this Rust implementation. It does not offer the main benefit you're describing. `parking_lot::RawMutex` is a one-byte type; that six bits within it are unused is true but something callers can not take advantage of. Worse, `parking_lot::Mutex<InnerFoo>` in practice is often a full word larger than `InnerFoo` due to alignment padding. As such, there's little benefit over a simpler futex-based approach.
> It describes the algorithm but not how a caller of the Rust `parking_lot` crate could take advantage of this.
Read the WebKit post.
> By "just use parking_lot directly", I think you're talking about reimplementing the parking lot algorithm or using the C++ `WTF::ParkingLot` implementation? But not actually using the existing Rust crate called `parking_lot` described in the cuongle.dev article?
See https://docs.rs/parking_lot_core/latest/parking_lot_core/
That's my ParkingLot API. You can use it to implement many kinds of locks, including:
- Efficient ones that use a tristate, like the glibc lowlevellock, or what I call the Cascade lock. So, this doesn't even need two bits.
- The lock algorithm I prefer, which uses two bits.
- Lots of other algorithms. You can do very efficient condition variables, rwlocks, counting locks, etc.
You can do a lot of useful algorithms with fewer than 8 bits. You don't have to use the C++ ParkingLot. You don't have to implement parking_lot.
What you do have to do is RTFM
The two bit lock was specifically refering to the C++ WTF::ParkingLot (and the comment mentioning it explicitly said that). nextaccountic is confused.
No. nextaccountic's comment and the cuongle.dev article are both talking about Rust. The Rust `parking_lot` implementation only uses two bits within a byte, but it doesn't provide a way for anything else to use the remaining six.
pizlonator's comments mention both the (C++) WTF::ParkingLot and the Rust `parking_lot`, and they don't answer nextaccountic's question about the latter.
> nextaccountic is confused.
nextaccountic asked how this idea could be applied to this Rust implementation. That's a perfectly reasonable question. pizlonator didn't know the anwer. That's perfectly reasonable too. Conscat suggested the article would be helpful; that was wrong.
nextaccountic replied to this original comment: https://news.ycombinator.com/item?id=46035698
Yes, nextaccountic's reply is confused about Rust vs C++ implementations. But the original mention was not talking about Rust.
The original webkit blog post about parking lot mutex implementation is a great read https://webkit.org/blog/6161/locking-in-webkit/
> Poisoning: Panic Safety in Mutexes
This is one of the biggest design flaws in Rust's std, in my opinion.
Poisoning mutexes can have its use, but it's very rare in practice. Usually it's a huge misfeature that only introduces problems. More often than not panicking in a critical section is fine[1], but on the other hand poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
I'm not saying such a project doesn't exist, but I don't think I've ever seen a project which does anything sensible with Mutex's `Poisoned` error besides ignoring it. It's always either an `unwrap` (and we know how well that can go [2]), or do the sensible thing and do this ridiculous song-and-dance:
Suffice to say, it's a pain.So in a lot of projects when I need a mutex I just add `parking_lot`, because its performance is stellar, and it doesn't have the poisoning insanity to deal with.
[1] -- obviously it depends on a case-by-case basis, but if you're using such a low level primitive you should know what you're doing
[2] -- https://blog.cloudflare.com/18-november-2025-outage/#memory-...
> It's always either an `unwrap` (and we know how well that can go [2])
If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point. It's fine to panic in a critical section if something's horribly wrong, the problem comes with blindly continuing after a panic in other threads that operate on the same data. In general, you're unlikely to know what that panic was, so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
In general, unless I were being careful to maintain fault boundaries between threads or tasks (the archetypical example being an HTTP server handling independent requests), I'd want a panic in one thread to cascade into stopping the program as soon as possible. I wouldn't want to swallow it up and keep using the same data like nothing's wrong.
> so you have no clue if the shared data might be incompletely modified or otherwise logically corrupted.
One can make a panic wrapper type if they cared: It's what the stdlib Mutex currently does:
MutexGuard checks if its panicking during drop using `std::thread::panicking()`, and if so, sets a bool on the Mutex. The next acquirer checks for that bool & knows state may be corrupted. No need to bake this into the Mutex itself.
My point is that "blindly continuing" is not a great default if you "don't care". If you continue, then you first have to be aware that a multithreaded program can and will continue after a panic in the first place (most people don't think about panics at all), and you also have to know the state of the data after every possible panic, if any. Overall, you have to be quite careful if you want to continue properly, without risking downstream bugs.
The design with a verbose ".lock().unwrap()" and no easy opt-out is unfortunate, but conceptually, I see poisoning as a perfectly acceptable default for people who don't spend all their time musing over panics and their possible causes and effects.
> If a mutex has been poisoned, then something must have already panicked, likely in some other thread, so you're already in trouble at that point.
I find that in the majority of cases you're essentially dealing with one of two cases:
1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements. In this case panicking is no different than if you would panic on a single thread while modifying an object through a plain old `&mut`. Here too dealing with poisoning is just useless busywork.
> I'd want a panic in one thread to cascade into stopping the program as soon as possible.
Sure, but you don't need mutex poisoning for this.
> 1) Your critical sections are tiny and you know you can't panic, in which case dealing with poisoning is just useless busywork.
Many people underestimate how many things can panic in corner cases. I've found quite a few unsafe functions in various crates that were unsound due to integer-overflow panics that the author hadn't noticed. Knowing for a fact that your operation cannot panic is the exception rather than the rule, and while it's unfortunate that the std Mutex doesn't accomodate non-poisoning mutexes, I see poisoning as a reasonable default.
(If Mutex::lock() unwrapped the error automatically, then very few people would even think about the "useless busywork" of the poison bit. For a similar example, the future types generated for async functions contain panic statements in case they are polled after completion, and no one complains about those.)
> 2) You use a Mutex to get around Rust's "shared xor mutable" requirement. That is, you just want to temporarily grab a mutable reference and modify an object, but you don't have any particular atomicity requirements.
Then I'd stick to a RefCell. Unless it's a static variable in a single-threaded program, in which case I usually just write some short wrapper functions if I find the manipulation too tedious.
We're currently working on separating poison from mutexes, such that the default mutexes won't have poisoning (no more `.lock().unwrap()`), and if you want poisoning you can use something like `Mutex<Poison<T>>`.
Yeah, I'm looking forward to it!
While we're at it, another thing that'd be nice to get rid of is `AssertUnwindSafe`, which I find even more pointless.
Speaking only for myself (though several other people have expressed the same sentiment), I wish we could get rid of unwinding. That would be a massive challenge to do while preserving capabilities people care about, such as the ability to handle panics in http request handlers without exiting. I think it would be possible, though.
That sounds really interesting, whether it is done in Rust, some Rust 2.0, or a successor or experimental language. I do not know whether it is possible, though. If one does not unwind, what should actually happen instead? How would for instance partial computations, and resources on the stack, be handled? Some partial or constrained unwinding? I have not given it a lot of thought, though. How do languages without exceptions handle it? How does C handle it? Error codes all the way? Maybe something with arenas or regions?
I do not have a good grasp on panics in Rust, but panics in Rust being able to either unwind or abort dependent on configuration, seems complex, and that design happened for historical reasons, from what I have read elsewhere.
Vague sketch: imagine if we had scoped panic hooks, unhooked via RAII. So, for use cases that today use unwinding for cleanup (e.g. "switch the terminal back out of curses mode"), you do that cleanup in a panic hook instead.
The hard use case to handle without unwinding is an HTTP server that wants to allow for panics in a request handler without panicking the entire process. Unwinding is a janky way to handle that, and creates issues in code that doesn't expect unwinding (e.g. half-modified states), and poisoning in particular seems likely to cascade and bring down other parts of the process anyway if some needed resource gets poisoned. But we need a reasonable alternative to propose for that use case, in order to seriously evaluate eliminating unwinding.
I am not sure that I understand what scoped panic hooks would or might look like. Are they maybe similar to something like try-catch-finally in Java? Would the language force the programmer to include them in certain cases somehow?
If a request handler for example has at some point in time 7 nested calls, in call no. 2 and call no. 6 have resources and partial computation that needs clean-up somehow and somewhere, and call no. 7 panics, I wonder what the code would look like in the different calls, and what would happen and when, and what the compiler would require, and what other relevant code would look like.
For the simple case, suppose that you're writing a TUI application that takes over the terminal. When it exits, even by panic, you want to clean up the terminal state so the user doesn't have to blindly type "reset".
Today, people sometimes do that by using `panic = "unwind"`, and writing a `catch_unwind` around their program, and using that to essentially implement a "finally" block. Or, they do it by having an RAII type that cleans up on `Drop`, and then they count on unwinding to ensure their `Drop` gets called even on panic. (That leaves aside the issue that something called from `Drop` is not allowed to fail or panic itself.) The question is, how would you do that without unwinding?
We have a panic hook mechanism, where on panic the standard library will call a user-supplied function. However, there is only one panic hook; if you set it, it replaces the old hook. If you have only one cleanup to do, that works fine. For more than one, you can follow the semantic of having your panic hook call the previous hook, but that does not allow unregistering hooks out of order; it only really works if you register a panic hook once for the whole program and never unregister it (e.g. "here's the hook for cleaning up tracing", "here's the hook for cleaning up the terminal state").
Suppose, instead, we had a mechanism that allowed registering arbitrary panic hooks, and unregistering them when no longer needed, in any order. Then, we could do RAII-style resource handling: you could have a `CursesTerminal` type, which is responsible for cleaning up the terminal, and it cleans up the terminal on `Drop` and on panic. To do the latter, it would register a panic hook, and deregister that hook on `Drop`.
With such a mechanism, panic hooks could replace anything that uses `catch_unwind` to do cleanup before going on to exit the program. That wouldn't fully solve the problem of doing cleanup and then swallowing the panic and continuing, but it'd be a useful component for that.
I have not given it much thought, but it would primarily be for the subset of Rust programs that do not need zero-cost abstractions as much, right? Since, even in the case of no panics, one would be paying at runtime for registering panic hooks, if I understand correctly.
(If unwinding goes away then, sure, mutex poisoning becomes moot.)
Excited to hear this.
I'm very disappointed at this. The path of least resistance ought to be the right thing to do.
In the entire history of the standard library, we have never once seen a single report of anyone attempting to recover from poison.
I've used recovering from poisoned state in impl Drop in quite a few places.
In my case it's usually waiting for the GPU to finish some asynchronous work that's been spun up by CPU threads that may have panicked while holding the lock. This is necessary to avoid freeing resources that the GPU may still be using.
I usually prefix this with `if !std::thread::panicking() {}`, so I don't end up waiting (possibly forever) if I'm already cleaning up after a panic.
Thank you for mentioning this; I'd be really interested in hearing more about this, and seeing some examples.
Hi, I don't have public examples to share but I can give an explanation of a simple scenario.
I have a container of resources, e.g. textures. When the GPU wants to use them, CPU will lease them until a point of time in the future denoted by a value (u64) of a GPU timeline semaphore. The handle and value of the semaphore is added to a list guarded by a mutex. Then GPU work is kicked off and the GPU will increment semaphore to that value when done.
In the Drop implementation of the container, we need to wait until all semaphores reach their respective value before freeing resources, and do so even if some thread panicked while holding the lock guarding the list. This is where I use .unwrap_or_else to get the list from the poison value.
It's not infeasible to try to catch any errors and propagate them when the lock is grabbed. But this is mostly for OOM and asserts that are not expected to fire. The ergonomics would be worse if the "lease" function would be fallible.
This said, I would not object to poisoning being made optional.
Oh, I don't think recovery from poison is why poisoning is good. The reason poisoning is good is that at the moment you've acquired a lock on a mutex, you should be able to assume that the invariants guarded by the mutex are upheld (and panic if not).
Mutex doesn't promise to uphold any more invariants than `&mut T` does. If the state can be corrupted by a panic while holding `&mut T`, I don't think there's any good reason to expect that obtaining it through `MutexGuard` should make any difference.
Panic propagation is typically handled much better at thread `join()` boundaries.
A panic in single-threaded, non-parallel code will either terminate the program or be recovered cleanly, so the potential for side effects to be silently observed in a way that breaks invariants is unique to Mutex<>. This is the reason for mutex poisoning,
I fail to see that there is any material difference. Whether you catch-unwind within a single thread or in a separate thread such that the panic can be resumed on join makes zero difference.
Heck, you can have Drop impls observing the state while unwinding.
A true panic-safe data structure requires serious thought, and mutex poisoning does nothing here - it is neither necessary nor sufficient.
This is a false dichotomy. Not every technique needs to work in all cases in order to be useful.
This seems analogous to arguing that because seat belts don't save the lives of all people involved in car crashes, and they're kind of annoying, then they shouldn't be factory-standard.
This is a case of a feature that is actively harmful for the things it tries to prevent, because it increases the risk in practice of panics "spreading" throughout a system, even after the programmer thought she had finished handling it, and because it gives a false impression what kind of guarantee you actually have.
This is exactly the problem. Poison is enough to be painful but not enough to fully solve the problem.
> Heck, you can have Drop impls observing the state while unwinding.
Yeah, this is really painful and regularly forgotten. And one reason it'd be nice to not have unwinding.
I understand what you mean, but you're saying has not been true for me in practice. Mutexes absolutely are used to uphold invariants in a way that &mut T is much less often.
There's something to be said here about what I've sometimes called the cancellation blast radius. The issues with cancellation happen when the data corruption/invariant violation is externally visible (if the corrupt data is torn down, who cares.) Mutexes make data corruption externally visible very often.
In projects I've worked on, this just hasn't been the case. Mutexes, especially in Rust, can grant you a `&mut T` when what you have is `&Mutex<T>`, and that's it - failing to uphold invariants in the API surface of `T` is a bug whether or not it lives inside a mutex.
Lots of data structures need to care about panic-safety. Inserting a node in a tree must leave the tree in a valid state if allocating memory for the new node fails, for example. All of that is completely orthogonal to whether or not the data structure is also observable from multiple threads behind a mutex, and I would argue especially in the case of mutex, whose purpose it is to make an object usable from multiple threads as-if they had ownership.
Acknowledging that panic safety is a real issue with data structures that mutex poisoning does not solve, I don't think we're going to agree on anything else here, unfortunately. We probably have entirely different experiences writing software -- mutex poisoning is very valuable in higher-level code.
That’s not surprising to me, but it’s not much of an argument for changing the default to be less safe. Most people want poisoning to propagate fatal errors and avoid reading corrupted data, not to recover from panics.
Edit: isn’t that an argument not to change the default? If people were recovering from poison a lot and that was painful, that’s one thing. But if people aren’t doing that, why is this a problem?
Because right now everyone writes `.lock().unwrap()` everywhere without really thinking about it, and it just makes Mutex more painful to work with.
If the issue is that everyone has to write an extra unwrap, then a good step would be to make lock panic automatically in the 2027 edition, and add a lock_or_poison method for the current behavior. But I think removing poisoning altogether from the default mutex, such that it silently unlocks on panic, would be very bad. The silent-unlock behavior is terrible with async cancellations and terrible with panics.
You seem to keep making the implicit assumption that because people are using `unwrap()`, they must not care about the poisoning behavior. I really don't understand where this assumption is coming from. I explicitly want to propagate panics from contexts that hold locks to contexts that take locks. The way to write that is `lock().unwrap()`. I get that some people might write `lock().unwrap()` not because they care about propagating panics, but because they don't care either way and it's easy. But why are you assuming that that's most people?
https://news.ycombinator.com/item?id=46051602
I'm suggesting that the balance of pain to benefit is not working out enough to inflict it on everyone by default. I'm not suggesting it has no value, just not enough to be worth it.
Is that not because there is not much to do, and therefore people use .unwrap() — because crashing is actually quite sane?
Correctness trumps ergonomics, and the default should definitely be poisoning/panicking unless handled. There could definitely be an optional poison-eating mutex, but I argue the current Mutex does the right thing.
To the contrary, the projects I've been part of have had no end of issues related to being cancelled in the middle of a critical section [1]. I consider poisoning to be table stakes for a mutex.
[1] https://sunshowers.io/posts/cancelling-async-rust/#the-pain-...
Well, I mean, if you've made the unfortunate decision to hold a Mutex across await points...?
This is completely banned in all of my projects. I have a 100k+ LOC project running in production, that is heavily async and with pervasive usage of threads and mutexes, and I never had a problem, precisely because I never hold a mutex across an await point. Hell, I don't even use async mutexes - I just use normal synchronous parking lot mutexes (since I find the async ones somewhat pointless). I just never hold them across await points.
As I said in the article, we avoid Tokio mutexes entirely for the exact reason that being cancelled in the middle of a critical section is bad. In Rust, there are two sources of cancellations in the middle of a critical section: async cancellations and panics. Ergo, panicking in the middle of a critical section is also bad, and mutexes ought to detect that and mark their internal state as corrupted as a result.
> Ergo, panicking in the middle of a critical section is also bad, and mutexes ought to detect that and mark their internal state as corrupted as a result.
I fundamentally disagree with this. Panicking in the middle of an operation that is supposed to be atomic is bad. If it's not supposed to be atomic then it's totally fine, just as panicking when you hold a plain old `&mut` is fine. Not every use of a `Mutex` is protecting an atomic operation that depends on not being cancelled for its correctness, and even for those situations where you do it's a better idea to prove that a panic cannot happen (if possible) or gracefully handle the panic.
I really don't see a point of mutex poisoning in most cases. You can either safely panic while you're holding a mutex (because your code doesn't care about atomicity), or you simply write your code in such a way that it's still correct even if you panic (e.g. if you temporarily `.take()` something in your critical section then you write a wrapper which restores it on `Drop` in case of a panic). The only thing poisoning achieves is to accidentally give you denial-of-service CVEs, and is actively harmful when it comes to producing reliable software.
I've written many production Rust services and programs over the years, both sync and async, and in my experience—by far the most common use of mutexes is to temporarily violate invariants that are otherwise upheld while the mutex is unlocked (which I think is what you mean by "atomic"). In some cases invariants can be restored, but in many cases they simply cannot.
Panicking while in the middle of a non-mutex-related &mut T is theoretically bad as well, but in my experience, &mut T temporary invariant violations don't happen nearly as often as corruption of mutex-guarded data.
You might not think you need atomicity, but some function you call that takes in a `&mut T` might actually expect it
If you’re not looking to scope out an atomic section, why are you taking the lock?
Worth noting that this is not `std::mutex` or `parking_lot::mutex` as discussed in the article, but `tokio::sync::Mutex` in cancellable async code.
Correct yes, but my point was about cancellation in the middle of a critical section more generally.
I disagree, lock poisoning is a good way of improving correctness of concurrent code in case of fatal errors. As demonstrated by the benchmarks in this article, it's not very expensive for typical use cases.
In 99% of the cases where one thread has panic'd while holding a lock, you want to panic the thread that attempts to grab the lock. The contents of anything inside the lock is very much undefined and continuing will lead to unpredictable results. So most of the time you just want:
The last 1% is when you want to clean up something even if a panic has occured. This is usually in a impl Drop situation. It's not much more verbose either, just: What is painful is trying to propagate the poison value as an error using `?`. In that case you're probably better off using a match expression because the usual `.into()` will not play nice with common error handling crates (thiserror, anyhow) or need to implement `From` manually for the error types and drop the contents of the poison error before propagating.This might be the case for long running server processes where you have n:m threading with long running threads and want to keep processing other requests even if one request fails. Although in that case you probably want (or your framework provides) some kind of robustness with `catch_unwind` that will log the errors, respond with HTTP 500 or whatever and then resume. Because that's needed to catch panics from non-mutex related code.
> poisoning a Mutex is a very convenient avenue for a denial-of-service attack, since a poisoned Mutex will just completely brick a given critical section.
There's a tension between making DoS hard and avoiding RCE vulnerabilities, since the way to avoid an unplanned/bad code state becoming an RCE vulnerability is to crash as quickly and thoroughly as possible when you get into that state.
There are cases where it is useful.
I had a case where if the mutex was poisened it was possible to reset the lock to a safe state (by writing a new value to the locked content).
Or you may want to drop some resource or restart some operation instead of panicing if it is poisoned.
But I agree that the default behavior should be that the user doesn't have to worry about it.
I've dug into this topic in the past and my takeaway for this entire thing was “cool idea, but don't use it practice ”. I.e. just unrwap the lock call's result. If a worker thread panics you should assume your applications done for. Some people even recommend setting panic=abort for release builds, in which case you won't even be able to catch those panics to begin with.
I mean, think about the actual use cases here. On of my threads just panicked. Does it make sense to continue running the application? And if you answer yes, this is an error condition that can occur, then it shouldn't panick to begin with and instead handle errors gracefully, leaving the mutex unpoisoned.
Questions for anyone who is an expert on poisoning in Rust:
Is it safe to ignore poisoned mutexes if and only if the relevant pieces of code are unwind-safe, similar to exception safety in C++? As in, if a panic happens, the relevant pieces of code handles the unwinding safely, thus data is not corrupted, and thus ignoring the poison is fine?
FYI, Apple platforms have had futexes since iOS 17.4 and macOS 14.4: https://developer.apple.com/documentation/os/synchronization...
Unfortunately, you sacrifice the priority inversion avoidance you would otherwise get with os_unfair_lock.
I will personally recommend that unless you are writing performance sensitive code*, don’t use mutexes at all because they are too low-level an abstraction. Use MPSC queues for example, or something like RCU. I find these abstractions much more developer friendly.
*: You may be, since you are using Rust.
Channels and RCU can often be better for performance as well. I have run into scaling issues due to contention way too many times. Sometimes just because two things were in different parts of the same cache line.
Sharing as little mutable state as possible is often the best way to scale to a large number of CPU cores. RCU can help with that if you have data that is mostly read but rarely written. If your workload is balanced or even write heavy RCU is probably not a good option.
I have even had to redesign because I had too much contention on plain old atomic reference counters, almost 30 % of my runtime was reference counting of a small number of specific Arcs, and hardware performance counters pointed at cache line contention. I redesigned that code to avoid Arcs entirely which also allowed some additional optimisations, resulting in cutting approximately 40 % of my runtime in total.
So, each specific use case should be approached individually if you care about performance. And always profile and benchmark. If the code isn't performance critical, by all means do what you think is most maintainable. But measure, because you are probably wrong about what part of your code is the bottleneck unless you measure.
Very interesting, I want to learn more. How do you diagnose contention on atomic counters? And how do you diagnose cache line contention?
And how do you rewrite that code away from Arc-s?
I have found out that mutex solutions are more maintainable and amendable without big redesigns compared with channels or RCU.
Consider a simple case of single producer-single consumer. While one can use bounded channels to implement back-pressure, in practice when one wants to either drop messages or apply back-pressure based on message priority any solution involving channels will lead to pile of complex multi-channel solutions and select. With mutex the change will be a straightforward replace of a queue by a priority queue and an extra if inside the mutex.
A channel can be backed by a priority queue if you wish. It’s just an abstraction. The channel internally probably uses mutexes too; it’s just that it’s helpful not to see mutexes in application code.
Surely one can abstract priority queue with mutexes into own data structure. However it will contain enough application-specific logic so a chance of reuse will be slim. So by directly working with mutexes one will have simpler code overall that can still be more easy to adapt to changing requirements.
In general the problem with channels is that they are not flexible enough while being rather abstract. A better abstraction is message passing with one message queue per thread like in Erlang. IMO it can cover more cases before one needs mutexes. But even with that proper back pressure and rate limiting is hard.
A mutex is a natural abstraction when there is exactly one of them. You have a bunch of tasks doing their own stuff, with shared mutable state behind the mutex. When you start thinking about using two mutexes, other abstractions often become more convenient.
The C++ example given in the article is not correct.
In C++ a mutex can wrap the object being protected.
Although there are owning mutexes for C++ the C++ standard library does not provide such a thing. So the std::mutex used in the example is not an owning mutex and that example works and does what was described.
One reason not to provide the owning mutex in C++ is that it isn't able to deliver similar guarantees to Rust because its type system isn't strong enough. Rust won't let you accidentally keep a reference to the protected object after unlocking, C++ will for example.
I am not very familiar with C++'s API, but I believe that you are right that the C++ example in the article is incorrect, though for a different reason, namely that RAII is supported also in C++.
In C++, a class like std::lock_guard also provides "Automatic unlock". AFAICT, the article argues that only Rust's API provides that.
> In C++, a class like std::lock_guard also provides "Automatic unlock". AFAICT, the article argues that only Rust's API provides that.
The issue isn't automatic unlocking. From the article:
> The problem? Nothing stops you from accessing account without locking the mutex first. The compiler won’t catch this bug.
i.e., a C++ compiler will happily compile code that modifies `account` without taking the lock first. Your lock_guard example suffers from this same issue.
Nothing in the C++ stdlib provides an API that makes it impossible to access `account` without first taking the lock, and while you can write C++ classes that approximate the Rust API you can't quite reach the same level of robustness without external help.
That is a different topic from what I wrote about.
The article wrote:
> Automatic unlock: When you lock, you receive a guard. When the guard goes out of scope, it automatically unlocks. No manual cleanup needed.
And presented Rust as being different from C++ regarding that, and the C++ example was not idiomatic, since it did not use something like std::lock_guard.
I have not addressed the rest of your comment, since it is a different topic, sorry.
Fair point with respect to the separate topic. My apologies.
As for the automatic cleanup bit, perhaps the article is trying to focus purely on the mutex types themselves? Or maybe they included the "when you lock" bit to emphasize that you can't forget to unlock the mutex (i.e., no reliance on unenforced idioms). Hard to say given the brevity/nature of the section, and in the end I think it's not that much of a problem given the general topic of the blogpost.
When comparing languages, posting unidiomatic code, and then making claims based on that unidiomatic code, is generally not fair nor correct.
That's true if the claims hinge on said unidiomatic code. As I said, it's not clear to me that that is the case here.
It seems completely clear. He first gives unidiomatic C++ code, then next gives idiomatic Rust code, and differentiates the two based on the code snippets. It is a mistake on his part, and I do not see how it could reasonably be viewed otherwise. It is not a huge mistake, but it is still a clear mistake.
Perhaps it might help to clarify precisely what claim(s) you think are being made?
From my reading, the section (and the article in general, really) is specifically focusing on mutexes, so the observations the article makes are indeed accurate in that respect (i.e., C++'s std::mutex indeed does not have automatic unlocking; you need to use an external construct for that functionality). Now, if the article were talking about locking patterns more generally, I think your criticism would hold more weight, but I think the article is more narrowly focused than that.
For a bit of a more speculative read, I think it's not unreasonable to take the C++ code as a general demonstration of the mutex API "languages other than Rust" use rather than trying to be a more specific comparison of locking patterns in Rust and C++. Consider the preceding paragraph:
> In languages other than Rust, you typically declare a mutex separately from your data, then manually lock it before entering the critical section and unlock it afterward. Here’s how it looks in C++:
I don't think it's unreasonable to read the "it" in the final sentence as "that pattern"; i.e., "Here's what that pattern looks like when written in C++". The example code would be perfectly correct in that case - it shows a mutex declared separately from the data, and it shows that mutex being manually locked before entering the critical section and unlocked after.
I am very sorry, but your arguments here are clearly terrible.
[dead]
tl;dr: the implementation that is designed for fairness has lower standard deviation under contention, but otherwise performs slightly worse.
Nothing too surprising.
[dead]
For Cargo.toml, an error: invalid basic string, expected `"` for 10:11 std/parking_lot_mutexes.
Sourcing VS, documentation indicates Python, C/C++, GitHubCopilot, and an Extension Pack for Java in top extensions.
[1]: https://code.visualstudio.com/docs