Language Selection

English French German Italian Portuguese Spanish

LWN's Latest Articles (No Paywall) About Linux Kernel

Filed under
Linux
  • Patching until the COWs come home (part 2)

    Part 1 of this series described the copy-on-write (COW) mechanism used to avoid unnecessary copying of pages in memory, then went into the details of a bug in that mechanism that could result in the disclosure of sensitive data. A patch written by Linus Torvalds and merged for the 5.8 kernel appeared to fix that problem without unfortunate side effects elsewhere in the system. But COW is a complicated beast and surprises are not uncommon; this particular story was nowhere near as close to an end as had been thought.

    Torvalds's expectations quickly turned out to be overly optimistic. In August 2020, a bug was reported by Peter Xu; it affected userfaultfd(), which is a subsystem for handling page faults in a user-space process. This mechanism allows the handling process to (among other things) write-protect ranges of memory and be notified of attempts to write to that range. One use case for this feature is to prevent pages from being modified while the monitoring process writes their contents to secondary storage. That write can, however, result in a read-only get_user_pages() (GUP) call on the write-protected pages, which should be fine. Remember, though, that Torvalds's fix worked by changing read-only get_user_pages() calls to look like calls for write access; this was done to force the breaking of COW references on the pages in question. In the userfaultfd() case, that generates an unexpected write fault in the monitoring process, with the result that this process hangs.

    The initial version of Xu's fix went in the direction of more fine-grained rules for breaking COW by GUP, as had been anticipated in the original fix, and added some userfaultfd()-specific handling. But during the discussion, Torvalds instead proposed a completely different approach, which resulted in another patch set from Xu. These patches essentially revert Torvalds's change and abandon the approach of always breaking COW for GUP calls. Instead, do_wp_page(), which handles write faults to a write-protected page, is modified by commit 09854ba94c6a ("mm: do_wp_page() simplification") to more strictly check if the page is shared by multiple processes.

  • Lockless patterns: some final topics

    So far, this series has covered five common lockless patterns in the Linux kernel; those are probably the five that you will most likely encounter when working on Linux. Throughout this series, some details have been left out and some simplifications were made in the name of clarity. In this final installment, I will sort out some of these loose ends and try to answer what is arguably the most important question of all: when should you use the lockless patterns that have been described here?

    [...]
    ions are. In these cases, applying lockless techniques to the fast path can be valuable.

    For example, you could give each thread a queue of requests from other threads and manage them through single-consumer linked lists. Perhaps you can trigger the processing of requests using the cross-thread notification pattern from the article on full memory barriers. However, these techniques only make sense because the design of the whole system supports them. In other words, in a system that is designed to avoid scalability bottlenecks, common sub-problems tend to arise and can often be solved efficiently using the patterns that were presented here.

    When seeking to improve the scalability of a system with lockless techniques, it is also important to distinguish between lock-free and wait-free algorithms. Lock-free algorithms guarantee that the system as a whole will progress, but do not guarantee that each thread will progress; lock-free algorithms are rarely fair, and if the number of operations per second exceeds a certain threshold, some threads might end up failing so often that the result is a livelock. Wait-free algorithms additionally ensure per-thread progress. Usually this comes with a significant price in terms of complexity, though not always; for example message passing and cross-thread notification are both wait-free.

    Looking at the Linux llist primitives, llist_add() is lock-free; on the consumer side, llist_del_first() is lock-free, while llist_del_all() is wait-free. Therefore, llist may not be a good choice if many producers are expected to contend on calls to llist_add(); and using llist_del_all() is likely better than llist_del_first() unless constant-time consumption is an absolute requirement. For some architectures, the instruction set does not allow read-modify-write operations to be written as wait-free code; if that is the case, llist_del_all() will only be lock-free (but still preferable, because it lets the consumer perform fewer accesses to the shared data structure).

    In any case, the definitive way to check the performance characteristics of your code is to benchmark it. Intuition and knowledge of some well-known patterns can guide you in both the design and the implementation phase, but be ready to be proven wrong by the numbers.

  • GDB and io_uring

    A problem reported when attaching GDB to programs that use io_uring has led to a flurry of potential solutions, and one that was merged into Linux 5.12-rc5. The problem stemmed from a change made in the 5.12 merge window to how the threads used by io_uring were created, such that they became associated with the process using io_uring. Those "I/O threads" were treated specially in the kernel, but that led to the problem with GDB (and likely other ptrace()-using programs). The solution is to treat them like other threads because it turned out that trying to make them special caused more problems than it solved.

    Stefan Metzmacher reported the problem to the io-uring mailing list on March 20. He tried to attach GDB to the process of a program using io_uring, but the debugger went "into an endless loop because it can't attach to the io_threads". PF_IO_WORKER threads are used by io_uring for operations that might block; he followed up the bug report with two patch sets that would hide these threads in various ways. The idea behind hiding them is that if GDB cannot see the threads, it will not attempt to attach to them. Prior to 5.12, the threads existed but were not associated with the io_uring-using process, so GDB would not see them.

    It is, of course, less than desirable for developers to be unable to run a debugger on code that uses io_uring, especially since io_uring support in their application is likely to be relatively new, thus it may need more in the way of debugging. The maintainer of the io_uring subsystem, Jens Axboe, quickly stepped in to help Metzmacher solve the problem. Axboe posted a patch set that included a way to hide the PF_IO_WORKER threads, along with some tweaks to the signal handling for these threads; in particular, he removed the ability for them to receive signals at all.

More in Tux Machines

digiKam 7.7.0 is released

After three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release. Read more

Dilution and Misuse of the "Linux" Brand

Samsung, Red Hat to Work on Linux Drivers for Future Tech

The metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world. Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility. Read more

today's howtos

  • How to install go1.19beta on Ubuntu 22.04 – NextGenTips

    In this tutorial, we are going to explore how to install go on Ubuntu 22.04 Golang is an open-source programming language that is easy to learn and use. It is built-in concurrency and has a robust standard library. It is reliable, builds fast, and efficient software that scales fast. Its concurrency mechanisms make it easy to write programs that get the most out of multicore and networked machines, while its novel-type systems enable flexible and modular program constructions. Go compiles quickly to machine code and has the convenience of garbage collection and the power of run-time reflection. In this guide, we are going to learn how to install golang 1.19beta on Ubuntu 22.04. Go 1.19beta1 is not yet released. There is so much work in progress with all the documentation.

  • molecule test: failed to connect to bus in systemd container - openQA bites

    Ansible Molecule is a project to help you test your ansible roles. I’m using molecule for automatically testing the ansible roles of geekoops.

  • How To Install MongoDB on AlmaLinux 9 - idroot

    In this tutorial, we will show you how to install MongoDB on AlmaLinux 9. For those of you who didn’t know, MongoDB is a high-performance, highly scalable document-oriented NoSQL database. Unlike in SQL databases where data is stored in rows and columns inside tables, in MongoDB, data is structured in JSON-like format inside records which are referred to as documents. The open-source attribute of MongoDB as a database software makes it an ideal candidate for almost any database-related project. This article assumes you have at least basic knowledge of Linux, know how to use the shell, and most importantly, you host your site on your own VPS. The installation is quite simple and assumes you are running in the root account, if not you may need to add ‘sudo‘ to the commands to get root privileges. I will show you the step-by-step installation of the MongoDB NoSQL database on AlmaLinux 9. You can follow the same instructions for CentOS and Rocky Linux.

  • An introduction (and how-to) to Plugin Loader for the Steam Deck. - Invidious
  • Self-host a Ghost Blog With Traefik

    Ghost is a very popular open-source content management system. Started as an alternative to WordPress and it went on to become an alternative to Substack by focusing on membership and newsletter. The creators of Ghost offer managed Pro hosting but it may not fit everyone's budget. Alternatively, you can self-host it on your own cloud servers. On Linux handbook, we already have a guide on deploying Ghost with Docker in a reverse proxy setup. Instead of Ngnix reverse proxy, you can also use another software called Traefik with Docker. It is a popular open-source cloud-native application proxy, API Gateway, Edge-router, and more. I use Traefik to secure my websites using an SSL certificate obtained from Let's Encrypt. Once deployed, Traefik can automatically manage your certificates and their renewals. In this tutorial, I'll share the necessary steps for deploying a Ghost blog with Docker and Traefik.