Language Selection

English French German Italian Portuguese Spanish

LWN's Latest Articles (No Paywall) About Linux Kernel

Filed under
Linux
  • Patching until the COWs come home (part 2)

    Part 1 of this series described the copy-on-write (COW) mechanism used to avoid unnecessary copying of pages in memory, then went into the details of a bug in that mechanism that could result in the disclosure of sensitive data. A patch written by Linus Torvalds and merged for the 5.8 kernel appeared to fix that problem without unfortunate side effects elsewhere in the system. But COW is a complicated beast and surprises are not uncommon; this particular story was nowhere near as close to an end as had been thought.

    Torvalds's expectations quickly turned out to be overly optimistic. In August 2020, a bug was reported by Peter Xu; it affected userfaultfd(), which is a subsystem for handling page faults in a user-space process. This mechanism allows the handling process to (among other things) write-protect ranges of memory and be notified of attempts to write to that range. One use case for this feature is to prevent pages from being modified while the monitoring process writes their contents to secondary storage. That write can, however, result in a read-only get_user_pages() (GUP) call on the write-protected pages, which should be fine. Remember, though, that Torvalds's fix worked by changing read-only get_user_pages() calls to look like calls for write access; this was done to force the breaking of COW references on the pages in question. In the userfaultfd() case, that generates an unexpected write fault in the monitoring process, with the result that this process hangs.

    The initial version of Xu's fix went in the direction of more fine-grained rules for breaking COW by GUP, as had been anticipated in the original fix, and added some userfaultfd()-specific handling. But during the discussion, Torvalds instead proposed a completely different approach, which resulted in another patch set from Xu. These patches essentially revert Torvalds's change and abandon the approach of always breaking COW for GUP calls. Instead, do_wp_page(), which handles write faults to a write-protected page, is modified by commit 09854ba94c6a ("mm: do_wp_page() simplification") to more strictly check if the page is shared by multiple processes.

  • Lockless patterns: some final topics

    So far, this series has covered five common lockless patterns in the Linux kernel; those are probably the five that you will most likely encounter when working on Linux. Throughout this series, some details have been left out and some simplifications were made in the name of clarity. In this final installment, I will sort out some of these loose ends and try to answer what is arguably the most important question of all: when should you use the lockless patterns that have been described here?

    [...]
    ions are. In these cases, applying lockless techniques to the fast path can be valuable.

    For example, you could give each thread a queue of requests from other threads and manage them through single-consumer linked lists. Perhaps you can trigger the processing of requests using the cross-thread notification pattern from the article on full memory barriers. However, these techniques only make sense because the design of the whole system supports them. In other words, in a system that is designed to avoid scalability bottlenecks, common sub-problems tend to arise and can often be solved efficiently using the patterns that were presented here.

    When seeking to improve the scalability of a system with lockless techniques, it is also important to distinguish between lock-free and wait-free algorithms. Lock-free algorithms guarantee that the system as a whole will progress, but do not guarantee that each thread will progress; lock-free algorithms are rarely fair, and if the number of operations per second exceeds a certain threshold, some threads might end up failing so often that the result is a livelock. Wait-free algorithms additionally ensure per-thread progress. Usually this comes with a significant price in terms of complexity, though not always; for example message passing and cross-thread notification are both wait-free.

    Looking at the Linux llist primitives, llist_add() is lock-free; on the consumer side, llist_del_first() is lock-free, while llist_del_all() is wait-free. Therefore, llist may not be a good choice if many producers are expected to contend on calls to llist_add(); and using llist_del_all() is likely better than llist_del_first() unless constant-time consumption is an absolute requirement. For some architectures, the instruction set does not allow read-modify-write operations to be written as wait-free code; if that is the case, llist_del_all() will only be lock-free (but still preferable, because it lets the consumer perform fewer accesses to the shared data structure).

    In any case, the definitive way to check the performance characteristics of your code is to benchmark it. Intuition and knowledge of some well-known patterns can guide you in both the design and the implementation phase, but be ready to be proven wrong by the numbers.

  • GDB and io_uring

    A problem reported when attaching GDB to programs that use io_uring has led to a flurry of potential solutions, and one that was merged into Linux 5.12-rc5. The problem stemmed from a change made in the 5.12 merge window to how the threads used by io_uring were created, such that they became associated with the process using io_uring. Those "I/O threads" were treated specially in the kernel, but that led to the problem with GDB (and likely other ptrace()-using programs). The solution is to treat them like other threads because it turned out that trying to make them special caused more problems than it solved.

    Stefan Metzmacher reported the problem to the io-uring mailing list on March 20. He tried to attach GDB to the process of a program using io_uring, but the debugger went "into an endless loop because it can't attach to the io_threads". PF_IO_WORKER threads are used by io_uring for operations that might block; he followed up the bug report with two patch sets that would hide these threads in various ways. The idea behind hiding them is that if GDB cannot see the threads, it will not attempt to attach to them. Prior to 5.12, the threads existed but were not associated with the io_uring-using process, so GDB would not see them.

    It is, of course, less than desirable for developers to be unable to run a debugger on code that uses io_uring, especially since io_uring support in their application is likely to be relatively new, thus it may need more in the way of debugging. The maintainer of the io_uring subsystem, Jens Axboe, quickly stepped in to help Metzmacher solve the problem. Axboe posted a patch set that included a way to hide the PF_IO_WORKER threads, along with some tweaks to the signal handling for these threads; in particular, he removed the ability for them to receive signals at all.

More in Tux Machines

KDE Frameworks 5.81 Released with KHamburgerMenu, Various Improvements

The biggest new feature in the KDE Frameworks 5.81 release is the implementation of a new, custom hamburger menu called KHamburgerMenu, which will be shown on QWidgets-based apps whenever the main menubar is hidden. The KDE Project plans to adopt the KHamburgerMenu for all KDE apps as it offers several advantages, including an alternative app menu in case you hide the default menubar by accident, more freedom when you want to take full advantage of the maximum vertical space, more compact design with only relevant menu items, as well as support for relocating, renaming, removing, or even changing its icon. Read more

today's leftovers

  • Radeon Vulkan Driver Adds Option Of Rendering Less For ~30% Greater Performance - Phoronix

    If your current Vulkan-based Radeon Linux gaming performance isn't cutting it and a new GPU is out of your budget or you have been unable to find a desired GPU upgrade in stock, the Mesa RADV driver has added an option likely of interest to you... Well, at least moving forward with this feature being limited to RDNA2 GPUs for now. RADV as Mesa's Radeon Vulkan driver has added an option to allow Variable Rate Shading (VRS) via an environment variable override. This RADV addition is inspired by the likes of NVIDIA DLSS for trading rendering quality for better performance but in its current form is a "baby step" before being comparable to DLSS quality and functionality.

  • Bas Nieuwenhuizen: A First Foray into Rendering Less

    In RADV we just added an option to speed up rendering by rendering less pixels. These kinds of techniques have become more common over the past decade with techniques such as checkerboarding, TAA based upscaling and recently DLSS. Fundamentally all they do is trading off rendering quality for rendering cost and many of them include some amount of postprocessing to try to change the curve of that tradeoff. Most notably DLSS has been widly successful at that to the point many people claim it is barely a quality regression. Of course increasing GPU performance by up to 50% or so with barely any quality regression seems like must have and I think it would be pretty cool if we could have the same improvements on Linux. I think it has the potential to be a game changer, making games playable on APUs or playing with really high resolution or framerates on desktops. [...] VRS is by far the easiest thing to make work in almost all games. Most alternatives like checkerboarding, TAA and DLSS need modified render target size, significant shader fixups, or even a proprietary integration with games. Making changes that deeply is getting more complicated the more advanced the game is. If we want to reduce render resolution (which would be a key thing in e.g. checkerboarding or DLSS) it is very hard to confidently tie all resolution dependent things together. For example a big cost for some modern games is raytracing, but the information flow to the main render targets can be very hard to track automatically and hence such a thing would require a lot of investigation or a bunch of per game customizations.

  • Dota 2 version 7.29 is out with the new Dawnbreaker melee hero

    Valve has put out a major upgrade for their popular free to play MOBA with Dota 2 getting Dawnbreaker. This brand new hero is focused on melee, with a low-skill entry level so it should be suitable for a lot of players. You can see a dedicated hero page for Dawnbreaker here. "Dawnbreaker shines in the heart of battle, happily crushing enemies with her celestial hammer and healing nearby allies. She revels in hurling her hammer through multiple foes and then converging with it in a blazing wake, always waiting to tap her true cosmic power to fly to the aid of her teammates — eager to rout her enemies on the battlefield no matter where they are."

  • Grape times ahead with the release of Wine 6.6 noting plenty of fixes

    No wine-ing about the puns please. Jokes aside, the tasty compatibility tech that is Wine has a new development release available today with Wine 6.6. For newer readers and Linux users here's a refresher - Wine is a compatibility layer built for operating systems like Linux, macOS and BSD. The idea is to allow other platforms to run games and applications only built and supported for Windows. It's also part of what makes up Steam Play Proton. Once a year or so, a new stable release is made.

  • Friday’s Fedora Facts: 2021-14

    Here’s your weekly Fedora report. Read what happened this week and what’s coming up. Your contributions are welcome (see the end of the post)! The Final freeze is underway. The F34 Final Go/No-Go meeting is Thursday. I have weekly office hours on Wednesdays in the morning and afternoon (US/Eastern time) in #fedora-meeting-1. Drop by if you have any questions or comments about the schedule, Changes, elections, or anything else. See the upcoming meetings for more information.

  • A developer goes to the Masters: Day 1 inside the digital ops center [Ed: IBM is OK with the word "Master" again, contrary to spin]
  • Rancher Platform Partner, Weka delivers Stateful Storage for Containers at Scale

    Containers rose to the mainstream primarily due to workload portability and immutability advantages. Kubernetes became the primary orchestration tool and was initially supporting stateless applications, commonly referred to as the cattle vs. pets approach. However, data-centric applications need stateful-ness while still leveraging the cattle vs. pet approach. Microservices, Containers, and Kubernetes are now moving mainstream as increasingly more stateful applications are adopting them.

  • SUSE for your agile data platform, featuring Microsoft SQL Server[Ed: SUSE is just a worthless proprietary software reseller for SAP and Microsoft (their salesperson from SAP signing anti-RMS petition makes perfect sense and proves us correct about SUSE's motivations)]
  • What's the point of open source without contributors? Turns out, there are several [Ed: Mac Asay isn't even using it himself, just lecturing others what to do while working for Jeff Bezos]
  • Am I FLoCed? A New Site to Test Google's Invasive Experiment

    FLoC is a terrible idea that should not be implemented. Google’s experimentation with FLoC is also deeply flawed . We hope that this site raises awareness about where the future of Chrome seems to be heading, and why it shouldn't.

    FLoC takes most of your browsing history in Chrome, and analyzes it to assign you to a category or “cohort.” This identification is then sent to any website you visit that requests it, in essence telling them what kind of person Google thinks you are. For the time being, this ID changes every week, hence leaking new information about you as your browsing habits change. You can read a more detailed explanation here .

    Because this ID changes, you will want to visit https://amifloced.org often to see those changes.

  • The Brave browser basics: what it does, how it differs from rivals

    Boutique browsers try to scratch out a living by finding a niche underserved by the usual suspects. Brave is one of those browsers.

    Brave has gotten more attention than most alternate browsers, partly because a co-founder was one of those who kick-started Mozilla's Firefox, partly because of its very unusual — some say parasitical — business model.

Devices/Embedded Hardware

  • 3.5-inch SBC features Comet Lake-S

    Aaeon’s 3.5-inch Linux-ready “GENE-CML5” SBC supplies an up to octa-core 10th Gen Core CPU plus up to 64GB DDR4, 2x SATA, 2x GbE, 2x USB 3.2 Gen2, DP, VGA, M.2 M-key, and PCIe x4. Aaeon has posted a preliminary product page for what appears to be the first 3.5-inch SBC built around Intel’s 10th Gen Comet Lake-S. In fact, this is one of the first Comet Lake SBCs of any kind, following a few early entries like Portwell’s WADE-8212 Mini-ITX board.

  • Play your retro console on a modern TV
  • Olimex RP2040-PICO-PC “computer” to feature RP2040-Py Raspberry Pi Pico compatible module

    We previously wrote it was possible to create a Raspberry Pi RP2040 board with HDMI using DVI and programmable IOs to output video up to 640×480 at 60 Hz with the microcontroller’s Cortex-M0+ cores clocked at 252 MHz. At the time, we also noted Olimex was working on such a board with RP2040-PICO-PC designed to create a small Raspberry Pi RP2040 computer with HDMI/DVI video output. The Bulgarian company has now manufactured the first prototype, but due to supply issues with the Raspberry Pi Pico board, they also designed their own RP2040-PICO module since they’ve got a reel of Raspberry Pi RP2040 microcontrollers.

  • Our most complex Open Source Hardware board made with KiCad – the octa core iMX8 Quad Max – Tukhla is completely routed and now on prototype production

    We started this project June-July 2020. Due to the Covid19 the development took 10 months although only 6 month of active work was done, due to lock downs, ill developers and so on troubles.

    Now the board is completely routed and has these features: [...]

Programming Leftovers

  • Open Source Software Leader the Eclipse Foundation Launches the Adoptium Working Group for Multi-Vendor Delivery of Java Runtimes for Enterprises
  • AWS's Shane Miller to head the newly created Rust Foundation

    Miller, who leads the Rust Platform team for AWS, has been a software engineer for almost 30 years. At AWS, Miller has been a leader in open-source strategic initiatives and software engineering and delivery. Miller's Rust Platform team includes Rust language and compiler maintainers and contributors and developers on the Tokio runtime for writing reliable asynchronous applications with Rust. Under Miller's leadership, the AWS Rust team is crafting optimizations and tools for the features that engineers will use to build and operate services which take full advantage of Rust's performance and safety.

  • Inkscape compiled in OpenEmbedded

    Cross-compiling can be a challenge with some packages, and some of the big ones, such as SeaMonkey, LibreOffice and Inkscape, I have compiled in a running EasyOS (with the "devx" SFS loaded). I have previously compiled LibreOffice in OE, see the Pyro series. But it was a lot of work.

  • Felix Häcker: New Shortwave release

    Ten months later, after 14.330 added and 8.634 deleted lines, Shortwave 2.0 is available! It sports new features, and comes with the well known improvements, and bugfixes as always. [...] Shortwave has always been designed to handle any screen size from the beginning. In version 2.0 we have been able to improve this even further. There is now a compact mini player for desktop screens. This still offers access to the most important functions in a tiny window.

  • 5 signs you're a groff programmer

    I first discovered Unix systems in the early 1990s, when I was an undergraduate at university. I liked it so much that I replaced the MS-DOS system on my home computer with the Linux operating system. One thing that Linux didn't have in the early to mid-1990s was a word processor. A standard office application on other desktop operating systems, a word processor lets you edit text easily. I often used a word processor on DOS to write my papers for class. I wouldn't find a Linux-native word processor until the late 1990s. Until then, word processing was one of the rare reasons I maintained dual-boot on my first computer, so I could occasionally boot back into DOS to write papers. Then I discovered that Linux provided kind of a word processor. GNU troff, better known as groff, is a modern implementation of a classic text processing system called troff, short for "typesetter roff," which is an improved version of the nroff system. And nroff was meant to be a new implementation of the original roff (which stood for "run off," as in to "run off" a document).