Language Selection

English French German Italian Portuguese Spanish

Kernel: LWN and Phoronix Articles About Latest Discussions and Linux Developments

Filed under
Linux
  • Filesystem UID mapping for user namespaces: yet another shiftfs

    The idea of an ID-shifting virtual filesystem that would remap user and group IDs before passing requests through to an underlying real filesystem has been around for a few years but has never made it into the mainline. Implementations have taken the form of shiftfs and shifting bind mounts. Now there is yet another approach to the problem under consideration; this one involves a theoretically simpler approach that makes almost no changes to the kernel's filesystem layer at all.

    ID-shifting filesystems are meant to be used with user namespaces, which have a number of interesting characteristics; one of those is that there is a mapping between user IDs within the namespace and those outside of it. Normally this mapping is set up so that processes can run as root within the namespace without giving them root access on the system as a whole. A user namespace could be configured so that ID zero inside maps to ID 10000 outside, for example; ranges of IDs can be set up in this way, so that ID 20 inside would be 10020 outside. User namespaces thus perform a type of ID shifting now.

    In systems where user namespaces are in use, it is common to set them up to use non-overlapping ranges of IDs as a way of providing isolation between containers. But often complete isolation is not desired. James Bottomley's motivation for creating shiftfs was to allow processes within a user namespace to have root access to a specific filesystem. With the current patch set, instead, author Christian Brauner describes a use case where multiple containers have access to a shared filesystem and need to be able to access that filesystem with the same user and group IDs. Either way, the point is to be able to set up a mapping for user and group IDs that differs from the mapping established in the namespace itself.

  • Keeping secrets in memfd areas

    Back in November 2019, Mike Rapoport made the case that there is too much address-space sharing in Linux systems. This sharing can be convenient and good for performance, but in an era of advanced attacks and hardware vulnerabilities it also facilitates security problems. At that time, he proposed a number of possible changes in general terms; he has now come back with a patch implementing a couple of address-space isolation options for the memfd mechanism. This work demonstrates the sort of features we may be seeing, but some of the hard work has been left for the future.
    Sharing of address spaces comes about in a number of ways. Linux has traditionally mapped the kernel's address space into every user-space process; doing so improves performance in a number of ways. This sharing was thought to be secure for years, since the mapping doesn't allow user space to actually access that memory. The Meltdown and Spectre hardware bugs, though, rendered this sharing insecure; thus kernel page-table isolation was merged to break that sharing.

    Another form of sharing takes place in the processor's memory caches; once again, hardware vulnerabilities can expose data cached in this shared area. Then there is the matter of the kernel's direct map: a large mapping (in kernel space) that contains all of physical memory. This mapping makes life easy for the kernel, but it also means that all user-space memory is shared with the kernel. In other words, an attacker with even a limited ability to run code in the kernel context may have easy access to all memory in the system. Once again, in an era of speculative-execution bugs, that is not necessarily a good thing.

  • Revisiting stable-kernel regressions

    Stable-kernel updates are, unsurprisingly, supposed to be stable; that is why the first of the rules for stable-kernel patches requires them to be "obviously correct and tested". Even so, for nearly as long as the kernel community has been producing stable update releases, said community has also been complaining about regressions that make their way into those releases. Back in 2016, LWN did some analysis that showed the presence of regressions in stable releases, though at a rate that many saw as being low enough. Since then, the volume of patches showing up in stable releases has grown considerably, so perhaps the time has come to see what the situation with regressions is with current stable kernels.
    As an example of the number of patches going into the stable kernel updates, consider that, as of 4.9.213, 15,648 patches have been added to the original 4.9 release — that is an entire development cycle worth of patches added to a "stable" kernel. Reviewing all of those to see whether each contains a regression is not practical, even for the maintainers of the stable updates. But there is an automated way to get a sense for how many of those stable-update patches bring regressions with them.

    The convention in the kernel community is to add a Fixes tag to any patch fixing a bug introduced by another patch; that tag includes the commit ID for the original, buggy patch. Since stable kernel releases are supposed to be limited to fixes, one would expect that almost every patch would carry such a tag. In the real world, about 40-60% of the commits to a stable series carry Fixes tags; the proportion appears to be increasing over time as the discipline of adding those tags improves.

  • Finer-grained kernel address-space layout randomization

    The idea behind kernel address-space layout randomization (KASLR) is to make it harder for attackers to find code and data of interest to use in their attacks by loading the kernel at a random location. But a single random offset is used for the placement of the kernel text, which presents a weakness: if the offset can be determined for anything within the kernel, the addresses of other parts of the kernel are readily calculable. A new "finer-grained" KASLR patch set seeks to remedy that weakness for the text section of the kernel by randomly reordering the functions within the kernel code at boot time.

  • Debian discusses how to handle 2038

    At this point, most of the kernel work to avoid the year-2038 apocalypse has been completed. Said apocalypse could occur when time counted in seconds since 1970 overflows a 32-bit signed value (i.e. time_t). Work in the GNU C Library (glibc) and other C libraries is well underway as well. But the "fun" is just beginning for distributions, especially those that support 32-bit architectures, as a recent Debian discussion reveals. One of the questions is: how much effort should be made to support 32-bit architectures as they fade from use and 2038 draws nearer?

    Steve McIntyre started the conversation with a post to the debian-devel mailing list. In it, he noted that Arnd Bergmann, who was copied on the email, had been doing a lot of the work on the kernel side of the problem, but that it is mostly a solved problem for the kernel at this point. McIntyre and Bergmann (not to mention Debian as a whole) are now interested in what is needed to update a complete Linux system, such as Debian, to work with a 64-bit time_t.

    McIntyre said that glibc has been working on an approach that splits the problem up based on the architecture targeted. Those that already have a 64-bit time_t will simply have a glibc that works with that ABI. Others that are transitioning from a 32-bit time_t to the new ABI will continue to use the 32-bit version by default in glibc. Applications on the latter architectures can request the 64-bit time_t support from glibc, but then they (and any other libraries they use) will only get the 64-bit versions of the ABI.

    One thing that glibc will not be doing is bumping its SONAME (major version, essentially); doing so would make it easier to distinguish versions with and without the 64-bit support for 32-bit architectures. The glibc developers do not consider the change to be an ABI break, because applications have to opt into the change. It would be difficult and messy for Debian to change the SONAME for glibc on its own.

  • UEFI Boot Support Published For RISC-V On Linux

    As we've been expecting to happen with the Linux EFI code being cleaned up before the introduction of a new architecture, the RISC-V patches have been posted for bringing up UEFI boot support.

    Western Digital's Atish Patra sent out the patch series on Tuesday for adding UEFI support for the RISC-V architecture. This initial UEFI Linux bring-up is for supporting boot time services while the UEFI runtime service support is still being worked on. This RISC-V UEFI support can work in conjunction with the U-Boot bootloader and depends upon other recent Linux kernel work around RISC-V's Supervisor Binary Interface (SBI).

  • Linux Kernel Seeing Patches For NVIDIA's Proprietary Tegra Partition Table

    As an obstacle for upstreaming some particularly older NVIDIA Tegra devices (namely those running Android) is that they have GPT entry at the wrong location or lacking at all for boot support. That missing or botched GPT support is because those older devices make use of a NVIDIA proprietary/closed-source table format. As such, support for this proprietary NVIDIA Tegra Partition Table is being worked on for the Linux kernel to provide better upstream kernel support on these consumer devices.

    NVIDIA Tegra devices primarily rely on a special partition table format for their internal storage while some also support traditional GPT partitions. Those devices with non-flakey GPT support can boot fine but TegraPT support is being worked on for handling the upstream Linux kernel with the other devices lacking GPT support or where it's at the wrong sector. This issue primarily plagues Tegra 2 and Tegra 3 era hardware like some Google Nexus products (e.g. Nexus 7) while fortunately newer Tegra devices properly support GPT.

  • Intel Continues Bring-Up Of New Gateway SoC Architecture On Linux, ComboPHY Driver

    Besides all the usual hardware enablement activities with the usual names by Intel's massive open-source team working on the Linux kernel, one of the more peculiar bring-ups recently has been around the "Intel Gateway SoC" with more work abound for Linux 5.7.

    The Intel Gateway SoC is a seemingly yet-to-be-released product for high-speed network packet processing. The Gateway SoC supports the Intel Gateway Datapath Architecture (GWDPA) and is designed for very fast and efficient network processing. Outside of Linux kernel patches we haven't seen many Intel "Gateway" references to date. Gateway appears to be (or based on) the Intel "Lightning Mountain" SoC we were first to notice and bring attention to last summer when patches began appearing for that previously unknown codename.

More in Tux Machines

Programming: Data Science, David Pena and GSoC

Events and Community: LF, Kubernetes and LibreOffice

  • The Linux Foundation’s First-Ever Virtual Open Source Summit

    The success of The Linux Foundation‘s first virtual summit may well have set the standard for new levels of open source participation. Summit masters closed the virtual doors of the four-day joint gathering on July 2. The event hosted the Open Source Summit + Embedded Linux Conference North America 2020 and ended with more than 4,000 registrants from 109 countries. The online attendance platform offered registrants a virtual experience that provided an immersive experience for event participants, according to The Linux Foundation (LF). That virtual attendance feeling was as close as possible to what they would have received at a face-to-face event, Kristin O’Connell, director of event marketing at The Linux Foundation, told LinuxInsider. One of the newcomers in technical trendsetting at this first virtual conference was the FinOps Foundation. The FinOps Foundation includes 1,500 individual members across the globe, representing more than 500 companies with more than US$1 billion in revenue each. In the same way that DevOps revolutionized development by breaking down silos and increasing agility, FinOps increases the business value of cloud by bringing together technology, business, and finance professionals with a new cultural set, knowledge skills and technical processes, LF maintained.

  • Kubernetes from cloud to edge: A US virtual event

    After the huge success of its latest virtual event on Kubernetes for the EMEA region, Canonical brings its expertise to the US with another educational discussion. On July 31st, Canonical’s panel of experts come together to share their insights from years of working with developers and enterprises on various Kubernetes use cases and interact with the audience live. For all the benefits it offers around container orchestration, Kubernetes remains a significantly complex platform to deploy and operate. As developers and enterprises alike turn to Kubernetes for more and more types of use cases, available Kubernetes solutions often fail to meet their exact needs. Canonical’s extensive Kubernetes portfolio is centered around Charmed Kubernetes and MicroK8s, designed to provide full flexibility from cloud to edge in order to facilitate efficient innovation and scaling.

  • Community Member Monday: Khairul Aizat Kamarudzzaman

    I’m from Kuala Lumpur, Malaysia. I started exploring and using FOSS way back in 2004 (Debian, Red Hat, the BSD family and sticking with Ubuntu till today) when I was studying at the university. From there I started exploring and contributing to Ubuntu, which you can find here, and I was accepted to be an Ubuntu Member, Kubuntu Membes (merits as Ubuntu contributor). Finally I was appointed to be part of the Asia Oceania Membership Board. A a FOSS community member, I work in a few IT companies in Malaysia and was one of the engineers working at Open Source Competency Centre (OSCC) under Malaysian Administrative Modernisation and Management Planning Unit (MAMPU). Now I own my own IT company, namely Informology Sdn Bhd. Not stopping there, I continue actively being a FOSS enthusiast in Malaysia. When the cloud computing era came, I was involved by leading Malaysia OpenStack User Group, and I’ve recently been exploring and leading the Endless OS user group for Malaysia.

Android Leftovers

Games: Embr, Darkest Dungeon: The Butcher's Circus, CONSCRIPT, Tilekit, OBS Studio

  • Gig economy firefighter parody 'Embr' has a new game mode

    Embr, a game that pokes fun at the rise of the gig economy by letting anyone tackle fires adds a whole new game mode. With my initial look at the game on Stadia I found it to be pretty amusing, although it does still feel a little basic. This update appears to be the start of Muse Games expanding on the promise of it, putting a new spin on it by combining the firefighting with food delivery.

  • Darkest Dungeon: The Butcher's Circus for Linux is getting really close

    Darkest Dungeon: The Butcher's Circus, the free DLC that adds in an online PvP mode will be coming to Linux really soon going by their latest update. This special PvP mode is entirely separate to the campaign, giving you access to a new set of heroes to fight with. While it's currently only available for Windows officially, Red Hook Studios have been working to iron out all the kinks that came with it before rolling it out to other platforms.

  • WW1 top-down survival horror 'CONSCRIPT' gets a Kickstarter extension

    Currently crowdfunding on Kickstarter, the promising looking top-down WW1 survival horror CONSCRIPT has been granted an extension to their campaign. Originally due to finish on July 16, they've now got until July 23 to hit the $30,000 Australian dollars (about £16,559) goal. "During the First World War, a lone French soldier must navigate twisted trenches, scavenge for limited supplies and solve complex puzzles - all whilst fighting for survival in the midst of mankind’s most brutal and horrifying conflict."

  • Tilekit looks like a great tilemap editor with auto tiling

    Some parts of developing games can end up quite tedious like manually placing down tiles for 2D games, and it seems Tilekit might be able to help with that. Tilekit is a tilemap editor, one that's centred around pattern-based auto tiling. It's quite clever and could end up being a nice time-saver. Tilekit works by using an input map of basic tiles and a set of user-defined rules to create the resultant output map from your tileset.

  • OBS Studio – Stream From Linux Distribution

    OBS Studio is the all-in-one tool for streaming your video in real-time. It is free to use, open-source, and supports all major platforms including Linux. In this article, I will walk you through a tour of OBS Studio. You will learn how easy it is to stream your video from your Linux computer.