Language Selection

English French German Italian Portuguese Spanish

Kernel: LWN and Phoronix Articles About Latest Discussions and Linux Developments

Filed under
Linux
  • Filesystem UID mapping for user namespaces: yet another shiftfs

    The idea of an ID-shifting virtual filesystem that would remap user and group IDs before passing requests through to an underlying real filesystem has been around for a few years but has never made it into the mainline. Implementations have taken the form of shiftfs and shifting bind mounts. Now there is yet another approach to the problem under consideration; this one involves a theoretically simpler approach that makes almost no changes to the kernel's filesystem layer at all.

    ID-shifting filesystems are meant to be used with user namespaces, which have a number of interesting characteristics; one of those is that there is a mapping between user IDs within the namespace and those outside of it. Normally this mapping is set up so that processes can run as root within the namespace without giving them root access on the system as a whole. A user namespace could be configured so that ID zero inside maps to ID 10000 outside, for example; ranges of IDs can be set up in this way, so that ID 20 inside would be 10020 outside. User namespaces thus perform a type of ID shifting now.

    In systems where user namespaces are in use, it is common to set them up to use non-overlapping ranges of IDs as a way of providing isolation between containers. But often complete isolation is not desired. James Bottomley's motivation for creating shiftfs was to allow processes within a user namespace to have root access to a specific filesystem. With the current patch set, instead, author Christian Brauner describes a use case where multiple containers have access to a shared filesystem and need to be able to access that filesystem with the same user and group IDs. Either way, the point is to be able to set up a mapping for user and group IDs that differs from the mapping established in the namespace itself.

  • Keeping secrets in memfd areas

    Back in November 2019, Mike Rapoport made the case that there is too much address-space sharing in Linux systems. This sharing can be convenient and good for performance, but in an era of advanced attacks and hardware vulnerabilities it also facilitates security problems. At that time, he proposed a number of possible changes in general terms; he has now come back with a patch implementing a couple of address-space isolation options for the memfd mechanism. This work demonstrates the sort of features we may be seeing, but some of the hard work has been left for the future.
    Sharing of address spaces comes about in a number of ways. Linux has traditionally mapped the kernel's address space into every user-space process; doing so improves performance in a number of ways. This sharing was thought to be secure for years, since the mapping doesn't allow user space to actually access that memory. The Meltdown and Spectre hardware bugs, though, rendered this sharing insecure; thus kernel page-table isolation was merged to break that sharing.

    Another form of sharing takes place in the processor's memory caches; once again, hardware vulnerabilities can expose data cached in this shared area. Then there is the matter of the kernel's direct map: a large mapping (in kernel space) that contains all of physical memory. This mapping makes life easy for the kernel, but it also means that all user-space memory is shared with the kernel. In other words, an attacker with even a limited ability to run code in the kernel context may have easy access to all memory in the system. Once again, in an era of speculative-execution bugs, that is not necessarily a good thing.

  • Revisiting stable-kernel regressions

    Stable-kernel updates are, unsurprisingly, supposed to be stable; that is why the first of the rules for stable-kernel patches requires them to be "obviously correct and tested". Even so, for nearly as long as the kernel community has been producing stable update releases, said community has also been complaining about regressions that make their way into those releases. Back in 2016, LWN did some analysis that showed the presence of regressions in stable releases, though at a rate that many saw as being low enough. Since then, the volume of patches showing up in stable releases has grown considerably, so perhaps the time has come to see what the situation with regressions is with current stable kernels.
    As an example of the number of patches going into the stable kernel updates, consider that, as of 4.9.213, 15,648 patches have been added to the original 4.9 release — that is an entire development cycle worth of patches added to a "stable" kernel. Reviewing all of those to see whether each contains a regression is not practical, even for the maintainers of the stable updates. But there is an automated way to get a sense for how many of those stable-update patches bring regressions with them.

    The convention in the kernel community is to add a Fixes tag to any patch fixing a bug introduced by another patch; that tag includes the commit ID for the original, buggy patch. Since stable kernel releases are supposed to be limited to fixes, one would expect that almost every patch would carry such a tag. In the real world, about 40-60% of the commits to a stable series carry Fixes tags; the proportion appears to be increasing over time as the discipline of adding those tags improves.

  • Finer-grained kernel address-space layout randomization

    The idea behind kernel address-space layout randomization (KASLR) is to make it harder for attackers to find code and data of interest to use in their attacks by loading the kernel at a random location. But a single random offset is used for the placement of the kernel text, which presents a weakness: if the offset can be determined for anything within the kernel, the addresses of other parts of the kernel are readily calculable. A new "finer-grained" KASLR patch set seeks to remedy that weakness for the text section of the kernel by randomly reordering the functions within the kernel code at boot time.

  • Debian discusses how to handle 2038

    At this point, most of the kernel work to avoid the year-2038 apocalypse has been completed. Said apocalypse could occur when time counted in seconds since 1970 overflows a 32-bit signed value (i.e. time_t). Work in the GNU C Library (glibc) and other C libraries is well underway as well. But the "fun" is just beginning for distributions, especially those that support 32-bit architectures, as a recent Debian discussion reveals. One of the questions is: how much effort should be made to support 32-bit architectures as they fade from use and 2038 draws nearer?

    Steve McIntyre started the conversation with a post to the debian-devel mailing list. In it, he noted that Arnd Bergmann, who was copied on the email, had been doing a lot of the work on the kernel side of the problem, but that it is mostly a solved problem for the kernel at this point. McIntyre and Bergmann (not to mention Debian as a whole) are now interested in what is needed to update a complete Linux system, such as Debian, to work with a 64-bit time_t.

    McIntyre said that glibc has been working on an approach that splits the problem up based on the architecture targeted. Those that already have a 64-bit time_t will simply have a glibc that works with that ABI. Others that are transitioning from a 32-bit time_t to the new ABI will continue to use the 32-bit version by default in glibc. Applications on the latter architectures can request the 64-bit time_t support from glibc, but then they (and any other libraries they use) will only get the 64-bit versions of the ABI.

    One thing that glibc will not be doing is bumping its SONAME (major version, essentially); doing so would make it easier to distinguish versions with and without the 64-bit support for 32-bit architectures. The glibc developers do not consider the change to be an ABI break, because applications have to opt into the change. It would be difficult and messy for Debian to change the SONAME for glibc on its own.

  • UEFI Boot Support Published For RISC-V On Linux

    As we've been expecting to happen with the Linux EFI code being cleaned up before the introduction of a new architecture, the RISC-V patches have been posted for bringing up UEFI boot support.

    Western Digital's Atish Patra sent out the patch series on Tuesday for adding UEFI support for the RISC-V architecture. This initial UEFI Linux bring-up is for supporting boot time services while the UEFI runtime service support is still being worked on. This RISC-V UEFI support can work in conjunction with the U-Boot bootloader and depends upon other recent Linux kernel work around RISC-V's Supervisor Binary Interface (SBI).

  • Linux Kernel Seeing Patches For NVIDIA's Proprietary Tegra Partition Table

    As an obstacle for upstreaming some particularly older NVIDIA Tegra devices (namely those running Android) is that they have GPT entry at the wrong location or lacking at all for boot support. That missing or botched GPT support is because those older devices make use of a NVIDIA proprietary/closed-source table format. As such, support for this proprietary NVIDIA Tegra Partition Table is being worked on for the Linux kernel to provide better upstream kernel support on these consumer devices.

    NVIDIA Tegra devices primarily rely on a special partition table format for their internal storage while some also support traditional GPT partitions. Those devices with non-flakey GPT support can boot fine but TegraPT support is being worked on for handling the upstream Linux kernel with the other devices lacking GPT support or where it's at the wrong sector. This issue primarily plagues Tegra 2 and Tegra 3 era hardware like some Google Nexus products (e.g. Nexus 7) while fortunately newer Tegra devices properly support GPT.

  • Intel Continues Bring-Up Of New Gateway SoC Architecture On Linux, ComboPHY Driver

    Besides all the usual hardware enablement activities with the usual names by Intel's massive open-source team working on the Linux kernel, one of the more peculiar bring-ups recently has been around the "Intel Gateway SoC" with more work abound for Linux 5.7.

    The Intel Gateway SoC is a seemingly yet-to-be-released product for high-speed network packet processing. The Gateway SoC supports the Intel Gateway Datapath Architecture (GWDPA) and is designed for very fast and efficient network processing. Outside of Linux kernel patches we haven't seen many Intel "Gateway" references to date. Gateway appears to be (or based on) the Intel "Lightning Mountain" SoC we were first to notice and bring attention to last summer when patches began appearing for that previously unknown codename.

More in Tux Machines

SUSE Condemns US Government and Promotes SUSE Enterprise Storage 7

  • A time of reflection and standing together

    Like many of you, I have found the events occurring across the United States in response to the unconscionable killings of George Floyd, Ahmaud Arbery, and Breonna Taylor, amongst many others, to be profoundly tragic and painful. Personally, they have shaken me to my core and have left me in deep reflection. While I will never understand the struggle of millions of people around the world that have been subject to systemic oppression, I stand united against hate and discrimination. As these events continue to unfold across the United States, they have rightly grabbed the world’s attention, and as leader of a global company, SUSE cannot remain silent – we will not remain silent. We will not accept racism, discrimination, or harassment in any form at any time. We stand against the innocent lives lost.

  • Staying Out of Trouble with SUSE Enterprise Storage 7

    Do you ever wake up in the morning and think, “I wish there was somewhere that stored common troubleshooting problems for SUSE Enterprise Storage 7?”

Debian: SReview, Nageru, Clang build and More

  • Wouter Verhelst: SReview 0.6

    I had planned to release a new version of SReview, my online video review and transcoding system that I wrote originally for FOSDEM but is being used for DebConf, too, after it was set up and running properly for FOSDEM 2020. However, things got a bit busy (both in my personal life and in the world at large), so it fell a bit by the wayside. I've now also been working on things a bit more, in preparation for an improved administrator's interface, and have started implementing a REST API to deal with talks etc through HTTP calls. This seems to be coming along nicely, thanks to OpenAPI and the Mojolicious plugin for parsing that. I can now design the API nicely, and autogenerate client side libraries to call them. While at it, because libmojolicious-plugin-openapi-perl isn't available in Debian 10 "buster", I moved the docker containers over from stable to testing. This revealed that both bs1770gain and inkscape changed their command line incompatibly, resulting in me having to work around those incompatibilities. The good news is that I managed to do so in a way that keeps running SReview on Debian 10 viable, provided one installs Mojolicious::Plugin::OpenAPI from CPAN rather than from a Debian package. Or installs a backport of that package, of course. Or, heck, uses the Docker containers in a kubernetes environment or some such -- I'd love to see someone use that in production.

  • Nageru 2.0.0 released

    I've released version 2.0.0 of Nageru, my live video mixer. Obviously, version 2 of anything is a major milestone; in this case, it wasn't so much this specific release being so big, but the combined work that has gone on through the 1.x versions. (Also, if you go from 1.9.0 to 1.10.0, you can be pretty sure 2.0 is never coming!) There were several major features where I could probably have justified a 2.0 bump alone (e.g., the multichannel audio processing support, HTML5 graphics, slow motion through Futatabi, or the large reworking of the themes in 1.9.0), and now, it was time. Interestingly enough, despite growing by 40,000 lines or so since the 1.0.0 release four and a half years ago, the basic design has proved fairly robust; there are always things I would like to do different, but I'm fairly happy about how flexible and reliable things have turned out to be, even though my own use cases have shifted from simple conference video to complex sports productions.

  • Debian rebuild with clang 10 + some patches

    Instead of patching clang itself, I used a different approach this time: patching Debian tools or implementing some workaround to mitigate an issue.

  • Olivier Berger: Mixing NRELab’s Antidote and Eclipse Che on the same k8s cluster

    You may have heard of my search for Cloud solutions to run labs in an academic context, with a focus on free an open source solutions . You may read previous installments of this blog, or for a shorter, check the presentation I’ve recorded last week. I’ve become quite interested, in the latest month, in 2 projects: NRELab’s Antidote and Eclipse Che. Antidote is the software that powers NRELabs, a labs platform for learning network automation, which runs on top of Kubernetes (k8s). The interesting thing is that for each learner, there can be a dedicated k8s namespace with multiple virtual nodes running on a separate network. This can be used in the context of virtual classes/labs where our students will perform network labs in parallel on the same cluster.

  • Olivier Berger: Experimenting on distant labs and labs on the Cloud

    I mention tools like Guacamole, MeshCentral, NRELab’s Antidote, Eclipse Che and Labtainers, as well as k8s and Docker, as interesting tools that may allow us to continue teaching in labs while allowing more flexibility, distant learning, and hopefully improved quality.

  • Sylvain Beucler: Debian LTS and ELTS - May 2020

    Here is my transparent report for my work on the Debian Long Term Support (LTS) and Debian Extended Long Term Support (ELTS), which extend the security support for past Debian releases, as a paid contributor. In May, the monthly sponsored hours were split evenly among contributors depending on their max availability - I was assigned 17.25h for LTS (out of 30 max; all done) and 9.25h for ELTS (out of 20 max; all done).

Let’s Discover Xubuntu 20.04 With Xfce 4.14; A Review

One of the most gorgeous flavors of Ubuntu is Xubuntu, which is shipped by default with the Xfce desktop. Xfce is a very practical desktop environment that not only “just works”, but is also beautiful in its own characteristic way. Xubuntu 20.04 is the first LTS release to ship with Xfce 4.14, making it also the first LTS to fully experience the power of GTK 3 after it was imported from GTK 2 taking around 4 years of continuous work. The amounts of updates between Xubuntu 18.04 and 20.04 is huge. We’ll take you today in a tour in Xubuntu 20.04, what are its features and what bugs or issues you may face if you consider switching to it. Read more

IBM/Red Hat: Security Enhanced Linux, Open Data Hub and More

  • SELinux Sees Nice Optimizations With Linux 5.8

    Security Enhanced Linux is seeing some nice optimizations with the in-development Linux 5.8 kernel. One of the optimizations in Linux 5.8 for SELinux is changing around some of their internal data structures for improving performance. One notable area is using a hash table for SELinux role transitions. For storing role transitions within a hash table, on Fedora where there are around 428 role transitions, the run-time was cut by about 50% when testing with Stress-NG benchmarks.

  • [Red Hat] Edge investments, data navigators, and more industry trends

    As part of my role as a senior product marketing manager at an enterprise software company with an open source development model, I publish a regular update about open source community, market, and industry trends for product marketers, managers, and other influencers. Here are five of my and their favorite articles from that update.

  • Open Data Hub 0.6.1: Bug fix release to smooth out redesign regressions

    It is just a few short weeks since we released Open Data Hub (ODH) 0.6.0, bringing many changes to the underlying architecture and some new features. We found a few issues in this new version with the Kubeflow Operator and a few regressions that came in with the new JupyterHub updates. To make sure your experience with ODH 0.6 does not suffer because we wanted to release early, we offer a new (mostly) bugfix release: Open Data Hub 0.6.1.

  • Open Sourcing Red Hat Advanced Cluster Management for Kubernetes

    Recently, at Red Hat Summit Virtual Event, we announced Red Hat Advanced Cluster Management for Kubernetes, a new management solution designed to help organizations further extend and scale Red Hat OpenShift, the leading enterprise Kubernetes platform. This new product is based on technology that originated with IBM, and that technology was not fully open source. In accordance with Red Hat policy, we are in the process of opening the source code for this new product. This same open source technology will then also be used by IBM for its CloudPak for Multicloud Management. At Red Hat, we believe using an open development model helps create more secure, stable and innovative technologies. And the commitment to that open source model is what we have based our business model on. Even after joining forces with IBM, this commitment remains unchanged. We have worked more than 25 years to invest in open projects and technologies.

  • Role of APIs in an increasingly digital world

    COVID-19 has had a major impact on the world. It has affected the way we do business, where we work, how we provide services and how we communicate. We must find new ways to accomplish these pursuits, and application programming interfaces (APIs) can help. In a digital-driven world, applications have become fundamental to our economy and even our society – and these applications commonly need to communicate and integrate with a range of other applications and systems in order to perform their essential functions. APIs are one way to unlock the change.

  • RHEL 7.8 and the final update to container tools

    Before we get started with the updates for Red Hat Enterprise Linux 7.8, we recommend taking a serious look at moving to Red Hat Enterprise Linux 8. RHEL 7 is now in Maintenance Support and will no longer receive newer versions of container tools. Users who need access to the latest versions of Podman, Buildah and Skopeo, should move to RHEL 8 where the container-tools module is updated once a quarter. For those of you required to use containers on RHEL 7, this post will provide you a strategic and technical update. Red Hat understands that many customers cannot upgrade immediately. So, similar to our update of container tools in RHEL 7.7, we have released one final update to the container tools provided in RHEL 7.8.

  • Advancing open source in telecom demands interoperability: How do we get there?

    More and more, open source technologies are gaining traction in the telecommunications industry as service providers reinvent their networks and push the boundaries with cloud-native networking functions and principles. But challenges remain, in particular around integration and interoperability of the many components that make up their infrastructures. This was a central theme at the Open Networking Summit Europe in Antwerp, Belgium, an event focused on the future of open source networking and aimed at enabling collaborative development and innovation across enterprises, service providers, and cloud providers. As digital service providers begin realizing value from open source platforms like OpenStack – including faster time to market, reduced costs, and improved reliability, scalability, and agility – they are in a better position to deliver the services their customers want: mobile 5G streaming video, audio, and more.

  • Exploring and modeling COVID data

    “If your prediction proves to be very good, then it’s probably too good to be true,” says IBM developer advocate and data scientist Damiaan Zwietering. Damiaan loves his profession, which he has been practicing for almost 25 years, and by now he has come across most of the pitfalls. He likes to share his knowledge and experience with others, from developers to people in the business, and therefore, has a prominent role during the June 12, 2020, Code @ Think digital event. To register for this event, click here. In two sessions, he’ll introduce anyone who wants to know more about data science into the world of COVID-19 data and where the opportunities and pitfalls lie.