Kernel and Graphics: Vulkan, NVIDIA Memory Compaction and Intel DRM Driver
-
vkBasalt CAS Vulkan Layer Adds FXAA Support
The open-source vkBasalt project is the independent effort implementing AMD Radeon Image Sharpening / Contrast Adaptive Sharpening technique as a Vulkan post-processing layer that can be used regardless of the (Vulkan-powered) game. With vkBasalt 0.1 also now comes the ability to apply FXAA.
Fast Approximate Anti-Aliasing (FXAA) is the latest feature of vkBasalt besides the contrast adaptive sharpening. However, for the v0.1 release, CAS and FXAA cannot both be enabled at the same time. It's on the project TODO list for being able to enable both FXAA and CAS in a future release. Like the existing CAS support, the anti-aliasing technique can be used for any Vulkan game thanks to this being implemented as a post-processing layer for this graphics API.
-
mm: Proactive compaction
For some applications we need to allocate almost all memory as hugepages. However, on a running system, higher order allocations can fail if the memory is fragmented. Linux kernel currently does on-demand compaction as we request more hugepages but this style of compaction incurs very high latency. Experiments with one-time full memory compaction (followed by hugepage allocations) shows that kernel is able to restore a highly fragmented memory state to a fairly compacted memory state within <1 sec for a 32G system. Such data suggests that a more proactive compaction can help us allocate a large fraction of memory as hugepages keeping allocation latencies low. For a more proactive compaction, the approach taken here is to define per page-node tunable called ‘hpage_compaction_effort’ which dictates bounds for external fragmentation for HPAGE_PMD_ORDER pages which kcompactd should try to maintain. The tunable is exposed through sysfs: /sys/kernel/mm/compaction/node-n/hpage_compaction_effort The value of this tunable is used to determine low and high thresholds for external fragmentation wrt HPAGE_PMD_ORDER order. Note that previous version of this patch [1] was found to introduce too many tunables (per-order, extfrag_{low, high}) but this one reduces them to just (per-node, hpage_compaction_effort). Also, the new tunable is an opaque value instead of asking for specific bounds of “external fragmentation” which would have been difficult to estimate. The internal interpretation of this opaque value allows for future fine-tuning. Currently, we use a simple translation from this tunable to [low, high] extfrag thresholds (low=100-hpage_compaction_effort, high=low+10%). To periodically check per-node extfrag status, we reuse per-node kcompactd threads which are woken up every few milliseconds to check the same. If any zone on its corresponding node has extfrag above the high threshold for the HPAGE_PMD_ORDER order, the thread starts compaction in background till all zones are below the low extfrag level for this order. By default. By default, the tunable is set to 0 (=> low=100%, high=100%). This patch is largely based on ideas from Michal Hocko posted here: https://lore.kernel.org/linux-mm/20161230131412.GI13301@dhcp22.suse.cz/ * Performance data System: x64_64, 32G RAM, 12-cores. I made a small driver that allocates as many hugepages as possible and measures allocation latency: The driver first tries to allocate hugepage using GFP_TRANSHUGE_LIGHT and if that fails, tries to allocate with `GFP_TRANSHUGE | __GFP_RETRY_MAYFAIL`. The drives stops when both methods fail for a hugepage allocation. Before starting the driver, the system was fragmented from a userspace program that allocates all memory and then for each 2M aligned section, frees 3/4 of base pages using munmap. The workload is mainly anonymous userspace pages which are easy to move around. I intentionally avoided unmovable pages in this test to see how much latency we incur just by hitting the slow path for most allocations.
-
NVIDIA Engineer Continues Working On Proactive Memory Compaction For Linux
NVIDIA's Nitin Gupta continues working on proactive compaction for the Linux kernel's memory management code.
This proactive compaction is designed to avoid the high latency introduced right now when the Linux kernel does on-demand compaction when an application needs a lot of hugepages. With this proactive compaction, a large number of hugepages can be requested while avoiding high latencies.
-
Intel Submits Last Bits For Linux 5.5 DRM Driver - Includes More TGL/Gen12, Discrete Bit
Intel's open-source crew has submitted the last of their feature updates to their "i915" Direct Rendering Manager graphics driver for staging in DRM-Next ahead of the upcoming Linux 5.5 kernel cycle.
In the previous weeks they've been bringing up a lot of their Tiger Lake / Gen12 graphics code as the dominating theme for the Linux 5.5 kernel. There has also been Jasper Lake support, Xe multi-GPU prepping, and their other routine code clean-ups and driver improvements. Out this morning is the last of their feature work targeting Linux 5.5.
- Login or register to post comments
- Printer-friendly version
- 3083 reads
- PDF version
More in Tux Machines
- Highlights
- Front Page
- Latest Headlines
- Archive
- Recent comments
- All-Time Popular Stories
- Hot Topics
- New Members
digiKam 7.7.0 is releasedAfter three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release. |
Dilution and Misuse of the "Linux" Brand
|
Samsung, Red Hat to Work on Linux Drivers for Future TechThe metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world. Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility. |
today's howtos
|
Recent comments
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago