Language Selection

English French German Italian Portuguese Spanish

Debian

Syndicate content
Planet Debian - https://planet.debian.org/
Updated: 4 hours 7 min ago

Dirk Eddelbuettel: ttdo 0.0.3: New package

Friday 13th of September 2019 11:29:00 PM

A new package of mine arrived on CRAN yesterday, having been uploaded a few days prior on the weekend. It extends the most excellent (and very minimal / zero depends) unit testing package tinytest by Mark van der Loo with the very clever and well-done diffobj package by Brodie Gaslam. Mark also tweeted about it.

The package was written to address a fairly specific need. In teaching STAT 430 at Illinois, I am relying on the powerful PrairieLearn system (developed there) to provides tests, quizzes or homework. Alton and I have put together an autograder for R (which is work in progress, more on that maybe another day), and that uses this package to provides colorized differences between supplied and expected answers in case of an incorrect answer.

Now, the aspect of providing colorized diffs when tests do not evalute to TRUE is both simple and general enough. As our approach works rather well, I decided to offer the package on CRAN as well. The small screenshot gives a simple idea, the README.md contains a larger screenshoot.

The initial NEWS entries follow below.

Changes in ttdo version 0.0.3 (2019-09-08)
  • Added a simple demo to support initial CRAN upload.
Changes in ttdo version 0.0.2 (2019-08-31)
  • Updated defaults for format and mode to use the same options used by diffobj along with fallbacks.
Changes in ttdo version 0.0.1 (2019-08-26)
  • Initial version, with thanks to both Mark and Brodie.

Please use the GitHub repo and its issues for any questions.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Norbert Preining: Gaming: Puzzle Agent

Friday 13th of September 2019 01:36:16 PM

Two lovely but short puzzle games: Puzzle Agent and Puzzle Agent II, follow agent Nelson Tethers in his quest to solve an obscure case in Scoggins, Minnesota: The erasers factory delivering to the White House stopped production – a dangerous situation for the US and the world. Tethers embarks on a wild journey.

Starting in his office, agent Tethers is used to office work solving puzzles, mostly inspired by chewing gum. Until a strange encounter and a phone call kicks him out in to the wild.

The game is full of puzzles, most of them rather easy, some of them tricky. One can use the spare chewing gums to get a hint in case one gets stuck. Chewing gums are rare in Scoggins, so agent Tethers needs to collect used gums from all kind of surfaces.

Solved puzzles are sent to evaluation, also showing the huge amount of money one single FBI agent costs. After that the performance of agent Tethers is evaluated based on the amount of hints (chewing gums) and false submissions.

The rest are dialog trees to collect information, and driving around in the neighborhood of Scoggins. The game shines by the well balanced list of puzzles to be solved, the quirky dialogs with quirky people of Scoggins.

The game is beautifully drawn in cartoon-style, far from the shine ray-tracing world, but this particularly adds a lot of charme to the game.

A simply, but very enjoyable pair of games. Unfortunately there is not much of replay value. Still, worth getting them when there are on sale.

Jonas Meurer: debian lts report 2019.08

Thursday 12th of September 2019 01:13:11 PM
Debian LTS report for August 2019

This month I was allocated 10 hours. Unfortunately, I didn't find much time to work on LTS issues, so I only spent 0.5 hours on the task listed below. That means that I carry over 9.5 hours to September.

  • Triaged CVE-2019-13640/qbittorrent: After digging through the code, it became obvious that qbittorrent 3.1.10 in Debian Jessie is not affected by this vulnerability as the affected code is not present yet.
Links

Ben Hutchings: Debian LTS work, August 2019

Thursday 12th of September 2019 12:45:17 PM

I was assigned 20 hours of work by Freexian's Debian LTS initiative and worked all those hours this month.

I prepared and, after review, released Linux 3.16.72, including various security and other fixes. I then rebased the Debian package onto that. I uploaded that with a small number of other fixes and issued DLA-1884-1. I also prepared and released Linux 3.16.73 with another small set of fixes.

I backported the latest security update for Linux 4.9 from stretch to jessie and issued DLA-1885-1 for that.

Thomas Lange: FAI.me service now support backports for Debian 10 (buster)

Thursday 12th of September 2019 07:36:39 AM

The FAI.me service for creating customized installation and cloud images now supports a backports kernel for stable release Debian 10 (aka buster). If you enable the backports option, you will currently get kernel 5.2. This will help you if you have newer hardware that is not support by the default kernel 4.19. The backports option is also still available for the images when using the old Debian 9 (stretch) release.

The URL of the FAI.me service is

https://fai-project.org/FAIme/

FAI.me

Norbert Preining: TeX Services at texlive.info

Thursday 12th of September 2019 01:20:20 AM

I have been working over the last weeks to provide four more services for the TeX (Live) community: an archive of TeX Live’s network installation directory tlnet, a git repository of CTAN, a mirror of the TeX Live historic archives, and a new tlpretest mirror. In addition to the services that have already been provided before on my server, this makes a considerable list, and I thought it is a good idea to summarize all of the services.

Overview of the services

New services added recently are marked with an asterisk (*) at the end.

For the git services, anonymous checkouts are supported. If a developer wants to have push rights, please contact me.

tlnet archive

TeX Live is distributed via the CTAN network in CTAN/systems/texlive/tlnet. The packages there are updated on a daily basis according to updates on CTAN that make it into the TeX Live repository. This has created some problems for distributions requiring specific versions, as well as problems with rollbacks in case of buggy packages.

Starting with 2019/08/30, for each day rsync backups of the tlnet directory are done, and they are available at https://www.texlive.info/tlnet-archive/YYYY/MM/DD/tlnet.

CTAN git repository

The second big item is putting CTAN into a git repository. In a perfect world I could get git commits for each single package update, but that would need a lot of collaboration with the CTAN Team (and maybe this will happen in the future), but for now there is one rsync of the CTAN a day committed after the sync.

Considering the total size of CTAN (currently around 40G), we decided to ignore file types that provide no useful information when put into git, mostly large binary files. The concrete list is tar, zip, pkg, cab, jar, dmg, rpm, deb, tgz, iso, exe, cab, as well as files containing one of these extensions (that means that files foobar.iso.gz will be ignored, too). This allows to keep the size of the .git directory for now at something reasonable amount (a few Gb).

We will see how the git repository grows over time, and whether we can support this on a long term time range.

While we exclude the above files from being recorded in the git repository, the actual CTAN directory is complete and contains all files, meaning that rsync checkout contains everything.

Access to these services is provided as follows:

TeX Live historic archives

The TeX Live historic archives hierarchy contains various items of interest in TeX history, from individual files to entire systems. See the article by Ulrik Vieth at https://tug.org/TUGboat/tb29-1/tb91vieth.pdf for an overview.

We provide a mirror available via rsync://texlive.info/historic/.

tlpretest mirror

During preparation of a new TeX Live release (the pretest phase) we are distributing preliminary builds via a few tlpretest mirrors. The current server will provide access to tlpretest, too:

TeX Live svn/git mirror

Since I prefer to work with git, and developing new features with git on separate branches is so much more convenient than working with subversion, I am running a git-svn mirror of the whole TeX Live subversion repository. This repo is updated every 15min with the latest changes. There are also git branches matching the subversion branches, and some dev/ branches where I am working on new features. The git repository carries, similar to the subversion, the full history back to our switch from Perforce to Subversion in 2005.This repository is quite big, so don’t do a casual checkout (checked out size currently close to 40Gb):

TeX Live contrib

The TeX Live Contrib repository is a companion to the core TeX Live (tlnet) distribution in much the same way as Debian’s non-free tree is a companion to the normal distribution. The goal is not to replace TeX Live: packages that could go into TeX Live itself should stay (or be added) there. The TeX Live Contrib is simply trying to fill in a gap in the current distribution system by providing ready made packages for software that is not distributed in TeX Live proper due to license reasons, support for non-free software, etc.:

TeX Live GnuPG

Starting with release 2016, TeX Live provides facilities to verify authenticity of the TeX Live database using cryptographic signatures. For this to work out, a working GnuPG program needs to be available. In particular, either gpg (version 1) or gpg2 (version 2). To ease adoption of verification, this repository provides a TeX Live package tlgpg that ships GnuPG binaries for Windows and MacOS (universal and x86_64). On other systems we expect GnuPG to be installed.

Supporting these services

We will try to keep this service up and running as long as server space, connectivity, and bandwidth allows. If you find them useful, I happily accept donations via PayPal or Patreon to support the server as well as my time and energy!

Benjamin Mako Hill: How Discord moderators build innovative solutions to problems of scale with the past as a guide

Wednesday 11th of September 2019 03:04:05 AM

Both this blog post and the paper it describes are collaborative work led by Charles Kiene with Jialun “Aaron” Jiang.

Introducing new technology into a work place is often disruptive, but what if your work was also completely mediated by technology? This is exactly the case for the teams of volunteer moderators who work to regulate content and protect online communities from harm. What happens when the social media platforms these communities rely on change completely? How do moderation teams overcome the challenges caused by new technological environments? How do they do so while managing a “brand new” community with tens of thousands of users?

For a new study that will be published in CSCW in November, we interviewed 14 moderators of 8 “subreddit” communities from the social media aggregation and discussion platform Reddit to answer these questions. We chose these communities because each community had recently adopted the real-time chat platform Discord to support real-time chat in their community. This expansion into Discord introduced a range of challenges—especially for the moderation teams of large communities.

We found that moderation teams of large communities improvised their own creative solutions to challenges they faced by building bots on top of Discord’s API. This was not too shocking given that APIs and bots are frequently cited as tools that allow innovation and experimentation when scaling up digital work. What did surprise us, however, was how important moderators’ past experiences were in guiding the way they used bots. In the largest communities that faced the biggest challenges, moderators relied on bots to reproduce the tools they had used on Reddit. The moderators would often go so far as to give their bots the names of moderator tools available on Reddit. Our findings suggest that support for user-driven innovation is important not only in that it allows users to explore new technological possibilities but also in that it allows users to mine their past experiences to introduce old systems into new environments.

What Challenges Emerged in Discord?

Discord’s text channels allow for more natural, in the moment conversations compared to Reddit. In Discord, this social aspect also made moderation work much more difficult. One moderator explained:

“It’s kind of rough because if you miss it, it’s really hard to go back to something that happened eight hours ago and the conversation moved on and be like ‘hey, don’t do that.’ ”

Moderators we spoke to found that the work of managing their communities was made even more difficult by their community’s size:

On the day to day of running 65,000 people, it’s literally like running a small city…We have people that are actively online and chatting that are larger than a city…So it’s like, that’s a lot to actually keep track of and run and manage.”

The moderators of large communities repeatedly told us that the tools provided to moderators on Discord were insufficient. For example, they pointed out tools like Discord’s Audit Log was inadequate for keeping track of the tens of thousands of members of their communities. Discord also lacks automated moderation tools like the Reddit’s Automoderator and Modmail leaving moderators on Discord with few tools to scale their work and manage communications with community members. 

How Did Moderation Teams Overcome These Challenges?

The moderation teams we talked with adapted to these challenges through innovative uses of Discord’s API toolkit. Like many social media platforms, Discord offers a public API where users can develop apps that interact with the platform through a Discord “bot.” We found that these bots play a critical role in helping moderation teams manage Discord communities with large populations.

Guided by their experience with using tools like Automoderator on Reddit, moderators working on Discord built bots with similar functionality to solve the problems associated with scaled content and Discord’s fast-paced chat affordances. This bots would search for regular expressions and URLs that go against the community’s rules:

“It makes it so that rather than having to watch every single channel all of the time for this sort of thing or rely on users to tell us when someone is basically running amuck, posting derogatory terms and terrible things that Discord wouldn’t catch itself…so it makes it that we don’t have to watch every channel.”

Bots were also used to replace Discord’s Audit Log feature with what moderators referred to often as “Mod logs”—another term borrowed from Reddit. Moderators will send commands to a bot like “!warn username” to store information such as when a member of their community has been warned for breaking a rule and automatically store this information in a private text channel in Discord. This information helps organize information about community members, and it can be instantly recalled with another command to the bot to help inform future moderation actions against other community members.

Finally, moderators also used Discord’s API to develop bots that functioned virtually identically to Reddit’s Modmail tool. Moderators are limited in their availability to answer questions from members of their community, but tools like the “Modmail” helps moderation teams manage this problem by mediating communication to community members with a bot:

“So instead of having somebody DM a moderator specifically and then having to talk…indirectly with the team, a [text] channel is made for that specific question and everybody can see that and comment on that. And then whoever’s online responds to the community member through the bot, but everybody else is able to see what is being responded.”

The tools created with Discord’s API — customizable automated content moderation, Mod logs, and a Modmail system — all resembled moderation tools on Reddit. They even bear their names! Over and over, we found that moderation teams essentially created and used bots to transform aspects of Discord, like text channels into Mod logs and Mod Mail, to resemble the same tools they were using to moderate their communities on Reddit. 

What Does This Mean for Online Communities?

We think that the experience of moderators we interviewed points to a potentially important underlooked source of value for groups navigating technological change: the potent combination of users’ past experience combined with their ability to redesign and reconfigure their technological environments. Our work suggests the value of innovation platforms like APIs and bots is not only that they allow the discovery of “new” things. Our work suggests that these systems value also flows from the fact that they allow the re-creation of the the things that communities already know can solve their problems and that they already know how to use.

For more details, check out check out the full 23 page paper. The work will be presented in Austin, Texas at the ACM Conference on Computer-supported Cooperative Work and Social Computing (CSCW’19) in November 2019. The work was supported by the National Science Foundation (awards IIS-1617129 and IIS-1617468). If you have questions or comments about this study, contact Charles Kiene at ckiene [at] uw [dot] edu.

Markus Koschany: My Free Software Activities in August 2019

Tuesday 10th of September 2019 10:37:47 PM

Welcome to gambaru.de. Here is my monthly report that covers what I have been doing for Debian. If you’re interested in Java, Games and LTS topics, this might be interesting for you.

Debian Games Debian Java Misc
  • I fixed two minor CVE in binaryen, a compiler and toolchain infrastructure library for WebAssembly, by packaging the latest upstream release.
Debian LTS

This was my 42. month as a paid contributor and I have been paid to work 21,75 hours on Debian LTS, a project started by Raphaël Hertzog. In that time I did the following:

  • From 12.8.2019 until 18.08.2019 and from 09.09.2019 until 10.09.2019 I was in charge of our LTS frontdesk. I investigated and triaged CVE in kde4libs, apache2, nodejs-mysql, pdfresurrect, nginx, mongodb, nova, radare2, flask, bundler, giflib, ansible, zabbix, salt, imapfilter, opensc and sqlite3.
  • DLA-1886-2. Issued a regression update for openjdk-7. The regression was caused by the removal of several classes in rt.jar by upstream. Since Debian never shipped the SunEC security provider SSL connections based on elliptic curve algorithms could not be established anymore. The problem was solved by building sunec.jar and its native library libsunec.so from source. An update of the nss source package was required too which resolved a five year old bug. (#750400).
  • DLA-1900-1. Issued a security update for apache2 fixing 2 CVE, three more CVE did not affect the version in Jessie.
  • DLA-1914-1. Issued a security update for icedtea-web fixing 3 CVE.
  • I have been working on a backport of opensc, a set of libraries and utilities to access smart cards that support cryptographic operations, from Stretch which will fix more than a dozen CVE.
ELTS

Extended Long Term Support (ELTS) is a project led by Freexian to further extend the lifetime of Debian releases. It is not an official Debian project but all Debian users benefit from it without cost. The current ELTS release is Debian 7 “Wheezy”. This was my fifteenth month and I have been assigned to work 15 hours on ELTS of which I used 10 of them.

  •  I was in charge of our ELTS frontdesk from 26.08.2019 until 01.09.2019 and I triaged CVE in dovecot, libcommons-compress-java, clamav, ghostscript, gosa as end-of-life because security support for them has ended in Wheezy. There were no new issues for supported packages. All in all this was a rather unspectacular week.
  • ELA-156-1. Issued a security update for linux fixing 9 CVE.
  • ELA-154-2. Issued a regression update for openjdk-7 and nss because the removed classes in rt.jar caused the same issues in Wheezy too.

Thanks for reading and see you next time.

Erich Schubert: Altmetrics of a Retraction Notice

Tuesday 10th of September 2019 08:17:08 AM

As pointed out by RetractionWatch, AltMetrics even tracks the metrics of a retraction notices.

This retraction notice has an AltMetric of 9 as I write, and it will grow with every mention on blogs (such as this) and Twitter. Even worse, even just one blog post and one tweet by Retraction watch was enough to put the retraction notice “In the top 25% of all research outputs”.

In my opinion, this shows how unreliable these altmetrics are. They are based on the false assumption that Twitter and blogs would be central (or at least representative) of academic importance and attention. But given the very low usage rates of these media by academics, this does not appear to work well, except for a few high-shot papers.

Existing citation indexes, with all their drawbacks, may still be more useful.

Jonathan McDowell: Making xinput set-button-map permanent

Tuesday 10th of September 2019 07:11:23 AM

Since 2006 I’ve been buying a Logitech Trackman Marble (or, as Amazon calls it, a USB Marble Mouse) for both my home and work setups (they don’t die, I just seem to lose them somehow). It’s got a solid feel to it, helps me avoid RSI twinges and when I’m thinking I can take the ball out and play with it. It has 4 buttons, but I find the small one on the right inconvenient to use so I treat it as a 3 button device (the lack of scroll wheel functionality doesn’t generally annoy me). Problem is the small left most button defaults to “Back”, rather than “Middle button”. You can fix this with xinput:

xinput set-button-map "Logitech USB Trackball" 1 8 3 4 5 6 7 2 9

but remembering to do that every boot is annoying. I could put it in a script, but a better approach is to drop the following in /usr/share/X11/xorg.conf.d/50-marblemouse.conf (the fact it’s in /usr/share instead of /etc or ~ is what meant it took me so long to figure out how I’d done it on my laptop for my new machine):

Section "InputClass" Identifier "Marble Mouse" MatchProduct "Logitech USB Trackball" MatchIsPointer "on" MatchDevicePath "/dev/input/event*" Driver "evdev" Option "SendCoreEvents" "true" # Physical buttons come from the mouse as: # Big: 1 3 # Small: 8 9 # # This makes left small button (8) into the middle, and puts # scrolling on the right small button (9). # Option "Buttons" "9" Option "ButtonMapping" "1 8 3 4 5 6 7 2 9" Option "EmulateWheel" "true" Option "EmulateWheelButton" "9" EndSection

This post exists solely for the purpose of reminding future me how I did this on my Debian setup (given that it’s taken me way too long to figure out how I did it 2+ years ago) and apparently original credit goes to Ubuntu for their Logitech Marblemouse USB page.

Iain R. Learmonth: Spoofing commits to repositories on GitHub

Monday 9th of September 2019 08:17:00 PM

The following has already been reported to GitHub via HackerOne. Someone from GitHub has closed the report as “informative” but told me that it’s a known low-risk issue. As such, while they haven’t explicitly said so, I figure they don’t mind me blogging about it.

Check out this commit in torvalds’ linux.git on GitHub. In case this is fixed, here’s a screenshot of what I see when I look at this link:

How did this get past review? It didn’t. You can spoof commits in any repo on GitHub due to the way they handle forks of repositories internally. Instead of copying repositories when forks occur, the objects in the git repository are shared and only the refs are stored per-repository. (GitHub tell me that private repositories are handled differently to avoid private objects leaking out this way. I didn’t verify this but I have no reason to suspect it is not true.)

To reproduce this:

  1. Fork a repository
  2. Push a commit to your fork
  3. Put your commit ref on the end of:
https://github.com/[parent]/[repo]/commit/

That’s all there is to it. You can also add .diff or .patch to the end of the URL and those URLs work too, in the namespace of the parent.

The situation that worries me relates to distribution packaging. Debian has a policy that deltas to packages in the stable repository should be as small as possible, targetting fixes by backporting patches from newer releases.

If you get a bug report on your Debian package with a link to a commit on GitHub, you had better double check that this commit really did come from the upstream author and hasn’t been spoofed in this way. Even if it shows it was authored by the upstream’s GitHub account or email address, this still isn’t proof because this is easily spoofed in git too.

The best defence against being caught out by this is probably signed commits, but if the upstream is not doing that, you can clone the repository from GitHub and check to see that the commit is on a branch that exists in the upstream repository. If the commit is in another fork, the upstream repo won’t have a ref for a branch that contains that commit.

Ben Hutchings: Distribution kernels at Linux Plumbers Conference 2019

Monday 9th of September 2019 01:32:56 PM

I'm attending the Linux Plumbers Conference in Lisbon from Monday to Wednesday this week. This morning I followed the "Distribution kernels" track, organised by Laura Abbott.

I took notes, included below, mostly with a view to what could be relevant to Debian. Other people took notes in Etherpad. There should also be video recordings available at some point.

Upstream 1st: Tools and workflows for multi kernel version juggling of short term fixes, long term support, board enablement and features with the upstream kernel

Speaker: Bruce Ashfield, working on Yocto at Xilinx.

Details: https://linuxplumbersconf.org/event/4/contributions/467/

Yocto's kernel build recipes need to support multiple active kernel versions (3+ supported streams), multiple architectures, and many different boards. Many patches are required for hardware and other feature support including -rt and aufs.

Goals for maintenance:

  • Changes w.r.t. upstream are visible as discrete patches, so rebased rather than merged
  • Common feature set and configuration
  • Different feature enablements
  • Use as few custom tools as possible

Other distributions have similar goals but very few tools in common. So there is a lot of duplicated effort.

Supporting developers, distro builds and end users is challenging. E.g. developers complained about Yocto having separate git repos for different kernel versions, as this led to them needing more disk space.

Yocto solution:

  • Config fragments, patch tracking repo, generated tree(s)
  • Branched repository with all patches applied
  • Custom change management tools
Using Yocto to build a distro and maintain a kernel tree

Speaker: Senthil Rajaram & Anatol Belski from Microsoft

Details: https://linuxplumbersconf.org/event/4/contributions/469/

Microsoft chose Yocto as build tool for maintaining Linux distros for different internal customers. Wanted to use a single kernel branch for different products but it was difficult to support all hardware this way.

Maintaining config fragments and sensible inheritance tree is difficult (?). It might be helpful to put config fragments upstream.

Laura Abbott said that the upstream kconfig system had some support for fragments now, and asked what sort of config fragments would be useful. There seemed to be consensus on adding fragments for specific applications and use cases like "what Docker needs".

Kernel build should be decoupled from image build, to reduce unnecessary rebuilding.

Initramfs is unpacked from cpio, which doesn't support SELinux. So they build an initramfs into the kernel, and add a separate initramfs containing a squashfs image which the initramfs code will switch to.

Making it easier for distros to package kernel source

Speaker: Don Zickus, working on RHEL at Red Hat.

Details: https://linuxplumbersconf.org/event/4/contributions/466/

Fedora/RHEL approach:

  • Makefile includes Makefile.distro
  • Other distro stuff goes under distro sub-directory (merge or copy)
  • Add targets like fedora-configs, fedora-srpm

Lots of discussion about whether config can be shared upstream, but no agreement on that.

Kyle McMartin(?): Everyone does the hierarchical config layout - like generic, x86, x86-64 - can we at least put this upstream?

Monitoring and Stabilizing the In-Kernel ABI

Speaker: Matthias Männich, working on Android kernel at Google.

Details: https://linuxplumbersconf.org/event/4/contributions/468/

Why does Android need it?

  • Decouple kernel vs module development
  • Provide single ABI/API for vendor modules
  • Reduce fragmentation (multiple kernel versions for same Android version; one kernel per device)

Project Treble made most of Android user-space independent of device. Now they want to make the kernel and in-tree modules independent too. For each kernel version and architecture there should be a single ABI. Currently they accept one ABI bump per year. Requires single kernel configuration and toolchain. (Vendors would still be allowed to change configuration so long as it didn't change ABI - presumably to enable additional drivers.)

ABI stability is scoped - i.e. they include/exclude which symbols need to be stable.

ABI is compared using libabigail, not genksyms. (Looks like they were using it for libraries already, so now using it for kernel too.)

Q: How we can ignore compatible struct extensions with libabigail?

A: (from Dodji Seketeli, main author) You can add specific "suppressions" for such additions.

KernelCI applied to distributions

Speaker: Guillaume Tucker from Collabora.

Details: https://linuxplumbersconf.org/event/4/contributions/470/

Can KernelCI be used to build distro kernels?

KernelCI currently builds arbitrary branch with in-tree defconfig or small config fragment.

Improvements needed:

  • Preparation steps to apply patches, generate config
  • Package result
  • Track OS image version that kernel should be installed in

Some in audience questioned whether building a package was necessary.

Possible further improvements:

  • Enable testing based on user-space changes
  • Product-oriented features, like running installer
Should KernelCI be used to build distro kernels?

Seems like a pretty close match. Adding support for different use-cases is healthy for KernelCI project. It will help distro kernels stay close to upstream, and distro vendors will then want to contribute to KernelCI.

Discussion

Someone pointed out that this is not only useful for distributions. Distro kernels are sometimes used in embedded systems, and the system builders also want to check for regressions on their specific hardware.

Q: (from Takashi Iwai) How long does testing typically takes? SUSE's full automated tests take ~1 week.

A: A few hours to build, depending on system load, and up to 12 hours to complete boot tests.

Automatically testing distribution kernel packages

Speaker: Alice Ferrazzi of Gentoo.

Details: https://linuxplumbersconf.org/event/4/contributions/471/

Gentoo wants to provide safe, tested kernel packages. Currently testing gentoo-sources and derived packages. gentoo-sources combines upstream kernel source and "genpatches", which contains patches for bug fixes and target-specific features.

Testing multiple kernel configurations - allyesconfig, defconfig, other reasonable configurations. Building with different toolchains.

Tests are implemented using buildbot. Kernel is installed on top of a Gentoo image and then booted in QEMU.

Generalising for discussion:

  • Jenkins vs buildbot vs other
  • Beyond boot testing, like LTP and kselftest
  • LAVA integration
  • Supporting other configurations
  • Any other Gentoo or meta-distro topic

Don Zickus talked briefly about Red Hat's experience. They eventually settled on Gitlab CI for RHEL.

Some discussion of what test suites to run, and whether they are reliable. Varying opinions on LTP.

There is some useful scripting for different test suites at https://github.com/linaro/test-definitions.

Tim Bird talked about his experience testing with Fuego. A lot of the test definitions there aren't reusable. kselftest currently is hard to integrate because tests are supposed to follow TAP13 protocol for reporting but not all of them do!

Distros and Syzkaller - Why bother?

Speaker: George Kennedy, working on virtualisation at Oracle.

Details: https://linuxplumbersconf.org/event/4/contributions/473/

Which distros are using syzkaller? Apparently Google uses it for Android, ChromeOS, and internal kernels.

Oracle is using syzkaller as part of CI for Oracle Linux. "syz-manager" schedules jobs on dedicated servers. There is a cron job that automatically creates bug reports based on crashes triggered by syzkaller.

Google's syzbot currently runs syzkaller on GCE. Planning to also run on QEMU with a wider range of emulated devices.

How to make syzkaller part of distro release process? Need to rebuild the distro kernel with config changes to make syzkaller work better (KASAN, KCOV, etc.) and then install kernel in test VM image.

How to correlate crashes detected on distro kernel with those known and fixed upstream?

Example of benefit: syzkaller found regression in rds_sendmsg, fixed upstream and backported into the distro, but then regressed in Oracle Linux. It turned out that patches to upgrade rds had undone the fix.

syzkaller can generate test cases that fail to build on old kernel versions due to symbols missing from UAPI headers. How to avoid this?

Q: How often does this catch bugs in the distro kernel?

A: It doesn't often catch new bugs but does catch missing fixes and regressions.

Q: Is anyone checking the syzkaller test cases against backported fixes?

A: Yes [but it wasn't clear who or when]

Google has public database of reproducers for all the crashes found by syzbot.

Wish list:

  • Syzkaller repo tag indicating which version is suitable for a given kernel version's UAPI
  • tarball of syzbot reproducers

Other possible types of fuzzing (mostly concentrated on KVM):

  • They fuzz MSRs, control & debug regs with "nano-VM"
  • Missing QEMU and PCI fuzzing
  • Intel and AMD virtualisation work differently, and AMD may not be covered well
  • Missing support for other architectures than x86

Dirk Eddelbuettel: pinp 0.0.8: Bugfix

Sunday 8th of September 2019 09:11:00 PM

A new release of our pinp package is now on CRAN. pinp allows for snazzier one or two column Markdown-based pdf vignettes, and is now used by a few packages. A screenshot of the package vignette can be seen below. Additional screenshots are at the pinp page.

This release was spurned by one of those "CRAN package xyz" emails I received yesterday: processing of pinp-using vignettes was breaking at CRAN under the newest TeX Live release present on Debian testing as well as recent Fedora. The rticles package (which uses the PNAS style directly) apparently has a similar issue with PNAS.

Kurt was a usual extremely helpful in debugging, and we narrowed this down to an interaction with the newer versions of titlesec latex package. So for now we did two things: upgrade our code reusing the PNAS class to their newest verson of the PNAS class (as suggested by Norbert whom I also roped in), but also copying in an older version of titlesec.sty (plus a support file). In the meantime, we are also looking into titlesec directly as Javier offered help—all this was a really decent example of open source firing on all cylinders. It is refreshing.

Because of the move to a newer PNAS version (which seems to clearly help with the occassionally odd formatting of floating blocks near the document end) I may have trampled on earlier extension pull requests. I will reach out to the authors of the PRs to work towards a better process with cleaner diffs, a process I should probably have set up earlier.

The NEWS entry for this release follows.

Changes in pinp version 0.0.8 (2019-09-08)
  • Two erroraneous 'Provides' were removed from the pinp class.

  • The upquote package is now use to use actual (non-fancy) quotes in verbatim mode (Dirk fixing #75)

  • The underlying PNAS style was updated to the most recent v1.44 version of 2018-05-06 to avoid issues with newer TeXLive (Dirk in #79 fixing #77 and #78)

  • The new PNAS code brings some changes eg watermark is longer an option but typesetting paragraphs seems greatly improved. We may have stomped on an existing behavior, if see please file an issue.

  • However, it also conflicts with the current texlive version of titlesec so for now we copy titlesec.sty (and a support file) in using a prior version, just like we do for pinp.cls and jss.bst.

Courtesy of CRANberries, there is a comparison to the previous release. More information is on the tint page. For questions or comments use the issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

Shirish Agarwal: Depression, Harrapa, Space program

Sunday 8th of September 2019 08:35:13 PM

I was and had been depressed mostly when the election results were out. I was expecting like many others that Congress will come into power, but it didn’t . With that, came one bad after other, whether it was on politics (Jammu and Kashmir, Assam) both of which from my POV are inhumane not just on citizenship but even simply on humane levels. How people can behave like this with each other is beyond me. On the Economic Front, the less said the better. We are in midst of a prolonged recession and don’t see things turning out for the better any time soon. But as we have to come to terms with it and somehow live day-to-day, we are living. Because of the web, came to know there are so many countries where it is happening right now, whether it is Britian (Brexit), South Africa, Brazil. In fact, the West Papu thing is similar in many ways to what happened in Kashmir. Of course each region has its own complexities but this can be safely said that such events are happening all over. In every incident, one way ‘The Other’ is demonized. This has happened in all of the above incidents.

One question I have often asked and have had no clear answers. If Germany knew that Israel would be big and strong as it is now, would they have done what they did ? Had they known that Einstein, A Jew would go on to change the face of Science. Would America have been great without Einstein to such a degree ? I was flabbergasted when I saw ‘The Red Sea Diving Resort‘ which is based on real life done by Mossad as shared in the pictures after the movie.

Even among such blackness, I do see some hope. One thing which has good has been the rise of independant media. While the mainstream media has become completely ridiculous and instead of questioning the Government is toeing its line, independant media is trying to do what mainstream media should have been doing all along. I wouldn’t say much about this otherwise the whole blog post would be about independant India only. Maybe some other day

Andrew Cater: Chasing around installing CD images for Buster 10.1 ...

Sunday 8th of September 2019 06:42:45 PM
and having great fun, as ever, making a few mistakes and contributing mayhem and entropy to the CD release process. Buster 10.1 point update just released, thanks to RattusRattus, Sledge and Isy and Schweer (amongst others).

Waiting on the Stretch point release to try all over again.. I'd much rather be in Cambridge, but hey, you can't have everything.

Debian GSoC Kotlin project blog: Begining of the end.

Sunday 8th of September 2019 01:09:26 PM
Work done.

Hey all, since the last page of this post we have come so far into packaging Kotlin 1.3.30. I am glad to announce that Kotlin 1.3.30's dependencies are completely packaged and only refining work on intellij-community-java( which is the source package of the intellij related jars that kotlin depended on) and Kotlin remain.

I have roughly packaged Kotlin, the debian folder is pretty much done, and have pushed it here. Also the bootstrap package can be found here.

The links to all the dependencies of Kotlin 1.3.30 can be found in my previous blog pages but I ll list them here for convinience of the reader.

1.->java-compatibility-1.0.1 -> https://github.com/JetBrains/intellij-deps-java-compatibility (DONE: here)
2.->jps-model -> https://github.com/JetBrains/intellij-community/tree/master/jps (DONE: here)
3.->intellij-core -> https://github.com/JetBrains/intellij-community/tree/183.5153 (DONE: here)
4.->streamex-0.6.7 -> https://github.com/amaembo/streamex/tree/streamex-0.6.7 (DONE: here)
5.->guava-25.1 -> https://github.com/google/guava/tree/v25.1 (DONE: Used guava-19 from libguava-java)
6.->lz4-java -> https://github.com/lz4/lz4-java/blob/1.3.0/build.xml(DONE:here)
7.->libjna-java & libjna-platform-java recompiled in jdk 8. -> https://salsa.debian.org/java-team/libjna-java (DONE : commit)
8.->liboro-java recompiled in jdk8 -> https://salsa.debian.org/java-team/liboro-java (DONE : commit)
9.->picocontainer-1.3 refining -> https://salsa.debian.org/java-team/libpicocontainer-1-java (DONE: here)
10.->platform-api -> https://github.com/JetBrains/intellij-community/tree/183.5153/platform (DONE: here)
11.->util -> https://github.com/JetBrains/intellij-community/tree/183.5153/platform (DONE: here)
12.->platform-impl -> https://github.com/JetBrains/intellij-community/tree/183.5153/platform (DONE: here)
13.->extensions -> https://github.com/JetBrains/intellij-community/tree/183.5153/platform (DONE: here)
14.->jengeleman:shadow:4.0.3 --> https://github.com/johnrengelman/shadow (DONE)
15.->trove4j 1.x -> https://github.com/JetBrains/intellij-deps-trove4j (DONE)
16.->proguard:6.0.3 in jdk8 (DONE: released as libproguard-java 6.0.3-2)
17.->io.javaslang:2.0.6 --> https://github.com/vavr-io/vavr/tree/javaslang-v2.0.6 (DONE)
18.->jline 3.0.3 --> https://github.com/jline/jline3/tree/jline-3.3.1 (DONE)
19.->protobuf-2.6.1 in jdk8 (DONE)
20.->com.jcabi:jcabi-aether:1.0 -> the file that requires this is commented out;can be seen here and here
21.->org.sonatype.aether:aether-api:1.13.1 -> the file that requires this is commented out;can be seen here and here

Important Notes.

It should be noted that at this point in time, 8th September 2019, the kotlin package only aims to package the jars generated by the ":dist" task of the kotlin build scripts. This task builds the kotlin home. So thats all we have, we don't have the kotlin-gradle-plugins or kotlinx or anything that isn't part of the kotlin home.

It can be noted that the kotlin boostrap package has kotlin-gradle-plugin, kotlinx and kotlin-dsl jars. The eventual plan is to build kotlin-gradle-plugins and kotlinx from kotlin source itself and to build kotlindsl from gradle source using kotlin as a dependency for gradle. After we do that we can get rid of the kotlin bootstrap package.

It should also be noted that this kotlin package as of 8th September 2019 may not be perfect and might contain a ton of bugs, this is because of 2 reasons; partly because I have ignored some code that depended on jcabi-aether(mentioned above with link to commit) and mostly because the platform-api.jar and patform-impl.jar from intellij-community-idea are not the same as their upstream counterpart but minimum files that are required to make kotlin compile without errors; I did this because they needed packaging new dependencies and at this time it didn't look like it was worth it.

Work left to be done.

Now I believe most of the building blocks of packaging kotlin are done and whats left is to remove this pesky bootstrap. I believe this can be counted as the completion of my GSoC (officially ended in August 26). The tasks left are as follows:

Major Tasks.
  1. Make kotlin build just using openjdk-11-jdk; now it builds iwth openjdk-8-jdk and openjdk-11-jdk.
  2. Build kotlin-gradle-plugins.
  3. Build kotlinx.
  4. Build kotlindsl from gradle.
  5. Do 2,3 and 4 and make kotlin build without bootstrap.
Things that will help the kotlin effort.
  1. refine intellij-community-idea and do its copyrights file proper.
  2. import kotlin 1.3.30 into a new debian-java-maintainers repository.
  3. move kotlin changes(now maintained as git commits) to quilt patches. link to kotlin -> here.
  4. do kotlin's copyrights file.
  5. refine kotlin.
Authors Notes.

Hey guys its been a wonderful ride so far. I hope to keep doing this and maintain kotlin in debian. I am only a final year student and my career fare starts this october 17nth 2019 so I have to prepare for coding interviews and start searching jobs. So until late November 2019 I'll only be taking on the smaller tasks and be doing them. Please note that I won't be doing it as fast as I used to up until now since I am going to be a little busy during this period. I hope I can land a job that lets me keep doing this :) .

I would love to take this section to thank _hc, ebourg, andrewsh and seamlik for helping and mentoring me trough all this.

So if any of you want to help please kindly take on any of these tasks.

!!NOTE-ping me if you want to build kotlin in your system and are stuck!!

You can find me as m36 or m36[m] on #debian-mobile and #debian-java in OFTC.

I ll try to maintain this blog and post the major updates.

Dima Kogan: Are planes crashing any less than they used to?

Saturday 7th of September 2019 11:25:00 PM

Recently, I've been spending more of my hiking time looking for old plane crashes in the mountains. And I've been looking for data that helps me do that, for instance the last post. A question that came up in conversation is: "are crashes getting more rare?" And since I now have several datasets at my disposal, I can very easily come up with a crude answer.

The last post describes how to map the available NTSB reports describing aviation incidents. I was only using the post-1982 reports in that project, but here let's also look at the older reports. Today I can download both from their site:

$ wget https://app.ntsb.gov/avdata/Access/avall.zip $ unzip avall.zip # <------- Post 1982 $ wget https://app.ntsb.gov/avdata/PRE1982.zip $ unzip PRE1982.zip # <------- Pre 1982

I import the relevant parts of each of these into sqlite:

$ ( mdb-schema avall.mdb sqlite -T events; echo "BEGIN;"; mdb-export -I sqlite avall.mdb events; echo "COMMIT;"; ) | sqlite3 post1982.sqlite $ ( mdb-schema PRE1982.MDB sqlite -T tblFirstHalf; echo "BEGIN;"; mdb-export -I sqlite PRE1982.MDB tblFirstHalf; echo "COMMIT;"; ) | sqlite3 pre1982.sqlite

And then I pull out the incident dates, and make a histogram:

$ cat <(sqlite3 pre1982.sqlite 'select DATE_OCCURRENCE from tblFirstHalf') \ <(sqlite3 post1982.sqlite 'select ev_date from events') | perl -pe 's{^../../(..) .*}{$1 + (($1<40)? 2000: 1900)}e' | feedgnuplot --histo 0 --binwidth 1 --xmin 1960 --xlabel Year \ --title 'NTSB-reported incident counts by year'

I guess by that metric everything is getting safer. This clearly just counts NTSB incidents, and I don't do any filtering by the severity of the incident (not all reports describe crashes), but close-enough. The NTSB only deals with civilian incidents in the USA, and only after the early 1960s, it looks like. Any info about the military?

At one point I went through "Historic Aircraft Wrecks of Los Angeles County" by G. Pat Macha, and listed all the described incidents in that book. This histogram of that dataset looks like this:

Aaand there're a few internet resources that list out significant incidents in Southern California. For instance:

I visualize that dataset:

$ < [abc].htm perl -nE '/^ \s* 19(\d\d) | \d\d \s*(?:\s|-|\/)\s* \d\d \s*(?:\s|-|\/)\s* (\d\d)[^\d]/x || next; $y = 1900+($1 or $2); say $y unless $y==1910' | feedgnuplot --histo 0 --binwidth 5

So what did we learn? I guess overall crashes are becoming more rare. And there was a glut of military incidents in the 1940s and 1950s in Southern California (not surprising given all the military bases and aircraft construction facilities here at that time). And by one metric there were lots of incidents in the late 1970s/early 1980s, but they were much more interesting to this "carcomm" person, than they were to Pat Macha.

Andreas Metzler: exim update

Saturday 7th of September 2019 04:39:43 AM

Testing users might want to manually pull the latest (4.92.1-3) upload of Exim from sid instead of waiting for regular migration to testing. It fixes a nasty vulnerability.

Iustin Pop: Nationalpark Bike Marathon 2019

Friday 6th of September 2019 06:00:00 PM

This is a longer story… but I think it’s interesting, nonetheless.

The previous 6 months

Due to my long-going foot injury, my training was sporadic at the end of 2018, and by February I realised I will have to stop most if not all training in order to have a chance of recovery. I knew that meant no bike races this year, and I was fine with that. Well, I had to.

The only compromise was that I wanted to do one race, the NBM short route, since that is really easy (both technique and endurance), and even untrained should be fine.

So my fitness (well, CTL) went down and down and down, and my weight didn’t improve either.

As April and May went by, my foot was getting better, but training on the indoor trainer was still causing problems and moving my recovery backwards, so easy times.

By June things were looking better - even was able to do a couple slow runs!, July started even better, but trainer sessions were still a no-go. Only early August I could reliably do a short trainer session without issues. The good side was that since around June I could bike to work and back without problems, but that’s a short commute.

But, I felt confident I could do ~50Km on a bike with some uphill, so I registered for the race.

In August I could also restart trainer sessions, and to my pleasant surprise, even harder ones. So, I started preparing for the race in the last 2 weeks before it :( Well, better than nothing.

Overall, my CTL went from 62-65 in August 2018, to 6 (six!!) in early May, and started increasing in June, reaching a peak of 23 on the day before the race. That’s almost three times lower… so my expectations for the race were low. Especially as the longest ride I did in these 6 months was one hour long or so, whereas the race is double this time.

The race week

Things were going quite well. I also started doing some of the Zwift Academy workouts, more precisely 1 and 2 which are at low power, and everything good.

On Wednesday however, I did workout number 3, which has two “as hard as possible” intervals. Which are exactly the ones that my foot can’t yet do, so it caused some pain, and some concern about the race.

Then we went to Scuol, and I didn’t feel very well on Thursday as I was driving. I thought some exercise would help, so I went for a short run, which reminded me I made my foot worse the previous day, and was even more concerned.

On Friday morning, instead of better, I felt terrible. All I wanted was to go back to bed and sleep the whole day, but I knew that would mean no race tomorrow. I thought - maybe some slight walking would be better for me than lie in bed… At least I didn’t have a running nose or couching, but this definitely felt like a cold.

We went up with the gondola, walked ~2Km, got back down, and I was feeling at least not worse. All this time, I was overly dressed and feeling cold, while everybody else was in t-shirt.

A bit of rest in the afternoon helped, I went and picked my race number and felt better. After dinner, as I was preparing my stuff for next morning, I started feeling a bit better about doing the race. “Did not start” was now out of the question, but whether it will be a DNF was not clear yet.

Race (day)

Thankfully the race doesn’t start early for this specific route, so the morning was relatively relaxed. But then of course I was late a few minutes, so I hurried on my bike to the train station, only to realise I’m among the early people. Loading bike, get on the bus (the train station in Scuol is off-line for repairs), long bus ride to start point, and then… 2 hours of waiting. And to think I thought I’m 5 minutes late :)

I spent the two hours just reading email and browsing the internet (and posting a selfie on FB), and then finally it was on.

And I was surprised how “on” the race was from the first 100 meters. Despite repeated announcements in those two hours that the first 2-3 km do not matter since they’re through the S-chanf village, people started going very hard as soon as there was some space.

So I find myself going 40km/h (forty!!!) on a mountain bike on relatively flat gravel road. This sounds unbelievable, right? But the data says:

  • race started at 1’660m altitude
  • after the first 4.4km, I was at 1’650m, with a total altitude gain of 37m (and thus a total descent of 47m); thus, not flat-flat, but not downhill
  • over this 4.4km, my average speed was 32.5km/h, and that includes starting from zero speed, and in the block (average speed for the first minute was 20km/h)

While 32.5km/h on an MTB sounds cool, the sad part was that I knew this was unsustainable, both from the pure speed point of view, and from the heart rate point of view. I was already at 148bpm after 2½ minutes, but then at minute 6 it went over 160bpm and stayed that way. That is above my LTHR (estimated by various Garmin devices), so I was dipping into reserves. VeloViewer estimates power output here was 300-370W in these initial minutes, which is again above my FTP, so…

But, it was fun. Then at 4.5 a bit of climb (800m long, 50 altitude, ~6.3%), after which it became mostly flow on gravel. And for the next hour, until the single long climb (towards Guarda), it was the best ride I had this year, and one of the best segments in races in general. Yes, there are a few short climbs here and there (e.g. a 10% one over ~700m, another ~11% one over 300m or so), but in general it’s slowly descending route from ~1700m altitude to almost 1400m (plus add in another ~120m gained), so ~420m descent over ~22km. This means, despite the short climbs, average speed is still god - a bit more than 25km/h, which made this a very, very nice segment. No foot pain, no exertion, mean heart rate 152bpm, which is fine. Estimated power is a bit high (mean 231W, NP: 271W ← this is strange, too high); I’d really like to have a power meter on my MTB as well.

Then, after about an hour, the climb towards Guarda starts. It’s an easy climb for a fit person, but as I said I was worse for fitness this year, and also my weight was not good. Data for this segment:

  • it took me 33m:48s
  • 281m altitude gained
  • 4.7km length
  • mean HR: 145bpm
  • mean cadence: 75rpm

I remember stopping to drink once, and maybe another time to rest for about half a minute, but not sure. I stopped in total 33s during this half hour.

Then slowly descending on nice roads towards the next small climb to Ftan, then another short climb (thankfully, I was pretty dead at this point) of 780m distance, 7m36s including almost a minute stop, then down, another tiny climb, then down for the finish.

At the finish, knowing that there’s a final climb after you descend into Scuol itself and before the finish, I gathered all my reserves to do the climb standing. Alas, it was a bit longer than I thought; I think I managed to do 75-80% of it standing, but then sat down. Nevertheless, a good short climb:

  • 22m altitude over 245m distance, done in 1m02s
  • mean grade 8.8%, max grade 13.9%
  • mean HR 161bpm, max HR 167bpm which actually was my max for this race
  • mean speed 14.0km/h
  • estimated mean power 433W, NP: 499w; seems optimistic, but I’ll take it :)

Not bad, not bad. I was pretty happy about being able to push this hard, for an entire minute, at the end of the race. Yay for 1m power?

And obligatory picture, which also shows the grade pretty well:

Final climb! And beating my PR by ~9%

I don’t know how the photographer managed to do it, but having those other people in the picture makes it look much better :)

Comparison with last year

Let’s first look at official race results:

  • 2018: 2h11m37s
  • 2019: 2h22m13s

That’s 8% slower. Honestly, I thought I will do much worse, given my lack of training. Or does a 2.5× lower CTL only result in 8% time loss?

Honestly, I don’t think so. I think what saved me this year was that—since I couldn’t do bike rides—I did much more cross-train as in core exercises. Nothing fancy, just planks, push-ups, yoga, etc. but it helped significantly. If my foot will be fine and I can do both for next year, I’ll be in a much better position.

And this is why the sub-title of this post is “Fitness has many meanings”. I really need to diversify my training in general, but I was thinking in a somewhat theoretical way about it; this race showed it quite practically.

If I look at Strava data, it gives an even more clear picture:

  • on the 1 hour long flat segment I was telling about, which I really loved, I got a PR beating previous year by 1 minute; Strava estimates 250W for this hour, which is what my FTP was last year;
  • on all the climbs, I was slower than last year, as expected, but on the longer climbs significantly so; and I was many times slower than even 2016, when I did the next longer route.

And I just realise, of the 10½m I took longer this year, 6½m I lost on the Guarda climb :)

So yes, you can’t discount fitness, but leg fitness is not everything, and Training Peaks it seems can’t show overall fitness.

At least I did beat my PR on the finishing climb (1m12s vs. 1m19s last year), because I had left aside those final reserves for it.

Next steps

Assuming I’m successful at dealing my foot issue, and that early next year I can restart consistent training, I’m not concerned. I need to put in regular session, I also need to put in long sessions. The success story here is clear, it all depends on willpower.

Oh, and losing ~10kg of fat wouldn’t be bad, like at all.

Reproducible Builds: Reproducible Builds in August 2019

Friday 6th of September 2019 11:33:37 AM

Welcome to the August 2019 report from the Reproducible Builds project!

In these monthly reports we outline the most important things that have happened in the world of Reproducible Builds and we have been up to.

As a quick recap of our project, whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed to end users or systems as precompiled binaries. The motivation behind the reproducible builds effort is to ensure zero changes have been introduced during these compilation processes. This is achieved by promising identical results are always generated from a given source thus allowing multiple third-parties to come to a consensus on whether a build was changed or even compromised.

In August’s month’s report, we cover:

  • Media coverage & eventsWebmin, CCCamp, etc.
  • Distribution workThe first fully-reproducible package sets, openSUSE update, etc
  • Upstream newslibfaketime updates, gzip, ensuring good definitions, etc.
  • Software developmentMore work on diffoscope, new variations in our testing framework, etc.
  • Misc newsFrom our mailing list, etc.
  • Getting in touchHow to contribute, etc

If you are interested in contributing to our project, please visit our Contribute page on our website.

Media coverage & events

A backdoor was found in Webmin a popular web-based application used by sysadmins to remotely manage Unix-based systems. Whilst more details can be found on upstream’s dedicated exploit page, it appears that the build toolchain was compromised. Especially of note is that the exploit “did not show up in any Git diffs” and thus would not have been found via an audit of the source code. The backdoor would allow a remote attacker to execute arbitrary commands with superuser privileges on the machine running Webmin. Once a machine is compromised, an attacker could then use it to launch attacks on other systems managed through Webmin or indeed any other connected system. Techniques such as reproducible builds can help detect exactly these kinds of attacks that can lay dormant for years. (LWN comments)

In a talk titled There and Back Again, Reproducibly! Holger Levsen and Vagrant Cascadian presented at the 2019 edition of the Linux Developer Conference in São Paulo, Brazil on Reproducible Builds.

LWN posted and hosted an interesting summary and discussion on Hardening the file utility for Debian. In July, Chris Lamb had cross-posted his reply to the “Re: file(1) now with seccomp support enabled” thread, originally started on the debian-devel mailing list. In this post, Chris refers to our strip-nondeterminism tool not being able to accommodate the additional security hardening in file(1) and the changes made to the tool in order to fix this issue which was causing a huge number of regressions in our testing framework.

The Chaos Communication Camp — an international, five-day open-air event for hackers that provides a relaxed atmosphere for free exchange of technical, social, and political ideas — hosted its 2019 edition where there were many discussions and meet-ups at least partly related to Reproducible Builds. This including the titular Reproducible Builds Meetup session which was attended by around twenty-five people where half of them were new to the project as well as a session dedicated to all Arch Linux related issues.

Distribution work

In Debian, the first “package sets” — ie. defined subsets of the entire archive — have become 100% reproducible including as the so-called “essential” set for the bullseye distribution on the amd64 and the armhf architectures. This is thanks to work by Chris Lamb on bash, readline and other low-level libraries and tools. Perl still has issues on i386 and arm64, however.

Dmitry Shachnev filed a bug report against the debhelper utility that speaks to issues around using the date from the debian/changelog file as the source for the SOURCE_DATE_EPOCH environment variable as this can lead to non-intuitive results when package is automatically rebuilt via so-called binary (NB. not “source”) NMUs. A related issue was later filed against qtbase5-dev by Helmut Grohne as this exact issue led to an issue with co-installability across architectures.

Lastly, 115 reviews of Debian packages were added, 45 were updated and 244 were removed this month, appreciably adding to our knowledge about identified issues. Many issue types were updated by Chris Lamb, including embeds_build_data_via_node_preamble, embeds_build_data_via_node_rollup, captures_build_path_in_beam_cma_cmt_files, captures_varying_number_of_build_path_directory_components (discussed later), timezone_specific_files_due_to_haskell_devscripts, etc.

Bernhard M. Wiedemann posted his monthly Reproducible Builds status update for the openSUSE distribution. New issues were found from enabling Link Time Optimization (LTO) in this distribution’s Tumbleweed branch. This affected, for example, nvme-cli as well as perl-XML-Parser and pcc with packaging issues.

Upstream news

Software development Upstream patches

The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. In August we wrote a large number of such patches, including:

diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. It is run countless times a day on our testing infrastructure and is essential for identifying fixes and causes of non-deterministic behaviour.

This month, Chris Lamb made the following changes:

  • Improvements:
    • Don’t fallback to an unhelpful raw hexdump when, for example, readelf(1) reports an minor issue in a section in an ELF binary. For example, when the .frames section is of the NOBITS type its contents are apparently “unreliable” and thus readelf(1) returns 1. (#58, #931962)
    • Include either standard error or standard output (not just the latter) when an external command fails. []
  • Bug fixes:
    • Skip calls to unsquashfs when we are neither root nor running under fakeroot. (#63)
    • Ensure that all of our artificially-created subprocess.CalledProcessError instances have output instances that are bytes objects, not str. []
    • Correct a reference to parser.diff; diff in this context is a Python function in the module. []
    • Avoid a possible traceback caused by a str/bytes type confusion when handling the output of failing external commands. []
  • Testsuite improvements:

    • Test for 4.4 in the output of squashfs -version, even though the Debian package version is 1:4.3+git190823-1. []
    • Apply a patch from László Böszörményi to update the squashfs test output and additionally bump the required version for the test itself. (#62 & #935684)
    • Add the wabt Debian package to the test-dependencies so that we run the WebAssembly tests on our continuous integration platform, etc. []
  • Improve debugging:
    • Add the containing module name to the (eg.) “Using StaticLibFile for ...” debugging messages. []
    • Strip off trailing “original size modulo 2^32 671” (etc.) from gzip compressed data as this is just a symptom of the contents itself changing that will be reflected elsewhere. (#61)
    • Avoid a lack of space between “... with return code 1” and “Standard output”. []
    • Improve debugging output when instantantiating our Comparator object types. []
    • Add a literal “eg.” to the comment on stripping “original size modulo...” text to emphasise that the actual numbers are not fixed. []
  • Internal code improvements:
    • No need to parse the section group from the class name; we can pass it via type built-in kwargs argument. []
    • Add support to Difference.from_command_exc and friends to ignore specific returncodes from the called program and treat them as “no” difference. []
    • Simplify parsing of optional command_args argument to Difference.from_command_exc. []
    • Set long_description_content_type to text/x-rst to appease the PyPI.org linter. []
    • Reposition a comment regarding an exception within the indented block to match Python code convention. []

In addition, Mattia Rizzolo made the following changes:

  • Now that we install wabt, expect its tools to be available. []
  • Bump the Debian backport check. []

Lastly, Vagrant Cascadian updated diffoscope to versions 120, 121 and 122 in the GNU Guix distribution.

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build. This month, Chris Lamb made the following changes.

  • Add support for enabling and disabling specific normalizers via the command line. (#10)
  • Drop accidentally-committed warning emitted on every fixture-based test. []
  • Reintroduce the .ar normalizer [] but disable it by default so that it can be enabled with --normalizers=+ar or similar. (#3)
  • In verbose mode, print the normalizers that strip-nondeterminism will apply. []

In addition, there was some movement on an issue in the Archive::Zip Perl module that strip-nondeterminism uses regarding the lack of support for bzip compression that was originally filed in 2016 by Andrew Ayer.

Test framework

We operate a comprehensive Jenkins-based testing framework that powers tests.reproducible-builds.org.

This month Vagrant Cascadian suggested and subsequently implemented that we additionally test a varying build directory of different string lengths (eg. /path/to/123 vs /path/to/123456 but we also vary the number of directory components within this, eg. /path/to/dir vs. /path/to/parent/subdir. Curiously, whilst it was a priori believed that was rather unlikely to yield differences, Chris Lamb has managed to identify approximately twenty packages that are affected by this issue.

It was also noticed that our testing of the Coreboot free software firmware fails to build the toolchain since we switched to building on the Debian buster distribution. The last successful build was on August 7th but all newer builds have failed.

In addition, the following code changes were performed in the last month:

  • Chris Lamb: Ensure that the size the log for the second build in HTML pages was also correctly formatted (eg. “12KB” vs “12345”). []

  • Holger Levsen:

  • Mathieu Parent: Update the contact details for the Debian PHP Group. []

  • Mattia Rizzolo:

The usual node maintenance was performed by Holger Levsen [][] and Vagrant Cascadian [].

Misc news

There was a yet more effort put into our our website this month, including misc copyediting by Chris Lamb [], Mathieu Parent referencing his fix for php-pear [] and Vagrant Cascadian updating a link to his homepage. [].

On our mailing list this month Santiago Torres Arias started a Setting up a MS-hosted rebuilder with in-toto metadata thread regarding Microsoft’s interest in setting up a rebuilder for Debian packages touching on issues of transparency logs and the integration of in-toto by the Secure Systems Lab at New York University. In addition, Lars Wirzenius continued conversation regarding various questions about reproducible builds and their bearing on building a distributed continuous integration system.

Lastly, in a thread titled Reproducible Builds technical introduction tutorial Jathan asked whether anyone had some “easy” Reproducible Builds tutorials in slides, video or written document format.

Getting in touch

If you are interested in contributing the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:


This month’s report was written by Bernhard M. Wiedemann, Chris Lamb, Eli Schwartz, Holger Levsen, Jelle van der Waa, Mathieu Parent and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.

More in Tux Machines

Android Leftovers

Red Hat: Puff Pieces, OpenStack, OpenShift, CodeReady and More

  • Red Hat and SAS: Enabling enterprise intelligence across the hybrid cloud

    Every day 2.5 quintillion bytes of big data is created - this data comes from externally sourced websites, blog posts, tweets, sensors of various types and public data initiatives such as the human genome project as well as audio and video recordings from smart devices/apps and the Internet of Things (IoT). Many businesses are learning how to look beyond just data volume (storage requirements), velocity (port bandwidth) and variety (voice, video and data) of this data; they are learning how to use the data to make intelligent business decisions. Today, every organization, across geographies and industries can innovate digitally, creating more customer value and differentiation while helping to level the competitive playing field. The ability to capture and analyze big data and apply context-based visibility and control into actionable information is what creates an intelligent enterprise. It entails using data to get real-time insights across the lines of business which can then drive improved operations, innovation, new areas of growth and deliver enhanced customer and end user experiences

  • Working together to raise mental health awareness: How Red Hat observed World Mental Health Day

    Cultivating a diverse and inclusive workspace is an important part of Red Hat’s open culture. That’s why we work to create an environment where associates feel comfortable bringing their whole selves to work every single day. One way we achieve this mission is by making sure that Red Hatters who wish to share their mental health experiences, are met with compassion and understanding, but most importantly, without stigma. It is estimated that one in four adults suffers from mental illness every year.

  • Introducing Red Hat OpenShift 4.2: Developers get an expanded and improved toolbox

    Today Red Hat announces Red Hat OpenShift 4.2 extending its commitment to simplifying and automating the cloud and empowering developers to innovate. Red Hat OpenShift 4, introduced in May, is the next generation of Red Hat’s trusted enterprise Kubernetes platform, reengineered to address the complexity of managing container-based applications in production systems. It is designed as a self-managing platform with automatic software updates and lifecycle management across hybrid cloud environments, built on the trusted foundation of Red Hat Enterprise Linux and Red Hat Enterprise Linux CoreOS. The Red Hat OpenShift 4.2 release focuses on tooling that is designed to deliver a developer-centric user experience. It also helps cluster administrators by easing the management of the platform and applications, with the availability of OpenShift migration tooling from 3.x to 4.x, as well as newly supported disconnected installs.

  • A look at the most exciting features in OpenStack Train

    With all eyes turning towards Shanghai, we’re getting ready for the next Open Infrastructure Summit in November with great excitement. But before we hit the road, I wanted to draw attention to the latest OpenStack upstream release. The Train release continues to showcase the community’s drive toward offering innovations in OpenStack. Red Hat has been part of developing more than 50 new features spanning Nova, Ironic, Cinder, TripleO and many more projects. But given all the technology goodies (you can see the release highlights here) that the Train release has to offer, you may be curious about the features that we at Red Hat believe are among the top capabilities that will benefit our telecommunications and enterprise customers and their uses cases. Here's an overview of the features we are most excited about this release.

  • New developer tools in Red Hat OpenShift 4.2

    Today’s announcement of Red Hat OpenShift 4.2 represents a major release for developers working with OpenShift and Kubernetes. There is a new application development-focused user interface, new tools, and plugins for container builds, CI/CD pipelines, and serverless architecture.

  • Red Hat CodeReady Containers overview for Windows and macOS

    Red Hat CodeReady Containers 1.0 is now available with support for Red Hat OpenShift 4.2. CodeReady Containers is “OpenShift on your laptop,” the easiest way to get a local OpenShift environment running on your machine. You can get an overview of CodeReady Containers in the tech preview launch post. You can download CodeReady Containers from the product page.

  • Tour of the Developer Perspective in the Red Hat OpenShift 4.2 web console

    Of all of the new features of the Red Hat OpenShift 4.2 release, what I’ve been looking forward to the most are the developer-focused updates to the web console. If you’ve used OpenShift 4.1, then you’re probably already familiar with the updated Administrator Perspective, which is where you can manage workloads, storage, networking, cluster settings, and more. The addition of the new Developer Perspective aims to give developers an optimized experience with the features and workflows they’re most likely to need to be productive. Developers can focus on higher level abstractions like their application and components, and then drill down deeper to get to the OpenShift and Kubernetes resources that make up their application. Let’s take a tour of the Developer Perspective and explore some of the key features.

A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD

For those thinking of playing with Ubuntu 19.10's new experimental ZFS desktop install option in opting for using ZFS On Linux in place of EXT4 as the root file-system, here are some quick benchmarks looking at the out-of-the-box performance of ZFS/ZoL vs. EXT4 on Ubuntu 19.10 using a common NVMe solid-state drive. Given Canonical has brought ZFS support to its Ubiquity desktop installer as an easy-to-deploy option for running on this popular file-system, for this initial round of testing from Ubuntu 19.10 a lone NVMe SSD is being used (Corsair Force MP600) as opposed to doing any multi-disk setups, etc, where ZFS is more common due to its rich feature set. Clean installs of Ubuntu 19.10 were done both with EXT4 and ZFS while using the stock mount options / settings each time. The ZoL support in Ubuntu 19.10 is relying upon various back-ports from ZFS On Linux 0.8.2 and this imminent Linux distribution update is shipping with a 5.3-based kernel. Read more

Freespire 5.0 Linux OS Is Out with Linux Kernel 5.0, Based on Ubuntu 18.04.3 LTS

Based on the latest Ubuntu 18.04.3 LTS operating system, Freespire 5.0 is here to respond to users' accusations of a bloated system. Freespire doesn't aim to become a bloatware, so Freespire 5.0 only ships with the best-of-breed apps and packages and nothing else. Among these, we can mention the KDE Plasma 5.12.9 LTS desktop environment, Chromium 77 web browser, Calligra office suite, Amarok music player, DragonPlayer video player, KolourPaint paint software, Kpatience and DreamChess games, Ice 6.0.4 browser installer, as well as Synaptic Package Manager, Boot Repair, and Kamerka. Read more