Mining DistroWatch.com Logs (Part 2)
This article pursues the analysis of DistroWatch.com's logs I started one week ago. Last time, the data were prepared so that we could investigate the evolution, in time and space, of the popularity of GNU/Linux distributions. Pre-processing the logs in a different manner allows to focus on other interesting questions. In this way, although the extracted patterns will have the same "shape" as in last week's extraction, they will, this time, help us in discovering groups of distributions fulfilling similar purposes.
Instead of last week's ternary relation, this time, we will end up with mining a 4-ary relation. More precisely, a symmetric graph of distributions evolving in time and space. Take the red pill and welcome to the real world... of data-mining! When a visitor of DistroWatch.com (identified by her IP address from which the country is inferred) visits, the same day, pages related to different distributions, she probably searches for a Free operating system to fulfill her specific needs.
Among the millions of visits in a semester, almost all pairs of distribution pages have, one day, been consulted by at least one visitor.
Let us start with the biggest community: the old mainstream general-purpose distributions. At the center of this community (again remember that this does not relate to popularity but to the common purposes these distributions serve), Slackware, Gentoo and Ubuntu. A bit further away (i.e., not as much related to the other distributions of this group), Fedora, openSUSE and Debian. At the border of this community, Yellow Dog, MEPIS, Mandriva, Vector, FreeBSD and Damn Small Linux. When looking at the countries present in these patterns, it appears that the visitors from some European countries are clearly those making these associations. The United Kingdom shows off by being in almost all these patterns. Finland is also extremely present. Australia, Greece and Denmark are not far away. Why would these European and Australian visitors focus more on mainstream distributions than others? Maybe they are more conservative and keep on tracking the evolution of these solid distributions instead of searching for more specialized ones.
- Login or register to post comments
- Printer-friendly version
- 1198 reads
- PDF version
More in Tux Machines
- Highlights
- Front Page
- Latest Headlines
- Archive
- Recent comments
- All-Time Popular Stories
- Hot Topics
- New Members
digiKam 7.7.0 is releasedAfter three months of active maintenance and another bug triage, the digiKam team is proud to present version 7.7.0 of its open source digital photo manager. See below the list of most important features coming with this release. |
Dilution and Misuse of the "Linux" Brand
|
Samsung, Red Hat to Work on Linux Drivers for Future TechThe metaverse is expected to uproot system design as we know it, and Samsung is one of many hardware vendors re-imagining data center infrastructure in preparation for a parallel 3D world. Samsung is working on new memory technologies that provide faster bandwidth inside hardware for data to travel between CPUs, storage and other computing resources. The company also announced it was partnering with Red Hat to ensure these technologies have Linux compatibility. |
today's howtos
|
Recent comments
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago