Language Selection

English French German Italian Portuguese Spanish

New Content/Layout OK?

Varnish Proxy

Silly me, a poll would not work on the new server. I forgot that with the Varnish cache proxy at the front almost all visitors arrive from the same IP address (the proxy), which means that Drupal would allocate just one vote to all (except registered and presently logged in users). With Drupal upgrade we can perhaps find polling software that overcomes this.

rpaf

You must use mod_rpaf to fix this problem that Varnish introduces.
See eg https://www.varnish-cache.org/lists/pipermail/varnish-misc/2008-September/016470.html
mod_rpaf for EL6 64bit here: http://centos.alt.ru/repository/centos/6/x86_64/mod_rpaf-0.6-2.el6.x86_64.rpm

Proxy

Thank, we will look into it. Currently, a lot of stuff other than the poll (e.g. views being counted) are not compatible with Varnish and it makes it look as though not many people visit and can participate in the site.

For sheer stats you could use

For sheer stats you could use an external (i.e. not cached by varnish) service, such as Google Analytics or run your own Piwik.

Piwik

Google Analytics is spyware, but Piwik would be a possibility (Stallman recently told me that it's good). Can it be installed on a cache proxy? I'd have to gain access to it first. Either way, this would not facilitate per-post page request count. Susan had it set up with a module, but it's no longer working correctly. In turn, rating/sorting posts by popularity is no longer possible, and that's the real downside (the front page can no longer list popular items for today).

The problem is not just that IP addresses are not unique. Some requests are never seen by the CMS and Apache.

For the non-unique addresses

For the non-unique addresses look at mod_rpaf, it was made for this situations.
Is this drupal6 or 7? With 6 varnish integration sucks from what I've seen.

See also
https://drupal.org/project/varnish
https://fourkitchens.atlassian.net/wiki/display/TECH/Configure+Varnish+3+for+Drupal+7

Agreed on Google Analytics. You can just install Piwik on the same host and tell Varnish either not to cache it or you can just set its virtualhost on a port other than 80 so it bypasses Varnish completely.

Varnish

Thanks for the pointers.

Yes, it's Drupal 6 and there are other issues that I am beginning to see, such as lack of updates from the RSS feeds around the page (I am currently investigating this, maybe it's related to a cron job or module config although I very much doubt the latter as I haven't changed configs).

Non-unique addresses could be bypassed as an issue even by writing random IP addresses, but that would enable easy poll rigging. I guess it's not essential for operation of the site, but it's a nice-to-have...

From Drupal.org: "This module provides integration between your Drupal site and the Varnish HTTP Accelerator, an advanced and very fast reverse-proxy system. Basically, Varnish handles serving static files and anonymous page-views for your site much faster and at higher volumes than Apache, in the neighborhood of 3000 requests per second."

I have had such issues with Varnish on top of WordPress and MediaWiki (pages served improperly from cache) and it all makes me wonder if removing Varnish altogether is the best way to proceed.

As for Piwik, I have never tried it before, so I will look into it.

I would keep Varnish on for

I would keep Varnish on for static files (css, js, jpeg etc) and to clean up HTTP traffic (Varnish will not forward incomplete or malformed HTTP requests to the backend, it should also be the front line against synfloods etc).

Here's a sample of what I use (test it first, I'm just beginning with Varnish myself)

director default dns {
.list = {
.port = "8080";
.connect_timeout = 5s;
.first_byte_timeout = 600s;
.between_bytes_timeout = 600s;
.max_connections = 10000;
"172.16.1.53"/32;
}
}
sub vcl_recv {
if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
return(lookup);
}
}
sub vcl_fetch {
if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
unset beresp.http.set-cookie;
}
if (req.restarts == 0) {
if (req.http.x-forwarded-for) {
set req.http.X-Forwarded-For =
req.http.X-Forwarded-For + ", " + client.ip;
} else {
set req.http.X-Forwarded-For = client.ip;
}
}
}

Then install mod_rpaf and make sure your Apache is listening on port 8080 and add this to /etc/httpd/conf.d/rpaf.conf:
LoadModule rpaf_module modules/mod_rpaf-2.0.so

RPAFenable On
RPAFproxy_ips 127.0.0.1 IPs_OF_THE_SERVER
RPAFsethostname On
RPAFheader X-Forwarded-For

PS: looks like drupal is messing with my comments, here's a text version http://fpaste.org/74672/raw/

Thanks

Thanks, I will look at it and into it in the weekend.

RSS feeds

The Piwik demo looks impressive, I have just given them a word of endorsement.

I am still trying to resolve some other issues we've identified.

I think I found the source of the issue above (RSS feeds). It seems like any external site access is denied by default, which helps explain why RSS feeds cannot be retrieved by the Drupal part of the site:


[root@tuxmachines ~]# wget lxer.com
--2014-02-05 04:34:37--  http://lxer.com/
Resolving lxer.com... 108.166.170.174
Connecting to lxer.com|108.166.170.174|:80... failed: Connection refused.
[root@tuxmachines ~]# wget linuxtoday.com
--2014-02-05 04:34:54--  http://linuxtoday.com/
Resolving linuxtoday.com... 70.42.23.121
Connecting to linuxtoday.com|70.42.23.121|:80... failed: Connection refused.

Looks like a firewall issue

Looks like a firewall issue at the first glance.

Firewall

Nux wrote:

Looks like a firewall issue at the first glance.

Yes, it was a simply issue to tackle. It works now.

Pageview count and polls

I'll have a look and see if configuration can solve not just the polling issue but also pageview count. The site of this module is down and it seems like it may require configuration on the cache server too.

More in Tux Machines

Python Programming Leftovers

  • How to Read SAS Files in Python with Pandas

    In this post, we are going to learn how to read SAS (.sas7dbat) files in Python. As previously described (in the read .sav files in Python post) Python is a general-purpose language that also can be used for doing data analysis and data visualization.

  • Daudin – a Python shell

    A few nights ago I wrote daudin, a command-line shell based on Python. It allows you to easily mix UNIX and Python on the command line.

  • How to Convert Python String to Int and Back to String

    This tutorial describes various ways to convert Python string to int and from an integer to string. You may often need to perform such operations in day to day programming. Hence, you should know them to write better programs. Also, an integer can be represented in different bases, so we’ll explain that too in this post. And there happen to be scenarios where conversion fails. Hence, you should consider such cases as well and can find a full reference given here with examples.

  • Thousands of Scientific Papers May be Invalid Due to Misunderstanding Python

    It was recently discovered that several thousand scientific articles could be invalid in their conclusions because scientists did not understand that Python’s glob.glob() does not return sorted results. This is being reported on by Vice, Slashdot and there’s an interesting discussion going on over on Reddit as well.

Audiocasts/Shows/Screencasts: Open Source Security Podcast, Linux Action News and Manjaro 19.09.28 KDE-DEV Run Through

  • Open Source Security Podcast: Episode 165 - Grab Bag of Microsoft Security News

    Josh and Kurt about a number of Microsoft security news items. They've changed how they are handling encrypted disks and are now forcing cloud logins on Windows users.

  • Linux Action News 127

    Richard Stallman's GNU leadership is challenged by an influential group of maintainers, SUSE drops OpenStack "for the customer," and Google claims Stadia will be faster than a gaming PC. Plus OpenLibra aims to save us from Facebook but already has a miss, lousy news for Telegram, and enormous changes for AMP.

  • GNU World Order 13x42

    On the road during the **All Things Open** conference, Klaatu talks about how to make ebooks from various sources, with custom CSS, using the Pandoc command.

  • Manjaro 19.09.28 KDE-DEV Run Through

    In this video, we are looking at Manjaro 19.09.28 KDE-DEV.

Apple of 2019 is the Linux of 2000

Last week the laptop I use for macOS development said that there is an XCode update available. I tried to install it but it said that there is not enough free space available to run the installer. So I deleted a bunch of files and tried again. Still the same complaint. Then I deleted some unused VM images. Those would free a few dozen gigabytes, so it should make things work. I even emptied the trash can to make sure nothing lingered around. But even this did not help, I still got the same complaint. At this point it was time to get serious and launch the terminal. And, true enough, according to df the disk had only 8 gigabytes of free space even though I had just deleted over 40 gigabytes of files from it (using rm, not the GUI, so things really should have been gone). A lot of googling and poking later I discovered that all the deleted files had gone to "reserved space" on the file system. There was no way to access those files or delete them. According to documentation the operating system would delete those files "on demand as more space is needed". This was not very comforting because the system most definitely was not doing that and you'd think that Apple's own software would get this right. After a ton more googling I managed to find a chat buried somewhere deep in Reddit which listed the magical indentation that purges reserved space. It consisted of running tmutil from the command line and giving it a bunch of command line arguments that did not seem to make sense or have any correlation to the thing that I wanted to do. But it did work and eventually I got XCode updated. After my blood pressure dropped to healthier levels I got the strangest feeling of déjà vu. This felt exactly like using Linux in the early 2000s. Things break at random for reasons you can't understand and the only way to fix it is to find terminal commands from discussion forums, type them in and hope for the best. Then it hit me. Read more

Contributor License Agreement and Developer Certificate of Origin references

In the last few years I have come across the CLA topic several times. It is and will be a popular topic in automotive the coming years, like in any industry that moves from being an Open Source Producer towards becoming an Open Source Contributor. In my experience, many organizations take the CLA as a given by looking at the google, microsoft or intels of the world and replicate their model. But more and more organizations are learning about alternatives, even if they do not adopt them. What I find interesting about discussing the alternatives is that it brings to the discussion the contributor perspective and not just the company one. This enrichs the debate and, in some cases, leads to a more balanced framework between any organization behind a project and the contriibutor base, which benefits both. Throughout these years I have read a lot about it but I have never written anything. It is one of those topics I do not feel comfortable enough to write about in public probably because I know lots of people more qualified than I am to do so. What I can do is to provide some articles and links that I like or that have been recommended to me in the past. Read more