Language Selection

English French German Italian Portuguese Spanish

Enough Keyword Searches. Just Answer My Question.

Filed under

SEARCH engines are so powerful. And they are so pathetically weak.

When it comes to digging up a specific name, date, phrase or price, search engines are unstoppable. The same is true for details from the previously concealed past. For better and worse, any information about any of us - true or false, flattering or compromising - that has ever appeared on a publicly available site is likely to be retrievable forever, or until we run out of electricity for the server farms. Carefree use of e-mail was once a sign of sophistication. Now to trust confidential information to e-mail is to be a rube. Despite the sneering term snail mail, plain old letters are the form of long-distance communication least likely to be intercepted, misdirected, forwarded, retrieved or otherwise inspected by someone you didn't have in mind.

Yet for anything but simple keyword queries, even the best search engines are surprisingly ineffective.

Recently, for example, I was trying to track the changes in California's spending on its schools. In the 1960's, when I was in public school there, the legend was that only Connecticut spent more per student than California did. Now, the legend is that only the likes of Louisiana and Mississippi spend less. Was either belief true? When I finally called an education expert on a Monday morning, she gave me the answer off the top of her head. (Answer: right in spirit, exaggerated in detail.) But that was only after I'd wasted what seemed like hours over the weekend with normal search tools. If it sounds easy, try using keyword searches to find consistent state-by-state data covering the last 40 years.

We live with these imperfections by trying to outguess the engines - what if I put "per capita spending by states" in quotation marks? - and by realizing that they're right for some jobs and wrong for others.

One branch of the federal government is desperate enough for a better search tool that its efforts could be a stimulus for fundamental long-term improvements. Last week, I spent a day at a workshop near Washington for the Aquaint project, whose work is unclassified but has gone virtually unnoticed in the news media. The name stands for "advanced question answering for intelligence," and it refers to a joint effort by the National Security Agency, the C.I.A. and other federal intelligence organizations. To computer scientists, "question answering," or Q.A., means a form of search that does not just match keywords but also scans, parses and "understands" vast quantities of information to respond to queries. An ideal Q.A. system would let me ask, "How has California's standing among states in per-student school funds changed since the 1960's?" - and it would draw from all relevant sources to find the right answer.

In the real Aquaint program, the questions are more likely to be, "Did any potential terrorist just buy an airplane ticket?" or "How strong is the new evidence of nuclear programs in Country X?" The presentations I saw, by scientists at universities and private companies, reported progress on seven approaches to the problem. (The new I.B.M. search technology discussed here last year is also part of the Aquaint project.)

There will be more to say later about this effort. On the bright side, apart from whatever the project does for national security, its innovations could eventually improve civilian search systems, much as the Pentagon's Arpanet eventually became the civilian Internet. Of course, the dark potential in ever more effective search-and-surveillance systems is also obvious.

For the moment, consider several here-and-now innovations that can improve on the standard Google-style list of search hits. Ask Jeeves, whose site is, recently introduced two features that enhance its long-established question-and-answer format. One tries to recast search terms into a question that can be answered on the Web; the other offers suggestions to broaden or narrow the search., a free version of what was once called GuruNet, combines conventional search results with questions and answers.

Two related sites, and its parent,, categorize the hits from each search, producing a kind of table of contents of results. Another site,, does something similar in a visual form; it is free online or $49 for a desktop version. And the bizarrely named but extremely useful has become my favorite search portal, because it allows quick, easy comparisons of the results of the same search on virtually any major engine.


More in Tux Machines Website Says Microsoft's Software Is Malware has a category on its website named “Philosophy of the GNU Project,” where the Microsoft software is described as malware, along with Apple and Amazon. Read more

Ubuntu Touch Devs Might Release an OTA-8.5 Hotfix Update for Ubuntu Phones

Earlier today, November 25, Canonical's Łukasz Zemczak sent his daily report for the day of November 24, 2015, informing all Ubuntu Phone users about the latest work done by the Ubuntu Touch developers on the Ubuntu for phones operating system. Read more

Systemd — unit dependencies and order

Welcome back to our continuing series on systemd features. As you’ve guessed from our previous articles, systemd brings more power and flexibility to service startup and management. One of the most important changes in systemd from legacy SysVinit is how it starts up units. You may have heard from casual users that systemd starts everything together. Some people believe this is true, and that’s why the system starts faster. But the reality is not quite that simple. Let’s look a little more deeply at how systemd understands unit relationships. Read more

today's leftovers

  • AWS launches EC2 Dedicated Hosts so you can bring your own Linux licence
    AMAZON WEB SERVICES (AWS) has announced the arrival of a new service called EC2 Dedicated Hosts. The new feature will allow companies to run the software they pay for on multiple virtual machines using a single server, giving more granular management to finding what applications are working on what virtual machine. AWS has outlined the advantages of EC2 Dedicated Hosts in a blog post by evangelist Jeff Barr.
  • Unikernels, meet Docker!
    The demo described here is just the beginning. There are many implementations of unikernels and there’s plenty of work ahead to ensure they can all reap the benefits of integration, as well as improving Docker itself to make the most of these new technologies. Look over the collection of unikernel projects and contribute your experiences to this blog!
  • AMD Radeon Software Crimson Edition Is A Letdown On Linux
    While leaked slides indicate AMD was planning better gaming on Linux for Crimson, in the end they really didn't deliver. Even for their mentioned games, when testing various Linux OpenGL games on three different systems the performance was largely unchanged.
  • New HPCG Benchmark List Goes Beyond LINPACK to Compare Supercomputers
    The High Performance Conjugate Gradients (HPCG) Benchmark list was announced this week at SC15. This is the fourth list produced for the emerging benchmark designed to complement the traditional High Performance LINPACK (HPL) benchmark used as the official metric for ranking the TOP500 systems. The first HPCG list was announced at ISC’14 a year and a half ago, containing only 15 entries and the SC’14 list had 25. The current list contains more than 60 entries as HPCG continues to gain traction in the HPC community.
  • New Opera 34 Beta Is Based on Chromium 47.0.2526.58, Brings Linux and Mac Fixes
    Opera Software, through Aneta Reluga, has announced the release and immediate availability for download and testing of a new Beta build for the upcoming Opera 34.0 web browser for all supported operating systems, including GNU/Linux, Mac OS X, and Microsoft Windows.
  • Hamster rediscovered
    If you like to track your time in a fine granular way, consider to use project-hamster with the GNOME Shell extension.
  • Distro hopping: feeling good with my time on LXLE
    Well the time has come to officially switch off from LXLE. This time around however I find myself in a weird spot. I’ve honestly struggled with LXLE; not in using the distribution itself but rather coming up with things to write about it. That isn’t to say that LXLE is bad by any stretch of the imagination, in fact it is quite good, it’s just that once you get used to the light weight desktop environment (DE) there is a perfectly capable “heavy weight” distribution underneath. What I mean by this is that once you get used to the DE and it fades into the background you’re left with a perfectly functional distribution that could just as easily have been Ubuntu or Linux Mint or Fedora or {insert your favourite one here}.
  • Netrunner 17 'Horizon' is here -- download the Kubuntu-based Linux distro now
    About a week ago, the Netrunner team released an update to its rolling release operating system. Based on Arch/Manjaro, I advised Linux beginners to steer clear, and instead opt for the Kubuntu-based variant. There are a couple of reasons for this. For one, the Ubuntu community is arguably friendlier and better for newbies -- there are a ton of instructions and .deb files available too. More importantly, however, the rolling release could be less stable overall.
  • Netrunner 17 Screenshot Tour
  • KNOPPIX 7.6.0 Screenshot Tour
  • Tumbleweed install for November
    For this month, I installed Tumbleweed on my laptop. I had installed Leap 42.1 to overwrite my previous Tumbleweed install on that laptop. This computer uses legacy booting. I gave Tumbleweed a 40G partition, which I formatted as “ext4”. I also allowed it to use the swap and home file systems from my encrypted LVM on that computer.
  • Python 3 Porting FAD: Lessons Learned
  • Fossetcon 2015 Orlando Florida – Lake Buena Vista Hilton 19 – 21 November 2015
  • Reproducible builds: week 30 in Stretch cycle