Language Selection

English French German Italian Portuguese Spanish

Enough Keyword Searches. Just Answer My Question.

Filed under
Web

SEARCH engines are so powerful. And they are so pathetically weak.

When it comes to digging up a specific name, date, phrase or price, search engines are unstoppable. The same is true for details from the previously concealed past. For better and worse, any information about any of us - true or false, flattering or compromising - that has ever appeared on a publicly available site is likely to be retrievable forever, or until we run out of electricity for the server farms. Carefree use of e-mail was once a sign of sophistication. Now to trust confidential information to e-mail is to be a rube. Despite the sneering term snail mail, plain old letters are the form of long-distance communication least likely to be intercepted, misdirected, forwarded, retrieved or otherwise inspected by someone you didn't have in mind.

Yet for anything but simple keyword queries, even the best search engines are surprisingly ineffective.

Recently, for example, I was trying to track the changes in California's spending on its schools. In the 1960's, when I was in public school there, the legend was that only Connecticut spent more per student than California did. Now, the legend is that only the likes of Louisiana and Mississippi spend less. Was either belief true? When I finally called an education expert on a Monday morning, she gave me the answer off the top of her head. (Answer: right in spirit, exaggerated in detail.) But that was only after I'd wasted what seemed like hours over the weekend with normal search tools. If it sounds easy, try using keyword searches to find consistent state-by-state data covering the last 40 years.

We live with these imperfections by trying to outguess the engines - what if I put "per capita spending by states" in quotation marks? - and by realizing that they're right for some jobs and wrong for others.

One branch of the federal government is desperate enough for a better search tool that its efforts could be a stimulus for fundamental long-term improvements. Last week, I spent a day at a workshop near Washington for the Aquaint project, whose work is unclassified but has gone virtually unnoticed in the news media. The name stands for "advanced question answering for intelligence," and it refers to a joint effort by the National Security Agency, the C.I.A. and other federal intelligence organizations. To computer scientists, "question answering," or Q.A., means a form of search that does not just match keywords but also scans, parses and "understands" vast quantities of information to respond to queries. An ideal Q.A. system would let me ask, "How has California's standing among states in per-student school funds changed since the 1960's?" - and it would draw from all relevant sources to find the right answer.

In the real Aquaint program, the questions are more likely to be, "Did any potential terrorist just buy an airplane ticket?" or "How strong is the new evidence of nuclear programs in Country X?" The presentations I saw, by scientists at universities and private companies, reported progress on seven approaches to the problem. (The new I.B.M. search technology discussed here last year is also part of the Aquaint project.)

There will be more to say later about this effort. On the bright side, apart from whatever the project does for national security, its innovations could eventually improve civilian search systems, much as the Pentagon's Arpanet eventually became the civilian Internet. Of course, the dark potential in ever more effective search-and-surveillance systems is also obvious.

For the moment, consider several here-and-now innovations that can improve on the standard Google-style list of search hits. Ask Jeeves, whose site is Ask.com, recently introduced two features that enhance its long-established question-and-answer format. One tries to recast search terms into a question that can be answered on the Web; the other offers suggestions to broaden or narrow the search. Answers.com, a free version of what was once called GuruNet, combines conventional search results with questions and answers.

Two related sites, Clusty.com and its parent, Vivisimo.com, categorize the hits from each search, producing a kind of table of contents of results. Another site, Grokker.com, does something similar in a visual form; it is free online or $49 for a desktop version. And the bizarrely named but extremely useful MrSapo.com has become my favorite search portal, because it allows quick, easy comparisons of the results of the same search on virtually any major engine.

By JAMES FALLOWS.

More in Tux Machines

Google seeks dev feedback for putting AI on Raspberry Pi

Google will bring its AI and machine learning technology to the Raspberry Pi this year, and has posted a survey seeking input. Google is planning to deliver tools for the Raspberry Pi later this year built around its artificial intelligence and machine learning technology, according to a Raspberry Pi Foundation blog entry. The announcement links to a Google survey that seeks to determine what kind of tools RPi developers would find most useful. Read more

Hands-On: Installing openSUSE Tumbleweed, Manjaro, and Debian GNU/Linux on my new notebook

In my previous post about installing Linux on my new, very low-priced laptop (the Asus X540S), I went through the initial setup of Windows 10 Home. My first impressions of the laptop were very mixed. The size and weight are nice, but the overall construction doesn't feel very good. The case feels like very thin plastic, the keyboard doesn't feel good at all, it has a particularly cheesy version of the dreaded "clickpad" (a touchpad with integrated buttons), and the power connection didn't feel very stable. Read more

Rugged, compact IoT gateway runs Linux on Apollo Lake

Axiomtek’s DIN-rail ready “ICO100-839” IoT controller offers an Atom x5-E3930, 8-bit DIO, mini-PCIe, mSATA, extended temp support, and a compact footprint. The ICO100-839 is one of the first embedded computers to use Intel’s recent “Apollo Lake” generation of 14nm-fabricated Atom SoCs. Like the Advantech UTX-3117, the fanless ICO100-839 is referred to as an IoT gateway, and runs on a dual-core Atom X5-E3930 clocked from 1.3GHz to 1.8GHz. The ICO100-839, which is also called an industrial IoT controller, is a stripped down, but updated version of the Bay Trail Atom based ICO300 DIN-rail controller. Last year, the ICO300 was followed by an almost identical ICO300-MI gateway, which added Intel IoT Gateway Technology and Wind River Intelligent Device Platform software. Read more

today's leftovers

  • GoboLinux 016
    GoboLinux is available for 64-bit x86 computers exclusively. The ISO I downloaded for GoboLinux 016 was 958MB in size. Booting from the installation media brings up a text-based menu system where we are asked to select our preferred language from a list of six European languages. We are then asked to select our keyboard's layout from another list. At this point, the system drops us to a command prompt where we are logged in as the root user. The default shell is zsh. A welcome message lets us know we can run the startx command to launch a desktop environment or run the Installer command to begin installing the distribution.
  • Solus Linux Working On A Flatpak-Based, Optimized Steam Runtime
    The Solus Linux developers have been working on their "Linux Steam Integration" for Steam and improvements around the Steam runtime, with this being one of the distributions interested in good Linux performance and making use of some Clear Linux optimizations, while their next step is looking at Flatpak-packaging up of libraries needed by the Steam runtime to fork a Flatpak-happy Linux gaming setup.
  • It’s ‘Best Linux Distro’ Time Again
    It’s time to start the process of choosing the FOSS Force Reader’s Choice Award winner for Best Desktop Linux Distro for 2016. This is the third outing for our annual poll, which began in a March, 2015 contest that was won by Ubuntu, which bested runner-up Linux Mint by only 11 votes. Last year we moved the voting up to January, in a contest which saw Arch Linux as the overall winner, with elementary OS in second place. Just like last year, this year’s polling will be a two round process. The first round, which began early Friday afternoon when the poll quietly went up on our front page, is a qualifying round. In this round, we’re offering a field of 19 of the top 20 distros on Distrowatch’s famous “Page Hit Ranking” list. Those whose favorite distro isn’t on the list shouldn’t worry — your distro’s not out of the game yet. Below the poll there’s a place to write-in any distro that’s not in the poll to be tallied for possible inclusion in the second and final round of polling to follow.
  • Tracktion NAMM 2017 Preview [Ed: Raspberry Pi with Ubuntu]
  • Snapdragon 410E SBC offers long lifecycle support at $85
    The Linux/Android-ready Inforce 6309L is a cheaper version of the DragonBoard 410c-like Inforce 6309. It sacrifices GbE and LVDS, but has 10-year support. Inforce Computing has released a more affordable and slightly less feature rich version of its commercial-oriented, circa-2015 Inforce 6309 SBC. Like the Inforce 6309, the new Inforce 6309L has the same 85 x 54mm footprint and much the same feature set as Arrow’s Qualcomm-backed, community-backed DragonBoard 410c SBC. It also offers the same Linux and Android BSPs used by the DragonBoard 410c, one of the first SBCs to adopt Linaro’s 96Boards form-factor.
  • It’s time to spring-clean your IT contracts
    The start of a new year is a time for review and planning, in business, as well as in our personal lives. It’s likely that you will be focused on finalising your company’s objectives and strategy for the year ahead. But it’s also important to consider whether the tools and processes that you have in place remain fit for purpose – and that includes your contract templates and contractual risk and compliance processes. When it comes to the law, “the only thing that is constant is change”. Without fail, each year brings the introduction of new legislation, case law and regulatory guidance that may have an impact on your contracts – whether it’s the terms of use or privacy policy for your website or app, or the contract terms that you use when supplying or purchasing technology services. Therefore, it’s important to carry out a regular review of your contract terms (and any existing contracts) to make sure that they remain compliant with law and are future-proofed as much as possible in terms of new legal and regulatory developments that you know are around the corner.
  • Chinese investors buy owner of PCWorld, IDC
    International Data Group, the owner of PCWorld magazine, several other tech journals and the IDC market research organisation, has been bought by two Chinese investors. China Oceanwide Holdings Group and IDG Capital (no affiliate of IDG) have paid between US$500 million and US$1 billion for IDG sans its high-performance computing research businesses. The two Chinese entities had made separate bids but were told by investment banker Goldman Sachs to join hands. The sale of IDG has been cleared by the US Committee on Foreign Investment and should be completed by end of the first quarter this year. China Oceanwide Holdings Group, founded by chairman Zhiqiang Lu, is active in financial services, real estate, technology, and media among others.