Language Selection

English French German Italian Portuguese Spanish

Keeper of Expired Web Pages Is Sued

Filed under
Legal

The Internet Archive was created in 1996 as the institutional memory of the online world, storing snapshots of ever-changing Web sites and collecting other multimedia artifacts. Now the nonprofit archive is on the defensive in a legal case that represents a strange turn in the debate over copyrights in the digital age.

Beyond its utility for Internet historians, the Web page database, searchable with a form called the Wayback Machine, is also routinely used by intellectual property lawyers to help learn, for example, when and how a trademark might have been historically used or violated.

That is what brought the Philadelphia law firm of Harding Earley Follmer & Frailey to the Wayback Machine two years ago. The firm was defending Health Advocate, a company in suburban Philadelphia that helps patients resolve health care and insurance disputes, against a trademark action brought by a similarly named competitor.

In preparing the case, representatives of Earley Follmer used the Wayback Machine to turn up old Web pages - some dating to 1999 - originally posted by the plaintiff, Healthcare Advocates of Philadelphia.

Last week Healthcare Advocates sued both the Harding Earley firm and the Internet Archive, saying the access to its old Web pages, stored in the Internet Archive's database, was unauthorized and illegal.

The lawsuit, filed in Federal District Court in Philadelphia, seeks unspecified damages for copyright infringement and violations of two federal laws: the Digital Millennium Copyright Act and the Computer Fraud and Abuse Act.

"The firm at issue professes to be expert in Internet law and intellectual property law," said Scott S. Christie, a lawyer at the Newark firm of McCarter & English, which is representing Healthcare Advocates. "You would think, of anyone, they would know better."

But John Earley, a member of the firm being sued, said he was not surprised by the action, because Healthcare Advocates had tried to amend similar charges to its original suit against Health Advocate, but the judge denied the motion. Mr. Earley called the action baseless, adding: "It's a rather strange one, too, because Wayback is used every day in trademark law. It's a common tool."

The Internet Archive uses Web-crawling "bot" programs to make copies of publicly accessible sites on a periodic, automated basis. Those copies are then stored on the archive's servers for later recall using the Wayback Machine.

The archive's repository now has approximately one petabyte - roughly one million gigabytes - worth of historical Web site content, much of which would have been lost as Web site owners deleted, changed and otherwise updated their sites.

The suit contends, however, that representatives of Harding Earley should not have been able to view the old Healthcare Advocates Web pages - even though they now reside on the archive's servers - because the company, shortly after filing its suit against Health Advocate, had placed a text file on its own servers designed to tell the Wayback Machine to block public access to the historical versions of the site.

Under popular Web convention, such a file - known as robots.txt - dictates what parts of a site can be examined for indexing in search engines or storage in archives.

Most search engines program their Web crawlers to recognize a robots.txt file, and follow its commands. The Internet Archive goes a step further, allowing Web site administrators to use the robots.txt file to control the archiving of current content, as well as block access to any older versions already stored in the archive's database before a robots.txt file was put in place.

But on at least two dates in July 2003, the suit states, Web logs at Healthcare Advocates indicated that someone at Harding Earley, using the Wayback Machine, made hundreds of rapid-fire requests for the old versions of the Web site. In most cases, the robot.txt blocked the request. But in 92 instances, the suit states, it appears to have failed, allowing access to the archived pages.

In so doing, the suit claims, the law firm violated the Digital Millennium Copyright Act, which prohibits the circumventing of "technological measures" designed to protect copyrighted materials. The suit further contends that among other violations, the firm violated copyright by gathering, storing and transmitting the archived pages as part of the earlier trademark litigation.

The Internet Archive, meanwhile, is accused of breach of contract and fiduciary duty, negligence and other charges for failing to honor the robots.txt file and allowing the archived pages to be viewed.

Brewster Kahle, the director and a founder of the Internet Archive, was unavailable for comment, and no one at the archive was willing to talk about the case - although Beatrice Murch, Mr. Kahle's assistant and a development coordinator, said the organization had not yet been formally served with the suit.

Mr. Earley, the lawyer whose firm is named along with the archive, however, said no breach was ever made. "We wouldn't know how to, in effect, bypass a block." he said.

Even if they had, it is unclear that any laws would have been broken.

"First of all, robots.txt is a voluntary mechanism," said Martijn Koster, a Dutch software engineer and the author of a comprehensive tutorial on the robots.txt convention (robotstxt.org). "It is designed to let Web site owners communicate their wishes to cooperating robots. Robots can ignore robots.txt."

William F. Patry, an intellectual property lawyer with Thelen Reid & Priest in New York and a former Congressional copyright counsel, said that violations of the copyright act and other statutes would be extremely hard to prove in this case.

He said that the robots.txt file is part of an entirely voluntary system, and that no real contract exists between the nonprofit Internet Archive and any of the historical Web sites it preserves.

"The archive here, they were being the good guys," Mr. Patry said, referring to the archive's recognition of robots.txt commands. "They didn't have to do that."

Mr. Patry also noted that despite Healthcare Advocates' desire to prevent people from seeing its old pages now, the archived pages were once posted openly by the company. He asserted that gathering them as part of fending off a lawsuit fell well within the bounds of fair use.

Whatever the circumstances behind the access, Mr. Patry said, the sole result "is that information that they had formerly made publicly available didn't stay hidden."

By TOM ZELLER Jr.
The New York Times

More in Tux Machines

today's howtos

Games: Half-Life: C.A.G.E.D., Arcan 0.5.3, Wine Staging 2.17

  • Half-Life: C.A.G.E.D. from former Valve worker should hopefully come to Linux
    Half-Life: C.A.G.E.D. [Steam] is a mod from former Valve worker Cayle George, it's a short prison escape and it should be coming to Linux. Mr George actually worked on Team Fortress 2 and Portal 2 during his time at Valve, but he's also worked for other notable developers on titles like Horizon Zero Dawn.
  • Game Engine Powered Arcan Display Server With Durden Desktop Updated
    Arcan, the open-source display server powered by a game engine, is out with a new release. Its Durden desktop environment has also been updated. Arcan is a display server built off "the corpse of a game engine" and also integrates a multimedia framework and offers behavior controls via Lua. Arcan has been in development for a half-decade while its original code traces back more than a decade, as explained previously and has continued advancing since.
  • Arcan 0.5.3, Durden 0.3
    It’s just about time for a new release of Arcan, and way past due for a new release of the reference desktop environment, Durden. Going through some of the visible changes on a ‘one-clip or screenshot per feature’ basis:
  • Razer plans to release a mobile gaming and entertainment device soon
    NVIDIA, another big player in the gaming hardware and lifestyle space, released an Android-based portable gaming and entertainment console called the NVIDIA Shield that emphasized in-home streaming, and the Ouya console that Razer acquired (and discontinued) ran Android. But Razer decided to use Windows instead of Android on the Edge.
  • Wine Staging 2.17 is out with more Direct3D11 features fixing issues in The Witcher 3, Overwatch and more
    Wine Staging 2.17 is another exciting release, which includes more Direct3D11 features which fixes issues with The Witcher 3, Overwatch and more. As a reminder, Wine Staging is the testing area for future Wine development released, which will eventually be made into stable Wine releases.

KDE: Plasma 5.11 in Kubuntu 17.10, Krita 3.3, Randa and Evolution of Plasma Mobile

  • KDE Plasma 5.11 Desktop Will Be Coming to Kubuntu 17.10 Soon After Its Release
    KDE kicked off the development of the KDE Plasma 5.11 desktop environment a few months ago, and they've already published the Beta release, allowing users to get a first glimpse of what's coming in the final release next month. Canonical's Ubuntu Desktop team did a great job bringing the latest GNOME 3.26 desktop environment to the upcoming Ubuntu 17.10 (Artful Aardvark) operating system, and it looks like the Kubuntu team also want to rebase the official flavor on the forthcoming KDE Plasma 5.11 desktop environment.
  • Krita 3.3 Digital Painting App Promises Better HiDPI Support on Linux & Windows
    Work on the next Krita 3.x point release has started, and a first Release Candidate (RC) milestone of the upcoming Krita 3.3 version is now ready for public testing, giving us a glimpse of what's coming in the new release. In the release announcement, Krita devs reveal the fact that they were forced to bump the version number from 3.2.x to 3.3.x because the upcoming Krita 3.3 release will be introducing some important changes for Windows platforms, such as support for the Windows 8 event API, thus supporting the n-trig pen in Surface laptops.
  • Randa-progress post-hoc
    So, back in Randa I was splitting my energies and attentions in many pieces. Some attention went to making pancakes and running the kitchen in the morning — which is stuff I take credit for, but it is really Grace, and Scarlett, and Thomas who did the heavy lifting, and Christian and Mario who make sure the whole thing can happen. And the attendees of the Randa meeting who pitch in for the dishes after lunch and dinner. The Randa meetings are more like a campground than a 5-star hotel, and we work together to make the experience enjoyable. So thanks to everyone who pitched in. Part of a good sprint is keeping the attendees healthy and attentive — otherwise those 16-hour hacking days really get to you, in spite of the fresh Swiss air. [...] You can read more of what the attendees in Randa achieved on planet KDE (e.g. kdenlive, snappy, kmymoney, marble, kube, Plasma mobile, kdepim, and kwin). I’d like to give a special shout out to Manuel, who taught me one gesture in Italian Sign Langauage — which is different from American or Dutch Sign Language, reminding me that there’s localization everywhere.
  • The Evolution of Plasma Mobile
    Back around 2006, when the Plasma project was started by Aaron Seigo and a group of brave hackers (among which, yours truly) we wanted to create a user interface that is future-proof. We didn’t want to create something that would only run on desktop devices (or laptops), but a code-base that grows with us into whatever the future would bring. Mobile devices were already getting more powerful, but would usually run entirely different software than desktop devices. We wondered why. The Linux kernel served as a wonderful example. Linux runs on a wide range of devices, from super computers to embedded systems, you would set it up for the target system and it would run largely without code changes. Linux architecture is in fact convergent. Could we do something similar at the user interface level?

Ubuntu: "Artful Aardvark" Preview, GNOME 3.26 Apps on Ubuntu 16.04 LTS via Snaps, Canonical Distribution of Kubernetes and Community Council

  • Ubuntu 17.10 "Artful Aardvark" Preview Part 6: The New Tweaks
    Artful will have a new GNOME Tweak Tool, version 3.26, which is called Tweaks now. This tool provides you capability to alter your desktop, such as moving control buttons from left to right, adjusting options, or disabling/enabling Shell extensions. Take a look to its new stuffs below.
  • You'll Soon Be Able to Run GNOME 3.26 Apps on Ubuntu 16.04 LTS via Snaps
    Don't know if you recall, but we told you that Canonical, the company behind the popular Ubuntu Linux operating system, is working on a packaging more GNOME apps as Snaps for Ubuntu and other Snappy-enabled distros. Well, it turns out that they've been working on a Platform Snap for the recently released GNOME 3.26 desktop environment, which should allow users of the Ubuntu 16.04 LTS (Xenial Xerus) operating system to run the latest apps from the GNOME 3.26 Stack as Snaps, as well as developers to package their apps as Snaps. "We’ve been working on a Platform Snap for GNOME 3.26 to allow you to run the latest GNOME apps on Xenial as well as making Snaps for the new apps," reveals Will Cooke, Ubuntu Desktop Director at Canonical, in his latest weekly report. "This should be ready for testing soon and we’d appreciate some feedback."
  • Ubuntu Desktop Weekly Update: September 22, 2017
    We’re less than a week away from Final Beta! It seems to have come round very quickly this cycle. Next week we’re at the Ubuntu Rally in New York City where we will be putting the finishing touches to the beta. In the meantime, here’s a quick rundown on what happened this week:
  • Canonical Distribution of Kubernetes: Development Summary 2017.38
    In our current sprint we’ve started testing 1.8.0 in anticipation of the upstream release at the end of this month. We’re also testing with docker 1.13.1, which will soon become the default in CDK.
  • Ubuntu Community Council 2017 election under way!
    The Ubuntu Community Council election has begun and ballots sent out to all Ubuntu Members. Voting closes September 27th at end of day UTC. The following candidates are standing for 7 seats on the council:     Anis El Achèche – https://wiki.ubuntu.com/elacheche     Leo Arias – https://wiki.ubuntu.com/elopio     Danial Behzadi – https://wiki.ubuntu.com/danialbehzadi     (incumbent) Marco Ceppi – https://wiki.ubuntu.com/marco-ceppi     Aaron Honeycutt – https://wiki.ubuntu.com/AaronHoneycutt     Walter Lapchynksi – https://wiki.ubuntu.com/wxl     Marius Quabeck – https://wiki.ubuntu.com/marius.quabeck     José Antonio Rey – https://wiki.ubuntu.com/jose     Larry Tavin – https://wiki.ubuntu.com/wildmanne39     Iulian Udrea – https://wiki.ubuntu.com/IulianUdrea     Martin Wimpress – https://wiki.ubuntu.com/MartinWimpress     Naeil Zoueidi – https://wiki.ubuntu.com/nzoueidi