Google Books & WashU Libraries

On October 16, a U.S. federal appeals court concluded that creation of the world’s most comprehensive index of full-text books constituted fair use. google-books-logo1

Commentaries abound, and it’s been several news cycles since the opinion’s release. But I wanted to share what I think the Google Books decision means for University Libraries, and connect this to a few legal developments more generally.

Short history: The Library Project began in 2004 (though Larry Page dreamt of “super librarians” before then). Research libraries shipped books to Mountain View to be scanned, OCR’d, indexed, and returned along with a digital copy. The compiled corpus is 20+ million books. The service offers full-text search, but you can only view snippets. Forms of text mining can be conducted through the Ngram research tool. No advertising is displayed to users. The company didn’t seek permission, though several proposed settlements were rejected along the way. The Authors Guild sued in 2005, kicking off ten years of litigation.

How is this relevant?

Copyright is central to core library functions. Fair use is essential to the academic enterprise. We’re involved in digitization activities as an organization, and we’re invested in opening new and constructive possibilities to engage in socially beneficial uses of our collections. So when a court issues a ruling on fair use—and the contested activities are in part similar to those in which we’re engaged—the reasoning applied provides practical guidance on structuring projects to maximize access and minimize risk.

Risk matters because copyright law provides statutory damages that let plaintiffs recover money ($750-$150,000 per act of infringement) regardless of actual harm suffered. This is of more consequence to private institutions than public universities because statutory damages are unavailable in infringement suits brought against state employees (sovereign immunity).

When addressing copyright issues at WashU, we stress the importance of making well-reasoned decisions—documented ex ante—based on good faith application of the law. We do this not only because we’re a private university and want to present well in the unlikely event we’re in court, but also because there are statutory exceptions that specifically facilitate the creation of digital collections of copyrighted materials. Most relevantly:

  • §504(c)(2), which provides for remission of statutory damages for acts of infringement by employees of nonprofit educational institutions, libraries, or archives who had reasonable belief that reproduction (though not necessarily other acts like distributing or performing publicly) of the work was fair use; and
  • §107 aka fair use, which allows for a limited privilege in those other than the owner of a copyright to use the copyrighted material in a reasonable manner without her consent.

As the Google Books opinion begins: “This copyright dispute tests the boundaries of fair use.” Though the four factors you are required to apply when evaluating fair use were codified into law in 1976, the doctrine is shaped by courts adjudicating disputes between discrete parties.

This is a strength and a weakness. On one hand it affords flexibility to adapt to changing circumstances and technology (trepidation about digitization has certainly softened since this litigation began). On the other hand, it informs the expression “hard cases make bad law.”

A tiny fraction of alleged infringing activities lead to lawsuits. And a fraction of those are litigated to resolution. Even then, any decision is limited to the facts and parties of that case. The result is a dearth of controlling law and imperfect instruction for prospective users.  This speaks to the importance of capturing local norms and conditions in “best practices” documents (like these).

Courts do seek to set rules—precedent applicable to factually similar situation in the future. The broad “rule” announced here is an affirmation of the fundamental means-end calculus on which our entire IP regime is based: that the law’s ultimate purpose is to enrich public knowledge, the creation of a public good by means of private enterprise. This is a generality, and may seem a bit nebulous. But it’s absolutely central to any organization that commits to “facilitate the creation, analysis and curation of knowledge and data” and “support and enrich users’ teaching, learning, and research.”

Here are conclusions on the case drawn by commentators of note:

Dan Cohen: “Authors Guild v. Google stands to make fair use much more muscular. Because many institutions want to avoid legal and financial risk, many possible uses that the courts would find fair—including a number of non-commercial, educational uses—are simply never attempted. A clearer fair-use principle, with stronger support from the courts, will make libraries and similar organizations more confident about pursuing forms of broader digital access.”

Kenny Crews: “Ultimately fair use is based on the factors in the statute, and the exercise is based on careful planning. On a smaller scale, [the decision] opens up some ability for universities and libraries to engage in socially beneficial uses of their collections, especially digitizing for preservation and protection of scarcer works.”

Kevin Smith: “[T]ransformation is an answer to the question of how a borrowing from a copyrighted work can be justified. The court, on behalf of a rights holder, asks a user ‘why did you do this?’  When the answer to that question is ‘because I wanted to make a new contribution to knowledge,’ that is a transformative purpose.  And, by definition, it is a purpose that benefits the public, which justifies whatever minor loss a rights holder might suffer from the use.  The second step in Judge Leval’s analysis, asking if the new use is a market substitute for the original, ensures that that loss is not so great as to outweigh the benefit. Thus we have a coherent analysis that recognizes the public purpose of copyright and still respects it chosen method for accomplishing that purpose.”

Pamela Samuelson: “[T]he appellate court’s decision establishes a precedent that provides a meaningful shield for individual scholars, as well as archives, libraries, and historical societies that serve scholarly research communities, when they undertake mass-digitization projects for similar purposes. Scholars and nonprofit institutions that service scholarly communities have mounds of materials they would like to digitize and make more accessible. Risks of copyright-infringement lawsuits have sometimes deterred socially valuable digitization efforts. Google’s win in the Authors Guild case reduces this risk significantly (especially since the court held that the guild cannot bring claims of copyright infringement against anyone except with regard to works whose copyrights it owns).”

Jane Ginsburg: “[T]here is a powerful argument that exploiting a work for its non-expressive information (bibliographic or bean-counting—how many times and in what works a given word or phrase appears) is not even prima-facie infringing, and that the digitization of lawfully–possessed copies (loaned from the University of Michigan library) to create a database that enables nonexpressive, but progress-of-knowledge-enhancing outputs must therefore be equally free. By contrast, the snippet views did convey limited amounts of expression, but the court repeatedly emphasized the very constrained and controlled, ‘fragmentary and scattered,’ ‘cumbersome, disjointed, and incomplete nature of the aggregation of snippets made available through snippet view.’ As a result, ‘at least as presently structured by Google, the snippet view does not reveal matter that offers the marketplace a significantly competing substitute for the copyrighted work.’ The court appears to be endeavoring to avoid slippery-slope expansion of the content or presentation of fair use-permissible snippets.”

Mark Seeley: “Bottom line is that although the decision could be read narrowly (specific and unusual fact circumstances, possible appeal, perhaps differences in approach among the circuits), there’s no question that this is a very influential jurist (Leval) and a very influential court, and therefore very impactful.  I think it makes it very difficult to articulate what ‘fair use’ actually means in practice.”

It’s worth noting that none of our federal courts in Missouri are required to follow decisions from other jurisdictions (notwithstanding SCOTUS). But another circuit’s reasoning can be persuasive, particularly when from an opinion that is:

  • recent,
  • written by a judge with a good reputation,
  • issued as part of a majority decision,
  • by a court of high appellate authority, and
  • from a jurisdiction known for expertise in the field.

Judge Pierre Leval, author of the opinion in Authors Guild v. Google, is influential fair use scholar and father of “transformativeness.” The Second Circuit Court of Appeals has a lengthy copyright docket, thanks to its seat in the world’s media capital, and its intellectual property decisions frequently are given deference (same for the Ninth Circuit with its Hollywood bailiwick).

So though we’re not Google (whew) and our projects aren’t like Google Books (we actually care about metadata), the decision lets us more clearly see how local initiatives—educational, non-commercial, and of smaller scale—fit within a framework for digitization activities now deemed legal elsewhere. It gives guidance on framework for making responsible decisions based on good faith application of the law, and insight on strategies to safeguard rightsholders before setting policies, writing grants, or entering agreements.


Turning back to hard facts making bad law, it can further be said that infringement suits are rarely only about the contested activity. Google is without equal. There are well-justified reasons to be suspicious of its intentions (though Alphabet/Google lost interest in Books a while ago). The Authors Guild dogged—many say foolish; and certainly expensive—insistence to continue the case in the face of fierce headwinds reflect a deeper grievance, in part based on principle. This wariness is arguably well-founded (“Don’t Be Evil” like “Don’t think of pink elephants” in ironic process theory?!). But to quote Christopher Hitchens: “[I]f your opponent thought he had identified your lowest possible motive, he was quite certain that he had isolated the only real one.”

I also don’t mean to overstate the importance of good faith. Fair use is a right codified by law, not something “earned by good works or clean morals.”1  Copyright isn’t about virtue. Libraries continue to have latitude to digitize for preservation, but we’re hamstrung by the range of access to the materials that we’re legally authorized to provide.  The Court in Google Books does make clear that you needn’t have a fully articulable research purpose at a project’s outset in order for your use to be fair. Any successful large-scale effort to allow public users to read substantial portions of digitized works, though, will almost certainly need come from a centralized, federal program (à la Europeana or the The Bookshelf project in Norway). Perhaps the perfect project for our next Librarian of Congress??

[1] NXIVM Corp. v. Ross Inst., 364 F.3d 471, 485 (2d Cir. 2004) (Jacobs, J., concurring).

About the author

Please contact Shannon Davis at: