The University of California’s secret agreement with Google for book digitization promises to improve access to parts of its library collections, but the contractual restrictions UC has accepted may enrich Google’s shareholders at public expense.
Digitizing the world’s books, films, video, sound recordings, maps, and other cultural artifacts could, to quote Internet Archive founder Brewster Kahle, provide “universal access to all human knowledge, within our lifetime.” So it’s troubling to see public institutions transfer cultural assets, accumulated with public funds, into private hands without disclosing the terms of the transaction.
Basic principles to govern mass digitization and safeguard the public interest have been developed by members of the American Library Association (forthcoming; see also http://litablog.org/?p=200), and by the Open Content Alliance. UC even signed on to the OCA principles (disclosure: I’ve worked for the OCA), which are designed to provide a baseline for digitization projects, in its scanning agreements with Yahoo and Microsoft. Transparency is a primary value to both the OCA, and the ALA.
So problem one is that the terms of the UC / Google agreement are secret, and were arrived at with no public input. As an institution that receives state and federal funding, UC should expect and welcome public comment if its inventory is effectively being privatized. The president’s office says it expects that terms will only come out after it receives the equivalent of a FOIA request. Since when does it take a FOIA request to get information from the library?
But it isn’t just the public that is excluded–it’s the rest of the library community. Mass digitization is very complex (see Paul Courant’s brilliant new article in First Monday). Librarians must grapple with new and unfamiliar issues that can only be resolved through dialog with peers. Google appears to be doing all it can to prevent this from happening, imposing NDAs on libraries at the start of discussions about mass digitization. By isolating librarians from each other, Google dramatically strengthens its negotiating position, and UC negates the goal of academic openness.
The second problem is more complex. Mass digitization is expensive. Public institutions that wish to digitize their holdings usually need to partner with private firms to get the work done. As described in Marketing Culture in the Digital Age, funded by the Mellon Foundation, and written by my colleague Peter Kaufman of Intelligent Television, commercial investment in digitization can be good for all concerned.
But private companies, at least profitable ones like Google, don’t work for free. So the public institutions need to pay for those services. Typically, they can’t pay in cash, so they pay in other ways, with labor, facilities, and some type of rights agreement. In other words, public use of and access to the digitized cultural works is usually limited in some way to benefit the private firm. This has to be done in the open.
The recent Smithsonian/Showtime agreement is a case in point that clearly shows what can go wrong in such a process. To recap, Showtime convinced the Smithsonian to sign a secret 170 page, 30 year agreement which gives Showtime control of the Smithsonian’s film and video archive. This particular saga has been widely covered elsewhere, but the roots of catastrophe are in 1) secret negotiations 2) exclusivity 3) length of term.
UC’s agreement is probably not explicitly exclusive. But as a practical matter, scanning doesn’t happen twice; libraries learned this when their material was microfilmed (as an aside, the microfilming was sometimes done badly, and to this day microfilm users suffer from those quality problems). This deal will be costly for UC in staff time and other resources, and the chances that another vendor will come through and duplicate the work are slim.
In the absence of the text of the agreement, it’s difficult to know what specific clauses may affect the ability of California citizens to read online the books now in their libraries. But there is a plausible nightmare scenario that UC needs to act now to prevent.
From the University of Michigan agreement (obtained only as a result of public records laws in Michigan, and despite Google’s best efforts) it is clear there will be restrictions on what UC can do with the digital scans. This is a critical issue. If this deal follows the pattern at Michigan, there will be limits determined by Google on how UC may share its digital holdings with other libraries.
If the scanning process is made efficient at all the universities now in Google’s orbit, a book already scanned at Harvard won’t be rescanned at Berkeley. So Berkeley may not receive a copy, and because of the restrictions on sharing its holdings, won’t have an easy time getting one from Harvard. The student of 2012 will have a choice: go to the complete digital library, owned by Google, or go to the partial digital library of his or her own university.
That extreme scenario may not come to pass, but there are many other questions about the Google / UC deal:
* What more might UC be able to do if its scanning project were funded by the legislature or foundations, rather than by Google?
* UC says the “digitized books will be searchable through Google Book Search.” Can anyone else build services that access this data? Or is it another case of “Google can crawl everyone else’s data, no one can crawl Google’s data?”
* What quality assurances will Google provide? How can we ensure this won’t be a repeat of the microfilm experience?
* Will UC have copies of the full, high quality scans, or will certain information, such as image positioning data needed for searching, be kept by Google alone?
* What restrictions will be placed on UC’s use of those scans?
* What will be the different treatments for material in copyright, or orphaned, or in the public domain?
* Is it reasonable to ask the public to pay a second time (or watch ads) for material already purchased, simply because it’s now necessary to convert the format in which it is stored?
* Why haven’t the Regents appointed a panel of advisors on this matter?
Clearly, UC’s high level goals are laudable. The Google people I’ve met believe in the company motto, “don’t be evil.” And it is not really in the public interest to side with the publishers who are the loudest voices now attacking Google, and a primary cause of the all the secrecy. Yet by acquiescing to Google’s demands for secrecy, UC has compromised the public interest, and set a dangerous precedent for the rest of the academic community.