Sunday, May 29, 2011

Google, Newspaper Archives, and the Business of Cultural Heritage

Google announced this month that it is ending its ambitious project to digitally archive newspapers. The project to scan the archives of the nation’s newspapers and make them available online as a searchable historical record was announced in 2008 with the level of hubris only found in online enterprises.

"Our objective is to bring all the world's historical newspaper information online,” said Adam Smith, director of product management at Google, announcing the project. Those lofty aims were echoed by Punit Soni, manager of the newspaper initiative: “As we work with more and more publishers, we'll move closer towards our goal of making those billions of pages of newsprint from around the world searchable, discoverable, and accessible online…."Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search, you'll be searching the full text of these newspapers as well.”

After scanning about 60 million pages and beginning to make them available as full page shots--because costs of disaggregating and indexing were too high and copyright clearances were difficult to obtain for older material—the company announced that it will quit scanning pages, but continue offering the existing pages available on it Google News Archive site. It said it would not invest any new effort to improve indexing or add tools to better search and manage the archive.

The project may have been well-intentioned, but it was not well thought out. It was a free service designed to use the search traffic at the site to raise revenue through advertising Google would put on the site. The scale of the project was enormous and requiring finding, scanning, and indexing thousands of daily and weekly newspapers--many no longer in existence. It would require a long-term commitment of funds, personnel and server capacity to catalogue and scan the material and provide and maintain search functions. The project ultimately incorporated on a fraction of the papers it had hoped to scan, did so spottily in many cases, and its usability was poor because it never mastered the problems of handling so much content. Worse yet, it discovered that history was not a money making business.

The exit announcement is not a surprise and is another sign that players the virtual world are stopping deluding themselves that they are replacing the entire world and that the laws of economics and finance to not apply to them.

As laudable the preservation of newspaper archives might be, expecting it to be completed and maintained by a commercial firm defied sense and historical experience. For centuries, the most important historical records, books, art have been maintain in governmentally and charitably funded collections because commercial enterprises were either unwilling to bear the costs or to allow the large scale efforts required to preserve, catalogue, index, and make available cultural heritage materials distract them from their business activities.

Why would anyone expect Google to act otherwise?

As Google increasingly acts as a mature business it will increasingly shed activities that were launched as goodwill gestures because the costs of their operations reduces the company’s financial performance and will diminish the value of its stock compared to other tech firms. Over time it will be harder for the firm to maintain the stance that it is not self-interested and motivated only by the opportunities to improve the lives of the public by providing access to all the world’s information.

The tentacles of its operations that have reached out into to many fields will increasingly be pulled back if they do not yield financial results. And fears that Google will rule the world will diminish. Google, Microsoft, Amazon and other big players of the digital world all have limits, just as did the handful of firms that once controlled steel, oil, and shipping through cartels. At some point even mammoth, wealthy companies do not have the resources and capabilities to keep expanding endlessly and their performance declines, leading shareholders to rein them in and competitors to find opportunities.