« Digitizing books: Authors Guild has it backward | Main | Podcasting video is now live »
Yahoo, Archive to scan books

Meantime, Katie Hafner in Monday's NY Times reports that a group of edu, org, com and gov organizations are announcing an Open Content system in the tradition of Open Source software.
An unusual alliance of corporations, nonprofit groups and universities plans to announce today an ambitious plan to digitize hundreds of thousands of books over the next several years and put them on the Internet, with the full text accessible to anyone.The effort is being led by Yahoo, which appears to be taking direct aim at a similar project announced by its archrival, Google, whose own program to create searchable digital copies of entire collections at leading research libraries has run into a series of challenges since it was announced nine months ago.
The new project, called the Open Content Alliance, has the wide-ranging goal of digitizing historical works of fiction along with specialized technical papers. In addition to Yahoo, its members include the Internet Archive, the University of California, and the University of Toronto, as well as the National Archive in England and others.
The digitization of print materials has been a continual effort on the part of various research libraries for the last several years. But the potential power of the new collaboration lies in the collective ability of many institutions to compare and cross-reference materials, said Daniel Greenstein, librarian for the California Digital Library at the University of California.
"This is the kind of platform we've been looking for for a long time," said Dr. Greenstein. "Libraries digitize their stuff and put it up, but none of the libraries have comprehensive collections of everything. Now we can say: 'We have this particular edition of Mark Twain, but it's not as good as that one over there,' and we add it to the collection."
The Library of Congress, for instance, has one of the largest library collections in the world, but even that collection is incomplete. "It's all about gap-filling and collection development," said Dr. Greenstein. ...
In a departure from Google's approach, the Open Content Alliance will also make the books accessible to any search engine, including Google's. (Under Google's program, a digitized book would show up only through a Google search.) And by focusing at first on works that are in the public domain - such as thousands of volumes of early American fiction - the group is sidestepping the tricky question of copyright violation. ...
When it comes to copyrighted materials, the newly formed group appears to be taking a more cautious approach by seeking permission from copyright holders and by making works available though a Creative Commons license, whereby the copyright holder stipulates how a work can be used.
"Other projects talk about snippets," said Brewster Kahle, the founder of the Internet Archive, a nonprofit organization in San Francisco that is building a vast digital library. "We don't talk about snippets. We talk about books." ...
The new group is calling for others to join. And Mr. Kahle of the Internet Archive said he hoped to recruit Google.
"The thing I want to have happen out of all this is have Google join in," he said. "I know we're dealing with archcompetitors, but if there's room for these guys to bend, by the time my kid goes to college, we could have a library system that is just astonishing."
Congrats, Brewster and colleagues. An important step forward.
October 2, 2005 at 11:26 PM in Books | Permalink
| Comments (0)
|
|
(1)
» Agile Software Development with SCRUM from Books
Customer Review:If you're the kind of person who demands technical books that weigh 15 pounds, with beautify layout and tons of white space, this book isn't for you.
But if you want a straightforward introductio to Scrum, this is it.Customer Review:..... [Read More]
Tracked on Nov 17, 2005 7:12:26 AM


















