'

Google launches Body Browser, language database

If you can't sell the books, you might as well catalog a few billion words or map the human body in 3D, right?

Google launched two major efforts on Thursday, both with a research and science angle. Google Body Browser showcases HTML 5 to present detailed, layered 3D views of the human body. PC Magazine calls it a "Google Earth-like experience for the human body." They also launched their Ngram Viewer, employing a variety of under-the-hood Google technologies to track word usage trends across all of the books scanned through their books project.

The Body Browser will only work on a select group of web browsers that support Web GL, a 3D, HTML 5-based rendering technology that works within the browser. The beta of Chrome will get you there, as will the beta versions and nightly builds of Safari and Firefox. The Chrome beta is generally quite stable and is worth the download for a look at both the technology and the Body Browser.

The picture above comes from the nervous system layer with labels turned on. Zoom and pan controls are identical to those from Google Maps and Google Earth and the system is both intuitive and graphically impressive. It's still in Google Labs at this point, so a variety of enhancements can be expected. Although many Labs projects die on the vine, showcases for HTML 5, the web standard that Google is pushing over Apple's native apps approach, are likely to see ongoing support.

While Google's book scanning project has been fraught with legal wrangling, the company, in cooperation with many universities, has scanned millions of books. Now, bypassing the legal controversies, Google and Harvard have created a tool to examine only the words in those books, organized by time and area of interest. According to Scientific American,

A team from Harvard has teamed up with Google to crack the spines of 5,195,769 digitized books that span five centuries of the printed word with the hopes of giving the humanities a more quantitative research tool.

Researchers published a paper in Science, entitled "Quantitative Analysis of Culture Using Millions of Digitized Books." The abstract, though somewhat laced with technical details, is actually fascinating in what the database can provide:

We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of "culturomics", focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. "Culturomics" extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

Have to make use of all those scanned and digitized pages somehow while they sort out the copyright issues around making such a vast library available to the public, right?