Categories: MarketingWorkspace

Google Tool Measures Phrase Popularity Over Time

As Google is trying to gain traction selling books, a free software tool that helps scholars analyse what words and phrases were popular several centuries ago is wowing researchers and media.

Google on 16 December launched Google Books Ngram Viewer, a data visualisation tool that crawls 500 billion words culled from 5.2 million books published between 1500 and 2008 that Google has indexed in its cloud computing system.

Word graphs

Users may access the tool here and type up to five words to see a typical Google graph that counts the words’ and phrases’ use each year over the last several hundred years. The words come from books published in Chinese, English, French, German, Russian and Spanish.

Google in one example shows how the tool can compare instances of musical instruments in English literature from 1750 to 2008. Note how the drum and trumpet, in particular, seemed to trade places in popularity over the last two hundred years.

While any bystander with a computer may access the tool, it is largely geared to help scholars and researchers studying philosophy, pop culture, religion, politics, art and language to conduct their research. Google said it is also making the datasets supporting the Ngram Viewer freely downloadable so that scholars can replicate the work.

The datasets were used in research project led by Harvard University’s Jean-Baptiste Michel and Erez Lieberman Aiden, along with several Googlers, said Jon Orwant, Google Books engineering manager.

“Their work provides several examples of how quantitative methods can provide insights into topics as diverse as the spread of innovations, the effects of youth and profession on fame, and trends in censorship,” Orwant said.

Unlike most Google Labs projects, media curiosity has been piqued by Ngram viewer.

The New York Times and Wall Street Journal spotlighted it, while top tech blogs such as ReadWriteweb dedicated not one but two positive posts to it here and here.

15 million works

The datasets in Ngram viewer constitute merely one-third of the 15 million works Google has scanned online since 2004 as part of its Google Books project.

High-octane math and algorithms aside, the work has been complicated thanks to contentious battles over copyright, particularly related to orphan works, where rightholders are deceased or cannot be found.

The court system is still sussing out this matter, though Google this month went ahead and launched its eBookstore to compete with Amazon, Apple and Barnes & Noble in selling books online.

Clint Boulton eWEEK USA 2012. Ziff Davis Enterprise Inc. All Rights Reserved

Share
Published by
Clint Boulton eWEEK USA 2012. Ziff Davis Enterprise Inc. All Rights Reserved
Tags: Google

Recent Posts

Hate Speech Watchdog CCDH To Quit Musk’s X

Target for Elon Musk's lawsuit, hate speech watchdog CCDH, announces its decision to quit X…

14 hours ago

Meta Fined €798m Over Alleged Facebook Marketplace Violations

Antitrust penalty. European Commission fines Meta a hefty €798m ($843m) for tying Facebook Marketplace to…

16 hours ago

Elon Musk Rebuked By Italian President Over Migration Tweets

Elon Musk continues to provoke the ire of various leaders around the world with his…

17 hours ago

VW, Rivian Launch Joint Venture, As Investment Rises To $5.8 Billion

Volkswagen and Rivian officially launch their joint venture, as German car giant ups investment to…

18 hours ago

AMD Axes 4 Percent Of Staff, Amid AI Chip Focus

Merry Christmas staff. AMD hands marching orders to 1,000 employees in the led up to…

21 hours ago

Tesla Recalls 2,431 Cybertrucks Over Propulsion Issue

Recall number six in 2024 for Tesla Cybertruck, and this time the fault cannot be…

22 hours ago