Google Takes Dataset Search Out Of Beta

Google has brought its dataset search tool out of the beta-testing phase, while adding new features.

Google Dataset Search was originally released in September 2018 to try to make datasets more accessible to researchers.

According to the search company, large amounts of such data is published online, from organisations including universities, governments and labs, but it can be difficult to find via standard searches.

Along with the search tool Google also released a set of open metadata tags, urging publishers to add them to pages containing datasets to make the information easier for search engines to index.

A Google data centre in Oklahoma. Image credit: Google

Metadata framework

Google’s tool has now indexed some 25 million datasets, in areas ranging from penguin populations to volcanic eruptions to medical data.

The information can be used for purposes such as testing hypotheses or to training AI algorithms.

Casual users can also use Google’s dataset search to find information related to their interests, such as a list of the fastest skiiers.

Google said hundreds of thousands of users have tried Dataset Search since its launch, and that the reaction from the scientific community was positive overall.

The company said the journal Nature, for instance, has begun requiring that data sharing take place with the proper metadata, said Natasha Noy, research scientist at Google Research.

New search features include the ability to filter data by type, such as tables, images or text, as well as whether the data is free to use and the geographic area covered.

Data discovery

The search engine is now available to use on mobile devices and has expanded dataset descriptions.

The biggest areas currently indexed include geosciences, biology and agriculture, with the most common queries being “education”, “weather”, “cancer”, “crime”, “soccer”… and “dogs”.

The US is the leader in open government dataset publishing, making more than 2 million available online.

Noy said Google is planning to continue releasing further updates to the search engine now that the beta-testing period has ended.

The company said its ultimate goal is to “help foster an ecosystem” for publishing, discovering and using datasets.

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

UK’s CMA Readies Cloud Sector “Behavioural” Remedies – Report

Targetting AWS, Microsoft? British competition regulator soon to announce “behavioural” remedies for cloud sector

8 hours ago

Former Policy Boss At X Nick Pickles, Joins Sam Altman Venture

Move to Elon Musk rival. Former senior executive at X joins Sam Altman's venture formerly…

11 hours ago

Bitcoin Rises Above $96,000 Amid Trump Optimism

Bitcoin price rises towards $100,000, amid investor optimism of friendlier US regulatory landscape under Donald…

12 hours ago

FTX Co-Founder Gary Wang Spared Prison

Judge Kaplan praises former FTX CTO Gary Wang for his co-operation against Sam Bankman-Fried during…

13 hours ago