Meta Platforms has revealed the data source it used to train its Meta AI chatbot, which it revealed to the world at this week’s developer conference, ‘Meta Connect 2023’.

On Wednesday CEO Mark Zuckerberg at the conference announced a new AI assistant; AI stickers; AI chatbots infused with celebrity personalities; the Meta Quest 3 mixed reality headset; and a new generation of Ray-Ban Meta smart glasses.

But now a senior executive at Meta has told Reuters in an interview that the organisation used people’s public Facebook and Instagram posts to train its new Meta AI chatbot, dubbed “Meta AI”.

Image credit: Meta

Public data

Meta AI is still in beta and can be used across Meta’s messaging platforms, and while it was trained on public Facebook and Instagram posts (both text and photos), the firm excluded private posts shared only with family and friends in an effort to respect consumers’ privacy.

This is according to Meta President of Global Affairs Nick Clegg, speaking on the sidelines of the company’s annual Connect conference this week.

Clegg also told Reuters that in addition to not use private posts, Meta also did not use private chats on its messaging services as training data for the model and took steps to filter private details from public datasets used for training,

“We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg told Reuters, adding that the “vast majority” of the data used by Meta for training was publicly available.

He cited LinkedIn as an example of a website whose content Meta deliberately chose not to use because of privacy concerns.

Clegg’s comments come as tech companies including Meta, OpenAI and Alphabet’s Google have been criticised for using information scraped from the internet without permission to train their AI models, which ingest massive amounts of data in order to summarize information and generate imagery.

Tech companies are currently weighing how to handle the private or copyrighted materials that are utilised in that process that their AI systems may reproduce, and it comes as many book authors have filed lawsuits against these firms, accusing them of copyright infringement.

Meta Connect 2023

Meta AI chatbot was the most significant product among the company’s first consumer-facing AI tools unveiled by CEO Mark Zuckerberg on Wednesday at Meta’s annual developer conference.

Meta made the assistant using a custom model based on the Llama 2 large language model that the company released for public commercial use in July, as well as a new model called Emu that generates images in response to text prompts.

Meta AI can give users real-time information and generate photorealistic images from text prompts in seconds to share with friends.

But users can also ask Meta AI questions in chat “to settle arguments” or ask other questions.

Meta AI will be able to generate text, audio and imagery and will have access to real-time information via a partnership with Microsoft’s Bing search engine.

Tom Jowitt

Tom Jowitt is a leading British tech freelancer and long standing contributor to Silicon UK. He is also a bit of a Lord of the Rings nut...

Recent Posts

UK’s CMA Readies Cloud Sector “Behavioural” Remedies – Report

Targetting AWS, Microsoft? British competition regulator soon to announce “behavioural” remedies for cloud sector

4 hours ago

Former Policy Boss At X Nick Pickles, Joins Sam Altman Venture

Move to Elon Musk rival. Former senior executive at X joins Sam Altman's venture formerly…

6 hours ago

Bitcoin Rises Above $96,000 Amid Trump Optimism

Bitcoin price rises towards $100,000, amid investor optimism of friendlier US regulatory landscape under Donald…

7 hours ago

FTX Co-Founder Gary Wang Spared Prison

Judge Kaplan praises former FTX CTO Gary Wang for his co-operation against Sam Bankman-Fried during…

8 hours ago