Meta Platforms has revealed the data source it used to train its Meta AI chatbot, which it revealed to the world at this week’s developer conference, ‘Meta Connect 2023’.

On Wednesday CEO Mark Zuckerberg at the conference announced a new AI assistant; AI stickers; AI chatbots infused with celebrity personalities; the Meta Quest 3 mixed reality headset; and a new generation of Ray-Ban Meta smart glasses.

But now a senior executive at Meta has told Reuters in an interview that the organisation used people’s public Facebook and Instagram posts to train its new Meta AI chatbot, dubbed “Meta AI”.

Image credit: Meta

Public data

Meta AI is still in beta and can be used across Meta’s messaging platforms, and while it was trained on public Facebook and Instagram posts (both text and photos), the firm excluded private posts shared only with family and friends in an effort to respect consumers’ privacy.

This is according to Meta President of Global Affairs Nick Clegg, speaking on the sidelines of the company’s annual Connect conference this week.

Clegg also told Reuters that in addition to not use private posts, Meta also did not use private chats on its messaging services as training data for the model and took steps to filter private details from public datasets used for training,

“We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg told Reuters, adding that the “vast majority” of the data used by Meta for training was publicly available.

He cited LinkedIn as an example of a website whose content Meta deliberately chose not to use because of privacy concerns.

Clegg’s comments come as tech companies including Meta, OpenAI and Alphabet’s Google have been criticised for using information scraped from the internet without permission to train their AI models, which ingest massive amounts of data in order to summarize information and generate imagery.

Tech companies are currently weighing how to handle the private or copyrighted materials that are utilised in that process that their AI systems may reproduce, and it comes as many book authors have filed lawsuits against these firms, accusing them of copyright infringement.

Meta Connect 2023

Meta AI chatbot was the most significant product among the company’s first consumer-facing AI tools unveiled by CEO Mark Zuckerberg on Wednesday at Meta’s annual developer conference.

Meta made the assistant using a custom model based on the Llama 2 large language model that the company released for public commercial use in July, as well as a new model called Emu that generates images in response to text prompts.

Meta AI can give users real-time information and generate photorealistic images from text prompts in seconds to share with friends.

But users can also ask Meta AI questions in chat “to settle arguments” or ask other questions.

Meta AI will be able to generate text, audio and imagery and will have access to real-time information via a partnership with Microsoft’s Bing search engine.

Tom Jowitt

Tom Jowitt is a leading British tech freelancer and long standing contributor to Silicon UK. He is also a bit of a Lord of the Rings nut...

Recent Posts

Craig Wright Sentenced For Contempt Of Court

Suspended prison sentence for Craig Wright for “flagrant breach” of court order, after his false…

2 days ago

El Salvador To Sell Or Discontinue Bitcoin Wallet, After IMF Deal

Cash-strapped south American country agrees to sell or discontinue its national Bitcoin wallet after signing…

2 days ago

UK’s ICO Labels Google ‘Irresponsible’ For Tracking Change

Google's change will allow advertisers to track customers' digital “fingerprints”, but UK data protection watchdog…

2 days ago

EU Publishes iOS Interoperability Plans

European Commission publishes preliminary instructions to Apple on how to open up iOS to rivals,…

3 days ago

Momeni Convicted In Bob Lee Murder

San Francisco jury finds Nima Momeni guilty of second-degree murder of Cash App founder Bob…

3 days ago