Meta Admits AI Assistant Trained On User Posts

Executive admits it used people’s public Facebook and Instagram posts to train its new Meta AI virtual assistant

By Tom Jowitt, September 29, 2023, 3:39 pm

3 min

Meta Platforms has revealed the data source it used to train its Meta AI chatbot, which it revealed to the world at this week’s developer conference, ‘Meta Connect 2023’.

On Wednesday CEO Mark Zuckerberg at the conference announced a new AI assistant; AI stickers; AI chatbots infused with celebrity personalities; the Meta Quest 3 mixed reality headset; and a new generation of Ray-Ban Meta smart glasses.

But now a senior executive at Meta has told Reuters in an interview that the organisation used people’s public Facebook and Instagram posts to train its new Meta AI chatbot, dubbed “Meta AI”.

Image credit: Meta

Public data

Meta AI is still in beta and can be used across Meta’s messaging platforms, and while it was trained on public Facebook and Instagram posts (both text and photos), the firm excluded private posts shared only with family and friends in an effort to respect consumers’ privacy.

This is according to Meta President of Global Affairs Nick Clegg, speaking on the sidelines of the company’s annual Connect conference this week.

Clegg also told Reuters that in addition to not use private posts, Meta also did not use private chats on its messaging services as training data for the model and took steps to filter private details from public datasets used for training,

“We’ve tried to exclude datasets that have a heavy preponderance of personal information,” Clegg told Reuters, adding that the “vast majority” of the data used by Meta for training was publicly available.

He cited LinkedIn as an example of a website whose content Meta deliberately chose not to use because of privacy concerns.

Clegg’s comments come as tech companies including Meta, OpenAI and Alphabet’s Google have been criticised for using information scraped from the internet without permission to train their AI models, which ingest massive amounts of data in order to summarize information and generate imagery.

Tech companies are currently weighing how to handle the private or copyrighted materials that are utilised in that process that their AI systems may reproduce, and it comes as many book authors have filed lawsuits against these firms, accusing them of copyright infringement.

Meta Connect 2023

Meta AI chatbot was the most significant product among the company’s first consumer-facing AI tools unveiled by CEO Mark Zuckerberg on Wednesday at Meta’s annual developer conference.

Meta made the assistant using a custom model based on the Llama 2 large language model that the company released for public commercial use in July, as well as a new model called Emu that generates images in response to text prompts.

Meta AI can give users real-time information and generate photorealistic images from text prompts in seconds to share with friends.

But users can also ask Meta AI questions in chat “to settle arguments” or ask other questions.

Meta AI will be able to generate text, audio and imagery and will have access to real-time information via a partnership with Microsoft’s Bing search engine.

Advertising

TECHNOLOGY POWERING BUSINESS

Meta Admits AI Assistant Trained On User Posts

Public data

Meta Connect 2023

Microsoft Pushes Ahead With Launch Of ‘Recall’ AI...

Google AI Presents April Fool’s Joke As True

Ex-Cruise Chief Vogt Raises $150m For Robotics Start-Up

Apple Reshuffles Executives As AI Plans Struggle

ByteDance Researchers Publish High-Performance AI Training Method

Whitepapers #Artificial Intelligence

Maximising Financial Impact: Building the Business Case for Source-to-Pay

Rethinking Receivables

Sprinklr named a Leader in The Forrester Wave™: Social Suites, Q4 2024

The Intelligent CX Revolution: How AI is Changing the Game