By Katherine Bindley
Training artificial-intelligence models demands massive amounts of fresh data. Mercor, a $10 billion startup that hires contractors to provide AI training feedback, is among those leading the high-stakes hunt.
Sometimes that quest for data leads to contentious territory.
The San Francisco startup, whose clients have included OpenAI, Anthropic and Meta, has been hit with at least seven class-action lawsuits in recent weeks following a third-party data breach. Allegedly, it exposed Mercor contractor information ranging from recorded job interviews to facial biometric data and screenshots of workers' computers.
The suits offered a window into how Mercor allegedly acquires the data used to serve its customers.
A class-action suit filed Tuesday in Northern California alleged that Mercor accumulated applicant-vetting data, including background checks, which it shared with partners, in breach of federal regulations.
According to plaintiffs, the company's practices include monitoring its contractors' computers and sharing that data with clients, using recorded candidate interviews to train AI models, and training client models on materials potentially owned by other companies.
"We strongly dispute the speculative claims in these lawsuits and look forward to presenting the facts at the appropriate time and place," Mercor said in a statement.
"We take the privacy of our customers, contractors, employees and those we interview very seriously, and we comply with all relevant laws and regulations," the statement continued, adding that the company acted promptly to remediate the data breach, and that the breach affected many other companies as well. "We are conducting a thorough investigation with leading third-party forensics experts and are communicating directly with affected stakeholder groups as we have findings," it said.
Previously, The Wall Street Journal reported that Mercor sought to buy prior work materials from people on LinkedIn: Those people said they didn't own the rights to such work. Mercor has been offering to pay $100 each for contractors' personal-finance documents, such as spreadsheets and PowerPoint presentations, according to postings online . The company has offered $100 for people's Google Maps histories.
Seeking out and handling so much data comes with complications: As workers' screenshots are alleged to be included in the breached data, contractors are suing Mercor not only for exposing their own personal information but also the information of their other employers.
Meta has paused its work with Mercor and is investigating the incident, according to a company spokesman. (Meta's Mercor pause was earlier reported by Wired.) Anthropic declined to comment. OpenAI didn't respond to requests for comment.
To train the first generation of large language models, AI developers already identified and extracted most major readily available sources of the world's data. Now companies have to get specialized, said Shayne Longpre, an MIT Ph.D. candidate who researches AI.
"A lot of the data-acquisition strategies seem to be moving towards more specialist sources," he said, pointing to those who are "extremely knowledgeable and have executed complex tasks in finance, healthcare, law, the sciences."
Mercor hired 30,000 contractors in 2025. Its competitors include Handshake AI, Micro1 and Surge. Recently, LinkedIn started testing its own AI training marketplace. The testing was earlier reported by Business Insider. Handshake co-founder Garrett Lord recently posted to LinkedIn that his company was looking to purchase codebases, internal databases and more.
"We anonymize everything," he wrote. "The stuff that's not on the internet is what we need."
The way big AI labs work with Mercor and other intermediaries who use contractors can make responsibility for data provenance more ambiguous, Longpre said. Industrywide, he added, "There's an incentive right now to figure out the rules and regulations after, and to capture as much of the market in the short term first."
Real-world scenarios
Thitipun Srinarmwong, a plaintiff in the class-action suit filed Tuesday, alleged that project managers and reviewers at Mercor encouraged workers to use real data from their firms, so long as the source was redacted or slightly changed. When Srinarmwong wrote in a way so as to protect confidential information, reviewers criticized the work as too short and vague, the suit said.
David Bevvino-Berv, a Mercor contractor who previously worked at Goldman Sachs, alleges in the same suit that he saw financial models and prompts that he suspected came from workers sharing proprietary information from other companies. He also saw "pre-project metadata, hidden defined names, institutional data-terminal markers, real lender or counterparty names, irregular numeric precision, and other features that raised serious provenance questions," the suit said .
One contractor, a federal investigator, told the Journal that Mercor asks for "real-world scenarios" but doesn't ask for prior work or proprietary data that might belong to other companies. However, the investigator added, the company scrubs data to remove personal or business identifiers in case contractors aren't abiding by its instructions.
Brendan Foody, Mercor's chief executive, said last fall at the TechCrunch Disrupt conference that while contractors are given guidance not to use data or documents from other companies, "there are things that happen." He added, "That's doing everything that we can on our side."
The company said its job listings state that work doesn't involve access to confidential or proprietary information from any of the worker's employers, clients or institutions.
Jennifer King, an expert in information with Stanford University's Institute for Human-Centered Artificial Intelligence, said that asking professionals to come up with real-world scenarios to train AI is tricky.
"Most people who do professional work don't have organic work that they've just created outside of their clients or professions," she said. "None of this stuff is just lying around."
AI is very effective at pattern matching, so even if companies scrub data from materials that end up fed into an AI model, the model might draw inferences between the uploaded material and its original source, then connect the two, she added.
Interviewing, onboarding and working
When applying for a contract job at Mercor, applicants and former workers said they sat for a recorded interview with an unseen AI proctor. Often, no humans are involved. Once hired, they are typically required to sign nondisclosure agreements.
Two contractors who worked for Mercor last year said they were tasked with comparing videos of job interviewees with their subsequent performances as contractors, to improve the company's own AI systems to spot talent.
A lawsuit filed in North Texas tied to the data breach alleged that interview recordings were shared with Mercor's clients. The company said it uses candidate interviews internally, not to train customer AI models.
In the suit filed Tuesday, plaintiffs said the company also collected background-check and other data and shared it with clients. Mercor said it doesn't share candidate background checks with its customers.
When they begin work for the company, Mercor workers are required to install software called Insightful, which takes computer screenshots.
Mercor contractors not involved in the suits described to the Journal an environment in which screenshots can be captured every minute.
Bevvino-Berv, the plaintiff who worked at Goldman Sachs , alleged that Insightful captured usage of his bank account, health-insurance portals and around 240 other applications. The suit also alleged that Bevvino-Berv wasn't "clearly informed" that Insightful would capture anything beyond his Mercor-related work.
One contractor who worked as project lead for Mercor for several months last year said he was able to view the computer screenshots of every person on his project.
The company said it informs workers that it may take screenshots of their work during billing hours, and that it explicitly tells workers to only use work-related applications while Insightful is active.
Write to Katherine Bindley at katie.bindley@wsj.com
(END) Dow Jones Newswires
April 22, 2026 20:21 ET (00:21 GMT)
Copyright (c) 2026 Dow Jones & Company, Inc.
Comments