Your AI hiring tool passed its audit. That doesn't mean it's fair

A Stanford study reveals how AI hiring tools can pass bias audits while still discriminating by race and role

Your AI hiring tool passed its audit. That doesn't mean it's fair

Most companies using AI to screen job applicants have done their due diligence. They’ve asked the vendor whether the tool has been audited. They’ve got a report that says it passed. They’ve moved on.

New Stanford University research suggests that process may be giving employers a false sense of security. A study of 4 million job applications found that an AI hiring tool can pass a bias audit at the aggregate level while still systematically screening out Black and Asian candidates for specific roles. The audits most employers rely on, it turns out, don’t tell the whole story.

That finding arrives as a federal judge in San Francisco weighs whether Workday, one of the largest providers of AI-powered human resources software in the United States, can be held liable under California law for how its screening algorithms filter candidates on behalf of major employers. The case, Mobley v. Workday, is the first of its kind to broadly challenge the algorithmic decision-making underpinning AI hiring tools that now shape recruitment at most large companies across the country.

What aggregate audits miss

The 2026 Stanford study is one of the largest analyses of algorithmic hiring ever conducted. It followed 3.4 million people submitting 4 million job applications across 1,700 positions. Every application was screened by a single third-party vendor, a setup the researchers describe as “algorithmic monoculture,” reflecting the reality that a small number of vendors now supply hiring algorithms to a large share of U.S. employers. When one vendor’s tool carries bias, that bias can affect candidates across every employer using it.

Applying the Equal Employment Opportunity Commission’s (EEOC) “four-fifths rule” — the standard threshold used under U.S. employment discrimination law to identify adverse impact in hiring — researchers found that 26% of Black applicants and 15% of Asian applicants submitted applications to positions where the AI system discriminated against their racial group. If the tool had advanced those candidates at the same rate as the most-favored group, roughly 40,000 more applications would have moved forward.

The critical finding wasn’t just that bias existed. It was that the bias was invisible at the aggregate level. Sarah Bana, co-author of the study and assistant professor at Chapman University in Orange, California, explained the mechanism.

“Earlier research reported aggregate numbers, averaged across all the positions a vendor screens for. We disaggregated and looked at each position separately. That’s the major difference,” she said.

“Imagine a model that over-selects one group for warehouse jobs and under-selects them for finance jobs. The averages would look balanced; the position-by-position picture would show real bias. That’s roughly the pattern we found.”

The legal exposure sits with the employer

U.S. employment law evaluates adverse impact one position at a time, because that’s how employers actually make hiring decisions. A clean aggregate audit doesn’t reflect that reality, and the consequences are playing out in real time. Employers are already relying on these tools to make high-stakes filtering decisions at enormous volume, and the Workday case shows what that exposure can look like.

The lawsuit was originally filed in 2023 by Derek Mobley, a Black job seeker who claims he was passed over for more than 100 positions at companies using Workday’s software, citing discrimination based on race, age, and disability. The case has since expanded to include additional plaintiffs and broader claims under California’s Fair Employment and Housing Act (FEHA).

Critically, the presiding judge has already ruled that Workday can be treated as an employer under federal anti-discrimination law, because it performs screening functions its clients would otherwise carry out themselves. That finding is significant. But it doesn’t shift liability away from employers.

“It probably depends on the vendor, but I think it’s important to recognize that the legal exposure sits with the hiring firm, not the vendor,” Bana said. “Under New York City’s Local Law 144, and with the EEOC designating AI-based screening tools as an enforcement priority in its current Strategic Enforcement Plan, the employer is the regulated party. So your legal office should do their due diligence.”

Three steps organizations should take now

Bana offered three concrete recommendations for any organization currently using AI tools to screen candidates.

The first is to periodically advance a small random sample of candidates the algorithm would have rejected, then track how they perform downstream.

“Without observing how the applicants the algorithm rejects would have performed, you cannot validate whether the screening tool is filtering out the right people,” she said. “If the filtered-out applicants perform comparably, the screening tool is generating artificial scarcity. If it performs worse, the screening tool is producing genuine signal.”

The second is to require position-level disparity reporting as a contractual condition. Aggregate reports, as the Stanford research makes clear, can hide significant problems in specific roles.

“Require selection rates broken out by position and demographic group as part of your contract,” Bana said. “This is information that will let you know if you’re getting what you paid for: an unbiased screening tool.”

The third is to close the gap between screening data and performance data. Most organizations keep these in separate systems that don’t communicate.

“Performance data lives in one table and screening recommendations live in another, and the two don’t typically talk to each other,” Bana said. “The screening tool isn’t just a hiring tool. It shapes who ends up working at your firm.”

What this means going forward

The regulatory environment is still catching up to the technology.

“Everything feels quite voluntary right now,” Bana said.

That’s unlikely to remain true. The Workday case may be the first of its kind, but as other algorithmic bias lawsuits involving hiring tools begin to emerge, it’s unlikely to be the last. The algorithmic monoculture the Stanford researchers identified means that when one vendor’s tool gets it wrong, the consequences can extend well beyond one employer.

In the meantime, the question organizations need to ask isn’t simply whether their hiring tool has been audited. It’s what that audit actually examined, at what level of granularity, and whether those answers map to how the tool is being used across specific roles within their organization.

“Be vigilant,” Bana said. “Understand the filters that you are imposing and what they imply for the candidates you’re filtering in and out.”

LATEST NEWS