This insight was featured in the February 20th, 2025 edition of the AI Catalyst Pulse.
What happened: The AI company Anthropic has released what it calls the Anthropic Economic Index, “the first large-scale empirical measurement of which tasks are seeing AI use across the economy.” Anthropic researchers examined millions of conversations with their AI assistant, Claude, and mapped each one to job-related tasks defined by the U.S. Department of Labor.
Their findings show that while jobs across the economy are using conversational AI every day — with 36% of occupations using AI for at least a quarter of their tasks — healthcare lags significantly behind other sectors, including other regulated industries like financial services and law.
Why it matters: Anthropic’s goal with their “economic index” was to track AI adoption across industries. But by releasing their underlying data, they accidentally gave healthcare leaders something far more valuable: the best evidence to date of how healthcare workers are using AI behind their organizations' backs.
The key is that, for reasons related to Anthropic’s data agreements, the study excluded activity from business customers. As such, it largely shows how healthcare workers are using AI outside of official contexts – giving us what may be our best audit ever of so-called “shadow AI.”
Here’s what the AI Catalyst team found when we crunched Anthropic’s healthcare data.
What healthcare workers seem to be doing in AI's shadows
Before we dive in, an important note about limitations. To protect user privacy, Anthropic didn’t associate the conversations in their dataset with specific users. Instead, they used AI to group conversations into job-related tasks defined by the U.S. Department of Labor.
In other words, while Anthropic could determine that a given conversation related to a task typically conducted by a healthcare professional, they couldn’t say for sure that the user was, in fact, a healthcare worker. That said, Anthropic said the vast majority of conversations appeared work-related.
With that disclaimer in mind, we analyzed the healthcare-related queries in Anthropic’s data. Broadly speaking, they fell into three categories:
Backend administrative work (~44%): This is all the “behind-the-scenes” stuff required to make healthcare run. Think about internal documentation, reporting, and workflow optimization.
Patient communication (~36%): Everything related to what we say to patients. Education materials, procedural explanations, and counseling content.
Clinical applications (~20%): The meat and potatoes of clinical care: diagnosing, interpreting tests, and developing treatment plans. Examples include AI support for diagnostics, analysis, and treatment planning.
Digging deeper: The most worrying uses of ‘shadow AI’ in healthcare
Some of these uses seem innocuous. But based on our analysis, about one-third of the queries in Anthropic’s data raise what we’d consider “red flags” – suggesting risks of patient harm, legal violations, or privacy concerns. To give a few examples:
“Interpret[ing] laboratory results and communicat[ing] findings to patients or physicians”
“Assess[ing] the identity, strength, or purity of medications”
“Interpret[ing] the outcomes of diagnostic imaging procedures”
“Develop[ing] individualized treatment plans for patients, considering patient preferences, clinical data, or the risks and benefits of therapies”
We hardly need to explain how troubling it is if healthcare professionals are truly using Claude – a consumer-facing, non-specialist AI chatbot – for these tasks at work without authorization. For one thing, AI makes mistakes that could threaten patient care. Just as worrying, if clinicians are sharing enough context for AI to respond meaningfully to these requests, they may be disclosing protected health information.
To be clear, this data doesn’t prove that any particular red-flag conversation involved a healthcare professional acting inappropriately – but because “shadow AI” by definition happens in unapproved contexts, it’s almost impossible to definitively measure such misuse.
Even so, I think it’s reasonable to say this data suggests that perhaps one-third of healthcare’s “shadow AI” use poses serious risks. That’s plenty enough to be scary.
What AI use can you bring in from the ‘shadows’?
Still, there's a more optimistic view of the data: Perhaps two-thirds of healthcare’s “shadow AI” use is potentially acceptable, if done through proper channels. This includes tasks such as:
“Writ[ing] research reports and other publications to document and communicate research findings”
Reviewing reports “for spelling, grammar, clarity, consistency, and proper medical terminology”
“Prepar[ing] statistical reports, narrative reports, or graphic presentations of information, such as tumor registry data”
These patterns reveal what your healthcare workers truly want from AI: help reducing administrative burden without compromising clinical judgment. They're just currently getting that help through unauthorized channels.
For health system leaders, this data provides both warning and roadmap. Your staff will use AI — if not through official channels, then in the shadows. The question is how to bring the best AI uses into the light while shutting down the dangerous ones.
Questions to consider:
Which specific AI use cases from this data could our organization safely implement through official channels?
What immediate steps should we take to detect and prevent high-risk clinical applications, particularly those involving protected health information?
How might we adapt our IT procurement processes to better accommodate the rapid pace of AI tool development while maintaining necessary safeguards?