Trust in the Tooling: How to Help Preclinical Researchers Help Themselves in the Age of AI
Operation "Smile and Dial."
I’ve been hitting the phones recently, trying to connect with preclinical lab cores at the best research universities in North America. My goal: put ModernVivo Scholar into the hands of as many preclinical scientists for free, to help make their research more translatable.
Turns out giving stuff away for free is harder than I thought.
Lovingly, I’ve been calling my endeavor “Operation Smile & Dial.” For anyone reading this who’s ever made cold calls, you probably know well how difficult the ‘smiling’ part can be. Some of the rejections are brutal. From hanging up on me mid-sentence, to lying about what their core does, people will find ways to get out of conversations they don’t trust. My favorite rejection was from a mouse histology core who claimed in vivo experiments “don’t even ring a bell.” Uh, what?
Before you feel too bad for me, know that I signed up for this. I get it. People are busy. They’re skeptical of sleazy salesmen pushing flashy AI tools down their gullets. Most of all, they’re comfortable in their workflows, even if those flows are riddled with pain points and bottlenecks. Change is hard.
And so I’m writing this blog to unpack a question I haven’t been able to shake as easily as all those rejections:
How do we build trust with scientists who have every reason to be skeptical of AI tools? Even if the utility of those tools has been validated by other expert researchers (ehem, like ModernVivo).
I hope my insights can help you feel more confident adopting and advocating for genuinely useful new research tools. Thanks for reading.
Hallucinations in AI Wonderland: Down the Rabbit Hole.
Just last week, I connected with the director of an animal behavior core of a neurotechnology center at a research university somewhere in the Rockies. He picked up my call in the middle of a literature review. He was looking for preclinical papers, literally as we spoke.
As we got into it, he told me he’s been using ChatGPT and Claude to look for relevant papers and in vivo protocols, but was frustrated that these models will sometimes hallucinate, returning papers and journals that sound real but don’t actually exist.
What’s the point of a time-saving AI tool if you have to spend that time reality checking its outputs? “Kinda pointless,” he told me.
I explained how ModernVivo is more deterministic than your run-of-the-mill LLM, because it operates on rule-based, semantic keyword matching. What you see is what you get. If it’s reported in the literature, our model will return language from that paper, verbatim. Not to mention, it’ll link you to the original text, 100% of the time, so you can see where the information is coming from.
But I got to thinking. Are fickle, stochastic language models that return different responses each time I ask it the same exact question really the best we can do? Shouldn’t determinism be the default for high stakes scientific research?
Turns out, many others have shared these thoughts.
The Case for Deterministic Models.
After a few hours of trying to wrap my head around LLM training, gradient descent, tokens, weights, temperatures, and how these concepts explain hallucinations, I found myself reading about deterministic models. Namely, “mathematical or statistical model in which the outcomes are precisely determined through known relationships among states and events, without any room for randomness in the process.”
The software engineers and programmers had the most to say about this. For instance, this article by an engineer at Capital One invokes fellow programmers to distrust LLMs to consistently enforce coding policies on their own. Instead, he claims, it’s better to use LLMs to write deterministic enforcement tools (e.g. linters, tests, compiler constraints) that will catch violations reliably every time the code is built. He makes a sound argument that deterministic programs are best suited for repetitive policy enforcement.
This podcast with Daniel Escott of Formic AI does a great job explaining the value of deterministic, observable AI models. I would recommend you give it a listen. Daniel claims, “The capacity to trust [AI] technology, given its transformative power, is the single most influential aspect of how we’re going to engage with it moving forward.”
He goes on to describe a deterministic, reverse RAG framework, which affords deterministic citation linking between the model’s responses and the source material. With such a framework, “citations cannot be hallucinated,” he states confidently.
He points to industries like law, banking, and government, where “you can’t just rely on statistical probability. You have to have some level of assurance.” All of these industries, he explains, rely on “decades of…documentation,” information that must be retrieved as the first step in experts’ reasoning processes. It’s obvious to me that preclinical drug discovery is another one of these industries.
Remember my director friend from the Rockies? Daniel echoes his sentiment almost exactly: When model responses are so unreliable that users have to become “an expert forensic auditor spending more time derisking the tool than would have been spent doing the actual work” – well, that’s just dumb technology.
On the Value of Trust.
Here’s what I’ve learned about trust from Operation Smile & Dial.
- As much as some people hate a cold call, it’s led to way more meaningful interactions than my previous approach – blasting hundreds of cold emails to academics. In a fraction of the outbound, I’m able to gather many more insights about preclinical workflows. I think there’s something primal that switches in our brain when we hear the sound of another human’s voice come through over the phone. “I promise you, I’m not a robot,” it says.
- People will stop listening as soon as I mention AI. By now, it’s an empty buzzword. Instead, they trust me more when I show up with genuine curiosity about their lived experiences, seeking to learn why their work keeps them up at night. Sometimes, they just need to vent. I’m here for it. “Just call it technology,” an AI developer from Charles River Labs told me last week.
- I don’t assume that AI is the solution to preclinical scientists’ problems. People don’t trust me if they sense I’m reasoning from the false assumption that AI is the hammer to their nail. AI might solve their problem. It might not. And that’s cool, too.
- The truth is, humans will solve their own problems. I build trust with scientists by offering trust first. I trust in their expertise and judgment. AI tools are so often marketed in a way that unconsciously undermines the subject matter expert’s human reasoning. This feeds into people’s fear of AI job replacement. And when people are afraid, they can’t trust.
The ModernVivo Advantage.
ModernVivo doesn’t exist to replace in vivo scientists. It exists to unlock their potential, and to free up their time so they can do what humans do best: reason, judge, decide. That’s why the ModernVivo team is so adamantly against making writ large study design and protocol recommendations. We’ll leave that to you, dear reader, the expert.
When people ask me why ModernVivo is better than Claude or OpenAI, I tell them two things:
- Greater breadth: Our model doesn’t just scan through a few dozen papers. It scans through MILLIONS of peer reviewed research publications. There are valuable insights locked in such a large body of knowledge. Insights about in vivo study design. No human could ever sift through this knowledge base manually. Nor should they. ModernVivo leaves no stone unturned.
- Greater depth: ModernVivo was built for in vivo scientists by in vivo scientists. Our model allows you to search the literature for hyper-specific experimental conditions, unique to preclinical therapeutics development. You can go deep in your study design by zooming in on conditions like route of administration, anesthesia type and dose, needle gauge, chow type, cell line(s) and so much more. No in vivo experimental condition is too trivial to overlook. These conditions matter for high quality, translatable research.
Only time will tell how AI tooling for preclinical research will evolve, and who will adopt which tools. I’m grateful to be part of this unfolding story. Thanks for reading.

.png)
.png)
.avif)