Back

Beyond Edit Checks & SAS: How AI Elevates Data Quality in Clinical Trials

by
Sanjay Kunchakarra
September 5, 2025
Beyond Edit Checks & SAS: How AI Elevates Data Quality in Clinical Trials

Anyone who has been involved with clinical data review will recognize this cycle: build edit checks, wait for SAS listings, analyze monthly reports, and issue queries weeks after the data was first entered. And it works, sort of. These checks catch missing values, out-of-range numbers, and date inconsistencies. But they also leave behind a long tail of errors that only surface late in the trial, or worse, during database lock crunch time.

It’s time for every sponsor, CRO and data team to ask a harder question: what more can we do to make sure our data is truly clean, reliable, and inspection-ready?

It’s time for every sponsor, CRO and data team to ask a harder question: what more can we do to make sure our data is truly clean, reliable, and inspection-ready?

It can be hard to separate signal from noise when it comes to AI, so we dive into specific elements of the data review process that AI assistants (agents), powered by large language models (LLMs), elevates.

Rule Building

One of the most pain-staking parts of study startup is figuring out which data quality checks to build. Today, data managers analyze study protocols and CRFs and build specifications manually. It’s an incomplete process that takes weeks and unsurprisingly misses important relationships.

With LLMs, that process changes dramatically. AI assistants read these documents directly, understand the nuances of the study, and surface the most important data quality conditions to validate. With these questions in mind:

  • How can inclusion/exclusion criteria be validated?
  • What are the critical elements underlying important study endpoints?
  • What CRF forms depend on one another?

Reveal’s AI assistants base their suggestions off a library of checks they’ve seen across similar studies, fundamental principles of risk-based data quality and unique elements of a study. Instead of starting from scratch, teams now begin with an automated set of study-specific rules that they refine.

Free Text

Traditional checks live in the world of structured fields: numbers, dates, dropdowns. But much of trial data is messier: free text entries, names and comments.

LLMs change the game here. They can read and interpret this raw data, spotting discrepancies that edit checks can’t touch and data managers have to manually sort through:

  • A medication misspelled in the CM log but recognizable as a prohibited drug
  • An AE term description that doesn’t match its listed details
  • Lab comments that explain why a result is missing

Connecting the Dots Across Forms

The most complex checks in the data review process are cross-form checks (e.g., a high creatinine lab value, but no corresponding adverse event or medication record). Today, most cross-form checks like this are either implemented very lightly or handled manually because they are too complex to program deterministically with edit checks or SAS listings.

Reveal’s AI assistants however, are excellent at bridging the gap here. Not only can they identify the critical form relationships to evaluate, they execute these checks continuously, drawing connections between disparate sets of information to determine if information is consistent and logical.

Spotting What Doesn’t Fit

Static range checks can only flag values that fall outside predefined thresholds. But what about values that are technically “in range” but biologically implausible given the patient’s history? Or a site that seems to always report the same set of vital signs?

By combining traditional machine learning models with modern generative AI techniques, AI assistants can spot these anomalies in real time, investigating whether a value really makes sense for this patient, at this time, and in this context

Why does all this matter?

The benefits are clear and tangible. By leveraging AI assistants for data cleaning on top of deterministic edit checks, sponsors and CROs can expect:

  • 30-40% more quality check coverage of critical data elements
  • Earlier detection of risks that would otherwise surface later
  • Cleaner interim analyses, faster database locks and smoother inspections
  • Unburdened clinical research teams, who spend less time on tedious manual investigation

The processes and tools built decades ago served their purpose, but they remain imperfect for today’s trials. AI provides a new opportunity for us to revamp these processes and close the gaps that prevent timeliness, quality and compliance in clinical research.

Share:

See how Oovacha works with your data

Oovacha uses AI to reduce the burden on study teams & unlock key trial data and insights

Book a demo