Hear me out. The average phase 3 clinical trial today involves 3.5 million data points collected (up from 1 million in 2012). As protocols become more complex, the burden on study teams to collect, clean and analyze data continues to worsen, leading to mistakes, delays and rework.
In response, the global governing body on clinical trials (ICH) released their official guidance (a.k.a. ICH E6 R3) on how to deal with this issue earlier this year. They require teams to implement "risk-based quality management" within their processes. In simple terms, they want teams to work smart; to figure out the data that’s most important to that study and focus their energy on validating that critical information. Don’t waste time and resources on details that don’t tell you much about how safe or effective a drug is.
Clean, reliable data is the foundation of every trial.
An average trial might have 40 different case report forms x 20 fields per form + 5 external data feeds x 30 data points per feed. That get’s us to about 1,000 unique data fields per study. For each field, you might want to check for completeness, value plausibility, alignment with the protocol, consistency with other data points, etc.
The universe of possible data quality checks starts creeping into the thousands.
To be compliant with the ICHs recommendations, to adopt a true “risk-based” process, study teams now need to manually analyze 5,000+ possible data quality checks and decide which are the most important. These requirements meant to reduce unnecessary work have now created a new bottleneck.
As a result, most study teams just stick with the status quo – they use templated libraries and apply checks that are executable today vs. aligning with regulatory guidelines and building a system that ensures the validity of their most important data.





