Clinical Reporting and Drug Discovery: An Overview of AI’s Impact in the Pandemic

Nicholas DeNittis

Nick DeNittis writes and edits AI industry trends and use-cases for Emerj's editorial and client content. Nick holds an MS in Management from Troy University and has earned several professional analytics certificates, including from the Wharton School.

Clinical Reporting 
and Drug Discovery@2x

The importance of AI in the pharmaceutical space has grown exponentially over the last few years, spearheaded by global demand for a COVID vaccine at the pandemic’s apex. Moreover, AI has proven helpful throughout the pandemic, all while providing glimpses into the future promise of the technology. 

According to a widely cited 2021 assessment on the impact of AI applications in the wake of COVID-19 from the academic journal Frontiers in Medicine, these technologies achieved “high performance in diagnosis, prognosis evaluation, epidemic prediction, and drug discovery for COVID-19.” 

The report went on to conclude from their research that AI “has the potential to significantly enhance significantly existing medical and healthcare system efficiency during the COVID-19 pandemic.”

In this article, we provide a brief overview of how AI was employed in the pandemic’s wake to: 

  • Accelerate drug discovery and production
  • Streamline clinical reporting workflows
  • Accelerate data validation

Following a brief summary of the first two trends, we will take a closer look at a use case wherein Pfizer leveraged an AI vendor to help them clean up patient data to produce results for clinical trials of their COVID-19 vaccine on a faster basis. Finally, we will conclude with a closer look at the frontiers of AI’s potential impact on clinical trial outcomes.

But first, let us begin our overview by taking a closer look at the impact of AI capabilities on the drug discovery process.

The Future of Drug Discovery

The drug discovery process is extremely costly and time-consuming. According to a report by PhRMA:

  • The research-to-marketplace cycle takes at least ten years, with clinical trials taking an average of six to seven years.
  • The average cost to develop a successful drug is $2.6 billion.
  • The overall probability of clinical success (i.e., FDA approval) is estimated at less than 12%.

The time spent in the research-to-marketplace cycle for vaccines specifically is similar, though less costly: up to $500 million, according to the International Vaccine Institute. 

The pharmaceutical industry is under stress for newer and better ways of managing the drug discovery process, and answers are now more readily at hand. Large pharmaceutical companies – including Pfizer and Sanofi – are using AI and machine learning to make this process faster and more cost-effective.

In the case of Pfizer, the company succeeded in producing an FDA-approved vaccine in just under one year –; no small feat. According to a Pfizer press release and corroborated by media reporting, AI and machine learning were used to expedite the processing of a massive dataset to develop their PAXLOVID vaccine. 

In a more recent case involving the development of an oral medication for COVID, Pfizer also claims to use AI to help screen millions of potential compounds designed to affect molecular drug targets.

In the case of Sanofi, the pharma giant claims on its website to use AI and machine learning to analyze anonymous data from about 450 million patients. The company recently released a press release revealing a partnership with Exscientia, an AI drug design and development company.  The deal is worth a potential $5.2 billion.

Streamlining the Reporting Process

Time is of the essence in the drug discovery process.

One of the more time-consuming elements of the drug discovery process is the preparation and generation of Clinical Study Reports (CSRs). According to a study published in Clinical Researcher, the average time to CSR completion is 109 days — about 3.6 months. 

This lengthy process not only uses up company resources but also prevents potentially useful drugs from moving forward to potentially benefit patients.

Whatever the scenario, CSRs are essential; but much of the labor is spent on repetitive tasks, requiring the valuable time of highly skilled medical professionals.  AI can reduce the time taken up by CSRs, freeing up medical writers’ time. 

AI-enabled software tools can automate much of the CSR writing process. The amount of time saved by CSR automation solutions varies by vendor. One company,, claims to save medical writers 60-70% on time spend. 

Another company, Narrativa, claims a 65% time reduction as well as an average 40% cost reduction for CSR writing.

Most user workflows of AI-enabled CSR solutions follow a similar pattern:

  • Template (output) selection and configuration
  • Uploading of source documents 
  • Reviewing the system-generated CSR
  • User editing of CSR
  • Finalization and approval of output document(s)

Automating CSR content with AI stands to simplify the review process by streamlining the generation of reliable, repeatable, quality text. Among the array of present AI applications, natural language generation (NLG) is widely used across finance and medical writing to facilitate a consistent writing style across the document.

The CSR process often requires information that is known only by a medical writer. It might be that only a specific person can explain the actual value and meaning of the data. 

In this case, the writer can program how the document will be read and how the facts fit together. NLP systems can be trained on the writing style and specific medical jargon required for a particular CSR.

A COVID Use Case: Accelerated Data Validation at Pfizer

After clinical trials, patient data must be “cleaned up” enough for scientists to accurately analyze the results. Moreover, the development of such a capability would take a long time and potentially, therefore, delay the process further. 

At the time, Pfizer was attempting to get its vaccine approved for emergency use by the FDA and could not afford any delays. The company decided to launch a competition to develop an AI-powered tool that could quickly manage and clean clinical data. 

A company called Saama Technologies won the competition with its Smart Data Query (SDQ) solution. 

Saama states in its case study [pdf] report that the SDQ platform accelerates data cleaning and ensures data quality by automating the query management process. The report further states that Saama’s solution leverages the following AI capabilities:

  • Machine learning to predict data discrepancies
  • Natural language processing to help detect adverse event data and use medical histories and case report forms for data consistency, and process more than 750,000 free text sentence

Saama claims the resulting AI-augmented workflow proceeds as follows:

  1. Site investigators feed electronic case report forms (eCRFs) into their electronic data capture (EDC) system, integrated with Saama’s Smart Data Query (SDQ) platform.
  2. The SDQ platform reviews the data and provides data managers with predictions for their review.
  3. The EDC automatically generates queries as eCRFs, sorted by confidence intervals and highlighting any data discrepancies.
  4. Managers then review each discrepancy and proposed response via the SDQ interface, assigning queries in the “open state” to the eCRF already in the EDC or signing off (step 6, depicted in figure 1 below) on correct changes. 
  5. SDQ recognizes any query text that is already correct and uses that data to improve its algorithm. If the query text has any errors, reviewers can edit the response (step 7) before raising a query. SDQ applies the data from those corrections to its algorithm.

The report features a helpful illustration depicting the facility of the above workflow (figure 1 below): 

(Figure 1. Source: Saama Technologies [pdf])

According to the case study report, the result was that the AI reduced the median number of calendar days to generate queries from 25.4 days to 1.7 days across all vaccine studies.

Throughout the report, Samma also claims their algorithms can help link potential treatments to precise biological causes in a far more effective, expeditious manner than the trial-and-error approach typifying the traditional drug discovery process. 

In their view, a lake of information can now be reduced to a small pool of insightful data in a massively shortened timeline.

Looking to the Future: Expediting Trial Outcomes

Another potential benefit of this new technology is that trial outcomes could be obtained much sooner. Spearheaded by the Covid-19 outbreak, we are entering an era when algorithms can analyze clinical data and estimate the journeys of simulated patients throughout the trial, accurately predicting the outcomes. 

In a recent article for the renowned academic journal Nature, it is reported that the German biotechnology company Evotec was able to shorten the discovery process from 4-5 years to 8 months using AI. 

By disrupting traditional testing methods with these capabilities, pharmaceutical firms can potentially reduce the average trial cycle from years to mere hours. The AI platform, ‘Centaur Chemist’ from Exscientia, can purportedly “sort through and compare various properties of millions of potential small molecules … for 10 or 20 to synthesize, test, and optimize.”

Further investment and development can turn these AI-led methodologies into increasingly robust and trustworthy simulation tools for data-only clinical trials. One noteworthy example of the present trend exists in Pfizer’s recently-unveiled “Immune System GPS” model.

Stay Ahead of the AI Curve

Discover the critical AI trends and applications that separate winners from losers in the future of business.

Sign up for the 'AI Advantage' newsletter: