[seopress_breadcrumbs]

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

•

June 5, 2025

This interview analysis is sponsored by Deloitte and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.

R&D teams across life sciences, agriculture, and materials science are under increasing pressure to deliver innovation — but often face a fundamental obstacle: asking questions their data isn’t prepared to answer.

According to a recent survey from IDC, partnering with Qlik, while 89% of organizations have updated their data strategies to adopt generative AI (GenAI), only 26% have actually deployed solutions at scale. Just 12% report having infrastructure robust enough to support autonomous decision-making. The disconnect between research ambition and data readiness remains a significant bottleneck.

The problem isn’t limited to infrastructure. Throughout the health sector, the so-called “10/90 gap” — where less than 10% of research funding targets diseases responsible for 90% of the global burden — reflects a highly cited and significantly deeper misalignment between research priorities and actionable data.

In both public and private domains, organizations struggle to define their hypotheses, interpret their findings, or connect siloed datasets across the R&D lifecycle. To address these challenges, Dr. Daniel Ferrante, AI Leader in R&D and Data Strategy at Deloitte, joined Emerj CEO Daniel Faggella on the ‘AI in Business’ podcast to discuss a new approach to scientific data: contextualizing it through domain-specific large language models (LLMs) and representation learning.

Drawing from his work on Deloitte’s Atlas AI framework, Ferrante outlines strategies for embedding internal data into domain knowledge, avoiding the pitfalls of rigid ontologies, and generating hypotheses through exploratory mapping rather than assumption-driven analysis.

This article examines three critical insights from Ferrante’s interview that provide actionable strategies for R&D and innovation leaders:

Building data context before building AI models: Mapping scientific data onto learned representations from domain-specific LLMs enables organizations to identify where their information aligns with established knowledge and where key gaps may remain.
Ontologies as a starting point, not a constraint: Teams should use rigid committee-built ontologies as flexible labeling tools and layer them into broader geometric and statistical models — allowing LLMs to bridge different naming systems and domains without being trapped by inconsistent labels.
Start with exploratory cartography, not assumptions: Before jumping to analysis, teams should enable smarter hypothesis formation by using LLMs to map internal data against known domain models (e.g., chemistry or genomics) that visualize patterns, identify clusters, and discover latent structures.

Listen to the full episode below:

Guest: Daniel Ferrante, Partner, AI Leader in R&D and Data Strategy at Deloitte

Expertise: AI strategy, Data Monetization, IP-driven POCs

Brief Recognition: Dr. Daniel Ferrante is a Partner and AI leader in R&D and Data Strategy at Deloitte. Before joining Deloitte, he co-founded SLF Scientific and was leading the company as a Chief Science & Data Officer. Dr. Ferrante received his Master’s & PhD in Theoretical Physics from Brown University.

Building Data Context Before Building AI Models

According to Dr. Daniel Ferrante, one of the most common missteps in enterprise AI adoption is assuming that internal data is ready for immediate model training. On the podcast, he explains that research and business leaders often ask questions their data cannot answer without the necessary contextual framework.

“When you ask a business or R&D question, you often don’t know if the data you have can answer it,” he notes.

Ferrante underscores that Deloitte’s Atlas AI approach starts not with modeling but with representation — embedding internal datasets into domain-specific LLM landscapes (e.g., chemistry, genomics):

“Because so much of R&D involves experimental unknowns, we should start by leveraging what LLMs have already learned in specific scientific domains. By mapping our internal data against that learned landscape, we can understand its position within broader domain knowledge. Once we begin asking questions of our data in that context, we can start identifying meaningful patterns — giving us the biological, chemical, or scientific perspective we didn’t have before.”

– Dr. Daniel Ferrante, AI Leader in R&D and Data Strategy at Deloitte

These learned spaces, he says, often resemble terrains of valleys and ridges, each representing a unique biological trait such as solubility or toxicity.

This strategy, Ferrante argues, allows organizations to position their data within a broader scientific context before attempting hypothesis testing or downstream analytics.

“We’re not just generating data — we’re generating labels,” he states, emphasizing that mapping comes first, not assumptions. For leaders, this means that AI serves as a lens for understanding what is possible with current data rather than rushing into ill-posed modeling tasks.

Ontologies as a Starting Point, Not a Constraint

Dr. Ferrante stresses in the interview that ontologies — while valuable for categorization — should not become architectural constraints that lock research teams into brittle naming conventions. He critiques what he calls “Frankenstein ontologies” assembled by committees, where companies often attempt to enforce a single vocabulary across diverse domains or research traditions.

Instead, Ferrante describes Atlas AI’s approach of embedding ontologies as soft labels within a broader statistical framework. Using soft labels allows for semantic similarity and flexible mapping, even when teams use different taxonomies:

“Ontologies are just structured labels — they’re useful, and we should use them wherever possible. But we shouldn’t trap ourselves within them.

In physics, we often make a problem larger in order to make it solvable, and the same principle applies here. Rather than trying to force-fit every data point into a rigid ontology, why not leverage language models that may have already learned the structure we’re trying to formalize? Why lock ourselves into Frankenstein ontologies when there’s a more flexible and scalable alternative?”

– Dr. Daniel Ferrante, AI Leader in R&D and Data Strategy at Deloitte

He argues that this method provides a more robust way to unify internal and external data without requiring standardization upfront. In Ferrante’s view, the role of ontologies should shift — from rigid taxonomies to tools for enhancing representation learning.

Start with Exploratory Cartography, Not Assumptions

Daniel also insists that, before jumping to analysis, teams should enable smarter hypothesis formation by using LLMs to map internal data against known domain models (e.g., chemistry or genomics) that visualize patterns, identify clusters, and discover latent structure.

Throughout the conversation, Dr. Ferrante challenges conventional AI workflows that begin with hypothesis formation or confirmatory analytics. Instead, he advocates for “cartography” — a term he uses to describe the exploratory process of mapping an organization’s internal data within a learned domain landscape.

In the interview, Ferrante explains how researchers using Atlas AI might project oncology drugs into a chemical representation space derived from publicly available databases, such as PubChem. In doing so, teams can detect whether certain compounds cluster by toxicity, binding affinity, or another property — even before building formal models. “It’s not about labeling more data,” Ferrante says. “It’s about seeing where that data sits in a landscape of known science.”

Ferrante argues that this exploratory practice leads to better hypothesis generation and prevents wasted effort on questions the data can’t meaningfully address. For R&D leaders, the key takeaway is a shift in mindset: use LLMs not only for automation but also for revealing patterns that guide smarter experimentation and more informed investments in modeling.

Recommended from Emerj

Lessons from Running Massive Online Degrees at Scale – with Aaron Demory of Fearlus

In both the US and Canada, a significant portion of the workforce is approaching retirement, putting vast amounts of tribal knowledge at risk. In Canada, baby boomers born between 1955 and 1965 are retiring, with the last cohort turning 65 in 2030, causing a decline in overall labor force participation and knowledge loss. Approximately 22%…

Matthew DeMello

•

September 22, 2025

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Small businesses and enterprises alike are running into similar roadblocks when it comes to deploying AI at scale and developing resilience in today's global supply chains. While many leaders understand the urgency, their organizations often face structural, cultural, and logistical barriers to implementation. According to the U.S. Census Bureau's Small Business Pulse Survey, 38.8% of…

Matthew DeMello

•

September 18, 2025

Artificial Intelligence at Bayer

Bayer is a global life sciences company operating across Pharmaceuticals, Consumer Health, and Crop Science. In fiscal 2024, the group reported €46.6 billion in sales and 94,081 employees, a scale that makes internal AI deployments consequential for workflow change and ROI. The company invests heavily in research, with more than €6 billion allocated to R&D…

Emily Smith

•

September 15, 2025

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Event Title: CoCreate 2025 Event Host: Alibaba.com Location: Las Vegas, NV, US Date: September 4-5 Team Member: Matthew DeMello, Emerj AI Research Editorial Director What Happened CoCreate 2025, Alibaba.com’s flagship sourcing and entrepreneurship event, convened global leaders from across supply chains, technology, and commerce in Las Vegas. With more than 200 networking sessions, 100 industry…

Matthew DeMello

•

September 11, 2025

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

This interview analysis is sponsored by Hitachi Vantara and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Organizations across various industries are making significant investments in enterprise AI capabilities to enhance their efficiency and…

Riya Pahuja

•

September 11, 2025

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

This article is sponsored by Cognigy and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Repetitive administrative tasks continue to be a significant source of employee burnout across various industries. In healthcare, as Microsoft’s…

Riya Pahuja

•

September 9, 2025

Artificial Intelligence at Fifth Third Bank

Fifth Third Bank, a leading regional financial institution with over 1,100 branches in 11 states, operates four main businesses: commercial banking, branch banking, consumer lending, and wealth and asset management. Founded in 1858 and headquartered in Cincinnati, the bank has assets in excess of $211 billion. During the first quarter of 2025, Fifth Third Bank…

Sharon Moran

•

September 8, 2025

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

This article is sponsored by Aquant and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. In high-stakes field service sectors, such as manufacturing heavy machinery or critical medical devices like hospital ventilators, equipment failure…

Matthew DeMello

•

September 4, 2025

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

While protein sequencing efforts have amassed hundreds of millions of protein variants, experimentally determined structures remain exceedingly rare, lagging far behind the number of unresolved structures. The 2024 UniProt knowledgebase catalogs approximately 246 million unique protein sequences, yet the Worldwide Protein Data Bank holds just over 227,000 experimentally determined three-dimensional structures — covering less than…

Ashwin Telang

•

September 1, 2025

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

This interview analysis is sponsored by Microsoft and NVIDIA. It was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Manufacturers worldwide are under increasing pressure to enhance operational efficiency and agility in response to evolving…

Marilie Fouche

•

August 26, 2025

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

There is significant regulatory uncertainty in global AI oversight, primarily because of the fragmented legal landscape across countries, which hinders effective governance of transnational AI systems. For instance, as noted in a 2024 Nature study, the lack of harmonized international law is complicating AI innovation, making it difficult for organizations to understand which standards apply in…

Riya Pahuja

•

August 25, 2025

Artificial Intelligence at ABB- Two Use Cases

ABB is a global technology leader specializing in electrification and automation, with a history spanning over 140 years and approximately 110,000 employees worldwide. Headquartered in Zurich, Switzerland, ABB operates in over 100 countries, supported by approximately 170 manufacturing sites worldwide. In 2024, the company reported revenues of $32.9 billion and an order intake of $33.7…

Riya Pahuja

•

August 18, 2025

Search site

Search site

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

Building Data Context Before Building AI Models

Ontologies as a Starting Point, Not a Constraint

Start with Exploratory Cartography, Not Assumptions

Recommended from Emerj

Lessons from Running Massive Online Degrees at Scale – with Aaron Demory of Fearlus

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Artificial Intelligence at Bayer

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

Artificial Intelligence at Fifth Third Bank

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

Customize Your Experience

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

Building Data Context Before Building AI Models

Ontologies as a Starting Point, Not a Constraint

Start with Exploratory Cartography, Not Assumptions

Related Posts

Share article

Subscribe to updates

Recommended from Emerj

Lessons from Running Massive Online Degrees at Scale – with Aaron Demory of Fearlus

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Artificial Intelligence at Bayer

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

Artificial Intelligence at Fifth Third Bank

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

This Content is Exclusive to Emerj Plus Members

In-Depth Analysis

Exclusive AI Capabilities Matrix

Exclusive AI White Paper Library

Best Practices and executive guides

Register

Customize Your Experience