Research10 min read

The Case for Domain-First AI: A Research Perspective

Teams Lab Research · 15 March 2025

The state of AI in industrial operations

General-purpose AI capabilities have advanced rapidly. Language models can reason about complex problems. Forecasting models have broken historical accuracy records on public benchmarks. Agent frameworks can automate multi-step workflows.

And yet, the failure rate of AI deployments in industrial operations — supply chain, manufacturing, trade compliance — remains stubbornly high. A 2024 survey of industrial AI deployments found that fewer than 30% were still in active use 18 months after go-live.

This is not a model quality problem. The models are good enough. It is an integration problem — specifically, a failure to integrate domain knowledge into the AI system design.

What we mean by domain-first

"Domain-first" is not a methodology in the traditional sense. It is a design principle that says: before you design the AI system, you must have a working model of the domain the system will operate in.

A domain model for supply chain includes:

The decision structure: who decides what, with what information, under what constraints
The data structure: what data exists, at what granularity, with what reliability and latency
The constraint structure: what operational constraints are non-negotiable, which are soft, and which are artefacts of current practice
The failure mode structure: how does the current system fail, and under what conditions

Without this model, AI system design is essentially guessing. With it, you can design for the specific decisions that matter, with the specific data that is available, under the specific constraints that bind.

Three research findings

Our research programme has produced three findings we consider robust enough to act on:

1. Explainability is a constraint, not a feature

In consumer AI applications, explainability is often treated as a nice-to-have. In industrial operations, it is a hard constraint.

Supply chain planners have institutional knowledge that exceeds what any model can capture. When a model's recommendation conflicts with their knowledge, they need to evaluate it — not just override it. Evaluation requires explainability.

Systems designed without planner-accessible explanations consistently see overrides that neither follow nor inform the model. The model and the planner diverge, and the model becomes irrelevant.

2. The data quality problem is not solvable without domain knowledge

Every AI project discovers data quality problems. The standard response is to invest in data cleaning. This is necessary but not sufficient.

Data quality problems in industrial contexts are often structural — they reflect how data was captured, what business processes generated it, and what decisions it was designed to support. Cleaning the data without understanding the structure produces clean data with the same structural problems.

Domain knowledge is required to distinguish cleanable noise from structural bias that requires process change upstream.

3. Benchmark accuracy is the wrong success metric

The industry standard for AI deployment success is model accuracy on a held-out test set. This metric is nearly useless for predicting deployment success.

What predicts deployment success is decision accuracy improvement — the improvement in the quality of the actual decisions made by the people using the system. This requires measuring the pre-deployment decision quality as a baseline, which most projects do not do.

At Teams Lab, we measure decision quality before any AI system is deployed. This requires domain knowledge to define what a "good decision" looks like in the specific operational context.

The practical implication

If you are planning an AI deployment in supply chain or trade compliance, the most important investment you can make is in domain knowledge documentation — not in model selection, data infrastructure, or MLOps tooling.

Start by writing down, in precise terms, what decisions you want to improve. Then document what information those decisions currently use, what information is available but not used, and what information does not exist but would be valuable.

This exercise typically takes 4-6 weeks and consumes significant senior operations time. It is almost always the highest-ROI activity in the project.

This article draws on research conducted across 40+ AI deployment reviews and 12 original engagements. Teams Lab publishes research on AI in industrial operations quarterly. Subscribe to receive new research.

Stay ahead of India's AI and trade landscape

Weekly insights from our research team. No fluff.