Manifesto, Probabl, 2026 Vol. 01

Enterprises
deserve better
data science.

The data science industry is at an inflection point. We are surrounded by high velocity innovation and new AI technologies that have the potential to create entirely new paradigms.

But as we push forward, we must acknowledge that the long term success of enterprise data science is held back by critical challenges. This is our case for a different way forward.

A manifesto by the scikit‑learn founders Scroll to explore

The momentum is a testament to the talent of our industry. The reckoning is what we owe it.

We push forward with new tooling that promises to surpass human creativity. But this momentum comes with a long term debt : a technology first mindset that tempts business leaders to replace proven processes with AI tools that promise magic but ultimately deliver opacity.

Spiraling pay‑as‑you‑go costs hamper economies of scale. All‑in strategies generate vendor lock‑in. The science of data has been quietly traded for the spectacle of it. We can do better.

We suggest a different way forward, one that turns data science into the industrial‑grade practice it deserves to be, and one that empowers enterprises to own it.

§ 02, Five challenges

Five challenges holding back industrial‑grade data science.

Let’s be brutally honest with ourselves. The practice of industrial‑grade data science has not yet achieved its full potential. To realize its long term success, we must ambitiously tackle the challenges we face.

CHALLENGE / 01 Methodology

The rising tide of technology‑first thinking

AI tooling tells enterprises that legacy applications and processes must be replaced because AI will surpass human creativity and productivity. When innovation in data science does not allow you to understand and reuse your existing experiments and models, it creates technical debt and amplifies costs.

CHALLENGE / 02 Economics

The pay‑as‑you‑go trap

On demand pricing has become the norm. Pay for compute. Pay for GPUs. Pay for tokens. Costs spiral out of control. Budget forecasting becomes impossible. When you give your suppliers open ended access to your bank account, your expansion generates their profits, not yours.

CHALLENGE / 03 Autonomy

All‑in strategies create lock‑in

Cloud only, GPU only or AI only sound like modern and decisive strategies. They create strategic dependencies that contradict long term value creation. When you lose autonomy, you lose freedom of movement, and your infrastructure decisions become vendor lock‑in.

CHALLENGE / 04 Practice

Data science has not reached the industrial maturity it deserves

Machine learning models rarely make it to production. Experiments are lost when team members leave, and reproducibility remains an aspiration rather than a standard. Practitioners still reinvent wheels, lack shared quality standards, and operate without the engineering discipline that data science deserves.

CHALLENGE / 05 Science

Scientific thinking has been forgotten

Data science is not software engineering. It requires a different discipline. Adopting new technologies should not undermine peer review, explainability, and ultimately trust. Methodology matters. Statistical rigor matters. You should not rush to replace scientific discipline with automated tools that promise magic but deliver opacity.

Five problems. One pragmatic shift. One way forward.

§ 03, Another way forward

Bringing the science of data to the world.

To tackle these challenges, we must take a pragmatic shift to the practice of data science. At Probabl, we advocate firmly for an approach built on four principles.

§ 04, Foundations

The foundations of our approach.

Principle A

Transparency and explainability lead to ownership, trust, and impact.

When you understand and can see how your models work, you can improve and trust them. Trust enables confidence in your decisions and accountability in your results. Understanding drives business value and competitive advantage.

Ownership Trust Impact
Principle B

Composability leads to agility and independence.

We believe in agility and independence. By choosing tools that are modular and plug into your existing stack, you retain the freedom to adapt to change and choose the best tool for each specific use case. You control your destiny and pay the right price rather than being forced into a walled garden.

Modularity Agility Independence
Principle C

Reusability leads to economies of scale.

Innovation should not mean that your existing investments become obsolete. When past experiments and models are treated as building blocks, you can build on experience and create true long term value.

Reuse Build‑up Compounding
Principle D

Science first.

Data science was born from the scientific method : hypothesis, experimentation, measurement, and peer review. These foundations are precisely why data science creates value for enterprises. Start with the problem, not the tool. Validate before you deploy. Question before you trust. Methodology should drive tooling, not the other way around.

Hypothesis Measurement Peer review
§ 05, In closing

By returning to these principles, we move away from automated tools that promise magic and back to the rigor business‑critical systems require.

Signed
by the scikit‑learn founders
Probabl SAS, Paris, Open source, 2026