Your Data Platform Costs Too Much

Every data leader has had the conversation. The cloud bill lands. Someone in finance flags the number. The first response is almost always the same: let's get FinOps in place.

So you do. You tag resources, build cost dashboards, set up alerts. Soon you have excellent visibility into exactly how much you're overspending. But nothing actually changes. FinOps has become the BI layer for your cloud costs. It tells you what's happening, but not how to fix it. You can stare at a cost dashboard all day. The jobs are still running.

The companies that actually solve this problem start with a different question: what are we getting for what we spend?

The 80/20 problem

On platforms that support both automated pipelines and human users (analysts, data scientists, teams etc.) the cost split usually tells the story. In most environments, around 80% of spend goes to scheduled jobs and orchestration. 20% goes to interactive queries, BI tools and exploration.

Your platform is upside down.

That scheduled machine layer is essentially a license fee. Companies will negotiate CRM and HR licenses for months to save a few percent. But the "license cost" of a data platform is the compute consumed by daily jobs. And unlike pure SaaS systems, you decide your own price.

Pipelines keep the lights on, but the real value (the decisions, insights and models) comes from the human side. The goal is to flip the ratio: compress the machine layer to around 20% and let the freed capacity flow to users. Same budget. Dramatically more value.

A platform where 80% of spend maps to human activity justifies itself in every budget discussion.

And then the agents arrive

This matters even more now. AI agents are arriving to the data platform. Not as a roadmap slide, but as real workloads: agents that query, explore, analyze and execute. From a cost perspective, every agent is a power user that never sleeps.

If your machine layer still consumes 80% of your platform spend, costs explode the moment agents arrive. But if you've already compressed it to 20%, you've created the headroom. Humans and AI together at 80%, infrastructure at 20%.

The optimization you do today isn't just about efficiency. It's about creating capacity for what comes next, without asking for a bigger budget.

What optimization actually looks like

So how do you compress the machine layer? Below is a customer case on Databricks, a technology which has three core layers of optimization:

Level 1: Cluster configuration. Right-sizing compute, autoscaling policies, idle termination. Most teams eventually get here.

Level 2: Query performance. Partitioning, pruning and eliminating redundant scans.

Level 3: Anti-patterns and concurrency. How data is ingested, moved and distributed across the platform compute.

Most teams stop at level one, so does the general guidelines and articles out there. But the real money sits in levels two and three.

Here's what happened when optimizing all three. Originally: 6 data sources, ~300 tables, one full refresh per day, seven days a week. Runtime: ~2 hours 45 minutes. Estimated monthly cost: 30,000–50,000 NOK. Nothing was broken. Everything ran. It was built fast, but not for efficiency.

After optimizing, the client also added new source tables along the way. The platform now does significantly more work, at higher frequency, for a fraction of the cost.

Results

Tables

300

500+

Updates / week

Power BI refreshes / week

200

Runtime

2h 45m

<30m

Monthly cost (NOK)

30–50K

<10K

More tables. More frequent updates. Faster runtime. A fraction of the cost.

"We used to track our data platform costs closely every month. Now I barely think about them. Our teams have more and better insights than ever – and with negligible cost."

Håkon Thaulow, Group CFO, NetNordic Group

This kind of bloat exists in almost every data platform that has been running for more than a year. Not because anyone made bad decisions, but because early platform work prioritizes dashboards first and optimization second.

Where to start

Two questions reveal most of the problem: Who decided that your pipelines are efficient? And what percentage of your platform spend goes to scheduled jobs versus the people who use the data?

If you don't know the answers, we can find them together.

Let's optimize

Your Data Platform Costs Too Much.But Cutting Spend Isn't the Answer.

The 80/20 problem

And then the agents arrive

What optimization actually looks like

Where to start

Your Data Platform Costs Too Much.
But Cutting Spend Isn't the Answer.