September 25, 2025 · 2 min read

Your pipelines can run fast, but if your data isn't trusted, nothing else matters

Data EngineeringSnowflakeCloud MigrationData QualityETL

Your dashboards can be fast. Your pipelines can run. But if your data isn't trusted, none of it matters.

When we migrated AT&T's Enterprise Data Warehouse (EDW) from Teradata to Snowflake, everything looked great at first. Tables moved. Pipelines ran. Queries came back faster. By every metric on the migration scorecard, we were winning.

Then the issues started.

The silent failures

Duplicate rows began creeping in, Snowflake handles primary-key constraints differently from traditional relational databases, and assumptions that held on Teradata quietly stopped holding.
Async pipelines started delaying critical real-time BAUs, so time-sensitive processes slipped.
Whole batch loads got stuck, triggering escalations across multiple teams.

The root cause wasn't really technical. It was a measurement problem. We were celebrating migration success metrics, records moved, runtimes improved, while ignoring the one thing that actually mattered downstream: data trust.

What we learned the hard way

Duplicates can silently break downstream systems. A small integrity gap upstream becomes a finance visible error three hops later, long after the job reported success.
Real-time dependencies need resilience. One late table can stall an entire batch. If the schedule assumes everything arrives on time, it's not a schedule, it's a hope.
Migration isn't just about moving data. It's about re-engineering quality and reliability at scale. The new platform has different rules, and the old guarantees don't migrate for free.

How trust came back

Once we built row-level validation, explicit duplicate handling, and resilient scheduling, the escalations dropped and confidence in the system returned. The fix wasn't a faster query engine, it was making correctness a first-class property of the pipeline instead of an assumption.

The lesson has stuck with me ever since: in data engineering, moving fast is useless if your data isn't reliable. A migration that's "done" on the scorecard but untrusted by the people downstream isn't done at all.

So I'll throw the question back: what hidden issues have you seen derail a "successful" migration?

I'm Yash Agarwal, a Data Engineer II at Amdocs in Pune, India. I write about building reliable, large-scale data platforms — migrations, data quality, and the war stories in between. You can find more of my work on my portfolio or connect with me on LinkedIn.

← All articles