Writing

Blog

Notes on building reliable, large-scale data platforms: cloud migrations, pipelines, data quality, and the occasional war story.

June 23, 2026 · 3 min read

You can't check nationality in a millisecond: the access control trap behind frontier AI's safety recalls

A frontier model shipped on a Friday and was pulled three days later. The headline was national security but underneath it is an access control engineering problem I keep seeing: a permission model more granular than the infrastructure can enforce.

April 22, 2026 · 2 min read

Meta's real moat was never the benchmark. It's the data layer

Muse Spark took Meta from an 18 to a 52 on the intelligence index in nine months. But the number that matters isn't the benchmark, it's the quiet move from open weights to a proprietary model sitting on three billion users. A data engineer's read on where the advantage actually lives.

February 26, 2026 · 3 min read

The 'SaaS-apocalypse' wasn't a crash, it was a re-pricing of what software is worth

When new Claude workflow tools landed, markets erased roughly $285 billion from software stocks in a single session. No outage, no breach, just fear that AI agents are becoming full stack replacements for entire workflows. A data engineer's take on why the future is fewer tools and more systems.

February 20, 2026 · 2 min read

AI's centre of gravity is shifting: India and the rise of real world AI adoption

The India AI Impact Summit made one thing clear: the next phase of AI won't be defined only by model breakthroughs, but by where and how those models are deployed, governed, and scaled. A data engineer's view on why real-world constraints matter more than demos.

January 5, 2026 · 2 min read

Why LangGraph feels familiar to a data engineer: state, orchestration, and failure paths

I picked up LangGraph half out of curiosity and half because it became a project requirement. What struck me is how much it rhymes with data engineering, state over stateless calls, orchestration over linear chains, and designing for the failure paths.

December 18, 2025 · 2 min read

Databricks' new IDE quietly changes how data engineers build pipelines

Databricks introduced a dedicated IDE for data engineering, and it isn't just another UI update. It's a shift from experiment first notebooks to engineer first workflows: structured pipeline authoring, real lineage, and git native version control. Why that matters for anyone maintaining production pipelines.

December 10, 2025 · 2 min read

Turning 25: notes on being a work in progress

When I was younger, 25 felt like peak adulthood — career sorted, life direction locked in. The reality is somewhere between 'I think I know what I'm doing' and complete confusion. A few honest reflections on growth, timelines, and trusting the process.

November 20, 2025 · 2 min read

Databricks just made data governance feel like leverage, not compliance

Databricks' November release isn't only a feature drop, it reads like a direction shift for modern data platforms. External tables into Unity Catalog with lineage intact, cross cloud sharing with SAP, attribute based access control, and audit logs. Why this moves us from 'store and compute' to 'connect and understand.'

November 13, 2025 · 2 min read

Snowflake × SAP and the rise of the 'AI-READY' data fabric

Snowflake and SAP announced a collaboration to build a shared, AI-ready business data fabric, zero copy sharing, semantic enrichment, and unified governance across clouds. Having migrated AT&T's warehouses from Teradata to Snowflake, the hardest problem was never performance. It was context.

November 6, 2025 · 2 min read

Why great data engineers think like product managers

When I started out, I thought being a good data engineer was all about clean pipelines and perfect schemas. Over time I realised the best ones don't just move data, they move decisions. They think in outcomes, not objects.

October 22, 2025 · 1 min read

Travel resets the cache

Just back from the Philippines: beaches, canyons, islands, and a lot of laughter. Somewhere between chasing sunsets I was reminded that the same principle applies to data systems and humans: clear the clutter, refresh your processes, reconnect to what actually matters.

September 25, 2025 · 2 min read

Your pipelines can run fast, but if your data isn't trusted, nothing else matters

When we migrated AT&T's enterprise data warehouse from Teradata to Snowflake, everything looked great: tables moved, pipelines ran, queries got faster. Then the silent failures started. The real lesson of that migration wasn't about speed: it was about data trust.

September 18, 2025 · 2 min read

What I've learned in 2 years as a data engineer (at 24)

When I started my career, I was convinced that mastering tools was everything: Python, SQL, Snowflake, Teradata, cloud platforms. After migrating 1000+ tables and 45B+ records for AT&T and shipping a production GenAI pipeline, I learned the thing that actually future-proofs a career in data.

September 11, 2025 · 2 min read

GenAI won't replace data engineers, it'll empower us

There's been so much talk about AI taking over jobs. Here's what I've actually seen in my work: on a recent GenAI heavy project, the AI handled modular coding while our team focused on orchestrating and scaling the workflow. The future isn't AI vs humans: it's AI + humans.

August 26, 2025 · 3 min read

How I drove billing data integrity incidents to ZERO with a 3-layer self auditing system

A look at the self auditing architecture I built on AT&T's billing platform at Amdocs, reconciling data across multiple cross team handoffs and catching problems before they ever reach finance.