There’s a moment in most large technology organizations when someone looks at the monitoring and observability landscape and realizes it has become completely ungovernable. Different teams have adopted different tools. Data lives in different places. When something goes wrong across team boundaries — which is when observability matters most — there’s no single place to look. A MarTech organization we worked with had reached exactly that moment. More than ten different tools and approaches were in use across the enterprise. The decision was made to standardize on Datadog. The mandate was clear. The path to getting there was not.
A good decision with a cost problem
The Integrations team, still running their .NET applications on Azure App Insights with Serilog, needed to migrate — with near real-time ingestion and support for 15,000 requests per second at peak. The challenge: there’s no out-of-the-box way to install the Datadog agent on Azure PaaS services like Azure Web Apps. So the team built a workaround — an Azure Event Hub to receive logs asynchronously, with a Datadog Forwarder Function App to push everything to the platform in JSON format. It worked. And then more applications started using it.
As the solution scaled, the Function App couldn’t keep up. Cost became unsustainable. The operational overhead of maintaining it undermined the original goal — which was to make observability simpler, not more complicated.
“Standardize on the right tool. But if your infrastructure isn’t ready to support it natively, you’re not solving the problem — you’re just moving it.” — Sirrus7
The fix: move the foundation, not the tool
The resolution wasn’t to find a better workaround. It was to eliminate the need for one. Applications moved off Azure Web App services onto an Azure Managed Kubernetes cluster, where the Datadog agent could run natively — reading logs directly, without the Event Hub or Function App in the middle. The cost problem and the scale problem went away together, because they were always the same problem: the hosting infrastructure wasn’t ready for what was being asked of it.
Architecture evolution
Phase 1 — Workaround: .NET Apps + Serilog → Azure Event Hub → Forwarder Function App → Datadog (Cost & scale failure)
Phase 2 — Right foundation: .NET Apps + Serilog → Azure Managed Kubernetes → Datadog Agent (native) → Datadog
What this engagement taught us
- Tooling mandates without infrastructure readiness create workarounds, not solutions. Standardizing on Datadog was correct. Doing it before the hosting environment could support native agent deployment forced an architecture that couldn’t scale.
- The observability problem and the infrastructure problem were the same problem. The Event Hub architecture revealed that Azure Web Apps weren’t the right long-term foundation. Kubernetes solved both issues simultaneously.
- Creative workarounds have a scale ceiling. A solution that works for one or two applications often breaks when extended to ten or twenty. Build with eventual scale in mind, even when starting small.
- Sequence matters as much as direction. Getting to the right destination in the wrong order costs significantly more than planning the path before you start walking it.
This story isn’t really about Datadog or Azure or Kubernetes. It’s about what happens when a well-intentioned modernization decision meets infrastructure that wasn’t ready. If something about this story is hitting a little close to home, we’d love to talk. The context window is always open.