What is What Just Happened (WJH)?

What Just Happened (WJH) is a specialized software feature, or category of tooling, designed for real-time, automated root cause analysis in complex IT and network environments. Its primary function is to continuously monitor system telemetry—such as logs, metrics, and events—and immediately correlate anomalies to identify the specific change or failure that likely caused a service degradation or outage. The core value proposition of WJH is its shift from reactive, manual troubleshooting, which can take engineers hours of sifting through dashboards, to a proactive, instantaneous presentation of a probable cause. This is not merely an advanced alerting system; it is a diagnostic engine that interprets the flood of operational data to answer the eponymous question precisely at the moment an incident begins, often pinpointing the faulty commit, configuration push, infrastructure failure, or security event responsible.

The operational mechanism of WJH typically involves ingesting a high-volume stream of structured and unstructured data from across the technology stack. Using a combination of deterministic rules, statistical baselining, and increasingly, machine learning models, the system establishes a normal behavioral pattern for the environment. When a significant deviation is detected—like a spike in error rates or a drop in throughput—the tool executes a causal inference analysis. It examines the timeline of changes and state variations across interdependent components, filtering out irrelevant noise to surface the event most statistically correlated with the observed symptoms. For instance, in a cloud-native application, a WJH system might instantly link a database latency spike to a specific automated scaling action or a recent deployment of a particular microservice, presenting that linkage with supporting evidence to the operations team.

The practical implications of deploying WJH are substantial for organizational efficiency and system reliability. It dramatically reduces mean time to resolution (MTTR) for incidents, minimizing business impact and freeing engineering talent from firefighting to focus on higher-value work. Furthermore, by providing objective, data-driven attribution for failures, it fosters accountability and continuous improvement within development and operations workflows, turning incident post-mortems from speculative discussions into fact-based analyses. However, the efficacy of a WJH system is contingent on the quality and comprehensiveness of the telemetry data it consumes; gaps in instrumentation can lead to incorrect or incomplete diagnoses. Its implementation also represents a cultural shift, requiring teams to trust and act upon its automated findings, which necessitates transparency in how the tool reaches its conclusions.

Ultimately, What Just Happened represents a maturation of observability practices, moving beyond the "three pillars" of logs, metrics, and traces toward synthesized intelligence. It is a critical capability for managing the inherent complexity and pace of change in modern distributed systems, where manual causal analysis is no longer feasible. While not a silver bullet that eliminates all operational complexity, it serves as an essential force multiplier for site reliability engineering (SRE) and DevOps teams, transforming raw data into immediate, actionable insight during the most critical moments of system failure.