The OSSification of Observability

In the relentless march of technological progress, the realm of Observability has long held the promise of offering deep insights into complex systems. Yet, stagnation lingers. Once heralded as a breakthrough, the standardization of tools like log management, metrics, and tracing feels more like ossification – a calcification of past approaches that fails to address the ever-evolving needs of modern software systems.

While mountains of data are generated, understanding and utilizing it for proactive problem-solving and performance optimization remains elusive.

Here we explore why the industry needs to move beyond the legacy tools and embrace a more dynamic and adaptable approach to gleaning genuine value from the ever-growing ocean of decontextualized data collected.

Stifling Standards

A significant portion of the stagnation in Observability can be attributed to the outsized influence vendors hold over standardization bodies. This vendor lock-in creates a cycle that reinforces the (re)use of existing techniques, technologies, and, not surprisingly, the vendors’ products. We’re rubber-stamping the past, ignoring the present.

Standardization, in theory, should drive innovation and interoperability.

However, when vendors wield undue influence, the standards themselves can become ossified, prioritizing backward compatibility with legacy tools over embracing new approaches that are a far better fit for addressing the complexity of distributed systems.

All this stifles the development of more efficient and effective methods for gaining better sensing and steering over systems.

Furthermore, vendor-driven standards often lock users into an ecosystem or way of doing making switching to potentially more effective solutions from other vendors difficult and expensive. This lack of competition and diversity at the conceptual level hinders the overall progress of the observability space – an evolutionary dead-end.

Future Ready

For actual advancement, we need a more balanced approach to standardization, one that favors far more extensible and explorative solutions that can keep pace with the rapid evolution of technology. This would foster a more dynamic ecosystem where innovation is encouraged, and users have the freedom to choose the tools that best suit their needs, instead of the same hammer carried from the stoneage being wielded wrongly.

While OpenTelemetry (OTel) has brought some standardization to the observability space, its focus on pre-defined instrumentation types like metrics and traces hinders innovation. It doesn’t provide a flexible framework for building entirely new instrument types.

Humainary as a private research project is furnishing a framework based around composability that empowers developers to craft new instruments that combine outputs from such in novel ways, unlocking a new wave of innovation in Observability, allowing developers and site reliability engineers to move beyond the limitations of pre-defined tools and unlock a deeper understanding of software systems.

Vendor Control

Observability vendors focus on ingesting and displaying data through dashboards, and their tooling is often designed around the existing instrumentation types like metrics and traces. By keeping the instrumentation set limited, vendors can exert more control over the data users collect, which is understandable from their current position of power but it stifles innovation, especially as OTel is designed squarely to move the same data types from source to vendor sink.

When challenged vendors will claim that such a framework adds unnecessary complexity for many users who simply want to get started with basic observability. Unfortunately, the starting point is invariably the stopping point, and the reason why so many Observability initiatives fail and will continue to do so.

There is a self-fulfilling loop here in that the lack of composability limits the development of new instrument types, and in turn, the limited instrument set reinforces the dominance of dashboards and vendor-specific tools.

Instead of simply moving raw data, Observability tools could focus on intelligent data transformation and aggregation at the source. This would reduce network traffic and allow for more context and semantically rich data to be transmitted for further analysis, such as signs, signals, situations, and status.

By moving away from a one-size-fits-all model focused exclusively on mindless transmission and remote processing, Observability can become more innovative, effective, and adaptable to the ever-evolving needs of modern systems.

Dashboard Dominance

Even though dashboards, more so databoards, might occasionally appear visually distinct, the data they display often originates from the same limited set of instruments. This means that regardless of the fancy lipstick-on-a-pig visualizations, the underlying insights remain constrained by the types of data being collected. The emphasis on visually appealing dashboards can distract from the core functionality of observability – gaining deep system understanding.

Flowery charts and graphs might look impressive to those with insufficient experience in systems monitoring and management, but if they don’t translate into actionable insights, they offer limited value.

Vendors might be competing on aesthetics and ease of use, but the underlying functionality – the ability to gather rich and diverse data – remains stagnant. It is impossible not to notice the sameness of all products when used.

Rewinding Remoteness

Observability in its current form will not deliver deep systems insight being unable to reconstruct a picture that is fit for purpose due to the focus on remote processing by vendors outside of the space and time where analysis and action needs to be performed – within the flow of execution and information. There are real limits to the amount of raw data that can be moved along the pipeline, from one queue (buffer) to another, from source to sink.

Today the only real processing that is performed at the source is the queuing of numerical data values up into a buffer for dispatching to an agent or collector. Because of this vast amounts of duplicative and decontextualized data are pushed along bloated and overflowing pipelines resulting in the need to sample, invariably at the wrong time.

We need to rewind and chart a new course that involves a focus on having sensing and steering done primarily locally, allowing for more dynamic and adaptable behavior within the context. We need to stop positioning Observability as some separate remote passive observer but as an essential component of local control loops.