Humane Factors in Observability – Part 1

Blueprints

When we set out to create a toolkit that would allow the rapid design, development, and deployment of new instruments for measuring software service behaviors, several essential considerations were placed on the board.

This two-part series will discuss critical factors that weighed heavily in our rethinking of Observability and how they manifest in our toolkit under the headings: conceptualization, communication, coordination, collaboration, and cognition.

Conceptualization

A shared model representation of an environment’s significant structural and behavioral elements is essential for effective and efficient communication, coordination, and cooperation.

The substrates.io project achieves this by defining a set of concepts and their contracts we have found commonly reoccurring across several instrumentation libraries we have designed since 2003.

Communication

Existing open-source observability libraries assume that the measurement data collected via instrumentation interfaces is immediately pushed outwards along a data pipeline toward an endpoint in the cloud.

In contrast, the Humainary toolkit is designed to allow data to be consumed and processed within the managed runtime space. This choice stems from our research background and ongoing interest in developing self-adaptive software services systems.

Considering how much data collected is noisy, it seems bizarre that we continue pumping data, with little utility, along costly pipelines instead of locally processing such events, turning sequences into signals and other higher semantic forms.

A toolkit conceived like Humainary brings sustainability back into the fold.

Others look to build out big fat pipes; we aim to distribute intelligence and, in doing so, scale naturally – both low and high.

Logging, and its sister technology tracing, were not designed for autonomic computing.

We need to revisit this in light of the growth and acceleration of complexity beyond human cognitive and network capacities.

Coordination

Coordination within the Humainary toolkit starts with the interaction between the application (or service) and the instrumentation runtime via an event subscription mechanism.

The application (or service) layer can self-introspect on signals and states of various subsystems and components within its process and respond accordingly and adaptively.

Every instrument library within the Humainary toolkit offers an identical event subscription mechanism, allowing consumption of measurement data collected to begin at source and then, when warranted, to proceed outwards to other coarser grain compositions, like clusters and hubs, across spatial and temporal dimensions.

The toolkit makes it relatively simple to chain, compose, and coordinate instruments under different contexts and fuse resulting models locally instead of pushing data collection to the cloud over other pipelines at incompatible time resolutions.

The measurement from an instrument can cause other instruments to be dynamically activated and deactivated locally, further reducing noisy data and keeping costs low while ensuring coordination happens at a more appropriate scale and frame of reference.

With Humainary, intelligence in local situational perception, comprehension, projection, and reaction can be automatic.

The software can self-regulate at their operating scale rather than human attention and actioning, improving resilience.