When we set out to create a toolkit that would allow the rapid design, development, and deployment of new instruments for measuring software services behaviors, several essential considerations were placed on the board, which we believe are crucial in understanding and differentiating our vision and interface design approach from other offerings in the community, in particular OpenTelemetery, which we consider a stopgap measure that rubber-stamps past engineering endeavors that never delivered much-touted operational improvements.
This two-part series will discuss some of the key factors that weighed heavily in our rethinking of observability and how they manifest in our toolkit under the headings: conceptualization, communication, coordination, collaboration, and cognition.
A shared model representation of the significant structural and behavioral elements within an environment is essential for effective and efficient communication, coordination, and cooperation. The substrates.io project achieves this by defining a set of concepts and their contracts we have found commonly reoccurring across several instrumentation libraries we have designed since 2003.
The list of concepts includes
Conduit. These concepts today are utilized and extended across instrumentation projects.
Existing open-source observability libraries assume that the measurement data collected via their instrumentation interfaces is immediately pushed outwards, when not dropped because of scaling issues, along some data pipeline towards an endpoint in the cloud. The result is the deployment of mindless computing components at the edge, not just those about performing observability data collection and transmission.
In contrast, the Humainary toolkit is designed to allow data to be consumed and processed within the managed runtime space. This choice stems from our research background and ongoing interest in developing self-adaptive software services systems.
Considering how much observability data collected is noisy, it seems bizarre that we continue blindly pumping data of slight if any utility along costly pipelines instead of locally processing such events, turning sequences into signals and other higher semantic forms before transmission, if warranted at all. A toolkit conceived like Humainary brings sustainability back into the fold.
Others look to build out big fat pipes; we aim to distribute intelligence and, in doing so, scale naturally – both low and high. Logging, and its sister technology tracing, were not designed for autonomic computing, which is an avenue we need to revisit in light of the growth and acceleration of complexity beyond human cognitive and network communication capacities.
Coordination within the Humainary toolkit starts with the interaction between the application (or service) and the instrumentation runtime. Via an event subscription mechanism, the application (or service) layer can self-introspect on signals and states of various subsystems and components within its process and respond accordingly and adaptively.
Every instrument library within the Humainary toolkit offers a near-identical event subscription mechanism, allowing consumption of measurement data collected to begin at source and then, when warranted, to proceed outwards to other coarser grain compositions, like clusters and hubs, across spatial and temporal dimensions.
The toolkit makes it relatively simple to chain, compose, and coordinate instruments under different contexts and fuse resulting models locally instead of pushing data collection to the cloud over other pipelines running at various and invariably incompatible time resolutions.
The measurement from an instrument can cause other instruments to be dynamically activated and deactivated locally, further reducing noisy data and keeping costs optimally low while ensuring human-machine coordination happens at a more appropriate scale and frame of reference.
With the Humainary toolkit, intelligence in local situational perception, comprehension, projection, and reaction can be near-realtime and automatic. Machines can self-regulate at their operating scale (computation speed) rather than that of human attention and actioning, improving resilience in doing so.
Yes, instruments can and should be more than passive pipes pushing data oblivious to meaning.