Introducing B23 Dataflow Monitoring

Monitoring data flows is a key differentiator for organizations that operate production Artificial Intelligence (“AI”) and Machine Learning (“ML”) workflows in order to derive accurate, reliable, and responsive outcomes. B23 is a pioneer in the area of AIOps and monitoring production dataflows at scale. We will highlight 3 important points in this blog for you:   Background of Dataflow Monitoring Why Dataflow Monitoring is so important for AIOps Detail into B23 Dataflow Monitoring for AIOPs.   Background of Dataflow Monitoring   On behalf of our customers, B23 manages and operates production, enterprise-scale data lakes as centralized stores for diverse types of data. The enterprise data lake serves diverse business purposes, often supporting workloads for AI/ML and business intelligence (“BI”).   Data lakes are not static — they are often connected by a complex series of rivers and streams, some flowing into the lake from external sources and some leaving the lake to downstream consumers. We call these rivers and streams “dataflows.” There are also dataflows within the data lake, where datasets are transformed or fused to suit customer needs, and then written back to the same data lake environment. These internal flows include extract-transform-load (“ETL”) pipelines and AI/ML training or inference pipelines.   A common data lake contains data that can have very different characteristics:   Structured vs unstructured Different serialization formats (comma-separated values, parquet, JSON) Different naming conventions Different delivery cadences (continuous, burst-y, periodic, ad hoc)   As data flows in and out of the data lake, it often crosses inter-organization and intra-organizational boundaries. A variety of legacy monitoring and alerting tools exists for infrastructure, networking, and applications. These tools monitor traditional “operations” data such as system availability, processing latency, network throughput, CPU utilization, disk utilization, etc. The commercial cloud providers offer a...

Kubectl cleanup plugin:

Kubectl cleanup plugin: https://github.com/b23llc/kubectl-config-cleanup At B23, much of our work involves spinning up new Kubernetes clusters on edge computing and storage devices to test our applied machine learning (“AML”) workloads. Using our B23 Data Platform (“BDP”), we can quickly configure and connect to a new Kubernetes cluster on an edge device, on-premise, or in any number of public vendor clouds. This lets us iterate and develop our data engineering and AML workloads quickly and at low cost for our customers. For most people who manage and operate multiple Kubernetes clusters, it can get overwhelming the number of configurations required to connect to those clusters. In my case, I have 37 different context entries in my ~/.kube/config. Only 3 of those entries are persistent clusters — in this case for dev, staging, and prod. The rest of the entries were for ephemeral clusters which barely lasted for a single work day. This workflow has become commonplace with the increased availability of managed Kubernetes services from cloud vendors like GCP, AWS, Azure, and Digital Ocean to name a few. Commands like gcloud container clusters get-credentials and az aks get-credentials are really convenient for obtaining credentials for a newly launched cluster and connecting right away, but a cluster a day quickly turns into this: Kubectl cleanup plugin: https://github.com/b23llc/kubectl-config-cleanup At B23, much of our work involves spinning up new Kubernetes clusters on edge computing and storage devices to test our applied machine learning (“AML”) workloads. Using our B23 Data Platform (“BDP”), we can quickly configure and connect to a new Kubernetes cluster on an edge device, on-premise, or in any number of public vendor clouds. This lets us iterate and develop our data engineering and AML workloads quickly and at low cost for our customers. For most people who manage and operate multiple Kubernetes clusters, it can get overwhelming the number of configurations required to connect to those clusters. In my case, I have 37...