Introducing B23 Dataflow Monitoring

Monitoring data flows is a key differentiator for organizations that operate production Artificial Intelligence (“AI”) and Machine Learning (“ML”) workflows in order to derive accurate, reliable, and responsive outcomes. B23 is a pioneer in the area of AIOps and monitoring production dataflows at scale. We will highlight 3 important points in this blog for you:   Background of Dataflow Monitoring Why Dataflow Monitoring is so important for AIOps Detail into B23 Dataflow Monitoring for AIOPs.   Background of Dataflow Monitoring   On behalf of our customers, B23 manages and operates production, enterprise-scale data lakes as centralized stores for diverse types of data. The enterprise data lake serves diverse business purposes, often supporting workloads for AI/ML and business intelligence (“BI”).   Data lakes are not static — they are often connected by a complex series of rivers and streams, some flowing into the lake from external sources and some leaving the lake to downstream consumers. We call these rivers and streams “dataflows.” There are also dataflows within the data lake, where datasets are transformed or fused to suit customer needs, and then written back to the same data lake environment. These internal flows include extract-transform-load (“ETL”) pipelines and AI/ML training or inference pipelines.   A common data lake contains data that can have very different characteristics:   Structured vs unstructured Different serialization formats (comma-separated values, parquet, JSON) Different naming conventions Different delivery cadences (continuous, burst-y, periodic, ad hoc)   As data flows in and out of the data lake, it often crosses inter-organization and intra-organizational boundaries. A variety of legacy monitoring and alerting tools exists for infrastructure, networking, and applications. These tools monitor traditional “operations” data such as system availability, processing latency, network throughput, CPU utilization, disk utilization, etc. The commercial cloud providers offer a...

Not All Kubernetes Services Are Equal. We Should Know.

Kubernetes promises the long sought-after capability to fully abstract underlying public cloud, private cloud, and edge infrastructure from the perspective of software applications that perform specific functions, or workloads. For B23, the value of Kubernetes means that all of the innovative and ground-breaking data engineering and applied machine learning workloads that we have developed and operated based on years of experience can be seamlessly deployed in almost any environment that runs Kubernetes. B23 supports and operates a variety of Kubernetes solutions including “pure” Kubernetes that we deploy to any arbitrary set of supported server hosts. We also support public cloud managed Kubernetes services from Google, Amazon, Microsoft, and DigitalOcean. We support integration to a previously running Kubernetes system to address private cloud Kubernetes solutions. Most recently, we support Rancher’s K3S for edge computing solutions (more on that exciting news in a later blog). We’ve done Kubernetes the “hard way” from scratch, and we’ve done it the “easy way” using cloud managed Kubernetes, or at least we thought managed Kubernetes would be easy. In some cases, the “easy way” was just not so easy. That’s why the “conceptual value” of Kubernetes varies from the “actual value” of Kubernetes. It depends heavily on your cloud service provider. Here are some of the high-level differences we have found in our pursuit to achieve our ultimate goal of infrastructure agnostic workloads using Kubernetes. They fall into the following categories: Default security features and versions vary by Kubernetes service provider Non-existing or limited built-in support for Kubernetes auto-scaling capabilities across service providers Some service providers require proprietary or provider-specific functionality leading to vendor lock-in The workflow and lifecycle management of Kubernetes and hosted workloads vary in capability and complexity The SDK ecosystem for programmatically operating managed Kubernetes solutions vary greatly in maturity The...

B23 Highlighted at Jeffries’s Battlefin Conference

Having just returned from yet another extremely productive Jefferies’s BattleFin conference in Miami, our team was reflecting on several themes we observed occurring in the financial services and hedge fund artificial intelligence (“AI”) market.  With 470 attendees, the Jefferies BattleFin conference is the premier event to observe, hear, and meet with experts related to alternative data and AI for hedge funds.  The conference itself continues to grow in size, and in the diversity of data providers and technology services relevant to the technology-driven investor.  We were excited to participate in a panel discussion yet again this year about a topic we have a high degree of conviction and experience, and it was extremely productive to meet with so many new and familiar industry experts.   An overview of these themes from this year’s event include: Increased acceptance of Data-Engineering-as-a-Service using qualified third-party technology partners like B23 Machine Learning at-scale is becoming more tightly coupled to Public Cloud infrastructure More pragmatism around the amount of alpha that alternative data can provide by itself Challenges to adopting new, innovative ideas with so much turnover and cross-pollination   Data-Engineering-as-a-Service for Hedge Funds An emerging theme that was very prevalent this year was the acceptance of outsourced data engineering or Data-Engineering-as-a-Service.  Many of the institutions we spoke with are quickly aligning themselves to this trend, which is consistent with our observations also occurring in non-financial services verticals as well.   It was obvious that more and more institutions continue to pursue a strategy to outsource the “undifferentiated heaving lifting” of data engineering in order for those same firms to focus on higher order outcomes with respect to quantitative investment analysis. Funds are increasing passing on building themselves cloud-based data lakes, or developing durable and performant extract, transform, and load (“ETL”) applications hosted on...

B23 Becomes Certified Google Cloud Partner

B23 is announcing today that is has successfully achieved multiple tiers of membership within Google Cloud Partner Platform. This partnership achievement is a result of meeting many technology and business criteria including integration of our B23 Data Platform (“BDP”) with the Google Cloud Platform (“GCP”), specifically Google Kubernetes Engine (“GKE”). B23 has also obtained a multitude of Google technical certifications for our engineering and development team including the Google Professional Cloud Architect and Google Professional Data Engineer certifications. These technology capabilities and certifications will further enhance B23’s data engineering and applied machine learning service capabilities when using GCP. B23 is now a certified and approved GCP partner in the following areas: Google Cloud Platform Services This focus area allows B23 customers to leverage the collective intelligence of Google engineers by aligning our B23 Data Platform features to Google’s innovation roadmap, like cutting-edge machine learning and Google’s fast, secure infrastructure, which scales to fit our customer needs. Google Cloud Partner Services This focus area allows B23 customers additional access to Google’s expanded offering of innovative cloud-first solutions, including data analytics, machine learning, cloud migration, IoT, productivity apps, and more. Google Cloud Trusted Reseller This focus area will allow B23 to resell GCP services to our customers if required. B23 was fully vetted by Google as a trusted...

B23 Announces B23 Data Platform integration with the Google Kubernetes Engine (“GKE”)

Today we are excited to announce our B23 Data Platform integration with the Google Kubernetes Engine (“GKE”).   Kubernetes is an exciting technology that helps customers orchestrate complex containerized workloads using templatized configurations.  In many ways, Kubernetes is an extension of the same automation and orchestration concepts we started developing with cloud-based virtual machines five years ago when we introduced the B23 Data Platform.   That’s why it made perfect sense to enhance our existing data platform offerings with multiple cloud-vendor Kubernetes services which will extend our data engineering and applied machine learning workloads to even more environments.   B23 has been “productionizing” the difficult and non-differentiated data engineering activities for Fortune 20 companies, financial services, large cybersecurity, leading telecommunications providers, and many other firms for several years.   This integration will make it easy to service more customers who prefer Google Cloud Platform (“GCP”) as they ask B23 to build, manage, and operate complex data pipelines.   This video shows a brief overview of how we have simplified the process to extend customer machine learning workloads onto GKE.   https://www.youtube.com/watch?v=cwiW0JEe8Lc&t   B23 provides managed data engineering and applied machine learning services for its customers so they can focus on the extracting the business value of data – and not focus on commoditized engineering. Building and operating durable data analysis infrastructure, and running algorithms at scale, on a 24/7 basis, are challenges that most modern organizations are facing today. By partnering with B23, our customers’ business analysts, data scientists, and machine learning engineers are free to focus on their core-competency, performing data analysis that will be most impactful to their business.   The B23 Data Platform supports a variety of data-processing and analysis-centric software.  We support both open source software, as well...