Next Generation Cybersecurity Analytics – Part III, Why a Next Generation OpenSOC is Required
This is the final blog in the 3-part blog series Next Generation Security Analytics. You can find Part I and Part II in the B23 Blog section as well.
The initial release of OpenSOC was a pioneering event in that it was the first open source, domain-specific, solution-oriented project that demonstrated the use of distributed processing applications like Hadoop, Kafka, Storm, and Elasticsearch. (that was a mouthful!) For most people even in the Big Data community, these solutions only existed inside commercial proprietary software tools, or within an elite few technically adept companies like Yahoo!, Netflix, Amazon, Twitter, and LinkedIn. For a lot of people OpenSOC helped rationalize the real-world use of Hadoop outside of word count! Credit should go to the original founders and to Cisco for sponsoring such an ambitious project and helping a large audience further understand the power of distributed processing systems.
In recent months, though, the official sponsorship of OpenSOC has ground to a halt. Viewing the commit history of the project itself, it was apparent to us and our customers that the commitment to OpenSOC as an open source solution was not where it needed to be. Our pull requests were going unheeded, and as of the date of this post the last accepted commit to opensoc-streaming, the big data processing component, was April 4, 2015, almost 5 months ago.
OpenSOC UI Commit History
OpenSOC Streaming Commit History
We believe OpenSOC offers tremendous opportunity, and that there currently exists two major areas of improvement required for this to happen.
The first is that the OpenSOC initiative requires transparency. The second is that OpenSOC needs a technical facelift to bring it to 2015 and beyond.
In the first case, transparency is necessarily for organizations who wish to embrace OpenSOC in more than a hobby-shop manner. There is no release schedule, no roadmap, no technical discussion about incorporating new features, and seemingly no one at the helm to accept pull requests for community software commits. The primary mechanism to communicate is through the Support Forums. Speaking with our enterprise customers, there is a great risk that any work they may do to improve OpenSOC may not align to future releases (if that is to ever happen given the recent organizational changes).
In the second case, OpenSOC is in dire need of several architectural-level changes to bring it up to speed with capabilities present in 2015. The Big Data ecosystem is a fast moving, always iterating landscape and keeping up with those changes can be a full time job.
Architecturally, OpenSOC has remained roughly the same since its first prototype in September 2013, more than 2 years ago. A lot has changed in 2 years, most notably, we believe that Apache Spark is changing the way data is processed in a distributed manner. With HDFS already built in to OpenSOC as a data landing zone, Spark has a huge potential role to play with a next generation OpenSOC including its ability to apply Machine Learning (“ML”) algorithms to network data, the ability to quickly develop applications in Python, Scala, SQL, or R to interrogate data, and possibly even to process streaming data in near-real time. Spark is a figurative “no-brainer.” There are other less obvious changes that should occur including replacing the current OpenSOC-UI component which suffers from custom AngularJS running on NodeJS which has over 200+ external dependencies, built on top of an old version of Kibana. This tightly-coupled solution does not deploy well in its current incarnation (particularly on a private network without internet access!), likely will not upgrade well even if an upgrade is proposed, and is definitely not pluggable if and when a better UI solution emerges (such as Kibana 4). Operationally speaking, OpenSOC’s current user interface is primitive and not worthy of such an ambitious project. Finally, Cisco’s own published collateral on OpenSOC, dated August 2014 describe “What OpenSOC is not…” and one bullet point is “…easy to install and get working quickly.” B23 has figured out how to get OpenSOC deployed quickly and effectively. We are working on even better deployment mechanisms that can help eliminate deployment expenses and get data scientists working on OpenSOC within a few days of installation.
We look forward to working with a community of developers who are as passionate about OpenSOC. We are very interested to get your feedback on these series of posts and your view on where OpenSOC needs to go to be relevant in the future.