Introducing B23r — An R package for Importing and Exporting Entire R Environments backed by Amazon’s Simple Storage Service (S3)

January 18th, 2017 The B23 Data Platform is free to use when launching R services on Amazon Web Services (“AWS”) R is an analytics technology that has been slow to adapt to its use in the Cloud. Other analytical solutions such as Spark and Hadoop have evolved over the past several years to include specific capabilities that facilitate the use of those platforms in the Cloud. B23 has been enabling analytical and distributed processing applications in the Cloud for many years, and we are happy to announce a new, efficient, and effective capability to persist R environments in the Amazon Cloud to help organizations keep cloud computing costs low while also increasing the ability for R users to collaborate easier. Adapting R for the Ephemeral Nature of Cloud Computing As data gets bigger and computation more complex, freeing R from the constraints and limitations of running it locally on a laptop is often a critical concern. The Cloud is the optimal computing platform for R for a variety of reasons. Our B23 Data Platform was designed to run R optimally in the Amazon Cloud. The high-level benefits of running R in the Cloud include optimal processing data stored locally in the Cloud, the ability to use varied computational resources that best suite analysis requirements (as opposed to a laptop with fixed computational resources), the ability to terminate compute and storage resources (and therefore stop incurring costs) when they are no longer needed, and enabling a security framework to securely ingest data directly into an R environment. Unfortunately, once started working in the Cloud, you’ll face challenges to working with R that aren’t well defined. In four (4) easy steps below we describe a solution for how to leverage the B23r package to help bridge these gaps. Challenge One of the biggest and most basic issues we’ve heard from users is concern about saving and restoring your work when taking advantage of the ephemeral nature of the Cloud. In 2016, Apache Zeppelin released a capability that allowed data science notebooks to persist...