Distributed R was begun in 2011 by Indrajit Roy, Shivaram Venkataraman, Alvin AuYoung, and Robert S. Schreiber as a research project at HP Labs.3 It was open sourced in 2014 under the GPLv2 license and is available at GitHub.
In February 2015, Distributed R reached its first stable version 1.0, along with enterprise support from HP.4
Distributed R is a platform to implement and execute distributed applications in R. The goal is to extend R for distributed computing, while retaining the simplicity and look-and-feel of R. Distributed R consists of the following components:
HP Vertica provides tight integration with their database and the open source Distributed R platform. HP Vertica 7.1 includes features that enable fast, parallel loading from the Vertica database to Distribute R. This parallel Vertica loader can be more than five times (5x) faster than using traditional ODBC based connectors. The Vertica database also supports deployment of machine learning models in the database. Distributed R users can call the distributed algorithms to create machine learning models, deploy them in the Vertica database, and use the model for in-database scoring and predictions. Architectural details of the Vertica database and Distributed R integration are described in the Sigmod 2015 paper.5
Venkataraman, Shivaram; Bodzsar, Erik; Roy, Indrajit; AuYoung, Alvin; Schreiber, Robert S. (2013). "Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices" (PDF). European Conference on Computer Systems (EuroSys). Archived from the original (PDF) on 2015-03-01. https://web.archive.org/web/20150301102733/http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Venkataraman.pdf ↩
Gagliordi, Natalie. "HP adds scale to open-source R in latest big data platform". ZDNet. Retrieved 17 February 2015. https://www.zdnet.com/article/hp-adds-scale-to-open-source-r-in-latest-big-data-platform/ ↩
Venkataraman, Shivaram; Roy, Indrajit; AuYoung, Alvin; Schreiber, Robert S. (2012). "Using R for Iterative and Incremental Processing". Workshop on Hot Topics in Cloud Computing (HotCloud). ↩
"HP Delivers Predictive Analytics at Big Data Scale". hp.com. 17 February 2015. Retrieved 17 February 2015. http://www8.hp.com/us/en/hp-news/press-release.html?id=1912830&pageTitle=HP-Delivers-Predictive-Analytics-at-Big-Data-Scale ↩
Prasad, Shreya; Fard, Arash; Gupta, Vishrut; Martinez, Jorge; LeFevre, Jeff; Xu, Vincent; Hsu, Meichun; Roy, Indrajit (2015). "Enabling predictive analytics in Vertica: Fast data transfer, distributed model creation and in-database prediction". ACM SIGMOD International Conference on Management of Data. ↩