Abstract
The R computing environment has become an important part of the statistical community and fostered the development of over a thousand add-on packages, many representing state-of-the-art research in statistical methodology. Although it is relatively easy to develop functionality on top of the system, it is very difficult for developers to directly extend the core system itself—the language, the interpreter and the internal data structures. Yet the ability to easily introduce new core, first-class data structures into the system that are customized and efficient is becoming essential in this era of large, complex data sets and innovative algorithms and data structures. While the community that might use such a facility to introduce new data types may be small, it is potentially very talented and important, and may lead to significant innovations that allow us to continue to leverage R for the next 5 years or more in rich new ways. I describe some of the difficulties that people encounter in extending the system and suggest that an object-oriented architecture for the internal implementation of R (or any system) would make such low-level internals extensible by package developers and not just the core development team. This would promote potentially rich experimentation that would allow us and others to approach new styles of computation in R, while simultaneously maintaining the existing important community which provides so much value-added to the R environment. Specifically, transforming the R implementation from a representation-specific architecture to a C++ abstract/virtual interface-based architecture may be the least disruptive approach to the continued evolution of R, and would bring many advantages and some technical challenges. Such an approach involves many technical details and potential degradations in performance. Due to the length of the this paper, I do not explore these issues in great detail but introduce the basic concepts. I do, however, refer to some technical aspects that are best understood with some knowledge of the implementation of R at the level of using the .Call () interface in R.
Similar content being viewed by others
References
R Development Core Team (2008a) R: A language and environment for statistical computing. ISBN 3-900051-07-0. http://www.R-project.org
R Development Core Team (2008b) Writing R extensions. ISBN 3-900051-11-9
Brun R, Rademakers F (1997) ROOT—An object oriented data analysis framework. In: Proceedings AIHENP’96 Workshop, Lausanne, September 1996, Nucl Inst Methods Phys Res A 389:81–86. See also http://root.cern.ch/.
Chambers JM (1998) Programming with data: a guide to the S language. Springer, Berlin
Edlefsen L (2006) ExaStat. http://www.exametrix.com/products/#q12
Stroustrup B (2000) The C++ Programming Language. Addison Wesley, Reading, MA, USA
Temple Lang D (2007) The RGCCTranslationUnit package. http://www.omegahat.org/RGCCTranslationUnit. January 2007
Temple Lang D, Chambers J (2000) The SJava package for R. http://www.omegahat.org/RSJava. March 2000
Temple Lang D, Gentleman R, Morgan M (2005) The Type Info package for R. http://bioconductor.org/packages/2.2/bioc/html/TypeInfo.html. September 2005
Tierney L (2004) Simple references with finalization. http://www.cs.uiowa.edu/~luke/R/simpleref.html
Urbanek S (2007) Low-level R to Java interface. http://www.rforge.net/rJava
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Temple Lang, D. A modest proposal: an approach to making the internal R system extensible. Comput Stat 24, 271–281 (2009). https://doi.org/10.1007/s00180-008-0127-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-008-0127-7