Unit testing in R

gary

14 years ago

Commentary

R is a statistical programming language, with a strong focus on mathematical operations. When writing code that is math-heavy, unit testing becomes very appealing- while equations may look correct on paper, one minor error can ruin the output.

R programming is also different to CRUD or enterprise software in that the R in-memory data structures are often used in a similar fashion to a database. When used this way, it is more self-contained, and integration tests that require specific data set-up can be easier to manage than tests that require pre-configured data in a database (a requirement, for instance, when testing some parts of ETL scripts).

R has a unit testing package called RUnit, based on the JUnit 3.x APIs, which even includes code coverage. Like jUnit, test functions start with the word “test”, and there are startup/teardown methods.

Unlike jUnit and nUnit, there is no IDE integration, nor is there a specialized tool – it is simply run through the R REPL, which gives you some control at the expense of convenience. Unfortunately there are no one-click installs with CI servers like Jenkins- if you wish to run tests automatically and track the results over time, you have to figure out some command line integration.

Unit testing is not designed for fuzzy operations that have natural failures; for a machine learning exercise you ideally may want to track accuracy, recall, precision, etc. This package may be a useful starting point for that, but would require custom development on top to be really valuable for these types of problem.

Implementation

The following code will run a test suite,

source('chords.r') test.suite

This prints a result like this:

Number of test functions: 168
Number of errors: 2
Number of failures: 166

By default R does not print stack traces when there is an error, so the following may be helpful:

options(error=function() traceback(10))

RUnit provides a series of "check" functions, for test result assertions:

checkTrue(x > 0)

The results are available as text or HTML, or you can inspect the R object, for custom output.

printTextProtocol(test.result)

str(test.result)
List of 1
$ example:List of 8
  ..$ nTestFunc        : num 168
  ..$ nDeactivated     : int 0
  ..$ nErr             : num 2
  ..$ nFail            : num 166
  ..$ dirs             : chr "tests"
  ..$ testFileRegexp   : chr "^.+\\.r$"
  ..$ testFuncRegexp   : chr "^test.+"
  ..$ sourceFileResults:List of 1
  .. ..$ tests/wiki-chords.r:List of 168
  .. .. ..$ testA       :List of 4
  .. .. .. ..$ kind     : chr "failure"
  .. .. .. ..$ msg      : chr "Error in checkTrue(cmp) : Test not TRUE\n\n"
  .. .. .. ..$ checkNum : num 1
  .. .. .. ..$ traceBack: NULL
  .. .. ..$ testA7      :List of 4
  .. .. .. ..$ kind     : chr "failure"
  .. .. .. ..$ msg      : chr "Error in checkTrue(cmp) : Test not TRUE\n\n"
  .. .. .. ..$ checkNum : num 1
  .. .. .. ..$ traceBack: NULL

The full test suite for this example is on Github, as it's a bit long for a blog post.