SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
TestR
Generating unit tests for R internals
Purdue University
https://github.com/allr/testR
1
Roman Tsegelskyi, Jan Vitek
Motivation
R1 > source('~/GNU-Rs/R1/tests/arith-true.R')
.
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
Time elapsed: 0.428 0 0.426 0 0
Warning messages:
1: In log(-1) : NaNs produced
2: In gamma(0:-47) : NaNs produced
3: In digamma(x) : NaNs produced
4: In psigamma(x, 0) : NaNs produced
Motivation
• Ensuring correctness of builtin functions written in C
(More than 600)
• Automating this by generating test cases
• Generalize it to testing any R function
3
TestR
4
foo <- function (x) {
x * 2;
}
!
R1 > foo(2)
[4]
TestR
• A test is a call to a test function with arguments to
handle errors, warnings, etc.
test(id=0, code={
foo <- function (x) {
x * 2
}
foo(2)
}, o=4);
• Handles not only unit tests but also more complex test types
TestR (continued)
• Test cases can be generated from a template..
test(id=18,
1 + c(1, 2),
name = "foo[a=1,b=c(1,2),

c = "+"]",
o = c(2, 3)
)
test(name = "foo",
g(a, 1, 2, 3, 4),
g(b, c(1,2), c(2,3), 

c(3,4)),
g(c, "+","-"),
code={a %c% b}
)
6
Examples
expected <- eval(parse(text="TRUE"));
test(id=0, code={
argv <- eval(parse(text="list(c(-0.9, 1.0))"))
do.call(`is.atomic`, argv)
}, o=expected);
!
!
!
expected <- eval(parse(text="1+0i"));
test(id=0, code={
argv <- eval(parse(text="list(1, 0+0i)"));
do.call(`+`, argv)
}, o=expected);
TestGeninstrumented
GNU-R
R
GNU-R 

with Gcov
Capture 

files
test 

cases
bad TC log
+Test DB test
if newCov > oldCov add to DB
exec
gen
gen
foreach
new
Cov
Testcase Filter
Instrumented GNU-R
# identical
func: identical
type: I
args: list("closure", "S4", TRUE,

TRUE, TRUE, TRUE, FALSE)
retn: FALSE
#is.na
func: is.na
type: P
args: list(NA_integer_)
retn: TRUE
Instrumented GNU-R
func: function_name
!
type: P | I
!
args: list(s1, s2, … , sn)
| <arguments too long, ignored>
!
retv: string
| <return value too long, ignored>
| <error>
Dependent calls to builtins
foo <- function(){
file.create(‘file.1’)
file.create(‘file.2’)
file.append(‘file.1’, file.2’)
}
foo <- function(){
Tfile <- file("test1", “w+")
cat("abcndefn", file = Tfile)
readLines(Tfile)
}
expected <- eval(parse(text="NULL"));
test(id=0, code={
writeLines<- function (text, con = stdout(), sep = "n",
useBytes = FALSE)
{
if (is.character(con)) {
con <- file(con, "w")
on.exit(close(con))
}
.Internal(writeLines(text, con, sep, useBytes))
}
!
argv <- eval(parse(text="list(c("[476] "1986-02-12"
”1986-02-13”),”file.1”);
do.call(`writeLines`, argv);
}, o=expected);
Instrumented GNU-R
13
func: function_name
body: closure_code
args: list(string1, string2, ..., stringN) |
<arguments too long, ignored>
retv: string | 

<return value too long, ignored> | 

<error>
TestGen
• Process the capture file, generate all valid tests, log
invalid tests
• Run each test on trusted VM and validate the return
value
• Generates TestR output
14
Filtering
• Tests only added to Database if coverage increase
• For builtins only measure coverage of src/main, but can
be done for any folder in general
• Use gcov to measure coverage of C code

(nothing for R coverage yet)
Experimental results
• GNU R test suite gives 73% coverage in src/main
• Capturing builtin calls gave 45% coverage.
• Test suite has 3803 test cases out of 37M candidates.
• Capturing closures that contain primitive calls gives 58%
coverage and adds 892 tests
Errors in RVMs
• CXXR (C++ R) (University of Kent)
• 8 failed test cases compared to R-2.15.1
• 263 failed test cases compared to R-3.0.1
• Renjin (R on JVM)
• 621 failed test cases compared to R-3.0.1
• 12 NULL pointer exceptions and 15 class cast
exceptions
Conclusions
• An infrastructure for automatically generating test cases
from legacy R code
• Generate test suite covers 80% of GNU R test suite
covers, while shrinking size to 4695 tests
• Infrastructure finds bugs in R VM implementations
• Infrastructure can be used for creating test cases for any
functions in R packages

Mais conteúdo relacionado

Mais procurados

Akka.NET streams and reactive streams
Akka.NET streams and reactive streamsAkka.NET streams and reactive streams
Akka.NET streams and reactive streamsBartosz Sypytkowski
 
All you need to know about the JavaScript event loop
All you need to know about the JavaScript event loopAll you need to know about the JavaScript event loop
All you need to know about the JavaScript event loopSaša Tatar
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in GeckoChih-Hsuan Kuo
 
Go Profiling - John Graham-Cumming
Go Profiling - John Graham-Cumming Go Profiling - John Graham-Cumming
Go Profiling - John Graham-Cumming Cloudflare
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the pointseanmcq
 
Python Unit Test
Python Unit TestPython Unit Test
Python Unit TestDavid Xie
 
Ricon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeRicon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeSusan Potter
 
Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Alexey Fyodorov
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Alexey Lesovsky
 
A CTF Hackers Toolbox
A CTF Hackers ToolboxA CTF Hackers Toolbox
A CTF Hackers ToolboxStefan
 
Performance is a feature! - Developer South Coast - part 2
Performance is a feature!  - Developer South Coast - part 2Performance is a feature!  - Developer South Coast - part 2
Performance is a feature! - Developer South Coast - part 2Matt Warren
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]Igor Lozynskyi
 
Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresSusan Potter
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程Weber Tsai
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackKernel TLV
 
LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)Wang Hsiangkai
 

Mais procurados (20)

Akka.NET streams and reactive streams
Akka.NET streams and reactive streamsAkka.NET streams and reactive streams
Akka.NET streams and reactive streams
 
All you need to know about the JavaScript event loop
All you need to know about the JavaScript event loopAll you need to know about the JavaScript event loop
All you need to know about the JavaScript event loop
 
Protocol handler in Gecko
Protocol handler in GeckoProtocol handler in Gecko
Protocol handler in Gecko
 
Rust
RustRust
Rust
 
Go Profiling - John Graham-Cumming
Go Profiling - John Graham-Cumming Go Profiling - John Graham-Cumming
Go Profiling - John Graham-Cumming
 
Gevent what's the point
Gevent what's the pointGevent what's the point
Gevent what's the point
 
Python Unit Test
Python Unit TestPython Unit Test
Python Unit Test
 
Go memory
Go memoryGo memory
Go memory
 
Ricon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak PipeRicon/West 2013: Adventures with Riak Pipe
Ricon/West 2013: Adventures with Riak Pipe
 
Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)Counter Wars (JEEConf 2016)
Counter Wars (JEEConf 2016)
 
Zone IDA Proc
Zone IDA ProcZone IDA Proc
Zone IDA Proc
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
 
A CTF Hackers Toolbox
A CTF Hackers ToolboxA CTF Hackers Toolbox
A CTF Hackers Toolbox
 
Performance is a feature! - Developer South Coast - part 2
Performance is a feature!  - Developer South Coast - part 2Performance is a feature!  - Developer South Coast - part 2
Performance is a feature! - Developer South Coast - part 2
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]
 
Dynamo: Not Just For Datastores
Dynamo: Not Just For DatastoresDynamo: Not Just For Datastores
Dynamo: Not Just For Datastores
 
TDOH x 台科 pwn課程
TDOH x 台科 pwn課程TDOH x 台科 pwn課程
TDOH x 台科 pwn課程
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
 
kii
kiikii
kii
 
LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)
 

Semelhante a TestR: generating unit tests for R internals

Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Fariz Darari
 
(2) c sharp introduction_basics_part_i
(2) c sharp introduction_basics_part_i(2) c sharp introduction_basics_part_i
(2) c sharp introduction_basics_part_iNico Ludwig
 
Testing in Python: doctest and unittest
Testing in Python: doctest and unittestTesting in Python: doctest and unittest
Testing in Python: doctest and unittestFariz Darari
 
Tackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RTackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RLun-Hsien Chang
 
Unit testing en iOS @ MobileCon Galicia
Unit testing en iOS @ MobileCon GaliciaUnit testing en iOS @ MobileCon Galicia
Unit testing en iOS @ MobileCon GaliciaRobot Media
 
Save Write a program to implement Binary search using recursive algo.pdf
Save Write a program to implement Binary search using recursive algo.pdfSave Write a program to implement Binary search using recursive algo.pdf
Save Write a program to implement Binary search using recursive algo.pdfarihantmobileselepun
 
Using xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing ToolkitUsing xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing ToolkitChris Oldwood
 
Testing Code and Assuring Quality
Testing Code and Assuring QualityTesting Code and Assuring Quality
Testing Code and Assuring QualityKent Cowgill
 
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMockUnit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMockRobot Media
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
 
Asynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsAsynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsPiotr Pelczar
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for RSeth Falcon
 
Beauty and the beast - Haskell on JVM
Beauty and the beast  - Haskell on JVMBeauty and the beast  - Haskell on JVM
Beauty and the beast - Haskell on JVMJarek Ratajski
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and MonoidsHugo Gävert
 
Scala @ TechMeetup Edinburgh
Scala @ TechMeetup EdinburghScala @ TechMeetup Edinburgh
Scala @ TechMeetup EdinburghStuart Roebuck
 
IIUG 2016 Gathering Informix data into R
IIUG 2016 Gathering Informix data into RIIUG 2016 Gathering Informix data into R
IIUG 2016 Gathering Informix data into RKevin Smith
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science Chucheng Hsieh
 

Semelhante a TestR: generating unit tests for R internals (20)

Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)Testing in Python: doctest and unittest (Updated)
Testing in Python: doctest and unittest (Updated)
 
(2) c sharp introduction_basics_part_i
(2) c sharp introduction_basics_part_i(2) c sharp introduction_basics_part_i
(2) c sharp introduction_basics_part_i
 
Testing in Python: doctest and unittest
Testing in Python: doctest and unittestTesting in Python: doctest and unittest
Testing in Python: doctest and unittest
 
Java Language fundamental
Java Language fundamentalJava Language fundamental
Java Language fundamental
 
Tackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in RTackling repetitive tasks with serial or parallel programming in R
Tackling repetitive tasks with serial or parallel programming in R
 
Unit testing en iOS @ MobileCon Galicia
Unit testing en iOS @ MobileCon GaliciaUnit testing en iOS @ MobileCon Galicia
Unit testing en iOS @ MobileCon Galicia
 
Save Write a program to implement Binary search using recursive algo.pdf
Save Write a program to implement Binary search using recursive algo.pdfSave Write a program to implement Binary search using recursive algo.pdf
Save Write a program to implement Binary search using recursive algo.pdf
 
Using xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing ToolkitUsing xUnit as a Swiss-Aarmy Testing Toolkit
Using xUnit as a Swiss-Aarmy Testing Toolkit
 
Testing Code and Assuring Quality
Testing Code and Assuring QualityTesting Code and Assuring Quality
Testing Code and Assuring Quality
 
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMockUnit testing in iOS featuring OCUnit, GHUnit & OCMock
Unit testing in iOS featuring OCUnit, GHUnit & OCMock
 
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
Asynchronous programming done right - Node.js
Asynchronous programming done right - Node.jsAsynchronous programming done right - Node.js
Asynchronous programming done right - Node.js
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for R
 
Golang dot-testing-lite
Golang dot-testing-liteGolang dot-testing-lite
Golang dot-testing-lite
 
Beauty and the beast - Haskell on JVM
Beauty and the beast  - Haskell on JVMBeauty and the beast  - Haskell on JVM
Beauty and the beast - Haskell on JVM
 
Introduction to Scalding and Monoids
Introduction to Scalding and MonoidsIntroduction to Scalding and Monoids
Introduction to Scalding and Monoids
 
ssh.isdn.test
ssh.isdn.testssh.isdn.test
ssh.isdn.test
 
Scala @ TechMeetup Edinburgh
Scala @ TechMeetup EdinburghScala @ TechMeetup Edinburgh
Scala @ TechMeetup Edinburgh
 
IIUG 2016 Gathering Informix data into R
IIUG 2016 Gathering Informix data into RIIUG 2016 Gathering Informix data into R
IIUG 2016 Gathering Informix data into R
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

TestR: generating unit tests for R internals

  • 1. TestR Generating unit tests for R internals Purdue University https://github.com/allr/testR 1 Roman Tsegelskyi, Jan Vitek
  • 2. Motivation R1 > source('~/GNU-Rs/R1/tests/arith-true.R') . [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE [1] TRUE Time elapsed: 0.428 0 0.426 0 0 Warning messages: 1: In log(-1) : NaNs produced 2: In gamma(0:-47) : NaNs produced 3: In digamma(x) : NaNs produced 4: In psigamma(x, 0) : NaNs produced
  • 3. Motivation • Ensuring correctness of builtin functions written in C (More than 600) • Automating this by generating test cases • Generalize it to testing any R function 3
  • 4. TestR 4 foo <- function (x) { x * 2; } ! R1 > foo(2) [4]
  • 5. TestR • A test is a call to a test function with arguments to handle errors, warnings, etc. test(id=0, code={ foo <- function (x) { x * 2 } foo(2) }, o=4); • Handles not only unit tests but also more complex test types
  • 6. TestR (continued) • Test cases can be generated from a template.. test(id=18, 1 + c(1, 2), name = "foo[a=1,b=c(1,2),
 c = "+"]", o = c(2, 3) ) test(name = "foo", g(a, 1, 2, 3, 4), g(b, c(1,2), c(2,3), 
 c(3,4)), g(c, "+","-"), code={a %c% b} ) 6
  • 7. Examples expected <- eval(parse(text="TRUE")); test(id=0, code={ argv <- eval(parse(text="list(c(-0.9, 1.0))")) do.call(`is.atomic`, argv) }, o=expected); ! ! ! expected <- eval(parse(text="1+0i")); test(id=0, code={ argv <- eval(parse(text="list(1, 0+0i)")); do.call(`+`, argv) }, o=expected);
  • 8. TestGeninstrumented GNU-R R GNU-R 
 with Gcov Capture 
 files test 
 cases bad TC log +Test DB test if newCov > oldCov add to DB exec gen gen foreach new Cov Testcase Filter
  • 9. Instrumented GNU-R # identical func: identical type: I args: list("closure", "S4", TRUE,
 TRUE, TRUE, TRUE, FALSE) retn: FALSE #is.na func: is.na type: P args: list(NA_integer_) retn: TRUE
  • 10. Instrumented GNU-R func: function_name ! type: P | I ! args: list(s1, s2, … , sn) | <arguments too long, ignored> ! retv: string | <return value too long, ignored> | <error>
  • 11. Dependent calls to builtins foo <- function(){ file.create(‘file.1’) file.create(‘file.2’) file.append(‘file.1’, file.2’) } foo <- function(){ Tfile <- file("test1", “w+") cat("abcndefn", file = Tfile) readLines(Tfile) }
  • 12. expected <- eval(parse(text="NULL")); test(id=0, code={ writeLines<- function (text, con = stdout(), sep = "n", useBytes = FALSE) { if (is.character(con)) { con <- file(con, "w") on.exit(close(con)) } .Internal(writeLines(text, con, sep, useBytes)) } ! argv <- eval(parse(text="list(c("[476] "1986-02-12" ”1986-02-13”),”file.1”); do.call(`writeLines`, argv); }, o=expected);
  • 13. Instrumented GNU-R 13 func: function_name body: closure_code args: list(string1, string2, ..., stringN) | <arguments too long, ignored> retv: string | 
 <return value too long, ignored> | 
 <error>
  • 14. TestGen • Process the capture file, generate all valid tests, log invalid tests • Run each test on trusted VM and validate the return value • Generates TestR output 14
  • 15. Filtering • Tests only added to Database if coverage increase • For builtins only measure coverage of src/main, but can be done for any folder in general • Use gcov to measure coverage of C code
 (nothing for R coverage yet)
  • 16. Experimental results • GNU R test suite gives 73% coverage in src/main • Capturing builtin calls gave 45% coverage. • Test suite has 3803 test cases out of 37M candidates. • Capturing closures that contain primitive calls gives 58% coverage and adds 892 tests
  • 17. Errors in RVMs • CXXR (C++ R) (University of Kent) • 8 failed test cases compared to R-2.15.1 • 263 failed test cases compared to R-3.0.1 • Renjin (R on JVM) • 621 failed test cases compared to R-3.0.1 • 12 NULL pointer exceptions and 15 class cast exceptions
  • 18. Conclusions • An infrastructure for automatically generating test cases from legacy R code • Generate test suite covers 80% of GNU R test suite covers, while shrinking size to 4695 tests • Infrastructure finds bugs in R VM implementations • Infrastructure can be used for creating test cases for any functions in R packages