Chapter 1 The Whole Game
This chapter will give you a quick and (I hope) complete overview of what you can do with hordes
.
This chapter is supposed to be an introduction and won’t go into details on all the functions.
Please refer to further chapters if you want to know more!
1.1 hordes_init
The library()
and mlibrary()
functions will be talking to RServe through node-rio.
You can either launch Rserve by hand, or from Node by calling hordes_init()
at the top of your script if you want to lauch it.
You can serve several instances of RServe, by calling hordes_init(port = XXX)
where XXX
is a port.
That also mean that you can open and call several instances of RServe using a Node load balancer.
1.2 library
The library
function from hordes
takes as input an installed R package and returns an object with all the functions from that package as methods.
**Note that every function returns a promise,
This means that:
- R fun output should be handled with
then/catch
orasync/await
- All function will return as an array of strings, unless you specify
capture_output = false
, then you rely onnode-rio
to do the conversion. This solution might make things faster if you’re concerned about very high scalability. - If you return a data format that is not to be treated as a string (JSON, number…)… you’ll have to transform the data to a JavaScript compatible format.
For example, here is how to run a lm
from Node.
// Here, we suppose you already have Rserve running in the background on port 6311
const {library} = require('hordes');
const stats = library(package = "stats");
stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
.then((e) => console.log(e.join("\n")))
.catch((err) => console.error(err))
Will output to the console:
Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)
Coefficients:
(Intercept) Sepal.Width
6.5262 -0.2234
The rest of this guide will use async/await
const { library } = require('hordes');
const stats = library("stats");
(async() => {
// Launching RServe from Node
await hordes_init();
try {
const a = await stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
console.log(a)
} catch (e) {
console.log(e)
}
try {
const a = stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
const b = stats.lm("Sepal.Length ~ Petal.Width, data = iris")
const ab = await Promise.all([a, b])
console.log(ab[0])
console.log(ab[1])
} catch (e) {
console.log(e)
}
})();
Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)
Coefficients:
(Intercept) Sepal.Width
6.5262 -0.2234
Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)
Coefficients:
(Intercept) Sepal.Width
6.5262 -0.2234
Call:
stats::lm(formula = Sepal.Length ~ Petal.Width, data = iris)
Coefficients:
(Intercept) Petal.Width
4.7776 0.8886
1.3 mlibrary
mlibrary
does the same job as library
except the functions are natively memoized.
Unless your data are changing between each function call, this is probably the mode you’d want to chose.
// Here, we suppose you already have Rserve running in the background on port 6311
const {library, mlibrary} = require('hordes');
const base = library("base");
const mbase = mlibrary("base");
(async () => {
try {
const a = await base.sample("1:100, 5")
console.log("a:", a)
const b = await base.sample("1:100, 5")
console.log("b:", b)
} catch(e){
console.log(e)
}
try {
const a = await mbase.sample("1:100, 5")
console.log("a:", a)
const b = await mbase.sample("1:100, 5")
console.log("b:", b)
} catch(e){
console.log(e)
}
}
)();
a: [1] 49 13 37 25 91
b: [1] 5 17 68 26 29
a: [1] 96 17 6 4 75
b: [1] 96 17 6 4 75
1.4 get_hash
When calling library()
or mlibrary()
, you can specify a hash, which can be compiled with get_hash
.
This hash is computed from the DESCRIPTION
of the package called.
That way, if ever the DESCRIPTION
file changes (version update, or stuff like that…), you can get alerted (app won’t launch).
Just ignore this param if you don’t care about that (but you should in a production setting).
'fdfe0166629045e6ae8f7ada9d9ca821742e8135efec62bc2226cf0811f44ef3'
Then if you call library()
with another hash, the app will fail.
throw new Error("Hash from DESCRIPTION doesn't match specified hash.")
var golem = library("golem", hash = 'e2167f289a708b2cd3b774dd9d041b9e4b6d75584b9421185eb8d80ca8af4d8a')
Object.keys(golem).length
104
1.5 waiter
You can launch an R process that streams data and wait for a specific output in the stdout.
The specificity of waiter
is that it doesn’t rely on node-rio
, but spawn a real R process, and reads the elements streamed on stdout.
The promise resolves with and {proc, raw_output}
: proc
is the process object created by Node, raw_output
is the output buffer, that can be turned to string with .toString()
.
A streaming process here is considered in a lose sense: what we mean here is anything that prints various elements to the console.
For example, when you create a new application using the {golem}
package, the app is ready once this last line is printed to the console.
This is exactly what waiter
does, it waits for this last line to be printed to the R stdout before resolving.
> golem::create_golem('pouet')
-- Checking package name -------------------------------------------------------
v Valid package name
-- Creating dir ----------------------------------------------------------------
v Created package directory
-- Copying package skeleton ----------------------------------------------------
v Copied app skeleton
-- Setting the default config --------------------------------------------------
v Configured app
-- Done ------------------------------------------------------------------------
A new golem named pouet was created at /private/tmp/pouet .
To continue working on your app, start editing the 01_start.R file.
const { waiter } = require("hordes")
const express = require('express');
const app = express();
app.get('/creategolem', async(req, res) => {
try {
await waiter("golem::create_golem('pouet')", {solve_on: "To continue working on your app"});
res.send("Created ")
} catch (e) {
console.log(e)
res.status(500).send("Error creating the golem project")
}
})
app.listen(2811, function() {
console.log('Example app listening on port 2811!')
})
1.6 Changing the process that runs R
By default, the R code is launched by RScript
, but you can specify another (for example if you need another version of R):
const { waiter } = require("hordes")
const express = require('express');
const app = express();
app.get('/creategolem', async(req, res) => {
try {
await waiter("golem::create_golem('pouet')", {solve_on: "To continue working on your app", process: '/usr/local/bin/RScript'});
res.send("Created ")
} catch (e) {
console.log(e)
res.status(500).send("Error creating the golem project")
}
})
app.listen(2811, function() {
console.log('Example app listening on port 2811!')
})