Chapter 1 The Whole Game

This chapter will give you a quick and (I hope) complete overview of what you can do with hordes. This chapter is supposed to be an introduction and won’t go into details on all the functions. Please refer to further chapters if you want to know more!

1.1 hordes_init

The library() and mlibrary() functions will be talking to RServe through node-rio. You can either launch Rserve by hand, or from Node by calling hordes_init() at the top of your script if you want to lauch it.

You can serve several instances of RServe, by calling hordes_init(port = XXX) where XXX is a port. That also mean that you can open and call several instances of RServe using a Node load balancer.

1.2 library

The library function from hordes takes as input an installed R package and returns an object with all the functions from that package as methods.

**Note that every function returns a promise,

This means that:

  • R fun output should be handled with then/catch or async/await
  • All function will return as an array of strings, unless you specify capture_output = false, then you rely on node-rio to do the conversion. This solution might make things faster if you’re concerned about very high scalability.
  • If you return a data format that is not to be treated as a string (JSON, number…)… you’ll have to transform the data to a JavaScript compatible format.

For example, here is how to run a lm from Node.

// Here, we suppose you already have Rserve running in the background on port 6311
const {library} = require('hordes');
const stats = library(package = "stats");

stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
    .then((e) => console.log(e.join("\n")))
    .catch((err) => console.error(err))

Will output to the console:

Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)

Coefficients:
(Intercept)  Sepal.Width  
     6.5262      -0.2234  

The rest of this guide will use async/await

const { library } = require('hordes');
const stats = library("stats");

(async() => {
// Launching RServe from Node
    await hordes_init();
    try {
        const a = await stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
        console.log(a)
    } catch (e) {
        console.log(e)
    }

    try {
        const a = stats.lm("Sepal.Length ~ Sepal.Width, data = iris")
        const b = stats.lm("Sepal.Length ~ Petal.Width, data = iris")
        const ab = await Promise.all([a, b])
        console.log(ab[0])
        console.log(ab[1])
    } catch (e) {
        console.log(e)
    }
})();
Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)

Coefficients:
(Intercept)  Sepal.Width  
     6.5262      -0.2234  



Call:
stats::lm(formula = Sepal.Length ~ Sepal.Width, data = iris)

Coefficients:
(Intercept)  Sepal.Width  
     6.5262      -0.2234  



Call:
stats::lm(formula = Sepal.Length ~ Petal.Width, data = iris)

Coefficients:
(Intercept)  Petal.Width  
     4.7776       0.8886 

1.3 mlibrary

mlibrary does the same job as library except the functions are natively memoized.

Unless your data are changing between each function call, this is probably the mode you’d want to chose.

// Here, we suppose you already have Rserve running in the background on port 6311
const {library, mlibrary} = require('hordes');
const base = library("base");
const mbase = mlibrary("base");

(async () => {
    try {
            const a = await base.sample("1:100, 5")
            console.log("a:", a)
            const b = await base.sample("1:100, 5")
            console.log("b:", b)
        } catch(e){
            console.log(e)
        }

    try {
            const a = await mbase.sample("1:100, 5")
            console.log("a:", a)
            const b = await mbase.sample("1:100, 5")
            console.log("b:", b)
        } catch(e){
            console.log(e)
        }
}
)();
a: [1] 49 13 37 25 91

b: [1]  5 17 68 26 29

a: [1] 96 17  6  4 75

b: [1] 96 17  6  4 75

1.4 get_hash

When calling library() or mlibrary(), you can specify a hash, which can be compiled with get_hash. This hash is computed from the DESCRIPTION of the package called. That way, if ever the DESCRIPTION file changes (version update, or stuff like that…), you can get alerted (app won’t launch). Just ignore this param if you don’t care about that (but you should in a production setting).

const { library, get_hash } = require('hordes');
console.log(get_hash("golem"))
'fdfe0166629045e6ae8f7ada9d9ca821742e8135efec62bc2226cf0811f44ef3'

Then if you call library() with another hash, the app will fail.

var golem = library("golem", hash = "blabla")
            throw new Error("Hash from DESCRIPTION doesn't match specified hash.")
var golem = library("golem", hash = 'e2167f289a708b2cd3b774dd9d041b9e4b6d75584b9421185eb8d80ca8af4d8a')
Object.keys(golem).length
104

1.5 waiter

You can launch an R process that streams data and wait for a specific output in the stdout. The specificity of waiter is that it doesn’t rely on node-rio, but spawn a real R process, and reads the elements streamed on stdout.

The promise resolves with and {proc, raw_output}: proc is the process object created by Node, raw_output is the output buffer, that can be turned to string with .toString().

A streaming process here is considered in a lose sense: what we mean here is anything that prints various elements to the console. For example, when you create a new application using the {golem} package, the app is ready once this last line is printed to the console. This is exactly what waiter does, it waits for this last line to be printed to the R stdout before resolving.

> golem::create_golem('pouet')
-- Checking package name -------------------------------------------------------
v Valid package name
-- Creating dir ----------------------------------------------------------------
v Created package directory
-- Copying package skeleton ----------------------------------------------------
v Copied app skeleton
-- Setting the default config --------------------------------------------------
v Configured app
-- Done ------------------------------------------------------------------------
A new golem named pouet was created at /private/tmp/pouet .
To continue working on your app, start editing the 01_start.R file.
const { waiter } = require("hordes")
const express = require('express');
const app = express();

app.get('/creategolem', async(req, res) => {
    try {
        await waiter("golem::create_golem('pouet')", {solve_on: "To continue working on your app"});
        res.send("Created ")
    } catch (e) {
        console.log(e)
        res.status(500).send("Error creating the golem project")
    }
})

app.listen(2811, function() {
    console.log('Example app listening on port 2811!')
})

-> http://localhost:2811/creategolem

1.6 Changing the process that runs R

By default, the R code is launched by RScript, but you can specify another (for example if you need another version of R):

const { waiter } = require("hordes")
const express = require('express');
const app = express();

app.get('/creategolem', async(req, res) => {
    try {
        await waiter("golem::create_golem('pouet')", {solve_on: "To continue working on your app", process: '/usr/local/bin/RScript'});
        res.send("Created ")
    } catch (e) {
        console.log(e)
        res.status(500).send("Error creating the golem project")
    }
})

app.listen(2811, function() {
    console.log('Example app listening on port 2811!')
})