It’s no news that Machine Learning have become a huge trend lately. However, it’s also not uncommon for people to avoid using techniques and technologies based on Machine Learning due to its frightening mathy nature.

However, that’s not an excuse anymore. There are today several frameworks and libraries that abstract all the complex details and provide a much simpler API.

Most of those libraries are found as Python packages, but we are going to explore a JavaScript package called Brain.js, which has been gaining a lot of traction lately. An alternative is Synaptic, which is relatively more flexible and allows for a finer-grained control of your neural network architecture.

Brain.js

With Brain.js it’s very easy to train and use neural networks, and therefore make it extremely easy to build predicting systems. All you need is some data, which can be acquired by prompting the user or an API. This data can then be directly fed into a Brain.js neural network object and used for training.

To better visualize how easy this process is, let’s look at an example. First, you need to obviously install the Brain.js package:

$ npm install brain.js

Or with yarn:

$ yarn add brain.js

You can also use it directly on the browser, using this script.

Now, let’s take a look at the classic example of approximating the XOR function, extracted from the Brain.js documentation:

1
2
3
4
5
6
7
8
var net = new brain.NeuralNetwork();

net.train([{input: [0, 0], output: [0]},
           {input: [0, 1], output: [1]},
           {input: [1, 0], output: [1]},
           {input: [1, 1], output: [0]}]);

var output = net.run([1, 0]);  // [0.987]

Basically, a neural network is a predicting technique. Of course you might look at the above example and think that it is useless, specially because we already knew beforehand the result of the XOR operation for the input [1, 0].

However, we can apply the exact same logic to predict, for example, which soccer team would win a match given their history. The neural network would take into account not only the matches between the two teams, but all the other matches as well. It’s very hard for a human to detect those nuances. You could also insert more data that you think would be useful, such as the crowd size per match.

Let’s build a simple soccer example. Our data is composed by the match (e.g. [0, 1] means a match between Team 0 and Team 1) and the result (e.g. [1] means that the second team in the input array won). We could model a draw with, for example, a 0.5 value. Or we could use the ratio of goals to determine the output. You are free to test your hypotheses.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
var data = [
    {input: [0, 1], output: [1]},  // Team 1 wins the match
    {input: [0, 2], output: [1]},  // Team 2 wins the match
    {input: [1, 2], output: [0]},  // Team 1 wins the match
    {input: [1, 3], output: [1]}   // Team 3 wins the match
];

var net = new brain.NeuralNetwork();

net.train(data);

// Who would win the match between Team 0 and Team 3?
net.run([0, 3]); // [0.9939018487930298] -> high change of Team 3 winning

Remember, however, that if you are testing parameters, you must segment your data into a training segment and a testing segment. You would use the training segment to train your model, and the testing segment to verify its precision. It’s common to use 80% of the data to train the model, and 20% to verify it.

MNIST model

mnist digits

Let’s now build a more realistic model using the MNIST dataset. It consists of several digits written by hand. We will use the pre-built MNIST package for Node.js.

I’ve created a repository with the code for this example. You can grab it here. It consists of a small library that will train a model using a subset of the MNIST dataset, and then try to predict some handwritten digits.

Photo

src/train.js

This is our training file. It is responsible for providing some helper functions for training our model, and creating the Neural Network itself. When we use the mnist[digit].get() method, the mnist library will return a random instance of the digit we have passed between brackets.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
const mnist = require('mnist');
const brain = require('brain.js');

const net = new brain.NeuralNetwork();

function random_digit() {
    const value = Math.floor(Math.random()*10);
    const value_str = String(value);

    const output = {};
    output[value_str] = 1;

    return {
        "input": mnist[value].get(),
        "output": output
    };
}

function train_rand(sample_size) {
    const data = [];

    for (let i = 0; i < sample_size; i++) {
        data.push(random_digit());
    }
    net.train(data);
}

function net_result(digit) {
    return brain.likely(digit, net);
}

window.random_digit = random_digit;
window.train_rand = train_rand;
window.net_result = net_result;

After we’re done with the train.js file, we can simply use browserify to create a self-contained script that can be imported from the browser. I’ve included a npm script to help with that:

$ npm run build

index.html

From our html file, we can simply import our generated script. Here is how I am using the functions from the train.js file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<script src="./dist/train.js"></script>
<script>
    next = function() {
        const context = document.getElementById("mnistCanvas").getContext("2d");
        const digit = random_digit();

        const inverted = digit.input.map(val => Number(!val));

        mnist.draw(inverted, context);

        const result = net_result(digit.input);
        const resultElement = document.getElementById("result");

        resultElement.innerHTML = result;
    };

    train_NN = function() {
        document.getElementById("train_button").disabled = true;
        const sample_size = document.getElementById("sample_size").value;

        train_rand(Number(sample_size));

        document.getElementById("train_button").disabled = false;
        document.getElementById("next_button").disabled = false;
    }
</script>

These two functions are callbacks for the two buttons in the page.

The next function will update a canvas in the page to show the current digit, then it tries to guess what digit it is by calling the net_result function, and finally displays the guess in the screen. The train_NN function will simply train our model by calling train_rand with the number of samples from the MNIST dataset the training process should use.

If you increase the sample size, the model will become much more accurate. However, increasing it too much will cause the training process to take a long time. It will take hours to train with the whole dataset. That can be accelerated by using GPU training. Brain.js supports GPU training with the NeuralNetworkGPU class, which uses gpu.js under the hood. However, this feature is not well supported yet.

Summary

We have seen how to import and use Brain.js in practice, and discussed some ways to model the input and output of our training data. We have also seen a practical example using a large dataset to train our model.

Of course there are several downsides of using the current JS ecosystem for machine learning instead of using Python, for example. However, this is a great way to display simpler results to the end user training directly on the browser, instead of sending the data to be trained on a server.

There are also other tools for this task, such as TensorFlow.js and Synaptic. Also, it’s worth mentioning natural, which is a package for Natural Language processing, which provides common algorithms such as tf-idf.