In this GettingStarted article, we will build a robot for answering questions in IQ test with the help of DeepLearning.scala.

Background

Suppose we are building a robot for answering questions in IQ test like this:

What is the next number in sequence:

3, 6, 9, ?

The answer is 12.

We prepared some questions and corresponding answers as INDArrays:

In [1]:
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`org.nd4j:nd4j-native-platform:0.8.0`
import org.nd4j.linalg.api.ndarray.INDArray

val TrainingQuestions: INDArray = {
  import org.nd4s.Implicits._
  Array(
    Array(0, 1, 2),
    Array(4, 7, 10),
    Array(13, 15, 17)
  ).toNDArray
}

val ExpectedAnswers: INDArray = {
  import org.nd4s.Implicits._
  Array(
    Array(3),
    Array(13),
    Array(19)
  ).toNDArray
}
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Out[1]:
import $ivy.$                     

import $ivy.$                                    

import org.nd4j.linalg.api.ndarray.INDArray


TrainingQuestions: org.nd4j.linalg.api.ndarray.INDArray = [[0.00, 1.00, 2.00],
 [4.00, 7.00, 10.00],
 [13.00, 15.00, 17.00]]
ExpectedAnswers: org.nd4j.linalg.api.ndarray.INDArray = [3.00, 13.00, 19.00]

These samples will be used to train the robot.

In the rest of this article, we will build the robot in the following steps:

  1. Install DeepLearning.scala, which is the framework that helps us build the robot.
  2. Setup configuration (also known as hyperparameters) of the robot.
  3. Build an untrained neural network of the robot.
  4. Train the neural network using the above samples.
  5. Test the robot seeing if the robot have been learnt how to answer these kind of questions.

Install DeepLearning.scala

DeepLearning.scala is hosted on Maven Central repository.

You can use magic imports in jupyter-scala or Ammonite-REPL to download DeepLearning.scala and its dependencies.

In [2]:
import $ivy.`com.thoughtworks.deeplearning::plugins-builtins:2.0.0`
Out[2]:
import $ivy.$                                                      

If you use sbt, please add the following settings into your build.sbt:

// All DeepLearning.scala built-in plugins.
libraryDependencies += "com.thoughtworks.deeplearning" %% "plugins-builtins" % "latest.release"

// The native backend for nd4j.
libraryDependencies += "org.nd4j" % "nd4j-native-platform" % "0.8.0"

// Uncomment the following line to switch to the CUDA backend for nd4j.
// libraryDependencies += "org.nd4j" % "nd4j-cuda-8.0-platform" % "0.8.0"

// The magic import compiler plugin, which may be used to import DeepLearning.scala distributed in source format.
addCompilerPlugin("com.thoughtworks.import" %% "import" % "latest.release")

// The ThoughtWorks Each library, which provides the `monadic`/`each` syntax.
libraryDependencies += "com.thoughtworks.each" %% "each" % "latest.release"
addCompilerPlugin("org.scalamacros" % "paradise" % "2.1.0" cross CrossVersion.full)

fork := true

scalaVersion := "2.11.11"

Note that this example must run on Scala 2.11.11 because nd4s does not support Scala 2.12. Make sure there is not a setting like scalaVersion := "2.12.x" in your build.sbt.

See Scaladex to install DeepLearning.scala in other build tools!

Setup hyperparameters

Hyperparameters are global configurations for a neural network.

For this robot, we want to set its learning rate, which determines how fast the robot change its inner weights.

In DeepLearning.scala, hyperparameters can be introduced by plugins, which is a small piece of code loaded from a URL.

In [4]:
interp.load(scala.io.Source.fromURL(new java.net.URL("https://gist.github.com/Atry/1fb0608c655e3233e68b27ba99515f16/raw/39ba06ee597839d618f2fcfe9526744c60f2f70a/FixedLearningRate.sc")).mkString)

By loading the hyperparameter plugin FixedLearningRate, we are able to create the context of neural network with learningRate parameter.

See FixedLearningRate's README for instructions for sbt projects.

In [5]:
import com.thoughtworks.deeplearning.plugins.Builtins
Out[5]:
import com.thoughtworks.deeplearning.plugins.Builtins

All DeepLearning.scala built-in features are also provided by plugins. Builtins is the plugin that contains all other DeepLearning.scala built-in plugins.

Now we create the context and setup learning rate to 0.003.

In [7]:
// `interp.load` is a workaround for https://github.com/lihaoyi/Ammonite/issues/649 and https://github.com/scala/bug/issues/10390
interp.load("""
  import scala.concurrent.ExecutionContext.Implicits.global
  import com.thoughtworks.feature.Factory
  val hyperparameters = Factory[Builtins with FixedLearningRate].newInstance(learningRate = 0.003)
""")

See Factory if you are wondering how those plugins are composed together.

The Builtins plugin contains some implicit values and views, which should be imported as following:

In [8]:
import hyperparameters.implicits._
Out[8]:
import hyperparameters.implicits._

Build an untrained neural network of the robot

In DeepLearning.scala, a neural network is simply a function that references some weights, which are mutable variables being changed automatically according to some goals during training.

For example, given x0, x1 and x2 are the input sequence passed to the robot, we can build a function that returns the answer as robotWeight0 * x0 + robotWeight1 * x1 + robotWeight2 * x2, by adjusting those weights during training, the result should become close to the expected answer.

In DeepLearning.scala, weights can be created as following:

In [9]:
def initialValueOfRobotWeight: INDArray = {
  import org.nd4j.linalg.factory.Nd4j
  import org.nd4s.Implicits._
  Nd4j.randn(3, 1)
}

import hyperparameters.INDArrayWeight
val robotWeight = INDArrayWeight(initialValueOfRobotWeight)
Out[9]:
defined function initialValueOfRobotWeight
import hyperparameters.INDArrayWeight

robotWeight: Object with hyperparameters.INDArrayWeightApi with hyperparameters.WeightApi with hyperparameters.WeightApi = Weight[fullName=$sess.cmd8Wrapper.Helper.robotWeight]

In the above code, robotWeight is a weight of n-dimensional array, say, INDArrayWeight, initialized from random values. Therefore, the formula robotWeight0 * x0 + robotWeight1 * x1 + robotWeight2 * x2 can be equivalent to a matrix multipication, written as a dot method call:

In [10]:
import hyperparameters.INDArrayLayer
def iqTestRobot(questions: INDArray): INDArrayLayer = {
  questions dot robotWeight
}
Out[10]:
import hyperparameters.INDArrayLayer

defined function iqTestRobot

Note that the dot method is a differentiable function provided by DeepLearning.scala. You can find other n-dimensional array differentiable methods in Scaladoc

Unlike the functions in nd4s, all those differentiable functions accepts either an INDArray, INDArrayWeight or INDArrayLayer, and returns one Layer of neural network, which can be composed into another differentiable function call.

Training the network

Loss function

In DeepLearning.scala, when we train a neural network, our goal should always be minimizing the return value.

For example, if iqTestRobot(TrainingQuestions).train get called repeatedly, the neural network would try to minimize input dot robotWeight. robotWeight would become smaller and smaller in order to make input dot robotWeight smaller, and iqTestRobot(TrainingQuestions).predict would return an INDArray of small numbers.

What if you expect iqTestRobot(TrainingQuestions).predict to return ExpectedAnswers?

You can create another neural network that evaluates how far between the result of myNeuralNetwork and your expectation. The new neural network is usually called loss function.

In this article we will use square loss as the loss function:

In [11]:
import hyperparameters.DoubleLayer
def squareLoss(questions: INDArray, expectAnswer: INDArray): DoubleLayer = {
  val difference = iqTestRobot(questions) - expectAnswer
  (difference * difference).mean
}
Out[11]:
import hyperparameters.DoubleLayer

defined function squareLoss

When the lossFunction get trained continuously, its return value will be close to zero, and the result of myNeuralNetwork must be close to the expected result at the same time.

Note the lossFunction accepts a questions and expectAnswer as its parameter. The first parameter is the input data used to train the neural network, and the second array is the expected output.

The squareLoss function itself is a neural network, internally using the layer returned by iqTestRobot method.

Run the training task

As I mentioned before, there is a train method for DoubleLayer. It is a ThoughtWorks Future that performs one iteration of training.

Since we want to repeatedly train the neural network of the robot, we need to create another Future that performs many iterations of training.

In this article, we use ThoughtWorks Each to build such a Future:

In [12]:
import $ivy.`com.thoughtworks.each::each:3.3.1`
import $plugin.$ivy.`org.scalamacros:paradise_2.11.11:2.1.0`

import com.thoughtworks.each.Monadic._
import com.thoughtworks.future._
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scalaz.std.stream._
Out[12]:
import $ivy.$                                  

import $plugin.$                                            


import com.thoughtworks.each.Monadic._

import com.thoughtworks.future._

import scala.concurrent.Await

import scala.concurrent.duration.Duration

import scalaz.std.stream._
In [13]:
val TotalIterations = 500

@monadic[Future]
def train: Future[Stream[Double]] = {
  for (iteration <- (0 until TotalIterations).toStream) yield {
    squareLoss(TrainingQuestions, ExpectedAnswers).train.each
  }
}
Out[13]:
TotalIterations: Int = 500
defined function train

Then we can run the task to train the robot.

In [14]:
val lossByTime: Stream[Double] = Await.result(train.toScalaFuture, Duration.Inf)
Out[14]:
lossByTime: Stream[Double] = Stream(
  46.446085611979164,
  25.62597147623698,
  15.60748036702474,
  10.731187184651693,
  8.304683685302734,
  7.047036488850911,
  6.34877077738444,
  5.919837951660156,
  5.622165044148763,
  5.389952341715495,
  5.191807111104329,
...

Then we create a plot to show how the loss changed during iterations.

In [15]:
import $ivy.`org.plotly-scala::plotly-jupyter-scala:0.3.2`

import plotly._
import plotly.element._
import plotly.layout._
import plotly.JupyterScala._

plotly.JupyterScala.init()
Out[15]:
import $ivy.$                                             


import plotly._

import plotly.element._

import plotly.layout._

import plotly.JupyterScala._

In [16]:
Scatter(lossByTime.indices, lossByTime).plot(title = "loss by time")
Out[16]:
res15: String = "plot-1447757220"

After these iterations, the loss should be close to zero.

Test the trained robot

In [17]:
val TestQuestions: INDArray = {
  import org.nd4s.Implicits._
  Array(Array(3, 6, 9)).toNDArray
}
Out[17]:
TestQuestions: INDArray = [3.00, 6.00, 9.00]
In [18]:
Await.result(iqTestRobot(TestQuestions).predict.toScalaFuture, Duration.Inf)
Out[18]:
res17: INDArray = 12.00

The result should be close to 12.

You may also see the value of weights in the trained neural network:

In [19]:
val weightData: INDArray = robotWeight.data
Out[19]:
weightData: INDArray = [-0.55, 0.11, 1.44]

Conclusion

In this article, we have created a IQ test robot with the help of DeepLearning.scala.

The model of robot is linear regression with a square loss, which consists of some INDArryWeights and INDArrayLayers.

After many iterations of training, the robot finally learnt the pattern of arithmetic progression.

Download this tutorial

DeepLearning.scala is an open source deep-learning toolkit in Scala created by our colleagues at ThoughtWorks. We're excited about this project because it uses differentiable functional programming to create and compose neural networks; a developer simply writes code in Scala with static typing.