In this GettingStarted article, we will build a robot for answering questions in IQ test with the help of DeepLearning.scala.
Background¶
Suppose we are building a robot for answering questions in IQ test like this:
What is the next number in sequence:
3, 6, 9, ?
The answer is 12.
We prepared some questions and corresponding answers as INDArrays:
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`org.nd4j:nd4j-native-platform:0.8.0`
import org.nd4j.linalg.api.ndarray.INDArray
val TrainingQuestions: INDArray = {
import org.nd4s.Implicits._
Array(
Array(0, 1, 2),
Array(4, 7, 10),
Array(13, 15, 17)
).toNDArray
}
val ExpectedAnswers: INDArray = {
import org.nd4s.Implicits._
Array(
Array(3),
Array(13),
Array(19)
).toNDArray
}
These samples will be used to train the robot.
In the rest of this article, we will build the robot in the following steps:
- Install DeepLearning.scala, which is the framework that helps us build the robot.
- Setup configuration (also known as hyperparameters) of the robot.
- Build an untrained neural network of the robot.
- Train the neural network using the above samples.
- Test the robot seeing if the robot have been learnt how to answer these kind of questions.
Install DeepLearning.scala¶
DeepLearning.scala is hosted on Maven Central repository.
You can use magic imports in jupyter-scala or Ammonite-REPL to download DeepLearning.scala and its dependencies.
import $ivy.`com.thoughtworks.deeplearning::plugins-builtins:2.0.0`
If you use sbt, please add the following settings into your build.sbt
:
// All DeepLearning.scala built-in plugins.
libraryDependencies += "com.thoughtworks.deeplearning" %% "plugins-builtins" % "latest.release"
// The native backend for nd4j.
libraryDependencies += "org.nd4j" % "nd4j-native-platform" % "0.8.0"
// Uncomment the following line to switch to the CUDA backend for nd4j.
// libraryDependencies += "org.nd4j" % "nd4j-cuda-8.0-platform" % "0.8.0"
// The magic import compiler plugin, which may be used to import DeepLearning.scala distributed in source format.
addCompilerPlugin("com.thoughtworks.import" %% "import" % "latest.release")
// The ThoughtWorks Each library, which provides the `monadic`/`each` syntax.
libraryDependencies += "com.thoughtworks.each" %% "each" % "latest.release"
addCompilerPlugin("org.scalamacros" % "paradise" % "2.1.0" cross CrossVersion.full)
fork := true
scalaVersion := "2.11.11"
Note that this example must run on Scala 2.11.11 because nd4s does not support Scala 2.12. Make sure there is not a setting like scalaVersion := "2.12.x"
in your build.sbt
.
See Scaladex to install DeepLearning.scala in other build tools!
Setup hyperparameters¶
Hyperparameters are global configurations for a neural network.
For this robot, we want to set its learning rate, which determines how fast the robot change its inner weights.
In DeepLearning.scala, hyperparameters can be introduced by plugins, which is a small piece of code loaded from a URL.
interp.load(scala.io.Source.fromURL(new java.net.URL("https://gist.github.com/Atry/1fb0608c655e3233e68b27ba99515f16/raw/39ba06ee597839d618f2fcfe9526744c60f2f70a/FixedLearningRate.sc")).mkString)
By loading the hyperparameter plugin FixedLearningRate
, we are able to create the context of neural network with learningRate
parameter.
See FixedLearningRate
's README for instructions for sbt projects.
import com.thoughtworks.deeplearning.plugins.Builtins
Now we create the context and setup learning rate to 0.003
.
// `interp.load` is a workaround for https://github.com/lihaoyi/Ammonite/issues/649 and https://github.com/scala/bug/issues/10390
interp.load("""
import scala.concurrent.ExecutionContext.Implicits.global
import com.thoughtworks.feature.Factory
val hyperparameters = Factory[Builtins with FixedLearningRate].newInstance(learningRate = 0.003)
""")
See Factory if you are wondering how those plugins are composed together.
The Builtins
plugin contains some implicit values and views, which should be imported as following:
import hyperparameters.implicits._
Build an untrained neural network of the robot¶
In DeepLearning.scala, a neural network is simply a function that references some weights, which are mutable variables being changed automatically according to some goals during training.
For example, given x0
, x1
and x2
are the input sequence passed to the robot, we can build a function that returns the answer as robotWeight0 * x0 + robotWeight1 * x1 + robotWeight2 * x2
, by adjusting those weights during training, the result should become close to the expected answer.
In DeepLearning.scala, weights can be created as following:
def initialValueOfRobotWeight: INDArray = {
import org.nd4j.linalg.factory.Nd4j
import org.nd4s.Implicits._
Nd4j.randn(3, 1)
}
import hyperparameters.INDArrayWeight
val robotWeight = INDArrayWeight(initialValueOfRobotWeight)
In the above code, robotWeight
is a weight of n-dimensional array, say, INDArrayWeight, initialized from random values. Therefore, the formula robotWeight0 * x0 + robotWeight1 * x1 + robotWeight2 * x2
can be equivalent to a matrix multipication, written as a dot
method call:
import hyperparameters.INDArrayLayer
def iqTestRobot(questions: INDArray): INDArrayLayer = {
questions dot robotWeight
}
Note that the dot
method is a differentiable function provided by DeepLearning.scala.
You can find other n-dimensional array differentiable methods in Scaladoc
Unlike the functions in nd4s, all those differentiable functions accepts either an INDArray
, INDArrayWeight
or INDArrayLayer, and returns one Layer of neural network, which can be composed into another differentiable function call.
Training the network¶
Loss function¶
In DeepLearning.scala, when we train a neural network, our goal should always be minimizing the return value.
For example, if iqTestRobot(TrainingQuestions).train
get called repeatedly,
the neural network would try to minimize input dot robotWeight
.
robotWeight
would become smaller and smaller in order to make input dot robotWeight
smaller,
and iqTestRobot(TrainingQuestions).predict
would return an INDArray
of small numbers.
What if you expect iqTestRobot(TrainingQuestions).predict
to return ExpectedAnswers
?
You can create another neural network that evaluates how far between the result of myNeuralNetwork
and your expectation. The new neural network is usually called loss function.
In this article we will use square loss as the loss function:
import hyperparameters.DoubleLayer
def squareLoss(questions: INDArray, expectAnswer: INDArray): DoubleLayer = {
val difference = iqTestRobot(questions) - expectAnswer
(difference * difference).mean
}
When the lossFunction
get trained continuously, its return value will be close to zero, and the result of myNeuralNetwork
must be close to the expected result at the same time.
Note the lossFunction
accepts a questions
and expectAnswer
as its parameter.
The first parameter is the input data used to train the neural network, and the second array is the expected output.
The squareLoss
function itself is a neural network, internally using the layer returned by iqTestRobot
method.
Run the training task¶
As I mentioned before, there is a train method for DoubleLayer
. It is a ThoughtWorks Future that performs one iteration of training.
Since we want to repeatedly train the neural network of the robot, we need to create another Future
that performs many iterations of training.
In this article, we use ThoughtWorks Each to build such a Future
:
import $ivy.`com.thoughtworks.each::each:3.3.1`
import $plugin.$ivy.`org.scalamacros:paradise_2.11.11:2.1.0`
import com.thoughtworks.each.Monadic._
import com.thoughtworks.future._
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scalaz.std.stream._
val TotalIterations = 500
@monadic[Future]
def train: Future[Stream[Double]] = {
for (iteration <- (0 until TotalIterations).toStream) yield {
squareLoss(TrainingQuestions, ExpectedAnswers).train.each
}
}
Then we can run the task to train the robot.
val lossByTime: Stream[Double] = Await.result(train.toScalaFuture, Duration.Inf)
Then we create a plot to show how the loss changed during iterations.
import $ivy.`org.plotly-scala::plotly-jupyter-scala:0.3.2`
import plotly._
import plotly.element._
import plotly.layout._
import plotly.JupyterScala._
plotly.JupyterScala.init()
Scatter(lossByTime.indices, lossByTime).plot(title = "loss by time")
After these iterations, the loss should be close to zero.
Test the trained robot¶
val TestQuestions: INDArray = {
import org.nd4s.Implicits._
Array(Array(3, 6, 9)).toNDArray
}
Await.result(iqTestRobot(TestQuestions).predict.toScalaFuture, Duration.Inf)
The result should be close to 12
.
You may also see the value of weights in the trained neural network:
val weightData: INDArray = robotWeight.data
Conclusion¶
In this article, we have created a IQ test robot with the help of DeepLearning.scala.
The model of robot is linear regression with a square loss, which consists of some INDArryWeight
s and INDArrayLayer
s.
After many iterations of train
ing, the robot finally learnt the pattern of arithmetic progression.