Contributing to DeepLearning.scala

So, you like deep learning and functional programming, and decide that DeepLearning.scala might be your thing. Great! There are a lot of ways you can help, and here is a non-comprehensive guide on how you can make DeepLearning.scala even more awesome.

Helping fellow users

DeepLearning.scala is a new framework with relatively fewer users. Besides the scaladoc documentation, our website includes several tutorials which demostrates using DeepLearning.scala for machine learning, but third-party tutorials are also welcome! As the developers of this framework, we'd love to hear from our users and get some idea on how they adapt the framework for their own specific tasks.

If you've evaluated DeepLearning.scala and authored a blog post about it, let it be an introductory tutorial or a real-world deep learning challenge, feel free to send us a link! Your feedback will be of great help to our fellow users.

You can also help by participating in discussions in our Gitter chatroom, or answering relevant questions on Stack Overflow.

Reporting issues

If you encounter problems when using DeepLearning.scala and believe something is wrong on our end, please file a bug report on the GitHub issue tracker. For us to track down the problem and deliver a solution, do include a minimum example to reproduce the bug in the description.

You may also file a feature request if you're missing some functionality from other deep learning frameworks. Note that there's a high chance a feature request can be fulfilled by simply implementing a new plugin, which does not require code revision in the main repository.

Hacking DeepLearning.scala

By its core, DeepLearning.scala is simply an interface which states support of automatic differentiation for any type. Most of DeepLearning.scala's functionality lies in its plugins. The project provides a hierarchy of built-in plugins, which provides support for common types like floating-point numbers and multi-dimensional arrays.

Users of DeepLearning.scala are encouraged to craft new plugins for their own specific tasks. Custom plugins can extend built-in ones and provide extra functionalities like logging, custom optimizers, or even custom neural network architectures. Here we give a brief introduction on the architecture of DeepLearning.scala and demonstrate the development of a simple plugin.

Prerequisites

DeepLearning.scala is based on several other Java/Scala libraries:

  • nd4j/nd4s for multi-dimensional arrays
  • scalaz for common abstractions of functional programming
  • feature.scala for creating dynamic mixins
  • RAII.scala for resource management
  • each for syntactic sugar to write imperative code

To implement a new plugin, one does not need to dive into implementation details of the above libraries, but reading their documentation will surely be useful.

Knowledge on deep learning is not strictly a prerequisite. However you should know some basics about linear algebra and vector calculus, and how Automatic differentiation works.

Setting up development environment

Code of a DeepLearning.scala plugin need not be integrated into an sbt project. It can be put into an Ammonite script. It is very handy to use jupyter-scala to develop the plugin interactively.

About the core interface

The core interface of DeepLearning.scala is DeepLearning. This trait defines a type class for data types which support automatic differentiation. To implement an instance, one needs to inherit the instance and override the definitions of Data, Delta and forward. The train method can be invoked to perform a single round of training (which is the basis of gradient descent). The predict method can be invoked to perform a prediction.

About the hierarchy of built-in plugins

The core interface can be extended by various plugins. Plugins provide various functionality, like supporting new differentiable data types, optimizing, logging, etc. Typically, end-users do not need to extend the core interface themselves; they need only to select plugins which provides required functionality, combine them and initiate an instance like this:

In [ ]:
import $ivy.`com.thoughtworks.deeplearning::plugins-doublelayers:2.0.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarraylayers:2.0.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`

import scala.concurrent.ExecutionContext.Implicits.global
import com.thoughtworks.feature.Factory
import com.thoughtworks.deeplearning.plugins.DoubleLayers
import com.thoughtworks.deeplearning.plugins.INDArrayLayers
import com.thoughtworks.deeplearning.plugins.INDArrayWeights
In [ ]:
interp.load("val hyperparameters = Factory[DoubleLayers with INDArrayLayers with INDArrayWeights].newInstance()")

hyperparameters is now in scope and provides functionality from DoubleLayers, INDArrayLayers and INDArrayWeights. One can use hyperparameters.INDArrayWeight to initiate a weight matrix, hyperparameters.DoubleLayer/hyperparameters.INDArrayLayer to initiate a hidden layer/output layer of the neural network. (Note that we're using Factory instead of Scala's built-in anonymous trait instantiation syntax, because Factory provides some extra niceties)

There exist a hierarchy of built-in plugins. They are roughly structured as follows:

  • For end users: Builtins is available for use. It simply combines all other traits and enables all built-in plugins.
  • The elementary plugins include Layers, Training, Weights and Operators. Each plugin is structured as a trait equipped with abstract types (often with constraints specifying the APIs they support) and API traits.
  • Extending the elementary plugins, we got support for Float/Double/INDArray.

The plugin hierarchy is designed with extensibility in mind. Although complicated, end users need only to use Factory to create a new instance for a plugin. The created instance often serves as the context type for the whole machine learning task, and can be used to create new differentiable variables and launch forward/backward passes.

When creating a custom plugin, one typically starts by implementing a new trait which extends some built-in traits.

Creating a plugin for optimizing INDArrayWeights

Let's create a plugin for optimizing INDArrayWeights. The INDArrayWeights trait extends Weights, and its inner trait INDArrayOptimizerApi extends the inner trait OptimizerApi of Weights. WeightApi declares a delta method, which is invoked to calculate how much is to be subtracted from the original weights after back propagation produces the current gradient. We'll implement a simple plugin which uses a fixed learning rate to calculate delta. The code is listed as follows:

In [ ]:
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`

import org.nd4j.linalg.api.ndarray.INDArray
import org.nd4s.Implicits._
import com.thoughtworks.deeplearning.plugins.INDArrayWeights

trait INDArrayLearningRate extends INDArrayWeights {
    val learningRate: Double
    
    override type INDArrayOptimizer <: INDArrayOptimizerApi with Optimizer
    
    trait INDArrayOptimizerApi extends super.INDArrayOptimizerApi { this: INDArrayOptimizer =>
        override def delta: INDArray = {
            super.delta * learningRate
        }
    }
}

This plugin can now be used to optimize an INDArrayWeight (don't forget to pass the learningRate parameter when initiating an instance). As we can see, extending a built-in plugin is not that hard:

  • Use a trait to extend a trait of existing plugins. This guarantees your plugin can be freely mixed and extended.
  • Add custom fields to preserve information required by this plugin.
  • Plugins typically has nested traits. To override a method of a nested trait, first extend the inner trait (you need to use "self type" to make it type check).

Besides the example of "learning rate", we'll develop another plugin that provides another piece of functionality: logging.

Creating a plugin for dumping INDArrayWeights

We'll demostrate another plugin here: adding logging functionality to INDArrayWeights. As we all know, training a neural network is a long and arduous process, consuming many hours (even days). What if the training process is interrupted, and all intermediate result is gone? Adding a dumper which serializes INDArrayWeights in case of disruption sounds like a good idea.

Below is a plugin which supports dumping INDArrayWeights each time after the weights are updated. The logic is pretty simple: in INDArrayWeights, the weight is updated by the backward method. So we need only to extend INDArrayWeights and override the backward method. We first invoke backward of the superclass to perform proper updating, then we use the java.io.Serializable interface to dump the data to a file. The user needs only to supply a dumpingPathPrefix when initiating an instance, and it keeps track of how many times the weight matrix has been serialized.

In [ ]:
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`

import java.io.{FileOutputStream, ObjectOutputStream}
import scalaz.syntax.all._
import org.nd4j.linalg.api.ndarray.INDArray
import com.thoughtworks.feature.{Factory, ImplicitApply, PartialApply}
import com.thoughtworks.raii.asynchronous._
import com.thoughtworks.deeplearning.plugins.INDArrayWeights

trait INDArrayDumping extends INDArrayWeights {
    
    val dumpingPathPrefix: String
    
    private var currentDumped: Int = 0
    
    override type INDArrayWeight <: INDArrayWeightApi with Weight
    
    trait INDArrayWeightApi extends super.INDArrayWeightApi { this: INDArrayWeight =>
        override protected def backward[SubtypeOfOptimizer](
            originalDelta: INDArray)(
            implicit implicitApplyRest: ImplicitApply.Aux[PartiallyAppliedOptimizer, SubtypeOfOptimizer],
            asOptimizer: SubtypeOfOptimizer <:< OptimizerApi { type Delta <: INDArray }
        ): Do[Unit] = {
           super.backward(originalDelta).map { _ =>
               val os = new ObjectOutputStream(new FileOutputStream(dumpingPathPrefix + "/" + currentDumped.toString))
               try {    
                   os.writeObject(data)
               } finally {
                   os.close()
               }
               currentDumped += 1
           }
        }
    }
}

To prove that our custom INDArrayDumping can indeed be used to initiate an INDArrayWeight which automatically does the serialization while preserving the original backward behavior, we'll test a trivial example below. We combine a random INDArrayWeight to form a DoubleLayer and invoke its train multiple times. We can observe the resulting value steadily decreases, and the intermediate weight matrices are indeed serialized to disk.

In [ ]:
import $ivy.`org.nd4j:nd4j-native-platform:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-builtins:2.0.0`

import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scala.concurrent.ExecutionContext.Implicits.global
import org.nd4j.linalg.factory.Nd4j
import com.thoughtworks.future._
import com.thoughtworks.deeplearning.plugins.Builtins
In [ ]:
interp.load("val hyperparameters = Factory[INDArrayDumping with Builtins].newInstance(dumpingPathPrefix=\"/Users/cshao/example\")")
In [ ]:
import hyperparameters.INDArrayWeight
import hyperparameters.DoubleLayer
import hyperparameters.implicits._
In [ ]:
val w = INDArrayWeight(Nd4j.randn(4,4))
In [ ]:
def o: DoubleLayer = w.sum
In [ ]:
Await.result(o.train.toScalaFuture, Duration.Inf) // strike enter for n times, and watch the magic happen.
// Make sure dumpingPathPrefix exists.

Download this tutorial

DeepLearning.scala is an open source deep-learning toolkit in Scala created by our colleagues at ThoughtWorks. We're excited about this project because it uses differentiable functional programming to create and compose neural networks; a developer simply writes code in Scala with static typing.