Contributing to DeepLearning.scala¶
So, you like deep learning and functional programming, and decide that DeepLearning.scala might be your thing. Great! There are a lot of ways you can help, and here is a non-comprehensive guide on how you can make DeepLearning.scala even more awesome.
Helping fellow users¶
DeepLearning.scala is a new framework with relatively fewer users. Besides the scaladoc documentation, our website includes several tutorials which demostrates using DeepLearning.scala for machine learning, but third-party tutorials are also welcome! As the developers of this framework, we'd love to hear from our users and get some idea on how they adapt the framework for their own specific tasks.
If you've evaluated DeepLearning.scala and authored a blog post about it, let it be an introductory tutorial or a real-world deep learning challenge, feel free to send us a link! Your feedback will be of great help to our fellow users.
You can also help by participating in discussions in our Gitter chatroom, or answering relevant questions on Stack Overflow.
Reporting issues¶
If you encounter problems when using DeepLearning.scala and believe something is wrong on our end, please file a bug report on the GitHub issue tracker. For us to track down the problem and deliver a solution, do include a minimum example to reproduce the bug in the description.
You may also file a feature request if you're missing some functionality from other deep learning frameworks. Note that there's a high chance a feature request can be fulfilled by simply implementing a new plugin, which does not require code revision in the main repository.
Hacking DeepLearning.scala¶
By its core, DeepLearning.scala is simply an interface which states support of automatic differentiation for any type. Most of DeepLearning.scala's functionality lies in its plugins. The project provides a hierarchy of built-in plugins, which provides support for common types like floating-point numbers and multi-dimensional arrays.
Users of DeepLearning.scala are encouraged to craft new plugins for their own specific tasks. Custom plugins can extend built-in ones and provide extra functionalities like logging, custom optimizers, or even custom neural network architectures. Here we give a brief introduction on the architecture of DeepLearning.scala and demonstrate the development of a simple plugin.
Prerequisites¶
DeepLearning.scala is based on several other Java/Scala libraries:
- nd4j/nd4s for multi-dimensional arrays
- scalaz for common abstractions of functional programming
- feature.scala for creating dynamic mixins
- RAII.scala for resource management
- each for syntactic sugar to write imperative code
To implement a new plugin, one does not need to dive into implementation details of the above libraries, but reading their documentation will surely be useful.
Knowledge on deep learning is not strictly a prerequisite. However you should know some basics about linear algebra and vector calculus, and how Automatic differentiation works.
Setting up development environment¶
Code of a DeepLearning.scala plugin need not be integrated into an sbt project. It can be put into an Ammonite script. It is very handy to use jupyter-scala to develop the plugin interactively.
About the core interface¶
The core interface of DeepLearning.scala is DeepLearning
. This trait defines a type class for data types which support automatic differentiation. To implement an instance, one needs to inherit the instance and override the definitions of Data
, Delta
and forward
. The train
method can be invoked to perform a single round of training (which is the basis of gradient descent). The predict
method can be invoked to perform a prediction.
About the hierarchy of built-in plugins¶
The core interface can be extended by various plugins. Plugins provide various functionality, like supporting new differentiable data types, optimizing, logging, etc. Typically, end-users do not need to extend the core interface themselves; they need only to select plugins which provides required functionality, combine them and initiate an instance like this:
import $ivy.`com.thoughtworks.deeplearning::plugins-doublelayers:2.0.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarraylayers:2.0.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`
import scala.concurrent.ExecutionContext.Implicits.global
import com.thoughtworks.feature.Factory
import com.thoughtworks.deeplearning.plugins.DoubleLayers
import com.thoughtworks.deeplearning.plugins.INDArrayLayers
import com.thoughtworks.deeplearning.plugins.INDArrayWeights
interp.load("val hyperparameters = Factory[DoubleLayers with INDArrayLayers with INDArrayWeights].newInstance()")
hyperparameters
is now in scope and provides functionality from DoubleLayers
, INDArrayLayers
and INDArrayWeights
. One can use hyperparameters.INDArrayWeight
to initiate a weight matrix, hyperparameters.DoubleLayer
/hyperparameters.INDArrayLayer
to initiate a hidden layer/output layer of the neural network. (Note that we're using Factory
instead of Scala's built-in anonymous trait instantiation syntax, because Factory
provides some extra niceties)
There exist a hierarchy of built-in plugins. They are roughly structured as follows:
- For end users:
Builtins
is available for use. It simply combines all other traits and enables all built-in plugins. - The elementary plugins include
Layers
,Training
,Weights
andOperators
. Each plugin is structured as atrait
equipped with abstract types (often with constraints specifying the APIs they support) and API traits. - Extending the elementary plugins, we got support for
Float
/Double
/INDArray
.
The plugin hierarchy is designed with extensibility in mind. Although complicated, end users need only to use Factory to create a new instance for a plugin. The created instance often serves as the context type for the whole machine learning task, and can be used to create new differentiable variables and launch forward/backward passes.
When creating a custom plugin, one typically starts by implementing a new trait which extends some built-in traits.
Creating a plugin for optimizing INDArrayWeights
¶
Let's create a plugin for optimizing INDArrayWeights
. The INDArrayWeights
trait extends Weights
, and its inner trait INDArrayOptimizerApi
extends the inner trait OptimizerApi
of Weights
. WeightApi
declares a delta
method, which is invoked to calculate how much is to be subtracted from the original weights after back propagation produces the current gradient. We'll implement a simple plugin which uses a fixed learning rate to calculate delta
. The code is listed as follows:
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`
import org.nd4j.linalg.api.ndarray.INDArray
import org.nd4s.Implicits._
import com.thoughtworks.deeplearning.plugins.INDArrayWeights
trait INDArrayLearningRate extends INDArrayWeights {
val learningRate: Double
override type INDArrayOptimizer <: INDArrayOptimizerApi with Optimizer
trait INDArrayOptimizerApi extends super.INDArrayOptimizerApi { this: INDArrayOptimizer =>
override def delta: INDArray = {
super.delta * learningRate
}
}
}
This plugin can now be used to optimize an INDArrayWeight
(don't forget to pass the learningRate
parameter when initiating an instance). As we can see, extending a built-in plugin is not that hard:
- Use a trait to extend a trait of existing plugins. This guarantees your plugin can be freely mixed and extended.
- Add custom fields to preserve information required by this plugin.
- Plugins typically has nested traits. To override a method of a nested trait, first extend the inner trait (you need to use "self type" to make it type check).
Besides the example of "learning rate", we'll develop another plugin that provides another piece of functionality: logging.
Creating a plugin for dumping INDArrayWeights
¶
We'll demostrate another plugin here: adding logging functionality to INDArrayWeights
. As we all know, training a neural network is a long and arduous process, consuming many hours (even days). What if the training process is interrupted, and all intermediate result is gone? Adding a dumper which serializes INDArrayWeights
in case of disruption sounds like a good idea.
Below is a plugin which supports dumping INDArrayWeights
each time after the weights are updated. The logic is pretty simple: in INDArrayWeights
, the weight is updated by the backward
method. So we need only to extend INDArrayWeights
and override the backward
method. We first invoke backward
of the superclass to perform proper updating, then we use the java.io.Serializable
interface to dump the data to a file. The user needs only to supply a dumpingPathPrefix
when initiating an instance, and it keeps track of how many times the weight matrix has been serialized.
import $ivy.`org.nd4j::nd4s:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-indarrayweights:2.0.0`
import java.io.{FileOutputStream, ObjectOutputStream}
import scalaz.syntax.all._
import org.nd4j.linalg.api.ndarray.INDArray
import com.thoughtworks.feature.{Factory, ImplicitApply, PartialApply}
import com.thoughtworks.raii.asynchronous._
import com.thoughtworks.deeplearning.plugins.INDArrayWeights
trait INDArrayDumping extends INDArrayWeights {
val dumpingPathPrefix: String
private var currentDumped: Int = 0
override type INDArrayWeight <: INDArrayWeightApi with Weight
trait INDArrayWeightApi extends super.INDArrayWeightApi { this: INDArrayWeight =>
override protected def backward[SubtypeOfOptimizer](
originalDelta: INDArray)(
implicit implicitApplyRest: ImplicitApply.Aux[PartiallyAppliedOptimizer, SubtypeOfOptimizer],
asOptimizer: SubtypeOfOptimizer <:< OptimizerApi { type Delta <: INDArray }
): Do[Unit] = {
super.backward(originalDelta).map { _ =>
val os = new ObjectOutputStream(new FileOutputStream(dumpingPathPrefix + "/" + currentDumped.toString))
try {
os.writeObject(data)
} finally {
os.close()
}
currentDumped += 1
}
}
}
}
To prove that our custom INDArrayDumping
can indeed be used to initiate an INDArrayWeight
which automatically does the serialization while preserving the original backward
behavior, we'll test a trivial example below. We combine a random INDArrayWeight
to form a DoubleLayer
and invoke its train
multiple times. We can observe the resulting value steadily decreases, and the intermediate weight matrices are indeed serialized to disk.
import $ivy.`org.nd4j:nd4j-native-platform:0.8.0`
import $ivy.`com.thoughtworks.deeplearning::plugins-builtins:2.0.0`
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scala.concurrent.ExecutionContext.Implicits.global
import org.nd4j.linalg.factory.Nd4j
import com.thoughtworks.future._
import com.thoughtworks.deeplearning.plugins.Builtins
interp.load("val hyperparameters = Factory[INDArrayDumping with Builtins].newInstance(dumpingPathPrefix=\"/Users/cshao/example\")")
import hyperparameters.INDArrayWeight
import hyperparameters.DoubleLayer
import hyperparameters.implicits._
val w = INDArrayWeight(Nd4j.randn(4,4))
def o: DoubleLayer = w.sum
Await.result(o.train.toScalaFuture, Duration.Inf) // strike enter for n times, and watch the magic happen.
// Make sure dumpingPathPrefix exists.