NeuralNetwork

Basics

To build a neural network with LearningHorse, use the NetWork type.

HorseML.NeuralNetwork.NetWorkType
NetWork(layers...)

Connect multiple layers, and build a NeuralNetwork. NetWork also supports index. You can also add layers later using the add_layer!() Function.

Example

julia> N = NetWork(Dense(10=>5, relu), Dense(5=>1, relu))

julia> N[1]

Dense(IO:10=>5, σ:relu)
source
HorseML.NeuralNetwork.@epochsMacro
@epochs n ex

This macro cruns ex n times. Basically this is useful for learning NeuralNetwork. Even if there are any output during the progress, the progress bar won't disappear! It is always displayed in the bottom line of the output. When the process is finished, display Complete!.

Note

The output during training is displayed, but in order to keep displaying the progress bar, it is displayed collectively after each process.(This may be improved in the future)

Warning

This macro may not work on Windows (because Windows locks files)! Use @simple_epochs instead!

Example

julia> function yes()
           println("yes")
           sleep(0.1)
       end
yes (generic function with 1 method)
julia> @epochs 10 yes()
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
  Complete!
source
HorseML.NeuralNetwork.@simple_epochsMacro

@simple_epochs n ex It's not much different from @epochs, but it doesn't have the ability to keep the progress bar displayed.

Example

julia> function yes()
           println("yes")
           sleep(0.1)
       end
yes (generic function with 1 method)
julia> @epochs 10 yes()
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
  Complete!
source

Layers

HorseML.NeuralNetwork.DenseType
Dense(in=>out, σ; set_w = "Xavier", high_accuracy=false)

Crate a traditinal Dense layer, whose forward propagation is given by: σ.(muladd(W, X, b)) The size of x must be (in) or (in, batch).

Parameters

  • in=>out: number of I/O unit
  • σ: activation
  • set_w: Xavier or He, it decide a method to create a first parameter. you can also specify a function which has number of I/O unit as arguments
  • high_accuracy: whether to calculate using Float64

Example

julia> D = Dense(5=>2, relu)
Dense(IO:5=>2, σ:relu)

julia> D(rand(Float64, 5)) |> size
(2,)
source
HorseML.NeuralNetwork.DenseσType
Denseσ(in=>out, σ; set_w = "Xavier", high_accuracy=false)

Dense layer which you can learn the parametes of the activation function. The implementation perfectly matches Dense layer. It is assumued that the activation function will be passed as a structure, and the parameters to be learned must be in the w field.

Example

julia> D = Denseσ(5=>2, relu)
Denseσ(IO:5=>2, σ:relu)

julia> D(rand(Float64, 5)) |> size
(2,)
source
HorseML.NeuralNetwork.ConvType
Conv(kernel, in=>out, σ; stride = 1, pading = 0, set_w = "Xavier")

This is the traditional convolution layer. kernel is a tuple of integers that specifies the kernel size, it must have one or two elements. And, in and out specifies number of input and out channels.

The input data must have a dimensions WHCB(width, height, channel, batch). If you want to use a data which has a dimentions WHC, you must be add a dimentioins of B.

Parameters

  • stride: specify stride of Convolution layer. This is a integer or tuple of 2 elements
  • padding: specify padding of COnvolution layer. This is a integer or tuple of 2 or 4 elements. If you pecidy KeepSize for this parameter, we adjust the size of input and return a matrix which has the same sizes.
  • set_w: Xavier or He, it decide a method to create a first parameter. This parameter is the same as Dense().

Example

julia> C = Conv((2, 2), 2=>2, relu)
Convolution(k:(2, 2), IO:2 => 2, σ:relu)

julia> C(rand(10, 10, 2, 5)) |> size
(9, 9, 2, 5)
Warning

When you specidies same to padding, in some cases, it will be returned one size smaller. Because of its expression.

julia> C = Conv((2, 2), 2=>2, relu, padding = KeepSize)
Convolution(k:(2, 2), IO:2 => 2, σ:relu

julia> C(rand(10, 10, 2, 5)) |> size
(9, 9, 2, 5)
source
HorseML.NeuralNetwork.DropoutType
Dropout(p)

This layer dropout the input data.

Example

julia> D = Dropout(0.25)
Dropout(0.25)

julia> D(rand(10))
10-element Array{Float64,1}:
 0.0
 0.3955865029078952
 0.8157710047424143
 1.0129613533211907
 0.8060508293474877
 1.1067504108970596
 0.1461289547292684
 0.0
 0.04581776023870532
 1.2794087133638332
source
HorseML.NeuralNetwork.FlattenType
Flatten()

This layer change the dimentions Image to Vector.

Example

julia> F = Flatten()
Flatten(())

julia> F(rand(10, 10, 2, 5)) |> size
(1000, )
source
HorseML.NeuralNetwork.MaxPoolType
MaxPool(k::NTuple; stride = k, padding = 0)

This is a layer for max pooling with kernel size k.

Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

The default stride is the same as kernel size k.

Example

julia> N = NetWork(Conv((2, 2), 5=>2, relu), MaxPool(2, 2))
Layer1 : Convolution(k:(2, 2), IO:5=>2, σ:relu)
Layer2 : MaxPool(k:(2, 2), stride:(2, 2) padding:(0, 0, 0, 0))

julia> x = rand(Float64, 10, 10, 5, 5) |> size
(10, 10, 5, 5)

julia> N(x) |> size
(4, 4, 5, 2)
source
HorseML.NeuralNetwork.MeanPoolType
MeanPool(k::NTuple; stride = k, padding = 0)

This is a layer for mean pooling with kernel size k.

Expects as input an array with ndims(x) == N+2, i.e. channel and batch dimensions, after the N feature dimensions, where N = length(out).

The default stride is the same as kernel size k.

Example

julia> N = NetWork(Conv((2, 2), 5=>2, relu), MeanPool(2, 2))
Layer1 : Convolution(k:(2, 2), IO:5=>2, σ:relu)
Layer2 : MeanPool(k:(2, 2), stride:(2, 2) padding:(0, 0, 0, 0))

julia> x = rand(Float64, 10, 10, 5, 5) |> size
(10, 10, 5, 5)

julia> N(x) |> size
(4, 4, 5, 2)
source

Activations

HorseML.NeuralNetwork.σFunction
σ(x)

Standard sigmoid activation function. Also, this function can be called with σ. This is the expression:

\[\sigma(x) = \frac{1}{1+e^{-x}}\]

source
HorseML.NeuralNetwork.hardσFunction
hardsigmoid(x) = max(0, min(1, (x + 2.5) / 6))

Piecewise linear approximation of sigmoid. Also, this function can be called with hardσ. This is the expression:

\[hardsigmoid(x) = \left\{ \begin{array}{ll} 1 & (x \geq \frac{1}{4}) \\ \frac{1}{5} x & (- \frac{1}{4} \lt x \lt \frac{1}{4}) \\ 0 & (x \leq - \frac{1}{4}) \end{array} \right.\]

source
HorseML.NeuralNetwork.hardtanhFunction
hardtanh(x)

Linear tanh function. This is the expression:

\[hardtanh(x) = \left\{ \begin{array}{ll} 1 & (x \geq 1) \\ x & (-1 \lt x \lt 1) \\ -1 & (x \leq -1) \end{array} \right.\]

source
HorseML.NeuralNetwork.reluFunction
relu(x) = max(0, x)

relu is Rectified Linear Unit. This is the expression:

\[relu(x) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ 0 & (x \lt 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.leakyreluFunction
leakyrelu(x; α=0.01) = (x>0) ? x : α*x

Leaky Rectified Linear Unit. This is the expression:

\[leakyrelu(x) = \left\{ \begin{array}{ll} \alpha x & (x \lt 0) \\ x & (x \geq 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.rreluType
rrelu(min, max)

Randomized Rectified Linear Unit. The expression is the as leakyrelu, but α is a random number between min and max. Also, since this function is defined as a structure, use it as follows:

Dense(10=>5, rrelu(0.001, 0.1))
source
HorseML.NeuralNetwork.relu6Function
relu6(x)

Relu function with an upper limit of 6. This is the expression:

\[relu6(x) = \left\{ \begin{array}{ll} 6 & (x \gt 6) \\ x & (x \geq 0) \\ 0 & (x \lt 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.eluFunction
elu(x, α=1)

Exponential Linear Unit activation function. You can also specify the coefficient explicitly, e.g. elu(x, 1). This is the expression:

\[elu(x, α) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^x-1) & (x \lt 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.geluFunction
gelu(x)

Gaussian Error Linear Unit. This is the expression($\phi$ is a distribution function of standard normal distribution.):

\[gelu(x) = x\phi(x)\]

However, in the implementation, it is calculated with the following expression.

\[\sigma(x) = \frac{1}{1+e^{-x}} \\ gelu(x) = x\sigma(1.702x)\]

source
HorseML.NeuralNetwork.swishFunction
swish(x; β=1)

The swish function. This is the expression:

\[\sigma(x) = \frac{1}{1+e^{-x}} \\ swish(x) = x\sigma(\beta x)\]

source
HorseML.NeuralNetwork.seluFunction
selu(x)

Scaled exponential linear units. This is the expression

\[\lambda = 1.0507009873554804934193349852946 \\ \alpha = 1.6732632423543772848170429916717 \\ selu(x) = \lambda \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^x-1) & (x \lt 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.celuFunction
celu(x; α=1)

Continuously Differentiable Exponential Linear Unit. This is the expression:

\[\alpha = 1 \\ celu(x) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^\frac{x}{\alpha}-1) & (x \lt 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.logσFunction
logσ(x)

logarithmic sigmoid function. This is the expression:

\[\sigma(x) = \frac{1}{1+e^{-x}} \\ logsigmoid(x) = \log(\sigma(x))\]

source
HorseML.NeuralNetwork.mishFunction
mish(x) = x * tanh(softplus(x))

The mish function. This is the expression:

\[softplus(x) = \ln(1+e^x) \\ mish(x) = x\tanh(softplus(x))\]

source
HorseML.NeuralNetwork.softshrinkFunction
softshrink(x; λ=0.5)

This is the expression:

\[\lambda=0.5 \\ softshrink(x) = \left\{ \begin{array}{ll} x-\lambda & (x \gt \lambda) \\ 0 & (-\lambda \leq x \leq \lambda) \\ x+\lambda & (x \lt -\lambda) \\ \end{array} \right.\]

source
HorseML.NeuralNetwork.treluFunction
trelu(x; θ=1)

Threshold gated Rectified Linear Unit. This is the expression:

\[\theta = 1 \\ trelu(x) = \left\{ \begin{array}{ll} x & (x \gt 0) \\ 0 & (x \leq 0) \end{array} \right.\]

source
HorseML.NeuralNetwork.SQUFunction
SQU(x)

Shifted Quadratic Unit. SQU is a biologically inspired activation that enables single neurons to learn the XOR function. This is the expression:

\[SQU(x) = x^{2}+x\]

source

Optimizers

HorseML.NeuralNetwork.MomentumType
Momentum(η=0.01, α=0.9, velocity)

Momentum gradient descent optimizer with learning rate η and parameter of velocity α.

Parameters

  • learning rate : η
  • parameter of velocity : α
source
HorseML.NeuralNetwork.AdamType
Adam(η=0.01, β=(0.9, 0.99))

Gradient descent adaptive moment estimation optimizer.

Parameters

  • η : learning rate
  • β : Decay of momentums
source

GPU Support

HorseML.gpuFunction
gpu(model)

Transform the model so that it can be trained on the GPU. When called in an environment without a GPU, it does nothing and returns the original model.

Note

This function is included in the HorseML module and can only be used with using HorseML.

Example

julia> model = NetWork(Dense(10=>5, relu), Dense(5=>1, tanh)) |> gpu
Layer1 : Dense(IO:10 => 5, σ:relu)
Layer2 : Dense(IO:5 => 1, σ:tanh)

julia> model[1].w |> typeof
CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
source
HorseML.cpuFunction
cpu(model)

Put the model trained on the GPU back on the CPU.

Note

This function is included in the HorseML module and can only be used with using HorseML.

Example

julia> model_on_cpu = model |> cpu #This model is made with description of gpu function
Layer1 : Dense(IO:10 => 5, σ:relu)
Layer2 : Dense(IO:5 => 1, σ:tanh)

julia> model_on_cpu[1].w |> typeof
Matrix{Float32} (alias for Array{Float32, 2})
source