NeuralNetwork
Basics
To build a neural network with LearningHorse, use the NetWork type.
HorseML.NeuralNetwork.NetWork
— TypeNetWork(layers...)
Connect multiple layers, and build a NeuralNetwork. NetWork also supports index. You can also add layers later using the add_layer!() Function.
Example
julia> N = NetWork(Dense(10=>5, relu), Dense(5=>1, relu))
julia> N[1]
Dense(IO:10=>5, σ:relu)
HorseML.NeuralNetwork.@epochs
— Macro@epochs n ex
This macro cruns ex
n
times. Basically this is useful for learning NeuralNetwork. Even if there are any output during the progress, the progress bar won't disappear! It is always displayed in the bottom line of the output. When the process is finished, display Complete!
.
The output during training is displayed, but in order to keep displaying the progress bar, it is displayed collectively after each process.(This may be improved in the future)
This macro may not work on Windows (because Windows locks files)! Use @simple_epochs
instead!
Example
julia> function yes()
println("yes")
sleep(0.1)
end
yes (generic function with 1 method)
julia> @epochs 10 yes()
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
Complete!
HorseML.NeuralNetwork.@simple_epochs
— Macro@simple_epochs n ex It's not much different from @epochs
, but it doesn't have the ability to keep the progress bar displayed.
Example
julia> function yes()
println("yes")
sleep(0.1)
end
yes (generic function with 1 method)
julia> @epochs 10 yes()
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
Complete!
Layers
HorseML.NeuralNetwork.Dense
— TypeDense(in=>out, σ; set_w = "Xavier", high_accuracy=false)
Crate a traditinal Dense
layer, whose forward propagation is given by: σ.(muladd(W, X, b)) The size of x
must be (in)
or (in, batch)
.
Parameters
in=>out
: number of I/O unitσ
: activationset_w
:Xavier
orHe
, it decide a method to create a first parameter. you can also specify a function which has number of I/O unit as argumentshigh_accuracy
: whether to calculate using Float64
Example
julia> D = Dense(5=>2, relu)
Dense(IO:5=>2, σ:relu)
julia> D(rand(Float64, 5)) |> size
(2,)
HorseML.NeuralNetwork.Denseσ
— TypeDenseσ(in=>out, σ; set_w = "Xavier", high_accuracy=false)
Dense layer which you can learn the parametes of the activation function. The implementation perfectly matches Dense layer. It is assumued that the activation function will be passed as a structure, and the parameters to be learned must be in the w
field.
Example
julia> D = Denseσ(5=>2, relu)
Denseσ(IO:5=>2, σ:relu)
julia> D(rand(Float64, 5)) |> size
(2,)
HorseML.NeuralNetwork.Conv
— TypeConv(kernel, in=>out, σ; stride = 1, pading = 0, set_w = "Xavier")
This is the traditional convolution layer. kernel
is a tuple of integers that specifies the kernel size, it must have one or two elements. And, in
and out
specifies number of input and out channels.
The input data must have a dimensions WHCB(width, height, channel, batch). If you want to use a data which has a dimentions WHC, you must be add a dimentioins of B.
Parameters
stride
: specify stride of Convolution layer. This is a integer or tuple of 2 elementspadding
: specify padding of COnvolution layer. This is a integer or tuple of 2 or 4 elements. If you pecidyKeepSize
for this parameter, we adjust the size of input and return a matrix which has the same sizes.set_w
:Xavier
orHe
, it decide a method to create a first parameter. This parameter is the same asDense()
.
Example
julia> C = Conv((2, 2), 2=>2, relu)
Convolution(k:(2, 2), IO:2 => 2, σ:relu)
julia> C(rand(10, 10, 2, 5)) |> size
(9, 9, 2, 5)
When you specidies same
to padding
, in some cases, it will be returned one size smaller. Because of its expression.
julia> C = Conv((2, 2), 2=>2, relu, padding = KeepSize)
Convolution(k:(2, 2), IO:2 => 2, σ:relu
julia> C(rand(10, 10, 2, 5)) |> size
(9, 9, 2, 5)
HorseML.NeuralNetwork.Dropout
— TypeDropout(p)
This layer dropout the input data.
Example
julia> D = Dropout(0.25)
Dropout(0.25)
julia> D(rand(10))
10-element Array{Float64,1}:
0.0
0.3955865029078952
0.8157710047424143
1.0129613533211907
0.8060508293474877
1.1067504108970596
0.1461289547292684
0.0
0.04581776023870532
1.2794087133638332
HorseML.NeuralNetwork.Flatten
— TypeFlatten()
This layer change the dimentions Image to Vector.
Example
julia> F = Flatten()
Flatten(())
julia> F(rand(10, 10, 2, 5)) |> size
(1000, )
HorseML.NeuralNetwork.MaxPool
— TypeMaxPool(k::NTuple; stride = k, padding = 0)
This is a layer for max pooling with kernel size k
.
Expects as input an array with ndims(x) == N+2
, i.e. channel and batch dimensions, after the N
feature dimensions, where N = length(out)
.
The default stride is the same as kernel size k
.
Example
julia> N = NetWork(Conv((2, 2), 5=>2, relu), MaxPool(2, 2))
Layer1 : Convolution(k:(2, 2), IO:5=>2, σ:relu)
Layer2 : MaxPool(k:(2, 2), stride:(2, 2) padding:(0, 0, 0, 0))
julia> x = rand(Float64, 10, 10, 5, 5) |> size
(10, 10, 5, 5)
julia> N(x) |> size
(4, 4, 5, 2)
HorseML.NeuralNetwork.MeanPool
— TypeMeanPool(k::NTuple; stride = k, padding = 0)
This is a layer for mean pooling with kernel size k
.
Expects as input an array with ndims(x) == N+2
, i.e. channel and batch dimensions, after the N
feature dimensions, where N = length(out)
.
The default stride is the same as kernel size k
.
Example
julia> N = NetWork(Conv((2, 2), 5=>2, relu), MeanPool(2, 2))
Layer1 : Convolution(k:(2, 2), IO:5=>2, σ:relu)
Layer2 : MeanPool(k:(2, 2), stride:(2, 2) padding:(0, 0, 0, 0))
julia> x = rand(Float64, 10, 10, 5, 5) |> size
(10, 10, 5, 5)
julia> N(x) |> size
(4, 4, 5, 2)
Activations
HorseML.NeuralNetwork.σ
— Functionσ(x)
Standard sigmoid activation function. Also, this function can be called with σ
. This is the expression:
\[\sigma(x) = \frac{1}{1+e^{-x}}\]
HorseML.NeuralNetwork.hardσ
— Functionhardsigmoid(x) = max(0, min(1, (x + 2.5) / 6))
Piecewise linear approximation of sigmoid. Also, this function can be called with hardσ
. This is the expression:
\[hardsigmoid(x) = \left\{ \begin{array}{ll} 1 & (x \geq \frac{1}{4}) \\ \frac{1}{5} x & (- \frac{1}{4} \lt x \lt \frac{1}{4}) \\ 0 & (x \leq - \frac{1}{4}) \end{array} \right.\]
HorseML.NeuralNetwork.hardtanh
— Functionhardtanh(x)
Linear tanh function. This is the expression:
\[hardtanh(x) = \left\{ \begin{array}{ll} 1 & (x \geq 1) \\ x & (-1 \lt x \lt 1) \\ -1 & (x \leq -1) \end{array} \right.\]
HorseML.NeuralNetwork.relu
— Functionrelu(x) = max(0, x)
relu
is Rectified Linear Unit
. This is the expression:
\[relu(x) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ 0 & (x \lt 0) \end{array} \right.\]
HorseML.NeuralNetwork.leakyrelu
— Functionleakyrelu(x; α=0.01) = (x>0) ? x : α*x
Leaky Rectified Linear Unit. This is the expression:
\[leakyrelu(x) = \left\{ \begin{array}{ll} \alpha x & (x \lt 0) \\ x & (x \geq 0) \end{array} \right.\]
HorseML.NeuralNetwork.rrelu
— Typerrelu(min, max)
Randomized Rectified Linear Unit. The expression is the as leakyrelu
, but α
is a random number between min
and max
. Also, since this function is defined as a structure, use it as follows:
Dense(10=>5, rrelu(0.001, 0.1))
HorseML.NeuralNetwork.prelu
— Typeprelu(; α=0.01)
Parametric Ractified LinearUnit. The expression is the as leakyrelu
, but α
is determined by learning. Also, when using this function, use Denseσ
instead of Dense
.
HorseML.NeuralNetwork.relu6
— Functionrelu6(x)
Relu function with an upper limit of 6. This is the expression:
\[relu6(x) = \left\{ \begin{array}{ll} 6 & (x \gt 6) \\ x & (x \geq 0) \\ 0 & (x \lt 0) \end{array} \right.\]
HorseML.NeuralNetwork.elu
— Functionelu(x, α=1)
Exponential Linear Unit activation function. You can also specify the coefficient explicitly, e.g. elu(x, 1). This is the expression:
\[elu(x, α) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^x-1) & (x \lt 0) \end{array} \right.\]
HorseML.NeuralNetwork.gelu
— Functiongelu(x)
Gaussian Error Linear Unit. This is the expression($\phi$ is a distribution function of standard normal distribution.):
\[gelu(x) = x\phi(x)\]
However, in the implementation, it is calculated with the following expression.
\[\sigma(x) = \frac{1}{1+e^{-x}} \\ gelu(x) = x\sigma(1.702x)\]
HorseML.NeuralNetwork.swish
— Functionswish(x; β=1)
The swish function. This is the expression:
\[\sigma(x) = \frac{1}{1+e^{-x}} \\ swish(x) = x\sigma(\beta x)\]
HorseML.NeuralNetwork.selu
— Functionselu(x)
Scaled exponential linear units. This is the expression
\[\lambda = 1.0507009873554804934193349852946 \\ \alpha = 1.6732632423543772848170429916717 \\ selu(x) = \lambda \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^x-1) & (x \lt 0) \end{array} \right.\]
HorseML.NeuralNetwork.celu
— Functioncelu(x; α=1)
Continuously Differentiable Exponential Linear Unit. This is the expression:
\[\alpha = 1 \\ celu(x) = \left\{ \begin{array}{ll} x & (x \geq 0) \\ \alpha(e^\frac{x}{\alpha}-1) & (x \lt 0) \end{array} \right.\]
HorseML.NeuralNetwork.softplus
— Functionsoftplus(x) = log(1 + exp(x))
the softplus activation function. This is the expression:
\[softplus(x) = \ln(1+e^x)\]
HorseML.NeuralNetwork.softsign
— Functionsoftsign(x) = x / (1+abs(x))
The softsign activation function. This is the expression:
\[softsign(x) = \frac{x}{1+|x|}\]
HorseML.NeuralNetwork.logσ
— Functionlogσ(x)
logarithmic sigmoid function. This is the expression:
\[\sigma(x) = \frac{1}{1+e^{-x}} \\ logsigmoid(x) = \log(\sigma(x))\]
HorseML.NeuralNetwork.logcosh
— Functionlogcosh(x)
Log-Cosh function. This is the expression:
\[logcosh(x) = \log(\cosh(x))\]
HorseML.NeuralNetwork.mish
— Functionmish(x) = x * tanh(softplus(x))
The mish function. This is the expression:
\[softplus(x) = \ln(1+e^x) \\ mish(x) = x\tanh(softplus(x))\]
HorseML.NeuralNetwork.tanhshrink
— Functiontanhshrink(x)
Shrink tanh function. This is the expression:
\[tanhshrink(x) = 1-\tanh(x)\]
HorseML.NeuralNetwork.softshrink
— Functionsoftshrink(x; λ=0.5)
This is the expression:
\[\lambda=0.5 \\ softshrink(x) = \left\{ \begin{array}{ll} x-\lambda & (x \gt \lambda) \\ 0 & (-\lambda \leq x \leq \lambda) \\ x+\lambda & (x \lt -\lambda) \\ \end{array} \right.\]
HorseML.NeuralNetwork.trelu
— Functiontrelu(x; θ=1)
Threshold gated Rectified Linear Unit. This is the expression:
\[\theta = 1 \\ trelu(x) = \left\{ \begin{array}{ll} x & (x \gt 0) \\ 0 & (x \leq 0) \end{array} \right.\]
HorseML.NeuralNetwork.lisht
— Functionlisht(x)
This is the expression:
\[lisht(x) = x\tanh(x)\]
HorseML.NeuralNetwork.gaussian
— Functiongaussian(x)
The Gauss Function. This is the expression:
\[Gaussian(x) = e^{-x^{2}}\]
HorseML.NeuralNetwork.GCU
— FunctionGCU(x)
Growing Cosine Unit. This is the expression:
\[GCU(x) = x\cos(x)\]
HorseML.NeuralNetwork.SQU
— FunctionSQU(x)
Shifted Quadratic Unit. SQU is a biologically inspired activation that enables single neurons to learn the XOR function. This is the expression:
\[SQU(x) = x^{2}+x\]
HorseML.NeuralNetwork.NCU
— FunctionNCU(x)
Non-Monotonic Cubic Unit. This is the expression:
\[NCU(x) = x-x^{3}\]
HorseML.NeuralNetwork.SSU
— FunctionSSU(x)
Shifted Sinc Unit. This is the expression:
\[SSU(x) = \pi sinc(x - \pi)\]
HorseML.NeuralNetwork.DSU
— FunctionDSU(x)
Decaying Sine Unit. This is the expression:
\[DSU(x) = \frac{\pi}{2}(sinc(x-\pi)-sinc(x+\pi))\]
Optimizers
HorseML.NeuralNetwork.Descent
— TypeDescent(η=0.1)
Basic gradient descent optimizer with learning rate η
.
Parameters
- learning rate :
η
HorseML.NeuralNetwork.Momentum
— TypeMomentum(η=0.01, α=0.9, velocity)
Momentum gradient descent optimizer with learning rate η
and parameter of velocity α
.
Parameters
- learning rate :
η
- parameter of velocity :
α
HorseML.NeuralNetwork.AdaGrad
— TypeAdaGrad(η = 0.01)
Gradient descent optimizer with learning rate attenuation.
Parameters
- η : initial learning rate
HorseML.NeuralNetwork.Adam
— TypeAdam(η=0.01, β=(0.9, 0.99))
Gradient descent adaptive moment estimation optimizer.
Parameters
- η : learning rate
- β : Decay of momentums
GPU Support
HorseML.gpu
— Functiongpu(model)
Transform the model so that it can be trained on the GPU. When called in an environment without a GPU, it does nothing and returns the original model.
This function is included in the HorseML module and can only be used with using HorseML
.
Example
julia> model = NetWork(Dense(10=>5, relu), Dense(5=>1, tanh)) |> gpu
Layer1 : Dense(IO:10 => 5, σ:relu)
Layer2 : Dense(IO:5 => 1, σ:tanh)
julia> model[1].w |> typeof
CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
HorseML.cpu
— Functioncpu(model)
Put the model trained on the GPU back on the CPU.
This function is included in the HorseML module and can only be used with using HorseML
.
Example
julia> model_on_cpu = model |> cpu #This model is made with description of gpu function
Layer1 : Dense(IO:10 => 5, σ:relu)
Layer2 : Dense(IO:5 => 1, σ:tanh)
julia> model_on_cpu[1].w |> typeof
Matrix{Float32} (alias for Array{Float32, 2})