Deep Learning using Rectified Linear Units (ReLU)

TLDR

ReLU is typically used as an activation function in DNNs with Softmax as the classification function, but prior work has explored alternative classification functions, and this study adds to that line of research. The study introduces ReLU as a classification function in deep neural networks. The authors compute raw scores by multiplying the penultimate layer activation by weights, apply ReLU to threshold at zero, and then use argmax to produce class predictions.

Abstract

We introduce the use of rectified linear units (ReLU) as the classification function in a deep neural network (DNN). Conventionally, ReLU is used as an activation function in DNNs, with Softmax function as their classification function. However, there have been several studies on using a classification function other than Softmax, and this study is an addition to those. We accomplish this by taking the activation of the penultimate layer $h_{n - 1}$ in a neural network, then multiply it by weight parameters $θ$ to get the raw scores $o_{i}$. Afterwards, we threshold the raw scores $o_{i}$ by $0$, i.e. $f(o) = \max(0, o_{i})$, where $f(o)$ is the ReLU function. We provide class predictions $\hat{y}$ through argmax function, i.e. argmax $f(x)$.

References

Page 1

	Year	Citations

Page 1