ReLU function

Rectified linear unit activation function

def ReLU(val):
	return np.maximum(0, val)

Maps any input value below 0 to 0, and any input value $\geq 0$ as the input value itself

Widely used because it is computationally efficient and does not saturate like other activation functions. Similar to Leaky ReLU function.

? What does saturation mean?

Saturation refers to the situation when an activation function's output is very extreme, leading to a very small or zero gradient which causes vanishing gradients. This leads to a neuron "death" which occurs when a neuron always outputs zero and stops learning.

ReLU avoids saturation for positive inputs and is computationally efficient. To tackle dying ReLU, Leaky ReLU function is used.

y = \max(0, x)