Constant initialisation
Random initialisation
def init_parameters_normal(self):
self.W1 = np.random.randn(self.n_x, self.n_h) * 0.1
self.b1 = np.random.randn(1, self.n_h) * 0.1
self.W2 = np.random.randn(self.n_h, self.n_y) * 0.1
self.b2 = np.random.randn(self.1, self.n_y) * 0.1
Xavier initialisation
Based on a Gaussian distribution
- params initialised to a zero-mean
- variance adjusted to number of input/output parameters:
def init_parameters_xavier(self):
var = np.sqrt(2.0 / (self.n_x + self.n_y))
self.W1 = np.random.randn(self.n_x, self.n_h) * var
self.b1 = np.random.randn(1, self.n_h) * var
self.W2 = np.random.randn(self.n_h, self.n_y) * var
self.b2 = np.random.randn(self.1, self.n_y) * var
Other initialisations
- Glorot's
- Orthogonal initialisation
- Variance scaling initialisation
- He initialisation - variance of
- LeCun initialisation - variance of
, useful for sigmoid and tanh activation