Log-likelihood (cross-entropy) function

Defined as:

L = \frac{1}{N} \sum_{i}^{N} [y_{i} \ln (s (a x_{i} + b)) + (1 - y_{i}) \ln (1 - s (a x_{i} + b))]

Used as the Loss function for Logistic function

When $p (x_{i})$ is close to the ground truth value $y_{i}$ , both loss function terms will have values close to zero, and will produce larger non-zero negative values when $p (x_{i})$ is different from $y_{i}$

Given that it will produce a negative value with greater difference, we would like to maximise it (hence bringing it closer to zero). But as good practice, since we prefer to minimise loss functions, we will multiply the loss function by -1

a^{*}, b^{*} = \arg min_{a, b} [\frac{- 1}{N} \sum_{i}^{N} [y_{i} \ln (s (a x_{i} + b)) + (1 - y_{i}) \ln (1 - s (a x_{i} + b))]]

def log_likelihood_loss(a, b, x, y):
	pred = logistic_regression(x, a, b)
	return -1 * np.mean(y * n.log(pred) + (1-y) * np.log(1 - pred)