Linear regression
The model assumes a linear relationship between inputs and outputs and thus consists of two trainable parameters to be freely chosen.
This is the equation of a line, which is a 1D subspace, and therefore a hyperplane of the 2D ambient space.
We solve this loss function's optimisation problem:
Normal equation
This becomes computationally expensive when number of features in X is large, or impossible when matrix product of feature variables
This is solved with Gradient Descent.