is the ground truth value for the probability of being of class for sample
For example, if the sample is of class , assuming we have 10 classes , then we have:
is the one-hot vector for the sample with class 2 i.e a binary vector where only one element is "hot" (i.e., set to 1), while all other elements are "cold" (i.e., set to 0).