Softmax
- yi = e(xi) / sum over j ^(xj)
- Easy way to get positive numbers, also called a probability distribution
- John bridle, recommends calling it softargmax
- Used a lot for classification
- As score gets to zero, log becomes too big
- Better off using the log of the value as the module directly to avoid numerical instability Why
- Better off using the log of the value as the module directly to avoid numerical instability Why
- This is the sigmoid function with inputs x and 1!