Notes for Deep Learning Book

Chap 6: Deep Feedforward Networks Sigmoid function Unnormalized probabilities is called $z$, I think it’s the outputs from some node in the network before we turn it into a final real probabilities that are able to be summed to 1. Section 6.2.2.2 - Sigmoid Units for Bernoulli Output Distributions - is hard to follow, I checked this reference from stack exchange(https://stats.stackexchange.com/questions/269575/motivating-sigmoid-output-units-in-neural-networks-starting-with-unnormalized-lo). 6.2.2.3 Softmax Units for Multinoulli Output Distributions How to under stand this?...