`log_softmax` in Dense layer + `from_logits=True` in cross-entropy loss

I have seen some code uses the combination of log_softmax in the Dense layer with from_logits=True in the cross-entropy loss function in order to have a stable computation of the softmax. How does it compare with using linear activation in the Dense layer with from_logits=True in the cross-entropy loss function? Isn’t there duplicate “softmax” in the first case of using log_softmax in the Dense layer since the cross-entropy loss function will perform the softmax calculation if from_logits=True?

Hi @khteh, Yes, You are correct, You should definitely use Linear output + from_logits=True in your loss. Using log_softmax as the activation is just a type of redundancy and misuses the API designed for stable calculation. Thank you!

Helo how r u

I thing that’s soft max we can help in many issues

But I what’s to ask about Wap and how I can transfer my websites in application