Section 3 [실습] PyTorch로 구현해보는 Loss Function의 Cross Entropy 구현 관련하여 질문이 있습니다.

Question

안녕하세요 선생님, batch_size = 16 n_class = 10 def generate_classification(batch_size=16, n_class=10): pred = torch.nn.Softmax()(torch.rand(batch_size, n_class)) ground_truth = torch.argmax(torch.rand(batch_size, n_class), dim=1) return pred, ground_truth def CE_loss(pred, label): loss = 0. exp_pred = torch.exp(pred) # 이 부분 관련 질문이 있습니다. for batch_i in range(len(pred)): for j in range(len(pred[0])): if j == label[batch_i]: print(pred[0], j) loss = loss + torch.log(exp_pred[batch_i][j] / torch.sum(exp_pred, axis=1)[batch_i]) return -loss / len(pred) CE loss를 구현하는 과정에서 exp_pred = torch.exp(pred) 행이 왜 필요한 것인지 궁금합니다! exp를 취해주는 이유는 모델의 출력값 logits에 exp를 적용해 각 클래스에 대한 예측값을 양수로 변환한다고 알고 있는데 generate_classification 위에서 이미 softmax를 취해서 확률분포로 변환해주기 때문에 음수 값은 나오지 않는데 왜 exp를 적용해주어야 하는지 모르겠어서 여쭤봅니다!

변정현 · Answer

안녕하세요 변정현입니다! 네 말씀해주신 것처럼 예측값이 이미 SoftMax 함수로 0~1 사이의 값과 합이 1로 normalize되어 있어서 CE_loss에 있는 SoftMax 함수 (Exponential 취한 후 Exponential의 합으로 나누는 것) 를 별도로 또 취할 필요가 없습니다! 제가 앞서서 정의한 예측값을 재활용하는 과정에서 혼선이 있었네요 ㅎㅎ 잘 발견해주셔서 감사합니다!