roc auc 최종 점수 도출

Question

안녕하세요 좋은 강의 정말 감사드립니다. 3. evalation 마지막 셀에 roc-auc까지 포함된 get_clf_eval 함수를 정의해주시고 코드가 끝나는데, 해당 셀의 마지막에 (다른 셀들 처럼) thresholds = [0.4 , 0.45 , 0.50 , 0.55 , 0.60] pred_proba = lr_clf.predict_proba(X_test) get_eval_by_threshold(y_test, pred_proba[:,1].reshape(-1,1), thresholds) 을 넣고 실행하면, Input contains NaN, infinity or a value too large for dtype('float64'). 이라는 에러 메시지가 뜹니다. 이것을 어떻게 해결할 수 있을까요?

nathan · Answer

와 정말 감사합니다!

권 철민 · Answer

get_eval_by_threshold()도 아래와 같이 바뀌어야 합니다. 이어지는 강의에서 해당 함수 변경에 대해서 설명 드립니다. from sklearn.preprocessing import Binarizer def get_eval_by_threshold(y_test , pred_proba_c1, thresholds): # thresholds 리스트 객체내의 값을 차례로 iteration하면서 Evaluation 수행. for custom_threshold in thresholds: binarizer = Binarizer(threshold=custom_threshold).fit(pred_proba_c1) custom_predict = binarizer.transform(pred_proba_c1) print('임곗값:',custom_threshold) # roc_auc_score 관련 수정 get_clf_eval(y_test , custom_predict, pred_proba_c1) 감사합니다.

nathan · Answer

<<<수업용 코드>>>> # 최종 함수의 도출 def get_clf_eval(y_test, pred=None, pred_proba=None): confusion = confusion_matrix( y_test, pred) accuracy = accuracy_score(y_test , pred) precision = precision_score(y_test , pred) recall = recall_score(y_test , pred) f1 = f1_score(y_test,pred) # ROC-AUC 추가 roc_auc = roc_auc_score(y_test, pred_proba) print('오차 행렬') print(confusion) # ROC-AUC print 추가 print('정확도: {0:.4f}, 정밀도: {1:.4f}, 재현율: {2:.4f},\ F1: {3:.4f}, AUC:{4:.4f}'.format(accuracy, precision, recall, f1, roc_auc)) <<<<추가한 부분>>>> thresholds = [0.4 , 0.45 , 0.50 , 0.55 , 0.60] pred_proba = lr_clf.predict_proba(X_test) get_eval_by_threshold(y_test, pred_proba[:,1].reshape(-1,1), thresholds) <<결과 및 에러메시지>> 임곗값: 0.4 --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 18 thresholds = [ 0.4 , 0.45 , 0.50 , 0.55 , 0.60 ] 19 pred_proba = lr_clf . predict_proba ( X_test ) ---> 20 get_eval_by_threshold ( y_test , pred_proba [ : , 1 ] . reshape ( - 1 , 1 ) , thresholds ) in get_eval_by_threshold (y_test, pred_proba_c1, thresholds) 8 custom_predict = binarizer . transform ( pred_proba_c1 ) 9 print ( '임곗값:' , custom_threshold ) ---> 10 get_clf_eval ( y_test , custom_predict ) 11 print ( ' ' ) 12 in get_clf_eval (y_test, pred, pred_proba) 8 f1 = f1_score ( y_test , pred ) 9 # ROC-AUC 추가 ---> 10 roc_auc = roc_auc_score ( y_test , pred_proba ) 11 12 print ( '오차 행렬' ) ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f (*args, **kwargs) 71 FutureWarning) 72 kwargs . update ( { k : arg for k , arg in zip ( sig . parameters , args ) } ) ---> 73 return f ( ** kwargs ) 74 return inner_f 75 ~\anaconda3\lib\site-packages\sklearn\metrics\_ranking.py in roc_auc_score (y_true, y_score, average, sample_weight, max_fpr, multi_class, labels) 370 y_type = type_of_target ( y_true ) 371 y_true = check_array ( y_true , ensure_2d = False , dtype = None ) --> 372 y_score = check_array ( y_score , ensure_2d = False ) 373 374 if y_type == "multiclass" or (y_type == "binary" and ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f (*args, **kwargs) 71 FutureWarning) 72 kwargs . update ( { k : arg for k , arg in zip ( sig . parameters , args ) } ) ---> 73 return f ( ** kwargs ) 74 return inner_f 75 ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array (array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator) 643 644 if force_all_finite : --> 645 _assert_all_finite(array, 646 allow_nan=force_all_finite == 'allow-nan') 647 ~\anaconda3\lib\site-packages\sklearn\utils\validation.py in _assert_all_finite (X, allow_nan, msg_dtype) 95 not allow_nan and not np.isfinite(X).all()): 96 type_err = 'infinity' if allow_nan else 'NaN, infinity' ---> 97 raise ValueError( 98 msg_err . format 99 (type_err, ValueError : Input contains NaN, infinity or a value too large for dtype('float64').

권 철민 · Answer

안녕하십니까, 오류 확인을 위해서 전체 오류 메시지를 여기에 올려 주실수 있나요? 감사합니다.