예시문제 작업형2 test 데이터 예측시 발생하는 오류

Question

안녕하세요! 복습하는 도중에 이런 에러가 발생되어 질문드립니다 ㅠㅠimport pandas as pd X_train = pd.read_csv('X_train.csv',encoding='euc-kr') y_train = pd.read_csv('y_train.csv') X_test = pd.read_csv('X_test.csv',encoding='euc-kr')  print(X_train.shape,y_train.shape) # X_train.head() # y_train.head()  # X_train.info()  #X_train 환불금액 결측치, 오브젝트 두개  # y_train.info()  X_train = X_train.fillna(0) X_train.isnull().sum()  X_train = X_train.drop(['cust_id'],axis=1) cust_id = X_test.pop('cust_id')  #라벨인코딩  from sklearn.preprocessing import LabelEncoder  cols = X_train.select_dtypes( include = 'object').columns cols for col in cols :      le = LabelEncoder()     X_train[col] = le.fit_transform(X_train[col])     X_test[col] = le.transform(X_test[col])  X_train.head()  #검증데이터 분리  from sklearn.model_selection import train_test_split X_tr, X_val, y_tr, y_val = train_test_split(X_train,                                             y_train['gender'],                                             test_size = 0.2,                                             random_state = 2022) X_tr.shape, X_val.shape, y_tr.shape, y_val.shape  #모델링 - 랜덤포레스트 from sklearn.ensemble import RandomForestClassifier from sklearn.metrics import roc_auc_score  model = RandomForestClassifier(random_state=2022) model.fit(X_tr,y_tr) pred = model.predict_proba(X_val)   roc_auc_score(y_val,pred[:,1])  pred = model.predict_proba(X_test) <------------------이 과정에서 발생되는 오류입니다 pred ValueError: Input X contains NaN. RandomForestClassifier does not accept missing values encoded as NaN natively.라는 에러가 발생합니다 ㅠㅠ코드를 검토해봐도 이상은 없는거같은데.. 뭐가문제일까요? ㅠㅠ

Answer

아!!! X_test 데이터에 결측치를 제거안해서 그런거네요 ㅠㅠ

Answer

화이팅입니다:)

작성자 없음

예시문제 작업형2 test 데이터 예측시 발생하는 오류

이 글과 비슷한 Q&A

7회 기출문제 원핫인코딩 관련 질문입니다.

MYSQL 맥북 오류

동적 페이지 이동 크롤링 방법 문의

섹션 테스트 코드 오류..