-
카테고리
-
세부 분야
자격증 (데이터 사이언스)
-
해결 여부
해결됨
예시문제 작업형2(신버전) 질문있습니다
23.12.01 13:44 작성 조회수 242
1
roc_auc_score은 proba를 쓰는걸로 아는데
아래 코드에 어떤 문제가 있어서 에러가 뜨는지 궁금합니다
에러지점: pred=model.predict_proba(test)
답변을 작성해보세요.
0
이태경
질문자2023.12.01
import pandas as pd
train = pd.read_csv("data/customer_train.csv")
test = pd.read_csv("data/customer_test.csv")
# 전처리
train['환불금액'] = train['환불금액'].fillna(0)
test['환불금액'] = test['환불금액'].fillna(0)
cols = train.select_dtypes(include='object').columns
train = pd.get_dummies(train, columns = cols)
test = pd.get_dummies(test, columns = cols)
target = train.pop('성별')
from sklearn.model_selection import train_test_split
X_tr, X_val, y_tr, y_val = train_test_split(train, target, test_size = 0.2, random_state = 2022)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(random_state=2022)
model.fit(X_tr, y_tr)
pred = model.predict_proba(X_val)
from sklearn.metrics import roc_auc_score
print(roc_auc_score(y_val, pred[:,1]))
pred = model.predict_proba(test)
submit = pd.DataFrame({
'pred': pred[:,1]
})
submit.to_csv('00000.csv', index=False)
이태경
질문자2023.12.01
프로세스가 시작되었습니다.(입력값을 직접 입력해 주세요)
> 0.6415146489773355
Makefile:6: recipe for target 'py3_run' failed
make: *** [py3_run] Error 1
Traceback (most recent call last):
File "/goorm/Main.out", line 33, in <module>
pred = model.predict_proba(test)
File "/usr/local/lib/python3.9/dist-packages/sklearn/ensemble/_forest.py", line 674, in predict_proba
X = self._validate_X_predict(X)
File "/usr/local/lib/python3.9/dist-packages/sklearn/ensemble/_forest.py", line 422, in _validate_X_predict
return self.estimators_[0]._validate_X_predict(X, check_input=True)
File "/usr/local/lib/python3.9/dist-packages/sklearn/tree/_classes.py", line 407, in _validate_X_predict
X = self._validate_data(X, dtype=DTYPE, accept_sparse="csr",
File "/usr/local/lib/python3.9/dist-packages/sklearn/base.py", line 437, in _validate_data
self._check_n_features(X, reset=reset)
File "/usr/local/lib/python3.9/dist-packages/sklearn/base.py", line 365, in _check_n_features
raise ValueError(
ValueError: X has 73 features, but DecisionTreeClassifier is expecting 74 features as input.
프로세스가 종료되었습니다.
답변 1