결정 트리 과적합

from sklearn.datasets import make_classification

import matplotlib.pyplot as plt

%matplotlib inline

plt.title("3 Class values with 2 Features Sample data creation")

# 2차원 시각화를 위해서 feature는 2개, 결정값 클래스는 3가지 유형의 classification 샘플 데이터 생성.

X_features, y_labels = make_classification(n_features=2, n_redundant=0, n_informative=2,

n_classes=3, n_clusters_per_class=1,random_state=0)

# plot 형태로 2개의 feature로 2차원 좌표 시각화, 각 클래스값은 다른 색깔로 표시됨.

plt.scatter(X_features[:, 0], X_features[:, 1], marker='o', c=y_labels, s=25, cmap='rainbow', edgecolor='k')

-------------------------------------

책 p.199에 나와있는 코드인데, 수업중에는 다루지 않아 질문 남깁니다.

plt.scatter(X_features[:, 0], X_features[:, 1], marker='o', c=y_labels, s=25, cmap='rainbow', edgecolor='k')이 부분에서

1) X_features[:,0]은 0에 대한 예측 확률, X_features[:, 1]은 1에 대한 예측 확률값 인가요?

앞서 배운 predict_proba() 수행시 반환되는 ndarray값과 혼동되어 질문 남깁니다.

답변 미리 감사드립니다.

인프런 커뮤니티 질문&답변