기출4회 유형2 문의드립니다.

Question

기출4회 유형2 문의드립니다. Macro f1-core는 #*****평가(=교차검증) from sklearn.model_selection import cross_val_score score = cross_val_score(model, train, train['Segmentation'], scoring='f1_macro', cv=5) print(score) print(score.mean()) 이렇게 구하면 되는건가요? 그리고 아래와 같이 풀이해봤는데 강사님께서 풀이하신 segmentation과 다른데 괜찮나요? 풀이과정에 문제는 없는지 확인 부탁드립니다. # 라이브러리 불러오기 import pandas as pd # 데이터 불러오기 train = pd.read _csv( "train.csv" ) test = pd.read _csv( "test.csv" ) #*****데이터확인 train.shape, test.shape train.head( 2 ) test.head( 2 ) #문자형 6개 # train.info () #결측치 없음 train.isnull(). sum () test.isnull(). sum () #*****전처리 #결측값 없음 #train합치기 없음 #인코딩 from sklearn.preprocessing import LabelEncoder cols= train.select _dtypes(include= 'object' ) cols for col in cols : le = LabelEncoder() train[col] = le.fit _transform(train[col]) test[col] = le.transform(test[col]) #id삭제 train = train.drop( 'ID' ,axis= 1 ) test_ = test.pop( 'ID' ) #*****분리 from sklearn.model_selection import train_test_split X_tr, X_val, y_tr, y_val = train_test_split( train.drop( 'Segmentation' ,axis= 1 ), train[ 'Segmentation' ], test_size= 0.2 , random_state= 2022 ) #*****모델 max_depth=5~7 / n_estimators= 100~1000 from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(random_state= 0 , max_depth= 7 , n_estimators= 500 ) model.fit (X_tr, y_tr) pred = model.predict(X_val) #*****평가(=교차검증) from sklearn.model_selection import cross_val_score score = cross_val_score(model, train, train[ 'Segmentation' ], scoring= 'f1_macro' , cv= 5 ) print (score) print (score.mean()) #*****예측 pred = model.predict(test) pred submit = pd.DataFrame({ 'ID' : test_ID, 'Segmentation' : pred }) submit #*****저장 submit.to _csv( 'submission_csv' , index= False ) pd.read _csv( 'submission_csv' )

퇴근후딴짓 · Answer

네, 큰문제 없어보입니다. 혹시 입문자라면 크로스 밸리데이션보다는 train_test_split로 연습하길 추천해요! 화이팅입니다.