분류 예측에서 결과값의 구체적 내용을 확인할수 있는지요?

Question

안녕하세요.. 무척 유익한 강의 수회째 반복하여 듣고 있습니다. 깊은 감사드립니다. 강의안 # 작업형 유형2(기초쌓기)<-제7강 (팽귄의 Species 분류예측 모델) 마무리에서 #11. 파일저장 pd.DataFrame({'id': y_test.index, 'pred': pred3}).to_csv('003000000.csv', index=False) 형식으로 답안이 제출됩니다. 결과의 구체적 내용이 궁금하여 print(pd.DataFrame({'id': y_test.index, 'pred': pred3}).head(10))으로 확인해보니 id pred 0 57 0 1 173 1 2 213 1 3 50 0 4 25 0 5 207 1 6 166 1 7 244 2 8 234 2 9 61 0 분류 결과(pred3)가 0과2사이로 표현됩니다. 저의 이해에 오류가 없다면, 저숫자가팽귄의 종(Species) ' Adelie','Gentoo','Chinstrap' 중에 어느 종을 나타내는 것인지 확인할 방법이 있는지요? 감사합니다.

Jongdeok Heo · Answer

선생님, 감사합니다. 분류문제의 라벨인코딩의 에러 메시지의 원인 및 해결을 하고 싶습니다. from sklearn.preprocessing import LabelEncoder X_label = ['sex', 'embarked', 'class', 'who', 'adult_male', 'deck', 'embark_town', 'alone'] X_train[label] = X_train[label].apply(LabelEncoder().fit_transform) X_test[label] = X_test[label].apply(LabelEncoder().fit_transform) print(X_train.head()) [에러 메시지] ypeError Traceback (most recent call last) C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode (values, uniques, encode, check_unknown) 111 try : --> 112 res = _encode_python ( values , uniques , encode ) 113 except TypeError : C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode_python (values, uniques, encode) 59 if uniques is None : ---> 60 uniques = sorted ( set ( values ) ) 61 uniques = np . array ( uniques , dtype = values . dtype ) TypeError : '<' not supported between instances of 'float' and 'str' During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) in 1 from sklearn . preprocessing import LabelEncoder 2 X_label = [ 'sex' , 'embarked' , 'class' , 'who' , 'adult_male' , 'deck' , 'embark_town' , 'alone' ] ----> 3 X_train [ label ] = X_train [ label ] . apply ( LabelEncoder ( ) . fit_transform ) 4 X_test [ label ] = X_test [ label ] . apply ( LabelEncoder ( ) . fit_transform ) 5 print ( X_train . head ( ) ) C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in apply (self, func, axis, raw, result_type, args, **kwds) 6876 kwds = kwds , 6877 ) -> 6878 return op . get_result ( ) 6879 6880 def applymap ( self , func ) -> "DataFrame" : C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result (self) 184 return self . apply_raw ( ) 185 --> 186 return self . apply_standard ( ) 187 188 def apply_empty_result ( self ) : C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard (self) 311 312 # compute the result using the series generator --> 313 results , res_index = self . apply_series_generator ( ) 314 315 # wrap results C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator (self) 339 else : 340 for i , v in enumerate ( series_gen ) : --> 341 results [ i ] = self . f ( v ) 342 keys . append ( v . name ) 343 C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in fit_transform (self, y) 250 """ 251 y = column_or_1d ( y , warn = True ) --> 252 self . classes_ , y = _encode ( y , encode = True ) 253 return y 254 C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_label.py in _encode (values, uniques, encode, check_unknown) 112 res = _encode_python ( values , uniques , encode ) 113 except TypeError : --> 114 raise TypeError ( "argument must be a string or number" ) 115 return res 116 else : TypeError : argument must be a string or number

대구빅데이터활용센터 · Answer

안녕하세요. 숫자가 어떤 종을 나타내는 것인지 확인할 코드를 제공해드립니다. 다른 방법으로는 새로운 데이터 프레임을 만들어 확인이 가능합니다. 강의의 라벨인코딩 과정에서 아래 코드를 추가하면 데이터 프레임으로 확인을 할 수 있습니다.