pd.concat(df, ignore_index=True) InvalidIndexError 해결 방법 문의

Question

안녕하세요. 아래 코드에서 마지막 부분에서 에러가 발생하는데 찾아봐도 해결을 못하겠습니다. # 최종 데이터 합치기 df1 = pd.concat(df, ignore_index=True) --------------------------------------------------------------------------- InvalidIndexError Traceback (most recent call last) Cell In[89], line 2 1 # 최종 데이터 합치기 ----> 2 df1 = pd.concat(df, ignore_index=True) File ~\anaconda3\Lib\site-packages\pandas\core\reshape\concat.py:393, in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy) 378 copy = False 380 op = _Concatenator( 381 objs, 382 axis=axis, (...) 390 sort=sort, 391 ) --> 393 return op.get_result() File ~\anaconda3\Lib\site-packages\pandas\core\reshape\concat.py:676, in _Concatenator.get_result(self) 674 obj_labels = obj.axes[1 - ax] 675 if not new_labels.equals(obj_labels): --> 676 indexers[ax] = obj_labels.get_indexer(new_labels) 678 mgrs_indexers.append((obj._mgr, indexers)) 680 new_data = concatenate_managers( 681 mgrs_indexers, self.new_axes, concat_axis=self.bm_axis, copy=self.copy 682 ) File ~\anaconda3\Lib\site-packages\pandas\core\indexes\base.py:3875, in Index.get_indexer(self, target, method, limit, tolerance) 3872 self._check_indexing_method(method, limit, tolerance) 3874 if not self._index_as_unique: -> 3875 raise InvalidIndexError(self._requires_unique_msg) 3877 if len(target) == 0: 3878 return np.array([], dtype=np.intp) InvalidIndexError: Reindexing only valid with uniquely valued Index objects ============================ df = [] articleNos = ['2433459189','2433504511'] for articleNo in articleNos: ind_url = f'https://new.land.naver.com/api/articles/{articleNo}?complexNo=' res = requests.get(ind_url, headers=headers) ind_dict = res.json() article_df = pd.Series(ind_dict['articleDetail']).to_frame().T # articleDetail_df = articleDetail_df[['articl/eNo','articleName','buildingTypeName','realestateTypeName', 'tradeTypeName', 'cityName','divisionName', 'sectionName', 'etcAddress', 'monthlyManagementCost', 'buildingName']] if 'articleFloor' in ind_dict.keys(): articleFloor_df = pd.Series(ind_dict['articleFloor']).to_frame().T article_df = pd.concat( [ article_df, articleFloor_df, ], axis=1 ) else: print(articleNo, '/', 'articleFloor') # articleFloor_df = articleFloor_df[['totalFloorCount','correspondingFloorCount']] if 'articlePrice' in ind_dict.keys(): articlePrice = pd.Series(ind_dict['articlePrice']).to_frame().T article_df = pd.concat( [ article_df, articlePrice, ], axis=1 ) else: print(articleNo, '/', 'articlePrice') # articlePrice_df = articlePrice_df[['dealPrice','allWarrantPrice','allRentPrice']] if 'articleRealtor' in ind_dict.keys(): articleRealtor = pd.Series(ind_dict['articleRealtor']).to_frame().T article_df = pd.concat( [ article_df, articleRealtor, ], axis=1 ) else: print(articleNo, '/', 'articleRealtor') # articleRealtor_df = articleRealtor_df[['realtorName','representativeName','cellPhoneNo','representativeTelNo']] if 'articleSpace' in ind_dict.keys(): articleSpace = pd.Series(ind_dict['articleSpace']).to_frame().T article_df = pd.concat( [ article_df, articleSpace, ], axis=1 ) else: print(articleNo, '/', 'articleSpace') # articleSpace_df = articleSpace_df[['supplySpace','exclusiveSpace']] # article_df = pd.concat( # [ # articleDetail_df, # articleFloor_df, # articlePrice_df, # articleRealtor_df, # articleSpace_df, # ], # axis=1 # ) df.append(article_df) # 최종 데이터 합치기 df1 = pd.concat(df, ignore_index=True)

DeepingSauce · Answer

안녕하세요. 우선 답이 너무 늦어 죄송합니다. df 라는 list에 넣은 article_df 개개의 dataframe에 중복인덱스가 있나 봅니다 ignore_index는 df를 이루는 article_df간에 중복인덱스가 있다면 무시한다는 뜻이기 때문에, 한 dataframe에서 중복 index가 있는 경우는 해결해주지 못합니다

communityai8509 · Answer

안녕하세요, 인프런 AI 인턴입니다. 초칼라님, pd.concat(df, ignore_index=True) 에서 발생하는 InvalidIndexError 문제를 해결하기 위해 질문 내용과 함께 제공된 정보를 검토했습니다. 제공된 정보 중에서는 아래의 게시글이 초칼라님의 문제와 가장 유사한 상황을 다루고 있습니다: to_csv() 한글 깨짐 : 이 게시물은 DataFrame을 to_csv 메소드를 사용하여 CSV 파일로 저장할 때 한글 인코딩 관련 문제를 해결하는 방법에 대해 설명하고 있습니다. 하지만 초칼라님의 문제는 DataFrame들을 concat 함수를 사용해 합치려 할 때 InvalidIndexError 가 발생하는 것과 관련이 있습니다. 따라서, 이 내용은 직접적으로 해당 문제를 해결하는데 도움이 되지 않습니다. 초칼라님의 문제는 pd.concat 함수를 사용할 때 여러 DataFrame을 합치고자 하셨지만, 발생한 에러는 보통 인덱스 문제가 있을 때 나타납니다. 특히, InvalidIndexError: Reindexing only valid with uniquely valued Index objects 에러 메시지는 합치려는 DataFrame 중 하나 이상에서 중복된 인덱스 값이 있을 때 발생할 수 있습니다. 해결 방법으로는, 각 DataFrame에 대한 발생 가능한 중복 인덱스를 확인하고, 필요하다면 인덱스를 재설정( reset_index() )하는 과정이 도움이 될 수 있습니다. 다만, 제공된 정보 내에서 직접적인 해결 방법을 안내드리기 어려운 점 양해 부탁드립니다. 저는 질문자님의 이해를 돕고자 지식공유자님의 기존 답변을 참고하여 유사한 답변 링크를 추천해드리고 있습니다. 현재 베타 기능으로 답변이 만족스럽지 않을 수 있는 점 양해 부탁드립니다. 🙏 추가적으로 궁금한 점이 있으시면, 이어서 질문해 주세요. 곧 지식공유자께서 답변해 주실 것입니다.