OpenCV YOLO에서 각 Output layer shape 출력 질문입니다!

Question

안녕하세요! 사소한 질문일 수 있는데 궁금해서 질문드립니다!

다음과 같이 단일 이미지를 Object Detection하는 코드를 함수화 하지 않고 개별 cell로 작성한 후 output shape를 출력하면 3개의 shape 모두 각각 출력되는 것을 볼 수 있습니다!(빨간색 박스입니다!)

그런데 아래와 같이 함수화로 만들고 난 후 출력하면 3개의 Output layer 중 마지막 layer인 (8112, 85) 만 3번 출력되는데 왜 이러는 걸까요..? 그렇다고 Object Detection 결과가 잘못되어서 나오진 않습니다. 함수화하지 않은 셀에서 실행했을 때랑 Detection 결과는 동일하게 정상적으로 나옵니다. shape 출력 결과만 저렇게 나오는 것 같은데.. 왜 그러는 건가요..?

Answer

typing이 잘못되었군요.

아래에서 for 문에 idx, output이 되어야 하는데, idx, ouput이 되었습니다. ouput=>output으로 수정하시면 됩니다.

for idx, ouput in enumerate(cv_out):
    print('output shape:', output.shape)

Answer

늦은 답변 죄송합니다 ㅜㅜ 하단에 코드 첨부해드렸습니다! # 단일 이미지를 YOLO로 Object Detection 함수화 시키기 import numpy as np import time import os  def get_detected_img(cv_net, img_array, conf_threshold, nms_threshold,                      use_copied_img=True, is_print=True):   # 원본 이미지 사이즈로 다시 돌려야 함! -> array일때 row는 height를 의미! Detection결과로 반환되는 x좌표는 width를 의미함! 헷갈리지 말즈아!   height = img_array.shape[0]   width = img_array.shape[1]   draw_img = None   if use_copied_img:     draw_img = img_array.copy()   else:     draw_img = img_array    # YOLO의 3개 Output layer를 얻자   layer_names = cv_net.getLayerNames()   outlayer_names = [layer_names[i[0] - 1] for i in cv_net.getUnconnectedOutLayers()]   #print('out layer names:', outlayer_names)   # 로드한 YOLO 모델에 입력 이미지 넣기   cv_net.setInput(cv2.dnn.blobFromImage(img_array, scalefactor=1/255.,                                         size=(416, 416), swapRB=True, crop=False))   # 이미지 Object Detection 수행하는데 Output layers 넣어주기! -> 넣어준 layer일 때마다의 Output을 반환해줌   start = time.time()   cv_out = cv_net.forward(outlayer_names)      green, red = (0, 255, 0), (0, 0, 255)   class_ids = []   confidences = []   boxes = []   # print('type cv_out:', type(cv_out))   # 총 3개의 Ouput layer들에 대해 하나씩 loop   for idx, ouput in enumerate(cv_out):     print('output shape:', output.shape)     # 각 Output layer들의 Object Detection 결과 처리     for idx2, detection in enumerate(ouput):       scores = detection[5:]  # 80개의 클래스 softmax 확률       class_id = np.argmax(scores)  # 가장 확률이 높은 클래스 id 반환       confidence = scores[class_id] # 가장 확률이 높은 클래스의 confidence score 반환        if confidence > conf_threshold:         # 들어있는 스케일링된 좌표값들 처리(scaled center_x, center_y, width, height)         center_x = int(detection[0] * width)         center_y = int(detection[1] * height)         o_width = int(detection[2] * width)         o_height = int(detection[3] * height)         # 왼쪽 상단 좌표          left = int(center_x - o_width/2)         top = int(center_y - o_height/2)          class_ids.append(class_id)         confidences.append(float(confidence)) # confidence type => just float형으로!(not np.float)         boxes.append([left, top, o_width, o_height])   # NMS 수행   optimal_idx = cv2.dnn.NMSBoxes(boxes, confidences, conf_threshold, nms_threshold)   # NMS 결과의 최적의 바운딩 박스들을 하나씩 표시!   if len(optimal_idx) > 0:     for i in optimal_idx.flatten():       class_id = class_ids[i]       confidence = confidences[i]       box = boxes[i]       left = int(box[0])       top = int(box[1])       right = int(left + box[2])       bottom = int(top + box[3])       caption = f'{labels_to_names_seq[class_id]}: {confidence :.3f}'       # 박스 씌우고 캡션 넣기       cv2.rectangle(draw_img, (left, top), (right, bottom),                     green, thickness=2)       cv2.putText(draw_img, caption, (left, top-5), cv2.FONT_HERSHEY_COMPLEX,                   0.4, red, 1)   if is_print:     print('Detection 수행 시간:', time.time() - start, '초')    return draw_img

Answer

안녕하십니까, 함수를 어떻게 만들었는지 소스코드를 올려 놓으시면 봐드릴께요.

밑바닥개발자

OpenCV YOLO에서 각 Output layer shape 출력 질문입니다!

이 글과 비슷한 Q&A

nothon 노트 질문

len과 sum의 차이

'행'과 '인덱스'는 같은 개념이라고 봐도 될까요?

IBKR Desktop Global Configuration 설정