ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • 9/28 - 수정된 로직 결과값 분석
    개발일지 2024. 9. 28. 06:28

    그냥 해봤다.

     

    연산량은 많지 않은것 같은데 멀티프로세싱이 잘 안되는지 시간이 기존 방식보다 많이 걸린다.

     

    Front와 Back만 적용했다.

     

    약간 더 좋은 수치를 보여준다.

     

    내가 수정한 코드 결과

    mAP: 0.4622
    mATE: 0.5431
    mASE: 0.4506
    mAOE: 0.5717
    mAVE: 0.3964
    mAAE: 0.2871
    NDS: 0.5062
    Eval time: 5.2s
    
    Per-class results:
    Object Class    AP      ATE     ASE     AOE     AVE     AAE
    car     0.730   0.317   0.149   0.068   0.095   0.083
    truck   0.656   0.175   0.130   0.037   0.033   0.000
    bus     0.741   0.521   0.124   0.071   0.457   0.052
    trailer 0.000   1.000   1.000   1.000   1.000   1.000
    construction_vehicle    0.000   1.000   1.000   1.000   1.000   1.000
    pedestrian      0.639   0.452   0.257   0.336   0.216   0.162
    motorcycle      0.684   0.404   0.292   1.084   0.055   0.000
    bicycle 0.377   0.435   0.206   0.548   0.315   0.000
    traffic_cone    0.797   0.127   0.349   nan     nan     nan
    barrier 0.000   1.000   1.000   1.000   nan     nan

     

    기존 방식

    mAP: 0.4603
    mATE: 0.5421
    mASE: 0.4522
    mAOE: 0.5639
    mAVE: 0.3982
    mAAE: 0.2864
    NDS: 0.5059
    Eval time: 5.0s
    
    Per-class results:
    Object Class    AP      ATE     ASE     AOE     AVE     AAE
    car     0.740   0.312   0.150   0.067   0.095   0.078
    truck   0.650   0.174   0.142   0.040   0.045   0.000
    bus     0.756   0.521   0.124   0.071   0.454   0.052
    trailer 0.000   1.000   1.000   1.000   1.000   1.000
    construction_vehicle    0.000   1.000   1.000   1.000   1.000   1.000
    pedestrian      0.639   0.454   0.255   0.333   0.217   0.161
    motorcycle      0.674   0.394   0.293   1.096   0.054   0.000
    bicycle 0.342   0.438   0.209   0.469   0.321   0.000
    traffic_cone    0.802   0.128   0.349   nan     nan     nan
    barrier 0.000   1.000   1.000   1.000   nan     nan

     

     

    • AP (Mean Average Precision) 
    • ATE (Mean Absolute Trajectory Error) 
    • ASE (Mean Absolute Scale Error) 
    • AOE (Average Orientation Error)
    • AVE (Average Velocity Error) 
    • AAE (Average Attribute Error) : 1에서 attribute classification accuracy를 빼준 값 (1 - acc)
    • NDS (nuScenes Detection Score)
      • mAP, mATE, mASE, mAOE, mAVE, mAAE에 가중치를 부여하여 합산한 값
      • TP 오류를 TP 점수로 변환한다
      • mAP에는 가중치 5를 할당하고 나머지에는 가중치 1을 할당하여 계산한다

     

    위 수치값은 아래 잘 설명 되어 있다

    https://velog.io/@happy_quokka/%EB%94%A5%EB%9F%AC%EB%8B%9D-%ED%94%84%EB%A1%9C%EC%A0%9D%ED%8A%B8-4.-Center-Point-%EB%AA%A8%EB%8D%B8-%EC%82%AC%EC%9A%A9%ED%95%98%EC%97%AC-object-detection-%EC%88%98%ED%96%89%ED%95%98%EA%B8%B0-%EA%B2%B0%EA%B3%BC#tracking-%EA%B2%B0%EA%B3%BC

     

    [딥러닝 프로젝트] 4. Center Point 모델 사용하여 object detection 수행하기 - 결과

    centerpoint 모델 학습 결과

    velog.io

     

    정면타겟은 엄청 중요한데 수정 로직은 잘 탐지한다.

     

    로직 수정 후

     

    기존 로직

     

    python3.8>site-packages>mmdet>apis>test.py>single_gpu_test

    더보기
    def single_gpu_test(model,
                        model_img,
                        data_loader,
                        show=False,
                        out_dir=None,
                        show_score_thr=0.3):
        model.eval()
        results = []
        dataset = data_loader.dataset
        PALETTE = getattr(dataset, 'PALETTE', None)
        prog_bar = mmcv.ProgressBar(len(dataset))
        model_img.eval()    
        # BGR to RGB conversion
        ID_COLOR_MAP = [
            (59, 59, 238), # vibrant blue
            (0, 255, 0),   # green
            (0, 0, 255),   # blue
            (255, 255, 0), # Yellow
            (0, 255, 255), # Cyan
            (255, 0, 255), # Magenta
            (255, 255, 255),# White
            (0, 127, 255),  # medium sky blue
            (71, 130, 255), # medium cornflower blue
            (127, 127, 0),  # Olive or dark yellow-green
        ]
        ID_COLOR_MAP_RGB = [(r, g, b) for (b, g, r) in ID_COLOR_MAP]    
        transform_img = transforms.Compose([transforms.ToTensor()])
        camera_sensors = ['CAM_FRONT', 'CAM_FRONT_RIGHT','CAM_FRONT_LEFT', 'CAM_BACK', 'CAM_BACK_LEFT','CAM_BACK_RIGHT']
        target_classes = [0,1,2,3,4,6,7,8,9,10,11,12,13,14,15,16,17]
        for i, data in enumerate(data_loader):
            
            with torch.no_grad():
                result = model(return_loss=False, rescale=True, **data)
    
            if False:
                raw_imgs = data['img'].data[0][0].permute(0, 2, 3, 1).cpu().numpy()            
                
    
                # raw_imgs = data2["img"][0].permute(0, 2, 3, 1).cpu().numpy()            
                # # 6개 카메라 이미지 합치기
                imgs = []
                mean = [123.675, 116.28 , 103.53 ] # np.array(cfg.img_norm_cfg["mean"])
                std = [58.395, 57.12 , 57.375] # np.array(cfg.img_norm_cfg["std"])
                img_norm_mean = mean#np.array(cfg.img_norm_cfg["mean"])
                img_norm_std = std#np.array(cfg.img_norm_cfg["std"])
                
                image = raw_imgs * img_norm_std + img_norm_mean
                
                for k in range(6):
    
                    img = image[k]
                    
                    if k > 2:
                        img = cv2.flip(img, 1)
                        
                    resized_img = cv2.resize(img,(1600, 900))
                    ubyte_img = resized_img.astype(np.uint8)
                    # rgb_img = cv2.cvtColor(ubyte_img, cv2.COLOR_BGR2RGB)
                    imgs.append(ubyte_img)        
                    
                images = np.concatenate(
                    [
                        np.concatenate([imgs[2], imgs[0], imgs[1]], axis=1),
                        np.concatenate([imgs[4], imgs[3], imgs[5]], axis=1),
                    ],
                    axis=0,
                ) 
    
    
                # 여기까지가 출력 결과 임.
                # result = result[i]["img_bbox"]
                vis_score_threshold = 0.3
                indices = [i for i, score in enumerate(result[0]["img_bbox"]["scores_3d"]) if score > 0.15]
                pred_bboxes_3d = result[0]["img_bbox"]["boxes_3d"][indices]#[result["scores_3d"] > vis_score_threshold]
                
                # 추가 DNN 실행하기
                img_pil = Image.fromarray(images)
                image_tensor = transform_img(img_pil).unsqueeze(0)
                
                with torch.no_grad():
                    predictions = model_img(image_tensor)
    
                corners_3d = box3d_to_corners(pred_bboxes_3d)
                ratio_x = 1600/704
                ratio_y = 900/256   
                num_bbox = corners_3d.shape[0]
                pts_4d = np.concatenate(
                    [corners_3d.reshape(-1, 3), np.ones((num_bbox * 8, 1))], axis=-1
                )    
                imgfov_pts_2d_CAM = []
                for k, key in enumerate(camera_sensors):
                    lidar2img_rt = copy.deepcopy(data["projection_mat"][0][k]).reshape(4, 4)
                    if isinstance(lidar2img_rt, torch.Tensor):
                        lidar2img_rt = lidar2img_rt.cpu().numpy()
                    pts_2d = pts_4d @ lidar2img_rt.T
    
                    pts_2d[:, 2] = np.clip(pts_2d[:, 2], a_min=1e-5, a_max=1e5)
                    pts_2d[:, 0] /= pts_2d[:, 2]
                    pts_2d[:, 1] /= pts_2d[:, 2]
                    imgfov_pts_2d = pts_2d[..., :2].reshape(num_bbox, 8, 2)        
                    imgfov_pts_2d_CAM.append(imgfov_pts_2d)
    
                # # 기존에 있는 박스에 결과값 매칭하기
                fastrnn_boxes = []
                show_score_thr=0.5
                for idx, (box, label, score) in enumerate(zip(predictions[0]['boxes'], predictions[0]['labels'], predictions[0]['scores'])):
                    if score > show_score_thr and label in target_classes:
                        x1, y1, x2, y2 = box
                            
                        box1 = [
                            (x1, y1),  # 좌상단
                            (x2, y1),  # 우상단
                            (x1, y2),  # 좌하단
                            (x2, y2)   # 우하단
                        ]            
                        fastrnn_boxes.append([box1,score,label,False])
                
                # 결과값이용해서 박스 그리기
                h, w = 256,704
                sparse4d_boxes = []
                temp_camera_sensors = ['CAM_FRONT','CAM_BACK']
                for k, key in enumerate(temp_camera_sensors):    
                    if 'CAM_FRONT' == key:
                        offset_x = 1600
                        offset_y = 0
                    elif 'CAM_FRONT_RIGHT' == key:
                        offset_x = 3200
                        offset_y = 0
                    elif 'CAM_FRONT_LEFT' == key:
                        offset_x = 0
                        offset_y = 0
                    elif 'CAM_BACK' == key:
                        offset_x = 3200
                        offset_y = 900
                    elif 'CAM_BACK_LEFT' == key:
                        offset_x = 1600
                        offset_y = 900
                    elif 'CAM_BACK_RIGHT' == key:                              
                        offset_x = 4800
                        offset_y = 900
    
                    if 'BACK' in key:
                        sign = -1
                    else:
                        sign = 1        
                        
                    for ii in range(num_bbox):
                        corners = np.clip(imgfov_pts_2d_CAM[k][ii], -1e4, 1e5).astype(np.int32)
                        if sign > 0:
                            box2 = [
                                    (int(corners[2][0]*ratio_x+offset_x),int(corners[2][1]*ratio_y+offset_y)), # 좌상단
                                    (int(corners[5][0]*ratio_x+offset_x),int(corners[5][1]*ratio_y+offset_y)), # 우상단
                                    (int(corners[3][0]*ratio_x+offset_x),int(corners[3][1]*ratio_y+offset_y)), # 좌하단
                                    (int(corners[4][0]*ratio_x+offset_x),int(corners[4][1]*ratio_y+offset_y)), # 우하단
                                ]                    
                        else:
                            box2 = [
                                    (offset_x - int(corners[2][0]*ratio_x),int(corners[2][1]*ratio_y+offset_y)), # 좌상단
                                    (offset_x - int(corners[5][0]*ratio_x),int(corners[5][1]*ratio_y+offset_y)), # 우상단
                                    (offset_x - int(corners[3][0]*ratio_x),int(corners[3][1]*ratio_y+offset_y)), # 좌하단
                                    (offset_x - int(corners[4][0]*ratio_x),int(corners[4][1]*ratio_y+offset_y)), # 우하단
                                ]                      
                        sparse4d_boxes.append([box2,score,ii,False])                 
                
                
                for idx1, fastrnn_box in enumerate(fastrnn_boxes):
                    box1,score,label,checked = fastrnn_box
                    if checked == False:
                        for idx2, sparse4d_box in enumerate(sparse4d_boxes):
                            box2,score2,label2,checked2 = sparse4d_box
                            if checked2 == False:
                                
                                # box1 = sparse4d_boxes[8][0]
                                # box2 = fastrnn_boxes[4][0]                   
                                
                                # 각 사각형의 좌상단 및 우하단 좌표 추출
                                (x1_min, y1_min), (x1_max, y1_max) = box1[0], box1[3]
                                (x2_min, y2_min), (x2_max, y2_max) = box2[0], box2[3]
    
                                # 교차 영역의 좌상단과 우하단 좌표 계산
                                inter_x_min = max(x1_min, x2_min)  # 교차 영역의 좌상단 X 좌표
                                inter_y_min = max(y1_min, y2_min)  # 교차 영역의 좌상단 Y 좌표
                                inter_x_max = min(x1_max, x2_max)  # 교차 영역의 우하단 X 좌표
                                inter_y_max = min(y1_max, y2_max)  # 교차 영역의 우하단 Y 좌표
    
                                # 교차 영역의 면적 계산
                                inter_area = max(0, inter_x_max - inter_x_min) * max(0, inter_y_max - inter_y_min)
    
                                # 각 사각형의 면적 계산
                                area_box1 = (x1_max - x1_min) * (y1_max - y1_min)
                                area_box2 = (x2_max - x2_min) * (y2_max - y2_min)
    
                                # 합집합 영역의 면적 계산
                                union_area = area_box1 + area_box2 - inter_area
    
                                # IoU 계산
                                if union_area > 0:
                                    iou = inter_area / union_area
                                else:
                                    iou = 0
    
                                # 겹침 비율이 특정 임계값 이상일 때
                                threshold = 0.3  # 50% 겹침을 기준으로
                                if iou > threshold:
                                    fastrnn_boxes[idx1][3] = True
                                    sparse4d_boxes[idx2][3] = True
                                    iii = sparse4d_boxes[idx2][2]
                                    result[0]["img_bbox"]["scores_3d"][indices[iii]] = torch.tensor(0.5)
                                    break
                                    #pred_bboxes_3d = torch.cat((pred_bboxes_3d, pred_bboxes_outofspec_3d[buf_index].unsqueeze(0)), dim=0)
                                    #pred_bboxes_3d = result["boxes_3d"]#[result["scores_3d"] > vis_score_threshold]
    
            batch_size = len(result)
            if show or out_dir:
                if batch_size == 1 and isinstance(data['img'][0], torch.Tensor):
                    img_tensor = data['img'][0]
                else:
                    img_tensor = data['img'][0].data[0]
                img_metas = data['img_metas'][0].data[0]
                imgs = tensor2imgs(img_tensor, **img_metas[0]['img_norm_cfg'])
                assert len(imgs) == len(img_metas)
    
                for i, (img, img_meta) in enumerate(zip(imgs, img_metas)):
                    h, w, _ = img_meta['img_shape']
                    img_show = img[:h, :w, :]
    
                    ori_h, ori_w = img_meta['ori_shape'][:-1]
                    img_show = mmcv.imresize(img_show, (ori_w, ori_h))
    
                    if out_dir:
                        out_file = osp.join(out_dir, img_meta['ori_filename'])
                    else:
                        out_file = None
    
                    model.module.show_result(
                        img_show,
                        result[i],
                        bbox_color=PALETTE,
                        text_color=PALETTE,
                        mask_color=PALETTE,
                        show=show,
                        out_file=out_file,
                        score_thr=show_score_thr)
    
            # encode mask results
            if isinstance(result[0], tuple):
                result = [(bbox_results, encode_mask_results(mask_results))
                          for bbox_results, mask_results in result]
            # This logic is only used in panoptic segmentation test.
            elif isinstance(result[0], dict) and 'ins_results' in result[0]:
                for j in range(len(result)):
                    bbox_results, mask_results = result[j]['ins_results']
                    result[j]['ins_results'] = (bbox_results,
                                                encode_mask_results(mask_results))
    
            results.extend(result)
    
            for _ in range(batch_size):
                prog_bar.update()
        return results
    반응형
Designed by Tistory.