8/11 -[26/10K] - lidar2img_rts 이해(feat, 좌표계 변환)

개발일지

8/11 -[26/10K] - lidar2img_rts 이해(feat, 좌표계 변환)

wandering developer 2024. 8. 11. 05:21

목표 시간	10000 : 00
총 시간	26 : 20
공부 시간	1 : 20
시작 시간	05 : 08
종료 시간	06 : 27

목표 : Projection Martix 이해

우선 sensor2lidar 값을 가지고 있다면 아래 처럼 계산해서 구할 수 있다.

참고로 intrinsic.shape[0], intrinsic.shape[1]는 3x3 행렬이므로 3 값을 갔는다.

lidar2img_rts 해당 변수는 모델의 결과 값에 이미지를 그릴때 사용하는것 같다.

모델의 결과 값은 3D 이고 이미지는 2D 이니 해당 matrix가 필요하다.

## nuscenes_3d_det_track_dataset.py

# obtain lidar to image transformation matrix
lidar2cam_r = np.linalg.inv(cam_info["sensor2lidar_rotation"])
lidar2cam_t = (
    cam_info["sensor2lidar_translation"] @ lidar2cam_r.T
)
lidar2cam_rt = np.eye(4)
lidar2cam_rt[:3, :3] = lidar2cam_r.T
lidar2cam_rt[3, :3] = -lidar2cam_t
intrinsic = copy.deepcopy(cam_info["cam_intrinsic"])
cam_intrinsic.append(intrinsic)
viewpad = np.eye(4)
viewpad[: intrinsic.shape[0], : intrinsic.shape[1]] = intrinsic
lidar2img_rt = viewpad @ lidar2cam_rt.T
lidar2img_rts.append(lidar2img_rt)

지식 1

역으로 변환 벡터 계산 하는 방법

변환 행렬의 개념

3D 공간에서 좌표계를 변환할 때 회전과 변환(translation)이라는 두 가지 주요 요소가 있습니다. 변환 행렬 T는 보통 회전 행렬 R과 변환 벡터 t를 결합한 다음과 같은 형태로 표현됩니다:

여기서:

R은 회전 행렬로, 좌표계를 회전시킵니다.
t은 변환 벡터로, 좌표계의 원점을 이동시킵니다.

좌표계 변환 시 변환 벡터의 역할

두 좌표계 간의 변환을 생각할 때, 변환 벡터 t는 한 좌표계에서 다른 좌표계의 원점이 어디에 있는지를 나타냅니다. 예를 들어, LiDAR에서 카메라로 변환하는 경우:

LiDAR 좌표계에서 카메라 좌표계로 변환:
- 회전 행렬 R을 적용하여 좌표계를 회전시킵니다.
- 변환 벡터 t를 더하여 좌표계를 이동시킵니다.
카메라 좌표계에서 LiDAR 좌표계로 변환:
- 반대로 회전 행렬의 전치 행렬 RTR^T을 적용합니다.
- 변환 벡터의 방향도 반대가 되므로, 이를 고려하기 위해 −t-t를 적용합니다.

지식 2

viewpad @ lidar2cam_rt.T 연산 질문에 대한 답변 :

sensor2lidar_rotation 값 계산하기

## nuscenes_converter.py 에 있음.

sd_rec = nusc.get("sample_data", sensor_token)
cs_record = nusc.get(
    "calibrated_sensor", sd_rec["calibrated_sensor_token"]
)
pose_record = nusc.get("ego_pose", sd_rec["ego_pose_token"])
data_path = str(nusc.get_sample_data_path(sd_rec["token"]))
if os.getcwd() in data_path:  # path from lyftdataset is absolute path
    data_path = data_path.split(f"{os.getcwd()}/")[-1]  # relative path
sweep = {
    "data_path": data_path,
    "type": sensor_type,
    "sample_data_token": sd_rec["token"],
    "sensor2ego_translation": cs_record["translation"],
    "sensor2ego_rotation": cs_record["rotation"],
    "ego2global_translation": pose_record["translation"],
    "ego2global_rotation": pose_record["rotation"],
    "timestamp": sd_rec["timestamp"],
}
l2e_r_s = sweep["sensor2ego_rotation"]
l2e_t_s = sweep["sensor2ego_translation"]
e2g_r_s = sweep["ego2global_rotation"]
e2g_t_s = sweep["ego2global_translation"]

# obtain the RT from sensor to Top LiDAR
# sweep->ego->global->ego'->lidar
l2e_r_s_mat = Quaternion(l2e_r_s).rotation_matrix
e2g_r_s_mat = Quaternion(e2g_r_s).rotation_matrix
R = (l2e_r_s_mat.T @ e2g_r_s_mat.T) @ (
    np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T
)
T = (l2e_t_s @ e2g_r_s_mat.T + e2g_t_s) @ (
    np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T
)
T -= (
    e2g_t @ (np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T)
    + l2e_t @ np.linalg.inv(l2e_r_mat).T
)
sweep["sensor2lidar_rotation"] = R.T  # points @ R.T + T
sweep["sensor2lidar_translation"] = T

위 코드를 수식으로 표현 하면 아래와 같다

T -= ( e2g_t @ (np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T) + l2e_t @ np.linalg.inv(l2e_r_mat).T )

R = (l2e_r_s_mat.T @ e2g_r_s_mat.T) @ ( np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T )

T = (l2e_t_s @ e2g_r_s_mat.T + e2g_t_s) @ ( np.linalg.inv(e2g_r_mat).T @ np.linalg.inv(l2e_r_mat).T )

위 처럼 수식을 적을 수 있지만 바로 이해하기 힘들다.

천천히 음미하면 보면 됨.

요약은 camera 의 좌표를 lidar로 변환 후 lidar의 translation 값만 반영해준다. 이렇게 이해하면 됨.

근데 이 과정이 바로 갈 수 없으니 camera --> ego --> global --> ego --> lidar 로 진행됨.