TY - JOUR
T1 - Keypoint Encoding for Improved Feature Extraction From Compressed Video at Low Bitrates
AU - Chao, Jianshu
AU - Steinbach, Eckehard
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/1
Y1 - 2016/1
N2 - In many mobile visual analysis applications, compressed video is transmitted over a communication network and analyzed by a server. Typical processing steps performed at the server include keypoint detection, descriptor calculation, and feature matching. Video compression has been shown to have an adverse effect on feature-matching performance. The negative impact of compression can be reduced by using the keypoints extracted from the uncompressed video to calculate descriptors from the compressed video. Based on this observation, we propose to provide these keypoints to the server as side information and to extract only the descriptors from the compressed video. First, we introduce four different frame types for keypoint encoding to address different types of changes in video content. These frame types represent a new scene, the same scene, a slowly changing scene, or a rapidly moving scene, and are determined by comparing features between successive video frames. Then, we propose Intra, Skip, and Inter modes of encoding the keypoints for different frame types. For example, keypoints for new scenes are encoded using the Intra mode, and keypoints for unchanged scenes are skipped. As a result, the bitrate of the side information related to keypoint encoding is significantly reduced. Finally, we present pairwise matching and image retrieval experiments conducted to evaluate the performance of the proposed approach using the Stanford mobile augmented reality dataset and 720p format videos. The results show that the proposed approach offers significantly improved feature matching and image retrieval performance at a given bitrate.
AB - In many mobile visual analysis applications, compressed video is transmitted over a communication network and analyzed by a server. Typical processing steps performed at the server include keypoint detection, descriptor calculation, and feature matching. Video compression has been shown to have an adverse effect on feature-matching performance. The negative impact of compression can be reduced by using the keypoints extracted from the uncompressed video to calculate descriptors from the compressed video. Based on this observation, we propose to provide these keypoints to the server as side information and to extract only the descriptors from the compressed video. First, we introduce four different frame types for keypoint encoding to address different types of changes in video content. These frame types represent a new scene, the same scene, a slowly changing scene, or a rapidly moving scene, and are determined by comparing features between successive video frames. Then, we propose Intra, Skip, and Inter modes of encoding the keypoints for different frame types. For example, keypoints for new scenes are encoded using the Intra mode, and keypoints for unchanged scenes are skipped. As a result, the bitrate of the side information related to keypoint encoding is significantly reduced. Finally, we present pairwise matching and image retrieval experiments conducted to evaluate the performance of the proposed approach using the Stanford mobile augmented reality dataset and 720p format videos. The results show that the proposed approach offers significantly improved feature matching and image retrieval performance at a given bitrate.
KW - Coding
KW - H.265/HEVC
KW - SIFT
KW - keypoints
KW - matching
KW - prediction
KW - retrieval
UR - http://www.scopus.com/inward/record.url?scp=84961700466&partnerID=8YFLogxK
U2 - 10.1109/TMM.2015.2502552
DO - 10.1109/TMM.2015.2502552
M3 - Article
AN - SCOPUS:84961700466
SN - 1520-9210
VL - 18
SP - 25
EP - 39
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 1
M1 - 7332927
ER -