An Efficient Motion Estimation Method for H.264-Based Video Transcoding with Arbitrary Spatial Resolution Conversion
Loading...
Date
2007-07-19T15:31:53Z
Authors
Wang, Jiao
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
As wireless and wired network connectivity is rapidly expanding
and the number of network users is steadily increasing, it has become more
and more important to support universal access of multimedia
content over the whole network. A big challenge, however, is
the great diversity of network devices from full screen computers
to small smart phones. This leads to research on transcoding,
which involves in efficiently reformatting compressed data from
its original high resolution to a desired spatial resolution
supported by the displaying device. Particularly, there is a
great momentum in the multimedia industry for H.264-based
transcoding as H.264 has been widely employed as a mandatory
player feature in applications ranging from television broadcast
to video for mobile devices.
While H.264 contains many new features for effective video
coding with excellent rate distortion (RD) performance, a major issue
for transcoding H.264 compressed video from one spatial resolution
to another is the computational complexity. Specifically, it is
the motion compensated prediction (MCP) part. MCP is the main
contributor to the excellent RD performance
of H.264 video compression, yet it is very time consuming. In general,
a brute-force search is used to find the best motion vectors for MCP.
In the scenario of transcoding, however, an immediate idea for
improving the MCP efficiency for the re-encoding procedure is to
utilize the motion vectors in the original compressed stream.
Intuitively, motion in the high resolution scene is highly related
to that in the down-scaled scene.
In this thesis, we study homogeneous video transcoding from H.264
to H.264. Specifically, for the video transcoding with arbitrary
spatial resolution conversion, we propose a motion vector estimation
algorithm based on a multiple linear regression model, which
systematically utilizes the motion information in the original scenes.
We also propose a practical solution for efficiently determining a
reference frame to take the advantage of the new feature of multiple
references in H.264. The performance of the algorithm was assessed
in an H.264 transcoder. Experimental results show that, as compared
with a benchmark solution, the proposed method significantly reduces
the transcoding complexity without degrading much the video quality.
Description
Keywords
H.264, video transcoding