Parallel implementation of sample adaptive offset filtering block for low-power HEVC chip
[摘要] This thesis presents a highly parallelized and low latency implementation of the Sample Adaptive Offset (SAO) filter, as part of a High Efficiency Video Coding (HEVC) chip under development for use in low power environments. The SAO algorithm is detailed and an algorithm suitable for parallel processing using offset processing blocks is analyzed. Further, the SAO block hardware architecture is discussed, including the pixel producer control module, 16 parallel pixel processors and storage modules used to perform SAO. After synthesis, the resulting SAO block is composed of about 36.5 kgates, with an SRAM sized at 6KBytes. Preliminary results yield a low latency of one clock cycle on average (10 ns for a standard 100Mhz clock) per 16 samples processed. This translates to a best case steady state throughput of 200 MBytes per second, enough to output 1080p (1920x1080) video at 60 frames per second. Furthermore, this thesis also presents the design and implementation of input/output data interfaces for an FPGA based real-life demo of the before-mentioned HEVC Chip under development. Two separate interfaces are described for use in a Xilinx VC707 Evaluation Board, one based on the HDMI protocol and the other based on the SD Card protocol. In particular, the HDMI interface implemented is used to display decoded HEVC video in an HD display at a 1080p (1920x1080) resolution with a 60Hz refresh rate. Meanwhile, the data input system built on top of the SD Card interface provides encoded bitstream data directly to the synthesized HEVC Chip via the CABAC Engine at rates of up to 1.5 MBytes per second. Finally, verification techniques for the FPGA real-life demo are presented, including the use of the on-board DDR3 RAM present in the Xilinx VC707 Evaluation Board.
[发布日期] [发布机构] Massachusetts Institute of Technology
[效力级别] [学科分类]
[关键词] [时效性]