RE: Video Game Changer?26 Jan 2020 14:06
Qualcomm 2020-01-20
LOW-DELAY BUFFERING MODEL IN VIDEO CODING 2020-01-20
[0050] Video applications that may make use of video encoder 20 and video decoder 30 may include local playback, streaming, broadcast/multicast and conversational applications. Conversational applications include video telephony and video conferencing. Conversational applications are also referred to as low-delay applications, in that such real-time applications are not tolerant to significant delay. For a good user experience, conversational applications require a relatively low end-to-end delay of the entire systems, i.e., the delay between the time when a video frame is captured at a source device and the time when the video frame is displayed at a destination device. Typically, an acceptable end-to-end delay for conversational applications should be less than 400 ms. An end-to-end delay of around 150 ms is considered very good.
[0051] Each processing step of a conversational application may contribute to the overall end-to-end delay. Example delays from processing steps includes capturing delay, pre-processing delay, encoding delay, transmission delay, reception buffering delay (for de-jittering), decoding delay, decoded picture output delay, post-processing delay, and display delay. Typically, the codec delay (encoding delay, decoding delay and decoded picture output delay) is targeted to be minimized in conversational applications. In particular, the coding structure should ensure that the pictures' decoding order and output order are identical such that the decoded picture output delay is equal to or close to zero.
[0056] In the AVC and HEVC HRD models, decoding or CPB removal is access unit (AU) based, and it is assumed that picture decoding is instantaneous (e.g., decoding process 104 in FIG. 2 is assumed to be instantaneous). An access unit is a set of network abstract layer (NAL) units and contains one coded picture. In practical applications, if a conforming decoder strictly follows the decoding times signaled, e.g., in picture timing supplemental enhancement information (SEI) messages generated by video encoder 20, to start decoding of AUs, then the earliest possible time to output a particular decoded picture is equal to the decoding time of that particular picture (i.e., the time when a picture starts to be decoded) plus the time needed for decoding that particular picture. The time needed for decoding a picture in the real-world cannot be equal to zero.
https://worldwide.espacenet.com/publicationDetails/description?CC=DK&NR=2936818T3&KC=T3&FT=D&ND=3&date=20200120&DB=&locale=en_EP#