FAQ about the Transcoding Technique

int 21h · 20th October 2003, 05:06

I've sort of started this thread to answer many common questions and dispel most of the common misconceptions about transcoding... I'll add more answers as you add more questions

Q. How does the transcoding process work?

A. To understand how transcoding works, we must first understand how encoding works. Mpeg-2 encoding works by exploiting redundancies in data in two domains, the spatial domain, and the temporal domain. First, the encoder performs intra-frame compression, (i.e. Divide the frame into macroblocks, perform DiscreteCosine Transform on each block, quantize (reduce data accuracy) results) this yields a set of coefficients describing the frame. Next, using inter-frame encoding we compute motion vectors to describe what macroblocks are changing over time, then depending on what sort of frame we're encoding into, we store the appropriate information in the bitstream (3 types of frames, I-Frames, meaning only intra-frame compression is done on it, it can be decoded with no other frames' help, P-frames & B-Frames, meaning predictive frames and bi-directional frames, only the change in macroblocks is stored). As you can imagine, computation of the motion vectors is the most time consuming process because of the complicated comparisons being performed. (**Please note this is a pretty simplified explanation, I encourage you to check out Mpeg-2 specs if you want many, many more details.. http://autumn.idv.tw/ppt/138182.html **)

Transcoders work by utilizing the data we already have, to reproduce a slightly less accurate bitstream. Transcoding in the compressed domain, done by DVD2One, DVDShrink, ReJig (Requant), and others, is done through requantization of coefficients. Quantization is the process, mentioned above, of discarding an amount of accuracy in order to reduce the amount of bits the data uses. For instance, the quantization of an encoder (not Mpeg-2) could be defined as sign(x) * (abs(x)/(2*quant)) where x is the original number we're quantizing, and quant is an integer specifying the amount of accuracy to discard (i.e. higher quant, lose more of x). By simply plugging in some numbers into that equation, you can see that you're losing accuracy of the plugged in number. This is how compression works in Mpeg-2. (Quantization is essential because DCT by itself is a nearly reversible process (you do lose some data because of precision errors). Quantization is what causes the 'blocking' and pixelation errors many people hate to see) So, by keeping in mind the process of Quantization, we can easily reduce a stream's size by simply re-quantizing the co-efficients of the bitstream... and by keeping the old motion vectors, we don't have to worry about the very computationally intensive portion of encoding.

Q. Why don't any of the transcoders let me reduce resolution of my final output stream?

A. Because none of the transcoders are currently capable of handling the changes needed to the motion vectors (maybe InstantCopy has the facilities in the code, but the program itself certainly doesn't show the ability). Instead of simply changing the coefficients of the stream, you would also need to resample the motion vectors.. this process would introduce additional noise into the picture. (While I've never seen the product of such a process, I can certainly imagine its quality would not be very good)

Q. What is the Q-Level in Bitrate Viewer and how does it relate to my transcoded (or re-encoded) stream's quality?

A. This is the quantization level of the stream. I believe Bitrate Viewer is looking at the quantization level used and performing some statistical analysis to find out an average used. This number should be slightly higher than the original's stream because of how the transcoding process works (increasing quantization level to decrease size). In a re-encoded stream (i.e. CCE, TMPGEnc, etc.) it should be nearly the same.. maybe a little lower because you're starting with a lower data accuracy to begin with, so you can use a lower amount of compression to represent it. (i.e. compressing a 10-bit number like 1011100111 to a 4-bit number takes a higher amount of compression than compressing a 6 or 7-bit number to 4-bits).

Q. Do any of the transcoders handle interlaced input?

A. They all should handle interlaced input fine. No special handling is needed because processing is being done on the macroblock level instead of the frame level... the structure of the original videostream itself is not being changed, just the amount of compression, so assuming the original was encoded correctly, the transcoding stream should also be encoded correctly.

Q. Is Mpeg-2 to Mpeg-4 encoder possible? practical? coming?

A. This type of encoder is definitely possible. It may not be practical due to limitations of motion vector computation though, as mentioned above, to change resolution, motion vectors need to be changed... however, this is not to say it cannot be done, its just that the quality will probably be far inferior to a straight re-encode (in general, format changing transcoding introduces more error than a full decode->encode process).

20th October 2003, 05:06	#1 \| Link
int 21h Still Laughing Join Date: Oct 2001 Location: Around Posts: 1,312	FAQ about the Transcoding Technique I've sort of started this thread to answer many common questions and dispel most of the common misconceptions about transcoding... I'll add more answers as you add more questions Q. How does the transcoding process work? A. To understand how transcoding works, we must first understand how encoding works. Mpeg-2 encoding works by exploiting redundancies in data in two domains, the spatial domain, and the temporal domain. First, the encoder performs intra-frame compression, (i.e. Divide the frame into macroblocks, perform DiscreteCosine Transform on each block, quantize (reduce data accuracy) results) this yields a set of coefficients describing the frame. Next, using inter-frame encoding we compute motion vectors to describe what macroblocks are changing over time, then depending on what sort of frame we're encoding into, we store the appropriate information in the bitstream (3 types of frames, I-Frames, meaning only intra-frame compression is done on it, it can be decoded with no other frames' help, P-frames & B-Frames, meaning predictive frames and bi-directional frames, only the change in macroblocks is stored). As you can imagine, computation of the motion vectors is the most time consuming process because of the complicated comparisons being performed. (Please note this is a pretty simplified explanation, I encourage you to check out Mpeg-2 specs if you want many, many more details.. http://autumn.idv.tw/ppt/138182.html ) Transcoders work by utilizing the data we already have, to reproduce a slightly less accurate bitstream. Transcoding in the compressed domain, done by DVD2One, DVDShrink, ReJig (Requant), and others, is done through requantization of coefficients. Quantization is the process, mentioned above, of discarding an amount of accuracy in order to reduce the amount of bits the data uses. For instance, the quantization of an encoder (not Mpeg-2) could be defined as sign(x) * (abs(x)/(2quant)) where x is the original number we're quantizing, and quant is an integer specifying the amount of accuracy to discard (i.e. higher quant, lose more of x). By simply plugging in some numbers into that equation, you can see that you're losing accuracy of the plugged in number. This is how compression works in Mpeg-2. (Quantization is essential* because DCT by itself is a nearly reversible process (you do lose some data because of precision errors). Quantization is what causes the 'blocking' and pixelation errors many people hate to see) So, by keeping in mind the process of Quantization, we can easily reduce a stream's size by simply re-quantizing the co-efficients of the bitstream... and by keeping the old motion vectors, we don't have to worry about the very computationally intensive portion of encoding. Q. Why don't any of the transcoders let me reduce resolution of my final output stream? A. Because none of the transcoders are currently capable of handling the changes needed to the motion vectors (maybe InstantCopy has the facilities in the code, but the program itself certainly doesn't show the ability). Instead of simply changing the coefficients of the stream, you would also need to resample the motion vectors.. this process would introduce additional noise into the picture. (While I've never seen the product of such a process, I can certainly imagine its quality would not be very good) Q. What is the Q-Level in Bitrate Viewer and how does it relate to my transcoded (or re-encoded) stream's quality? A. This is the quantization level of the stream. I believe Bitrate Viewer is looking at the quantization level used and performing some statistical analysis to find out an average used. This number should be slightly higher than the original's stream because of how the transcoding process works (increasing quantization level to decrease size). In a re-encoded stream (i.e. CCE, TMPGEnc, etc.) it should be nearly the same.. maybe a little lower because you're starting with a lower data accuracy to begin with, so you can use a lower amount of compression to represent it. (i.e. compressing a 10-bit number like 1011100111 to a 4-bit number takes a higher amount of compression than compressing a 6 or 7-bit number to 4-bits). Q. Do any of the transcoders handle interlaced input? A. They all should handle interlaced input fine. No special handling is needed because processing is being done on the macroblock level instead of the frame level... the structure of the original videostream itself is not being changed, just the amount of compression, so assuming the original was encoded correctly, the transcoding stream should also be encoded correctly. Q. Is Mpeg-2 to Mpeg-4 encoder possible? practical? coming? A. This type of encoder is definitely possible. It may not be practical due to limitations of motion vector computation though, as mentioned above, to change resolution, motion vectors need to be changed... however, this is not to say it cannot be done, its just that the quality will probably be far inferior to a straight re-encode (in general, format changing transcoding introduces more error than a full decode->encode process).

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Switch to Linear Mode Switch to Hybrid Mode Threaded Mode