Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > (HD) DVD, Blu-ray & (S)VCD > One click suites for DVD backup and DVD creation

Reply
 
Thread Tools Search this Thread Display Modes
Old 20th October 2003, 05:06   #1  |  Link
int 21h
Still Laughing
 
int 21h's Avatar
 
Join Date: Oct 2001
Location: Around
Posts: 1,312
FAQ about the Transcoding Technique

I've sort of started this thread to answer many common questions and dispel most of the common misconceptions about transcoding... I'll add more answers as you add more questions

Q. How does the transcoding process work?

A. To understand how transcoding works, we must first understand how encoding works. Mpeg-2 encoding works by exploiting redundancies in data in two domains, the spatial domain, and the temporal domain. First, the encoder performs intra-frame compression, (i.e. Divide the frame into macroblocks, perform DiscreteCosine Transform on each block, quantize (reduce data accuracy) results) this yields a set of coefficients describing the frame. Next, using inter-frame encoding we compute motion vectors to describe what macroblocks are changing over time, then depending on what sort of frame we're encoding into, we store the appropriate information in the bitstream (3 types of frames, I-Frames, meaning only intra-frame compression is done on it, it can be decoded with no other frames' help, P-frames & B-Frames, meaning predictive frames and bi-directional frames, only the change in macroblocks is stored). As you can imagine, computation of the motion vectors is the most time consuming process because of the complicated comparisons being performed. (**Please note this is a pretty simplified explanation, I encourage you to check out Mpeg-2 specs if you want many, many more details.. http://autumn.idv.tw/ppt/138182.html **)

Transcoders work by utilizing the data we already have, to reproduce a slightly less accurate bitstream. Transcoding in the compressed domain, done by DVD2One, DVDShrink, ReJig (Requant), and others, is done through requantization of coefficients. Quantization is the process, mentioned above, of discarding an amount of accuracy in order to reduce the amount of bits the data uses. For instance, the quantization of an encoder (not Mpeg-2) could be defined as sign(x) * (abs(x)/(2*quant)) where x is the original number we're quantizing, and quant is an integer specifying the amount of accuracy to discard (i.e. higher quant, lose more of x). By simply plugging in some numbers into that equation, you can see that you're losing accuracy of the plugged in number. This is how compression works in Mpeg-2. (Quantization is essential because DCT by itself is a nearly reversible process (you do lose some data because of precision errors). Quantization is what causes the 'blocking' and pixelation errors many people hate to see) So, by keeping in mind the process of Quantization, we can easily reduce a stream's size by simply re-quantizing the co-efficients of the bitstream... and by keeping the old motion vectors, we don't have to worry about the very computationally intensive portion of encoding.



Q. Why don't any of the transcoders let me reduce resolution of my final output stream?

A. Because none of the transcoders are currently capable of handling the changes needed to the motion vectors (maybe InstantCopy has the facilities in the code, but the program itself certainly doesn't show the ability). Instead of simply changing the coefficients of the stream, you would also need to resample the motion vectors.. this process would introduce additional noise into the picture. (While I've never seen the product of such a process, I can certainly imagine its quality would not be very good)



Q. What is the Q-Level in Bitrate Viewer and how does it relate to my transcoded (or re-encoded) stream's quality?

A. This is the quantization level of the stream. I believe Bitrate Viewer is looking at the quantization level used and performing some statistical analysis to find out an average used. This number should be slightly higher than the original's stream because of how the transcoding process works (increasing quantization level to decrease size). In a re-encoded stream (i.e. CCE, TMPGEnc, etc.) it should be nearly the same.. maybe a little lower because you're starting with a lower data accuracy to begin with, so you can use a lower amount of compression to represent it. (i.e. compressing a 10-bit number like 1011100111 to a 4-bit number takes a higher amount of compression than compressing a 6 or 7-bit number to 4-bits).



Q. Do any of the transcoders handle interlaced input?

A. They all should handle interlaced input fine. No special handling is needed because processing is being done on the macroblock level instead of the frame level... the structure of the original videostream itself is not being changed, just the amount of compression, so assuming the original was encoded correctly, the transcoding stream should also be encoded correctly.



Q. Is Mpeg-2 to Mpeg-4 encoder possible? practical? coming?

A. This type of encoder is definitely possible. It may not be practical due to limitations of motion vector computation though, as mentioned above, to change resolution, motion vectors need to be changed... however, this is not to say it cannot be done, its just that the quality will probably be far inferior to a straight re-encode (in general, format changing transcoding introduces more error than a full decode->encode process).
int 21h is offline   Reply With Quote
Old 25th October 2003, 21:36   #2  |  Link
Ollie W. Holmes
Registered User
 
Join Date: Feb 2002
Location: Los Angeles, CA
Posts: 81
Q: How close to the final MPEG-2 ISO standard is the working draft? I assume a person has to pay to get a printed or electronic copy of the standard, right? Also, the official DVD standard costs a fortune and requires signing a non-disclosure agreement. Any publicly available reverse-engineered documentation for this standard?

Q: Recently, I started getting into dv cameras, and wondered about their limitations. They don't appear to use motion vectors, but their bit rate is 3-4x greater than mpeg-2 as applied to dvds. Yet dvds can look several times better than dv cam footage. What's the explanation?

1. Film is better than video
2. Dv cameras suck
3. Mpeg-2 has higher quality than DV, bit for bit. Mpeg-2 HD uses around the same bit rate as DV, and it totally blows DV out of the water.

And if #3 is true, then why don't the next generation of video cameras build in better transcoding using whatever the best techniques are? JVC already has a 720p consumer cam on the market.

Q: PCs suck even doing 480p. How many years are we away from doing HD on a computer, i.e., doing what we now do with dvd decrypter, ifoedit, IC, dvdxcopy, etc., but for 1080i and 720p HD-DVD? Somehow a hardware manufacturer has to get into the act when it comes to encoding, decoding, and re-transcoding, because Intel is close to reaching the limits of P4 technology.
Ollie W. Holmes is offline   Reply With Quote
Old 26th October 2003, 04:53   #3  |  Link
writersblock29
Registered User
 
Join Date: Mar 2003
Posts: 614
@Ollie W. Holmes

The reason DVDs look better than DV footage is due to the difference in resolution of the raw footage. All DV camcorders rely on the performance of the charged coupled device (CCD), which is what converts the image you see *live* into pixel information. At this point in time, most DV camcorders capture interlaced video streams -- which nearly always suffer when compared to the image quality of progressive-scanned images. Most studio DVDs are created from film stock masters -- and film has levels of resolution that pixel-based CCDs (at least, in consumer-level camcorders) can only dream of. So your original footage is far crisper if it's film-based. Naturally, the higher the quality of your original footage, the higher the quality any MPEG2 encoding of that footage is going to be.

The only way to improve the quality of MPEG2 streams -- which were created from DV footage -- is to improve the quality of your original footage. This means a better camcorder than most you'll find at BestBuy or Shopco. Using pro-sumer or professional 3 CCD camcorders (such as a Sony VX2000) is a good start toward that goal, since each CCD is dedicated to one of the three primary video colors, and can offer far better separation than any single CCD camcorder can offer. These units are a bit pricey -- but if quality is the need, then you'll do quite well to check one of these out.

Even then, you'll be using interlaced streams (don't be fooled by the VX2000's claim of progressive-scan recording -- it'll only do this at 15 frames a second, which can look quite choppy). Panasonic makes a true progressive-scan 3 CCD cam... but I can't comment much on it, since I've never had the pleasure of introducing myself to one. My bet would be that the Panasonic's raw image would accept MPEG2 much better than the Sony -- but I've gotten some very satisfactory results from MPEG2 VX2000 footage, provided the bitrate's fairly high.

What I'd recommend is taking a mini DV tape to any neighborhood electronics store that happens to have higher-end DV cameras, record some stuff around the store, then pop the tape into your camcorder at home and MPEG that. Your cam will play back that footage at the same resolution it was recorded at, even though your camera may not be capable of capturing such footage, itself. This way, at least, you'll know if the results will make you happy enough to make the leap into bankruptcy... er, I mean, make you want to make the investment. Having shot a lot of video footage in my time, I'd bet you'll like what you see.

--Cheers!
writersblock29 is offline   Reply With Quote
Old 30th October 2003, 05:20   #4  |  Link
mpucoder
Moderator
 
Join Date: Oct 2001
Posts: 3,530
Quote:
Originally posted by Ollie W. Holmes
Q: How close to the final MPEG-2 ISO standard is the working draft?
The standard has undergone several changes since 1995. However, DVD froze the specs at the 1995 (E) revision.
mpucoder is offline   Reply With Quote
Old 6th November 2003, 17:23   #5  |  Link
TEB
Registered User
 
Join Date: Feb 2003
Location: Palmcoast of Norway
Posts: 357
Hi. Im curious about CBR vs VBR work in the compressed domain. Is it possible to transrate (transcode in the compressed domain) from VBR to CBR @ a given bitrate instead of a global % reduction?

best regards teb
TEB is offline   Reply With Quote
Old 7th November 2003, 01:33   #6  |  Link
writersblock29
Registered User
 
Join Date: Mar 2003
Posts: 614
@TEB

Sounds reasonable, since DVD2ONE has the option of either constant or variable reduction of a given stream... and DVD2ONE is a true transcoder. My understanding (which could be wrong) is that the constant setting removes the same amount of information from each frame, while the variable setting removes more information from SOME areas, while removing less from others. I doubt that such an approach would be able to "turn" a VBR encode into a CBR, since either setting would pretty much keep the properties of the original VBR stream -- minus some information, of course, in order to reduce the file size. All the same, though, I really don't see where it would be IMPOSSIBLE to convert a VBR into a CBR.

Considering that you can usually wind up with a better quality video stream, with a smaller file size, by using VBR... and that's the name of the game for us "make-it-fit-and-look-good-too" people, I seriously doubt there will ever be an appearance of such a tool.
writersblock29 is offline   Reply With Quote
Old 7th November 2003, 05:31   #7  |  Link
int 21h
Still Laughing
 
int 21h's Avatar
 
Join Date: Oct 2001
Location: Around
Posts: 1,312
VBR vs CBR is more of how the bits are allocated during encoding than specific stream structure. The only difference in stream structure is that some specific components of the stream will have a varying size... but on the encoding side, it greatly affects how bits are allocated.

Sure you could take a VBR stream and reduce the size across all frames in a way that makes it entirely a constant rate. However, it would not at all be the same as encoding a stream at a constant rate.
int 21h is offline   Reply With Quote
Old 10th January 2004, 03:04   #8  |  Link
mrbass
Moderator
 
Join Date: Oct 2001
Location: Las Vegas
Posts: 2,034
This is an excellent quote from dvdshrink himself describing how deep analysis improves video transcoding within dvdshrink.

Someone asked him the following:
Is the prime function of 'Deep Analysis' to enable the program to arrive at the target size without doing some last minute, panic-stricken squeezing, or does it improve video quality?
Quote:
Both! It is dual-purpose. I'll try to explain in english, not math:

Suppose I want to achieve 50% video compression. The easiest way to do this, is take each picture, and squeeze it to half of it's original size.

The problem is, that not every data in a picture can be compressed. An encoded picture consists of both motion vector data and DCT coefficient data. It doesn't really matter what they are. The important thing is, DVD Shrink can only compress the DCT data.

It so happens that the amount of space devoted to motion vectors and the amount devoted to DCT data is different in each picture.

Suppose one picture consists of a 50-50 proportion of motion vectors to DCT data. To compress this picture to 50% of it's original size, you'd have to remove all the DCT data! Needless to say, the result would look awful.

Suppose another picture consists of 25-75 proportion of motion vectors to DCT data. To get 50% compression, you'd only have to remove 2/3 of the DCT data (still rather a lot, but hell, DVD Shrink sucks at 50% compression).

It would have been better to know in advance, that the first picture could not be compressed much, and the second picture could be compressed more. This way you could spread the compression evenly between the two pictures.

This is exactly the function of deep analysis, except on a bigger scale: it calculates the best distribution of compression over the entire 200,000 pictures of a movie. --dvdshrink
__________________
www.mrbass.org DVDShrink | DVD2DVD | DVDFAB | Mac guides
mrbass is offline   Reply With Quote
Old 18th June 2004, 01:00   #9  |  Link
mrbass
Moderator
 
Join Date: Oct 2001
Location: Las Vegas
Posts: 2,034
post by duartix August 6, 2003

I'm sure InstantCopy (IC) is another story altogether. If you have followed that post you already know that IC doesn't use a fixed rate transcoding algorithm. Besides some heavy duty diference coding/recoding compensation, I'm sure it uses lumimasking techniques. Perhaps that is inherent to their diference coding, I don't know. What I know is that in a dark scene from that movie, where DS is wasting about 3.5 Mb/s, IC takes about 1.5 Mb/s with no quality loss whatsoever.

Here is a quote from a Pinnacle employee on how IC works:
Quote:
Well, basically MPEG Video is encoded in groups of pictures called GOPs. In every GOP is a reference frame followed by several difference frames. While the reference frame is encoded as a full picture the difference frames contain only the differences to the “last” frame. While encoding every frame is “quantitized” – this means that small, almost unnoticeable differences in the signal are removed. Both InstantCopy and competing programs change the quantization process. However, InstantCopy is the only program that takes the changes done into account for the following frames. This means that additionally to the “quantization” the whole frames needs to be decoded two times and encoded one time which is indeed very time consuming. However, if you only do the quantization the picture quality gets worse with every frame until the next reference frame is decoded – which is the famous annoying “pumping”.
Here it is how IC works "translated" by DVDShrink:
Quote:
The error correction is done by (simplified description)
1. decode the original frame
2. requantize (shrink) the original frame
3. decode the requantized frame
4. calculate the difference between the two decoded frames
5. add/subtract this difference to the next frame
6. loop for each frame in movie.
------------------------------------------------------

Post by dvdshrink June 17, 2004

That is very interesting information about InstantCopy.

To put the problem in perspective, I will describe the DVD Shrink compression algorithm (this turned out to be a very long explanation, my apologies to those already familiar with MPEG video).

Requantizing

The DVD Shrink transcode engine works by requantizing video data.

There are two kinds of data in an MPEG video stream:
1. Motion vectors
2. Pixel "residual" data (in the form of DCT coefficients)

Each decoded picture is formed by combining parts of previous decoded pictures (using the motion vectors) with new pixel information (residual data).

The basic idea is, that since each decoded picture is very similar to the previous, it can be fairly accurately described using pixels from the previously decoded picture, offset by some motion vectors to compensate for camera movement, or movement of objects in the scene.

The purpose of residual data is then to compensate for any errors in this process, since you are unlikely to get an exact match for the new picture using only the previous picture + motion vectors.

DVD Shrink achieves compression by removing some of the residual data. This process is called requantization. Selected DCT coefficients are scaled down (thus reducing the number of bits required to store them) and a corresponding scale value for these coefficients (quantizer scale) is scaled up. The result is a less accurate description of the same residual data, which takes up less space. Note motion vectors are left unmodified by this process.

With the exception of CCE encoder based applications (which do a full re-encode of the video and recompute all motion vectors), all DVD compression softwares are using this same requantizion algorithm. The difference between alternative software is, how they choose which residual data to requantize, and how they handle the subsequent errors or artifacts which occur in the video as a result of this process.

If you requantize all the residual data (DVD Shrink at maximum compression) then the resulting video will look bad - it will contain noticable artifacts. If you have some compression % to play with (e.g. maximum compression is not required), then the compression software has a choice of which residual data to process. This choice can heavily influence quality. The goal is to select the residual data to compress which minimizes the resulting errors or artifacts.

Picture Types

When choosing residual data, it is important to consider that there are three types of picture in an MPEG stream. They are called I, P and B-pictures. A typical DVD video contains pictures in the following sequence:

I-B-B-P-B-B-P-B-B-P-B-B-P-B-B-I-B-B-P-B-B.... and so on.

I pictures
These contain residual data only. There are no motion vectors, so they do not depend on (or "reference") any previously decoded pictures. Essentially, they are like standalone JPEG images. They occur at a rate of 1 in every 15 pictures (also they occur at the point of a scene change).

P pictures
These contain both motion vectors and residual data. The motion vectors always reference the previously decoded I or P picture. This is important, because it means that any error resulting from compression of the previous I or P picture will also be visible in this picture, and furthermore, if an additional error is introduced by compression of this picture, then the error will accumulate and artifacts will become more noticable. Note the accumulated error will also be visible in the next picture which references this picture, so things can get out of hand rather quickly. P-pictures occur 1 in every 3 pictures.

B pictures
Like P pictures, these also contain both motion vectors and residual data. The motion vectors always reference the last two decoded I or P pictures (data from two pictures is averaged). The important characteristic of b-pictures is, no picture will ever reference this b-picture (only I or P pictures are referenced), which means that any error introduced by compression of this picture will just be a one-off occurrence, visible in this picture, but not accumulated or carried forward into any other. Note also that B-pictures form the vast majority, occuring 2 in every 3 pictures.

If you are still following this, then you'll probably have figured out that when choosing residual data to compress, it makes sense to first select the data in b-pictures. This is because (a) they form the bulk of all pictures, and (b) any artifacts introduced by this compression will not be visible in any other picture.

This is what DVD Shrink does. If the resulting video size after compressing all b-pictures is still too large, then it will try to distribute the remaining compression over the remaining I and P pictures, and this is when artifacts start to become really noticable, because errors introduced into I and P pictures will be visible and compounded in all following pictures (until the next I-picture). Note that at low compression ratios, DVD Shrink will never need to touch I and P pictures. The exact % compression where this becomes necessary depends largely on the DVD, or more accurately, on the video encoder software used to encode it.

Problems with this approach

The first problem is that by applying maximum compression to b-pictures, although the resulting errors are not accumulated in any other picture, they are still heavily present in the b-picture itself. The resulting stream is:

I picture (original)
B picture (visible artifacts)
B picture (visible artifacts)
P picture (original)
B picture (visible artifacts)
etc...

The second problem is as mentioned before, when it becomes necessary to compress I and P pictures, errors are then accumulated from one picture to the next:

I picture (visible artifacts)
B picture (visible artifacts x2)
B picture (visible artifacts x2)
P picture (visible artifacts x2)
B picture (visible artifacts x3)
etc...

The problem is lessened somewhat because errors tend to be random in nature, so positive and negative pixel errors often cancel each other out when accumulated: however this is not always the case, and depends very much on the scene.

Possible Solutions

InstantCopy attempts to solve this problem by distributing compression evenly over all I,P and B pictures (I think), and then calculating the resulting error for each picture. The next picture in sequence is then adjusted to compensate for this error, so as to prevent error accumulation. This results in significantly increased processing time, because both the original and the compressed video must be simultaneously decoded, in order to determine the error difference for each picture. Also, since the error difference is essentially random, does it not become more difficult to compress subsequent pictures, to which the error difference has been added? Does anybody know about this?

Another (faster) solution may be to distribute compression evenly over all I,P and B pictures, but do so in such a way that the compressed regions of each picture (from one I-picture to the next) do not overlap. This would also prevent accumulation of errors, but may be impossible to achieve at higher compression ratios. A modification could be to allow overlapping regions only twice between each I-picture, thus minimizing error accumulation.

Another solution may be to accept the errors, but try keep them in parts of the video where they are less visible - e.g. less compression in scenes with high motion, or more compression in dark scenes, or around the edges of the frame...

All ideas are welcome!
__________________
www.mrbass.org DVDShrink | DVD2DVD | DVDFAB | Mac guides
mrbass is offline   Reply With Quote
Old 18th June 2004, 01:33   #10  |  Link
Jester700
Registered User
 
Join Date: Jan 2003
Posts: 26
Quote:
Originally posted by Ollie W. Holmes

Q: Recently, I started getting into dv cameras, and wondered about their limitations. They don't appear to use motion vectors, but their bit rate is 3-4x greater than mpeg-2 as applied to dvds. Yet dvds can look several times better than dv cam footage. What's the explanation?

1. Film is better than video
2. Dv cameras suck
3. Mpeg-2 has higher quality than DV, bit for bit. Mpeg-2 HD uses around the same bit rate as DV, and it totally blows DV out of the water.

And if #3 is true, then why don't the next generation of video cameras build in better transcoding using whatever the best techniques are? JVC already has a 720p consumer cam on the market.
Re: the bitrate issue: DV frames are entire frames; there are no "B" or "P" frames. This takes a LOT higher bitrate, but allows you the flexibility of cutting on any frame when editing. To do this in MPEG you either cut on I-frames or re-encode surrounding frames, with a big quality dropoff. Nobody edits MPEG, and most editing forums occasionally have a noob come in asking how to work with files from his MPEG cam. People reply "return that cam & get a DV one". As to that, check out the cams that DO record in MPEG; the standard def ones are no better than a comparable DV.
Jester700 is offline   Reply With Quote
Old 23rd June 2007, 10:50   #11  |  Link
MetalheadGautham
Registered User
 
MetalheadGautham's Avatar
 
Join Date: May 2007
Location: well... there is a door in front of my house
Posts: 99
I request the author of the first post to put more details, like transcoading from which format to which format results in no or minimal loss in quality, the use of an uncompressed file in between, etc.

1. vp3 to theora results in lossless conversion, the only loss is size

2.
Quote:
Originally Posted by MetalheadGautham View Post
Now for a MAJOR doubtplease tell me if my conclution is right)

Facts:

lame is opensource

vorbis is opensource

lame uses advanced psychoaccoustics

vorbis also uses advanced psychoaccoustics

lame came before vorbis, and is being developed to the limits of MP3(which is being expanded; VBR, 512 kbps, etc are realities)

vorbis must borrow a lot of stuff in the removing unnessary data part from lame, as both are opensource projects(and really good ones to)

hence the main differences between files of the same QUALITY between vorbis and lame is the SIZE

also, vorbis performs much better at lower bitrates.

then shouldn't it be really easy to convert mp3 to vorbis without quality loss at the same time reducing size?
that post expalins about mp3 to vorbis too...
__________________
C:\> File not found. Fake it ?
MetalheadGautham is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 22:00.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.