View Full Version : Safe archival format
swiego
3rd December 2007, 17:13
Some time ago I undertook an effort to capture a large amount of video from tape. Some of it was sourced from media like Video8 and Hi8, others from SVHS, still others from assorted higher quality sources. My goal at the time was to capture in the highest quality format possible while I could still get my hands on the equipment, then sort out editing, etc., later.
I finished the capturing project and collected about 2TB in total worth of video spread across many PCs and hard drives. Then I ran out of time and let it all sit. Now, I realize I won't have time anytime soon to do anything useful with it, but I would like to consolidate it all down into a format that hopefully doesn't consume quite so much disk space. I could use some help here as I am somewhat intimidated by the options.
Most of my source footage is in one of two formats. About half of it is huffyuv lossless compression. Interlaced source, mostly Hi8/SVHS stuff from broadcast decks. The rest of it is in DV format which admittedly is quite a bit smaller.
Any suggestions? What I'd like is a format that,
- supports pretty easy cutting and splicing... I'm not going to do any fancy processing of this video anytime soon but I might cut chunks out and would like a format that's amenable to this. I think the two I'm using work.
- Lower storage requirement than huffyuv. DV would be okay--maybe I should just convert huffyuv to DV?
- Not too esoteric... in case I decide to just archive this off and get back to it a year from now, it would be nice if it were in a format that's reasonably well used...
Suggestions?
2Bdecided
3rd December 2007, 17:36
Consider
1. DV, or
2. high bitrate short GOP or I-frame only MPEG-2, or
3. Leave it as it is
The MPEG-2 option gives you the chance to use higher bitrates than regular DV, if you think it's necesary.
The "leave it as it is" option gives you the chance to spend a little money on the very cheap HHD storage that can be had these days. 2TB - that's about £300 (so maybe $300?). Copying the files is quicker than transcoding them, so maybe you get back some of the money in time saved?
Cheers,
David.
*.mp4 guy
3rd December 2007, 22:06
lagarith (a lossless codec) can often compress better then dv ( a lossy codec) you should take a look at lagarith, and perhaps ffv1 (another lossless codec, relatively high compression) before you decide to use dv. If you decide to use a lossy codec, there really isn't any point in using a data rate higher then 10mbps for SD video, becuase around 15-30 mbs is what you can manage using lossless codecs.
laserfan
3rd December 2007, 23:07
I might cut chunks outYou've put-in a huge amount of time already capping all that material; if you're...
1. Going to get back to it in a year and/or
2. Going to want to fiddle with some of it in the meantime
...I would offload to 2TB of hard disk and put it in a closet w/dessicant bags. Why at this point expend still MORE effort to convert-the-converted into yet another format.
check
4th December 2007, 05:52
- supports pretty easy cutting and splicing... I'm not going to do any fancy processing of this video anytime soon but I might cut chunks out and would like a format that's amenable to this. I think the two I'm using work.
Any codec will do this if you will be re-encoding anyway as part of the process. If you want to easily extract data without re-encoding you are more limited. You will need to either use a codec that is intra compression only, or be satisfied with only being able to cleanly cut on certain frames (I frames).
- Lower storage requirement than huffyuv. DV would be okay--maybe I should just convert huffyuv to DV?
Not much of a constraint. There are more efficient lossless codecs (see lagarith or others), and more or less any lossy codec will compress better.
- Not too esoteric... in case I decide to just archive this off and get back to it a year from now, it would be nice if it were in a format that's reasonably well used...
This says to me: MPEG.
If you do think it's worth re-encoding (see laserfan's post), I'd go with MPEG-4 AVC (aka h264). It support both Real Lossless and so-close-to-lossless-it's-not-funny-but-significantly-smaller lossy encoding.
It can encode as I frames only if you want super curring ability.
It will be around for longer than more or less any other codec (I suspect) because of more or less the same reasons MPEG-2 has.
It's also the most efficient codec (in terms of quality per bit) around today.
2Bdecided
4th December 2007, 12:29
lagarith (a lossless codec) can often compress better then dv ( a lossy codec)By "better", do you mean "smaller" or "higher quality"?
Higher quality - yes, probably.
Smaller - not here!
I'm guessing the files (which aren't DV already) are the Hi8/SVHS stuff - these may contain enough noise and jitter to stress any codec.
Cheers,
David.
*.mp4 guy
4th December 2007, 22:35
A lossless codec always has the same quality, by better I mean everytime I have recompressed dv to lagarith, I have always gottent a size reduction. It is, obviously possible for lagarith to create a file bigger then a dv compressed file, but I've never seen it happen. Even so, there are more efficient codecs, such as ffv1, x264 lossless, and the king, yuvsoft lossless, or yuls; any of which can easily trump dv's compression ratio. My point about relative gain using lossy vs. lossless at a given bitrate stands, as a corollary, dv is useless as an archival format, unless it is also the original format.
2Bdecided
5th December 2007, 11:58
mp4 guy,
Either one of us is doing something wrong, or your content is very different from mine! I've tried native DV camcorder footage, and S-VHS footage converted by a DV capture card. Similar results with both, though noisier source = larger lossless file.
I'm using the Cedocida DV codec, YV12 DV colour (PAL here), yv12vfw.dll (not sure this is relevant), VirtualDubMod, Fast re-compress, Lagarith V1.3.14 YV12.
Typically it doubles file sizes. It's the same if I go via AVIsynth into Vdub. It's even worse (larger files) if I use YUY2 colour space (which would be inappropriate for PAL DV).
So how are you taking a lossy source, and getting lower bitrates for lossless encoding?
What do other people find when the try this?
Can anyone share a short DV clip which does encode to a lower bitrate via lossless?
Cheers,
David.
ariga
5th December 2007, 14:12
I agree with 2Bdecided. With subtle differences, I get the same results. Differences being - DivX in place of yv12vfw.dll and VirtualDub 1.7.4
*.mp4 guy
5th December 2007, 22:18
Well, The footage I was compressing was progressive NTSC, from a dv camcorder, so I would assume that that is where the differences are coming from.
Blue_MiSfit
5th December 2007, 22:38
I had always been concerned with colorspace conversions for lossless archival from DV. Since it's 4:1:1 in NTSC land, there's no way to
a) keep all the data
and
b) not go to YUY2, since that's technically not lossless (who cares about a little high quality upsampling though), and more seriously - a lot bigger!
Unless there is a lossless codec out there that can actually encode 4:1:1 as is!
~Misfit
2Bdecided
6th December 2007, 11:56
Well, The footage I was compressing was progressive NTSC, from a dv camcorder, so I would assume that that is where the differences are coming from.
It may be, but it may not be!
I just created some 720x480p30 content from a 320x240p30 source, so it's very soft, heavily denoised, and should be easy to compress (though there's plenty of movement).
I encoded it to DV: 74MB.
I transcoded that to Lagarith YUY2: 106MB.
I had to go via AVIsynth. (The colours were getting scrambled going out of VirtualDubMod directly - the UV channels were being duplicated (side by side - two copies of each!), rather than upsampled. That "beautiful" version gave me 103MB.)
So either your content is exceptional (little movement?), or you're doing something strange.
As Blue_MiSfit pointed out, as a minimum you have to double the chroma samples from 4:1:1 to 4:2:2 to put NTSC DV into Lagarith, which makes it even less likely that you'll hit a lower bitrate - and, without care, also means the "lossless" transcode could be (slightly) lossy.
I still believe that DV is a useful archive format, and that you are not going to hit DV's 25Mbps for lossless encoding of typical SD video - especially where the source is interlaced analogue (noisy?) video.
Cheers,
David.
swiego
10th December 2007, 20:29
I hadn't simply disappeared but was doing research and soaking all this in. First of all, thank you very much for the replies, they helped but also gave me some search terms to use for a little self-education.
Now I sort of understand that I can treat I-frames as "splicing points that don't require re-encoding", is this an accurate depiction? If so, then I gather that h264 with every frame an I-frame (I believe that's what one poster alluded to?) could give me my high compression, good quality and easier splicing and cutting options? If so, that sounds like the ideal format.
IF the above is correct then I guess the only question I would have left is how interlaced video fits into the above model... should I assume that converting to h264 will require deinterlacing? I'm pretty sure all of my recorded video is interlaced.
Some clarifications to my original post... I actually -did- use Lagarith not Huffyuv; I'd tried both and forgotten that I'd settled on Lagarith and DV for all the source material back when capturing. Still big files though. Also, it's actually getting closer to 3TB now that I'm adding -everything- up and to be honest, I'm just reluctant to buy lots of external hard drives to store stuff of this much capacity... I'd much rather re-encode to something a bit more reasonable in size before moving it all to external drives. I just bought a pair of one TB external hard drives (the ones with the new WD "green power" drives--they run amazingly cool for a single disk TB external drive) to store away two copies of this stuff. Right now I just have a single copy of it spread out across many 400GB and 500GB hard drives in several different computers including some in my office. It's a mess...
swiego
10th December 2007, 21:52
Oh!
I almost forgot... sort of an extension to my original question.
When I captured all this stuff to codecs like Lagarith, it was because the "state of the art" on encoding, restoration and the like seemed like such a moving target. Every day there's a new version of <codec> or <filter> that is yet another step up from what existed the previous day. Much of this footage is messy and could use some TLC but it's not an urgent matter. I might revisit this five years from now on my 32 core desktop PC with 24GB of RAM using filters and codecs a generation ahead of what we have today... or, at least, that's what I suspected could be a reality, in which case I wanted to keep raw, interlaced footage in the best quality I could get at on hand and not worry much about it.
SO... my question both then and now is a more abstract, "where is this all headed over the next five years?" Is it fair to say that chances are, five years from now, h264 is still going to be pretty darn good? And is video processing mostly going to remain as it is today... adjusting curves, dealing with side effects, etc? If so, then maybe I should just start packaging all this video into final form.
Or, are there other things on the horizon that, though perhaps not realistic today due to lack of compute power or immature algorithms, will represent a quantum leap ahead in terms of how we process, restore and encode video? If so then I'd much rather skip ahead a few years.
Dark Shikari
10th December 2007, 22:16
Oh!
I almost forgot... sort of an extension to my original question.
When I captured all this stuff to codecs like Lagarith, it was because the "state of the art" on encoding, restoration and the like seemed like such a moving target. Every day there's a new version of <codec> or <filter> that is yet another step up from what existed the previous day. Much of this footage is messy and could use some TLC but it's not an urgent matter. I might revisit this five years from now on my 32 core desktop PC with 24GB of RAM using filters and codecs a generation ahead of what we have today... or, at least, that's what I suspected could be a reality, in which case I wanted to keep raw, interlaced footage in the best quality I could get at on hand and not worry much about it.
SO... my question both then and now is a more abstract, "where is this all headed over the next five years?" Is it fair to say that chances are, five years from now, h264 is still going to be pretty darn good? And is video processing mostly going to remain as it is today... adjusting curves, dealing with side effects, etc? If so, then maybe I should just start packaging all this video into final form.
Or, are there other things on the horizon that, though perhaps not realistic today due to lack of compute power or immature algorithms, will represent a quantum leap ahead in terms of how we process, restore and encode video? If so then I'd much rather skip ahead a few years.H.264 is going to be the standard for at least 5-10 years, most likely; any future standard will likely be a wavelet-based codec with OBMC, like Snow, which will be far too slow for processors of this or the next couple generations.
communist
11th December 2007, 08:41
Well, The footage I was compressing was progressive NTSC, from a dv camcorder, so I would assume that that is where the differences are coming from.
NTSC and PAL D1 contain the same amount of information.
720x576x25
720x480x30
The only explanation to why your got smaller files from converting to a lossless codec is that the content was very compressible, I've actually never seen it happen to material I converted.
2Bdecided
11th December 2007, 12:59
1. Keep it interlaced. Even the best deinterlacers are not perfect, and all double the raw datarate!
2. While everything does seem to keep improving, there will come a day when people stop being interested in this stuff. Then it'll stop improving, and fade into obsolescence. This will happen (is happening) much sooner with the hardware than the software.
3. "raw source > very lossy encoder > clean up later" is fraught with problems. It's not going to help you later on to have compressed this footage, though if the losses are imperceptible you'll probably get away with it.
FWIW TV stations in the UK use 50Mbps and 37Mbps MPEG-2 for pre-broadcast storage and transmission. Final broadcasts are 2-5Mbps MPEG-2 and also analogue still. They don't go any lower for the pre-broadcast storage and transmission because their tests revealed that it could have a visible impact on the final signal.
Cheers,
David.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.