Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th July 2007, 05:35   #21  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Each pixel depends on its left, top, and topleft neighbors. I can't load consecutive pixels in an mmreg because that violates the left dependency, but I can load a diagonal stripe into an mmreg.

Pixels in their logical arrangement in a frame, with one register's worth highlighted:

How I'd store the temporary values in memory:

Cost: one extra transpose operation (the other transpose is free, since the bitstream parser has to store pixels one by one into an array anyway.)

Last edited by akupenguin; 13th July 2007 at 05:43.
akupenguin is offline   Reply With Quote
Old 13th July 2007, 08:55   #22  |  Link
MiroLx
Registered User
 
Join Date: Nov 2006
Posts: 51
I don't know if this applies to MLC, but entropy coding isn't the only reason huffyuv decoding is slower than encoding. There's also the pixel prediction algorithm: when encoding, all pixels are independent so they can be predicted in parallel with SIMD, whereas when decoding each pixel depends on the previous so they have to run in series with latency-bound scalar math.
I have some ideas about how to fix this, but they can't be implemented without changing the bitstream. Maybe some future version of ffvhuff.


Thanks for your explanation akupenguin. I don't use SIMD in
MLC (maybe futere version). If I understand it right you are able to use bytes for calculations, but numbers in MLC are bigger (32bit) in most cases so the advantage of SIMD is falling down. I am also waiting for Barcelona (desktop version) but AMD seems to have some problems recently...
MiroLx is offline   Reply With Quote
Old 13th July 2007, 10:11   #23  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
32bit numbers still get 4x parallelism from SSE*, just not as much as the 16x that bytes get.
BTW, how do you get 32bit numbers out of video with 8bit colordepth? I can't think of any process that would add so much dynamic range without impeding lossless compression.
akupenguin is offline   Reply With Quote
Old 13th July 2007, 11:48   #24  |  Link
MiroLx
Registered User
 
Join Date: Nov 2006
Posts: 51
Quote:
Originally Posted by akupenguin View Post
32bit numbers still get 4x parallelism from SSE*, just not as much as the 16x that bytes get.
BTW, how do you get 32bit numbers out of video with 8bit colordepth? I can't think of any process that would add so much dynamic range without impeding lossless compression.
No nothing like that. It's index to context model, which has to be big. So at the beginning there are 8bit color numbers and at the end 32bit index numbers. But you are right 4x parallelism is still better than none.
MiroLx is offline   Reply With Quote
Old 15th July 2007, 09:08   #25  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by akupenguin View Post
That's because the LZ sliding window is a half-assed version of inter prediction. But LZ fails compared to real inter prediction because (a) it's line based instead of block based, so each motion vector has to be specified many times, and (b) it has no subpixel interpolation
You're right about that: in fact, except in the case of actual constant image data between frames, LZMA fails to provide any benefit for inter compression at all. Therefore any LZMA codec would be intra-only like HuffYUV or Lagarith, as inter-frame compression with LZMA would basically be useless.

FFV1 would probably end up being better anyways.

And FFV1 outperforms PAQ8?? I've never tried it but PAQ8's image compression has always impressed me, it nearly doubles the compression ratio of PNG.
Dark Shikari is offline   Reply With Quote
Old 15th July 2007, 18:57   #26  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
PAQ8 uses linear predictive coding in a roundabout way, so I wouldn't be surprised if it did well ... still if you want to go for unusably slow MRP has it beat.
MfA is offline   Reply With Quote
Old 16th July 2007, 07:14   #27  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
I think the main problem with PAQ8's image compression is that the linear prediction is only for determining contexts, it's not used as a filter. Thus, if PAQ8 predicts a pixel value to be 128 and it turns out to be 127, then every single bit is predicted wrong (or if it knows the prediction is inaccurate, then every bit is coded with near 50% probability). Whereas in FFV1 that would be a residual of -1, which still compresses well.
There's also context dilution: If FFV1 sees a neighborhood of (128,128,128) and the current pixel is 128, then that's also evidence for a neighborhood of (127,127,127) implying that pixel is 127. PAQ8 treats those as separate events, and doesn't infer the probability of one from observations of the other.

Last edited by akupenguin; 16th July 2007 at 07:18.
akupenguin is offline   Reply With Quote
Old 16th July 2007, 08:36   #28  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by MfA View Post
PAQ8 uses linear predictive coding in a roundabout way, so I wouldn't be surprised if it did well ... still if you want to go for unusably slow MRP has it beat.
Yeah, even its image compression (with the special BMP context model, no context mixing) is 100KB/s on a fast machine... too slow to be useful for video.

But I think its still faster than the MSU lossless codec...
Dark Shikari is offline   Reply With Quote
Old 16th July 2007, 17:40   #29  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
For laughs I'll try a combination of MED prediction and QuickLZ, it should be fast enough for realtime ... dunno if the performance will be worth crap though.
MfA is offline   Reply With Quote
Old 17th July 2007, 20:15   #30  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Well crap was just about right, even slightly worse than huffyuv on some really clean CGI anime I tried it with (best case for this kind of entropy coding) ... still it might be of interest to someone (cut down and cleaned up cedocida, fitting in a different type of compression requires only minimal changes now).

http://karton.student.utwente.nl/sc.0.2.zip
MfA is offline   Reply With Quote
Old 18th July 2007, 04:04   #31  |  Link
MiroLx
Registered User
 
Join Date: Nov 2006
Posts: 51
Quote:
Originally Posted by MfA View Post
Well crap was just about right, even slightly worse than huffyuv on some really clean CGI anime I tried it with (best case for this kind of entropy coding) ... still it might be of interest to someone (cut down and cleaned up cedocida, fitting in a different type of compression requires only minimal changes now).

http://karton.student.utwente.nl/sc.0.2.zip
And what was the speed compared to huffyuv?
MiroLx is offline   Reply With Quote
Old 18th July 2007, 04:40   #32  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Almost the same ... but both are probably mostly disk limited on my crappy system so that doesn't mean much. Didn't spend too much time with it ... it's not meant as competition, it's just a toy
MfA is offline   Reply With Quote
Old 18th July 2007, 06:31   #33  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by MfA View Post
Almost the same ... but both are probably mostly disk limited on my crappy system so that doesn't mean much. Didn't spend too much time with it ... it's not meant as competition, it's just a toy
Test it encoding to /dev/null then
Dark Shikari is offline   Reply With Quote
Old 18th July 2007, 09:18   #34  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
Quote:
Originally Posted by MfA View Post
Well crap was just about right, even slightly worse than huffyuv on some really clean CGI anime I tried it with (best case for this kind of entropy coding) ... still it might be of interest to someone (cut down and cleaned up cedocida, fitting in a different type of compression requires only minimal changes now).

http://karton.student.utwente.nl/sc.0.2.zip
Great !
I was just looking around for the fastest lossless codec possible and quickLZ (along with lzo) looked like good candidates...
bill_baroud is offline   Reply With Quote
Old 18th July 2007, 16:01   #35  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Just a thought: would it be beneficial to apply a transform to video going into a line-based compressor (like LZ, etc) to turn, say, 8x8 blocks into 64-long lines, so that the compressor acted more efficiently rather than scanning across lines of the video?
Dark Shikari is offline   Reply With Quote
Old 18th July 2007, 16:35   #36  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
I did consider it for a moment, but I am not sure it would be a sure win. If you are just encoding a straight edge between 2 colored planes then you can't generally pick a block and overlap it with the present one to get a good match. When you just encode lines though then in the line above you can always pick a section which you can overlap with the edge on the present line.

Still, it's not a lot of trouble to try it ...
MfA is offline   Reply With Quote
Old 18th July 2007, 17:19   #37  |  Link
MiroLx
Registered User
 
Join Date: Nov 2006
Posts: 51
Quote:
Originally Posted by Dark Shikari View Post
Just a thought: would it be beneficial to apply a transform to video going into a line-based compressor (like LZ, etc) to turn, say, 8x8 blocks into 64-long lines, so that the compressor acted more efficiently rather than scanning across lines of the video?
I am afraid this would be problematic, because when finding block match you have to slide one block in pixel steps and your model only allows block steps.
MiroLx is offline   Reply With Quote
Old 18th July 2007, 17:22   #38  |  Link
mitsubishi
Registered User
 
Join Date: Sep 2006
Location: UK
Posts: 416
Thanks akupenguin and squid on the heads up on the one decode methods, I'll look into those more thoroughly later.

MfA, I tested your codec, not bad.

Code:
Codec   Enc.Time     Size      Dec. time
MLC        186          1.22GB   304
FFDs huffy  43          2.12GB    92
Alpy        40          1.65GB   250
Lagarith   134          1.34GB   193
Huffyyuv    45          2.56GB   104
sc.0.2     103          2.56GB   120


Through "read.avs"
Code:
AVIsource("w:\1.avi")
which I use for the decode speed test and lossless verification it is fine, but is all blue tinted when played back in MPC directly.

Last edited by mitsubishi; 18th July 2007 at 17:32.
mitsubishi is offline   Reply With Quote
Old 18th July 2007, 17:39   #39  |  Link
squid_80
Registered User
 
Join Date: Dec 2004
Location: Melbourne, AU
Posts: 1,963
Quote:
Originally Posted by mitsubishi View Post
MfA, I tested your codec, not bad.
Are these tests using YUY2? Everytime I compare original huffyuv with ffdshow using YUY2 (no adaptive tables), ffdshow ends up about 40% bigger. Using adaptive for both the size difference is < 0.1% (as you'd expect). Am I just choosing bad material?
squid_80 is offline   Reply With Quote
Old 18th July 2007, 17:59   #40  |  Link
mitsubishi
Registered User
 
Join Date: Sep 2006
Location: UK
Posts: 416
Everything is in the sources YV12, except for Huffyyuv which I had to convert back on reading it.

source.avs
Code:
DGDecode_mpeg2source("D:\test1.d2v")
read.avs
Code:
AVIsource("w:\1.avi")
verify.avs
Code:
A=import("source.avs")
b=import("read.avs")   #.converttoyv12() # needed for huffyyuv

return ssim(a,b,"results.csv","averageSSIM.txt")
I've only tested on one source so far, I intend to check a HD source later. This source is clip from Doctor Who (DVB)
mitsubishi is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:19.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.