Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
13th July 2007, 05:35 | #21 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
Each pixel depends on its left, top, and topleft neighbors. I can't load consecutive pixels in an mmreg because that violates the left dependency, but I can load a diagonal stripe into an mmreg.
Pixels in their logical arrangement in a frame, with one register's worth highlighted: How I'd store the temporary values in memory: Cost: one extra transpose operation (the other transpose is free, since the bitstream parser has to store pixels one by one into an array anyway.) Last edited by akupenguin; 13th July 2007 at 05:43. |
13th July 2007, 08:55 | #22 | Link |
Registered User
Join Date: Nov 2006
Posts: 51
|
I don't know if this applies to MLC, but entropy coding isn't the only reason huffyuv decoding is slower than encoding. There's also the pixel prediction algorithm: when encoding, all pixels are independent so they can be predicted in parallel with SIMD, whereas when decoding each pixel depends on the previous so they have to run in series with latency-bound scalar math.
I have some ideas about how to fix this, but they can't be implemented without changing the bitstream. Maybe some future version of ffvhuff. Thanks for your explanation akupenguin. I don't use SIMD in MLC (maybe futere version). If I understand it right you are able to use bytes for calculations, but numbers in MLC are bigger (32bit) in most cases so the advantage of SIMD is falling down. I am also waiting for Barcelona (desktop version) but AMD seems to have some problems recently... |
13th July 2007, 10:11 | #23 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
32bit numbers still get 4x parallelism from SSE*, just not as much as the 16x that bytes get.
BTW, how do you get 32bit numbers out of video with 8bit colordepth? I can't think of any process that would add so much dynamic range without impeding lossless compression. |
13th July 2007, 11:48 | #24 | Link | |
Registered User
Join Date: Nov 2006
Posts: 51
|
Quote:
|
|
15th July 2007, 09:08 | #25 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
FFV1 would probably end up being better anyways. And FFV1 outperforms PAQ8?? I've never tried it but PAQ8's image compression has always impressed me, it nearly doubles the compression ratio of PNG. |
|
16th July 2007, 07:14 | #27 | Link |
x264 developer
Join Date: Sep 2004
Posts: 2,392
|
I think the main problem with PAQ8's image compression is that the linear prediction is only for determining contexts, it's not used as a filter. Thus, if PAQ8 predicts a pixel value to be 128 and it turns out to be 127, then every single bit is predicted wrong (or if it knows the prediction is inaccurate, then every bit is coded with near 50% probability). Whereas in FFV1 that would be a residual of -1, which still compresses well.
There's also context dilution: If FFV1 sees a neighborhood of (128,128,128) and the current pixel is 128, then that's also evidence for a neighborhood of (127,127,127) implying that pixel is 127. PAQ8 treats those as separate events, and doesn't infer the probability of one from observations of the other. Last edited by akupenguin; 16th July 2007 at 07:18. |
16th July 2007, 08:36 | #28 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
But I think its still faster than the MSU lossless codec... |
|
17th July 2007, 20:15 | #30 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
Well crap was just about right, even slightly worse than huffyuv on some really clean CGI anime I tried it with (best case for this kind of entropy coding) ... still it might be of interest to someone (cut down and cleaned up cedocida, fitting in a different type of compression requires only minimal changes now).
http://karton.student.utwente.nl/sc.0.2.zip |
18th July 2007, 04:04 | #31 | Link | |
Registered User
Join Date: Nov 2006
Posts: 51
|
Quote:
|
|
18th July 2007, 09:18 | #34 | Link | |
Registered User
Join Date: Feb 2002
Posts: 407
|
Quote:
I was just looking around for the fastest lossless codec possible and quickLZ (along with lzo) looked like good candidates... |
|
18th July 2007, 16:01 | #35 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Just a thought: would it be beneficial to apply a transform to video going into a line-based compressor (like LZ, etc) to turn, say, 8x8 blocks into 64-long lines, so that the compressor acted more efficiently rather than scanning across lines of the video?
|
18th July 2007, 16:35 | #36 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
I did consider it for a moment, but I am not sure it would be a sure win. If you are just encoding a straight edge between 2 colored planes then you can't generally pick a block and overlap it with the present one to get a good match. When you just encode lines though then in the line above you can always pick a section which you can overlap with the edge on the present line.
Still, it's not a lot of trouble to try it ... |
18th July 2007, 17:22 | #38 | Link |
Registered User
Join Date: Sep 2006
Location: UK
Posts: 416
|
Thanks akupenguin and squid on the heads up on the one decode methods, I'll look into those more thoroughly later.
MfA, I tested your codec, not bad. Code:
Codec Enc.Time Size Dec. time MLC 186 1.22GB 304 FFDs huffy 43 2.12GB 92 Alpy 40 1.65GB 250 Lagarith 134 1.34GB 193 Huffyyuv 45 2.56GB 104 sc.0.2 103 2.56GB 120 Through "read.avs" Code:
AVIsource("w:\1.avi") Last edited by mitsubishi; 18th July 2007 at 17:32. |
18th July 2007, 17:39 | #39 | Link |
Registered User
Join Date: Dec 2004
Location: Melbourne, AU
Posts: 1,963
|
Are these tests using YUY2? Everytime I compare original huffyuv with ffdshow using YUY2 (no adaptive tables), ffdshow ends up about 40% bigger. Using adaptive for both the size difference is < 0.1% (as you'd expect). Am I just choosing bad material?
|
18th July 2007, 17:59 | #40 | Link |
Registered User
Join Date: Sep 2006
Location: UK
Posts: 416
|
Everything is in the sources YV12, except for Huffyyuv which I had to convert back on reading it.
source.avs Code:
DGDecode_mpeg2source("D:\test1.d2v") Code:
AVIsource("w:\1.avi") Code:
A=import("source.avs") b=import("read.avs") #.converttoyv12() # needed for huffyyuv return ssim(a,b,"results.csv","averageSSIM.txt") |
Thread Tools | Search this Thread |
Display Modes | |
|
|