PDA

View Full Version : Psy RDO: Official testing thread (version 0.6 out!)


Pages : [1] 2 3 4 5 6 7 8 9 10 11 12 13

Dark Shikari
31st May 2008, 19:04
Psychovisually optimized rate-distortion optimization

Experimental Patch: Psy RD 0.6 (updated for r950) (http://pastebin.com/m645aae89)
Build: r937, version 0.6 (http://www.mediafire.com/?rhacap8ja1e)

Version 0.6 changes:

1. Psy trellis (adjustable separately on commandline) included and on by default.
2. Automatically adjust chroma QP offset to compensate for the higher quants psy RD/trellis result in.
3. Psy trellis allows trellis=1 now.

Version 0.3-0.5 changes:

1. Much faster, by caching half the SATDs that need to be done.
2. Warn the user if trellis 1 is enabled--and disable trellis--if psy RD is on.
3. Psy RD strength is now a decimal value with default 1.0. Don't touch it unless you have good reason.
4. Psy RD strength now automatically scales based on quantizer. This is done internally--the "strength" is a multiplier to this internal value.

How to use it:

Its on by default. Adjust the strength with --psy-rd. That's it.

How it works (simply): the human eye doesn't just want the image to look similar to the original, it wants the image to have similar complexity. Therefore, we would rather see a somewhat distorted but still detailed block than a non-distorted but completely blurred block. The result is a bias towards a detailed and/or grainy output image, a bit like xvid except that its actual detail rather than ugly blocking.

How it works (full explanation): read the comment in the patch.

gav1577
31st May 2008, 19:21
If it isn't obvious already, this is going to destroy your PSNR and SSIM.


Great work Dark Shikari is the above anything to worry about ? i am a noob when it comes x264

Dark Shikari
31st May 2008, 19:22
Great work Dark Shikari is the above anything to worry about ? i am a noob when it comes x264It only matters if you think that numbers are more important than how the video actually looks. :p

Sagittaire
31st May 2008, 19:28
Only 84 lines for the patch ... ?

gav1577
31st May 2008, 19:30
Ha ha the picture is more important IMHO. is there any cmd switch
for this patch or is automatically activated with certain settings ?
Thanks nevermind you just edited and answered my question thanks :)

Dark Shikari
31st May 2008, 19:32
Ha ha the picture is more important IMHO. is there any cmd switch
for this patch or is automatically activated with certain settings ?
ThanksIts on by default any time you're using RDO. As such this is a patch for testing purposes and not for something to apply to every modified build out there, since you can't turn it off ;)Only 84 lines for the patch ... ?Yup, and a large portion is comments ;)

As you can tell the code is heavily macroed to reduce duplication; unwrapped it'd be another dozen lines or two. But overall the patch isn't too complicated; I'm only changing how x264 views "distortion" in RDO.

guada2
31st May 2008, 19:44
"Dark is a photoshop fgox264 .
Very god job.

Some questions:

- at what bitrate is it useful?
- is it compatible with grain opt on quickly scene?

desta
31st May 2008, 19:58
Me and my "2pass encode" questions again.... Would there be a worthwhile benefit in enabling this for both passes, or would just the 2nd pass suffice?

Sagittaire
31st May 2008, 20:06
Me and my "2pass encode" questions again.... Would there be a worthwhile benefit in enabling this for both passes, or would just the 2nd pass suffice?

And as always ... no. Because first pass is just a statistical reference for 2nd pass Rate Control. Very fine first pass (crf mode with same setting with close bitrate) will produce marginal better overall quality (it's just Rate Control optimisation).

desta
31st May 2008, 21:26
And as always ... no. Because first pass is just a statistical reference for 2nd pass Rate Control. Very fine first pass (crf mode with same setting with close bitrate) will produce marginal better overall quality (it's just Rate Control optimisation).
It wasn't exactly a ridiculous question though either. We know AQ needs to be the same for both passes to work, and other settings need to be similar or the same to do their job properly. I have also had instances where using similar settings in both passes have made for a better encode, especially when using FGO, which this patch is to replace....

http://forum.doom9.org/showthread.php?p=1143777#post1143777

Adub
31st May 2008, 22:05
Damn DS!! You rule!! Keep the awesome quality patches coming!!

TheRyuu
31st May 2008, 22:12
- at what bitrate is it useful?
- is it compatible with grain opt on quickly scene?

1.) Any
2.) It is intended to replace fgo or at least 'should' replace it. (hope that answers your question since I do not understand it fully)

Gabriel_Bouvigne
31st May 2008, 23:03
Nitpicking: this is not really similar to "perceptual entropy", but quite similar to the "dropout prevention" used by several encoders (exemple: 3gpp HE-AACv2 reference encoder).

Dark Shikari
31st May 2008, 23:14
Nitpicking: this is not really similar to "perceptual entropy", but quite similar to the "dropout prevention" used by several encoders (exemple: 3gpp HE-AACv2 reference encoder).Explain?

techouse
1st June 2008, 00:05
I <3 Dark

Gabriel_Bouvigne
1st June 2008, 00:09
PE is a kind of rough complexity estimation of the frame done at early stage, and is usually done on a less granular scale. It is usually (PE is mainly a concept rather than a specific metric) not used within the RD/quantization stage. On the other hand, dropout prevention is something (usually enabled at low bitrates) that tries to preserve power per subband, even if the individual frequency bins are different, in order to avoid holes within the spectrum.

A simple dropout prevention scheme is described there:
http://www.3gpp.org/ftp/Specs/archive/26_series/26.403/26403-700.zip
section 5.6.1.1.2 "Avoidance of spectral holes"

techouse
1st June 2008, 00:33
http://techouse.project357.com/builds/x264_x86_r859_psy_rdo_techouse.7z

Source: x264 r859 GIT (git://git.videolan.org/x264.git)

Applied patches (current versions):

x264_2pass_vbv.9.diff

x264_fix_win_stdin.diff

x264.gaussian.cplxblur.01.diff

x264_hrd_pulldown.04_interlace.diff

x264_me-prepass_DeathTheSheep.diff

x264_progress.diff

x264_psy_rdo.diff (this patch is STILL IN TESTING but it pretty much replaces FGO; please test how it compares to FGO using my older build "x264_x86_r859_progress_techouse")

x264_rd-optimze_DeathTheSheep.diff


Please check http://forum.doom9.org/showthread.php?t=130364 and http://git.videolan.org/gitweb.cgi?p=x264.git;a=shortlog for more info

Compiled by techouse on June 1st 2008, 01:21:49 CEST with GCC-4.3.0 on Windows Vista Business SP-1 32-bit.

Commandline used: ./configure&&make

Platform: X86
System: MINGW
avis input: yes
mp4 output: yes
pthread: yes
gtk: no
debug: no
gprof: no
PIC: no
shared: no
visualize: no

IgorC
1st June 2008, 08:53
Some undesirable results:

With psy RDO
http://rapidshare.com/files/119246172/3xrdo_RDO.mp4.html

Without psy RDO
http://rapidshare.com/files/119247136/3_pass_NoRDO.mp4.html

Settings for 3 passes:
x264.exe --threads 3 --pass 3 --progress --stats "x264_stat.log" --qcomp 0.75 --bframes 3 --bime --weightb --subme 7 --keyint 500 --ref 16 --trellis 2 --mixed-refs --8x8dct --partitions all --b-rdo --direct auto --b-pyramid --bitrate 800 --no-fast-pskip --me umh --merange 16 --deblock -1:-1 --me-prepass -o 3xrdo.mp4 Ma.avs

If source is needed I can upload it later.

Dark Shikari
1st June 2008, 09:03
Some undesirable results:

With psy RDO
http://rapidshare.com/files/119246172/3xrdo_RDO.mp4.html

Without psy RDO
http://rapidshare.com/files/119247136/3_pass_NoRDO.mp4.html

Settings for 3 passes:
x264.exe --threads 3 --pass 3 --progress --stats "x264_stat.log" --qcomp 0.75 --bframes 3 --bime --weightb --subme 7 --keyint 500 --ref 16 --trellis 2 --mixed-refs --8x8dct --partitions all --b-rdo --direct auto --b-pyramid --bitrate 800 --no-fast-pskip --me umh --merange 16 --deblock -1:-1 --me-prepass -o 3xrdo.mp4 Ma.avs

If source is needed I can upload it later.Undesirable? The psy-RDO version looks a hell of a lot better (overall) to me. There's a little bit of increased ringing but the overall detail retention is an order of magnitude better.

A random example:

http://i31.tinypic.com/3355ssw.png

http://i30.tinypic.com/35jhwyd.png

Notice the vastly decreased blocking on the chest and the increased graininess in the background--plus of course the obvious, the face.

IgorC
1st June 2008, 09:05
Too much ringing and artifacts for my eyes

Dark Shikari
1st June 2008, 09:11
Too much ringing and artifacts for my eyesWell the non-psy-RDO version has far too much blurring for my eyes. The facial texture is completely flattened and there's blocking all over the place.

The slight ringing increase is because Psy-RD moved more bits to the background of the frame, forcing the quantizer up a little bit. If you don't like the bit distribution differential between the characters and the background, you could try lowering AQ strength, since both AQ and psy-RD are, in this case, moving around bits.

Sagittaire
1st June 2008, 09:29
Well I make test on Casino Royal (really grainy movie) and the old FGO produce really better result for the grain retention (--fgo 5 --trelli 2 vs --trelli 2). It's particuraly true for the very fine grain. If you want sample it's possible.

Dark Shikari
1st June 2008, 09:36
Well I make test on Casino Royal (really grainy movie) and the old FGO produce really better result for the grain retention (--fgo 5 --trelli 2 vs --trelli 2). It's particuraly true for the very fine grain. If you want sample it's possible.Sure, toss a sample. There are two potential reasons why FGO could beat the current psy RD:

1. FGO uses an overlapped transform (2x2 SATD) while psy RD uses non-overlapped.

2. FGO uses a much smaller transform.

The former gets very slow for large transforms, so isn't very practical for this situation IMO. The latter issue is likely the same reason that FGO does worse than nothing on many sources; its aimed towards a very specific kind of "detail" rather than complexity in general.

Anyways, upload the source, I'm interested.

Edit: Oh, another possible issue is that FGO lowers B-frame quantizers and adjusts RD thresholds, which this doesn't (yet). Thus comparisons could potentially get misleading when trying to compare just the RD metric alone.

ToS_Maverick
1st June 2008, 09:47
how does Psy RDO work together with AQ, and especially with NO AQ?
how much is the grain retention depenend on the AQ with this patch?

i will definitely test this in the next days, when i got some time!

Dark Shikari
1st June 2008, 09:53
how does Psy RDO work together with AQ, and especially with NO AQ?
how much is the grain retention depenend on the AQ with this patch?

i will definitely test this in the next days, when i got some time!I haven't tested much with regard to AQ; I've just been leaving it on default.

I assume its still probably a good idea to do so :p

Sagittaire
1st June 2008, 12:53
Anyways, upload the source, I'm interested.

Edit: Oh, another possible issue is that FGO lowers B-frame quantizers and adjusts RD thresholds, which this doesn't (yet). Thus comparisons could potentially get misleading when trying to compare just the RD metric alone.

Source is uncompressed FFV1 file ...

Here the sample:
- New psy mode
http://jfl1974.free.fr/upload/Sample_psy_01.mkv

- Old FGO mode
http://jfl1974.free.fr/upload/Sample_FGO_01.mkv


Yes it's perhaps bframe problem because in low motion scene grain retention is really good for FGO and grain retention is really better for PSY RDO in high motion scene. Anyway for HVS I think that grain retention for low motion part is really more important than grain retention in high motion part.

Here my profil for FGO encoding ...


@REM -----------------------------------------------------------
@REM
@REM Profil BluRay 1080p23.976 extra high quality
@REM
@REM -----------------------------------------------------------


@REM Source file name (suffit de mettre la source ici)
set E_SRC=Lossless.avs

@REM Set of quality (ici la qualité 1-50)
set E_BR=22

@REM Set of max bitrate (ici le bitrate max)
set MAX_BR=20000

@REM Set of Buffer (ici le buffer)
set BUF_BR=30000

@REM Set credit (frame de début du générique)
set CRE_FR=201560

@REM Set end credit (frame de fin du générique)
set END_FR=207442



@REM Profil

x264.exe --threads auto --thread-input --keyint 24 --min-keyint 1 --crf %E_BR% --vbv-maxrate %MAX_BR% --vbv-bufsize %BUF_BR% --mvrange 511 --level 4.1
--bframe 3 --b-pyramid --b-rdo --bime --weightb --ref 3 --mixed-refs --direct auto --deblock -2:-2 --ipratio 1.10 --pbratio 1.10 --partitions "all" --8x8dct
--me "umh" --subme 7 --trellis 2 --no-fast-pskip --no-dct-decimate --aud --nal-hrd --sar 1:1 --cqmfile Sagittaire.cfg --aq-strength 1.00 --aq-mode 2 --fgo 5
--zone %CRE_FR%,%END_FR%,b=0.33 --progress -o 1080p_Q1.264 %E_SRC%

pause

sysKin
1st June 2008, 16:35
+#define DC_COEFS_MB(block,src,stride)\
+ dc_coefs[block][0] = h->pixf.sad[PIXEL_16x16]( zero, 0, src[0], stride ) >> 1;\
+ dc_coefs[block][1] = h->pixf.sad[PIXEL_8x8] ( zero, 0, src[1], stride ) >> 1;\
+ dc_coefs[block][2] = h->pixf.sad[PIXEL_8x8] ( zero, 0, src[2], stride ) >> 1;


Shouldn't that be stride/2 for chroma planes?

Dark Shikari
1st June 2008, 17:07
Shouldn't that be stride/2 for chroma planes?No, because x264 uses constant stride for macroblock analysis.

Dark Shikari
1st June 2008, 17:31
Source is uncompressed FFV1 file ...

Here the sample:
- New psy mode
http://jfl1974.free.fr/upload/Sample_psy_01.mkv

- Old FGO mode
http://jfl1974.free.fr/upload/Sample_FGO_01.mkvI can hardly tell the difference between then, but psy looks slightly better to me in the flattest areas (textures on the characters, etc).

Your test is a bit screwed up because you didn't use the same video clip for both though, so I suspect ratecontrol allocated bits somewhat differently ;)

I think its probably better to test psy RDO at low bitrates, because at high the differences are so slight that nobody is going to notice it anyways.

If there is a clear case that old FGO beats psy RDO, we want it to be very blatantly obvious so I can try to find the source of the issue.

MfA
1st June 2008, 18:04
Too much ringing and artifacts for my eyes
I don't particularly care for the overall effect either.

In theory the skin effect is nice ... but the problem with that is that apart from adding some texture it adds a spurious line to his neck. Structured artifacts are a complete no go IMO.

Dark Shikari
1st June 2008, 18:08
In theory the skin effect is nice ... but the problem with that is that apart from adding some texture it adds a spurious line to his neck. Structured artifacts are a complete no go IMO.Line? If you're looking at what I'm looking at, that's an artifact of AQ, not psy-RD, and exists on both...

Yoshiyuki Blade
1st June 2008, 18:30
The quality gain on Neo's black shirt stands out the most in the comparison. I'm really eager to see the results in anime (gonna try various bitrates), but since I'm in vietnam on vacation, I just have a lappy to test this out on ;_;. What can I expect on animated material where there's rarely any dark details on dark objects, or noise?

I knew I'd find myself wanting to tinker with encodes even while overseas lol.

Adub
1st June 2008, 18:51
Truthfully, I rather prefer the second picture, honestly. The jaggies introduced along all of the edges are making my eyes bleed. If this can be fixed by lowering aq, then great! But frankly, I will take something over the jaggies. Sorry DS.

asdfsauce
1st June 2008, 18:53
Dear God, that rapidshare capcha is the worst! Why do people use that site? It's horrible.. (no offense)

Sorry, so um, the Psy RDO screen for those matrix clips is the top one? To me that looks the best. There may be more artifacts but it doesn't have that artificial airbrushed look that I dislike most about h.264. It's all about seeing the forest through the trees. :)

MfA
1st June 2008, 18:56
Line? If you're looking at what I'm looking at, that's an artifact of AQ, not psy-RD, and exists on both...
I'm talking about the line in the small light area at the bottom of his neck, I can't see it in the other picture ... he also seems to have grown a scar on his forehead.
http://i278.photobucket.com/albums/kk105/_MfA_/head.png

PS. I have got to be honest though ... I have to either zoom in or reduce my resolution to see it clearly :)

Dark Shikari
1st June 2008, 19:02
It appears we're now getting down into the true danger of psy optimizations... not everyone agrees on the concept of "better" :p

MfA
1st June 2008, 19:12
How about a settable cut off for the frequencies considered in either dimension? Excluding lower frequencies would probably help prevent structured artifacts.

Dark Shikari
1st June 2008, 19:14
How about a settable cut off for the frequencies considered in either dimension? Excluding lower frequencies would probably help prevent structured artifacts.Excluding lower frequencies would completely ruin the effect though; in many of my test encodes there was an enormous benefit from the lower frequencies, especially in "300" at lower bitrates, but really in anything where the bitrate isn't high enough to flawlessly keep grain. I also suspect the dither-keeping effect depends strongly on lower frequencies.

I'd say the low frequencies are actually the primary reason that this beats FGO in many cases.

Structured artifacts aren't a large problem if they're small and don't stay between frames. Plus, trellis should usually handle getting rid of them anyways; at least it does in my experience.

Adub
1st June 2008, 19:33
New picture for people to compare based on IgorC's provided files.

Psy RDO is on top, "regular" is on bottom.
http://i34.photobucket.com/albums/d125/Merlin7777/rdovsnordo-1.png

The more noticeable part is around his figure.

Dark Shikari
1st June 2008, 19:39
The main issue just seems to be that psy RDO moves more bits to the background in order to keep the grain, losing bits elsewhere.

In almost all the cases I've tested, this results in a dramatic improvement (even in the areas that lose bits), but this is a particularly high contract situation, so it ends up ringing a bit more.

Of course, I prefer ringing to blurring, but that's just me.

MfA
1st June 2008, 19:43
Small moving objects as focal points in a scene seem pretty much worst case scenario for this method.

Dark Shikari
1st June 2008, 19:45
Small moving objects as focal points in a scene seem pretty much worst case scenario for this method.Which is somewhat odd, because the research actually shows that moving objects should be worse quality than they currently are (e.g. motion-based adaptive quantization).

akupenguin
1st June 2008, 19:47
Which is somewhat odd, because the research actually shows that moving objects should be worse quality than they currently are (e.g. motion-based adaptive quantization).
Not if they're the focus. Then you track that one object and the rest of the frame counts as moving.

Dark Shikari
1st June 2008, 19:52
Not if they're the focus. Then you track that one object and the rest of the frame counts as moving.Hmm, true. Of course, its hard to make an algorithm that decides what is the "focus" of the scene... :p

Dark Shikari
1st June 2008, 20:24
New patch. Updates:

1. Ignores chroma. This speeds things up and didn't seem to change things visually at all. Even if we used chroma, it should probably be weighted a lot lower anyways.

2. Raises B-RDO threshold.

3. Now available as a commandline option (--rdcmp).

jethro
1st June 2008, 20:32
Nice discussion. The differences between what people consider to be better might be due to their displays (because i don't think our eyes differ this much). For example, in x264 encodes I don't see much ringing on my monitor but i definitely see blurring and ugly smearing of details (esp without VAQ). This new RD seems to make pictures more 'dirty' which is similar in effect to dct noise mentioned in xvid encodes. Of course, there are also preferences but I can't believe displays aren't a factor.


Truthfully, I rather prefer the second picture, honestly. The jaggies introduced along all of the edges are making my eyes bleed. If this can be fixed by lowering aq, then great! But frankly, I will take something over the jaggies. Sorry DS.
Yes, the jaggies come from VAQ. I noticed that default AQ 1.0 seems too strong for low resolutions as object edges get few bits. I often find myself using AQ 0.5-0.7 for something like 720x304. For HD res actually AQ 1.0 is fine and AQ 0.5 gets blocky.

EDIT: I'd like to add that I also prefer details over artifacts:)

Dark Shikari
1st June 2008, 20:54
Here's a really dramatic example of Psy RDO's effectiveness:

No Psy RDO (http://www.mediafire.com/?wtlmwi021zh)

Psy RDO (http://www.mediafire.com/?acuumi1sino)

Atak_Snajpera
1st June 2008, 21:12
Screenshots from frame 600

PSY ON
http://img519.imageshack.us/img519/3011/psygb2.th.png (http://img519.imageshack.us/my.php?image=psygb2.png)

PSY OFF
http://img204.imageshack.us/img204/7977/nopsytg8.th.png (http://img204.imageshack.us/my.php?image=nopsytg8.png)

IgorC
1st June 2008, 22:23
I think it also depends of resolution. My sample had dvd resolution while new psy rdo maybe more optimized for HD. Artifacts look different for HD and DVD. And of course it depends of display and other conditions of watching samples.

Sagittaire
1st June 2008, 22:42
Edit: Oh, another possible issue is that FGO lowers B-frame quantizers and adjusts RD thresholds, which this doesn't (yet). Thus comparisons could potentially get misleading when trying to compare just the RD metric alone.

After size analyse I think that it's simply a different complexity analysis. For really grainy picture (with really fine grain) fgo mode will use really more bit than psy mode.

http://jfl1974.free.fr/upload/1080p_3.264_020491.jpg (http://jfl1974.free.fr/upload/1080p_3.264_020491.png) vs http://jfl1974.free.fr/upload/1080p_Q1.264_020491.jpg (http://jfl1974.free.fr/upload/1080p_Q1.264_020491.png) vs http://jfl1974.free.fr/upload/Casino_MCCLI_BD9.264_020489.jpg (http://jfl1974.free.fr/upload/Casino_MCCLI_BD9.264_020489.png)

PSY vs FGO vs Elecard

31 832 vs 64 267 vs 63 751 bits (all frame are Pframe)

And here FGO produce by far the best quality ...