PDA

View Full Version : Trying to achieve transparency with x264


Blue_MiSfit
7th May 2008, 19:45
Hey folks,

I've been trying to backup my copy of Hard Candy recently, and no matter what settings I use, I can't seem to get transparent output. In fact, I can't seem to get output that looks as good as Xvid CQ2, which is almost transparent, aside from some mucking of noise / fine details.

The x264 encode overall looks denoised (big surprise), though it retains detail in complex scenes much better (especially pores on close-ups).

It also seems like my x264 encodes have ringing when I view them at 1920x1200 on my LCD, whereas both the MPEG-2 original and the Xvid CQ2 backup both look clean.

It's an SD-DVD, and I don't care about file-size. It's an academic experiment to see if I can compress the video at all without losing any perceptible quality.

My script is just MPEG2Source, and the x264 command line is:

--crf 18.0 --level 5.1 --ref 6 --mixed-refs --no-fast-pskip --bframes 16 --b-pyramid --b-rdo --bime --weightb --direct auto
--filter -2,-2 --subme 7 --trellis 2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --progress
--no-psnr --no-ssim --output "output" "input"


I can't submit any screenshots / samples yet. Just looking for vague suggestions at the moment.

I'm currently re-encoding using the FGO build. Maybe that will help!

Blue_MiSfit
7th May 2008, 19:56
Also, the current CRF18 encode is coming out to ~640mb, which seems very small!

Dark Shikari
7th May 2008, 20:02
1. Try not using trellis=2 if you want to keep grain.
2. Try using lower deadzones.
3. Try using FGO.
4. If CRF18 isn't enough, lower the CRF.

Blue_MiSfit
7th May 2008, 21:28
Interesting. I was under the impression that trellis=2 would improve subjective quality at the cost of encoding time.

I have always used trellis, so would you suggest disabling it entirely for maximum quality (with no regard to speed or compression)?

I have never experimented with deadzones, and must admit I know nothing about them. I will look into this!

FGO encode running at a speedy rate on my 3GHz quad :D In fact it's probably done already. I will check ASAP!

CRF18 seemed to be the lower limit for transparency suggested by most. I've never gone lower! I will try this if all else fails.

I might even try a 2pass encode to match the size generated by Xvid using EQM-V3HR and its usual high quality settings. That would be an interesting comparison - especially with FGO.

Thanks for your time Dark Shikari. I will report back!

~MiSfit

Dark Shikari
7th May 2008, 21:32
Interesting. I was under the impression that trellis=2 would improve subjective quality at the cost of encoding time.

I have always used trellis, so would you suggest disabling it entirely for maximum quality (with no regard to speed or compression)?Trellis improves things purely RD-wise (PSNR). This is often not a good idea when you're trying to retain fine detail/grain perfectly. The general effect of trellis is to make edges sharper and cleaner, since that benefits PSNR the most. Of course, to do this, it'll take bits away from fine detail/grain.
I have never experimented with deadzones, and must admit I know nothing about them. I will look into this!Its just a very simple quantization algorithm; round up/down based on a bias value instead of a complex algorithm like trellis.
I might even try a 2pass encode to match the size generated by Xvid using EQM-V3HR and its usual high quality settings. That would be an interesting comparison - especially with FGO.Sounds like a good idea... I suspect Xvid would get creamed ;)

Blue_MiSfit
7th May 2008, 21:36
So trellis can improve PSNR (which I could care less about), and can improve the sharpness / clarity of edges, but does so at the risk of losing fine detail / grain.

Tricky! I thought it was one of those "always use me" features, like RDO or B-Frames. Oh well - learn something new every day!

I'll get those results posted - I'll have to dust off VirtualDub and get my Xvid updated (this is a project I have had on hold for a few months)!

~MiSfit

Blue_MiSfit
8th May 2008, 00:03
Wow.

Enabling FGO 10 really made things a lot better!

At CRF18 it increased the bitrate by nearly 50%, but the movie still came out under 1 GB, which is over 4:1 compression.

I can still of course tell a difference between still frames, but one isn't obviously "better" than the other.

It's better than the Xvid CQ2 encode, and smaller. That's about all I can ask!

Dark Shikari
8th May 2008, 00:07
Wow.

Enabling FGO 10 really made things a lot better!As a general warning, you can't necessarily claim FGO made things better unless you compare to a non-FGO encode at the same bitrate, since obviously the benefit could have really come from the extra bitrate instead of FGO ;)

smok3
8th May 2008, 00:08
Blue_MiSfit: care to share the command line? /and what win compile did you use, if you are on win that is?/

Blue_MiSfit
8th May 2008, 00:14
@ Dark:

I recall our conversation awhile back in the FGO thread :)

I'm currently making a 2pass non FGO encode to match the FGO CRF18 bitrate. That should settle things nicely.

@Smoke3,

I'm working on proper documentation and screenshots. It should all be up soon :)

x264 build for the non FGO version was r839 from MeGUI auto-update.

~MiSfit

Blue_MiSfit
8th May 2008, 00:32
Here's some more details (from Avinaptic)

Xvid - EQM-V3HR, CQ2


[ About file ]

Name: Hard - Xvid EQM-V3HR CQ2.avi
Date: 21/04/2008 08:52:35
Size: 1,033,195,288 bytes (985.332 MB)

[ Generic infos ]

Play duration: 01:44:10 (6250.077166 s)
Container type: AVI OpenDML
Number of streams: 1
Type of stream nr. 0: video
Audio streams: 0
JUNK: VirtualDub build 29393/release

[ Relevant data ]

Resolution: 720 x 480
Width: multiple of 16
Height: multiple of 32

[ Video track ]

FourCC: xvid/XVID
Resolution: 720 x 480
Frame aspect ratio: 3:2 = 1.5
Pixel aspect ratio: 40:33 = 1.212121
Display aspect ratio: 20:11 = 1.818181
Framerate: 23.976023 fps
Number of frames: 149852
Stream size: 1,029,490,326 bytes
Bitrate: 1317.731347 kbps
Qf: 0.159028
Key frames: 1144 (0; 300; 600; 634; 678; ... 149829)
Null frames: 0
Min key int: 1
Max key int: 300
Avg key int: 130.989510
Delay: 0 ms

[ About MPEG4 encoding ]

User data: DivX503b1393p
User data: XviD0043
Packed bitstream: Yes (*)
QPel: Yes (*)
GMC: No
Interlaced: No
Aspect ratio: 40:33 (16:9 NTSC pixel shape)
Quant type: MPEG custom (*)
Custom intra quant matrix:
8 10 10 10 11 11 13 15
10 10 10 10 11 12 14 16
10 10 11 11 13 14 16 18
10 10 11 13 15 17 19 23
11 11 13 15 19 22 26 29
11 12 14 17 22 28 34 41
13 14 16 19 26 34 44 55
15 16 18 23 29 41 55 72
Custom inter quant matrix:
15 15 15 15 16 17 19 22
15 15 15 15 16 18 20 23
15 15 16 17 19 20 23 27
15 15 17 19 22 25 29 33
16 16 19 22 28 32 38 43
17 18 20 25 32 41 50 60
19 20 23 29 38 50 66 81
22 23 27 33 43 60 81 106

[ Profile compliancy ]

Profile to check: MTK PAL 6000
Resolution: Ok
Framerate: 23.976023 <> 25
Warning: If you need a more complete report, then click on "DRF analysis"

This report was created by AVInaptic (18-11-2007) on 7 mag 2008, h 15:29:48


x264 - CRF 18, no FGO


[ About file ]

Name: Hard - CRF18.mp4
Date: 6/05/2008 23:27:25
Size: 659,673,618 bytes (629.114 MB)

[ Generic infos ]

Play duration: 01:44:10 (6250.076666 s)
Container type: MP4/MOV
Major brand: JVT AVC version 0
Compatible brands: ISO Base Media
Creation time: 6/05/2008 04:55:34 UTC
Modification time: 6/05/2008 04:55:34 UTC
Number of streams: 1
Type of stream nr. 1: video (avc1) {GPAC ISO Video Handler}
Audio streams: 0

[ Relevant data ]

Resolution: 720 x 480
Width: multiple of 16
Height: multiple of 32

[ Video track ]

Codec: avc1
Resolution: 720 x 480
Frame aspect ratio: 3:2 = 1.5
Pixel aspect ratio: 1:1 = 1
Display aspect ratio: 3:2 = 1.5
Framerate: 23.976023 fps
Number of frames: 149852
Bitrate: 842.253626 kbps

[ About H.264 encoding ]

User data: x264
User data: core 59 r839M 27c3071
User data: H.264/MPEG-4 AVC codec
User data: Copyleft 2003-2008
User data: http://www.videolan.org/x264.html
User data: cabac=1
User data: ref=6
User data: deblock=1:-2:-2
User data: analyse=0x3:0x113
User data: me=umh
User data: subme=7
User data: me-prepass=0
User data: brdo=1
User data: mixed_ref=1
User data: me_range=16
User data: chroma_me=1
User data: trellis=2
User data: 8x8dct=1
User data: cqm=0
User data: deadzone=21,11
User data: chroma_qp_offset=0
User data: threads=6
User data: nr=0
User data: decimate=1
User data: mbaff=0
User data: fgo=0
User data: bframes=16
User data: b_pyramid=1
User data: b_adapt=1
User data: b_bias=0
User data: direct=3
User data: wpredb=1
User data: bime=1
User data: keyint=250
User data: keyint_min=25
User data: scenecut=40(pre)
User data: rc=crf
User data: crf=18.0
User data: rceq='blurCplx^(1-qComp)'
User data: qcomp=1.00
User data: qpmin=10
User data: qpmax=51
User data: qpstep=4
User data: ip_ratio=1.40
User data: pb_ratio=1.30
User data: aq=2:1.00
SPS id: 0
Profile: High@L5.1
Num ref frames: 6
Aspect ratio: Square pixels
Chroma format idc: YUV 4:2:0
PPS id: 0 (SPS: 0)
Entropy coding type: CABAC
Weighted prediction: No
Weighted bipred idc: B slices - implicit weighted prediction
8x8dct: Yes

Profile to check: MTK PAL 6000
Resolution: Ok
Framerate: 23.976023 <> 25
Warning: If you need a more complete report, then click on "DRF analysis"

This report was created by AVInaptic (18-11-2007) on 7 mag 2008, h 15:29:44


x264 - CRF 18, with FGO


[ About file ]

Name: Hard - CRF18 FGO.mp4
Date: 7/05/2008 12:35:22
Size: 942,739,266 bytes (899.066 MB)

[ Generic infos ]

Play duration: 01:44:10 (6250.076666 s)
Container type: MP4/MOV
Major brand: JVT AVC version 0
Compatible brands: ISO Base Media
Creation time: 6/05/2008 17:50:52 UTC
Modification time: 6/05/2008 17:50:52 UTC
Number of streams: 1
Type of stream nr. 1: video (avc1) {GPAC ISO Video Handler}
Audio streams: 0

[ Relevant data ]

Resolution: 720 x 480
Width: multiple of 16
Height: multiple of 32

[ Video track ]

Codec: avc1
Resolution: 720 x 480
Frame aspect ratio: 3:2 = 1.5
Pixel aspect ratio: 1:1 = 1
Display aspect ratio: 3:2 = 1.5
Framerate: 23.976023 fps
Number of frames: 149852
Bitrate: 1204.573100 kbps

[ About H.264 encoding ]

User data: x264
User data: core 59 r826M 138601d
User data: H.264/MPEG-4 AVC codec
User data: Copyleft 2003-2008
User data: http://www.videolan.org/x264.html
User data: cabac=1
User data: ref=6
User data: deblock=1:-2:-2
User data: analyse=0x3:0x113
User data: me=umh
User data: subme=7
User data: brdo=1
User data: mixed_ref=1
User data: me_range=16
User data: chroma_me=1
User data: trellis=2
User data: 8x8dct=1
User data: cqm=0
User data: deadzone=21,11
User data: chroma_qp_offset=0
User data: threads=6
User data: nr=0
User data: decimate=1
User data: mbaff=0
User data: fgo=10
User data: bframes=16
User data: b_pyramid=1
User data: b_adapt=1
User data: b_bias=0
User data: direct=3
User data: wpredb=1
User data: bime=1
User data: keyint=250
User data: keyint_min=25
User data: scenecut=40(pre)
User data: rc=crf
User data: crf=18.0
User data: rceq='blurCplx^(1-qComp)'
User data: qcomp=1.00
User data: qpmin=10
User data: qpmax=51
User data: qpstep=4
User data: ip_ratio=1.40
User data: pb_ratio=1.15
User data: aq=2:1.00
SPS id: 0
Profile: High@L5.1
Num ref frames: 6
Aspect ratio: Square pixels
Chroma format idc: YUV 4:2:0
PPS id: 0 (SPS: 0)
Entropy coding type: CABAC
Weighted prediction: No
Weighted bipred idc: B slices - implicit weighted prediction
8x8dct: Yes

[ Profile compliancy ]

Profile to check: MTK PAL 6000
Resolution: Ok
Framerate: 23.976023 <> 25
Warning: If you need a more complete report, then click on "DRF analysis"

This report was created by AVInaptic (18-11-2007) on 7 mag 2008, h 15:29:38


x264 - 2pass, 1204kbit, no FGO


[ About file ]

Name: Hard - 2pass 1204 - no FGO.mp4
Date: 7/05/2008 19:08:58
Size: 942,391,853 bytes (898.735 MB)

[ Generic infos ]

Play duration: 01:44:10 (6250.076666 s)
Container type: MP4/MOV
Major brand: JVT AVC version 0
Compatible brands: ISO Base Media
Creation time: 7/05/2008 00:10:25 UTC
Modification time: 7/05/2008 00:10:25 UTC
Number of streams: 1
Type of stream nr. 1: video (avc1) {GPAC ISO Video Handler}
Audio streams: 0

[ Relevant data ]

Resolution: 720 x 480
Width: multiple of 16
Height: multiple of 32

[ Video track ]

Codec: avc1
Resolution: 720 x 480
Frame aspect ratio: 3:2 = 1.5
Pixel aspect ratio: 1:1 = 1
Display aspect ratio: 3:2 = 1.5
Framerate: 23.976023 fps
Number of frames: 149852
Bitrate: 1204.128500 kbps

[ About H.264 encoding ]

User data: x264
User data: core 59 r839M 27c3071
User data: H.264/MPEG-4 AVC codec
User data: Copyleft 2003-2008
User data: http://www.videolan.org/x264.html
User data: cabac=1
User data: ref=6
User data: deblock=1:-2:-2
User data: analyse=0x3:0x113
User data: me=umh
User data: subme=7
User data: me-prepass=0
User data: brdo=1
User data: mixed_ref=1
User data: me_range=16
User data: chroma_me=1
User data: trellis=2
User data: 8x8dct=1
User data: cqm=0
User data: deadzone=21,11
User data: chroma_qp_offset=0
User data: threads=6
User data: nr=0
User data: decimate=1
User data: mbaff=0
User data: fgo=0
User data: bframes=16
User data: b_pyramid=1
User data: b_adapt=1
User data: b_bias=0
User data: direct=3
User data: wpredb=1
User data: bime=1
User data: keyint=250
User data: keyint_min=25
User data: scenecut=40(pre)
User data: rc=2pass
User data: bitrate=1204
User data: ratetol=1.0
User data: rceq='blurCplx^(1-qComp)'
User data: qcomp=1.00
User data: qpmin=10
User data: qpmax=51
User data: qpstep=4
User data: cplxblur=20.0
User data: qblur=0.5
User data: ip_ratio=1.40
User data: pb_ratio=1.30
User data: aq=2:1.00
SPS id: 0
Profile: High@L5.1
Num ref frames: 6
Aspect ratio: Square pixels
Chroma format idc: YUV 4:2:0
PPS id: 0 (SPS: 0)
Entropy coding type: CABAC
Weighted prediction: No
Weighted bipred idc: B slices - implicit weighted prediction
8x8dct: Yes

[ Profile compliancy ]

Profile to check: MTK PAL 6000
Resolution: Ok
Framerate: 23.976023 <> 25
Warning: If you need a more complete report, then click on "DRF analysis"

This report was created by AVInaptic (18-11-2007) on 7 mag 2008, h 21:56:40

Avenger007
8th May 2008, 06:22
Trellis improves things purely RD-wise (PSNR). This is often not a good idea when you're trying to retain fine detail/grain perfectly. The general effect of trellis is to make edges sharper and cleaner, since that benefits PSNR the most. Of course, to do this, it'll take bits away from fine detail/grain.
In MeGUI, the Trellis option comment says
Performs Trellis quantisation to increase efficiency. On Macroblock provides a good compromise between speed and efficiency. On all decisions reduces speed further.
What is meant by efficiency and how is it determined?

Dark Shikari
8th May 2008, 06:37
In MeGUI, the Trellis option comment says

What is meant by efficiency and how is it determined?Efficiency means PSNR-at-a-given-bitrate, or more precisely, RD score.

Avenger007
8th May 2008, 06:57
Efficiency means PSNR-at-a-given-bitrate, or more precisely, RD score.
OK, I thought it was something like that. They could have just said that instead of vaguely saying increase efficiency. :rolleyes:

Blue_MiSfit
8th May 2008, 07:05
Well, I finished the 2pass, no FGO version, and it came out as stated above (updated previous post).

The Average Q was 17.4.

Here's a look at a couple scenes

http://www.flickr.com/photos/blue_misfit/sets/72157604942405954/

http://farm4.static.flickr.com/3097/2475528518_4477808d88_o.png
http://farm3.static.flickr.com/2274/2474710973_0ca062916f_o.png

There are more at the flickr page..

~MiSfit

smok3
8th May 2008, 10:49
with just staring at the screenshots i would say xvid still wins (background is just to flat in all x264 encodes), is FHO 10 the strongest settings for retaining grain?.

Dark Shikari
8th May 2008, 11:13
with just staring at the screenshots i would say xvid still winsIf you look more carefully, you'll notice there's "detail" in xvid that isn't even in the source ;)

x264 doesn't try to add detail that isn't there in the first place; if that's what you want, there's AddGrain().

smok3
8th May 2008, 12:01
Dark Shikari, that would mean my eyes are really old, or you are biased :)

Dark Shikari
8th May 2008, 12:13
Dark Shikari, that would mean my eyes are really old, or you are biased :)Perhaps you should actually look at the posted images before commenting? :p

http://img178.imageshack.us/img178/2406/differencezs2.png

Looks like pretty blatant oversharpening to me. Now, it might look better to you when sharpened, but the proper way to sharpen a video is using the appropriate Avisynth filter, not during the encoding process.

Most of the "detail" I see in the Xvid shots just looks like oversharpened crap from the CQM used. Of course if you used a similar CQM with x264 you could achieve the same effect, most likely.

I'd be curious how it looks in motion, especially with Xvid's tendency to completely destroy detail in B-frames.

Shinigami-Sama
8th May 2008, 12:24
my monitor must getting worse than I thought it was
I can't tell the difference between any of the images except the mpeg2 and xvid shots of the guy sweating, and the xvid one is very oversharpened looking

my god I want my other monitor back :(

smok3
8th May 2008, 12:25
yes, the screenshots shouldn't be crucial, about oversharpening - either that (to my eyes it is very slight) or overbluring with x264 (which to my eyes isn't just slight).

nm
8th May 2008, 12:54
It would be nice to see what it looks like with trellis=0 or 1 instead of 2.

CruNcher
8th May 2008, 14:24
yeah definitely the Xvid one is artificialy sharped (and it's not slight) look for example at the finger, as dark said most probably the couse for this seem to be the quant matrix used.
And yeah the overbluring in X264 should be less with trellis of tough lower deadzones are better handled carefully or you get oversharpening like with the XviD one (ringing gets introduced then), but on the other side it's not bad especialy for Mpeg-2 transcoding this oversharpening is artificialy bringing back something which was much nearer to the original subjective sharpness level that got lost at the Source->Mpeg-2 quantisation. I find sharpening in all my tests i did subjectively much better when done at Encoding then doing it outside of the Encoder.

smok3
8th May 2008, 14:57
i wonder how the --nf and --fgo goes together?

Audionut
8th May 2008, 15:52
The x264 screenshots are lighter than the source and xvid shots.

And the FGO encodes look best to me. A little more bitrate and it would be sweet.

CruNcher
8th May 2008, 21:18
Btw this is also very interesting
http://farm3.static.flickr.com/2274/2474710973_0ca062916f_o.png

look @ the facial details definitely much more preserved by XviD here compared to X264 blur look & feel even with FGO so im almost 100% sure trellis is the couse for that (all my own tests and sugestions about this matter to the devs are the same in those regards) and might be even that x264 deblocker plays a small role in here too, but yeah a high motion sample of this would be really nice as this is especialy the hard stuff for XviD compared to H.264 in terms off picture stability :)
Tough it's also the question if this was a B-frame vs P-frame comparsion or both P-frame or even I vs B all this plays a role here, what i want to say actually without a subjective comparision this is useless to tell especialy the pro & cons @ high motion vs the detail lost of H.264 in this low motion section tough only from these 2 stills it's really hard to say anything, also the actuall sizes of each frame would be a good thing to know.

I think a much better testing methodology for this would be to restrict both to the exact same settings so you have the same frame types @ the same times also and not mixing up with the (bad untuned adaptive rc stuff in X264)

Also a speed comparision would be nice, because you put so much cpu cycles into the X264 that i would say for now only based from these still it wasn't worth it if the XviD was faster :P and just pure energy thrown out of the window for the H.264 one for nothing (but without seeing the whole encode or at least the bad parts as i said especialy a high motion compare and the time for both encodes it's hard to say anything, then to just make assumptions based on these few data provided.

BTW it would be cool if you could also try out my old EDP build on this, useing the Better profile :)

*.mp4 guy
8th May 2008, 22:47
Screenshots are almost completely useless for comparing video quality, they hide the smearing introduced by xvid, and the flickering introduced by X264, amound many other failings. If you want to get good advice for reaching complete transparancy, posting a short vob sample, along with the avisynth script you use, and your best x264 encode of that sample will give us a lot more to work with.

Dark Shikari
8th May 2008, 22:55
Screenshots are almost completely useless for comparing video quality, they hide the smearing introduced by xvid, and the flickering introduced by X264, amound many other failings.Yup. There are a number of issues in both:

1. x264's tendency to blur B-frames.
2. Xvid's tendency to completely destroy B-frames due to its much higher pbratio value.
3. Xvid's tendency to turn grain into overlapping shifting blocks of ugliness.
4. The appearance of grain *while in motion*.

Plus, if you compare different frametypes between the two videos (e.g. I with P, B with P, etc) you're bound to get biased results.

And yes, I agree that here the use of trellis is probably counterproductive to grain retention.

Blue_MiSfit
8th May 2008, 23:42
Alright, well I'm pretty durn satisfied with the FGO+trellis encodes, but I will do another one, at CRF17 with no trellis. That should be just about perfect for me, and still be a manageable filesize.

~MiSfit