PDA

View Full Version : piping 10bit 444le to x265|x264 => artifacts in output


Selur
12th February 2017, 21:12
I get artefacts in the upper left corner when I feed x264 and x265 with yuv444p10le.

using x264 with 10bit input:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p10le -f yuv4mpegpipe - | x264-10bit --preset ultrafast --crf 18.00 --sar 1:1 --demuxer y4m --output-csp i444 --fps 24 --output "H:\Output\x264.264" -
-> output is broken

using x265 with 10bit input:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p10le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265.265"
-> output is broken

using vp9 with 10bit input (after the encoding I muxed the output to mkv):
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p10le -f yuv4mpegpipe - | vpxenc --codec=vp9 --passes=1 --pass=1 --end-usage=cq --cq-level=18 --target-bitrate=80000 --profile=3 --good --cpu-used=2 --min-q=0 --max-q=63 --undershoot-pct=0 --buf-sz=6 --buf-initial-sz=4 --buf-optimal-sz=5 --drop-frame=0 --resize-allowed=0 --kf-min-dist=0 --kf-max-dist=250 --auto-alt-ref=0 --noise-sensitivity=0 --sharpness=0 --static-thresh=0 --tile-columns=2 --tile-rows=1 --min-gf-interval=0 --max-gf-interval=0 --threads=16 --width=2400 --height=1350 --yv12 --color-space=unknown --target-level=255 --input-bit-depth=10 --bit-depth=10 -o "H:\Temp\20_44_51_8310_01.vp9" -
-> output is fine!

side note: vp9 is still way to slow to be usable on my machine (i7 4770k) with 0.5 fps :(

outputting y4m into a file using:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p10le -f yuv4mpegpipe h:\test.y4m
and playing the file back with ffplay the output looks fine.

using x264 or x265 with 16bit input:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p16le -f yuv4mpegpipe - | x264-10bit --preset ultrafast --crf 18.00 --sar 1:1 --demuxer y4m --output-csp i444 --fps 24 --output "H:\Output\x264_16bitInput.264" -
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p16le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265_16bitInput.265"
the output is fine.

Uploaded the input and the output to my Google Drive (https://drive.google.com/drive/folders/0B_WxUS1XGCPAU1poaXdzNW10Qm8?usp=sharing).

=> Is this a bug in ffmpeg or in x265 and x264 ? Did I make a mistake? Atm. I suspect this is a bug in x264&x265 since both the y4m output file and the encode with vpxenc looks fine.

I also tried using rawvideo instead of yuv4mpegpipe (then with demuxer raw and specifying resolution, frame rate input depth in x264&x265), but the output is still broken.

Cu Selur

sneaker_ger
12th February 2017, 22:08
"x265.265" is 4:2:0 8 bit according to MediaInfo. Can you post verbose ffmpeg log and a screenshot of how it looks?

Selur
12th February 2017, 22:18
must have uploaded a wrong file, over here MediaInfo reports:
General
Complete name : h:\Output\x265.265
Format : HEVC
Format/Info : High Efficiency Video Coding
File size : 26.0 MiB
Writing library : x265 2.2+36-9b975fec584a:[Windows][GCC 6.3.0][64 bit] 10bit
Encoding settings : cpuid=1173503 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=3 / input-res=2400x1350 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=1 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=24 / keyint=250 / bframes=3 / b-adapt=0 / b-pyramid / bframe-bias=0 / rc-lookahead=5 / lookahead-slices=8 / scenecut=0 / no-intra-refresh / ctu=32 / min-cu-size=16 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=0 / no-limit-modes / me=0 / subme=0 / merange=57 / temporal-mvp / no-weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=2 / early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / analysis-mode=0 / no-lossless / cbqpoffs=6 / crqpoffs=6 / rc=crf / crf=18.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / ipratio=1.40 / pbratio=1.30 / aq-mode=1 / aq-strength=0.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / max-cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / opt-qp-pps / opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-motion / no-hdr

Video
Format : HEVC
Format/Info : High Efficiency Video Coding
Format profile : @L5@Main
Width : 2 400 pixels
Height : 1 350 pixels
Display aspect ratio : 16:9
Frame rate : 24.000 FPS
Color space : YUV
Chroma subsampling : 4:4:4
Bit depth : 10 bits
Writing library : x265 2.2+36-9b975fec584a:[Windows][GCC 6.3.0][64 bit] 10bit
Encoding settings : cpuid=1173503 / frame-threads=3 / numa-pools=8 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=3 / input-res=2400x1350 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=1 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=24 / keyint=250 / bframes=3 / b-adapt=0 / b-pyramid / bframe-bias=0 / rc-lookahead=5 / lookahead-slices=8 / scenecut=0 / no-intra-refresh / ctu=32 / min-cu-size=16 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / dynamic-rd=0.00 / no-signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=0 / no-limit-modes / me=0 / subme=0 / merange=57 / temporal-mvp / no-weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=2 / early-skip / rskip / fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / rdpenalty=0 / psy-rd=2.00 / psy-rdoq=0.00 / no-rd-refine / analysis-mode=0 / no-lossless / cbqpoffs=6 / crqpoffs=6 / rc=crf / crf=18.0 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=0 / ipratio=1.40 / pbratio=1.30 / aq-mode=1 / aq-strength=0.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=2 / chromaloc=0 / display-window=0 / max-cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / opt-qp-pps / opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-motion / no-hdr

-> uploaded the correct file
and here's a screenshot of the problem:
https://s28.postimg.org/6lyuywbhl/x265_265_snapshot_00_00_2017_02_12_22_16_12.png (https://postimg.org/image/6lyuywbhl/) (black stuff in the upper left corner)

sneaker_ger
12th February 2017, 23:24
Interestingly, the .y4m file plays fine if converted to RGB using LAV. But the same artifacts as in your screenshot appears when you let madVR do the conversions. :confused:

Selur
12th February 2017, 23:31
Playing the source iwth LAV + MadVR also shows the artifacts.
ffplay plays the y4m (and the source) fine.

Cu Selur

Ps.: Really hoping one of the x265 or x264 devs can shed some light on this.

Jamaika
13th February 2017, 00:30
No problem ;)
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=1920:1080:in_color_matrix=bt2020_ncl:in_range=full:out_color_matrix=bt2020_ncl:out_range=full,format=yuv444p10le -strict -1 - |
x265.exe --y4m --input-csp i422 --input-depth 10 --output-depth 10 --preset veryslow --crf 28 --fps 29.970 --keyint 60 --info --no-open-gop --no-hrd --high-tier --me sea --aq-mode 0 --no-cutree --ipratio 1.0 --pbratio 1.1
--qpstep 1 --no-sao --psy-rd 4.0 --psy-rdoq 10.0 --no-rskip --vbv-bufsize 40000 --vbv-maxrate 40000 --colormatrix bt2020nc --colorprim bt2020 --transfer bt2020-10 --limit-ref 0 --range full --output "output.h265" -

Selur
13th February 2017, 00:37
that (adjusted the resolution to the input resolution) gives me:
ffmpeg.exe -loglevel verbose -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt2020_ncl:in_range=full:out_color_x=bt2020_ncl:out_range=full,format=yuv444p10le -strict -1 - | x265.exe --y4m --input-csp i422 --input-depth 10 --output-depth 10 --preset veryslow --crf 28 --fps 29.970 --keyint 60 --info --no-open-gop --no-hrd --high-tier --me sea --aq-mode 0 --no-cutree --ipratio 1.0 --pbratio 1.1 --qpstep 1 --no-sao --psy-rd 4.0 --psy-rdoq 10.0 --no-rskip --vbv-bufsize 40000 --vbv-maxrate 40000 --colormatrix bt2020nc --colorprim bt2020 --transfer bt2020-10 --limit-ref 0 --range full --output "h:\Output\output.h265" -
ffmpeg version N-83486-g25d9cb4621 Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 6.3.0 (Rev1, Built by MSYS2 project)
configuration: --enable-avisynth --enable-gmp --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --disable-w32threads --enable-fontconfig --enable-frei0r --enable-gnutls --enable-libass --enable-libbluray --enable-libbs2b --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libwavpack --enable-libwebp --enable-libxavs --enable-libxvid --enable-libzimg --enable-libsnappy --enable-gpl --enable-nvenc --enable-version3 --enable-filter=frei0r --disable-debug
libavutil 55. 46.100 / 55. 46.100
libavcodec 57. 78.100 / 57. 78.100
libavformat 57. 66.102 / 57. 66.102
libavdevice 57. 2.100 / 57. 2.100
libavfilter 6. 73.100 / 6. 73.100
libswscale 4. 3.101 / 4. 3.101
libswresample 2. 4.100 / 2. 4.100
libpostproc 54. 2.100 / 54. 2.100
Routing option strict to both codec and muxer layer
[mxf @ 0000021ff767acc0] Dark key 06.0e.2b.34.01.01.01.02.03.01.02.10.01.00.00.00
Last message repeated 1 times
[mxf @ 0000021ff767acc0] Dark key 06.0e.2b.34.02.53.01.01.0d.01.01.01.01.01.23.00
[mxf @ 0000021ff767acc0] Dark key 06.0e.2b.34.01.01.01.02.03.01.02.10.01.00.00.00
Last message repeated 3 times
[mxf @ 0000021ff767acc0] Dark key 06.0e.2b.34.02.05.01.01.0d.01.02.01.01.11.01.00
[mxf @ 0000021ff767acc0] Dark key 06.0e.2b.34.01.01.01.02.03.01.02.10.01.00.00.00
Last message repeated 1 times
[mxf @ 0000021ff767acc0] dnxhd: Universal Label: 060e2b34.0401.010a.04010202.71240000
[mxf @ 0000021ff767acc0] none: Universal Label: 00000000.0000.0000.00000000.00000000
[dnxhd @ 0000021ff767b9e0] Profile cid 1270.
[dnxhd @ 0000021ff767b9e0] 2400x1350, 4:4:4 12 bits, MBAFF=0 ACT=1
[mxf @ 0000021ff767acc0] Stream #0: not enough frames to estimate rate; consider increasing probesize
Guessed Channel Layout for Input Stream #0.1 : stereo
Input #0, mxf, from 'F:\TestClips&Co\vc-3\original_vc3.mxf':
Metadata:
project_name : Untitled Project
uid : b0772c86-c911-7c4a-9500-58807f478d51
generation_uid : 19b73308-2cb5-4144-b000-a8c7d0832ad8
company_name : Blackmagic Design
product_name : DaVinci Resolve
product_version : 12.5.4
product_uid : 057cd849-178a-4b88-b4c7-825af8761b34
modification_date: 2017-01-25T15:49:31.000000Z
application_platform: libMXF (Win32)
material_package_umid: 0x060A2B340101010501010D43130000005888C90A015B0000060E2B347F7F2A80
material_package_name: 77
timecode : 01:00:00:00
Duration: 00:00:13.50, start: 0.000000, bitrate: 552431 kb/s
Stream #0:0: Video: dnxhd (DNXHR 444), 1 reference frame, yuv444p12le(bt709/unknown/unknown, progressive), 2400x1350, SAR 1:1 DAR 16:9, 24 tbr, 24 tbn, 24 tbc
Metadata:
file_package_umid: 0x060A2B340101010501010D43130000005888C90A015C0000060E2B347F7F2A80
file_package_name: 77
track_name : 77_v1
Stream #0:1: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
Metadata:
file_package_umid: 0x060A2B340101010501010D43130000005888C90A015C0000060E2B347F7F2A80
file_package_name: 77
track_name : 77_A01
[Parsed_scale_0 @ 0000021ff76d2e00] Option 'out_color_x' not found
[AVFilterGraph @ 0000021ff76d0c00] Error initializing filter 'scale' with args '2400:1350:in_color_matrix=bt2020_ncl:in_range=full:out_color_x=bt2020_ncl:out_range=full:flags=bicubic'
Error opening filters!
x265 [error]: unable to open input file <->
which doesn't really help,..

Selur
13th February 2017, 00:45
Also really confused why you use: 'format=yuv444p10le ' and '--input-csp i422 --input-depth 10' that just looks wrong to me,... -> no clue, how your call could produce anything usable.
-> going to bed (1am here).


Cu Selur

Ps.: added an issue to the bitbucket bugtracker of x265 and linked to this thread, see: https://bitbucket.org/multicoreware/x265/issues/322/artifacts-when-encoding-444-10bit-input.

MasterNobody
13th February 2017, 08:27
imho I would say it is ffmpeg bug. It produce values outside of 0..1023 for your -pix_fmt yuv444p10le and x264/x265 take them as is (i.e. without clamp but only lower bits). Try with -pix_fmt yuv444p16le instead. This one looks to not produce out of range values.

Selur
13th February 2017, 08:35
Try with -pix_fmt yuv444p16le instead
Already tried:using x264 or x265 with 16bit input: .... the output is fine. :)

imho I would say it is ffmpeg bug. It produce values outside of 0..1023
okay, could you post over at the ffmpeg bub tracker and report the issue? (no clue how to report this properly; show that the values are out of range in a fashion the ffmpeg devs will not dismiss directly)

Also strange that vpxenc encodes the source without a problem and ffplay properly decodes the source and the y4m output.

Cu Selur

Jamaika
13th February 2017, 08:56
My mistakes. It's problem x264&x265. For it is strange that the x264 doesn't support rawvideo. Once work. Perhaps intentionally it disabled.

dipje
13th February 2017, 09:56
I think +1 for MasterNobody. And ignore the other posts :).

Seems like a rounding / precision bug somewhere in ffmpeg. Maybe use a videofilter that clamps/clips/limits if ffmpeg has something like that.

For sure it isn't a x265 or x264 problem , I use y4m 444p10 almost daily with both (prores444 back and forth) and no issues there. But I don't have to convert something to 444p10, it already is so ffmpeg doesn't touch it.

X264 supports y4m input no problem if you use '--demuxer y4m' like selur does. Maybe that's a patched x264 with libav support and such, but since that's the build 95% of people are using I call it standard :)

Jamaika
13th February 2017, 10:05
For sure it isn't a x265 or x264 problem , I use y4m 444p10 almost daily with both (prores444 back and forth) and no issues there. But I don't have to convert something to 444p10, it already is so ffmpeg doesn't touch it.
It's problem x264&x265 becose we use codec x264&x265.;)


X264 supports y4m input no problem if you use '--demuxer y4m' like selur does. Maybe that's a patched x264 with libav support and such, but since that's the build 95% of people are using I call it standard :)
It's standard, but x264(selur) have option input-cps:
- valid csps for `raw' demuxer:
i420, yv12, nv12, nv21, i422, yv16, nv16, i444, yv24, bgr, bgra, rgb
- valid csps for `lavf' demuxer:
yuv420p, yuyv422, rgb24, bgr24, yuv422p,
yuv444p, yuv410p, yuv411p, gray, monow, monob,
pal8, yuvj420p, yuvj422p, yuvj444p, xvmcmc,
xvmcidct, uyvy422, uyyvyy411, bgr8, bgr4,
bgr4_byte, rgb8, rgb4, rgb4_byte, nv12, nv21,
argb, rgba, abgr, bgra, gray16be, gray16le,
yuv440p, yuvj440p, yuva420p, vdpau_h264,
vdpau_mpeg1, vdpau_mpeg2, vdpau_wmv3,
vdpau_vc1, rgb48be, rgb48le, rgb565be,
rgb565le, rgb555be, rgb555le, bgr565be,
bgr565le, bgr555be, bgr555le, vaapi_moco,
vaapi_idct, vaapi_vld, yuv420p16le,
yuv420p16be, yuv422p16le, yuv422p16be,
yuv444p16le, yuv444p16be, vdpau_mpeg4,
dxva2_vld, rgb444le, rgb444be, bgr444le,
bgr444be, ya8, bgr48be, bgr48le, yuv420p9be,
yuv420p9le, yuv420p10be, yuv420p10le,
yuv422p10be, yuv422p10le, yuv444p9be,
yuv444p9le, yuv444p10be, yuv444p10le,
yuv422p9be, yuv422p9le, vda_vld, gbrp, gbrp9be,
gbrp9le, gbrp10be, gbrp10le, gbrp16be,
gbrp16le, yuva422p, yuva444p, yuva420p9be,
yuva420p9le, yuva422p9be, yuva422p9le,
yuva444p9be, yuva444p9le, yuva420p10be,
yuva420p10le, yuva422p10be, yuva422p10le,
yuva444p10be, yuva444p10le, yuva420p16be,
yuva420p16le, yuva422p16be, yuva422p16le,
yuva444p16be, yuva444p16le, vdpau, xyz12le,
xyz12be, nv16, nv20le, nv20be, rgba64be,
rgba64le, bgra64be, bgra64le, yvyu422, vda,
ya16be, ya16le, gbrap, gbrap16be, gbrap16le,
qsv, mmal, d3d11va_vld, cuda, 0rgb, rgb0, 0bgr,
bgr0, yuv420p12be, yuv420p12le, yuv420p14be,
yuv420p14le, yuv422p12be, yuv422p12le,
yuv422p14be, yuv422p14le, yuv444p12be,
yuv444p12le, yuv444p14be, yuv444p14le,
gbrp12be, gbrp12le, gbrp14be, gbrp14le,
yuvj411p, bayer_bggr8, bayer_rggb8,
bayer_gbrg8, bayer_grbg8, bayer_bggr16le,
bayer_bggr16be, bayer_rggb16le, bayer_rggb16be,
bayer_gbrg16le, bayer_gbrg16be, bayer_grbg16le,
bayer_grbg16be, yuv440p10le, yuv440p10be,
yuv440p12le, yuv440p12be, ayuv64le, ayuv64be,
videotoolbox_vld, p010le, p010be, gbrap12be,
gbrap12le, gbrap10be, gbrap10le, mediacodec,
gray12be, gray12le, gray10be, gray10le, p016le,
p016be

Film it's good for rgb24.
ffmpeg.exe -i original_vc3.mxf -f rawvideo -an -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=rgb:out_range=full,format=rgb24 - |
x264.exe -v --demuxer raw --input-csp rgb --input-res 2400x1350 --output-csp i444 --threads 4 --preset veryslow --tune grain --crf 24 --fps 29.970 --keyint 60 --nal-hrd none --colormatrix bt709 --range pc
--output "x264_422p10le_crf28b.h264" -

I don't know how to use lavf?
Edit:
x264-10bit.exe -v --demuxer lavf --input-fmt mxf --input-csp yuv444p12le --input-depth 12 --input-range pc --input-res 2400x1350 --preset veryslow --crf 24 --fps 24.00 --keyint 48 --nal-hrd none
--colormatrix bt709 --range pc --output-csp i444 --output "x264_444p10le_crf24.h264" "original_vc3.mxf"
resize [error]: input colorspace vaapi_idct is not supported

Ma
13th February 2017, 10:11
For me it is ffmpeg (dither) bug. If you pass unchanged video (i444p12) to x265, the result is OK, if you dither from 12bit to 10bit in x265 it is also OK.

See my command lines and output bit-rates:
f:\speed\2.2+35>ffmpeg -i ../original_vc3.mxf -v warning -strict -1 -pix_fmt yuv444p10 -f yuv4mpegpipe - | x265-10b --y4m - -pul
trafast --crf 18 w1.hevc
[mxf @ 00000000004ba9e0] Stream #0: not enough frames to estimate rate; consider increasing probesize
Guessed Channel Layout for Input Stream #0.1 : stereo
[yuv4mpegpipe @ 00000000004c17c0] Warning: generating non standard YUV stream. Mjpegtools will not work.
y4m [info]: 2400x1350 fps 24/1 i444p10 sar 1:1 unknown frame count
raw [info]: output file: w1.hevc
x265 [info]: HEVC encoder version 2.2+35-fe2f2dd96f8c
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 4:4:4 10 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.94 kb/s: 100397.38
x265 [info]: frame P: 81, Avg QP:22.10 kb/s: 40446.32
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 7306.44
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 36.97s (8.76 fps), 16166.05 kb/s, Avg QP:24.49

f:\speed\2.2+35>ffmpeg -i ../original_vc3.mxf -v warning -strict -1 -f yuv4mpegpipe - | x265-10b --y4m - -pultrafast --crf 18 w2
.hevc
[mxf @ 000000000045a8c0] Stream #0: not enough frames to estimate rate; consider increasing probesize
Guessed Channel Layout for Input Stream #0.1 : stereo
[yuv4mpegpipe @ 0000000000461660] Warning: generating non standard YUV stream. Mjpegtools will not work.
y4m [info]: 2400x1350 fps 24/1 i444p12 sar 1:1 unknown frame count
raw [info]: output file: w2.hevc
x265 [info]: HEVC encoder version 2.2+35-fe2f2dd96f8c
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 4:4:4 10 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.91 kb/s: 98398.85
x265 [info]: frame P: 81, Avg QP:22.09 kb/s: 37125.22
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 4778.67
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 36.89s (8.78 fps), 13443.21 kb/s, Avg QP:24.49

f:\speed\2.2+35>ffmpeg -i ../original_vc3.mxf -v warning -strict -1 -f yuv4mpegpipe - | x265-10b --y4m - --dither -pultrafast --
crf 18 w3.hevc
[mxf @ 000000000052a8c0] Stream #0: not enough frames to estimate rate; consider increasing probesize
Guessed Channel Layout for Input Stream #0.1 : stereo
[yuv4mpegpipe @ 0000000000531660] Warning: generating non standard YUV stream. Mjpegtools will not work.
y4m [info]: 2400x1350 fps 24/1 i444p12 sar 1:1 unknown frame count
raw [info]: output file: w3.hevc
x265 [info]: HEVC encoder version 2.2+35-fe2f2dd96f8c
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 4:4:4 10 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.92 kb/s: 98397.02
x265 [info]: frame P: 81, Avg QP:22.09 kb/s: 37141.73
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 4786.77
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 40.45s (8.01 fps), 13453.35 kb/s, Avg QP:24.49

Selur
13th February 2017, 10:21
@Jamaika: using
ffmpeg.exe -i original_vc3.mxf -f rawvideo -an -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=rgb:out_range=full,format=rgb24 - |
x264.exe -v --demuxer raw --input-csp rgb --input-res 2400x1350 --output-csp i444 --threads 4 --preset veryslow --tune grain --crf 24 --fps 29.970 --keyint 60 --nal-hrd none --colormatrix bt709 --range pc
--output "x264_422p10le_crf28b.h264" -
for me, the output also has the artifacts,...

@Ma: I agree not using ffmpeg to dither seems to work.

nevcairiel
13th February 2017, 10:23
Playing the source iwth LAV + MadVR also shows the artifacts.

That would suggest to me that something in your source file is wonky, or the ffmpeg decoder is already the cause of the problem, not the conversion routines.
Not every conversion saturates, so one that does might hide a source problem, while one that doesn't might just flip the bits over and artifact.

Edit:
After downloading the file, I don't see artifacts playing the original source with LAV + madVR. I do see them if I manually force a downconversion to 10-bit, which would use the same logic as ffmpeg itself would.

Jamaika
13th February 2017, 10:25
If you pass unchanged video (i444p12) to x265, the result is OK, if you dither from 12bit to 10bit in x265 it is also OK.
For me it isn't OK.
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le -strict -1 - |
x265 --y4m - -pultrafast --crf 18 w1.hevc
There are stains in the frame for the BT709. In the YUV stuff isn't there.
@Ma: I agree not using ffmpeg to dither seems to work.
It's incorrect use. Only for the test. I should use RGB48.

Selur
13th February 2017, 10:41
@nevcairiel:
I do see them if I manually force a downconversion to 10-bit, which would use the same logic as ffmpeg itself would.
Which probably is happening on my end since I only have 10bit displays. :)

@Jamaika: using:
ffmpeg.exe -loglevel verbose -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le -strict -1 - | x265 --y4m - -pultrafast --crf 18 h:\Output\w1.265
the output is 8bit not 10bit so x265 does the dithering from 10bit to 8bit.


=> So the conclusion atm. seems to be, that this is indeed a bug inside lav/ffmpegs dither routine when going from 12bit (possibly others) to 10bit, right? (and ffplay and vpxenc internally do some range restrictions on the input values and thus avoid/fix the problem)

Cu Selur

Jamaika
13th February 2017, 10:54
@Selur I use X265.exe as 10bit.;)
@Ma The differences in the colorpixels of the original and X265 yuv444p12 BT709 (upper left corner). Stains on the film can't be seen, but there are differences.
Screenshot with ImageComparer.
https://preview.ibb.co/kKh4MF/vhrfiew.png

Selur
13th February 2017, 11:00
@Jamaika:
I use X265.exe as 10bit.
using
ffmpeg.exe -loglevel verbose -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le -strict -1 - | x265 --y4m --input-depth 10 --output-depth 10 - -pultrafast --crf 18 h:\Output\w1.265
I get 10bit output, but still have the artifacts. :(

Cu Selur

Ps.: Does anyone whether there is an option which allows one to tell ffmpeg to properly limit the output to 0-1023?

Ma
13th February 2017, 11:41
@Jamaika: In my 3 examples the first (w1.hevc) is wrong (dithered by ffmpeg), second is OK (dropped least-significant 2 bits) and third is OK (dithered in x265).

w1.hevc is 16166.05 kb/s
w2.hevc is 13443.21 kb/s
w3.hevc is 13453.35 kb/s
so from bitrates you can see, that ffmpeg heavily dither the image.

Jamaika
13th February 2017, 11:55
Here are two overlapping errors for yuv444p10le bt709. Dithering and artifact for ffmpeg yuv
http://i64.tinypic.com/fcpt2f.png
plus enhancement efect artifack in codec X265&x264 what give the final result. Well, that someone was doing pictures on a cloudy day.

For yuv444p12le screenshot yuv is the same as for the original.:cool:

Selur
13th February 2017, 19:57
I'm more concerned about dithering problem. :)

Ma
13th February 2017, 20:41
ffmpeg works wrong, x265 & x264 doesn't validate input. I've prepared a patch to x265 that validate y4m input for bit-depth > 8 && < 16.

ffmpeg in this case gives values > 1023 (which is wrong). With this patch x265 all values > 1023 replaces to 1023.

richardpl
13th February 2017, 22:05
Ps.: Does anyone whether there is an option which allows one to tell ffmpeg to properly limit the output to 0-1023?

There is filter to limit it to 0,64-960 range, -vf lutyuv.

Selur
13th February 2017, 22:33
64-960 would be 10bit tv range,reading https://ffmpeg.org/ffmpeg-filters.html#toc-lut_002c-lutrgb_002c-lutyuv I'm totally confused on how to use that filter. :)
-> could you post the options needed to:
a. limit range to 64-960 ?
b. limit range to 0-1023 ?

Cu Selur

richardpl
13th February 2017, 22:40
a. you can't limit to 64-960, out of range values: one higher that 1023 are currently mapped to 0, i will fix this nuisance too.
b. you can't limit to 0-1023, at least not for yuv420p10 but I will fix this nuisance shortly.

Selur
14th February 2017, 07:21
Okay. Thanks! Happy to here this will be fixed. :) Looking forward to it. :)

Jamaika
14th February 2017, 09:58
Something begins to happen. Selur knows how to get things done.;)
https://github.com/Freecom/x264/commit/8157d27b9ead28926fd684c15b132a41dfbb3abc
https://github.com/Freecom/x264/commit/0b01811bde3b5a35a681905f4bc8d666557901f2
https://github.com/FFmpeg/FFmpeg/commit/aa234698e92fa856013a1391fac3526b06c4d533
https://github.com/FFmpeg/FFmpeg/commit/72864547f91f2864f75b2829d0c11317ef7b390b
https://github.com/Nevcairiel/LAVFilters/commit/0ea1c1b9d4438417513312c923f7fd2bd9cfe4c3
https://github.com/Nevcairiel/LAVFilters/commit/c9d551c5671bcb080bac2a28fbe58352aa5148bb
:thanks:

Selur
15th February 2017, 00:40
not totally sure whether this is also part of the current problem, but encoding to 422 causes x265 to crash:

x265 [info]: HEVC encoder version 2.2+36-9b975fec584a
x265 [info]: build info [Windows][GCC 6.3.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2

with:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\Test-AC3-5.1.avi" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv422p16le -f yuv4mpegpipe - | x265 --input - --output-depth 10 --y4m --profile main422-10 --limit-modes --no-open-gop --crf 18.00 --cbqpoffs -2 --crqpoffs -2 --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 15.00 --aq-mode 2 --no-cutree --range limited --colormatrix bt470bg --output "H:\Temp\00_01_57_3410_01.265"
crashes x265 for me.
Same with:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\Test-AC3-5.1.avi" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv422p16le -f yuv4mpegpipe - | x265 --input - --output-depth 12 --y4m --profile main422-12 --limit-modes --no-open-gop --crf 18.00 --cbqpoffs -2 --crqpoffs -2 --psy-rd 2.50 --rdoq-level 2 --psy-rdoq 15.00 --aq-mode 2 --no-cutree --range limited --colormatrix bt470bg --output "H:\Temp\00_05_00_6210_01.265"
using yuv422p12le, yuv422p10le or yuv422p instead of yuv422p16le doesn't help either.

my x265 build was compiled using: https://github.com/jb-alvarado/media-autobuild_suite
I ran it on Windows 10 pro 64bit.

Cu Selur

Ma
15th February 2017, 09:57
not totally sure whether this is also part of the current problem, but encoding to 422 causes x265 to crash:

I've tested on your first sample file "original_vc3.mxf" and x265 works. It may be related with "Test-AC3-5.1.avi" -- could you upload this sample (or some part of it) to your google drive?

Selur
15th February 2017, 10:00
could you upload this sample (or some part of it) to your google drive?
It's already there (https://drive.google.com/open?id=0B_WxUS1XGCPAUTlILW54VThMTFU), uploaded it yesterday before making that post.

I've tested on your first sample file "original_vc3.mxf" and x265 works.
There the issue wasn't a crash but the artifacts. Or do you not get the artifacts?
(I just posted the 422 issue since I suspect it is related to the 444 issue,..)

Ma
15th February 2017, 10:22
Thanks for the sample -- confirmed crash at 18 frame.
The same command line with previous sample as source works OK (without crash).

Selur
15th February 2017, 10:31
The same command line with previous sample as source works OK (without crash).
Yes. That sample isn't 4:2:2 but 4:4:4, crash only occurs for 4:2:2 content.

Ma
15th February 2017, 16:54
The crash is from divide by 0 in SAO.
First attempt to patch is:
diff -r 912dd749bdb5 source/encoder/sao.cpp
--- a/source/encoder/sao.cpp Wed Feb 15 12:04:41 2017 +0530
+++ b/source/encoder/sao.cpp Wed Feb 15 16:50:53 2017 +0100
@@ -1234,7 +1234,7 @@
if (m_param->internalCsp == X265_CSP_I420)
qpCb = x265_clip3(m_param->rc.qpMin, m_param->rc.qpMax, (int)g_chromaScale[qp + slice->m_pps->chromaQpOffset[0]]);
else
- qpCb = X265_MIN(qp + slice->m_pps->chromaQpOffset[0], QP_MAX_SPEC);
+ qpCb = x265_clip3(0, QP_MAX_SPEC, qp + slice->m_pps->chromaQpOffset[0]);

lambda[0] = (int64_t)floor(256.0 * x265_lambda2_tab[qp]);
lambda[1] = (int64_t)floor(256.0 * x265_lambda2_tab[qpCb]); // Use Cb QP for SAO chroma

Could you test this patch?

Selur
15th February 2017, 17:00
Not really unless you can provide me with a windows binary.

Ma
15th February 2017, 17:14
Not really unless you can provide me with a windows binary.

www.msystem.waw.pl/x265/test-selur.7z

Selur
15th February 2017, 17:18
Thanks ! That worked! tried both calls that crashed x265 before and now they don't. :)

Ma
15th February 2017, 18:13
Thanks for testing.

About ffmpeg dither bug and x265 read function that changes values > 1023 to (close) 0, there is a problem with speed. If we change read function to validate 10bit and 12bit data for overflow, x265 will be a bit slower. In my tests the slow-down on your test file 2400x1350 324 frames was 0.46s (read function with validate overflow). It is noticeable for fastest presets.

We could optimize this validation (for speed) or make option that can turn on/off the validation. What is your opinion?

Selur
15th February 2017, 18:16
If it's not much of a hassle an additional option would be nice, since that would:
a. allow me enable it till ffmpeg fixes the issue
b. allow folks which use older systems to fix the issue by enabling it
encoding slower is better than realizing at the end of an xy-hour encode that the output is broken. :)

richardpl
15th February 2017, 19:41
I do not think that is needed as ffmpeg lut filter can now properly do clipping.

Selur
15th February 2017, 19:48
1. In reference to https://forum.doom9.org/showthread.php?p=1797180#post1797180 => how ?
2. this will also only help for systems where ffmpeg is relatively new ;)

richardpl
15th February 2017, 20:10
1. for limited range just use: -vf lutyuv=clipval:clipval:clipval for full range use -vf lutyuv=val:val:val

Selur
15th February 2017, 20:18
okay, so for

10bit: lutyuv=1023:1023:1023
12bit: lutyuv=4095:4095:4095
16bit: lutyuv=65535:65535:65535

are there limited range definitions for 10/12/16bit? If yes, what are the limits? :)

Cu Selur

Jamaika
15th February 2017, 20:55
I don't know how to do this. Is it possible to describe easier?
How to write commands correctly to ffmpeg?
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,lutyuv=4095:4095:4095,format=yuv444p12le -strict -1 111.yuv
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,lutyuv=1023:1023:1023,format=yuv444p10le -strict -1 112.yuv
This is absurd.

richardpl
15th February 2017, 20:58
okay, so for

10bit: lutyuv=1023:1023:1023
12bit: lutyuv=4095:4095:4095
16bit: lutyuv=65535:65535:65535

are there limited range definitions for 10/12/16bit? If yes, what are the limits? :)

Cu Selur

No, you got it wrong, you need to copy paste values exactly as I give it to you. clipval gives limited range, val gives full range.

If you need more explicit commands they are even harder to grasp and involve functions like clip(val, 55, 128) to clip pixel component value to [55, 128] range.

See lutyuv documentation here: http://ffmpeg.org/ffmpeg-filters.html#lut_002c-lutrgb_002c-lutyuv.

Selur
15th February 2017, 20:59
a. 'lutyuv=val:val:val' needs to be placed behind the 'format=yuv444p12le' otherwise the input would be limited
b. scale=2400:1350 <> scale ; no need for the width&height if they are the same as the input
---
@richardpl: okay, I misunderstood that :)

richardpl
15th February 2017, 21:03
I don't know how to do this. Is it possible to describe easier?
How to write commands correctly to ffmpeg?
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,lutyuv=4095:4095:4095,format=yuv444p12le -strict -1 111.yuv
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,lutyuv=1023:1023:1023,format=yuv444p10le -strict -1 112.yuv
This is absurd.

You need to use lutyuv filter after format filter which instruct scale filter to output yuv444p10le format.

so for full range copy paste vf command exactly as written:

ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le,lutyuv=val:val:val -strict -1 112.yuv

Also for best results you better use zscale filter instead of scale filter.

Selur
15th February 2017, 21:21
Also for best results you better use zscale filter instead of scale filter.
I normally avoid zscale because of:
The zscale filter forces the output display aspect ratio to be the same as the input, by changing the output sample aspect ratio.
since I normally use ffmpeg mainly as decoder and prefer to:
a. set the sample aspect ratio through the encoder
b. be able to use rawvideo instead of yuv4mpegpipe (without knowing the exact width&height of the output that isn't really possible)

Selur
15th February 2017, 21:27
@richardpl, just encoded (using ffmpeg version N-83524-g449ce456a6 and x265 version 2.3+2-912dd749bdb5):
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -vf scale,format=yuv444p12le,lutyuv=val:val:val -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output "H:\Temp\21_15_34_6110_02.265"
and the output still shows the artifacts, on the other hand, the 10bit approach:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -vf scale,format=yuv444p10le,lutyuv=val:val:val -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Temp\21_23_46_2810_02.265"
looks fine.

Jamaika
15th February 2017, 22:37
You need to use lutyuv filter after format filter which instruct scale filter to output yuv444p10le format.

so for full range copy paste vf command exactly as written:

ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le,lutyuv=val:val:val -strict -1 112.yuv
I don't understand, I have only pink picture throughout the movie.

Selur
15th February 2017, 22:58
ffmpeg.exe -loglevel verbose -i original_vc3.mxf -an -f yuv4mpegpipe -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=bt709:out_range=full,format=yuv444p10le,lutyuv=val:val:val -strict -1 112.yuv
either use '-f yuv4mpegpipe' and a file with .y4m extension
OR
either use '-f rawvideo' and a file with .yuv extension
mixing those two might cause confusion,...

Ma
16th February 2017, 00:32
For now I've prepared version that fixes both problems -- hangs & black dots. There is no new option (for now) but the code is optimized for speed in case that input is clean (0.2s speed drop).

In archive there are 2 patches + Win64 binary
www.msystem.waw.pl/x265/test-selur2.7z

Selur
16th February 2017, 09:07
Thanks!
Using that version I confirm the 422-crashs are fixed.

regarding the black dots,..

10bit y4m -> 10bit output:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p10le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265_10-10bit_444.265"
-> no artifacts
12bit y4m -> 10bit output:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p12le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265_12-10bit_444.265"
-> no artifacts
16bit y4m -> 10bit output:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p16le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265_16-10bit_444.265"
-> no artifacts
16bit y4m -> 12bit output:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p16le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 10 --y4m --crf 18.00 --output "H:\Output\x265_16-12bit_444.265"
12bit y4m -> 12bit output:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p12le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output "H:\Output\x265_12-12bit_444.265"
-> artifacts
12bit y4m -> 12bit output using lutyuv=val:val:val:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -vf scale,format=yuv444p12le,lutyuv=val:val:val -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output "H:\Output\x265_12-12bit_lutyuv_val_444.265"
-> artifacts
12bit y4m -> 12bit output using lutyuv=clipval:clipval:clipval:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -vf scale,format=yuv444p12le,lutyuv=clipval:clipval:clipval -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output "H:\Output\x265_12-12bit_lutyuv_clipval_444.265"
-> no artifacts

problem seems nearly fixed.

Cu Selur

Jamaika
16th February 2017, 09:21
OK. There is no added value for val and clipval.:stupid:
I lost a little time to test the new codecs. For me it is a very interesting thread about a very complicated suggestions.
Firstly, adding features lutyuv by yuv444p12le to cause additional errors in the image so should not do that. Without lutyuv original is the same as yuv444p12le.
Secondly, whether it is better to convert the yuv444p12le or may yuv444p10le to X265 yuv444p10le?
The results are interesting. For the full range it is better with yuv444p12le for limited with yuv444p10le.
Thirdly, must I add lutyuv when taking a screenshot or rather not?
Shall not be added.
Fourth, I tested options for rawvideo rgb24 in x264. In the other case, using rawvideo makes no sense. The problem is that the functions RGB and lavf are dead for the x264. So I propose to remove them so as not deceiving users.
ffmpeg.exe -i original_vc3.mxf -f rawvideo -an -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=rgb:out_range=full,format=rgb24,lutyuv=val:val:val -strict -1 - |
x264.exe -v --demuxer raw --input-csp rgb --input-range pc --input-res 2400x1350 --output-csp rgb --threads 4 --preset veryslow --tune grain --crf 18 --fps 24.000 --keyint 48 --nal-hrd none
--colormatrix bt709 --range pc --output "rgb24_crf18b.h264" -

@Selur: I wish you success by explaining the conversion in the forum for the program Hybrid and not only. ;)
I don't add pictures into not cluttering the forum. Anyone can do a test.

Selur
16th February 2017, 09:50
"format=rgb24,lutyuv=val:val:val" -> looks like a bad idea, since lutyuv is meant for yuv color spaces not for RGB (lutrgb should be used then) :)

Jamaika
16th February 2017, 10:21
Mayby, it's my provocation.
My mistake for RGB. I didn't know that I cann't add colormatrix and range. In X265 is this only info.
ffmpeg.exe -i original_vc3.mxf -f rawvideo -an -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=rgb:out_range=full,format=rgb24 - |
x264.exe -v --demuxer raw --input-csp rgb --input-res 2400x1350 --output-csp rgb --threads 4 --preset veryslow --tune grain --crf 18 --fps 24.000 --keyint 48 --nal-hrd none
--output "rgb24_crf18b.h264" -

Selur
16th February 2017, 10:38
If you use a rawvideo pipe use '-loglevel fatal' to avoid that unwanted output ends up inside the piped output which then will corrupt the video stream.
using:
ffmpeg -y -loglevel fatal -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -an -sn -vf scale=2400:1350:in_color_matrix=bt709:in_range=full:out_color_matrix=rgb:out_range=full,format=rgb24 -f rawvideo - |
x264 -v --demuxer raw --input-csp rgb --input-res 2400x1350 --output-csp rgb --preset ultrafast --tune grain --crf 18 --fps 24.000 --output "h:\Outputrgb24_crf18b.264" -
encoding works fine here.

nevcairiel
16th February 2017, 11:23
If you use a rawvideo pipe use '-loglevel fatal' to avoid that unwanted output ends up inside the piped output which then will corrupt the video stream.

Thats now how that works, logging goes to stderr, pipe uses stdout, so logging never goes into the pipe but just comes out to the console.

Jamaika
16th February 2017, 11:27
Thats now how that works, logging goes to stderr, pipe uses stdout, so logging never goes into the pipe but just comes out to the console.
OK, loglevel fatal has influenc on lutyuv.
How to I use 'colormatrix' bt709 in x264 I have inverted the colors from RGB to BGR.

Selur
16th February 2017, 11:29
Thats now how that works, logging goes to stderr, pipe uses stdout, so logging never goes into the pipe but just comes out to the console.
You are right that is how it should work. :)

Ma
16th February 2017, 20:23
12bit y4m -> 12bit output:
[...]
-> artifacts

Strange. Source file is 12bit and ffmpeg decode it properly, so there is no values above 4095. I can't reproduce this. In my test output form x265s2 (which was in test-selur2.7z archive with validate input) is the same as from x265s1 (from archive test-selur.7z without validate input) and both are OK.
f:\speed\2.3+2>ffmpeg -y -loglevel fatal -threads 8 -i "..\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv
444p12le -f yuv4mpegpipe - | x265s2 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output w2.hevc
y4m [info]: 2400x1350 fps 24/1 i444p12 sar 1:1 unknown frame count
raw [info]: output file: w2.hevc
x265 [info]: HEVC encoder version 2.3+2-912dd749bdb5
x265 [info]: build info [Windows][GCC 6.3.0][64 bit] 12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 4:4:4 12 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.81 kb/s: 99486.43
x265 [info]: frame P: 81, Avg QP:22.06 kb/s: 37460.62
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 4761.86
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 89.95s (3.60 fps), 13521.27 kb/s, Avg QP:24.48

f:\speed\2.3+2>ffmpeg -y -loglevel fatal -threads 8 -i "..\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv
444p12le -f yuv4mpegpipe - | x265s1 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output w1.hevc
y4m [info]: 2400x1350 fps 24/1 i444p12 sar 1:1 unknown frame count
raw [info]: output file: w1.hevc
x265 [info]: HEVC encoder version 2.3+2-912dd749bdb5
x265 [info]: build info [Windows][GCC 6.3.0][64 bit] 12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 4:4:4 12 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.81 kb/s: 99486.43
x265 [info]: frame P: 81, Avg QP:22.06 kb/s: 37460.62
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 4761.86
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 90.04s (3.60 fps), 13521.27 kb/s, Avg QP:24.48

Selur
16th February 2017, 21:08
copied the x264.exe you send me again into the folder and did the call again:
ffmpeg -y -loglevel fatal -threads 8 -i "F:\TestClips&Co\vc-3\original_vc3.mxf" -map 0:0 -an -sn -vsync 0 -strict -1 -pix_fmt yuv444p12le -f yuv4mpegpipe - | x265 --preset ultrafast --input - --output-depth 12 --y4m --crf 18.00 --output "H:\Output\x265_12-12bit_444.265"
y4m [info]: 2400x1350 fps 24/1 i444p12 sar 1:1 unknown frame count
raw [info]: output file: H:\Output\x265_12-12bit_444.265
x265 [info]: HEVC encoder version 2.3+2-912dd749bdb5
x265 [info]: build info [Windows][GCC 6.3.0][64 bit] 12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
x265 [info]: Main 4:4:4 12 profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 8 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 3 / wpp(43 rows)
x265 [info]: Coding QT: max CU size, min CU size : 32 / 16
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : dia / 57 / 0 / 2
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 250 / 0 / 5.00
x265 [info]: Cb/Cr QP Offset : 6 / 6
x265 [info]: Lookahead / bframes / badapt : 5 / 3 / 0
x265 [info]: b-pyramid / weightp / weightb : 1 / 0 / 0
x265 [info]: References / ref-limit cu / depth : 1 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 0.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-18.0 / 0.60
x265 [info]: tools: rd=2 psy-rd=2.00 early-skip rskip tmvp fast-intra
x265 [info]: tools: strong-intra-smoothing lslices=8 deblock
x265 [info]: frame I: 2, Avg QP:19.81 kb/s: 99486.43
x265 [info]: frame P: 81, Avg QP:22.06 kb/s: 37460.62
x265 [info]: frame B: 241, Avg QP:25.33 kb/s: 4761.86
x265 [info]: consecutive B-frames: 2.4% 1.2% 0.0% 96.4%

encoded 324 frames in 26.15s (12.39 fps), 13521.27 kb/s, Avg QP:24.48
output shows artifacts here (uploaded the output to my GoogleDrive (https://drive.google.com/drive/folders/0B_WxUS1XGCPAU1poaXdzNW10Qm8?usp=sharing))

Ma
16th February 2017, 21:21
output shows artifacts here (uploaded the output to my GoogleDrive (https://drive.google.com/drive/folders/0B_WxUS1XGCPAU1poaXdzNW10Qm8?usp=sharing))

The file is (except cpuid, frame-threads and numa-pools in header) the same as my. If I play this video I don't see any black dots.

Selur
16th February 2017, 21:27
Okay, then the problem is related to my MPC-HC setup and the conversion from 12bit to 10bit. (since my display only shows 10bit)
Thanks! (will probably have to wait for the next MPC-HC nightly until that works)

sneaker_ger
16th February 2017, 21:44
The dots appear when LAV outputs Y410. They aren't there when it outputs Y416. (using madvr)

Selur
16th February 2017, 21:51
Nice! You are right, enabling Y416 in Libav seems to fix the issue (madvr reports Y416 then :))
Thanks!

nevcairiel
16th February 2017, 21:55
Never disable any of the output formats from LAV Video, let the renderer do a conversion to whatever your display wants, faster and safer that way. :)

Selur
16th February 2017, 21:59
I not aware I disabled any of the outputs, but since I haven't really looked into those settings for quite some time I might have just forgotten about it. ;)

Ma
11th March 2017, 21:07
I found the bug in ffmpeg. It is in DITHER_COPY macro (or dither_scale data or dithers data):
https://github.com/FFmpeg/FFmpeg/blob/master/libswscale/swscale_unscaled.c#L1487
in line(s):
dst = (src + dither) * scale >> shift;

In Selur example we have 12-bit source and 10-bit destination, which give as in DITHER_COPY:
scale = 2047
shift = 13

First question -- how big can be dither to avoid overflow in operation:
(4095 + dither) * 2047 >> 13
we have
(4095+3)*2047>>13=1023
(4095+4)*2047>>13=1024
so we can add as much as 3 (but in DITHER_COPY we add 15).

Second question -- how big can be scale to avoid overflow in operation:
(4095+15) * scale >> 13
we have
(4095+15)*2041>>13=1023
(4095+15)*2042>>13=1024
so if we want to add 15 we can multiply by 2041 max (but in DITHER_COPY we multiply by 2047).
-------------------------------
First attempt to fix is:
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index ba3d688..bd4e4f2 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -1488,7 +1488,7 @@ static int packedCopyWrapper(SwsContext *c, const uint8_t *src[],
uint16_t scale= dither_scale[dst_depth-1][src_depth-1];\
int shift= src_depth-dst_depth + dither_scale[src_depth-2][dst_depth-1];\
for (i = 0; i < height; i++) {\
- const uint8_t *dither= dithers[src_depth-9][i&7];\
+ const uint8_t *dither= dithers[src_depth-dst_depth-1][i&7];\
for (j = 0; j < length-7; j+=8){\
dst[j+0] = dbswap((bswap(src[j+0]) + dither[0])*scale>>shift);\
dst[j+1] = dbswap((bswap(src[j+1]) + dither[1])*scale>>shift);\

Ma
12th March 2017, 17:24
I've checked all numbers in dither_scale table and most are OK. Wrong numbers are for dither [from 11-bit to 8-bit are OK, my bad in copy numbers] from 16-bit to 15-bit (overflow in signed multiply -- undefined behavior; now impossible conversion).

I decided to simplify dither_scale table and write the numbers in form that you at once see that all numbers are optimal (and easy to change if you change dithers table).

nevcairiel or richardpl -- please review this patch:
diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
index ba3d688..8bc9ba6 100644
--- a/libswscale/swscale_unscaled.c
+++ b/libswscale/swscale_unscaled.c
@@ -110,22 +110,19 @@ DECLARE_ALIGNED(8, static const uint8_t, dithers)[8][8][8]={
{ 112, 16,104, 8,118, 22,110, 14,},
}};

-static const uint16_t dither_scale[15][16]={
-{ 2, 3, 3, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,},
-{ 2, 3, 7, 7, 13, 13, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25,},
-{ 3, 3, 4, 15, 15, 29, 57, 57, 57, 113, 113, 113, 113, 113, 113, 113,},
-{ 3, 4, 4, 5, 31, 31, 61, 121, 241, 241, 241, 241, 481, 481, 481, 481,},
-{ 3, 4, 5, 5, 6, 63, 63, 125, 249, 497, 993, 993, 993, 993, 993, 1985,},
-{ 3, 5, 6, 6, 6, 7, 127, 127, 253, 505, 1009, 2017, 4033, 4033, 4033, 4033,},
-{ 3, 5, 6, 7, 7, 7, 8, 255, 255, 509, 1017, 2033, 4065, 8129,16257,16257,},
-{ 3, 5, 6, 8, 8, 8, 8, 9, 511, 511, 1021, 2041, 4081, 8161,16321,32641,},
-{ 3, 5, 7, 8, 9, 9, 9, 9, 10, 1023, 1023, 2045, 4089, 8177,16353,32705,},
-{ 3, 5, 7, 8, 10, 10, 10, 10, 10, 11, 2047, 2047, 4093, 8185,16369,32737,},
-{ 3, 5, 7, 8, 10, 11, 11, 11, 11, 11, 12, 4095, 4095, 8189,16377,32753,},
-{ 3, 5, 7, 9, 10, 12, 12, 12, 12, 12, 12, 13, 8191, 8191,16381,32761,},
-{ 3, 5, 7, 9, 10, 12, 13, 13, 13, 13, 13, 13, 14,16383,16383,32765,},
-{ 3, 5, 7, 9, 10, 12, 14, 14, 14, 14, 14, 14, 14, 15,32767,32767,},
-{ 3, 5, 7, 9, 11, 12, 14, 15, 15, 15, 15, 15, 15, 15, 16,65535,},
+/* Numbers 1, 3, 7, 15, 31, 63, 63, 126 in table dither_scale are from
+ * maximum values in diters[0], dithers[1], ..., dithers[7] table.
+ * If you change dithers table please update these numbers.
+ */
+static const uint16_t dither_scale[8][8]={
+{ ((1<<24)-1)/(511+1), ((1<<25)-1)/(1023+3), ((1<<26)-1)/(2047+7), ((1<<27)-1)/(4095+15), ((1<<28)-1)/(8191+31), ((1<<29)-1)/(16383+63), ((1<<30)-1)/(32767+63), ((1u<<31)-1)/(65535+126),},
+{ 0, ((1<<25)-1)/(1023+1), ((1<<26)-1)/(2047+3), ((1<<27)-1)/(4095+ 7), ((1<<28)-1)/(8191+15), ((1<<29)-1)/(16383+31), ((1<<30)-1)/(32767+63), ((1u<<31)-1)/(65535+ 63),},
+{ 0, 0, ((1<<26)-1)/(2047+1), ((1<<27)-1)/(4095+ 3), ((1<<28)-1)/(8191+ 7), ((1<<29)-1)/(16383+15), ((1<<30)-1)/(32767+31), ((1u<<31)-1)/(65535+ 63),},
+{ 0, 0, 0, ((1<<27)-1)/(4095+ 1), ((1<<28)-1)/(8191+ 3), ((1<<29)-1)/(16383+ 7), ((1<<30)-1)/(32767+15), ((1u<<31)-1)/(65535+ 31),},
+{ 0, 0, 0, 0, ((1<<28)-1)/(8191+ 1), ((1<<29)-1)/(16383+ 3), ((1<<30)-1)/(32767+ 7), ((1u<<31)-1)/(65535+ 15),},
+{ 0, 0, 0, 0, 0, ((1<<29)-1)/(16383+ 1), ((1<<30)-1)/(32767+ 3), ((1u<<31)-1)/(65535+ 7),},
+{ 0, 0, 0, 0, 0, 0, ((1<<30)-1)/(32767+ 1), ((1u<<31)-1)/(65535+ 3),},
+{ 0, 0, 0, 0, 0, 0, 0, ((1u<<31)-1)/(65535+ 1),},
};


@@ -1485,10 +1482,10 @@ static int packedCopyWrapper(SwsContext *c, const uint8_t *src[],
}

#define DITHER_COPY(dst, dstStride, src, srcStride, bswap, dbswap)\
- uint16_t scale= dither_scale[dst_depth-1][src_depth-1];\
- int shift= src_depth-dst_depth + dither_scale[src_depth-2][dst_depth-1];\
+ uint16_t scale= dither_scale[dst_depth-8][src_depth-9];\
+ int shift= src_depth-dst_depth + 15;\
for (i = 0; i < height; i++) {\
- const uint8_t *dither= dithers[src_depth-9][i&7];\
+ const uint8_t *dither= dithers[src_depth-dst_depth-1][i&7];\
for (j = 0; j < length-7; j+=8){\
dst[j+0] = dbswap((bswap(src[j+0]) + dither[0])*scale>>shift);\
dst[j+1] = dbswap((bswap(src[j+1]) + dither[1])*scale>>shift);\

Ma
15th March 2017, 12:47
This DITHER_COPY approach in ffmpeg is mathematically wrong. It makes the image darker. After 200 iterations on Selur's sample movie 12-bit to 10-bit, 10-bit to 8-bit and 8-bit to 12-bit:
for /L %%i in (1 1 200) do (
ffmpeg2 -i w2.y4m -v warning -strict -1 -pix_fmt yuv444p10 w2-10-%%i.y4m
ffmpeg2 -i w2-10-%%i.y4m -v warning -strict -1 -pix_fmt yuv444p w2-8-%%i.y4m
ffmpeg2 -i w2-8-%%i.y4m -y -v warning -strict -1 -pix_fmt yuv444p12 w2.y4m
)
the final movie is darker and greener -- www.msystem.waw.pl/x265/w200.mkv

Ma
17th March 2017, 00:37
I've sent a patch to ffmpeg-devel that is mathematical balanced (but 0.2s slower on whole Selur's movie) -- https://patchwork.ffmpeg.org/patch/2941/

200th iteration with new patch looks like this -- www.msystem.waw.pl/x265/n200.mkv (it is the same like first iteration because new patch do not change upshifted data i.e. data with 0 on bits to remove).

Selur
17th March 2017, 05:53
Thanks! :D