View Full Version : gamut conversions through Avisynth ?
pitch.fr
10th July 2008, 17:48
hello world,
there's a new SAMSUNG projector, that's ISF(Imaging Science Foundation) certified and made under the supervision of Joe Kane(he's quite a video guru)
that's the SP-A800B(DLP 1080p), and as they say "Paramount, DreamWorks, ABC, and Universal Studios are among the professional users of this model."
http://www.projectorcentral.com/Samsung-SP-A800B.htm
it's one of the very few projectors(if not the only one) where you can change the gamut on the fly.
mostly, the SD movies are encoded through the REC601 matrix, and the HD movies through REC709.
but this is the theory...
in practice(and I know that sounds crazy), the mastering engineers still use this kind of monitoring displays :
http://www.soundandvisionmag.com/assets/image/2007/W49/1242007175956.jpg
the engineer is using a BVM CRT monitor(with SMPTE-C phosphores gamut)
this thing :
http://assets.sonybiz.net/products/BVM-A14F5M(img1).jpg
http://www.sony.co.uk/biz/view/ShowProduct.action?BIZ_SESSIONID=yndHL19ZmGTb7HJwGSbJyQRmYPJYqtv1GjHhgvYr6TzXZNjLbf26!-1724456250&product=BVM-A14F5M&pageType=Overview&category=BVM&site=biz_en_GB
so basically, SD and HD movies are not mastered on sRGB/HDTV gamuts as we've all been led to believe.
video monitors used in Europe :
SONY BVM-A14F5M (EBU phosphores)
SONY BVM-A20F1M (EBU phosphores)
SONY BVM-A32E1WM (EBU phosphores)
SONY BVM-L230 (LCD)
SONY BVM-L420 (LCD)
and in USA:
SONY BVM-A20F1U(SMPTE-C phosphores)
so basically, to see movies the way they were mastered, you'd need :
1)YCbPr decoding :
- Rec. ITU-R BT.601-5 for SD primaries (in theory only)
Y’ = 0,299R’ + 0,587G’ + 0,114B’ (601)
- Rec. ITU-R BT.709-4 for HD primaries (in theory only)
Y’ = 0,2126R’ + 0,7152G’ + 0,0722B’ (709)
2)display gamut :
- REC-601 SMPTE-C for SMPTE-C primaries (NTSC SD & HD, 90% of bluray discs as well, and some PAL DVD's)
- REC-601 EBU Tech. 3213 for EBU primaries (PAL/SECAM SD and european bluray discs)
- REC-709 HDTV (for US and EUR HDTV)
that's what my DLP projector looks like, when calibrated in sRGB(the gamut is identical to the HDTV gamut) :
http://pix.nofrag.com/a/3/c/deff50ab2a4b5c063c43e15a16155.png
and here's a gamut comparison :
http://www.homecinema-fr.com/forum/download/file.php?id=54227
so is there any AVS guru that'd be interested to find a way to get 1:1 colors with the way movies are being mastered in the first place ?
of course it'd have to 10bit if possible in order to avoid banding and stuff.
I know that sounds crazy, but that's the way it is.
this Samsung projector is the ultimate thing to watch movies "the way they were meant to be watched"(it supports sRGB/HDTV/EBU/SMPTE/etc.. gamuts).
surely enough, you could revert this colorspace hiccup in AVS ?
....and run it in realtime in ffdshow ? that'd be the shiznit :D
here's a list made by the french ISF CEO "Julien Berry", that lists which gamut has to be used for each Bluray movie :
http://www.wysios.com/jkp/gamut2.asp
from what he told me, here's what happens when using a display with sRGB/HDTV gamut :
for telecines that were made on an EBU monitor, only the green primary will be off.
for telecines that were made on an SMPTE-C monitor, the colors will be over-saturated.
when you read that most of the bluray's are made on SMPTE-C monitors, and that usually you end up with over-saturated colors on sRGB/HDTV displays.....that suddenly would make a lot of sense to fix the colors.....wouldn't it ?
here's a technical PDF about gamuts :
http://www.teksite.co.kr/osilo/down/Understandin%20color%20and%20gamut.pdf
and basically you can do this gamut conversion with a scaler :
http://www.google.com/search?hl=en&q=gamut+conversion&btnG=Search&lr=
its job is to convert gamuts, but that costs a hell lot of money(and usually only SDI input and DVI output).....so doing it in AVS in 10bit would be beyond words :)
some ppl have worked on it with PS scripts :
http://www.avsforum.com/avs-vb/showthread.php?t=912720
the idea would be to :
-do REC-601/REC-709 matrix conversions for SD/HD to get sRGB content
-do REC-601 SMPTE-C / REC-601 EBU Tech. 3213 > HDTV gamut conversions to get proper primaries on sRGB/HDTV displays :)
pitch.fr
10th July 2008, 18:35
an even easier solution would be to input the primaries locations(after calibration) and input them in AVISYNTH :
http://www.avsforum.com/avs-vb/showthread.php?p=11726737#post11726737
is that doable ?
here are the primaries locations on my sRGB CRT :
R:[0.658,0.328]; G:[0.334,0.619]; B:[0.150,0.070]
EDIT : another option might be to use ColorMatrix.
but it doesn't support either "REC-601 EBU" or "REC-601 SMPTE-C" :(
anyhow here's a BT.709>RGB32 conversion(through ffdshow)
+ the same with ColorMatrix(mode="Rec.601->Rec.709") added upfront :
http://thumbnails8.imagebam.com/915/73e4979147374.gif (http://www.imagebam.com/image/73e4979147374)http://thumbnails8.imagebam.com/915/0b83a59147373.gif (http://www.imagebam.com/image/0b83a59147373)
http://thumbnails8.imagebam.com/915/7778be9147377.gif (http://www.imagebam.com/image/7778be9147377)http://thumbnails8.imagebam.com/915/36e09b9147376.gif (http://www.imagebam.com/image/36e09b9147376)
http://thumbnails8.imagebam.com/915/a0f7ef9147379.gif (http://www.imagebam.com/image/a0f7ef9147379)http://thumbnails8.imagebam.com/915/30d22c9147378.gif (http://www.imagebam.com/image/30d22c9147378)
I would say that it looks A LOT better on my sRGB/HDTV calibrated displays :)
this is a US bluray source btw, so most likely telecined in a "REC-601 SMPTE-C" gamut ;)
hopefully I've convinced a few people now :D
yesgrey
12th July 2008, 21:01
some ppl have worked on it with PS scripts....but they are confusing "decoding matrix" and "display gamut" :
http://www.avsforum.com/avs-vb/showthread.php?t=912720
It's me who have started the AVSForum thread you refer, and no, I am not confusing "decoding matrix" and "display gamut".
It seems to be you who are confusing "encoding matrix" and "monitoring color gamut". The authoring process being monitored with a different color gamut, does not means that the colors are wrong. They could be correcting the colors on their monitoring displays the same way I suggested in that thread.
All the thread is considering the mastering process is monitored with displays the same color gamut of the encoding matrix; what you are saying is that it's not done that way. In fact, if the above correction in the monitoring displays is not performed, the correction should be a little different, but the math is all there, we just need to adapt it to this situation. I will take a look and update the thread accordingly.
When I started the thread I started using avisynth, but when I realized the correction must be done in the color linear space JohnAd suggested to use pixel shaders and mplayerc.
It seems to be the better option for now, you have it already done and working, why requesting it in avisynth?
pitch.fr
13th July 2008, 00:46
well many people speak about different ways to do it in this thread, sorry I didn't understand it this way :(
anyhow, great to have you here :)
tritical is improving ColorMatrix to allow this kind of CIExyz conversions on the fly in AVS/ffdshow :)
well, for one I only use Haali's Renderer, which is the de facto video renderer on PC to get smooth video....so these PS scripts are not usable.
and ColorMatrix will allow to input the primaries coordinates we get from colorimetry applications, so it will offer tailor-made gamut conversions between SMPTE-C > your own calibrated display for 1:1 colors(no need for an expensive scaler anymore).
well the ISF CEO and many other professionals are showing that the only way to watch bluray with 1:1 colors is to do REC.709 conversion, and watch it on an SMPTE-C gamut if it was mastered in the US, and an EBU gamut if it was done in europe.
these guys are calibrating broadcasting studios with Minolta CS colorimeters all over the world, and they are pushing this new Samsung as the ultimate projector because it supports any kind of gamut you could possibly want :eek:
even PLANAR does that now with their newest projectors as well.
you can find above a list with all the bluray that are supposed to be watched on an SMPTE-C gamut.
OTOH, the only thing you need to use this kind of PS script is 2D convolution, that some AVS plugins can do.....only problem is that converting to RGB in AVS is single threaded and very slow compared to ffdshow, and you also get those pesky "chroma upsampling bugs" :
http://forum.doom9.org/showpost.php?p=1137196&postcount=1868
pitch.fr
13th July 2008, 17:35
alright, so this PS script is really nice :eek:
http://www.avsforum.com/avs-vb/showthread.php?t=912720
...too bad that doesn't work with HR :(
here's my 19" iiyama CRT calibrated CIE :
http://thumbnails9.imagebam.com/931/c3ad2b9303768.gif (http://www.imagebam.com/image/c3ad2b9303768)
here's my XLS script with the primaries coordinates measured with an Eye One Display 2 colorimeter in Color.HCFR :
http://thumbnails9.imagebam.com/931/fdb6989303770.gif (http://www.imagebam.com/image/fdb6989303770)
and here's the correction from SMPTE-C to my CRT gamut, with REC.709 RGB32HQ HD content from ffdshow in VMR9.
left is untouched, right is corrected to match my CRT gamut.
http://thumbnails9.imagebam.com/931/4ad07f9303793.gif (http://www.imagebam.com/image/4ad07f9303793)http://thumbnails9.imagebam.com/931/8bfb8d9303796.gif (http://www.imagebam.com/image/8bfb8d9303796)
http://thumbnails9.imagebam.com/931/76e0769303773.gif (http://www.imagebam.com/image/76e0769303773)http://thumbnails9.imagebam.com/931/01b8319303776.gif (http://www.imagebam.com/image/01b8319303776)
http://thumbnails9.imagebam.com/931/b562359303778.gif (http://www.imagebam.com/image/b562359303778)http://thumbnails9.imagebam.com/931/7e556c9303781.gif (http://www.imagebam.com/image/7e556c9303781)
http://thumbnails9.imagebam.com/931/25028c9303784.gif (http://www.imagebam.com/image/25028c9303784)http://thumbnails9.imagebam.com/931/10f5e19303785.gif (http://www.imagebam.com/image/10f5e19303785)
http://thumbnails9.imagebam.com/931/2aedfd9303788.gif (http://www.imagebam.com/image/2aedfd9303788)http://thumbnails9.imagebam.com/931/3d12c09303791.gif (http://www.imagebam.com/image/3d12c09303791)
I'd say it removes the green cast on faces(my biggest problem at this point!), and it gives much more natural colors :cool:
too bad my red level is now lower, and because it's 8bit/pixel this could possibly introduce banding.
that's also too bad that I've never managed to get good results with Reclock in VMR9......HR is way smoother, so this is not a viable solution for me at this point.
there is some AVS plugins that can do 2D convolution :
http://www.google.com/search?hl=en&q=avisynth+convolution&btnG=Search&lr=
but as I said earlier RGB32 conversion in AVS is slow as hell(not multithreaded and/or optimized), and full of "chroma upsampling bugs".......except if tritical can make miracles happen :)
otherwise maybe you could create an .icm file ?
or a .cal file to be opened with dispwin.exe(part of ARGYLLCMS www.argyllcms.com) ?
ATi's CLUT is 30bit AFAIK :)
ARGYLL's coder is very knowledgeable and always willing to help :)
it's still 8bit/pixel, but better get proper 8bit corrected colors than greenish movies :D
Wilbert
13th July 2008, 17:44
but as I said earlier RGB32 conversion in AVS is slow as hell, and full of "chroma upsampling bugs"
If you feed interlaced material, you need to use
ConvertToRGB32(interlaced=true)
pitch.fr
13th July 2008, 17:56
ok, thanks for the tip but I've never managed to get RGB32 content out of AVS w/o chroma bugs.
here's a good test sample :
http://forum.doom9.org/showpost.php?p=1137196&postcount=1868
only HR and ffdshow(with "high quality YV12 to RGB conversion" checked) manage to do bug-free conversions AFAIK...
and HR/ffdshow BT.709 matrixes are also different.....ffdshow's is 1 notch greener, and 1 notch less blue and less red...
Wilbert
13th July 2008, 18:17
ok, thanks for the tip but I've never managed to get RGB32 content out of AVS w/o chroma bugs.
here's a good test sample :
http://forum.doom9.org/showpost.php?...postcount=1868
Ok. Post a script and a screenshot which shows these chroma upsampling bugs.
pitch.fr
13th July 2008, 19:10
Ok. Post a script and a screenshot which shows these chroma upsampling bugs.
http://img207.imageshack.us/img207/2374/blaud9.png
there is also some AVS commands to increase the chroma upsampling, but that didn't work either.
I'm not an AVS guru, so maybe I'm missing something :D
what is required is "progressive chroma upsampling", at this point this is not done :eek:
Didée
13th July 2008, 19:45
That's fine and all, but where's the bug you're talking about? There's nothing wrong in your pic. :)
When converting YUV to RGB, it's a matter of choice if you replicate the subsampled UV planes 1:1, or if you interpolate during resampling. The former is what Avisynth and ffdshow(default) are doing, the latter is what ffdshow's HQ=true does.
Both methods are valid, there is no "right" or "wrong".
Definetly, it is not a bug. A missing option, yes. But not a bug.
pitch.fr
13th July 2008, 20:09
oh ok thanks for the clarification Didée...well everyone calls it "chroma upsampling bug" :
http://www.google.com/search?hl=en&q=%22chroma+upsampling+bug%22&btnG=Search&lr=
it basically makes the red blocky to hell...and I'm not watching HD to get blocky reds :D
so is there a way to overcome this issue in AVS ?
I think I read that when you checked "HQ" in ffdshow, it was using some YV12>YUY2 conversion script(from Avisynth actually).....I read it in the ffdshow thread :
http://forum.doom9.org/showpost.php?p=1106325&postcount=3250
anyhow ConvertToRGB32 can't be multithreaded, so this can't be used in realtime in ffdshow :(
tritical
13th July 2008, 21:16
pitch.fr how did you generate that image? Avisynth uses interpolation in its yv12->yuy2->rgb conversions, but the image for converttorgb32() looks more like replication was used. Btw, 'chroma upsampling bug' usually refers to using progressive upsampling for yv12->yuy2 when interlaced is needed or vice versa.
Maybe someone who knows for sure can comment on this, but I was under the assumption that the high quality rgb conversion in ffdshow actually used the yv12->yuy2->rgb conversion code from avisynth, and that the other colorspace conversions in ffdshow used xvid's conversion routines? I took a quick glance at ffdshow-tryout's subversion repository, and saw avisynth's/xvid's conversion code in there, but couldn't quickly see when each was called.
pitch.fr
13th July 2008, 21:24
hi tritical, did you get my PM ? any chance getting a reply please ? :D
well I was using Haali's Renderer in MPC, then I pressed the "print screen" key, zoomed to 300% in photoshop and did some cut/paste.
yes it seems that ffdshow is using some YV12>YUY2>RGB32 conversion when you tick the "HQ conversion" option, I dunno why it ends up like that with ConvertToRGB32 :(
anyhow, I'll try to watch a movie with EVR in MPC HC, Reclock and that PS script......but I don't think this will be as smooth as HR in 1920*1080 24Hz :(
tritical
13th July 2008, 21:47
Yeah, I'll get back to you, gonna go on a bike ride right now though. I quickly downloaded that file, and I get pretty much exactly the same image using ffdshow's hq rgb conversion versus using converttorgb24() in avisynth. image (http://bengal.missouri.edu/~kes25c/tt.bmp) (top is ffdshow default, second is ffdshow hq, third is converttorgb24()) (you might have to zoom in to see the differences). Are you sure that when you tested with converttorgb in avisynth that you didn't accidently still have ffdshow outputting rgb when it decoded the video or something along those lines?
pitch.fr
13th July 2008, 22:06
I didn't use ConvertToRGB24, only ConvertToRGB32.....as even if I do RGB24 in AVS, then I will need to do convert to 32 again....which will mean more CPU time required :(
I've just tried again with ConvertToRGB24(matrix="rec709") + RGB24>RGB32 conversion in ffdshow(as neither VMR9/EVR or HR will accept RGB24 input) and I still get pixelated red.....are you sure you got this output from ConvertToRGB24 ?!
maybe it's ffdshow messing the whole stuff up, then ? even tho I kinda doubt it :(
yesgrey
13th July 2008, 23:14
The problem is the color correction must be done in the linear space. So, after you convert from YUV to R'G'B', you must remove gamma and convert it to RGB, apply the color correction, reapply gamma and then output to your screen. This process has a big advantage, you can also perform gamma correction all in the same operation, to apply a gamma to the signal more indicated for your display.
This is not a Convolution 2D, is simply multiplying vectors and matrices. The problem is it being slow, and the PS have the great advantage of being highly multithreaded and all the clip math is already implemented, no cpu work.
I use reclock and it is great. Tell me more about your problems, maybe I can help you.
When I created that thread, I asked Haali to add that PS funcionallity to HR, since he is already using PS for the YUV->RGB conversion. He told me he would look into it, but never replyed me again about the subject, so I think he did not approve the idea... Let's keep using mplayerc and vmr9...
tritical
14th July 2008, 00:03
@pitch.fr
Something is wrong in your chain. Here is how I tested:
1.) ffdshow default. Use directshowsource("Bronz_s.mkv") in avisynth to open the mkv file using ffdshow, and set ffdshow to output RGB w/o hq conversion. Open in vdub.
2.) ffdshow hq. Same as method 1 setup, but check the hq rgb conversion box in ffdshow.
3.) avisynth conversion. Again use directshowsource, but make sure ffdshow is set to output YV12 (check that it is in vdub), and use converttorgb24 or converttorgb32 after directshowsource() in the avisynth script.
Using this, I get that 2 and 3 give almost identical output (image (http://bengal.missouri.edu/~kes25c/tt2.bmp)).
pitch.fr
14th July 2008, 00:08
Let's keep using mplayerc and vmr9...
yeah, except if tritical can make some highly multithreaded and optimized code, I don't see it working in realtime in ffdshow :(
anyhow, I've just run through all my test movies on my DLP projector.
I'd say the colors look way more realistic :eek:
the best example being an HD bollywood movie, where red clothes look too dark, dull and artificial and skin tones slightly too green.....and when I enable the tailor-made PS script, bam! it looks natural (again) :devil:
well my main issue with not using HR w/ Reclock is that it's simply not as smooth.....HR is just so smooth, that's quite amazing...
I can watch a 2H movie in 24fps@24Hz with very low jitter and no dropped frame.
EVR or VMR9 can't offer me this kind of super low jitter.
but I'll run more tests, considering I'm starting to enjoy getting 1:1 colors on a 2 meters wide projection screen :p
@pitch.fr
Something is wrong in your chain. Here is how I tested:
1.) ffdshow default. Use directshowsource("Bronz_s.mkv") in avisynth to open the mkv file using ffdshow, and set ffdshow to output RGB w/o hq conversion. Open in vdub.
2.) ffdshow hq. Same as method 1 setup, but check the hq rgb conversion box in ffdshow.
3.) avisynth conversion. Again use directshowsource, but make sure ffdshow is set to output YV12 (check that it is in vdub), and use converttorgb24 or converttorgb32 after directshowsource() in the avisynth script.
Using this, I get that 2 and 3 give almost identical output (image (http://bengal.missouri.edu/~kes25c/tt2.bmp)).
that's really weird ?! I'll run more tests tomorrow ?!
and could you do that gamut conversion in 10 bit in AVS(even if it's not officially supported yet) ?
yesgrey
14th July 2008, 01:03
yeah, except if tritical can make some highly multithreaded and optimized code, I don't see it working in realtime in ffdshow :(
It would be great if the gpu drivers could allow this. It would be so simple to them, but I don't believe they would do this kind of stuff to just a few guys...
I'm starting to enjoy getting 1:1 colors on a 2 meters wide projection screen :p
I understand you. My projector is a bit old (JVC M15), and the green is slightly off. It never bothered me. In fact, I never noticed it. After starting using mplayerc with the PS for correcting the colors, I started noticing and prefering the standard colors...:)
yesgrey
14th July 2008, 01:09
and could you do that gamut conversion in 10 bit in AVS(even if it's not officially supported yet) ?
For this the PS is better. I think it operates with more than 14 bit precision. But the biggest problem is we not having bit depths higher than 8 bit per color in the pc world. Due to this, in theory, you could get some banding with the color correction, in practice, I never noticed it.
pitch.fr
14th July 2008, 03:23
well even BD's are 8bit, there's nothing further on the market at this point AFAIK..
oh really 14bit ? I had no idea :)
well I've run some tests this evening, and EVR/VMR9 still can't compete with HR's smoothness...
maybe I'm willing to give up some smoothness to get perfect colors :rolleyes:
lucky you, your projector has a Xenon lamp so you get massive reds, due to my UHP lamp mine are orangey now :(
tritical
14th July 2008, 11:53
Whether avisynth is the right place or not, I created a filter to do the conversions. ddcc v1.2 (http://bengal.missouri.edu/~kes25c/ddcc.zip). It takes in rgb24 or rgb32, and does the following conversion:
gamma corrected RGB -> linear RGB -> CIE XYZ -> CIE XYZ -> linear RGB -> gamma corrected RGB
Chromaticity coordinates and transfer functions of the source and output are fully adjustable... some presets are built in. If white points of chromaticity coordinates differ, then a chromatic adaptation is done using Bradford matrix. More details are in the readme. I tested some, but not enough to really be sure it is correct (I simply implemented the conversions as I understand them). For this to run real time on HD video you're gonna need a processor with SSE3 support (I didn't write SSE or SSE2 routines, so its either SSE3 or the C routine), and probably more than 1 core (using only 1 core on my 2.8Ghz Q6600 doesn't quite make it). Hopefully someone can comment on whether the results are anywhere near correct :).
yesgrey
14th July 2008, 12:06
tritical,
I will take a look and update my AVSForum thread if it's working. It's nice to have other options.:)
Have you think about doing it using the GPU instead of the CPU? Since this is a highly parallelizable task the GPU could speed it up a lot!... The mplayerc Pixel Shader version works great without any cpu overhead...
Thanks for your work!
pitch.fr
14th July 2008, 12:49
cool thanks tritical, gonna give it a shot!
wow 32float, hope this will work in realtime :D
I've spoken to a friend of mine who's a PS script coder, and he said PS scripts also work in 32bit float :
http://msdn.microsoft.com/en-us/library/bb509646(VS.85).aspx
EDIT : ooh, your plugin is highly technical stuff :eek:
all I wanna do is map SMPTE-C to my DLP projector gamut, which is
Your Display xy
Red Green Blue White
x 0.656 0.339 0.151 0.311
y 0.329 0.611 0.068 0.328
z 0.015 0.05 0.781 0.361
maybe you could give some real life examples in the readme for the Avisynth challenged amongst us :D
well I can check very easily if the conversion is done properly, by doing a Color.HCFR calibration using the official test patterns DVD in MPC
I could even compare the MPC PS script and your plugin.
I'm gonna try to see if it works in realtime, but some little help on the syntax would be much appreciated :)
pitch.fr
14th July 2008, 13:54
well it's actually very usable in realtime with HR :eek:
man, 32bit gamut correction with HR and Reclock, this is too awesome :)
at this point I'm decoding h264 w/ ffdshow, then it enters the AVS filter in YV12, for 1) LSF 2) multithreaded ConvertToRGB32 conversion and 3) your colorimetry plugin.
pitch.fr
14th July 2008, 20:40
oh btw yesgrey3, tritical is using 6 digits after the coma for the xyz coeffs, may wanna update your script ?
for BT470-2 you have more digits after the coma than he does.......but you simply call it "NTSC" so I dunno if it's the same exact matrix.
and for "Offset" which is set to 24 in the XLS script, what's a good value ? what does it stand for ?
tritical
14th July 2008, 22:53
@yesgrey3
I'll look into it. Unfortunately, I don't have any experience using pixel shaders... so it may be unlikely.
@pitch.fr
Not sure if you figured this out already, but to map to a custom set of chromaticity coordinates you'll have to use the ofile parameter:
ddcc(ofile="file.txt")
file.txt would contain your values as follows:
0.151
0.068
0.781
0.339
0.611
0.050
0.656
0.329
0.015
0.311
0.328
0.361
You also need to specify the 5 values for the transfer function. You can just take one of the sets of values out of the readme, or to match the transfer function of the pixel shader code use the following five values:
1.0
0.0
0.45
0.0
0.0
I set the defaults of the filter to assume SMPTE-C input primaries and BT.709 input transfer function... so you'll probably also need to change the input transfer function depending on what values you use for the output transfer function.
Also, I put up v1.1... it fixes one small bug, and adds one new built in transfer function (gam_i/gam_o = 5) which matches the transfer function of the pixel shader code.
pitch.fr
14th July 2008, 23:18
great, thanks!
actually I've spent quite some time trying to find a way to run this thing in realtime in ffdshow :)
I've had to cut quite a lot of my audio enhancements in ffdshow audio(96KHz resample and stuff), because it sucks a lot more CPU to use your script + RGB32 conversion in the AVS filter, than to let ffdshow do the RGB32HQ conversion + using the PS script to fix the gamut :rolleyes:
what do you think is a good/best transfer function to choose ?
any theoritical improvement over the way the PS script works ?
and what do you guys think of that SAMSUNG pj that can auto-detect whether the YCbPr input is receiving BT.601 or BT.709 ?!
they've got to use some sort of auto-detection algorithm I guess ? or maybe they actually found a way to do that ?
AFAIK this is not possible :confused:
there is already a PS script parser for AVS :
http://www.google.com/search?hl=en&q=avisynth+pixel+shader&btnG=Search&lr=
maybe there is room for improvement ?
only the GPU works on these scripts, that's a good thing because I've always heard ppl saying that CPU's were pretty lousy DSP's
yesgrey
15th July 2008, 02:53
I'll look into it. Unfortunately, I don't have any experience using pixel shaders... so it may be unlikely.
I was not thinking exclusivelly in PS. CUDA would be too complicated? Haven't you tried something with the gpu for your NNEDI?;)
tritical
15th July 2008, 03:54
@yesgrey3
A CUDA implementation would be easy to do, but I was thinking that it would be pretty limiting (in terms of supported video cards) compared to a pixel shader.
@pitch.fr
The best choice of transfer functions would be the one corresponding to what was originally used to produce the gamma corrected RGB, and the one corresponding to the transfer function of your display. Of course, finding those out may be impossible. The one in the pixel shader code is just a generic one with ~2.222 display gamma and no linear segment... it's a reasonable choice if the true one is not known. I have no idea how autodetection of BT.601 vs BT.709 would work.
pitch.fr
15th July 2008, 04:18
yeah, I prefer the PQ of ATi cards........please no nvidia proprietary stuff :D
so what do u think of the PS AVS plugin, maybe a good starting point ?
I'm not sure I understand what you call the "Transfer Function", but it's 4AM here :D
you mean that it's a wild guess at what the gamma curve was before they encoded their RGB master to YV12 ?
I know what the gamma curve looks like on my DLP pj, it start at 2.3 and ends at 2.1
BUT the professional broadcasting equipment runs 2.5 gamma, that's the rule.......only consumer equipment is 2.22
do you know what I should change in the PS script generator to reflect this please ?
there's also many ways to compute gamma, and there's 2 formulas being used the most.
I've spoken about that with the HCFR color engineers, the most widely used formula seems to be this one :
y' = (y - black) / (white - black)
gamma = log (y') / log (x)
or y' = x ^gamma
which can also be written as follow
we agree that
Y = Ymin when V = 0
Y = Ymax when V = 1
gamma = log ((Y - Ymin) / (Ymax - Ymin)) / log (V)
for 0<V<1
if we wanna take the offset in account, we need to use the following formula :
Y = ((V+offset)/(1+offset))^gamma
if V = 1 then Y = 1
if V = 0 then Ymin = ( offset / (1 + offset)) ^ gamma
it's the one used by Colorfacts, X-Rite, Color.HCFR, Calman etc...
tritical
15th July 2008, 21:27
When I say 'transfer function' I just mean the function which maps linear RGB to gamma corrected RGB. The inverse of the transfer function maps gamma corrected RGB to linear RGB. In ddcc they are implemented as:
x = lv*C if C < thresh1
(1.0+av)*pow(C,pv)-av otherwise
x = C/lv if C < thresh2
pow((C+av)/(1.0+av),1.0/pv) otherwise
Here, x is the output value, C is a linear or gamma corrected RGB value in the range [0,1], and lv, av, pv, thresh1, and thresh2 are the variables. This is the same as the function you describe, if thresh1 and thresh2 are set to 0 (i.e. no linear segment around 0), offset = av, and pv = 1.0/gamma. All of the presets that I put into ddcc (with the exception of the one to match the pixel shader code, and the srgb one which I got from http://en.wikipedia.org/wiki/SRGB) I took from the mpeg2 specs ('transfer_characteristics', Table 6-8).
pitch.fr
15th July 2008, 23:55
humm OK, I will read that slowly again I'm not quite a coder :D
BTW, what happens to off-gamut colors ?
white is my DLP pj, black is SMPTE-C :
http://pix.nofrag.com/a/3/c/deff50ab2a4b5c063c43e15a16155.png
what happens to the green colors that my pj can't output ? they're clipped ? :eek:
so they don't appear(banding?), or it still shows yellow instead like it does w/o correction ?
and are you sure this can't be done through the CLUT ?!
ARGYLLCMS has pretty advanced CLUT capabilities, and it's GPL software.
considering you know the formula part, why would that be impossible to generate CLUT entries ?
then we can "capture" it with powerstrip, and we're free from any ressource hogging process.
of course I will test it in every possible way with my colorimeter :D
DESCRIPTOR "Argyll Device Calibration State"
ORIGINATOR "Argyll dispcal"
CREATED "Tue Jun 17 13:01:45 2008"
KEYWORD "DEVICE_CLASS"
DEVICE_CLASS "DISPLAY"
KEYWORD "DEVICE_TYPE"
KEYWORD "TARGET_WHITE_XYZ"
TARGET_WHITE_XYZ "47.889141 50.384253 54.837810"
KEYWORD "TARGET_GAMMA"
TARGET_GAMMA "2.200000"
KEYWORD "BLACK_POINT_CORRECTION"
BLACK_POINT_CORRECTION "1.000000"
KEYWORD "QUALITY"
QUALITY "high"
KEYWORD "RGB_I"
NUMBER_OF_FIELDS 4
BEGIN_DATA_FORMAT
RGB_I RGB_R RGB_G RGB_B
END_DATA_FORMAT
NUMBER_OF_SETS 256
BEGIN_DATA
0.0000 0.084613 0.0000 2.0434e-003
3.9216e-003 0.087834 0.0000 6.7763e-003
7.8431e-003 0.091075 1.0884e-003 0.011527
0.011765 0.094334 8.1334e-003 0.016295
0.015686 0.097613 0.015385 0.021081
0.019608 0.10091 0.022857 0.025885
0.023529 0.10423 0.030658 0.030707
0.027451 0.10757 0.038978 0.035546
0.031373 0.11092 0.047935 0.040408
0.035294 0.11430 0.057523 0.045302
0.039216 0.11771 0.067448 0.050240
0.043137 0.12117 0.077293 0.055230
0.047059 0.12466 0.086703 0.060272
0.050980 0.12820 0.095542 0.065369
0.054902 0.13178 0.10384 0.070522
0.058824 0.13540 0.11162 0.075722
0.062745 0.13906 0.11897 0.080956
0.066667 0.14275 0.12592 0.086205
0.070588 0.14647 0.13252 0.091446
0.074510 0.15019 0.13878 0.096661
0.078431 0.15392 0.14473 0.10184
0.082353 0.15765 0.15039 0.10697
0.086275 0.16136 0.15583 0.11205
0.090196 0.16507 0.16107 0.11710
0.094118 0.16877 0.16614 0.12210
0.098039 0.17246 0.17107 0.12706
0.10196 0.17616 0.17589 0.13197
0.10588 0.17986 0.18059 0.13685
0.10980 0.18357 0.18519 0.14168
0.11373 0.18728 0.18972 0.14646
0.11765 0.19100 0.19416 0.15121
0.12157 0.19471 0.19853 0.15589
0.12549 0.19843 0.20279 0.16054
0.12941 0.20214 0.20699 0.16514
and many more lines.......
yesgrey
16th July 2008, 09:27
BTW, what happens to off-gamut colors ?
what happens to the green colors that my pj can't output ? they're clipped ? :eek:
so they don't appear(banding?), or it still shows yellow instead like it does w/o correction ?
The final color gamut will be the intersection of the two triangles; a display cannot output colors out of it's own color gamut. What this correction does is remapping the "original" colors into the display gammut so they are the same. It does this changing the RGB values sent to your display in a way that the resulting colors would be the same as the "original".
As I told you before, in theory could exist banding, in practice, I've never noticed it. As you see, the colors clipped are the more saturated ones, which don't appear too often in a movie...
There are two kinds of possible banding, one avoidable and the other not:
1-colors out of display gammut: there is nothing to do, due to the clipping, some colors will be the same. Not many and not very frequent in a movie.
2-colors inside of display gammut: If you perform the correction and have a display with >= 10 bit per color, you could avoid this kind of banding, but in the pc we are currently "limited" by 8 bit per color. Maybe there is any hardware solution...
yesgrey
16th July 2008, 09:45
A CUDA implementation would be easy to do, but I was thinking that it would be pretty limiting (in terms of supported video cards) compared to a pixel shader.
Well, it would be a start... and much of the people who will use this probably have a Nvidia card.
The ATI "CUDA version" is too complicated?
I understand that currently it's not very appealing working with these new GPU/CPU APIs, because probably only one will survive or even none of them will survive (a Microsoft one maybe?...)
Also, we cannot forget that the currently existing alternative to this, using mplayerc with PS, works in all of them and gives pretty good results. I believe you're solution in theory is better, because you are using the correct gamma functions, but at the end, the result should be very similar...
I left the decision in your ends (as it should be).:)
pitch.fr
16th July 2008, 11:01
As I told you before, in theory could exist banding, in the practice, I've never noticed it. As you see, the colors clipped are the more saturated ones, which don't appear too often in a movie...
There are two kinds of possible banding, one avoidable and the other not
ok, gracias for the explanation ;)
well CLUT is 30 bit, so no banding :)
I've spoken to Graeme Gill(the author of www.argyllcms.com) and he said this could be achieved through the CLUT.
then I could use it on top of Haali's Renderer.............this would be awesome for me coz this is the smoothest renderer ever :)
here's what he told me :
do you think that would be doable through the CLUT ?
Yes. It's been done many times. See the example code in the openEXR package and their exrdisplay utility. Rather than doing simple matrix arithmetic, you set up a texture to hold the CLUT, and have the fragment shader do a texturelookup for each pixel.
the issue is that the smoothest video renderer on PC(Haali's Renderer) doesn't support Pixel Shaders......and I was hoping to get SMPTE-C corrected colors with it, but I guess it's just a wild dream at this point...
The code here http://www.avsforum.com/avs-vb/showthread.php?t=912720 is already using a shader, it's just a matter of converting it to use a text3D and loading the ICC device link into the texture, instead of the pow/mul etc. functions.
so are you saying that once the CLUT is built, it could work w/o further realtime pixel manipulations ?
Yes, it only needs initializing if you change the device link it represents.
The OpenEXR utility was written to preview motion picture effects in real time,that's why they use a shader program to do it.
if any of you two code gurus could do that, that'd save a helluvalot of CPU time :)
as I understand it, this should be possible to output an .ICC/.ICM file instead of a realtime correction through PS or AVS ? :p
maybe this PDF would help ?
http://www.inventoland.net/imaging/JEI/159.PDF
OTOH, I'm trying your AVS plugin in ffdshow....as nothing's as smooth as HR on my box with Reclock in 24/48Hz :(
I've copied this to C:\PJ_SMPTE.txt :
0.151
0.068
0.781
0.339
0.611
0.050
0.656
0.329
0.015
0.311
0.328
0.361
1.0
0.0
0.45
0.0
0.0
and I'm using :
MT("""ConvertToRGB32(matrix="rec709")""",4)
ddcc(chr_i=3,ofile="C:\PJ_SMPTE.txt",threads=4,opt=1)
all I get is a black screen, ddcc(chr_i=3,chr_o=0,threads=4,opt=1) works fine.
IanB
16th July 2008, 13:25
@Tritical,
One very minor issue, you min(max(X, 0.0f), 1.0f) in 3 places. The input gamma table, the result of the matrix multiply and again for the output gamma table. I would have expected only the output gamma table which generates the 8bit output pixels needed to be clamped, all the other cases could be free to have temporary head room excursions outside the nominal range.
i.e.
1.001*0.55 + 0.7*0.20 + 0.9*0.25 = 0.91555
1.000*0.55 + 0.7*0.20 + 0.9*0.25 = 0.91500
I guess it's a style thing, allow headroom on intermediate results or clamp rigorously at every stage.
tritical
16th July 2008, 15:25
@IanB
Well, only one of those three clamps is actually doing anything. The first clamp (on building the igamma table) shouldn't ever trigger since the inverse transfer function maps [0,1]->[0,1], and at that point it will only get [0,1] input. The second clamp, after the linear RGB -> XYZ -> linear RGB conversion, will trigger for out-of-gamut colors (i.e. a color in the input gamut is outside of the output gamut). I believe this is the right place to clamp because the final RGB values should be in the range [0,1], and the transfer function is defined as taking [0,1] input. The final clamp (on building the fgamma table) will again never trigger since it maps [0,1]->[0,1] and because of the previous clamp it will only be getting [0,1] input. I put those clamps in the gamma table creation mainly to remind myself that those values should be in the range [0,1], and since it is run only in the constructor the cost is negligible.
@pitch.fr
Having the filter output a 3D LUT is possible, just need to know the format to use. Also, the dark screen when using the ofile parameter was due to a bug I introduced in v1.1 (a missing multiply if the white points differed causing chromatic adaptation to be used). I put up a fixed version.
pitch.fr
16th July 2008, 15:38
well, Graeme Gill's told me that an .ICC file would do the trick! possibly v4 ?
the LUT is 30 bit on ATi cards, dunno with nvidia ?
If you could generate an .ICC file, this would be really awesome........as the Avisynth support in ffdshow is too slow(or maybe it's ConvertToRGB32's fault), and this PS script isn't working with HR..
at this point, nothing could be better than a way to generate .ICC profiles
I'm actually very happy to hear that it's possible, because getting 1:1 colors with Haali's Renderer and Reclock would be just TOO AWESOME :eek:
OK I'm gonna try again with the new version
there is another slight detail, though.
a friend of mine believes that gamut color conversion can only be done on the original 16-235 signal........applying the gamut conversion on 0-255 expanded content would require different coeffs ?!
when we do REC.xxx conversions to RGB32 in ffdshow, luma goes from 16-235 to 0-255 and chroma from 16-240 to 1-255.....so are you still able to map the 0-15 and 241-255 colors :confused:
or should we output in full range and let your ICC do the PC>TV conversion ? or a custom range ?
also, here's a very interesting thread about these gamut stories :
http://www.avsforum.com/avs-vb/showpost.php?p=14071122&postcount=1
and another one on doom9 :
http://forum.doom9.org/showthread.php?t=132745
and a PDF about ICC corrections in the movie industry :
http://www.color.org/ICC_Chiba_07-06-19_PM_DMP_Float.pdf
thanks for your help tritical !
tritical
16th July 2008, 22:45
I don't know anything about ICC profiles... How exactly would that solve the problem (assuming you had one, how would you use it)? Also, is there a document which describes the format of icc profiles, i.e. how to create one?
a friend of mine believes that gamut color conversion can only be done on the original 16-235 signal........applying the gamut conversion on 0-255 expanded content would require different coeffs ?!
when we do REC.xxx conversions to RGB32 in ffdshow, luma goes from 16-235 to 0-255 and chroma from 16-240 to 1-255.....so are you still able to map the 0-15 and 241-255 colors
All of the equations in this filter work on values in the [0,1] range. The conversion to and from [0,1] currently assumes the min/max values of the RGB channels are 0 and 255. I could allow the user to specify different min/max values if desired (for instance if the RGB channels had a range of [16,235]), but usually RGB is [0,255]. The range question is usually related to whether YUV is [16,235],[16-240] or [0-255]. Depending on which it is, you need different scalings of the YUV->RGB conversion coefficients.
pitch.fr
16th July 2008, 22:54
well an ICC profile is the most universal solution by far.
people could compute the numbers on windows, then use the ICC profile on any OS and even hardware device :)
and it would enable gamut correction in any software/player(except PowerDVD because it only works in OVERLAY and this doesn't support ICC's).......but all the other players could beneficiate from this......even games, and there's no CPU time required, just like Pixel Shaders :)
I believe you can find all kind of white papers through google considering it's an open file format I think :
http://www.google.com/search?hl=en&q=ICC+specs&btnG=Search&lr=
this one maybe ?
http://www.color.org/ICC1v42_2006-05.pdf
I can ask Graeme Gill as well.
you can apply it within windows(for automatic correction) or with various apps, but I would personally use dispwin.exe(from the ARGYLLCMS package) with a batch for each of my displays(CRT/pj)
pitch.fr
17th July 2008, 14:31
I'm still a bit confused between LUT and profiles.
LUT is the final result of the profile, so you can't quite output LUT raw data.
here's what he told me
maybe he could simply output "raw lut data", that can be read with this software : http://www.exactscan.com/lutmanager/
You are confusing calibration and profiles.
The "lut data" is the calibration curves.
A device link is the result of linking two device ICC profiles,and is a clut, a 3D dimensional interpolation table.
Gerhard's code http://www.mail-archive.com/lcms-user@lists.sourceforge.net/msg02326.html , which uses lcms to link two device profiles and load it into a textureis basically a complete example.
I can't help you much more than pointing youat that code. Graeme Gill.
I've asked him for simple specs to an .ICC file
ok, so using the settings from the previous page with DDCC 1.2, compared to these settings with the .XLS script(updated the script with your coeffs that have 5 digits after the coma) :
http://rapidshare.com/files/130373214/pj.xls.html
the correction is quite different actually.
this is the original test video :
http://rapidshare.com/files/130375581/rec709.mkv.html
http://thumbnails8.imagebam.com/950/b5e6859499794.gif (http://www.imagebam.com/image/b5e6859499794)
this is with the PS script, and SMPTE-C conversion :
http://thumbnails8.imagebam.com/950/3498ff9499795.gif (http://www.imagebam.com/image/3498ff9499795)
this is with ddcc 1.2, and SMPTE-C conversion :
http://thumbnails8.imagebam.com/950/3c2ca29499796.gif (http://www.imagebam.com/image/3c2ca29499796)
I'd say the gamma has been lowered way too much....clearly not 2.2 ?
or maybe we need to finetune the settings ? :D
EDIT : there's an ICC profile SDK btw :
http://www.littlecms.com/downloads.htm
interestingly enough, they have used that SDK to make a simple example, that is said to :
This command line program does compute colorspace conversion based on icc profiles. Additionally, it can show XYZ and Lab values of PCS, and up to 16 bits of precision (48, 64 bits per pixel). If you ever have been searching for a universal colorspace conversion utility, check this one!
http://www.littlecms.com/newutils.htm
but these things are going way over my head :scared:
tritical
18th July 2008, 02:21
The differences to the pixel shader are due to two things:
1.) ddcc and the ps code are using different input gamma functions. You need to set gam_i=5 in ddcc to have it use the same one as the ps code. So the script line would become:
ddcc(chr_i=3,gam_i=5,ofile="ofile.txt")
2.) When the white point coordinates differ between the source and destination, ddcc performs chromatic adaptation using the Bradford method. The pixel shader excel file does not. This will slightly change the resulting coefficients. (Other than this, the ps excel file and ddcc produce the same coefficients).
I don't think I'll mess with ICC profiles. However, the 3D lut method via a 3d texture on the gpu looks easy enough. I will try to implement it when I have the time.
pitch.fr
18th July 2008, 10:12
OK that sounds great, indeed they're identical now...tritical yo'ure my hero :D
original :
http://thumbnails8.imagebam.com/956/b5e6859555408.gif (http://www.imagebam.com/image/b5e6859555408)
PS :
http://thumbnails8.imagebam.com/956/3498ff9555409.gif (http://www.imagebam.com/image/3498ff9555409)
AVS :
http://thumbnails8.imagebam.com/956/9a4b919555410.gif (http://www.imagebam.com/image/9a4b919555410)
I will make colorimeter measures anytime soon :)
too bad the ConvertToRGB32 sucks so much CPU......it's hardly usable in real time :(
pitch.fr
18th July 2008, 21:00
OK I've played around with Profile Maker 5.0
http://pix.nofrag.com/a/e/5/7812bec244bd2cba7acec57c666bdtt.jpg (http://pix.nofrag.com/a/e/5/7812bec244bd2cba7acec57c666bd.html)
here's what their help file says :
Mechanisms of the ICC Profile
There are a variety of mathematical mechanisms for color conversion that are used by the Color Management Module. The data required for this operation is contained in the profile.
There are two methods of defining a conversion:
1)Mathematical Functions (TRC: Tone Reproduction Curves and Matrix Models)
2)Conversion Tables (LUT: Look Up Table)
The ICC profiles of the different device categories work with a variety of mathematical models.
Using Matrix Functions and TRCs
For the quite simple conversions between purely additive color systems, ICC profiles use Matrix Functions. These models are sufficiently accurate and produce relatively small color profiles.
The advantage of functions is the compactness of the parameter sets, but unfortunately they are poorly suited to accurately describe complex mapping characteristics such as exist between color models that have different dimensions (RGB > LAB versus CMYK > LAB).
Monitor profiles work mostly with TRCs and 3x3 Matrix Operations.
Using LUTs
For more complex color systems, for example in the case of color transformations involving a change in dimension, and for the description of non-linear color spaces, Tables (LUTs) must be used.
Tables offer unlimited accuracy and an unlimited number of dimensions, but are also unlimited in terms of size.
As a rule, the direction of conversion from Scanner RGB to LAB is described in a scanner profile using three-dimensional LUTs. Normally, a separate table would have to be created for each Rendering Intent, for a total of four tables. However, since the Rendering Intents for Relative and Absolute colorimetric interpretation can be calculated from the same table, only three LUTs are required. In output profiles, both conversion directions are described using LUTs. In this case, three four-dimensional LUTs are created for the direction CMYK to LAB and three thee-dimensional LUTs for the direction LAB to CMYK.
so clearly a LUT is more accurate than an ICC :)
and Powerstrip's coder had given me infos on how to input custom LUT's within his app :
LUTs are stored by PowerStrip in the registry, and read/written to SRAM in the GPU. Here's the text format for a Broadcast D65 LUT - it uses 8 bit values because it predates deeper depths, but PowerStrip has been able to read LUTs stored with 16 bit data since the Matrox Parhelia.
So you have here 768 bytes (or words) in the form R[0], R[1]...R[254],R[255],G[0]...G[255],B[0]...B[255].
"Broadcast 2.2 Gamma D65"=hex:00,00,00,01,01,02,02,02,03,03,04,04,04,05,05,06,\
07,08,09,0a,0b,0c,0d,0f,10,11,12,13,14,15,16,17,19,1a,1b,1c,1d,1f,20,21,22,\
23,24,26,27,28,29,2a,2b,2c,2d,2f,30,31,32,33,34,35,36,38,39,3a,3b,3c,3d,3e,\
3f,40,42,43,44,45,46,47,48,49,4a,4b,4c,4d,4f,50,51,52,53,54,55,56,57,58,59,\
5a,5b,5c,5e,5f,60,61,62,63,64,65,66,67,68,69,6a,6b,6c,6d,6e,6f,70,71,72,73,\
74,76,77,78,79,7a,7b,7c,7d,7e,7f,80,81,82,83,84,85,86,87,88,89,8a,8b,8c,8d,\
8e,8f,90,91,92,93,94,95,96,97,98,99,9a,9b,9c,9d,9e,9f,a0,a1,a2,a3,a4,a5,a6,\
a7,a8,a9,aa,ab,ad,ae,af,b0,b1,b2,b3,b4,b5,b6,b7,b8,b9,ba,bb,bc,bd,bd,be,bf,\
c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,ca,cb,cc,cd,ce,cf,d0,d1,d2,d3,d4,d5,d6,d7,d8,\
d9,da,db,dc,dd,de,df,e0,e1,e2,e3,e4,e5,e6,e7,e8,e9,ea,eb,ec,ed,ee,ef,f0,f1,\
f2,f3,f4,f5,f6,f7,f8,f9,fa,fb,fb,fc,fd,fe,ff,00,00,00,01,01,02,02,02,03,03,\
04,04,04,05,05,06,07,08,09,0a,0b,0c,0d,0f,10,11,12,13,14,15,16,18,19,1a,1b,\
1c,1d,1f,20,21,22,23,24,26,27,28,29,2a,2b,2c,2e,2f,30,31,32,33,34,35,37,38,\
39,3a,3b,3c,3d,3e,3f,40,42,43,44,45,46,47,48,49,4a,4b,4c,4e,4f,50,51,52,53,\
54,55,56,57,58,59,5a,5b,5c,5e,5f,60,61,62,63,64,65,66,67,68,69,6a,6b,6c,6d,\
6e,6f,70,71,72,74,75,76,77,78,79,7a,7b,7c,7d,7e,7f,80,81,82,83,84,85,86,87,\
88,89,8a,8b,8c,8d,8e,8f,90,91,92,93,94,95,96,97,98,99,9a,9b,9c,9d,9e,9f,a0,\
a1,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad,ae,af,b0,b1,b2,b3,b4,b5,b6,b7,b8,b9,ba,\
bb,bc,bd,be,bf,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,ca,ca,cb,cc,cd,ce,cf,d0,d1,d2,\
d3,d4,d5,d6,d7,d8,d9,da,db,dc,dd,de,df,e0,e1,e2,e3,e4,e5,e6,e7,e8,e9,ea,eb,\
ec,ed,ee,ef,f0,f1,f2,f3,f4,f5,f6,f7,f8,f9,fa,fb,fc,fd,fe,ff,ff,00,00,00,01,\
01,02,02,02,03,03,04,04,04,05,05,06,07,08,09,0a,0b,0c,0d,0f,10,11,12,13,14,\
15,16,17,19,1a,1b,1c,1d,1e,20,21,22,23,24,25,27,28,29,2a,2b,2c,2d,2f,30,31,\
32,33,34,35,36,38,39,3a,3b,3c,3d,3e,3f,40,41,43,44,45,46,47,48,49,4a,4b,4c,\
4d,4e,50,51,52,53,54,55,56,57,58,59,5a,5b,5c,5d,5e,60,61,62,63,64,65,66,67,\
68,69,6a,6b,6c,6d,6e,6f,70,71,72,73,74,75,77,78,79,7a,7b,7c,7d,7e,7f,80,81,\
82,83,84,85,86,87,88,89,8a,8b,8c,8d,8e,8f,90,91,92,93,94,95,96,97,98,99,9a,\
9b,9c,9d,9e,9f,a0,a1,a2,a3,a4,a5,a6,a7,a8,a9,aa,ab,ac,ad,ae,af,b0,b1,b2,b3,\
b4,b5,b6,b7,b8,b9,ba,bb,bc,bd,be,bf,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9,ca,cb,cc,\
cd,ce,cf,d0,d1,d2,d3,d4,d5,d6,d7,d8,d9,da,db,dc,dd,de,df,e0,e1,e2,e3,e4,e5,\
e6,e7,e8,e9,ea,eb,ec,ed,ee,ef,f0,f1,f2,f3,f4,f5,f6,f6,f7,f8,f9,fa,fb,fc,fd,\
fe,ff
maybe using your voodoo magic, you could offer a way to output powerstrip compatible LUT's :eek:
he has also told me that you can output 16bit LUT, which will improve the accuracy to 10 bit on ATi cards, dunno about nvidia
If you use 16-bit values, PowerStrip automatically adjusts down to the maximum bit depth of the DAC.
If you use 8-bit values, PowerStrip will make 8 bit correction.
the advantage of doing the conversion through the LUT is that it doesn't use any CPU time and it's compatible with any picture viewer/video renderer(except OVERLAY) w/o any requirement(like PS script compatibility, that only MPC and KMPlayer support at this point)
and the LUT can be ported to other OS such as Linux :)
thanks!
pitch.fr
19th July 2008, 13:53
OK I've read the AVS thread again.
it seems that people were confusing ICC and LUT(just like me, but the other way around)
an ICC only works in compatible applications, Adobe Gamma can force it though.....but only in window mode.
so the best solution is to output 16 bit LUT to powerstrip, which will render it in 10 bit on the Radeon.
the LUT is compatible with any application, except the OVERLAY video renderer.
here's the powerstrip coder's notes about importing 16bit pstrip-compatible LUT's to the registry(for instant use in pstrip in 10 bit) :
Here is the same Broadcast D65 LUT, but with 16 bit values instead of 8 bit.
"Broadcast 2.2 Gamma D65 (16 bit)"=hex:00,00,00,00,00,00,00,01,00,01,00,02,00,\
02,00,02,00,03,00,03,00,04,00,04,00,04,00,05,00,05,00,06,00,07,00,08,00,09,\
00,0a,00,0b,00,0c,00,0d,00,0f,00,10,00,11,00,12,00,13,00,14,00,15,00,16,00,\
17,00,19,00,1a,00,1b,00,1c,00,1d,00,1f,00,20,00,21,00,22,00,23,00,24,00,26,\
00,27,00,28,00,29,00,2a,00,2b,00,2c,00,2d,00,2f,00,30,00,31,00,32,00,33,00,\
34,00,35,00,36,00,38,00,39,00,3a,00,3b,00,3c,00,3d,00,3e,00,3f,00,40,00,42,\
00,43,00,44,00,45,00,46,00,47,00,48,00,49,00,4a,00,4b,00,4c,00,4d,00,4f,00,\
50,00,51,00,52,00,53,00,54,00,55,00,56,00,57,00,58,00,59,00,5a,00,5b,00,5c,\
00,5e,00,5f,00,60,00,61,00,62,00,63,00,64,00,65,00,66,00,67,00,68,00,69,00,\
6a,00,6b,00,6c,00,6d,00,6e,00,6f,00,70,00,71,00,72,00,73,00,74,00,76,00,77,\
00,78,00,79,00,7a,00,7b,00,7c,00,7d,00,7e,00,7f,00,80,00,81,00,82,00,83,00,\
84,00,85,00,86,00,87,00,88,00,89,00,8a,00,8b,00,8c,00,8d,00,8e,00,8f,00,90,\
00,91,00,92,00,93,00,94,00,95,00,96,00,97,00,98,00,99,00,9a,00,9b,00,9c,00,\
9d,00,9e,00,9f,00,a0,00,a1,00,a2,00,a3,00,a4,00,a5,00,a6,00,a7,00,a8,00,a9,\
00,aa,00,ab,00,ad,00,ae,00,af,00,b0,00,b1,00,b2,00,b3,00,b4,00,b5,00,b6,00,\
b7,00,b8,00,b9,00,ba,00,bb,00,bc,00,bd,00,bd,00,be,00,bf,00,c0,00,c1,00,c2,\
00,c3,00,c4,00,c5,00,c6,00,c7,00,c8,00,c9,00,ca,00,cb,00,cc,00,cd,00,ce,00,\
cf,00,d0,00,d1,00,d2,00,d3,00,d4,00,d5,00,d6,00,d7,00,d8,00,d9,00,da,00,db,\
00,dc,00,dd,00,de,00,df,00,e0,00,e1,00,e2,00,e3,00,e4,00,e5,00,e6,00,e7,00,\
e8,00,e9,00,ea,00,eb,00,ec,00,ed,00,ee,00,ef,00,f0,00,f1,00,f2,00,f3,00,f4,\
00,f5,00,f6,00,f7,00,f8,00,f9,00,fa,00,fb,00,fb,00,fc,00,fd,00,fe,00,ff,00,\
00,00,00,00,00,00,01,00,01,00,02,00,02,00,02,00,03,00,03,00,04,00,04,00,04,\
00,05,00,05,00,06,00,07,00,08,00,09,00,0a,00,0b,00,0c,00,0d,00,0f,00,10,00,\
11,00,12,00,13,00,14,00,15,00,16,00,18,00,19,00,1a,00,1b,00,1c,00,1d,00,1f,\
00,20,00,21,00,22,00,23,00,24,00,26,00,27,00,28,00,29,00,2a,00,2b,00,2c,00,\
2e,00,2f,00,30,00,31,00,32,00,33,00,34,00,35,00,37,00,38,00,39,00,3a,00,3b,\
00,3c,00,3d,00,3e,00,3f,00,40,00,42,00,43,00,44,00,45,00,46,00,47,00,48,00,\
49,00,4a,00,4b,00,4c,00,4e,00,4f,00,50,00,51,00,52,00,53,00,54,00,55,00,56,\
00,57,00,58,00,59,00,5a,00,5b,00,5c,00,5e,00,5f,00,60,00,61,00,62,00,63,00,\
64,00,65,00,66,00,67,00,68,00,69,00,6a,00,6b,00,6c,00,6d,00,6e,00,6f,00,70,\
00,71,00,72,00,74,00,75,00,76,00,77,00,78,00,79,00,7a,00,7b,00,7c,00,7d,00,\
7e,00,7f,00,80,00,81,00,82,00,83,00,84,00,85,00,86,00,87,00,88,00,89,00,8a,\
00,8b,00,8c,00,8d,00,8e,00,8f,00,90,00,91,00,92,00,93,00,94,00,95,00,96,00,\
97,00,98,00,99,00,9a,00,9b,00,9c,00,9d,00,9e,00,9f,00,a0,00,a1,00,a3,00,a4,\
00,a5,00,a6,00,a7,00,a8,00,a9,00,aa,00,ab,00,ac,00,ad,00,ae,00,af,00,b0,00,\
b1,00,b2,00,b3,00,b4,00,b5,00,b6,00,b7,00,b8,00,b9,00,ba,00,bb,00,bc,00,bd,\
00,be,00,bf,00,c0,00,c1,00,c2,00,c3,00,c4,00,c5,00,c6,00,c7,00,c8,00,c9,00,\
ca,00,ca,00,cb,00,cc,00,cd,00,ce,00,cf,00,d0,00,d1,00,d2,00,d3,00,d4,00,d5,\
00,d6,00,d7,00,d8,00,d9,00,da,00,db,00,dc,00,dd,00,de,00,df,00,e0,00,e1,00,\
e2,00,e3,00,e4,00,e5,00,e6,00,e7,00,e8,00,e9,00,ea,00,eb,00,ec,00,ed,00,ee,\
00,ef,00,f0,00,f1,00,f2,00,f3,00,f4,00,f5,00,f6,00,f7,00,f8,00,f9,00,fa,00,\
fb,00,fc,00,fd,00,fe,00,ff,00,ff,00,00,00,00,00,00,00,01,00,01,00,02,00,02,\
00,02,00,03,00,03,00,04,00,04,00,04,00,05,00,05,00,06,00,07,00,08,00,09,00,\
0a,00,0b,00,0c,00,0d,00,0f,00,10,00,11,00,12,00,13,00,14,00,15,00,16,00,17,\
00,19,00,1a,00,1b,00,1c,00,1d,00,1e,00,20,00,21,00,22,00,23,00,24,00,25,00,\
27,00,28,00,29,00,2a,00,2b,00,2c,00,2d,00,2f,00,30,00,31,00,32,00,33,00,34,\
00,35,00,36,00,38,00,39,00,3a,00,3b,00,3c,00,3d,00,3e,00,3f,00,40,00,41,00,\
43,00,44,00,45,00,46,00,47,00,48,00,49,00,4a,00,4b,00,4c,00,4d,00,4e,00,50,\
00,51,00,52,00,53,00,54,00,55,00,56,00,57,00,58,00,59,00,5a,00,5b,00,5c,00,\
5d,00,5e,00,60,00,61,00,62,00,63,00,64,00,65,00,66,00,67,00,68,00,69,00,6a,\
00,6b,00,6c,00,6d,00,6e,00,6f,00,70,00,71,00,72,00,73,00,74,00,75,00,77,00,\
78,00,79,00,7a,00,7b,00,7c,00,7d,00,7e,00,7f,00,80,00,81,00,82,00,83,00,84,\
00,85,00,86,00,87,00,88,00,89,00,8a,00,8b,00,8c,00,8d,00,8e,00,8f,00,90,00,\
91,00,92,00,93,00,94,00,95,00,96,00,97,00,98,00,99,00,9a,00,9b,00,9c,00,9d,\
00,9e,00,9f,00,a0,00,a1,00,a2,00,a3,00,a4,00,a5,00,a6,00,a7,00,a8,00,a9,00,\
aa,00,ab,00,ac,00,ad,00,ae,00,af,00,b0,00,b1,00,b2,00,b3,00,b4,00,b5,00,b6,\
00,b7,00,b8,00,b9,00,ba,00,bb,00,bc,00,bd,00,be,00,bf,00,c0,00,c1,00,c2,00,\
c3,00,c4,00,c5,00,c6,00,c7,00,c8,00,c9,00,ca,00,cb,00,cc,00,cd,00,ce,00,cf,\
00,d0,00,d1,00,d2,00,d3,00,d4,00,d5,00,d6,00,d7,00,d8,00,d9,00,da,00,db,00,\
dc,00,dd,00,de,00,df,00,e0,00,e1,00,e2,00,e3,00,e4,00,e5,00,e6,00,e7,00,e8,\
00,e9,00,ea,00,eb,00,ec,00,ed,00,ee,00,ef,00,f0,00,f1,00,f2,00,f3,00,f4,00,\
f5,00,f6,00,f6,00,f7,00,f8,00,f9,00,fa,00,fb,00,fc,00,fd,00,fe,00,ff
As you can see from the very last LUT entry above (0xFF00 for B[256]) when represented as text the msb are in the 2nd byte of the couplet.
pitch.fr
20th July 2008, 20:21
OK I've made some colorimetry measurements with my Eye One Display 2 on my HC3100 as promised.
the black triangle is SMPTE RP 145 in all these graphs, and the ghost gamut is the original one.
I also give the Delta-E and the Distance from the reference in the CIExy space
the projector has been D65 calibrated with a Delta-E <3 before doing the tests(with the official test patterns DVD and Color.HCFR in MPC) :
http://pix.nofrag.com/b/2/6/c48e09b6e79d3305ce11b26fa27edtt.jpg (http://pix.nofrag.com/b/2/6/c48e09b6e79d3305ce11b26fa27ed.html)
that's the original gamut :
http://pix.nofrag.com/2/0/8/2b4fde6c873d93324cb13e538db76tt.jpg (http://pix.nofrag.com/2/0/8/2b4fde6c873d93324cb13e538db76.html)
http://pix.nofrag.com/d/a/7/993d1813bee28cdd09acd5f647378.png
now with the Pixel Shaders script :
http://pix.nofrag.com/4/1/0/6146664a9e7e23f8f006fc5e3875dtt.jpg (http://pix.nofrag.com/4/1/0/6146664a9e7e23f8f006fc5e3875d.html)
http://pix.nofrag.com/6/8/f/de5e32a36e2732028bbe1ab14e1f3.png
now with ddcc(chr_i=3,gam_i=5,ofile="C:\PJ.txt",threads=4,opt=1) :
http://pix.nofrag.com/2/c/c/aea54933ce35849d7bf9774522bf3tt.jpg (http://pix.nofrag.com/2/c/c/aea54933ce35849d7bf9774522bf3.html)
http://pix.nofrag.com/1/2/f/1b9c92223b2fef175a7d7fe61cb10.png
now with ddcc(chr_i=3,gam_i=2,ofile="C:\PJ.txt",threads=4,opt=1) :
http://pix.nofrag.com/a/f/a/7b96e6cba9e016e1c02893d743e26tt.jpg (http://pix.nofrag.com/a/f/a/7b96e6cba9e016e1c02893d743e26.html)
http://pix.nofrag.com/f/3/4/941c1cae7095aa14a0fed85b9d41d.png
PJ.txt contained :
0.151
0.068
0.782
0.338
0.611
0.052
0.653
0.330
0.002
0.312
0.328
0.360
1.0
0.0
0.45
0.0
0.0
it's really impressive how close the PS script managed to get to the refs in the CIExy space....I'm speechless :eek:
maybe some brush-up of the DDCC maths and/or the text file could improve the results ?
together with a way to output 16 bit Powerstrip-compatible LUT's please ? :D
TIA,
pitch.fr
22nd July 2008, 00:07
OK forget it, this can't be done through the LUT :
> I'm afraid there is a misunderstanding. The VCGT LUTs which can be
> loaded into the graphics card with ARGYLL are just one-dimensional
> LUTs. Such LUTs do not allow arbitrary color transformation, but only
> VERY LIMITED ones (like e.g. adjusting the gamma of the RGB channnels).
> For the kind of color transformation you desire, you would need a
> three-dimensional LUT. I'm not aware of graphics cards which support
> that (except in conjunction with a pixel shader program), but I also
> don't want to rule out that such graphics cards might exist.
tritical, any idea what I could do to get the same accurate results that the PS script is giving ?
thanks,
yesgrey
22nd July 2008, 01:23
Now I understand what is a LUT. You could change it through the color calibration you could perform in the graphic card drivers, this was debated in the AVS Forum. The problem of it is that each component only depends on itself, and for this kind of correction, each component should depends on all three components.
Example:
LUT
Rd = f(Rs)
Gd = f(Gs)
Bd = f(Bs)
What we need:
Rd = f(Rs,Gs,Bs)
Gd = f(Rs,Gs,Bs)
Bd = f(Rs,Gs,Bs)
It's the 3D LUT table you referred...
There is also a very simple and fast solution for your problem... Haali uses Pixel Shaders in his renderer for the YUV->RGB conversion, you could ask him to add this PS code to his renderer, in which we just need to specify our conversion matrix.
In fact, I have already tried that some time ago without any luck. Give it a try, maybe if more people start to request this to him he feel the need to do it.;)
tritical
22nd July 2008, 01:42
Your measurements aren't telling the whole story here for a number of reasons. First, you are ignoring white point differences between source/destination, and chromatic adaptation. As I said before, the only difference between the PS script and ddcc (assuming ddcc is setup to use the same gamma function assumed by the ps script, etc...) is that ddcc performs chromatic adaptation when the white points between the source and destination differ. Chromatic adaptation, with regards to the HVS, refers to the ability of the HVS to make an object appear the same (color wise) under a wide range of illuminants. What this means is that to a person different points in the CIE XYZ colorspace will appear to be the same color when viewed under different illuminants. When it is used in regards to colorspace conversions (specifically those based on CIE XYZ where the white point differs) it refers to trying to alter the XYZ tristimulus values such that to a person the new color under the new illuminant would look the same as the original color under the original illuminant. The are three main methods for doing chromatic adaptation: XYZ scaling, Von Kries, and Bradford. ddcc uses Bradford. To see how this effects the coefficients in the BGR->CIE->BGR transform, I had ddcc print them out. The first set shows the coefficients calculated when using the ddcc() line you specified. Each line after that modifies your white point to be progressively closer to the D65 illuminant, such that in the last case no chromatic adaptation is needed (in this case the PS script's coefficients match ddcc's to 7 or more decimal places).
0.312713,0.329016,0.358271 -> 0.312,0.328,0.360
1.109563,-0.129243,0.016116
0.028285,0.972018,-0.001302
0.008913,0.054374,0.938853
0.312713,0.329016,0.358271 -> 0.313,0.329,0.358
1.105802,-0.123381,0.019919
0.028030,0.971643,-0.000281
0.008032,0.052882,0.938751
0.312713,0.329016,0.358271 -> 0.3127,0.3290,0.3583
1.108651,-0.127532,0.018811
0.028307,0.971852,-0.000172
0.008176,0.053085,0.938775
0.312713,0.329016,0.358271 -> 0.312713,0.329016,0.358271
1.108620,-0.127480,0.018860
0.028306,0.971848,-0.000154
0.008164,0.053063,0.938774
While the coefficients aren't altered all that much (since the white point you specified was already pretty close to D65), it is definitely enough to account for the minor differences you measured between the PS script and ddcc.
Another thing to remember is that the assumed gamma function (in this case x^2.2222) is not an exact match to any real world device (let alone an exact match to both the source/destination). This will introduce error into the calculations.
Basically, while measurements are nice, the only way to truely know if the colors are accurate would be to see the original source on the original display device, and see if to you it looks the same as what you are seeing on your display device. Unfortunately this is unlikely to ever happen.
Anyways, to get back to the point, there isn't anything wrong with the math in ddcc.
pitch.fr
22nd July 2008, 02:11
@yesgrey3 : this is not going to happen. Haali won't support PS scripts in his renderer, he clearly said it several times in his thread.
you can only achieve 3D LUT's in ICC v4, but this only works in windowed mode with HR, so it's useless to us.
@tritical : wow, the mad scientist is at work again :D
a friend of mine was explaining me that the CIE chart doesn't mean much because CIExy is a 3D space.....from the view we see the CIE from there's an unlimited number of heights.
the PS scripts prefers to get closer to the source, even if the Delta-E gets worse.
your explanation is going a bit over my head, I will need to read it several times again :D
so there's no way to measure what DDCC does ?
it doesn't clip colors like the PS script does, it's smarter than that then ? :eek:
for SMPTE-C/EBU conversion on an HDTV'ish display, the DDCC settings and the text file settings seem "optimal" to you ? is it better to go gam_i=2 or 5 ?
Thanks,
yesgrey
22nd July 2008, 02:46
this is not going to happen. Haali won't support PS scripts in his renderer, he clearly said it several times in his thread
But he doesn't need to support PS scripts, he could only add this correction to his renderer, which should be a very simple task to do...
The other solution seems to be if tritical write ddcc with some kind of gpu support. PS would be better because will work with all gpu. CUDA would be easier to him, but will only work with NVidia. I vote for the later. (I have a nvidia card :D)
yesgrey
22nd July 2008, 02:59
tritical,
Where did you get your D65 white point coordinates? (0.312713,0.329016,0.358271)
In Rec. ITU-R BT.709-5 the coordinates are: (0.3127,0.3290,0.3583)
In the begining I also used yours, but aparently they are wrong...
tritical
22nd July 2008, 04:25
@yesgrey3
If you go through all the calculations based on how D65 is defined, 0.312713,0.329016,0.358271 are closer to the true values. BT.709 just rounds to 4 decimal places. I guess the BT.709 chromaticity coordinates definition in ddcc should be changed to the truncated values, but I don't think the difference is enough to matter (i.e. release a new version just for that). Especially if you are simply going to use pow(x,0.45) as the transfer function, when you consider that BT.709 actually defines the transfer function as:
x = 4.5*C if C < 0.018
(1.0+0.099)*pow(C,0.45)-0.099 otherwise
Again, I think most of these differences are too small to worry about.
@pitch.fr
Chromatic adaptation tries to adjust for changes in illuminant (between the source and destination) such that the image after conversion will appear to a person to look as much like the original as possible. Basically, CATs (chromatic adaptation transforms) are adjusting for a phenomena of the HVS, and like psychovisual methods in codecs, CATs are primarily based on human viewer tests. Using a CAT means that the XYZ values of each pixel will be different than if you just did a straight conversion while ignoring the CA phenomena. Most major photo/image editors like photoshop use CATs during conversions, so I assume it is desirable (photoshop uses the Bradford method).
On the gam_i=2 or 5 question, I'm not sure. SMPTE-C only defines chromaticity coordinates, not a transfer function.
pitch.fr
22nd July 2008, 04:47
But he doesn't need to support PS scripts, he could only add this correction to his renderer, which should be a very simple task to do...
The other solution seems to be if tritical write ddcc with some kind of gpu support. PS would be better because will work with all gpu. CUDA would be easier to him, but will only work with NVidia. I vote for the later. (I have a nvidia card :D)
oh yeah GPU support would be awesome, but no proprietary stuff please.
nvidia cards don't work with Powerstrip, so no spot-on 48Hz for you.....and their PQ contrast and sharpness-wise it's nothing like the ATi's IMHO.
and what happens to the off-gamut green with the PS script in my case ? it clips to the most saturated green I can get ? strangely enough, I haven't seen any banding
On the gam_i=2 or 5 question, I'm not sure. SMPTE-C only defines chromaticity coordinates, not a transfer function.
well SMPTE-C is supposed to be in watched in the SMPTE RP 145 gamut I think, so how does that look ? and what about EBU Tech. 3213 for EBU ?
gam_i=5 seems to offer more contrast, because the low IRE gamma of 2 seem too bright.
what about the last 5 figures of the text file, do they look "optimal" to you ?
and does the DDCC algorithm offer improvements over the PS script ? or do they basically go different ways but achieve the same end result ? like off-gamut colors for instance ?
I'm getting a new CPU on thursday, hopefully a 4GHz C2D will help with real time use in ffdshow and HD content :D
yesgrey
22nd July 2008, 11:03
I guess the BT.709 chromaticity coordinates definition in ddcc should be changed to the truncated values, but I don't think the difference is enough to matter (i.e. release a new version just for that).
I agree. I did not referred the chromaticity coordinates difference to justify what is happening, I know the difference is too small, I was just noting it.
nvidia cards don't work with Powerstrip, so no spot-on 48Hz for you...
and what happens to the off-gamut green with the PS script in my case ? it clips to the most saturated green I can get ? strangely enough, I haven't seen any banding
I have spot-on 47.952Hz with my geforce 8600gt. See my thread in powerstrip forum on how to achieve it. In fact, I also have spot on 50.000Hz and 59.940Hz. I wrote a little program to calculate the resolutions to achieve this. Maybe when I get the time I will create a thread about it. I have to test the new powerstrip functionallity to see if it overrides my little program...
Yes, your green and all the other off-gamut colors are clipped to the nearest most saturated colors you could get.
Remember that a movie is not full of highly saturated colors, and our eyes color perception is also non-linear, so don't worry about that and simply enjoy your movies!:)
For me, would be more of a concern the possibillity of banding inside the color gammut, when converting from one to another,
but I also never noticed it.
One of this days I will create a few test images to see if we could really notice the banding...;)
pitch.fr
22nd July 2008, 11:15
I can't find any posts from yesgrey3 on the pstrip forum, do you have the link please ?
well quite frankly I've tried 2 8600GT(MSI & PNY), they looked really bad compared to the Sapphire 2400Pro/2600XT on both my CRT and my HC3100 in DVI.
like some ugly EE even on the windows desktop, and very bad contrast.
many friends have also switched after I told them, IMO the ATI are far ahead as far as PQ is concerned.
a friend of mine tried the new "auto search" feature of pstrip on a 9600GT, it didn't work at all!
I personally prefer 48.000Hz to get the real 24fps cinema speed, I find 47.952 too slow :D
yesgrey
22nd July 2008, 15:25
Search for yesgrey33. I lost the password of the yesgrey3...
I prefer 47.952. If you set an exact refresh in your graphic card you don't need to resample the audio with reclock. I currently only use it to output bit perfect with my RME Fireface. For PAL movies, the pitch correction could be done lowering the audio card clock, without any resampling...
pitch.fr
22nd July 2008, 15:32
well I don't see any tool ?!
Use powerstrip timings with GeForce 8 custom resolutions
yesgrey33 6 5125 Fri Mar 21, 2008 3:58 am
Custom resolution feature request
yesgrey33 10 2570 Fri Sep 28, 2007 9:33 pm
well I don't mind resampling, I'm outputting analog anyway.
and pstrip has that new feature that does automatic searches until the graphic card says "48.0000000000000000000 Hz"
so together with Reclock set to "24fps", it's a winner....very low jitter in HR 8)
yesgrey
22nd July 2008, 17:28
I haven't posted the tool yet, and I believe the new pstrip function works very similar to mine, so maybe I will never post it.
When you resample the sound it's quality will be worse than without resampling... even when you use the Excelent quanlity, and this is very slow... and if you don't resample, you will have more cpu power for processing the color correction.
pitch.fr
22nd July 2008, 17:32
the auto-search feature doesn't work at all on the 9600GT.
a friend of mine was searching for 48.000000 but all he got was 59Hz :D
maybe he could try your app ?
yes you have a point, if Reclock doesn't suck 20% of my CPU time to do "Excellent" resampling, this can be put to good use with DDCC :)
I'll see how it goes with my new E7200 tomorrow :D
pitch.fr
22nd July 2008, 20:33
to get back on topic, would that be possible to fix the saturations of the display upfront in DDCC ?
that's a chart of my HC3100 saturations :
http://pix.nofrag.com/0/2/8/a990a92dce9da10f80a7f76c10e18.png
ideally, they should all be on the 0% line and at worse ±5%
you can see that at 75% the green primary is +12% and yellow +10%, which has a very bad effect on people faces.
and cyan is falling down, which increases the red saturation.
very few displays allow to set each color saturation(the Sanyo pj are very good at that).
could you set an offset or something in DDCC for the primaries/secondaries saturations ? because it's a major point not to be missed to get accurate colors on the whole IRE scale.
and a friend of mine is asking again if the luminance is taken in account ? considering we've done a TV>PC levels conversion.
PS : here's another thread that says that HD is mastered with REC.709 matrix and SMPTE-C primaries :
http://www.avsforum.com/avs-vb/showthread.php?p=14072175#post14072175
actually there's a very good reason why all the US telecine studios use this exact SONY 20" 4:3 CRT :
http://catalogs.infocommiq.com/avcat/CTL901/index.cfm?mlc_id=203&mrc_id=901&prodid=398485
that because it's built like a tank(2.5 gamma like all the broadcast equipment), and it already has less dynamic than a telecine, but LCD and plasma can't match its native contrast and also have other major drawbacks...
I just don't understand why they don't convert gamut from SMPTE to HDTV when they do BT.601>BT.709 transcoding :o
pitch.fr
25th July 2008, 03:12
I think I've killed tritical with all my questions :eek: :D
anyhow, I just got my new CPU, and it lets me run DDCC in real time with all my audio enhancements and Haali's Renderer.....this is awesome :D
http://pix.nofrag.com/0/9/9/f76a843109dbd01e7bec409131558tt.jpg (http://pix.nofrag.com/0/9/9/f76a843109dbd01e7bec409131558.html)
what do you guys think of this TV>PC levels AVS script ?
http://forum.doom9.org/showthread.php?t=137479
it basically does a TV>PC levels conversion doing chroma smoothing, to avoid banding, which is very pleasing to the eye....
but OTOH it's messing with the gamma curve, which increases the contrast :)
here's a comparison(both get through my pj SMPTE-C PS script), top is with Ulevels(), bottom is with ffdshow at the RGB32 conversion :
http://thumbnails8.imagebam.com/991/b61b119905830.gif (http://www.imagebam.com/image/b61b119905830)http://thumbnails8.imagebam.com/991/9b2ef49905832.gif (http://www.imagebam.com/image/9b2ef49905832)http://thumbnails8.imagebam.com/991/3b5c889905834.gif (http://www.imagebam.com/image/3b5c889905834)
http://thumbnails8.imagebam.com/991/f1ffe19905831.gif (http://www.imagebam.com/image/f1ffe19905831)http://thumbnails8.imagebam.com/991/0b3b449905833.gif (http://www.imagebam.com/image/0b3b449905833)http://thumbnails8.imagebam.com/991/fa99c29905835.gif (http://www.imagebam.com/image/fa99c29905835)
there's an ongoing war on AVS around SMPTE-C reds being way too orangey, and I have to agree.
basically the CARS hero will never look as red in the movie as he is on the cover....so using this thing kinda fixes this problem :cool:
here's what I'm running in ffdshow at this point :
MT("LimitedSharpenFaster(ss_x=1.0,ss_y=1.0,strength=40)",4)
MT("""ULevels(preset="tv2pc")""",4)
MT("""ConvertToRGB32(matrix="PC.709")""",4)
ddcc(chr_i=3,gam_i=5,ofile="C:\PJ.txt",threads=4,opt=1)
pitch.fr
9th August 2008, 01:25
just wanted to say that DDCC works perfectly fine in real time in ffdshow with any 720p/2.35 1080p content on a G0 Q6600 o/c to 3.4Ghz :)
it's too slow for 1080p 1.78/1.85 movies, though(even when downscaled to 720p).......ffdshow is wasting too many CPU cycles on 15/20mbit h264
but hopefully the h264 Remoulade decoder will solve this issue, because it's got much less latency than ffdshow and is far more MT optimized :eek:
tritical
20th December 2008, 06:44
While working on a project requiring rgb->Lab/Luv/Lch conversions, for which I stole code from ddcc, I discovered a bug in ddcc's chromatic adaptation code (it's used when the white points of the source and destination differ). Specifically, it should have been multiplying by the transpose a matrix when it was just using the matrix straight away. Fortunately, the errors caused by this were small. Anyways, I put up a fixed v1.3 at the usual location (http://bengal.missouri.edu/~kes25c/ddcc.zip). Also, I still don't have any plans to port this to the GPU.
yesgrey
24th December 2008, 02:40
tritical,
The correction matrix is always the same for each display. So, why not make a very simple thing like this:
Pick this filter and divide it in two parts:
1-One small software tool that will create files with 3D LUTs for all our display modes we need.
2-The Avisynth filter only has to load that file and output the correct RGB values just by looking at the 3D LUTs. This will not use lots of memory and it will speed up all this process. It's not very efficient calculating each RGB value one by one, since for each RGB value the corrected value will always be the same. This way, if we have the 3D LUTs in a file format, we could simply ask the ffdshow developers to add this to ffdshow, which would be very simple, just load the 3D LUTs and applying it after the YUV->RGB conversion, without any increase in cpu load, just some memory usage.
From my calculations, it would be: 3 3D LUTs: 3x256^3 = 48MB
For the current memory available in a PC, it's an insignificant amount...
What do you think?
The porting to the GPU is completelly useless, I agree with you. This is a one time calculation, so it could also be all written in C, then it's simply using the 3D LUTs...
tritical
24th December 2008, 11:40
yesgrey3, when I read your post my first thought was that accessing a 48MB LUT would be really slow on a typical desktop comp. Then I got curious about how slow it would be... Turns out it wasn't that slow at all (faster than the sse3 code path of ddcc on the dual core comp I'm on right now).
So I added a parameter to ddcc to output a 3D LUT ('lutfile'). The table is stored in binary format. The offset into the table in bytes is calculated as ((g<<16)+(b<<8)+r)*3, and at that location the new values are stored in b,g,r order (one byte each). I chose that arrangement because it made the assembly implementation fastest/easiest. I also added a new function to ddcc.dll called 'rgb3d'. It takes a LUT of the format I just described as input, and performs the 3D LUT operation. 'rgb3d' requires rgb24 input. The new dll is at the same place as before.
yesgrey
24th December 2008, 11:53
tritical,
Thank you very much! It was a good Christmas gift!:)
I will take a look and test it and will let you know...
:thanks:
leeperry
24th December 2008, 12:50
goddamn, you know you're just a bunch of mad scientists right :D
ConvertToRGB32(matrix="rec709")
= 302 fps
ConvertToRGB32(matrix="rec709").AviShader("C:\effect.fx", "ColorCorrection")
= 86 fps
(using this script : http://www.avsforum.com/avs-vb/showthread.php?t=912720 )
ConvertToRGB32(matrix="rec709")
ddcc(chr_i=3,gam_i=2,ofile="C:\coeffs.txt",threads=4,opt=1)
= 96 fps
ConvertToRGB24(matrix="rec709")
= 283 fps
ConvertToRGB24(matrix="rec709")
rgb3d(lutfile="C:\lut.txt")
= 220 fps
ConvertToRGB24(matrix="rec709")
ConvertToRGB32()
= 196 fps
ConvertToRGB24(matrix="rec709")
rgb3d(lutfile="C:\lut.txt")
ConvertToRGB32()
= 170 fps
anyway me likes it A LOT, thanks a lot fellas & merry xmas http://forum-images.hardware.fr/images/perso/astrid72.gif
if it could work natively/output in RGB32 & support RAR format for the lut file(so I can copy them on a ramdisk), that would be great too...but it's already awesome as it is!
yesgrey
24th December 2008, 14:38
tritical,
It's great as it is, but I would like to make a suggestion... I think the name rgb3dlut would be a little more suggestive...
leeperry,
I was worried to not have hear anything yet from you...;-)
leeperry
24th December 2008, 14:49
I was worried to not have hear anything yet from you...;-)
well when it comes to Reclock/ffdshow/KMP/HR or 3D LUT's, I'm all ears :p
besides, I think it's good to have real world benchmarks to see how efficient the new code is :o
PS: the lut stuff seems to be more accurate than the realtime SSE3 mode...or at least the PNG's are slightly bigger than both the SSE3 mode & the PS script.
tritical
24th December 2008, 21:06
I changed the name to rgb3dlut, and added rgb32 support.
leeperry
24th December 2008, 21:19
awesome, thanks a bunch tritical!
ConvertToRGB32(matrix="rec709")
rgb3dlut(lutfile="C:\lut.txt")
= 216 fps
I've RAR'ed up the LUT files(1 mb a pop) on a ramdisk and I unrar them through a rar.exe batch, so everything's cool...damn I'm drunk, gonna get some more Champagne :D
yesgrey
25th December 2008, 14:55
PS: the lut stuff seems to be more accurate than the realtime SSE3 mode...
Read the readme.txt file.
With all calculations not needed to be done in Real-time, no simplifications are used, the values in the 3D LUT are as accurate as they could be!
yesgrey
25th December 2008, 15:02
tritical,
Thanks a lot for your work, and also for your curiosity in knowing how slow my suggestion would be...;)
:thanks:
leeperry
25th December 2008, 15:04
Read the readme.txt file.
With all calculations not needed to be done in Real-time, no simplifications are used, the values in the 3D LUT are as accurate as they could be!
right, I remembered I read something about that, but I couldn't find it in this thread :rolleyes:
yesgrey
25th December 2008, 22:10
tritical,
Why do you request all chromaticity coordinates in the file. Is not the rule x+y+z=1 always valid? Usually, the colorimeters only give us the x,y coordinates, so we have to perform the calculation above to get the z coordinate...
Another question: in the readme, you don't say what is the gamma for SMPTC-E and EBU, only the chromaticity coordinates...
If you release a new version to remove the z coordinates, also take a look at this, in your post #54, maybe you still want to change it, even not being needed...
If you go through all the calculations based on how D65 is defined, 0.312713,0.329016,0.358271 are closer to the true values. BT.709 just rounds to 4 decimal places. I guess the BT.709 chromaticity coordinates definition in ddcc should be changed to the truncated values, but I don't think the difference is enough to matter (i.e. release a new version just for that).
leeperry
26th December 2008, 02:21
colors look definitely even more true to life with that LUT stuff, quite a blast to watch BARAKA in BD in its native gamut this accurately :eek:
I tried gam_i=2,gam_o=5 as yesgrey suggested, but then the gamma curves were not dark enough...gam_i=5,gam_o=5 looks identical to the PS script gamma-wise and really good on my ±2.2 calibrated display :cool:
tritical
27th December 2008, 02:20
I require z in the files just because that's how I wrote it. Calculating z isn't that much extra work :p. I don't list gamma functions for SMPTE-C because it doesn't specify a specific gamma function. AFAIK, with SMPTE-C you should use NTSC gamma (BT.470-2 System M, 2.2 gamma) for NTSC stuff and PAL gamma (BT.470-2 System B,G, 2.8 gamma) for PAL stuff. EBU should usually use BT.470-2 System B,G gamma AFAIK. On the truncating of white point coefficients for BT.709, I'm not sure if the standard actually states truncated values or simply says the white point is D65 illuminant (in which case more accurate values would be fine). Either way, the difference wouldn't be noticeable. I might change it in the next version.
One thing I definitely want to change is the way out-of-gamut colors are handled when using lut creation. Right now, each channel (R,G,B) is simply capped to [0,1] (before gamma is applied). That method is used for speed. A better way would be to adjust all channels together. In the case of values < 0, subtract min(r,g,b,0) from all channels (i.e. add white). In the case of values > 1, scale all channels using the same value such that the maximum channel is at 1.
leeperry
27th December 2008, 10:50
..that still wouldn't show off-gamut colors I guess ? :D
actually that'd be great to have an option to set off-gamut colors to blink or sumthing(for troubleshooting purposes), because from what yesgrey3 said the most saturated tints are hardly ever used...and I've never witnessed banding myself.
it would also be great if you could import a .chc file from Color.HCFR and use the primaries/secondaries saturations data to counterbalance it in the LUT :eek:
I was discussing it with JohnAd here, he said it would be possible :
http://www.avsforum.com/avs-vb/showthread.php?p=14594838#post14594838
:thanks:
yesgrey
27th December 2008, 13:02
Either way, the difference wouldn't be noticeable.
I agree with that, that's why I think it's preferable using the truncated values. The standard says D65 and the x,y coordinates as 0.3127, 0.3290. I agree that the other values are more accurate, but using the truncated values avoid the repetition of this question over and over... do as you prefer, for me it's not an issue.:)
If SMPTE-C and EBU do not specify any gamma curve, should we use option 5 but using the gamma values you suggested?
I think it would be a good idea to include in the readme that explanation about the gamma curves for SMPTE-C and EBU.
yesgrey
27th December 2008, 13:14
actually that'd be great to have an option to set off-gamut colors to blink or sumthing(for debug purposes), because from what yesgrey3 said the most saturated tints are hardly ever used...and I've never witnessed banding myself.
With the current displays, the off-gamut colors is almost not an issue, because their gammuts exceed the standards gammuts; maybe only ntsc could have some problems with the reds...
The more problematic would be the inside-gamut colors. Due to the PC only working with 8 bit per component, the same as the video sources, some source colors will be showed as the same color after the gammut conversion (shrinking).
If we had 10 bit per component color depth in our PCs... some graphics cards already support it, but Windows don't.:(
tritical
27th December 2008, 20:42
@yesgrey3
For SMPTE-C I would use gam_i/o = 3 for NTSC and 4 for PAL, which is the same as using 5 but with 2.2 or 2.8 gamma respectively. I will add that to the readme.
@leeperry
it would also be great if you could import a .chc file from Color.HCFR and use the primaries/secondaries saturations data to counterbalance it in the LUT
In the saturation error graph you posted in the avsforum thread, how is saturation being computed?... From Luv colorspace or XYZ or HSV? How are those % errors computed? Is it computing the saturation of the desired color, the saturation of the color measured by the probe, and then computing the relative error? What values are in the .chc file (I can't download the file you posted since I don't have an account there). Given those values, how do you propose to modify the calculation?
leeperry
27th December 2008, 21:12
@leeperry
In the saturation error graph you posted in the avsforum thread, how is saturation being computed?... From Luv colorspace or XYZ or HSV? How are those % errors computed? Is it computing the saturation of the desired color, the saturation of the color measured by the probe, and then computing the relative error? What values are in the .chc file (I can't download the file you posted since I don't have an account there). Given those values, how do you propose to modify the calculation?
apparently JohnAd could answer you on that, as he managed to import the .chc data and try to fix it through the PS script.
but because it was a very simple 3D LUT in the PS script, it wasn't quite possible to fix it entirely.
with your LUT stuff, you should be able to take it in account in the conversion computing I would guess ?
all I can say right now is that there is an option in Color.HCFR to choose whether you want the saturations to be computed from the reference gamut or from the display native gamut.
I will ask your questions to one of the Color.HCFR coders, and I'll get back to you.
I've uploaded the CHC file here BTW :
http://www.yousendit.com/download/TTZuUWVuTWN6RStGa1E9PQ
you need Color.HCFR 2.01 :
http://www.homecinema-fr.com/colorimetre/release/Setup_v2_0_1.exe
For SMPTE-C I would use gam_i/o = 3 for NTSC and 4 for PAL, which is the same as using 5 but with 2.2 or 2.8 gamma respectively. I will add that to the readme.
well SMPTE-C is used on US/ASIAN DVD/BD, and EBU on EUR BD/DVD...you're saying we should use 3/3 for SMPTE-C stuff ? what about EBU stuff ?
what if you have values at the end of the ofile and still specify gam_o in the AVS call, is gam_o prefered ?
I've tried to put this at the end of my ofile(same values as gam_o=3) but it gave much darker gamma than gam_o=5 :
1
0.0
1.0/2.2
0.0
0.0
this is 0.45454545~ where gam_o=5 is 0.45...the difference I see seems much bigger than that, maybe the "/2.2" part has been ignored :confused:
I was under the impression than 5/5 worked in any given situation :rolleyes:
tetsuo55
27th December 2008, 22:04
With the current displays, the off-gamut colors is almost not an issue, because their gammuts exceed the standards gammuts; maybe only ntsc could have some problems with the reds...
The more problematic would be the inside-gamut colors. Due to the PC only working with 8 bit per component, the same as the video sources, some source colors will be showed as the same color after the gammut conversion (shrinking).
If we had 10 bit per component color depth in our PCs... some graphics cards already support it, but Windows don't.:(
Windows 7 has full support for 10bit, it goes as far as 16bit actually.
I personally cannot wait because i have a 10bit videocard AND a 10bit display
I really hope Windows 7 is smart enough to detect all the different colorspaces and converts them accurately to 10bit BT.709
leeperry
27th December 2008, 22:06
i have a 10bit videocard
the TMDS encoder of your graphic card works in 3*8 bits, no soup for you :D
tetsuo55
27th December 2008, 22:22
the TMDS encoder of your graphic card works in 3*8 bits, no soup for you :D
Why do you think so?
According to all ATI documentation i read on the subject it has full support, also read some posts on random forums where people are using it.(incidentally with almost the same panel as i have)
Quoting Ati's PR:
-Full 30-bit display processing
-Spatial/temporal dithering provides 30-bit color quality on 24-bit and 18-bit displays
-Primary supports 18-, 24-, and 30-bit digital displays at all resolutions up to 1920x1200 (single-link DVI) or 2560x1600 (dual-link DVI)1
Secondary supports 18-, 24-, and 30-bit digital displays at all resolutions up to 1920x1200 (single-link DVI only)1
leeperry
27th December 2008, 22:44
According to all ATI documentation i read on the subject it has full support, also read some posts on random forums where people are using it.(incidentally with almost the same panel as i have)
well we've discussed this a bit with Seb.26(an occasional ffdshow coder from HCFR), basically you can output 30 bits on VGA....considering it's analog and the LUT is 3*10 bits, you can even measure the LUT accuracy with ARGYLLCMS and a colorimeter.
DVI/HDMI 1.0 is 3*8 bits TMDS, and I don't see any >1.0 HDMI graphic card at this point...so these nice HDMI 1.3 inputs on top of the line displays are m00t at this point(even so considering there's no 30 bits source at this point, and there won't be before a long while).
so 30 bits/xvYCC would only be useful to the HTPC color freaks...like us :D
apparently DVI can only do 30 bits in dual link configuration :
http://techreport.com/forums/viewtopic.php?t=47715
but dual link DVI & HDMI 1.3 are not compatible AFAIK...what's your display that does 30 bits on DVI ?
tetsuo55
27th December 2008, 23:27
Interesting...
I have a Sony HDTV, its a 1080P/10bit(30bit) panel.
My current videocard is a HD2400pro.
The card seems to have a HDMI 1.2 port, which supports everything except TrueHD/DTS-MA bitstream and 10bit(30bit)
My card does support up to 10bit(30bit) but only with DVI
I guess you're right.
Currently you need a panel that supports 10bit over DVI, not HDMI(Many of these panels exist, but they are small and expensive as they are sold to the graphic industry)
I know there is no content available, but at least it would fix the fact that all are mediacenters are stuck in sRGB(which is the only colorspace windows understands)
yesgrey
27th December 2008, 23:41
Windows 7 has full support for 10bit, it goes as far as 16bit actually.
I was hoping for this...
In fact, when they start W7 development, I have posted to their blog requesting that... I don't think they have done by my request, but it's great to know that it's coming...
Maybe you can not use it with the hdmi port, but with the vga port yes, I believe the dacs are 10 or 12bit...
leeperry
28th December 2008, 00:04
there's still no HDMI 1.3 soundcard that can do bitstream....so wait for a (long) while and HDMI 1.3 graphic cards will pop up, eventually :D
many pj work internally in 10/12 bits, meaning the gamma/colorimetry settings in the OSD won't create terrible banding
even if we get HDMI 1.3 graphic cards & 10 bits native windows 7, don't count on anyone but tritical, yesgrey3 & JohnAd to watch our back.
just like Reclock, ppl who care about butter smooth movies and proper colorspaces are not legions.
actually, there's this card from Asus that looked most promising "12-bit gamma correction" :
http://vr-zone.com/articles/ASUS_Combines_Splendid_HD_w_HD3850_MXM_Card/5753.html?doc=5753
too bad they didn't put HDMI 1.3 outputs :rolleyes:
tritical
28th December 2008, 00:57
Well, after reading the documentation of color.hcfr I know how they calculate saturation (euclidean distance from the white point in xyY space) and the saturation error percentages. However, I'm still not sure how you'd go about accuractly correcting anything. All you have are measures for a few points that fall along the 6 lines running from the white point to the 3 primary and 3 secondary colors in xyY colorspace.
leeperry
28th December 2008, 01:08
actually Colorfacts doesn't measure these saturations at all.
all you get is gray levels & gamut measurements, so maybe it's not that important after all.
in my case cyan is going down the drain >70%, and other colors are oversaturated...if you tell me what data YOU'd need, I can try to get the Color.HCFR coders to provide you with them.
http://pix.nofrag.com/0/2/8/a990a92dce9da10f80a7f76c10e18.png
yesgrey
28th December 2008, 02:57
For SMPTE-C I would use gam_i/o = 3 for NTSC and 4 for PAL, which is the same as using 5 but with 2.2 or 2.8 gamma respectively. I will add that to the readme.
tritical,
This is my understanding of all this formats thing...
For the output, whichever format we use, I think we should always use our display coordinates, and gam_o=5 with a gamma value equal to our display gamma... if our display has a gamma of 2.0, we should use 2.0.
For PAL we should use BT.470-2 System B,G
primaries and gamma curve, chr_i=2 and gam_i=4.
and from this (http://en.wikipedia.org/wiki/NTSC), it appears that for NTSC we should use SMPTE C primaries and SMPTE 170M gamma curve, chr_i=3 and gam_i=1.
What do you think of this?
yesgrey
28th December 2008, 03:14
leeperry,
could you post the same graph but without performing the color correction? Just with the original primaries and secondaries?
Thanks.
leeperry
28th December 2008, 03:27
leeperry,
could you post the same graph but without performing the color correction? Just with the original primaries and secondaries?
Thanks.
well that's the second part of the problem actually.
it's easy to measure them with the automatic test patterns built into Color.HCFR...but then it's done through simple GDI(no DS filter).
to do it through ddcc, you'd need to use the manual DVD test patterns.....and to get this chart you need to go through at least 24 of them, enough to turn anyone nuts.
so I've asked the Color.HCFR guys if they could somehow drive MPC by sending "next chapter" keystrokes, but well if these data can't be "processed" into ddcc's LUT then it's pointless.
tetsuo55
28th December 2008, 12:24
there's still no HDMI 1.3 soundcard that can do bitstream....so wait for a (long) while and HDMI 1.3 graphic cards will pop up, eventually :D
many pj work internally in 10/12 bits, meaning the gamma/colorimetry settings in the OSD won't create terrible banding
even if we get HDMI 1.3 graphic cards & 10 bits native windows 7, don't count on anyone but tritical, yesgrey3 & JohnAd to watch our back.
just like Reclock, ppl who care about butter smooth movies and proper colorspaces are not legions.
actually, there's this card from Asus that looked most promising "12-bit gamma correction" :
http://vr-zone.com/articles/ASUS_Combines_Splendid_HD_w_HD3850_MXM_Card/5753.html?doc=5753
too bad they didn't put HDMI 1.3 outputs :rolleyes:Auzentech has a HDMI1.3 soundcard, you connect the device or videocard with 1.2 or lower to the hdmi in on the soundcard, then it converts it to hdmi1.3(this does not add any new information except for the audio)
leeperry
28th December 2008, 12:52
so they finally rlsed it, and it does HD bitstream audio ?
I guess the fees to offer HDMI 1.3 must be very high, but soon or later we'll get HDMI 1.3 graphic cards I guess.
maybe with the new nvidia cards early next year ? :)
yesgrey
7th January 2009, 00:14
tritical,
I had another idea to increase speed and lower the size of the 3D LUT file... Why not include in the 3D LUT also the YUV->RGB conversion? (I think YUY2 would be enough)
The LUT size would be: 3x220x254x254 ~ 40.6MB < 48MB.
And is just adding the specification of the matrix we would like to use (BT.601 or BT.709) and the levels we want... (you already have part of that done in Colormatrix)
So the 3D LUT would become equivalent to:
Y'U'V'->R'G'B'->RGB->Color Correction->RGB->R'G'B'.
I think this would be a more powerfull and interesting solution, because some people only need the YUV->RGB conversion with a 3D LUT... perhaps it would be faster than the ConvertToRGB32()...;)
If you don't have time let me know, I could try changing just the C part of the code for calculating the 3D LUT file...
tritical
10th January 2009, 07:59
I am working on the next version, current list of changes to make:
1.) don't require z in the input files
2.) change the way out-of-gamut colors are handled
3.) Add an option to ddcc to use lut (basically it will just programmically invoke rgb3dlut after creating the lut file so you don't have to open the script twice).
@yesgrey3
Coding a yuy2 to rgb 3d lut filter wouldn't take much work (small modifications to rgb3dlut). I might include it in the next release of ddcc. If you could write a program to actually calculate the LUT files for use with it that would great :).
@leeperry
the difference I see seems much bigger than that, maybe the "/2.2" part has been ignored
It will be... it just reads fp values (sscanf(buf,"%lf",&val)).
On the topic of saturation correction. Maybe you could ask JohnAd what he did, because I still don't know how you could make any corrections based on the data you have.
leeperry
10th January 2009, 11:02
Coding a yuy2 to rgb 3d lut filter wouldn't take much work (small modifications to rgb3dlut). I might include it in the next release of ddcc.
well mark0077 has been recently pointing out that the nvidia drivers in YV12(using software renderers) were offering better chroma upsample than even ConvertToRGB32() :
http://forum.doom9.org/showpost.php?p=1230065&postcount=27
I'm not sure how they do it, but could you please set up a very HQ chroma upsample scheme ? like spline36 ? apparently Convert() is using bicubic, and so does ffdshow....but the ugly ATi drivers would be using pointresize(from what Leak said) :rolleyes:
the test sample VOB is available here :
http://www.mediafire.com/?9g9ddlfzxhv
@leeperry
It will be... it just reads fp values (sscanf(buf,"%lf",&val)).
On the topic of saturation correction. Maybe you could ask JohnAd what he did, because I still don't know how you could make any corrections based on the data you have.
well the real issue will be to verify how it went, the HCFR coders are too busy with the next release to set up some sort of automatization with MPC and the DVD manual patterns....and going through 24 manual patterns is too cumbersome. nevertheless, I will ask JohnAd :)
if at some point, you could sorta give some hints for the gam_i/o formulas to use, that'd be really great.
atm I'm using 5/5 and it looks good to me on my ±2.2 calibrated display....maybe I should put 0.454545454545 instead of 0.45 ?
tritical
10th January 2009, 20:27
What playback chain do I need to use to ensure that nvidia's drivers are doing the upsampling (I have a 9800gtx with 178.28 drivers)? Avisynth's color conversions operate as described here: http://avisynth.org/Sampling. The only problems with the descriptions on that page are that the rgb->yuy2 conversion doesn't use a [1 2 1] kernel, and the C code version of yuy2->rgb doesn't average to get every other chroma sample (see http://forum.doom9.org/archive/index.php/t-129316.html). ffdshow's hq rgb conversion uses avisynth's code (as in taken out of avisynth cvs) so it uses the same sampling. Any differences between avisynth's conversion and ffdshow's hq rgb conversion are probably due to coefficient scaling.
My plan for yuy2 to rgb lut is to use linear interpolation to create the chroma for the odd pixels (same method described on the avisynth sampling page).
leeperry
10th January 2009, 21:03
the test was conducted in YV12 with EVR on Vista SP1 :
http://forum.doom9.org/showthread.php?t=143818
apparently what also helped was Leak's progressively upsampling PS script(embedded in MPC HC) :
http://forum.doom9.org/showpost.php?p=1184975&postcount=32
that'd be more convenient if you could allow YV12 input instead of YUY2 if any possible, so ddcc gets the untouched video stream(no ConvertToYUY2() implied).
yesgrey
11th January 2009, 02:05
The LUT size would be: 3x220x254x254 ~ 40.6MB < 48MB.
This is wrong. Since we also want to convert values of Y<16 and Y>235, the correct size would be:
3x254x254x254 ~ 48MB.
In YUV values 0 and 255 should be clipped. So, I think the better is just keeping the same size of the RGB, 3x256x256x256...
If you could write a program to actually calculate the LUT files for use with it that would great :).
Yes, I am thinking in this. I will use parts of your code, and am also thinking in finding a way of performing the correction for the saturations problem.
Probably I will use Fortran - is my prefered language -, are you ok with it?
yesgrey
11th January 2009, 02:14
that'd be more convenient if you could allow YV12 input instead of YUY2 if any possible
This is not possible.
A LUT maps an input value to an output value.
In YV12 you don't have all the input values, only half, so you cannot map it through the 3D LUT. You have to compute the values that are missing, so, you always have to convert to YUY2. The idea of including the YUV->RGB conversion is only to avoid the extra cpu load of YUV->RGB conversion, you cannot avoid the YV12->YUY2 conversion...
Of course rgb3dlut function could support YV12 input, but then it will have to perform the YV12->YUY2 internally beforing mapping it out via the 3D LUT, so there is no gain in avoiding ConvertToYUY2...;)
leeperry
11th January 2009, 02:20
you cannot avoid the YV12->YUY2 conversion
right :D
well still, seeing how inaccurate ConvertToRGB32() is, maybe the YV12>YUY2 conversion could also benefit from some higher quality upsample :o
tritical
11th January 2009, 05:00
In YUV values 0 and 255 should be clipped. So, I think the better is just keeping the same size of the RGB, 3x256x256x256...
I agree... it also makes the code simpler. My code makes the lookup in to the table as: ((V<<16)+(U<<8)+Y)*3, then the three bytes at that location are stored in b,g,r order.
Yes, I am thinking in this. I will use parts of your code, and am also thinking in finding a way of performing the correction for the saturations problem.
Probably I will use Fortran - is my prefered language -, are you ok with it?
Yep.
IanB
11th January 2009, 09:09
YV12 to RGB is possible in exactly the same way as the proposed YUY2 to RGB, you just have to pre-interpolate vertically as well.
As for out of gamut values, maintaining Hue is the all important factor. The eye is very sensitive to hue, especially skin tones (pink), but is quite ambivalent about saturation particularly at high saturations.
In YUV space, Hue is analogous to the ratio of U/V, saturation is analogous to the length of the UV vector, i.e. U**2+V**2, so just clamp the offending U or V value at the gamut limit and scale the other value to maintain the original ratio of the preclamped values.
e.g. Using normalised values, conversion results in U=1.06, V=0.42. Clamp U at 1.0, set V=1.0*(0.42/1.06)=0.396
In RGB space hue is the ratio of the 2 dominant primaries after the value of the minority primary has been subtracted from each. These values are best corrected in linear (non-gamma) space.
e.g. Using normalised values, conversion results in R=0.72, G=1.07, B=0.42.
B is the current minority primary, R'=0.72-0.42=0.3 and G'=1.07-0.42=0.65
Maintain Hue by keeping the ratio G'/R'=0.65/0.3=2.167 constant.
Clamp G to 1.0, scale R''=R'*G''/G'=0.3*(1.0-0.42)/0.65=0.267
Thus adjusted values become R=R''+B=0.267+0.42=0.688, G=1.0, B=0.42
Apply Gamma correction and scale to output range.
tritical
11th January 2009, 10:07
I agree that maintaining hue while decreasing brightness and saturation is the best way to handle the case of values greater than 1.0, but I don't follow your math. You say that the ratio of G'/R' must be kept constant, and that ratio is equal to (G-B)/(R-B). Yet you only modify G and R. It's not possible for that ratio to remain the same without scaling all three values by the same factor (and it will stay constant if you do). I think the way to do it is simply to divide all three values by the maximum value... which in your example would give:
R=0.72/1.07, G=1.07/1.07, B=0.42/1.07
R=0.673, G=1.000, B=0.393
yesgrey
11th January 2009, 16:19
YV12 to RGB is possible in exactly the same way as the proposed YUY2 to RGB, you just have to pre-interpolate vertically as well.
Yes, I referred that, but it's not possible to include the interpolation in the 3D LUT values. The interpolation should be done before mapping with the 3D LUT, it's the only way to know what to map from...;)
yesgrey
11th January 2009, 16:23
These values are best corrected in linear (non-gamma) space.
Well, these values only appear in the linear space, when we are performing the gammut conversion, so we are safe.:)
yesgrey
11th January 2009, 16:53
that ratio is equal to (G-B)/(R-B). Yet you only modify G and R. It's not possible for that ratio to remain the same without scaling all three values by the same factor...
It's possible. Look:
a) C = (G-B)/(R-B); you know R,G,B and calculate C
b) C = (G''-B)/(R''-B)
c) Set G''=1.0
d) R'' = (G''-B)/C + B; you know G'',B,C and calculate R''
I think this method is better because affects less the values not to be clipped.
tritical
11th January 2009, 19:56
You're are right. I must have been up to late last night.
However, I still don't think it's clear cut that that method is better. In terms of HSL coordinates IanB's method is better... both methods produce the same H/S values, but IanB's method reduces L less. However, in terms of HSV coordinates my method is better... both methods produce the same H/V values, but IanB's method reduces S while mine does not.
yesgrey
12th January 2009, 01:19
tritical,
Your method has the advantage of being much easier to code.
Since the hue is kept with both methods, and the HSL vs HSV is not conclusive, maybe it's better just doing the easiest...;)
IanB, what do you think?
IanB
12th January 2009, 02:00
Yes, both methods are compromises, the value is out of gamut, you must compensate it.
Tritical's all scaling method maintains hue and saturation at the expense of luminance.
My dominant pair scaling method maintains hue at the expense of both saturation and luminance, but to a lesser degree.
Another alternative is to prune saturation even harder and maintain luminance. i.e.
R=0.72, G=0.42, B=1.07.
Y=0.2126*0.72+0.7152*0.42+0.0722*1.07=0.531
All scaling
R=0.673, G=0.393, B=1.0
Y=0.496
Dominant pair scaling
R=0.688, G=0.42, B=1.0
Y=0.519
Plus minority primary scaling
R=0.695, G=0.437, B=1.0
Y=0.531
Each method has weaknesses. Probably a selection of methods might be needed, depending on how the colour is out of gamut. I chose to make the Blue channel over valued in this example because it was not possible to achieve the Y correction needed with the original Green over value example. Minority primary scaling can work best for excess Blue excursions and worst for excess Green excursions.
Likewise dominant pair scaling should not be used for low saturation examples, the eye is quite sensitive to changes in low saturation values. Think of white point difference between 9300K, 6500K and 5000K to the eye these are all very low saturation blues and reds
And of course small changes in mid scale luminance for high saturation colours are more noticable, making all scaling a less desirable choice.
We are probably being excessively picky here, but gamut correction is for the perfectionist's anyway. Most of the population watch coloured TV and just don't care/don't know any better.
yesgrey
12th January 2009, 02:48
We are probably being excessively picky here, but gamut correction is for the perfectionist's anyway.
Well, with the 3D LUT method, implementing all of this will not increase the cpu load, and it's not too hard to code, so let's go be picky.:)
tritical
12th January 2009, 07:45
I'll admit that I'm one of those people who don't really care. When I watch stuff on my computer I never worry about colorimetry or gamut correction. I have a cheap 19 inch flat panel monitor :).
Anyways, initial speed tests show the yuy2->rgb lut method is much faster than calling converttorgb32().rgb3dlut(), but it isn't going to beat out just converttorgb32(). Some tests on my quad core Q6600 using 720x480 mpeg2 video decoded with dgdecode to yv12:
220fps converttorgb32()
76fps converttorgb32().ddcc(threads=1)
111fps converttorgb32().ddcc(threads=2)
143fps converttorgb32().ddcc(threads=4)
118fps converttorgb32().rgb3dlut(threads=1)
138fps converttorgb32().rgb3dlut(threads=2)
145fps converttorgb32().rgb3dlut(threads=4)
175fps converttoyuy2().rgb3dlut(threads=1)
200fps converttoyuy2().rgb3dlut(threads=2)
210fps converttoyuy2().rgb3dlut(threads=4)
These results seem a little strange... have to investigate.
leeperry
12th January 2009, 11:32
I'll admit that I'm one of those people who don't really care. When I watch stuff on my computer I never worry about colorimetry or gamut correction. I have a cheap 19 inch flat panel monitor :).
I believe you start to care when you use a projector, as these things have wide gamuts.....and a flashy picture on a big projection screen looks really ugly :o
nice speed improvement! with proper chroma upsampling at that...can't wait to try a new beta :thanks:
yesgrey
12th January 2009, 14:06
I'll admit that I'm one of those people who don't really care. When I watch stuff on my computer I never worry about colorimetry or gamut correction. I have a cheap 19 inch flat panel monitor :).
But have you tryed it? probably not, because I think that maybe you don't know the coordinates of your monitor primaries.
I also don't care much about it, because since I'm not english native, sometimes I have to use subtitles, and with it I spend more time looking at the subtitles than at the movie colors...:D but when I disable the subtitles and listen to the audio directly, the difference is very noticeable and pleasant.:)
Anyways, initial speed tests show the yuy2->rgb lut method is much faster than calling converttorgb32().rgb3dlut(), but it isn't going to beat out just converttorgb32().
Well, if we could load the 3D LUT in the graphics card memory and performing the mapping through it, maybe it would beat the convertorgb32() alone. Even as it is now, is very fast, but it's strange that the speed changes with the number of threads... it should be limited by the memory access speed, right?
tritical
13th January 2009, 09:25
I put up the new version. I thought the fps numbers were weird compared to tests I ran two weeks ago, but those numbers were on a dual core using xvid encoded input. For rgb3dlut speed increases going from 1 to 2 threads, but hardly at all from 2 to 4. I think it is probably because this quad core is actually two dual cores each with its own 4MB L2 cache.
leeperry
13th January 2009, 12:04
nice! trying it as we speak...it looks fast as hell :eek:
but because ddcc takes care of the RGB conversion, shouldn't it let us choose the YCbCr>RGB decoding matrix ?
Rec. ITU-R BT.601-5 => PAL / SECAM / NTSC (SD)
Rec. ITU-R BT.709-4 => HD
you can have a SD video w/ EBU gamut, or a HD video w/ SMPTE-C gamut as you know.
also letting us choose the input levels would be nice, as I'm sending 0-255 content to ddcc(or just assume 0-255 input and let ppl use colorYUV(levels="tv->pc"))
and I've played around again w/ gam_i/o=3 or 5 but I always go back to 5/5.
yesgrey
13th January 2009, 12:50
well mark0077 has been recently pointing out that the nvidia drivers in YV12(using software renderers) were offering better chroma upsample than even ConvertToRGB32() :
http://forum.doom9.org/showpost.php?p=1230065&postcount=27
I'm not sure how they do it, but could you please set up a very HQ chroma upsample scheme ? like spline36 ?
Try using avisynth 2.6. With it you can select the resizer for the chroma upsampling, even spline64.
yesgrey
13th January 2009, 12:56
but because ddcc takes care of the RGB conversion, shouldn't it let us choose the YCbCr>RGB decoding matrix ?
Yes. I will write a little program, based on some of ddcc code, for creating the 3D LUT files. I will consider this, and even the possibility of defining custom levels for the conversion, like it's possible now with ffdshow. This way, I could use just rgb3dlut and control myself the yuv->rgb conversion.
I also want to add custom gamma correction, including the possibility of using the display gamma curve.
leeperry
13th January 2009, 13:11
Try using avisynth 2.6. With it you can select the resizer for the chroma upsampling, even spline64.
ORLY ? well MT 0.7 doesn't support Avisynth 2.58.
I know chromaresample="spline36" exists, but it's not used for chroma upsampling in RGB32 conversion from what IanB told me.
Yes. I will write a little program, based on some of ddcc code, for creating the 3D LUT files. I will consider this, and even the possibility of defining custom levels for the conversion, like it's possible now with ffdshow. This way, I could use just rgb3dlut and control myself the yuv->rgb conversion.
I also want to add custom gamma correction, including the possibility of using the display gamma curve.
oh OK, I thought tritical would simply add an option to choose REC601/709(and assume 0-255 input), and that everything would be fine :o
anyhow, I failed to understand how I could build the LUT in RGB32 in ddcc() and then use it in YUY2 in rgb3dlut()....so that would explain :D
so rgb3dlut() will assume 0-255 input for YUY2>RGB32 conversion then ?
I'm back to ddcc 1.5 for the time being..or maybe I'll rebuild the LUT's as off-gamut colors seem to be even better handled now :cool:
leeperry
14th January 2009, 22:08
OK so I've rebuilt the LUT's, my 2 displays(CRT/DLP pj) both carry off-gamut colors, and the new ddcc looks absolutely stunning :eek:
playing HD SMPTE-C movies yields very true to life colors....actually they've never looked so good :cool:
as soon as I started playing around w/ the PS script, I quickly realized that when you watch movies in their native gamut the contrast ratio increases accordingly....mainly coz colors are perfectly spot-on and not "polluted" anymore.
a friend of mine, who got me into that gamut craziness also told me the same thing....he's got the holy Samsung SP-A800B pj and runs a website about gamuts & stuff : www.hdsoir.com (sorry, french only :o )
@yesgrey : if you can get ddcc working equally accurately in YUY2, that sure will be a blast! :D
cyberbeing
23rd January 2009, 05:44
I've read the documentation, but I'm a bit confused about how to properly use this filter. I currently have two GDM-F520 CRT monitors calibrated and profiled with an Eye-One Pro spectrophotometer to D65 2.2 for photoediting/graphic design use.
Could someone give me a step-by-step of how to create the required txt file(s)?
Do I only need to create an ofile?
Why would I need an ifile?
Is lutfile automatically created by ddcc or do I have to create that as well?
It sounds like rgb3dlut is faster and therefore preferred to ddcc?
Once I have the required files, what would be an example script for bt.709 video? For bt.601 video?
How should I setup ffdshow's output panel?
yesgrey
23rd January 2009, 10:20
First you have to measure the chromaticity coordinates of your displays, in this case of both your crt's, you will need it for filling the data in the ofile.
Generally you don't need the ifile, only if you will use source material from a standard different than the already included in ddcc.
Currently the 3D lut file is created by ddcc when you specify a name for it when calling the function. Then, rgb3dlut uses that file.
I'm writting a little program, using some parts of ddcc, to create the 3D LUT files in an alternative way. These files would then be used by rgb3dlut.
Wait a few more days (I'm waiting for the weekend to finish it), I will post my program with instructions and scripts for the more usual setups.;)
cyberbeing
23rd January 2009, 11:23
First you have to measure the chromaticity coordinates of your displays
That is my main confusion. I'm unsure what chromaticity coordinates you are talking about.
What program do I use? If not easy to figure out, how would I measure what is needed?
What do I need to measure?
After measuring, where in the results would I find the chromaticity coordinates needed for the ofile?
yesgrey
23rd January 2009, 12:00
What program do I use? If not easy to figure out, how would I measure what is needed?
Well, you can look here (http://www.avsforum.com/avs-vb/showthread.php?t=912720) to understand what are the coordinates we need, and you can look here (http://www.homecinema-fr.com/colorimetre/index_en.php) for a free software to help you measuring the coordinates.
cyberbeing
24th January 2009, 00:43
OK, I think I figured it out, but ddcc is throwing an error:
ddcc: error reading from file (C:\Program Files\Avisynth 2.5\bt709.txt)
I was using the following script to try to generate the lut file:
dss2("F:\test.mkv",fps=23.976)
ConvertToRGB24()
ddcc(chr_i=0, gam_i=1, ofile="C:\Program Files\AviSynth 2.5\bt709.txt", lutfile="C:\Program Files\AviSynth 2.5\3D_LUT_BT709.txt", threads=0, opt=-1)
the bt709.txt contains:
http://img144.imageshack.us/img144/296/ofilegi8.png
What am I doing wrong?
yesgrey
24th January 2009, 00:53
What am I doing wrong?
If you are using ddcc v1.6, you should supply only the x and y coordinates. In your file you have specified x,y and z.
cyberbeing
24th January 2009, 02:44
Oh, I guess that explains the error. I was just basing my file off one of the old examples earlier in this thread, but I didn't catch it had changed. I removed the z value and it worked.
Now I've run into another problem. Using YUY2 input with rgb3dlut gives me messed up chroma (http://img144.imageshack.us/img144/5751/0000ur6.png):
ConvertToYUY2()
rgb3dlut(lutfile="C:\Program Files\AviSynth 2.5\3D_LUT_BT709.txt", threads=2)
If I use RGB input the colors seem correct:
ConvertToRGB32(matrix="Rec709")
rgb3dlut(lutfile="C:\Program Files\AviSynth 2.5\3D_LUT_BT709.txt", threads=2)
Am I missing something else, or in ddcc 1.6 are you unable to use YUY2 input with rgb3dlut?
yesgrey
24th January 2009, 10:40
Now I've run into another problem. Using YUY2 input with rgb3dlut gives me messed up chroma...
Am I missing something else, or in ddcc 1.6 are you unable to use YUY2 input with rgb3dlut?
Read this (http://forum.doom9.org/showthread.php?p=1236588#post1236588).
leeperry
24th January 2009, 11:06
Read this (http://forum.doom9.org/showthread.php?p=1236588#post1236588).
I was reading again that you would let us input the display gamma curve.
does that mean that we could input all the data from ColorHCFR ? 9 points for each primary color ? :cool:
http://pix.nofrag.com/2/f/a/731617f4652b055fc4d4498489ac4.png
and to get back on the orangey red problem in SMPTE-C, I'm often seeing this kind of UGLY red's :
http://pix.nofrag.com/d/e/c/18ccaea39e7817bbb9a670494f9a8tt.jpg (http://pix.nofrag.com/d/e/c/18ccaea39e7817bbb9a670494f9a8.html)
this is the Dumb & Dumber BD, mostly the damn cameras didn't sample the Ferrari red properly...the major flaw in SMPTE-C.
in oversaturated REC709 demos, red cars are actually red :
http://img141.imageshack.us/img141/7003/31bv.jpg
it's discussed here :
http://www.avsforum.com/avs-vb/showthread.php?t=1038602
and here :
http://www.google.com/search?hl=en&q=SMPTE-C+orangey+red&btnG=Search&lr=
EDIT: using this Ferrari colors panel, maybe it was a "Rosso Corsa" red after all....but still undersaturated :rolleyes:
http://www.jb330gt.com/color/1995FerrariBig.jpg
cyberbeing
24th January 2009, 11:24
Read this (http://forum.doom9.org/showthread.php?p=1236588#post1236588).
I don't see the answer to my question in what you linked.
It would be much easier if you could just give me simple answers to my questions instead of linking me all over the place... I've already previously read through everything you have linked me to so far before I even thought about asking for help... I need a clear answer beyond what has already been posted... I don't want to play 20 questions...
Can rgb3dlut be used with YUY2 input? The documentation suggests the answer is yes.
What is causing the problem I'm seeing with YUY2 input? The lutfile? Something else?
If the lutfile isn't the problem, how did tritical get it to work with YUY2?
If the lutfile is the problem, then I assume that you are working on an app for creating the needed lutfile for YUY2 input?
leeperry
24th January 2009, 11:55
If the lutfile is the problem, then I assume that you are working on an app for creating the needed lutfile for YUY2 input?
it'll be possible when the app will be ready, now just wait like all of us :o
yesgrey
24th January 2009, 14:00
It would be much easier if you could just give me simple answers to my questions instead of linking me all over the place...
Some answers are not simple, and is boring having to repeat long answers several times. Each long answer that I repeat, is less time I can put in coding...;)
If the lutfile is the problem, then I assume that you are working on an app for creating the needed lutfile for YUY2 input?
Yes.
yesgrey
24th January 2009, 14:03
I was reading again that you would let us input the display gamma curve.
does that mean that we could input all the data from ColorHCFR ? 9 points for each primary color ?
Yes, I am thinking in it, but not in the first release. First it will be just the same as ddcc v1.6, but including YCbCr->RGB conversion.
leeperry
24th January 2009, 14:22
Yes, I am thinking in it
http://forum-images.hardware.fr/images/perso/nico54.gif
cyberbeing
24th January 2009, 23:54
Yes.
Thank you, that is all I needed to know.
The lutfile is the problem and without a properly created lutfile, it won't work with YUY2 input.
You are working on a solution to make that lutfile, and until it is ready I'll have to wait.
:thanks:
yesgrey
28th January 2009, 12:14
Here is a link (http://www.megaupload.com/?d=JAVPPV3Q) with 4 3D LUT files just for testing purposes, to keep you busy while I finish the first version of my little program...;)
The 3D LUTs are only for performing the YCbCr->RGB conversion. This way, we can test the speed and the quality of using rgb3dlut versus other options available for the conversion (like ConverToRGB32, ffdshow, graphic cards drivers).
There are 4 files:
-BT.601 with 16-235 levels
-BT.601 with 0-255 levels
-BT.709 with 16-235 levels
-BT.709 with 0-255 levels
Don't forget that the speed will be exactly the same with the full options available for creating the 3D LUT files.
leeperry
28th January 2009, 12:18
awesome, thanks!
so from the readme :
itype -
For yuy2 input this sets how to compute the u/v values for the second y value in each yuyv set.
Possible settings:
0 - duplicate (use u/v of first y value)
1 - linear interpolation (average u/v of first y with u/v of first y in next yuyv set)
ConvertToRGB32(matrix="rec601")
http://www.image-load.eu/out.php/t141613_convert32601.png (http://www.image-load.eu/out.php/i141613_convert32601.png)
ConvertToYUY2()
rgb3dlut(lutfile="C:\3dluts\3dlut_ycbcr_16-235.txt",itype=0)
http://www.image-load.eu/out.php/t141614_ddccitype0.png (http://www.image-load.eu/out.php/i141614_ddccitype0.png)
ConvertToYUY2()
rgb3dlut(lutfile="C:\3dluts\3dlut_ycbcr_16-235.txt",itype=1)
http://www.image-load.eu/out.php/t141615_ddccitype1.png (http://www.image-load.eu/out.php/i141615_ddccitype1.png)
major changes :eek:
the chroma looks a lot more accurate, not some smearing pixelating hollow anymore :eek:
PS: I didn't have the BT.709 LUT when I ran the tests, but this doesn't really matter.
yesgrey
28th January 2009, 12:22
YV12 to RGB is possible in exactly the same way as the proposed YUY2 to RGB, you just have to pre-interpolate vertically as well.
Yes, I referred that, but it's not possible to include the interpolation in the 3D LUT values. The interpolation should be done before mapping with the 3D LUT, it's the only way to know what to map from...;)
IanB,
Now that I am messing with YCbCr->RGB conversion, I realized that I haven't understood exactly what you have said in that quote. I thought that YUY2 had all the CbCr data for the correponding Y values, but now I know it doesn't, it only has 1 CbCr data for 2 Y data.
Now I see that the YV12 input could also be used in rgb3dlut. I hope it's not too much work, and that tritical could find the time for doing it...;)
yesgrey
28th January 2009, 12:29
the chroma looks a lot more accurate, not some smearing pixelating hollow anymore :eek:
Yes, I agree that rgb3dlut YCbCr->RGB conversion looks better.
I have zoomed to compare, and when the linear interpolation is used (itype=1), the color grading is exactly the same from ConvertToRGB32, but it doesn't have the smearing around it...
This makes me think that if rgb3dlut would also accept YV12 input the result could be even better... maybe ConvertToYUY2 is also adding some smearing of it's own...
leeperry
28th January 2009, 12:44
This makes me think that if rgb3dlut would also accept YV12 input the result could be even better... maybe ConvertToYUY2 is also adding some smearing of it's own...
exactly my words when you guys started that YCbCr thingie....tritical's algorithms are a lot more accurate than Convert(), so we should bypass it altogether if any possible :cool:
until your code is ready, I will now use rgb3dlut() twice in a row, once for RGB32 conversion and once for gamut conversion :D
EDIT: I might also need to lower my sharpening in ffdshow, it's too damn sharp now :eek:
mark0077
28th January 2009, 13:09
so there is no direct yv12 to rgb32 lut? Is one in production :)
yesgrey
28th January 2009, 13:39
so there is no direct yv12 to rgb32 lut? Is one in production :)
The yv12 to rgb32, as the yuy2 to rgb32 conversions could not be performed by a 3D lut. The 3D LUT converts only betwwen YCbCr and RGB, and for that you must have the Y and CbCr values of each pixel.
In YUY2 you have 2 Y and 1 CbCr for 2 pixels, and in YV12 you have 4 Y and 1 CbCr for 4 pixels. The CbCr values for the other 1 or 3 pixels, have to be calculated by rgb3dlut. Currently, it's already doing it for YUY2, let's hope the same could happen for YV12...
So, what we need for YV12->RGB32 is not a new lut, but a new rgb3dlut version...;)
yesgrey
28th January 2009, 13:43
leeperry,
Could you also post a screenshot using ffdshow's conversion?
Thanks!
leeperry
28th January 2009, 14:45
here's ffdshow in RGB32/601 :
http://thumbnails3.imagebam.com/2490/fe9a1124894461.gif (http://www.imagebam.com/image/fe9a1124894461)
and in RGB32HQ/601 :
http://thumbnails9.imagebam.com/2490/6a0f6e24894469.gif (http://www.imagebam.com/image/6a0f6e24894469)
http://img132.imageshack.us/img132/1697/31370080kq2.png
the file sizes are pretty self-explanatory....the smearing red costs, and RGB32HQ=ConvertToRGB32()
also the black background is actually R0-G0-B0 in ddcc, but in Convert()(full range 601)/ffdshow(full range 601) it's R2-G0-B1 :confused:
the sample is still available here :
http://forum.doom9.org/showpost.php?p=1137196&postcount=1868
and here's some real world comparisons, top is Convert(709), bottom is ddcc(709) :
http://thumbnails16.imagebam.com/2490/f405bd24894487.gif (http://www.imagebam.com/image/f405bd24894487)http://thumbnails15.imagebam.com/2490/f137f324894509.gif (http://www.imagebam.com/image/f137f324894509)http://thumbnails10.imagebam.com/2490/054af024894527.gif (http://www.imagebam.com/image/054af024894527)http://thumbnails11.imagebam.com/2490/208cb824894559.gif (http://www.imagebam.com/image/208cb824894559)
http://thumbnails11.imagebam.com/2490/257a6b24894500.gif (http://www.imagebam.com/image/257a6b24894500)http://thumbnails2.imagebam.com/2490/922c4d24894517.gif (http://www.imagebam.com/image/922c4d24894517)http://thumbnails10.imagebam.com/2490/aa353524894543.gif (http://www.imagebam.com/image/aa353524894543)http://thumbnails14.imagebam.com/2490/635f1324894578.gif (http://www.imagebam.com/image/635f1324894578)
http://img262.imageshack.us/img262/4982/42859825xo0.png
the red shade is slightly different in ddcc, more saturated :confused:
yesgrey
28th January 2009, 15:38
as I told you in PM the 2 BT.709 LUT's are mixed up, the 0-255 is actually 16-235 ;)
The Luts designation refers to the black and white in RGB.
The BT.709_0-255 is
Y:16-235 -> RGB: 0-255
and the BT.709_16-235 is
Y:16-235 -> RGB: 16-235
and RGB32HQ=ConvertToRGB32()
As it's supposed to be, the HQ mode uses Avisynth's code.
also the black background is actually R0-G0-B0 in ddcc, but in Convert()(full range 601)/ffdshow(full range 601) it's R2-G0-B1 :confused:
Probably some small error in the matrix coefficients...
leeperry
28th January 2009, 15:42
The Luts designation refers to the black and white in RGB.
The BT.709_0-255 is
Y:16-235 -> RGB: 0-255
and the BT.709_16-235 is
Y:16-235 -> RGB: 16-235
well, I think you told me that PC/TV levels didn't matter for colorimetry matters...what I need is full range YUY2 to full range RGB32 conversion(in 601 & 709)
BTW I've definitely had to lower my sharpening in ffdshow one notch, that smearing red was tempering w/ the PQ apparently..
so here we go again :
Convert() 609 / ddcc() 609 / ffdshow 609 RGB32(not HQ, same as the ATi drivers in YV12/YUY2)
http://thumbnails12.imagebam.com/2491/2d15aa24903098.gif (http://www.imagebam.com/image/2d15aa24903098)http://thumbnails14.imagebam.com/2491/4a0ede24903105.gif (http://www.imagebam.com/image/4a0ede24903105)http://thumbnails11.imagebam.com/2491/dd153924903110.gif (http://www.imagebam.com/image/dd153924903110)
http://thumbnails15.imagebam.com/2491/c7b08524903114.gif (http://www.imagebam.com/image/c7b08524903114)http://thumbnails11.imagebam.com/2491/51812224903124.gif (http://www.imagebam.com/image/51812224903124)http://thumbnails14.imagebam.com/2491/ec623824903131.gif (http://www.imagebam.com/image/ec623824903131)
can't really notice less red blocking between Convert() and ddcc() on these screenshots, only that red is more saturated w/ ddcc.
mark0077
28th January 2009, 16:14
So how will the new ffdshow rgb32hq that is being developed compare to the final version of this? I assume by definition they should be exactly the same but will they?
leeperry
28th January 2009, 16:20
So how will the new ffdshow rgb32hq that is being developed compare to the final version of this?
you'll have to pray for some good soul to add tritical's chroma upsampling algorithm in ffdshow :o
BTW, if you care to run compares w/ the nvidia drivers...you're most welcome :cool:
mark0077
28th January 2009, 16:27
I will do the compare in about 2 hours when I'm home from work ;) will be interested to see which is better. Pretty sure this conversion to rgb32 in software is a great idea because of the difference between nvidia and ati, not to mention the diff between conversion with nvidia and different renderers (sometimes bad conversion with evr-cp).
I'll post the new pics up soon with my test pattern.
EDIT: Got the code working, no avisynth errors, but ffdshow seems to still do a conversion which means I can't test this :( Any workarounds...? ffdshow musn't know that avisynth is outputting 32bit rgb?
Using ffdshow set to rgb32 output only, and full range and the following two sets of codes, but ffdshow is still doing some conversion
colorYUV(levels="tv->pc")
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_16-235.txt",itype=0)
and
colorYUV(levels="tv->pc")
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_0-255.txt",itype=0)
leeperry
28th January 2009, 23:42
colorYUV(levels="tv->pc")
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_16-235.txt",itype=0)
and
colorYUV(levels="tv->pc")
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_0-255.txt",itype=0)
you don't have to use colorYUV(), I do coz I got a longer script and I want GrainFactory3() to be processed on the full range video...
this SHOULD work :
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_0-255.txt",itype=1)
610 for SD, 709 for HD of course ;)
we spoke about it in PM, but if you only have RGB32 checked in ffdshow, ffdshow should bypass the Avisynth stream....I dunno what's up :confused:
instead of using Avisynth 2.5.8, you could try 2.5.7 + MT 0.7 on XP ? that's what I got....or as a last resort the alpha 2.60, which is already patched for MT.
try to set the RGB conversion of ffdshow to bogus levels, and see if it has any effect....this doesn't on my box :
http://www.image-load.eu/out.php/i142077_plop0.png
mark0077
29th January 2009, 00:48
Disabling ffdshows internal decoders fixed the problem, now avisynth seems to be doing the conversion...
Well I am kind of disappointed. Scaling upto 1920 x 1080 and tried
ffdshow rgb32hq and rgb32lut with the following settings
ConvertToYUY2()
rgb3dlut(lutfile="c:\3dlut_ycbcr_BT601_0-255.txt",itype=1)
Both show the ugly scaling problem again at the edge of the letters in my test. rgb32lut with itype=1 is 2kb larger file size than ffdshow's rgb32hq. The Nvidia yv12 -> rgb32 is absolutely perfectly smooth again as usual.
DISCOVERY ! :P
Now I have a feeling why I am seeing these ugly artifacts on this test, and I may be wrong but I think my hunch might be right in this case. When I view my test as part of the entire DVD, it is the first title and the image is scaled to the correct aspect ratio, BUT when its played alone, as a seperate vob file, this same image is displayed much narrower.
I am wondering if this image I am using to test is originally in the aspect ratio
x pixels : y pixels
and when played as part of the DVD it is signalled to be scaled to a non x:y aspect ratio (as I think it should by just looking at the width of the fonts) to
(x*z) : y
I think this is what is happening, and this horizontal stretching in certain configurations is what is causing these weird little jaggy lines (all but nvidias own conversions). What can be done software wise to counteract this. I am trying to find a combination of ffdshow resize, before and after avisynth to see what nvidia might be doing, or what order they do things in.
Any input on this would be great, because I think this may be important. Could it be that the resizers that we / I use are better in yv12/yuy2, than rgb32? This is my hunch so far anyways.
yesgrey
29th January 2009, 01:15
but ffdshow seems to still do a conversion which means I can't test this :( Any workarounds...? ffdshow musn't know that avisynth is outputting 32bit rgb?
Are you using any filter after the avisynth filter in ffdshow?
Some of the filters require YV12 input, so, even if you are outputing RGB32 from the avisynth filter, ffdshow will convert back to YV12 to feed the filter, then, it will convert to RGB32.
leeperry
29th January 2009, 11:15
Now I have a feeling why I am seeing these ugly artifacts on this test, and I may be wrong but I think my hunch might be right in this case. When I view my test as part of the entire DVD, it is the first title and the image is scaled to the correct aspect ratio, BUT when its played alone, as a seperate vob file, this same image is displayed much narrower.
it's splitter/player/renderer dependent I guess.
I gave up on your test pattern coz it's SD, and needs to be upscaled....kinda ruins the point of checking how the chroma looks.
anyway here it is, in spline36 upscale to 1280*768, and zoomed at 400% :
http://www.image-load.eu/out.php/t142192_01.png (http://www.image-load.eu/out.php/i142192_01.png)
leeperry
29th January 2009, 11:33
@tritical : Haruhiko has implemented a new chroma upsampling algorithm in ffdshow :
http://forum.doom9.org/showpost.php?p=1243154&postcount=6366
the results look really good :
http://forum.doom9.org/showpost.php?p=1243161&postcount=6368
your input would be much appreciated :thanks:
tritical
29th January 2009, 13:28
Nice that it is 20% faster, but I don't see how the extra bit depth during the yv12->yuy2->yv24 conversions, using linear interpolation, could make any noticeable difference. During yv12->yuy2 you do 75/25 and 25/75 averaging. The most error you could incur while rounding to 8-bit result is 0.5 (only possible results are: 1.0,0.75,0.5,0.25,0.0, rounding .5 to 1.0 gives the largest error). During yuy2->yv24 you keep all of the u/v values you calculated during yv12->yuy2, and do 50/50 averaging to get the other half. Now if you happened to introduce .5 error into both pixel values during yv12->yuy2 you could end up with a difference of at most 1.0 in the new u/v values calculated during yuy2->yv24. A difference of +-1 in u/v is not noticeable, and is certainly not the cause behind the differences in mark0077's image.
The explanation above is also why I'm not keen to add yv12 support to rgb3dlut... I don't think it would be at all noticeably different than if you called converttoyuy2(). Certainly it wont be faster. The only way it would be visibly different is if it used a different interpolation method during yv12->yuy2 conversion, and, if a more complex upsampling method is used, I think it would be better to have it as a separate filter (so that yuy2 frames are fed to rgb3dlut). However, I did add a cubic interpolation mode to rgb3dlut for yuy2->yv24 conversion, but before I release it I'd like to do a blind test. If anyone has samples that would really show the difference between chroma upsampling algorithms please post them :thanks:.
leeperry
29th January 2009, 13:50
A difference of +-1 in u/v is not noticeable, and is certainly not the cause behind the differences in mark0077's image.
it would appear that what you see in mark0077's compares is mostly due to bogus upscaling...so the results are to be taken with a grain of salt, as he has said himself here :
http://forum.doom9.org/showpost.php?p=1243013&postcount=156
If anyone has samples that would really show the difference between chroma upsampling algorithms please post them
it only showns on extreme test patterns, such as these red rolling end credits :
http://rapidshare.com/files/122925763/Bronz_s.mkv.html
if you zoom *a lot*, you can see that Haruhiko's new algorithm looks slightly better than ddcc :
http://forum.doom9.org/showpost.php?p=1243161&postcount=6368
but ddcc is already a major improvement over ConvertToRGB32() when zoomed....still it doesn't really show in real world practice :
http://forum.doom9.org/showpost.php?p=1242783&postcount=149
tritical
29th January 2009, 23:02
Well, on second thought I don't want to take the time for the test. Everyone can do their own tests and make up their own mind. ddcc v1.7 is on my site, changes:
+ added cubic interpolation option to rgb3dlut
+ added adobe 1998 chromaticity and linear gamma presets to ddcc
+ added yv12toyuy2 filter
yesgrey
29th January 2009, 23:35
tritical,
Once again...
:thanks:
leeperry
30th January 2009, 02:08
ddcc v1.7 is on my site, changes:
+ added cubic interpolation option to rgb3dlut
+ added adobe 1998 chromaticity and linear gamma presets to ddcc
+ added yv12toyuy2 filter
looking good, thanks!
so I ran a quick test on my o/c Q6600 :
yv12toyuy2(threads=4)
=750 fps
yv12toyuy2(itype=1,threads=4)
=1400 fps
ConvertToYUY2()
=1400 fps
is "itype 1" the same exact thing as what ConvertToYUY2() does?
and w/ rg3dlut() :
yv12toyuy2(itype=1,threads=4)
rgb3dlut(lutfile="Y:\BT709_0-255.txt",itype=1,threads=4)
=550 fps
yv12toyuy2(itype=1,threads=4)
rgb3dlut(lutfile="Y:\BT709_0-255.txt",itype=2,threads=4)
=480 fps
the speed drop is hardly noticeable, my full script(w/ LSF+Grainfactory3) falls from 62 to 61.5 fps...I'll be doing visual comparisons tomorrow :p
madshi
30th January 2009, 10:14
Dear gamut experts. Unfortunately I'm not an expert in this area at all. But I've one question:
The Lumagen Radiance video processor offers gamut correction which seems to be comparable to what tritical's solution does. But there seems to be a limitation of the Radiance, which is described here:
http://www.avsforum.com/avs-vb/showpost.php?p=15033893&postcount=3101
http://www.avsforum.com/avs-vb/showpost.php?p=15045760&postcount=3111
http://www.avsforum.com/avs-vb/showpost.php?p=15126353&postcount=3130
Does the tritical gamut solution have the same limitation? If so, could it be improved to beat the Radiance? :D
yesgrey
30th January 2009, 11:03
Does the tritical gamut solution have the same limitation? If so, could it be improved to beat the Radiance? :D
Yes, currently it does, and probably it will remain like that.
Since I am creating a little program to create the 3D LUT files all the future developments should be done in this program, that will create the 3D LUT files to use with rgb3dlut. These are our (tritical and I) plans.
The final goal would be to create a color correction application with complete customization of the output to fit all display irregularities that we can measure. Just to name a few: gamma correction without any predefined function (all data points you measure from your display), and for each color channel; chromatic adaptation to several white points (color temperatures), because a display cannot maintain the same white point from 0 to 100 IRE, and usually the higher on/off contrast is not at 6500k. Gammut correction at several levels, not only at one point as other solutions available, with better off-gammut colors handling. Etc...
With the 3D LUT, the main problem of adding all these corrections is solved: speed. It will always be the same speed, because it will be just mapping values. Now it will be just coding the algorythms for performing all the corrections we want.;)
Just a final note: this will be free. The program will be released under GNU GPL.
madshi
30th January 2009, 11:42
Yes, currently it does, and probably it will remain like that.
Since I am creating a little program to create the 3D LUT files all the future developments should be done in this program, that will create the 3D LUT files to use with rgb3dlut. These are our (tritical and I) plans.
The final goal would be to create the "ultimate" color correction application, a complete customization of the output to fit all display irregularities that we can measure. Just to name a few: gamma correction without any predefined function (all data points you measure from your display), and for each color channel; chromatic adaptation to several white points (color temperatures), because a display cannot maintain the same white point from 0 to 100 IRE, and usually the higher on/off contrast is not at 6500k. Gammut correction at several levels, not only at one point as other solutions available, with better off-gammut colors handling. Etc...
With the 3D LUT, the main problem of adding all these corrections is solved: speed. It will always be the same speed, because it will be just mapping values. Now it will be just coding the algorythms for performing all the corrections we want.;)
Just a final note: this will be free. The program will be released under GNU GPL.
Well, that sounds *really* good. I think the biggest problem may be on how to make all this potential functionality available in a way which is easy and intuitive to use? Ideally, I guess, your little program would contain a calibration "assistant" which would guide the consumer through a set of test screens (contained in and displayed by your program) and then ask for measurements of each of those test screens and then automatically do all necessary calculations based on those measurement results? If implemented in such a way, I guess even a calibration dummy (like me) would be able to get near to perfection, as long as good measurement hardware is used? Would that be "goodbye" to hiring ISF calibrators?
:)
yesgrey
30th January 2009, 12:10
Ideally, I guess, your little program would contain a calibration "assistant" which would guide the consumer through a set of test screens...
Well, I really haven't thought about it. Currently, my idea is just a simple console application (like eac3to;)) in which the user inputs a file with all the data, and the program outputs a 3D LUT file accordingly. Your idea is very good, and maybe it could end to something like that, but I don't know if I will have the time and the skills for all of that... Let's just see how the things progress... Currently there are already some free software for measuring and testing the displays (HCFR, for example), maybe we can find a way to use it for collecting the data needed for creating the 3D LUT...
Would that be "goodbye" to hiring ISF calibrators?
To someone with a HTPC, yes. Maybe it's better keeping my identity secret...:D
madshi
30th January 2009, 12:25
Well, I really haven't thought about it. Currently, my idea is just a simple console application (like eac3to;)) in which the user inputs a file with all the data, and the program outputs a 3D LUT file accordingly. Your idea is very good, and maybe it could end to something like that, but I don't know if I will have the time and the skills for all of that... Let's just see how the things progress... Currently there are already some free software for measuring and testing the displays (HCFR, for example), maybe we can find a way to use it for collecting the data needed for creating the 3D LUT...
I demand a perfect solution. NOW!!
Just joking. I was a little day dreaming, of course... :)
If you provided a command line tool which does all the dirty work which is necessary to realize all the fancy features you mentioned, then that would a very awesome first step.
But it would be great, if your tool just needed a specific set of measurements and would do all the necessary calculations itself. Finding a way to provide those measurements comfortably should be easy enough to add for other people then (e.g. the HCFR guys). I'm just hoping that your tool won't require users to be calibration experts. Well, because I'm not... :o
To someone with a HTPC, yes. Maybe it's better keeping my identity secret...:D
:D
yesgrey
30th January 2009, 13:28
I'm just hoping that your tool won't require users to be calibration experts. Well, because I'm not...
It won't.:)
madshi
30th January 2009, 13:36
It won't.:)
Great! You might be responsible for holding me in the HTPC camp then. Was already thinking about going external media player. Now I have to reconsider... :rolleyes:
leeperry
30th January 2009, 13:44
To someone with a HTPC, yes. Maybe it's better keeping my identity secret
not quite, we're still a far cry from real ISF calibration.
not because of their software tools(Color.HCFR has actually more features than ColorFacts, like the saturation measurements which ColorFacts doesn't do at all)....but the real difference is the sensor.
home users will have an Eye One Display 2 colorimeter from X-Rite at best, which is as accurate as the Eye One Pro in *most* cases :
http://www.avsforum.com/avs-vb/showthread.php?p=9495885#post9495885 (oops pix have gone AWOL...)
it's almost as accurate as the i1pro on DLP projectors, but on plasma or SXRD...it's a whole different story.
ISF ppl use Minolta spectrophotometers, that are recalibrated every 6 months(mastering houses get their CRT's recalibrated on a weekly basis w/ these professional tools).
http://www.konicaminolta.com/instruments/products/display/index.html
when an i1d2 gives a ΔE of 3, god knows if it wouldn't be 1 or 15 w/ a true Minolta spectrophotometer...it'd be like having a SP-A800B pj, and calibrate it w/ a spyder 2 :D
yesgrey
30th January 2009, 14:01
You might be responsible for holding me in the HTPC camp then.
It's funny, because you, with eac3to, are also responsible for holding a lot of people in the HTPC camp...;)
leeperry
30th January 2009, 15:13
so here I am w/ comparisons :)
1)
colorYUV(levels="tv->pc")
yv12toyuy2(itype=1,threads=4)
rgb3dlut(lutfile="Y:\BT709_16-235.txt",itype=2,threads=4)
2)
colorYUV(levels="tv->pc")
yv12toyuy2(itype=2,threads=4)
rgb3dlut(lutfile="Y:\BT709_16-235.txt",itype=2,threads=4)
3)
colorYUV(levels="tv->pc")
convertToYUY2()
rgb3dlut(lutfile="Y:\BT709_16-235.txt",itype=2,threads=4)
4)
colorYUV(levels="tv->pc")
convertToYUY2()
rgb3dlut(lutfile="Y:\BT709_16-235.txt",itype=1,threads=4)
5)
colorYUV(levels="tv->pc")
+ ffdshow RGB32 709 full range
6)
colorYUV(levels="tv->pc")
+ ffdshow RGB32HQ 709 full range
7)
colorYUV(levels="tv->pc")
+ ffdshow RGB32HQ 709 full range / new test build from Haruhiko
http://www.image-load.eu/out.php/t142371_1.png (http://www.image-load.eu/out.php/i142371_1.png)http://www.image-load.eu/out.php/t142372_2.png (http://www.image-load.eu/out.php/i142372_2.png)http://www.image-load.eu/out.php/t142373_3.png (http://www.image-load.eu/out.php/i142373_3.png)
http://www.image-load.eu/out.php/t142374_4.png (http://www.image-load.eu/out.php/i142374_4.png)http://www.image-load.eu/out.php/t142375_5.png (http://www.image-load.eu/out.php/i142375_5.png)http://www.image-load.eu/out.php/t142376_6.png (http://www.image-load.eu/out.php/i142376_6.png)
http://www.image-load.eu/out.php/t142377_7.png (http://www.image-load.eu/out.php/i142377_7.png)
first comparison :
http://www.image-load.eu/out.php/t142378_orig.png (http://www.image-load.eu/out.php/i142378_orig.png)
zoomed at 400% :
http://www.image-load.eu/out.php/t142379_origx4.png (http://www.image-load.eu/out.php/i142379_origx4.png)
second comparison :
http://www.image-load.eu/out.php/t142381_2orig.png (http://www.image-load.eu/out.php/i142381_2orig.png)
zoomed at 400% :
http://www.image-load.eu/out.php/t142382_2origx4.png (http://www.image-load.eu/out.php/i142382_2origx4.png)
after careful examination, it seems to me that the sharpest and best looking results are w/ Haruhiko's new algorithm...any chance adding it in rgb3dlut please ? plus it works in 10 bits or so, so going yv12>rgb32 in more than 8 bits would avoid sloppy roundings and increase the LUT accuracy. Haruhiko talked about it here :
http://forum.doom9.org/showpost.php?p=1243635&postcount=6402
this is really über-nitpicking, though...feel free to throw rocks at me if you like http://forum-images.hardware.fr/images/perso/antp.gif
yesgrey
30th January 2009, 17:24
leeperry, how have you peerformed your tests? Are you using any sharpenning? My tests show differences, but not so big as yours... I think the comparison must be done at the most raw level...
Here are my test results...
ffdshow new algorythm:
9365
yv12toyuy2 and rgb3dlut with itype=2, b=0, c=0.5
9365
Comparing both images, the ffdshow new algorythm has a slightly less sharp image, but the colors are more even (Look at the capital J's).
There is also a strange thing... if you compare both images switching from one to another (I open both with Paint, and then switch using Alt+Tab), you can notice a little horizontal shift between them.
Maybe the slightly loss of precision due to the yv12->yv24 being performed in two steps with yv12toyuy2+rgb3dlut would be the responsible for that?
leeperry
30th January 2009, 17:46
nope, no sharpening.
remoulade(divx7) decoding > colorYUV() in HR/full range 32bits and that's it.
you can't trust VMR/EVR as they do postprocessing, HR does not.
I can't see your screenshots, maybe you could put them on imagebam?
anyway, we both agree that using Haruhiko's new code and doing the LUT thingie in 10 bits w/o any in-between 8 bits conversion could potentially yield more pleasing results...let's wait to know what tritical thinks about all this, and whether this would be technically achievable http://forum.slysoft.com/images/smilies/agreed.gif
leeperry
1st February 2009, 10:55
the funny thing is that if you save the 7 original screenshots to your PC, then quickly pass through them w/ the windows picture viewer....they all smear a bit, except the seventh.
in RGB32HQ stock, the background is R3/G1/B1 like all the other screenshots...but w/ the new experimental yv12 algorithm it's R2/G0/B0
this time I haven't messed up w/ the LUT's, and I used the exact same settings between stock ffdshow and the new Haruhiko's version...I only updated ffdshow :o
FoLLgoTT
1st February 2009, 11:25
in RGB32HQ stock, the background is R3/G1/B1 like all the other screenshots...but w/ the new experimental yv12 algorithm it's R2/G0/B0
The overall saturation is too high. Is this a basic problem of this algorithm or just a matter of implementation?
tritical
2nd February 2009, 06:42
There is also a strange thing... if you compare both images switching from one to another (I open both with Paint, and then switch using Alt+Tab), you can notice a little horizontal shift between them.
Maybe the slightly loss of precision due to the yv12->yv24 being performed in two steps with yv12toyuy2+rgb3dlut would be the responsible for that?
It's because the new ffdshow algorithm assumes centered chroma placement in yuy2, and not left aligned (which is the mpeg2 standard, and what rgb3dlut uses). As I explained before, the extra bits in the conversion during the 4:2:0 -> 4:4:4 interpolation steps will be absolutely unnoticable (there are bigger differences between yv12toyuy2(itype=1) and converttoyuy2(), try to spot those :)). Now a change in assumed chroma position is a different matter. I have added chroma placement options to rgb3dlut/yv12toyuy2, which allow the user to select between all possible scenarios. In yuy2 there are only 2, centered or left aligned. In yv12 there are a few more, depending on interlaced vs non-interlaced and non-standard ways the conversion could be performed... which has been discussed in the past.
Jeremy Duncan
2nd February 2009, 07:01
It's because the new ffdshow algorithm assumes centered chroma placement in yuy2, and not left aligned (which is the mpeg2 standard, and what rgb3dlut uses). As I explained before, the extra bits in the conversion during the 4:2:0 -> 4:4:4 interpolation steps will be absolutely unnoticable (there are bigger differences between yv12toyuy2(itype=1) and converttoyuy2(), try to spot those :)). Now a change in assumed chroma position is a different matter. I have added chroma placement options to rgb3dlut/yv12toyuy2, which allow the user to select between all possible scenarios. In yuy2 there are only 2, centered or left aligned. In yv12 there are a few more, depending on interlaced vs non-interlaced and non-standard ways the conversion could be performed... which has been discussed in the past.
- "left aligned (chroma) is the mpeg2 standard"
- "ffdshow uses centered aligned (chroma)"
- "you can spot a change in the assumed chroma"
My question to you, Tritical.
- Mpeg2 assumes left, and I use yv12.
- ffdshow output changes yv12 to rgb32 with (16-235 levels).
- Will the picture be degraded? because the yv12 being changed into rgb 32 is treated like it's centered chroma when in fact it's left chroma?
tritical
2nd February 2009, 07:18
My question to you, Tritical.
- Mpeg2 assumes left, and I use yv12.
- ffdshow output changes yv12 to rgb32 with (16-235 levels).
- Will the picture be degraded? because the yv12 being changed into rgb 32 is treated like it's centered chroma when in fact it's left chroma?
The thing is, just because the standard specifies left aligned 4:2:2 doesn't mean all encoders do that. Heck, avisynth's rgb->yuy2 conversion averages every two pixels... resulting in centered chroma (yet its yuy2->rgb conversion assumes left aligned, and averages to create the right pixel value, which will result in visible shifting if you chain enough conversions together). Xvid's old color conversion routines (circa 2003-2004, I have no idea if they are the same now) operated based on centered chroma. I assume there are commercial encoders that operate that way... so in these cases assuming centered chroma in the yuy2->rgb conversion will look better. That said... if you are watching video, and not zooming in on still frames, you probably wouldn't be able to notice the difference.
Also, when I mentioned yuy2 vs yv12 and number of positioning scenarios... for yuy2 I was talking only about horizontal positioning (what matters in the yuy2<->rgb conversion) and in yv12 I was talking only about vertical positioning (what matters in the yuy2<->yv12 conversions). Generally, progressive yuy2->yv12 vertical positioning is consistent. It's the interlaced yuy2->yv12 conversion that is more likely to result in non-standard chroma placement (for example: separating fields and then doing a progressive yv12->yuy2 conversion on each field results in non-standard placement). The nice thing about mpeg2 standard 4:2:0 interlaced vertical chroma positioning, is that if the content is actually progressive, but is downsampled using interlaced conversion, then you can use normal progressive upsampling on it because the two positionings are actually the same (the difference is how the values are created). If the positioning is non-standard though, then the progressive conversion's assumptions wont match up correctly.
madshi
2nd February 2009, 09:09
It's because the new ffdshow algorithm assumes centered chroma placement in yuy2, and not left aligned (which is the mpeg2 standard, and what rgb3dlut uses).
Do you happen to know what the h264 and VC-1 specifications say about chroma placement?
tritical
2nd February 2009, 10:56
h.261,h.263,mpeg1 -> centered
mpeg2,mpeg4,h.264 -> left
Don't know about vc-1 for sure, but I would guess left.
madshi
2nd February 2009, 11:07
Pretty interesting - thanks!
leeperry
2nd February 2009, 11:30
I have added chroma placement options to rgb3dlut/yv12toyuy2
in ddcc 1.7? you mean itype? which one is left and which one is centered please?
there are bigger differences between yv12toyuy2(itype=1) and converttoyuy2(), try to spot those :)
haha, couldn't tell :D
I'm using yv12toyuy2(itype=1)/rgb3dlut(itype=2) and it looks good to me.
h.261,h.263,mpeg1 -> centered
mpeg2,mpeg4,h.264 -> left
Don't know about vc-1 for sure, but I would guess left.
ok, I'll try to make automatic profiles in ffdshow then :p
if you are watching video, and not zooming in on still frames, you probably wouldn't be able to notice the difference.
indeed, this is major nitpicking :devil:
anyway what I *can* see is that yesgrey3's REC601/709 matrixes are more accurate than ffdshow.
so we finally have very accurate decoding and conversion, this is too awesome :thanks:
leeperry
3rd February 2009, 03:33
I've got some h264/DTS MKV samples that make yv12toyuy2() crash ffdshow instantly(using HMS/divx7 decoder/sonic audio 4.2), whatever in type 1 or 2...itype 0 works fine, and so does ConvertToYUY2() :confused:
here's one of them :
http://www.megaupload.com/?d=4A4TRTNB
btw, look at the native gamut of the Epson TW5000 :
http://www.homecinema-fr.com/BE/TW5000/usine-cie.jpg
it's got a full CMS, BenQ/Epson are now offering it in their projectors...got to be a good sign ;)
tritical
4th February 2009, 21:58
in ddcc 1.7? you mean itype? which one is left and which one is centered please?
It's in 1.8, haven't released it yet. Still need to test it a little more, and need to check out your crashing report.
yesgrey
5th February 2009, 20:53
Finally it's ready the first version of cr3dlut, a program for creating a 3D LUT file to use with tritical's rgb3dlut. This first version is partially based in tritical's ddcc code, so the results should be very similar.
You can get both the program and source code at my web page. Here is the link:
http://yesgrey3.totalh.com/
It's not (yet) very user friendly, but I will also improve the user-friendliness at the same time I will improve the program. For that, it will be very important your comments and suggestions.;)
A special thanks to tritical for his rgb3dlut. Now I can implement and test several ideas I have about YCbCr->RGB conversion and color correction.
I would also like to thank leeperry for promptly test our implementations, and for helping me correcting some typos in the readme file.
leeperry
5th February 2009, 21:22
thanks yesgrey! I was busy doing stuff today, but now that I've RTFM several times I'll look into it tomorrow http://forum-images.hardware.fr/images/perso/d4buff.gif
Mug Funky
6th February 2009, 02:35
heywow!
3d luts in avisynth, 1 step closer to being quite practical :)
i'm just R'ingTFM now.
just thinking, probably the best balance of performance versus complexity might be in the way programs like lustre, scratch, resolve etc do it - a 16x16x16 LUT (in a text file) with processing and interpolation between these "cube points" happening on the GPU. i'm pretty confident most modern GPU's can handle it, and the advantage is you can go ape with the precision and it all happens at the output stage.
right now i'm just trying to figure out how to port the Arri DCI LUT for log-to-print emulation, so i can preview film scans in avisynth as they'd appear in a cinema.
[edit]
would it be advantageous to make the 3d lut file format such that it looks meaningful when loaded raw into an image editor? that way you could create a series of adjustments in gimp or photoshop or whatever, use them on a "flat" lut file, then save the result as a raw image that can be used as a lut in avisynth. does that make sense? this way i could drop the flat file into scratch, apply the arri lut, then render to a new pic that i can use.
canuckerfan
6th February 2009, 04:29
i feel a little lost in all this but would using yv12toyuy2() be advantageous in any way over avisynth's native converttoyuy2()?
tritical
6th February 2009, 05:31
@All
Put up ddcc v1.8 on my website... only change was adding 'cplace' parameter to yv12toyuy2 and rgb3dlut for specifying chroma placement.
@leeperry
The crash happened because I coded most of yv12toyuy2 for mod 4 height, but didn't put in the necessary error checking. Since itype=0 with progressive upsampling was the only combination that worked with mod 2 height, I decided to require all input be mod 4.
just thinking, probably the best balance of performance versus complexity might be in the way programs like lustre, scratch, resolve etc do it - a 16x16x16 LUT (in a text file) with processing and interpolation between these "cube points" happening on the GPU. i'm pretty confident most modern GPU's can handle it, and the advantage is you can go ape with the precision and it all happens at the output stage.
Well, I have no plans to port it to the gpu, but someone else might. In terms of programming for the cpu, 16x16x16 lut with interpolation would be more complicated. If the interpolation was something simple it might be faster, but the current method (with full 48MB table) is pretty quick on most recent computers. Working with 8-bit in -> 8-bit out you certainly can't get better precision than with the full table (precision of the mapping is limited by the program computing the table). Personally, I think it would be easier to write a separate program that converts other lut formats into the format rgb3dlut currently uses, and since that would be offline you could make it as complex as you want (interpolation methods, etc...). Plus, that way rgb3dlut is never the limiting the factor, in terms of interpolation methods offered for smaller luts, etc...
would it be advantageous to make the 3d lut file format such that it looks meaningful when loaded raw into an image editor? that way you could create a series of adjustments in gimp or photoshop or whatever, use them on a "flat" lut file, then save the result as a raw image that can be used as a lut in avisynth. does that make sense? this way i could drop the flat file into scratch, apply the arri lut, then render to a new pic that i can use.
What format are you thinking? Again, I think it would be easier just to have a separate program that converts from one format to another and leave rgb3dlut as it is.
i feel a little lost in all this but would using yv12toyuy2() be advantageous in any way over avisynth's native converttoyuy2()?
That question is like asking if bicubicresize() would be advantageous over bilinearresize(). Compared to converttoyuy2(), yv12toyuy2() simply offers more choices for interpolation function, and more choices for chroma placement. Are those differences large? Not really, people aren't that sensitive to chroma. As a test, I created an image in paint, and converted it to yv12 using converttoyv12(matrix="Rec709",interlaced=false). Then I converted it back to rgb using a number of methods: cpic-center.png (http://bengal.missouri.edu/~kes25c/cpic-center.png). For the images that are named "xxx,yyy", 'xxx' specifies the interpolation method for yv12->yuy2, and 'yyy' specifies the interpolation method for yuy2->yv24 (dup=duplicate,lin=linear,cub=cubic). Those images were converted using yv12toyuy2/rgb3dlut. Since avisynth's rgb->yuy2 conversion averages every two pixels, I used centered chroma placement in rgb3dlut. Here is the result using left aligned chroma placement for the yv12toyuy2/rgb3dlut images: cpic-left.png (http://bengal.missouri.edu/~kes25c/cpic-left.png). Are there differences among the conversions? Yep. Are they that big between linear and cubic? Not really.
cyberbeing
6th February 2009, 06:29
I'm finding ddcc 1.8 nearly 30-50% slower then ddcc 1.7 which makes it unusable for me. This is a major problem considering that the 30-50% slower causes me to be unable to view some 720p video in real-time on my AMD X2 computer with yv12toyuy2, rgb3dlut, and reclock. If just adding the chroma placement option is the only thing changed/fixed and is the entire cause of this slowdown, I'll just continue using 1.7.
Below are the options I use:
yv12toyuy2(itype=2, interlaced=false, threads=2, b=0.33, c=0.33)
rgb3dlut(itype=2, lutfile="C:\Program Files\AviSynth 2.5\bt709lut.txt", threads=2, b=0.33, c=0.33)
Edit: Oh and thank you yesgrey3 for the cr3dlut app, initial impressions is it seems to work as designed.
canuckerfan
6th February 2009, 08:02
thanks for the explanation, tritical. the visuals helped:)
leeperry
6th February 2009, 08:30
@leeperry
The crash happened because I coded most of yv12toyuy2 for mod 4 height, but didn't put in the necessary error checking. Since itype=0 with progressive upsampling was the only combination that worked with mod 2 height, I decided to require all input be mod 4.
any chance you could make it fall back to itype=0 if it's not mod4? but well I'm not sure I could even see a diff between the itypes :o
anyway, I'll look into the new version! :thanks:
tritical
6th February 2009, 09:25
@cyberbeing
Is this faster: [removed]?
madshi
6th February 2009, 10:22
Finally it's ready the first version of cr3dlut, a program for creating a 3D LUT file to use with tritical's rgb3dlut. This first version is partially based in tritical's ddcc code, so the results should be very similar.
You can get both the program and source code at my web page. Here is the link:
http://yesgrey3.totalh.com/
It's not (yet) very user friendly, but I will also improve the user-friendliness at the same time I will improve the program. For that, it will be very important your comments and suggestions.;)
Thanks!
Some comments:
(1) There are some "*/" in the readme which probably aren't supposed to be there?
(2) The readme is very technical. I didn't understand half of it... :) Would it make sense to split the readme into two separate files: One for technical gurus, where all the funny details are explained. And one for every user where there are more explanations and less technical terms? I can't help much with double checking the technical stuff. But I could easily tell you which parts are difficult to understand for a noob like me. Some of the current text of the readme is only useful for programmers, but not for users, e.g. "The table is created such that the offset into the table" is not useful for users, but might be useful for programmers, I think. Such information could also be moved to a separate file.
(3) For some parameters you're listing a number of ITU/SMTP/whatever specs. That's ok, but it doesn't help a noob like me at all. I'd need an explanation what these specs are actually usually used for. E.g. for "Source_primaries" both options 1 and 3 contain the word "NTSC", so if I wanted to use the default source primaries for DVD NTSC discs, I wouldn't know whether to use 1 or 3. Also the word "ATSC" doesn't appear, anywhere, so I wouldn't know which source primaries to use for ATSC, either. I think for each of these options the first line should explain what they are usually used for, and then maybe the ITU/SMPTE spec listings under that. Personally, I couldn't care less about ITU/SMPTE. If you ask me, I'd move the ITU/SMTPE numbers to the "technical readme" and remove them from the normal readme. I think for each option which requires different values depending on the source type, each of the following source types should be contained somewhere in one of the options: Blu-Ray/HD DVD, NTSC DVD, PAL DVD, NTSC SD broadcasts, ATSC HD broadcasts, PAL SD broadcasts, PAL HD broadcasts. So that we noob users know exactly which option is the right one to use.
(4) What does "GBR" stand for? I'd simply remove that and only write "RGB Input, no YCbCr->RGB conversion" there instead.
(5) I'd rename "YCbCr_Full_range" to "YCbCr_Input_Full_range" to make it clear that this is for *input*. It's clear enough if you think about it (after all output of 3dlut is always RGB, so YCbCr can only be the input and not the output), but adding "input" to the parameter name means you don't even need to use intelligent logic to find out that input is meant and not output.
(6) "RGB_BW" should be renamed to "RGB_Output_BW".
(7) Would it make sense to add an "RGB_Input_BW" option?
(8) "Chromatic_adaptation" definitely needs an explanation. I've no idea what this is good for and which value it should be set to under which circumstances.
(9) What should "Display_primaries" be set to for people who don't have this information about their display?
(10) Is there a direct correlation between different source types (e.g. PAL DVD, NTSC DVD, Blu-Ray, ...) and "Source_gamma"?
(11) What should "Display_gamma" be set to for people who don't have this information about their display?
(12) An explanation would be helpful about whether YCbCr and RGB data are usually linear or gamma corrected. E.g. are compressed sources usually gamma corrected (probably yes)? And if you transport YCbCr via HDMI, is that also gamma corrected or linear? And if you transport RGB via HDMI, is that also gamma corrected or linear? How about if you use HDMI 1.3 DeepColor. Is that still gamma corrected or linear?
(13) I'd like to have an option to output 16bit RGB instead of 8bit... ;)
(Of course my questions are not meant to be answered by you here in the forum. The intent of my questions is that they should be answered by the readme.)
cyberbeing
6th February 2009, 10:29
It seems like I underestimated before how slow the initial version of 1.8 was. Instead of 10-20% it was actually nearer to 30-50% slower then 1.7 when I retested and actually calculated the percentage instead of guessed. The new version you posted is much better but still slower then 1.7 by about 5-15%.
CPU graphs from left to right, |ddcc 1.7|original ddcc 1.8|new ddcc 1.8|
http://img17.imageshack.us/img17/6731/17st7.png http://img10.imageshack.us/img10/6216/old18iu9.png http://img15.imageshack.us/img15/9736/new18ew8.png
yesgrey
6th February 2009, 12:54
But I could easily tell you which parts are difficult to understand for a noob like me.
madshi,
Thank you very much for your excellent post! I need feedback like this, because it's the only way of improving things.
You should consider the actual readme as the technical readme.;)
I did not want to delay much more the release of the program, so I only put the more technical stuff. For helping the newbies, I've created 3 typical input files, for Blu-ray, dvd-pal and dvd-ntsc, but I know the instructions must be a lot more detailed and simpler.
As you know very well, only when people start using our software we can see what's good, what's bad, what's useful, what's useless...
(5) I'd rename "YCbCr_Full_range" to "YCbCr_Input_Full_range"
(6) "RGB_BW" should be renamed to "RGB_Output_BW".
I'm not so sure about this. I was thinking that maybe could be a good idea of allowing the 3D LUT to work both ways:
YCbCr->RGB or RGB->YCbCr.
tritical, what do you think about it? Some people are reporting slightly better results when performing YCbCr->RGB with the 3D LUT, maybe the RGB->YCbCr could also be more accurate with the 3D LUT?
I think that would be a good idea starting a new thread about rgb3dlut. Now, there is a new option available for the Avisynth usage (using 3D LUTs), and the current thread name is not very meaningfull about it; some potential users could be missing it...
If you agree with using the 3dlut in both ways, we have two optios:
-rename rgb3dlut to a more generic name
-keep rgb3dlut as it is and create ycbcr3dlut or yuv3dlut (i prefer the former because is the correct designation), etc. As you wish.
leeperry
6th February 2009, 14:40
It seems like I underestimated before how slow the initial version of 1.8 was.
on an o/c Q6600 :
ddcc 1.7
yv12toyuy2(itype=1,threads=4)
=1400 fps
rgb3dlut(lutfile="C:\BT709_16-235.txt",itype=2,threads=4)
=622 fps
ddcc 1.8
yv12toyuy2(itype=1,threads=4)
=1450 fps
rgb3dlut(lutfile="C:\BT709_16-235.txt",itype=2,threads=4)
=320 fps
madshi
6th February 2009, 16:55
You should consider the actual readme as the technical readme.;)
Ok, that readme is probably very good then... :)
I'm not so sure about this. I was thinking that maybe could be a good idea of allowing the 3D LUT to work both ways:
YCbCr->RGB or RGB->YCbCr.
tritical, what do you think about it? Some people are reporting slightly better results when performing YCbCr->RGB with the 3D LUT, maybe the RGB->YCbCr could also be more accurate with the 3D LUT?
I think that would be a good idea starting a new thread about rgb3dlut. Now, there is a new option available for the Avisynth usage (using 3D LUTs), and the current thread name is not very meaningfull about it; some potential users could be missing it...
If you agree with using the 3dlut in both ways, we have two optios:
-rename rgb3dlut to a more generic name
-keep rgb3dlut as it is and create ycbcr3dlut or yuv3dlut (i prefer the former because is the correct designation), etc. As you wish.
Well, if you go that way then you may also want to support YCbCr -> YCbCr. Also you may want to support RGB computer levels -> RGB video levels. In any case, if the options are not clear about whether they're supposed to affect input or output, there can be all kinds of misunderstandings. Just think about the ffdshow RGB controls which were backwards (are finally fixed) and about Haali's RGB levels control, which is also backwards. It's all because the options don't clearly say whether they're meant to control input or output levels. If you want to do RGB -> RGB then you probably have to offer RGB_BW options for both input and output. And if you want to support YCbCr -> YCbCr you probably have to offer full_range options for both input and output. So IMHO the options should be clearly separated for input and output.
yesgrey
6th February 2009, 17:09
So IMHO the options should be clearly separated for input and output.
Yes, you're right. I will change it for the next version.
Thanks
tritical
6th February 2009, 20:38
v1.9 should fix the speed issues... it was the compiler sucking at inlining some functions. Also, when I was explaining yv12toyuy2 vs converttoyuy2 I forgot to mention that converttoyuy2 is significantly faster (it has mmx/isse versions, whereas yv12toyuy2 is just written in c). So unless you really need the extra functionality of yv12toyuy2 I would recommend using converttoyuy2. I wrote yv12toyuy2 mainly for comparison purposes.
@yesgrey3
Adding rgb->yuy2, yuy2->yuy2, and yv12->yv12 support is a good idea. I would probably just rename rgb3dlut to 3dlut, and move it to its own dll at that point.
leeperry
6th February 2009, 20:46
v1.9 should fix the speed issues... it was the compiler sucking at inlining some functions.
indeed! on an o/c Q6600 :
ddcc 1.7
yv12toyuy2(itype=1,threads=4)
=1400 fps
rgb3dlut(lutfile="C:\BT709_16-235.txt",itype=2,threads=4)
=622 fps
ddcc 1.9
yv12toyuy2(itype=1,threads=4)
=1410 fps
rgb3dlut(lutfile="C:\BT709_16-235.txt",itype=2,threads=4)
=700 fps
OK now yv12toyuy2() doesn't crash if it's not mod4, it simply gives an error msg.
is there any potentially visible improvement using yv12toyuy2(itype=0) over ConvertToYUY2() ? or if you don't mind forcing it to use 0 if it's not mod4 :thanks:
yesgrey
6th February 2009, 23:48
Adding yuy2->yuy2, and yv12->yv12
What's the idea of this, changing the luma matrix coefficients and/or levels?
Please tell me when you are thinking in adding that, so I can set my priorities for cr3dlut development...;)
I would probably just rename rgb3dlut to 3dlut, and move it to its own dll at that point.
Yes, and perhaps it would also be the time for starting a new thread about it...
Can a function name start with a number? I think that would be the best name, but i thought that it was not possible...
yesgrey
6th February 2009, 23:57
(13) I'd like to have an option to output 16bit RGB instead of 8bit... ;)
This would be very simple to add, but completelly useless for now... You will need a video renderer that supports 16bit per component, and Avisynth also only supports 8bit per component. I've read something about Avisynth 3.0 supporting RGB45 (15bit per component), but I don't know if it's still being developed.
I also need to know the format of the 3D LUT. I could simply use the current format just changing the offset considering 16bit instead of 8bit, but I don't know if that would be the desired format...
I think you will have to wait a little more. For cr3dlut, it will be less than a day of work, but all the other stuff that we need could take a little bit longer...:(
leeperry
7th February 2009, 01:03
ah...yv12toyuy2(itype=0) used to work w/ non-mod4, but now it also gives an error msg....back to ConvertToYUY2() :o
madshi
7th February 2009, 12:46
This would be very simple to add, but completelly useless for now... You will need a video renderer that supports 16bit per component, and Avisynth also only supports 8bit per component. I've read something about Avisynth 3.0 supporting RGB45 (15bit per component), but I don't know if it's still being developed.
I also need to know the format of the 3D LUT. I could simply use the current format just changing the offset considering 16bit instead of 8bit, but I don't know if that would be the desired format...
I think you will have to wait a little more. For cr3dlut, it will be less than a day of work, but all the other stuff that we need could take a little bit longer...:(
Here comes my suggestion:
The 3dlut files *have* to get a header. The header should contain at least the following information:
(1) signature, e.g. "3dlut"
(2) header size
(3) file format version number
(4) program which created this file (e.g. "cr3dlut")
(5) version of the program which created this file (Windows version information "a.b.c.d" = 4 words)
(6) input color space (RGB or YUY2 something else)
(7) output color space (YUY2 or YCbCr or something else)
(8) input bitdepth (8bit or 16bit) - even if 16bit input doesn't seem to make sense right now
(9) output bitdepth (8bit or 16bit)
(10) detailed list of *every single* parameter used to create the file
(11) maybe some reserved fields for future use
The file format should never break. In the worst case a new "file format version number" could be used, e.g. if you added support for display primaries for different luminance levels or things like that. If you can create a file format which is already fit for all future extensions you plan to do that would be awesome, of course...
Let's just imagine someone decided to create a DirectShow filter doing gamut correction based on your 3dlut solution. That DirectShow filter should be able to find out whether a specific 3dlut file has the expected format. If it does not, the filter could just delete it and recreate it on the fly by using cr3dlut. The filter could offer the consumer a list of controls (e.g. display gamma and primaries) and further options like RGB output levels (video or computer) etc. The necessary 3dlut files could always be created on the fly before video playback is started. But such a logic would definitely require a clearly defined header for the 3dlut files. Finally, such a DirectShow filter could easily make use of 16bit RGB output, e.g. to feed a potential new Windows 7 16bit RGB renderer or to dither down the 16bit RGB 3dlut output to any desired RGB bitdepth. Saying that AviSynth doesn't support more than 8bit RGB yet and that there's no renderer for 16bit RGB yet feels a bit short sighted to me. Maybe Haali would be motivated to update his renderer to support Windows 7 16bit if there was a ready to use gamut correction solution outputting 16bit RGB? He won't add 16bit output if there's no argument for it, obviously. So *please* let's not play the chicken and egg game.
I think if you guys provided the necessary framework with exact specifications, that might increase motivation for other programmers to jump in and provide the missing pieces of the puzzle. Saying: "Maybe we will add this later" and "I don't know how the file format of 16bit output would look like" etc makes all your work feel like "it's not ready to be used by other programmers yet". That is likely to slow down adoption by other programmers...
IMHO the first thing you should do is create a file format for 3dlut. The 2nd thing to do would be to make cr3lut create files in that format. That would be the minimum needed for other programmers to jump in and make use of it, I think.
yesgrey
7th February 2009, 14:52
Here comes my suggestion:
The 3dlut files *have* to get a header.
That's a very good suggestion, because with all the possible options that could be added to the 3dlut it would be very hard to know which will do what... I will start working on it.
The file format should never break...
If you can create a file format which is already fit for all future extensions you plan to do that would be awesome, of course...
For a 3dlut this is easy. The output will always be the same for the same output bit depths. For example, for the current 8bit version, the output will always be an array of 3x[256,256,256] 1 byte entries. What could change is just the way the output values are computed, nothing more.
So *please* let's not play the chicken and egg game.
Ok, I will put the egg.:D
I will add 16bit and possibly other output bit depths.
Let's hope he grows in a beautiful chicken...;)
@tritical,
I will start the specification of a header for the 3dlut files, based on madshi suggestions. For the 3d lut files with 16bit output are you ok with this format:
offset: ((v<<16)+(u<<8)+y)*3 2 bytes entries
offset: ((g<<16)+(b<<8)+r)*3 2 bytes entries
At that location, the associated rgb value should be stored in b, g, r order.
Would it be better considering only 8bit and 16bit output? It would be easier to code... what about performance wise?
With all the options we are considering for rgb3dlut, maybe it would also be better start using some kind of compression with the lut files. If we need to have several in our computer it will consume some hard disk space... any suggestion?
FoLLgoTT
7th February 2009, 18:37
There is a 10 bit per component mode in DirectShow called MEDIASUBTYPE_A2R10G10B10 (http://msdn.microsoft.com/en-us/library/dd407253(VS.85).aspx). Is it somehow possible to use that mode with VMR9 or other Renderers?
While searching a time ago I found nearly nothing about displaying graphics with more than 8 bit per component on Windows platform.
yesgrey
7th February 2009, 21:31
There is a 10 bit per component mode in DirectShow called MEDIASUBTYPE_A2R10G10B10 (http://msdn.microsoft.com/en-us/library/dd407253(VS.85).aspx). Is it somehow possible to use that mode with VMR9 or other Renderers?
With VMR9 I don't think so, in the Video Mixing Renderer Subtypes (http://msdn.microsoft.com/en-us/library/dd407345(VS.85).aspx) it doesn't appear... With another renderers, I don't know.
While searching a time ago I found nearly nothing about displaying graphics with more than 8 bit per component on Windows platform.
Apparently only Windows7 will allow it.
I have read a user report in this forum that when using a 10bit lcd monitor with dvi connection the current windows version would show a 10bit per component graphic mode, but I don't know if it's true.
madshi
7th February 2009, 23:28
For a 3dlut this is easy. The output will always be the same for the same output bit depths. For example, for the current 8bit version, the output will always be an array of 3x[256,256,256] 1 byte entries. What could change is just the way the output values are computed, nothing more.
True for the data array. But how about the header? I would really want to have all the parameters in the header which were used to create the 3dlut data array. So if the parameter logic changes, the header might have to change, too. E.g. currently users can define only one set of primaries for the display, IIRC. Maybe some day you will allow primaries for different luminance levels. If you do that, the header may have to change, if you really want to put all the parameters in there which were used to create the 3dlut file.
Or maybe you could simply store the cr3dlut config text file in the header which was used to create the 3dlut file?
Ok, I will put the egg.:D
Great - thanks!!
I will add 16bit and possibly other output bit depths.
What other output bit depths would make sense? Maybe 8bit and 16bit is all we need? More than 16bit should be overkill. And any intermediate value between 8bit and 16bit could easily be calculated on the fly by the software (e.g. DirectShow filter) which does all the work. If I were to design a software (e.g. DirectShow filter) which "executes" gamut correction based on 3dlut I'd probably not even use 8bit 3dlut files at all. I'd only use 16bit 3dlut and then dither down to 8bit, if needed. I wouldn't really care much if that costs a few percent of performance...
yesgrey
8th February 2009, 00:58
Or maybe you could simply store the cr3dlut config text file in the header which was used to create the 3dlut file?
Yes, I think this would be the best. If we'll edit the 3dlut file with a text editor we could see the run parameters settings, and it will be a small overhead...
What other output bit depths would make sense?
Well, for people who doesn't want to use dithering, it would be better having it in the display's native bit depth.
More than 16bit should be overkill.
16bit is already overkill...;)
yesgrey
8th February 2009, 01:55
Here is the first iteration for the definition of a file format for the 3D LUT:
struct
{
char sig[6];
int size, ver, biti, bito, cci, cco;
char pname[20];
int pver, sizerp, reserv1, reserv2;
} h3dlut;
/* 3D LUT file specification:
// Header
sig - File signature, must be: '3DLUT'
size - File header size in bytes
ver - File format version number
biti - Input bit depth per component (same for all three)
bito - Output bit depth per component (same for all three)
cci - Input color coding
0 - R'G'B'
> 0 - Y'Cb'Cr' - index to luma_matrix_coeffs
cco - Output color coding
0 - R'G'B'
> 0 - Y'Cb'Cr' - index to luma_matrix_coeffs
pname - Name of the program that created the file
pver - Version of the program that created the file
sizerp - Size in bytes of the array of char with a copy of the
run parameters settings used for creating the file
reserv1 - Reserved for future usage
reserv2 - Reserved for future usage
// Parameters Settings
sizerp bytes
// Data
3*((2^biti)^3)*bito/8 bytes
This is the general formula, in reality it's only usable the biti=8 version
*/
Any suggestions/corrections are welcome.
@tritical,
If you agree, let me know so I can implement it and release the first version of the 3DLUT file format.
cyberbeing
8th February 2009, 09:10
Would there be any quality advantage of completely bypassing the video card LUT (i.e. leaving your 8bit display uncalibrated) and applying your calibrated gamma ramp instead through 3dlut before it does its other adjustments?
My limited understanding is, considering that most graphics cards only have a 8bit or 10bit LUT, doing the adjustment with 3dlut which is 16bit would result in less banding. Is this correct?
If so, could support for applying a calibrated gamma ramp (using something like the values exported from CalibrationTester) before the CMS compensations are done be added?
leeperry
8th February 2009, 10:56
Would there be any quality advantage of completely bypassing the video card LUT (i.e. leaving your 8bit display uncalibrated) and applying your calibrated gamma ramp instead through 3dlut before it does its other adjustments?
My limited understanding is, considering that most graphics cards only have a 8bit or 10bit LUT, doing the adjustment with 3dlut which is 16bit would result in less banding. Is this correct?
If so, could support for applying a calibrated gamma ramp (using something like the values exported from CalibrationTester) before the CMS compensations are done be added?
good point, even though I'm dubious about any banding improvement?!
ARGYLLCMS outputs this sort of 1D LUT for calibration :
NUMBER_OF_SETS 256
BEGIN_DATA
0.0000 0.083189 0.021229 0.023934
3.9216e-003 0.087305 0.047484 0.028051
7.8431e-003 0.091516 0.061404 0.032320
0.011765 0.095825 0.071292 0.036751
0.015686 0.10023 0.079941 0.041354
0.019608 0.10475 0.088256 0.046137
0.023529 0.10937 0.096318 0.051114
0.027451 0.11409 0.10400 0.056294
0.031373 0.11882 0.11121 0.061668
0.035294 0.12350 0.11782 0.067159
0.039216 0.12807 0.12387 0.072689
0.043137 0.13247 0.12948 0.078181
0.047059 0.13671 0.13473 0.083563
0.050980 0.14080 0.13966 0.088789
0.054902 0.14475 0.14437 0.093852
0.058824 0.14860 0.14889 0.098760
0.062745 0.15235 0.15323 0.10354
0.066667 0.15601 0.15747 0.10823
0.070588 0.15960 0.16162 0.11283
0.074510 0.16315 0.16566 0.11736
0.078431 0.16664 0.16961 0.12183
0.082353 0.17010 0.17345 0.12626
0.086275 0.17353 0.17722 0.13063
0.090196 0.17695 0.18092 0.13497
0.094118 0.18034 0.18457 0.13927
0.098039 0.18369 0.18818 0.14352
0.10196 0.18701 0.19175 0.14771
0.10588 0.19030 0.19534 0.15185
0.10980 0.19355 0.19889 0.15593
0.11373 0.19679 0.20245 0.15996
0.11765 0.20002 0.20604 0.16396
0.12157 0.20324 0.20968 0.16794
0.12549 0.20647 0.21337 0.17190
0.12941 0.20970 0.21707 0.17586
0.13333 0.21293 0.22077 0.17980
0.13725 0.21616 0.22444 0.18373
0.14118 0.21938 0.22807 0.18765
0.14510 0.22257 0.23167 0.19155
0.14902 0.22575 0.23524 0.19542
0.15294 0.22891 0.23872 0.19925
0.15686 0.23207 0.24214 0.20306
0.16078 0.23521 0.24550 0.20685
0.16471 0.23833 0.24883 0.21061
0.16863 0.24143 0.25216 0.21436
0.17255 0.24450 0.25548 0.21808
0.17647 0.24758 0.25880 0.22179
0.18039 0.25068 0.26216 0.22546
0.18431 0.25379 0.26554 0.22912
0.18824 0.25693 0.26891 0.23277
0.19216 0.26008 0.27230 0.23641
0.19608 0.26324 0.27569 0.24003
0.20000 0.26642 0.27910 0.24366
0.20392 0.26961 0.28251 0.24731
0.20784 0.27282 0.28593 0.25096
0.21176 0.27602 0.28935 0.25463
0.21569 0.27921 0.29280 0.25832
0.21961 0.28238 0.29625 0.26200
0.22353 0.28554 0.29964 0.26570
0.22745 0.28867 0.30297 0.26940
0.23137 0.29177 0.30625 0.27309
0.23529 0.29482 0.30950 0.27678
0.23922 0.29783 0.31273 0.28045
0.24314 0.30081 0.31594 0.28412
0.24706 0.30375 0.31910 0.28778
0.25098 0.30666 0.32227 0.29146
0.25490 0.30954 0.32544 0.29513
0.25882 0.31239 0.32864 0.29879
0.26275 0.31521 0.33185 0.30242
0.26667 0.31801 0.33510 0.30606
0.27059 0.32080 0.33837 0.30971
0.27451 0.32357 0.34171 0.31334
0.27843 0.32633 0.34507 0.31698
0.28235 0.32909 0.34844 0.32062
0.28627 0.33184 0.35180 0.32425
0.29020 0.33460 0.35515 0.32787
0.29412 0.33736 0.35850 0.33149
0.29804 0.34013 0.36182 0.33510
0.30196 0.34292 0.36512 0.33870
0.30588 0.34573 0.36840 0.34229
0.30980 0.34856 0.37163 0.34587
0.31373 0.35141 0.37483 0.34944
0.31765 0.35430 0.37801 0.35302
0.32157 0.35723 0.38120 0.35660
0.32549 0.36019 0.38439 0.36018
0.32941 0.36319 0.38759 0.36376
0.33333 0.36622 0.39079 0.36735
0.33725 0.36926 0.39400 0.37095
0.34118 0.37232 0.39727 0.37455
0.34510 0.37539 0.40062 0.37815
0.34902 0.37847 0.40404 0.38178
0.35294 0.38152 0.40742 0.38542
0.35686 0.38457 0.41073 0.38906
0.36078 0.38759 0.41400 0.39271
0.36471 0.39061 0.41727 0.39636
0.36863 0.39361 0.42054 0.40003
0.37255 0.39661 0.42379 0.40371
0.37647 0.39957 0.42703 0.40738
0.38039 0.40251 0.43029 0.41101
0.38431 0.40542 0.43356 0.41463
0.38824 0.40834 0.43686 0.41826
0.39216 0.41125 0.44017 0.42191
0.39608 0.41415 0.44352 0.42554
0.40000 0.41704 0.44689 0.42916
0.40392 0.41992 0.45025 0.43277
0.40784 0.42281 0.45363 0.43636
0.41176 0.42569 0.45704 0.43995
0.41569 0.42858 0.46047 0.44353
0.41961 0.43148 0.46390 0.44712
0.42353 0.43438 0.46733 0.45071
0.42745 0.43729 0.47076 0.45430
0.43137 0.44021 0.47416 0.45790
0.43529 0.44316 0.47755 0.46154
0.43922 0.44611 0.48092 0.46519
0.44314 0.44907 0.48427 0.46886
0.44706 0.45203 0.48760 0.47255
0.45098 0.45500 0.49092 0.47624
0.45490 0.45800 0.49422 0.47994
0.45882 0.46104 0.49752 0.48364
0.46275 0.46410 0.50088 0.48734
0.46667 0.46717 0.50426 0.49102
0.47059 0.47023 0.50763 0.49470
0.47451 0.47329 0.51097 0.49837
0.47843 0.47633 0.51429 0.50204
0.48235 0.47937 0.51760 0.50570
0.48627 0.48239 0.52091 0.50936
0.49020 0.48540 0.52423 0.51301
0.49412 0.48838 0.52756 0.51662
0.49804 0.49131 0.53090 0.52021
0.50196 0.49420 0.53424 0.52377
0.50588 0.49704 0.53761 0.52730
0.50980 0.49984 0.54101 0.53079
0.51373 0.50260 0.54441 0.53426
0.51765 0.50535 0.54780 0.53770
0.52157 0.50809 0.55119 0.54112
0.52549 0.51083 0.55459 0.54452
0.52941 0.51358 0.55797 0.54792
0.53333 0.51637 0.56136 0.55132
0.53725 0.51917 0.56478 0.55473
0.54118 0.52201 0.56821 0.55817
0.54510 0.52486 0.57156 0.56162
0.54902 0.52773 0.57486 0.56510
0.55294 0.53063 0.57815 0.56863
0.55686 0.53355 0.58143 0.57219
0.56078 0.53649 0.58470 0.57579
0.56471 0.53943 0.58796 0.57941
0.56863 0.54237 0.59122 0.58308
0.57255 0.54532 0.59449 0.58680
0.57647 0.54827 0.59776 0.59054
0.58039 0.55119 0.60104 0.59430
0.58431 0.55408 0.60432 0.59807
0.58824 0.55695 0.60763 0.60185
0.59216 0.55978 0.61094 0.60565
0.59608 0.56259 0.61424 0.60945
0.60000 0.56539 0.61753 0.61327
0.60392 0.56820 0.62085 0.61709
0.60784 0.57102 0.62419 0.62091
0.61176 0.57386 0.62754 0.62471
0.61569 0.57671 0.63089 0.62851
0.61961 0.57960 0.63422 0.63229
0.62353 0.58252 0.63753 0.63606
0.62745 0.58548 0.64084 0.63981
0.63137 0.58849 0.64420 0.64354
0.63529 0.59153 0.64767 0.64726
0.63922 0.59461 0.65129 0.65096
0.64314 0.59772 0.65505 0.65466
0.64706 0.60087 0.65889 0.65832
0.65098 0.60405 0.66261 0.66193
0.65490 0.60723 0.66619 0.66547
0.65882 0.61039 0.66964 0.66895
0.66275 0.61351 0.67301 0.67239
0.66667 0.61660 0.67637 0.67586
0.67059 0.61968 0.67973 0.67937
0.67451 0.62277 0.68309 0.68291
0.67843 0.62587 0.68643 0.68648
0.68235 0.62898 0.68974 0.69004
0.68627 0.63209 0.69302 0.69359
0.69020 0.63519 0.69628 0.69712
0.69412 0.63827 0.69951 0.70066
0.69804 0.64136 0.70272 0.70421
0.70196 0.64446 0.70594 0.70777
0.70588 0.64758 0.70915 0.71135
0.70980 0.65070 0.71237 0.71493
0.71373 0.65383 0.71561 0.71852
0.71765 0.65695 0.71883 0.72214
0.72157 0.66009 0.72207 0.72578
0.72549 0.66323 0.72529 0.72943
0.72941 0.66636 0.72850 0.73310
0.73333 0.66948 0.73172 0.73678
0.73725 0.67259 0.73496 0.74048
0.74118 0.67569 0.73824 0.74418
0.74510 0.67880 0.74155 0.74790
0.74902 0.68191 0.74488 0.75164
0.75294 0.68502 0.74823 0.75539
0.75686 0.68812 0.75159 0.75914
0.76078 0.69120 0.75499 0.76290
0.76471 0.69427 0.75840 0.76668
0.76863 0.69734 0.76185 0.77046
0.77255 0.70040 0.76535 0.77424
0.77647 0.70346 0.76888 0.77803
0.78039 0.70653 0.77243 0.78182
0.78431 0.70960 0.77603 0.78563
0.78824 0.71269 0.77971 0.78946
0.79216 0.71580 0.78351 0.79329
0.79608 0.71890 0.78727 0.79712
0.80000 0.72198 0.79095 0.80093
0.80392 0.72505 0.79458 0.80476
0.80784 0.72814 0.79821 0.80863
0.81176 0.73124 0.80183 0.81250
0.81569 0.73433 0.80544 0.81638
0.81961 0.73743 0.80903 0.82027
0.82353 0.74054 0.81259 0.82416
0.82745 0.74367 0.81613 0.82806
0.83137 0.74680 0.81968 0.83195
0.83529 0.74994 0.82325 0.83583
0.83922 0.75307 0.82684 0.83970
0.84314 0.75619 0.83043 0.84356
0.84706 0.75930 0.83406 0.84741
0.85098 0.76242 0.83776 0.85126
0.85490 0.76555 0.84142 0.85511
0.85882 0.76867 0.84502 0.85896
0.86275 0.77178 0.84862 0.86283
0.86667 0.77489 0.85219 0.86669
0.87059 0.77799 0.85573 0.87054
0.87451 0.78109 0.85927 0.87438
0.87843 0.78420 0.86280 0.87821
0.88235 0.78733 0.86635 0.88202
0.88627 0.79048 0.86992 0.88581
0.89020 0.79364 0.87344 0.88959
0.89412 0.79680 0.87696 0.89338
0.89804 0.79998 0.88047 0.89716
0.90196 0.80316 0.88399 0.90093
0.90588 0.80634 0.88752 0.90468
0.90980 0.80953 0.89105 0.90842
0.91373 0.81271 0.89456 0.91217
0.91765 0.81589 0.89809 0.91591
0.92157 0.81906 0.90164 0.91964
0.92549 0.82222 0.90518 0.92335
0.92941 0.82538 0.90873 0.92704
0.93333 0.82854 0.91228 0.93072
0.93725 0.83169 0.91582 0.93440
0.94118 0.83484 0.91938 0.93809
0.94510 0.83801 0.92293 0.94178
0.94902 0.84119 0.92647 0.94547
0.95294 0.84438 0.93002 0.94916
0.95686 0.84757 0.93356 0.95285
0.96078 0.85077 0.93710 0.95654
0.96471 0.85398 0.94065 0.96023
0.96863 0.85720 0.94419 0.96392
0.97255 0.86041 0.94773 0.96760
0.97647 0.86363 0.95128 0.97127
0.98039 0.86683 0.95483 0.97493
0.98431 0.87004 0.95838 0.97858
0.98824 0.87324 0.96194 0.98222
0.99216 0.87643 0.96550 0.98586
0.99608 0.87962 0.96907 0.98948
1.0000 0.88281 0.97263 0.99310
END_DATA
CAL
its author is all for GPL, and very helpful if needed.
but this graphic card 1D LUT is also used on the desktop, so you get D65 calibration in picture viewers and games....plus this is done in 10 bits, ARGYLLCMS has a tool to check the LUT accuracy.
considering the TMDS of the HDMI/DVI outputs will be encoded in RGB24 anyhow, and that the 3D LUT computing is done in 64bit floating point per component, I don't see it getting any more accurate and show *VISIBLE* improvement :o
yesgrey
8th February 2009, 12:13
Would there be any quality advantage of completely bypassing the video card LUT (i.e. leaving your 8bit display uncalibrated) and applying your calibrated gamma ramp instead through 3dlut before it does its other adjustments?
IMHO for video it would be like this:
-3D LUT 8bit and GC LUT 8bit: do all processing in 3D LUT
-3D LUT 8bit and GC LUT 10bit: do all processing in 3D LUT except gamma correction to fit displays gamma, this should be done in the GC LUT, which has higher accuracy
-3D LUT 16bit dithered to 8bit: do all processing in 3D LUT.
The last option is not available, and I don't know if it will ever be.
If so, could support for applying a calibrated gamma ramp (using something like the values exported from CalibrationTester) before the CMS compensations are done be added?
This is part of my plans, but in fact the gamma correction would be part of the CMS, the last step. I'm already working on it, but I'm waiting feedback about the 3DLUT file format specification.
yesgrey
8th February 2009, 12:20
but this graphic card 1D LUT is also used on the desktop, so you get D65 calibration in picture viewers and games....
Yes. The 3D LUT we are talking about only could be used for videos and photos, if you use a program that uses it (currently only an avisynth scrip directly or via ffdshow support). For the desktop, you are better with the GC 1D LUT.
leeperry
8th February 2009, 12:24
Yes. The 3D LUT we are talking about only could be used for videos and photos, if you use a program that uses it (currently only an avisynth scrip directly or via ffdshow support). For the desktop, you are better with the GC 1D LUT.
well we could use the GC 1D LUT for the desktop, and OVERLAY for videos(as it doesn't care for the GC LUT) but :
-I don't wanna use Overlay, HR is far smoother and free of video drivers sharpening blabla
-the GC LUT is 10 bits, it won't create more banding than needed by the calibration IMHO
BTW I'm being lazy as my gamut config works perfectly fine as it is, just that I have to use one YUY2 LUT for RGB conversion + one RGB LUT for gamut conversion....I will try to look into creating an all-in-one LUT w/ your app later today :p
cyberbeing
8th February 2009, 13:09
3D LUT 16bit dithered to 8bit: do all processing in 3D LUT.
The last option is not available, and I don't know if it will ever be.
That last option is what I was thinking of. What is the reason against implementing this? Would it be too slow? I was under the assumption that rgb3dlut already had this capability and all it would need was a specially created lut file to do this.
leeperry
8th February 2009, 13:12
That last option is what I was thinking of. What is the reason against implementing this? Would it be too slow?
the LUT would be HUGE, and the visible improvement inexistent? it's already done in 64 bits float, and you're gonna have to wait forever before display drivers actually support 10 bits.
OTOH maybe 10 bits could be implement for futureproof sake's?
yesgrey
8th February 2009, 14:28
That last option is what I was thinking of. What is the reason against implementing this? Would it be too slow? I was under the assumption that rgb3dlut already had this capability and all it would need was a specially created lut file to do this.
I will add the 16 bit output 3D LUT very soon, so it will only depends if tritical would want to add dithering to 8bits in rgb3dlut. It will be slower than without dithering, but essentially it will depend on the dither algorythm used... Floyd-Steinberg could be the best balance between quality and speed. From the dithering wiki-page (http://en.wikipedia.org/wiki/Dithering) the stucki method looks the best for me, but probably it would be too slow and/or too complicated to add...
the LUT would be HUGE
The LUT would be HUGE for 16 bit input. For 8bit input/16 bit output it will be 96MB, still pretty usable...
it's already done in 64 bits float
That's the internal computing resolution, the output resolution will always be a limiting factor...;)
So it should be: "16 bit" better than "16bit dithered to 8bit" better than "8bit".
leeperry
8th February 2009, 14:40
the output resolution will always be a limiting factor
true, but all we got now is RGB24 TMDS anyhow.
maybe in a few years if we've not been naughty, we'll get 10 bits...so adding 10 bits/10 bits dither would be more useful IMHO.
of course we'll have to buy spanking new HDMI 1.3 displays.....personally I've never seen any banding using your algorithms in 8 bits, so I'm not too worried :cool:
and before we get >8 bits native video sources, it will take a LONG while.
cyberbeing
8th February 2009, 14:57
so it will only depends if tritical would want to add dithering to 8bits in rgb3dlut.
You heard the man, so tritical, any interest in adding 16bit-->8bit dithering in rgb3dlut?
the LUT would be HUGE
The LUT would be HUGE for 16 bit input. For 8bit input/16 bit output it will be 96MB, still pretty usable...
Photoshop does all processing with 16bit per component and 128bit floating point precision via lut tables. Out of curiosity how do you assume Photoshop does this without taking up massive diskspace or memory with 16bit and 32bit input images?
Is there some better alternative method which is being overlooked? Do you think Photoshop generates a partial lut table wherever needed, on the fly? Something else? Too slow for your uses?
leeperry
8th February 2009, 15:05
Photoshop does all processing with 16bit per component and 128bit floating point precision via lut tables.
I demand 1024bit accuracy :p j/k ;)
pshop uses ICC v4 files for softproofing AFAIK
cyberbeing
8th February 2009, 15:18
I demand 1024bit accuracy :p j/k ;)
Duotriguple Precision??? :scared:
pshop uses ICC v4 files for softproofing AFAIK
If that is the case, and it somehow makes the situation better, why aren't we using 16bit ICC profiles in ddcc?
tetsuo55
8th February 2009, 15:49
Awesome news.
Just want to add view.
-The default Lut's should be 16bit per component, lower bits are optional(for slower systems)
-All internal processing should be 64bit or higher
-The should be a choice between dithering:none, fast, balanced, slow (increasing in quality over speed)
-The output bitdepth should be selectable: 6,8,10,12,16(I'm not sure if 6bit is possible and if it helps those 6bit lcd panels at all)
Once this all works we will need a proof of concept.
-A player that supports 16bit
-A renderer that supports 16bit
-A 16bit sample
Once we have all this in place we can start demanding support (the egg is in place)
-Write all the videocard developers that we want:
*10/12/16bit per component output support
*HDMI1.3C on their videocards
*We want updated drivers supporting this for current hardware
-Write all videocard reseller brands(like asus, msi) the same email.
-Write all display manufacturers the same letter:
*16 bit and lower per componenent input
*Native 16 panels
*Driver updates for current 10/12 bit panels
-File support tickets with the oem companies that you cannot use this 16bit setup because the system they sold you does not support it :(
-Ask the oem companies for 16bit capable hardware and get dissapointed for them not selling it
We might have to push Micrsoft too..
PS.
i know there is no commercial >8 bit content at this time. Maybe a big wave like the one above can spark more interest
yesgrey
8th February 2009, 17:48
before we get >8 bits native video sources, it will take a LONG while.
For >8bit sources the 3DLUT will not be usable...
Input/output: 3D LUT size
8bit/8bit: 48MB
8bit/16bit: 96MB
9bit/9bit: 432MB
10bit/10bit: 3840MB
But since the sources should being kept at 8bit...
You heard the man, so tritical, any interest in adding 16bit-->8bit dithering in rgb3dlut?
It doesn't have to be tritical doing it... the source code is available, so anyone who wishes to can do it. But I think it would be better tritical doing it... let's hope he agrees with it and have the time for it.;)
Photoshop does all processing with 16bit per component and 128bit floating point precision via lut tables. Out of curiosity how do you assume Photoshop does this without taking up massive diskspace or memory with 16bit and 32bit input images?
Photoshop should be using the lut only for the gamma correction, and for that we only need 1D LUTs - each component only depends on itself.
The size formula for 3 1D LUT is: 3*(2^biti)*bito/8 bytes
So, for a 16bit in/16bit out the size is only: 384kB
why aren't we using 16bit ICC profiles in ddcc?
Because the ICC profiles should be used at the software player renderer level.
-All internal processing should be 64bit or higher
No need for higher than 64bit FP. I have tryed 80bit FP and the results were exactly the same, only 15% slower in the computation.
I have also tryed 32bit FP, but with that was a slightly difference against 64bit. Some values differ of +/- 1. Nothing visually noticeable, but since we are performing the computation offline there is no reason to get less accurate results.
-The should be a choice between dithering:none, fast, balanced, slow (increasing in quality over speed)
I think it would be good enough just a basic algorythm like Floyd-steinberg or Sierra-lite (I think this is preferable, since it's the same visual quality as floyd and a little less cpu intensive). Even without dithering we aren't noticing any banding, so let's keep this simple.
I'm not sure if 6bit is possible and if it helps those 6bit lcd panels at all
It's possible, but it would be useless. The 6bit lcd panels perform their own dithering from an 8bit input for their native 6bit.
Once we have all this in place we can start demanding support (the egg is in place)
A massive attack!!!:D
leeperry
8th February 2009, 18:07
Photoshop should be using the lut only for the gamma correction, and for that we only need 1D LUTs - each component only depends on itself.
The size formula for 3 1D LUT is: 3*(2^biti)*bito/8 bytes
So, for a 16bit in/16bit out the size is only: 384kB
apparently you can have 3D LUT's within ICC v4 profiles.
I played around w/ them in X-Rite Profile Maker 5, too bad they only work in color managed apps(pshop, firefox, etc..)
pshop can do full gamut conversions w/ its softproofing options apparently.
but going ICC v4 in rgb3dlut would be pointless, as this is not even an open standard I think? not sure
madshi
8th February 2009, 19:27
sig - File signature, must be: '3DLUT'
'3DLUT' are only 5 chars. So maybe the definition should be "char sig[5]"?
int size, ver, biti, bito, cci, cco;
char pname[20];
int pver, sizerp, reserv1, reserv2;
I'd prefer much much MUCH longer names. Look at all the win32 structures. Your average field name is about 20-30 chars there...
Does "size" include the signature and the "size" field itself?
cci - Input color coding
0 - R'G'B'
> 0 - Y'Cb'Cr' - index to luma_matrix_coeffs
cco - Output color coding
0 - R'G'B'
> 0 - Y'Cb'Cr' - index to luma_matrix_coeffs
What does "luma_matrix_coeffs" mean? How about this?
COLOR_CODING_RGB = 0;
COLOR_CODING_YCbCr = 1;
COLOR_CODING_LUMA_MATRIX_COEFFS = 0x10000;
The matrix coeffs (whatever they mean) would then be "cci & 0xffff". And we'd have a lot of room for additional values between YCbCr and luma_matrix_coeffs for potential future use. But since I don't really know what is meant with matrix coeffs, my suggestion may very well be stupid. So take it with a pinch of salt, please...
pver - Version of the program that created the file
That should be a long int (64bit) to be win32 file version compatible.
Rest looks fine to me.
IMHO for video it would be like this:
-3D LUT 8bit and GC LUT 8bit: do all processing in 3D LUT
-3D LUT 8bit and GC LUT 10bit: do all processing in 3D LUT except gamma correction to fit displays gamma, this should be done in the GC LUT, which has higher accuracy
-3D LUT 16bit dithered to 8bit: do all processing in 3D LUT.
The last option is not available, and I don't know if it will ever be.
I think you're being too pessimistic... :)
This is part of my plans, but in fact the gamma correction would be part of the CMS, the last step.
That would be very nice!!
Floyd-Steinberg could be the best balance between quality and speed. From the dithering wiki-page (http://en.wikipedia.org/wiki/Dithering) the stucki method looks the best for me, but probably it would be too slow and/or too complicated to add...
I think the differences between the colors are small enough with RGB24 so that Stucki won't have much of an advantage over Floyd-Steinberg. Actually I think maybe using random dithering (like in audio processing) would even be superior to using either Stucki or Floyd-Steinberg. Random dithering looks bad if the color differences are big. But I think with small color differences random dithering could play out to be surprisingly good. Well, I'm only speculating, of course...
So it should be: "16 bit" better than "16bit dithered to 8bit" better than "8bit".
Fully agreed. However, "16bit dithered to 8bit" should be a lot nearer to 16bit than to 8bit. That is, if there's any difference visible between 16bit and 8bit at all. Jury is still out on that, I have to admit...
tritical
8th February 2009, 22:50
I can add 16 bit to 8 bit support, but it will take a little while. rgb3dlut really needs to be rewritten so that the yuy2 -> packed 4:4:4 upsampling step is separate from the packed 4:4:4 -> RGB conversion via lut step. Right now 14 separate code paths exist in rgb3dlut (12 for yuy2 input w/ rgb output), and I'd have to add 16->8 support to all of them. Separating the upsampling out would leave only 4 code paths. I'm skeptical if 16-bit output with dithering to 8-bit, given 8-bit input, will make any real difference though (we're talking about changing r/g/b values by +-1).
My current plans are:
separate rgb3dlut and yv12toyuy2 from ddcc
rename rgb3dlut to 3dlut
add rgb->yuy2, yuy2->yuy2, and yv12->yv12 support
separate upsampling/downsampling steps inside 3dlut (4:2:2->4:4:4,4:4:4->4:2:2) from lut step
After that, I will add 16-bit lut output w/ dithering to 8-bit. I can't say how fast all of this will happen. I do have other stuff that I actually get paid to work on :D.
The proposed header looks ok to me, but it doesn't make much difference to rgb3dlut anyways... aside from some error checking and automatic parameter setting. I think the specific values of cci/cco should be listed though.
madshi
9th February 2009, 09:23
@tritical, that sounds awesome - thanks! It will be very interesting to see whether dithering does or does not make a visible difference. I guess we'll need some fancy test screens for that then with very smooth color and brightness gradations...
@yesgrey3 & tcritical, here are some additional thoughts about bitdepth and array / file format:
input bitdepth
For unscaled progressive sources 8bit input should be good enough. But eventually if deinterlacing and scaling is made use of, having slightly higher input bitdepth may be worth a thought. But of course the needed memory size grows astronomical, unfortunately. But:
(1) Deinterlacing and scaling are usually done in YCbCr and not in RGB, AFAIK.
(2) Brightness information is more important than color information.
(3) For the array access we need to convert to YCbCr 4:4:4, so we put too much weight on color information compared to brightness information.
So my thought is this: Would it make sense to (optionally) up the Y bitdepth to 10bit while leaving Cb and Cr at 8bit? That would increase needed memory by factor 4x, which is a lot, but still manageable, I think. Of course this input format would make sense only if there was a deinterlacing/scaling algorithm which outputs more than 8bit. Don't know if any such thing exists yet. But even if it doesn't exist yet, who knows what the future will bring. FWIW, all the good hardware video processing chips calculate at least in 10bit YCbCr internally when doing deinterlacing & scaling etc...
One minor problem is that the current file format draft wouldn't support a funny input format like this. So I'd suggest splitting the "input/output bitdepth" field into 3 fields, one for each input/output component. E.g. "int inputBitdepth [3]".
array pack format
Thinking about memory sizes, and how to pack the output data:
We could split the array into 3 parts, one for each output component. This would give the program loading the 3dlut the chance to load the array into memory in 3 separate chunks. That is a big advantage, especially if the array sizes get rather big. E.g. Y10Cb8Cr8 input and R16G16B16 output would require an array of 384MB. Now the memory consumption itself is still manageable, but the bigger problem is the limited address range in a 32bit process. There's only 4GB of memory address range available per 32bit process and the whole range gets fragmented already during creation and initialization of the process. Furthermore some of those 4GB are reserved to the OS. So it will be hard to find a memory address where you can put a continuous block of 384MB into. In my experience you can have good or bad luck allocating such a big continuous memory block. There's a chance that it might fail - especially inside of a typical DirectShow media player, which has loads of DirectShow filter dlls loaded. Chances would be much much higher for allocating 3 separate memory blocks à 128MB. So that's a really good argument for splitting the array into 3 chunks, one for each output component.
yesgrey
11th February 2009, 01:09
'3DLUT' are only 5 chars. So maybe the definition should be "char sig[5]"?
I was counting with the '\0' character, but I will change to 5.
Does "size" include the signature and the "size" field itself?
Yes. size is the size of the header. signature and size are part of the header.
That is, if there's any difference visible between 16bit and 8bit at all. Jury is still out on that, I have to admit...
That's the big question... We are very happy playing with all this bits stuff, but in the end probably we will not notice any difference (I will for sure, because I will not admit that all this work was for nothing...:D)
My current plans are:
Thanks for the planning. I will adjust my planning to yours.
I guess we'll need some fancy test screens for that then with very smooth color and brightness gradations...
You could start creating some...;)
If you don't know how to put them in video just post the pics here and I will do it.
here are some additional thoughts about bitdepth...
Interesting thoughts, but I think that without seeing it we will never know...
IMHO all this >8bit processing is important, because it's the only way of preserving the full 8bit color depth of the source... If we perform all these computations using 8bit, in the end we will keep only 6 or 7 bits of the source color depth...:(
Would it make sense to (optionally) up the Y bitdepth to 10bit while leaving Cb and Cr at 8bit?
Why not? Since I've decided to play the chicken I will put one more egg...:D
We could split the array into 3 parts, one for each output component.
I think that would be a bad idea. I agree with the splitting, but not like that. When we access the 3D LUT, we always retrieve 3 values (r,g,b or y,cb,cr), so, these three values should be stored together. I think it would be better splitt the 3D LUT file for Cb values
For example, for YCbCr(10bit;8bit;8bit):
(Y:0-1023;Cb:0-255;Cr:0-63)+(Y:0-1023;Cb:0-255;Cr:64-127)+(Y:0-1023;Cb:0-255;Cr:128-191)+(Y:0-1023;Cb:0-255;Cr:192-255)
Let's see what tritical thinks about it...
yesgrey
11th February 2009, 01:27
Here is the second iteration for the definition of a file format for the 3D LUT (with names more like at Microsoft way...;)):
struct
{
char signature[5];
int headerSize, fileVersion;
int inputBitDepth[3], outputBitDepth[3];
int inputColorEncoding, outputColorEncoding;
char programName[20];
long int programVersion;
int parametersSize;
int reserved1, reserved2;
} h3dlut;
/* 3D LUT file specification:
// Header
signature - File signature; must be: '3DLUT'
headerSize - File header size in bytes
fileVersion - File format version number
inputBitDepth - Input bit depth per component (Y,Cb,Cr or G,B,R)
outputBitDepth - Output bit depth per component (Y,Cb,Cr or G,B,R)
inputColorEncoding - Input color encoding standard
- ...
outputColorEncoding - Output color encoding standard
- ...
programName - Name of the program that created the file
programVersion - Version of the program that created the file
parametersSize - Size in bytes of the array of char with a copy of the
run parameters settings used for creating the file
reserved1 - Reserved for future usage
reserved2 - Reserved for future usage
// Parameters Settings
parametersSize bytes
// Data
outputBitDepth=8:
offset = (cr<<(2*inputBitDepth[1])+cb<<(inputBitDepth[0])+y)*3 (YCbCr input)
offset = ( r<<(2*inputBitDepth[1])+ b<<(inputBitDepth[0])+g)*3 (RGB input)
size = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2]) bytes
outputBitDepth<=16:
offset = (cr<<(2*inputBitDepth[1])+cb<<(inputBitDepth[0])+y)*6 (YCbCr input)
offset = ( r<<(2*inputBitDepth[1])+ b<<(inputBitDepth[0])+g)*6 (RGB input)
size = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2])*2 bytes
(This is the general formula for the 3 3D LUT tables offset and size)
*/
I have not decided yet how to indicate the ColorEncoding values, I will update the specification later. I need a little more time to think about it.
I think that the offset navigation should be in bytes, so, when outputBitDepth > 8 it will assume always a 16bit output 3D LUT size. Maybe it's better only considering 8bit or 16bit output, or we will have to decide how to deal with the bits not used...
Comments/sugestions are welcome.
Mug Funky
11th February 2009, 07:10
you're gonna have to wait forever before display drivers actually support 10 bits.
http://h20331.www2.hp.com/hpsub/cache/596803-0-0-225-121.html
nvidia's higher end cards will do 10 (or 12) bit output via an SDI option (HDMI can carry it too, i believe).
also sony BVM monitors have been doing 10-bit for years (since 1992?). whether it achieves a noise floor below -48dB i'm not sure, but they certainly don't visibly band on gentle gradients.
madshi
11th February 2009, 09:43
You could start creating some...;)
If you don't know how to put them in video just post the pics here and I will do it.
The problem is that ideally a test screen should be created in YCbCr and not in RGB. E.g. a smooth gradiant from lowest to highest brightness in YCbCr in one bit steps would be helpful. And I don't know how to create YCbCr pics.
Why not? Since I've decided to play the chicken I will put one more egg...:D
Great - thanks!!
I think that would be a bad idea. I agree with the splitting, but not like that. When we access the 3D LUT, we always retrieve 3 values (r,g,b or y,cb,cr), so, these three values should be stored together.
I changed my mind. Please ignore my splitting suggestion. Because I think it is not the responsibility of a file format to dictate how the data has to be stored in RAM. I think it's the task of the software which loads the 3dlut data to store the data in RAM in the best possible way. If the array needs to be split in order to be mappable it into RAM, then let the software which loads the data do this work. I think you should just write one big continuous data array (just like you originally planned) and be done with it.
Here is the second iteration for the definition of a file format for the 3D LUT (with names more like at Microsoft way...;)):
struct
{
char signature[5];
int headerSize, fileVersion;
int inputBitDepth[3], outputBitDepth[3];
int inputColorEncoding, outputColorEncoding;
char programName[20];
long int programVersion;
int parametersSize;
int reserved1, reserved2;
} h3dlut;
/* 3D LUT file specification:
// Header
signature - File signature; must be: '3DLUT'
headerSize - File header size in bytes
fileVersion - File format version number
inputBitDepth - Input bit depth per component (Y,Cb,Cr or G,B,R)
outputBitDepth - Output bit depth per component (Y,Cb,Cr or G,B,R)
inputColorEncoding - Input color encoding standard
- ...
outputColorEncoding - Output color encoding standard
- ...
programName - Name of the program that created the file
programVersion - Version of the program that created the file
parametersSize - Size in bytes of the array of char with a copy of the
run parameters settings used for creating the file
reserved1 - Reserved for future usage
reserved2 - Reserved for future usage
Looks fine to me! Could you please add "char parameter data [?]" to the end of the "h3dlut" structure definition, just to make clear where the parameter data is stored?
// Parameters Settings
parametersSize bytes
// Data
outputBitDepth=8:
offset = (cr<<(2*inputBitDepth[1])+cb<<(inputBitDepth[0])+y)*3 (YCbCr input)
offset = ( r<<(2*inputBitDepth[1])+ b<<(inputBitDepth[0])+g)*3 (RGB input)
size = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2]) bytes
outputBitDepth<=16:
offset = (cr<<(2*inputBitDepth[1])+cb<<(inputBitDepth[0])+y)*6 (YCbCr input)
offset = ( r<<(2*inputBitDepth[1])+ b<<(inputBitDepth[0])+g)*6 (RGB input)
size = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2])*2 bytes
(This is the general formula for the 3 3D LUT tables offset and size)
*/
I think that the offset navigation should be in bytes, so, when outputBitDepth > 8 it will assume always a 16bit output 3D LUT size. Maybe it's better only considering 8bit or 16bit output, or we will have to decide how to deal with the bits not used...
Agreed, using only 8bit or 16bit byte packed output sounds just fine to me. If the software loading the file thinks its better to use bit packing, it can rearrange the array to its liking.
But, I think the offset calculation is wrong. Shouldn't it be this way?
offset = (cr << (inputBitDepth[1] + inputBitDepth[0]) + cb << (inputBitDepth[0]) + y) * 3 (YCbCr input)
leeperry
11th February 2009, 10:26
http://h20331.www2.hp.com/hpsub/cache/596803-0-0-225-121.html
nvidia's higher end cards will do 10 (or 12) bit output via an SDI option (HDMI can carry it too, i believe).
also sony BVM monitors have been doing 10-bit for years (since 1992?). whether it achieves a noise floor below -48dB i'm not sure, but they certainly don't visibly band on gentle gradients.
well, any video card can do 30bit on VGA, it's mostly 8 bit + 10bit LUT...I can use ARGYLLCMS to measure that the LUT is in 10bit.
we're stuck to RGB24 over HDMI/DVI coz the TMDS encoder works in RGB24 anyhow.
oh sure, there must be some high end cards that can do true 10bit on SDI/dual link DVI, but these are not consumer items AFAIK.
it would appear that the HDMI 1.3 licences would be so high that there's hardly any soundcard compatible, let alone graphic card.
honai
11th February 2009, 13:07
This might be a useful 1920x1080 pattern to detect banding/dithering issues:
http://web.comhem.se/zacabeb/repository/spectrum_rgb.png
madshi
11th February 2009, 15:03
I've just had an idea about how to make *any* input bitdepth work with a relatively small (e.g. 8bit input, 16bit output) 3dlut array:
Instead of rounding the input down to the native 3dlut bitdepth, we read the 8 nearest RGB output values from the 3dlut array. Then we average these 8 values down (weighted average) to interpolate the final RGB output value.
I'm thinking of the 3dlut array like a net of 3D cubes. Each value in the 3dlut array is one cube corner. Now if we have an input bitdepth higher than the native 3dlut array, the input usually doesn't fall directly on a 3dlut array cube corner. Instead it's somewhere in the middle of one of those cubes. So we read the surrounding cube corner values and interpolate the final output value, based on how far the input value is away from each of the 8 corners.
Does that make any sense to you? Of course my suggestion would result a noticeable performance loss, cause we'd have to read 8 array values instead of just one and we'd have to use some crazy formula to combine these 8 values into the final output value. But I think the result should be very near to perfect. It should *almost* be as good as having a 3dlut with a much higher native input bitdepth.
Thoughts?
Anyone willing to write a formula to calculate the final output value? Unfortunately my math sucks...
tritical
12th February 2009, 01:05
I would like to change the lookup into the rgb lut to be:
(r<<16)+(g<<8)+b
Also, I am dropping the idea of adding yv12->yv12 support as it just doesn't make sense (internally it would have to convert to 4:4:4 somehow, so might as well require yuy2 input and let the user decide how to do the yv12->yuy2 conversion).
@madshi
Your idea is basically the same idea that is used with smaller luts, such as 16x16x16, applied to higher bitdepths... for input values that don't fall on the samples defined by the lut interpolate inside the cube. I think that for input sources with greater than 8-bits this is the way to go due to memory requirements. So stick with an 8-bit input, 16-bit output and interpolate.
yesgrey
12th February 2009, 01:44
But, I think the offset calculation is wrong. Shouldn't it be this way?
offset = (cr << (inputBitDepth[1] + inputBitDepth[0]) + cb << (inputBitDepth[0]) + y) * 3 (YCbCr input)
Yes, you're right. I will correct it in iteration 3.:o
I would like to change the lookup into the rgb lut to be:
(r<<16)+(g<<8)+(b<<8)
Is this correct? or would it be: (r<<16)+(g<<8)+b
I have suggested: (r<<16)+(b<<8)+g
to keep the correlation Y->G; Cb->B; Cr->R.
as indicated in the ITU specifications, but for me it's ok to use it as you suggested... and what about the output format, do you also agree to always use 1 byte or 2byte?
tritical
12th February 2009, 02:04
Yep, I meant
(r<<16)+(g<<8)+b
and
(v<<16)+(u<<8)+y
So that lookups using rgb packed as: b,g,r in memory, and yuv packed as: y,u,v in memory form the lookup the same way in terms of reading/shifting from memory locations. I agree only 1 byte and 2 byte outputs.
madshi
12th February 2009, 09:43
Also, I am dropping the idea of adding yv12->yv12 support as it just doesn't make sense (internally it would have to convert to 4:4:4 somehow, so might as well require yuy2 input and let the user decide how to do the yv12->yuy2 conversion).
I'm wondering: As far as I understand yv12 is 4:2:0 and yuy2 is 4:2:2, right? Is there also an "official" fourcc for 4:4:4 YCbCr? And wouldn't it be faster and better to convert 4:2:0 directly to 4:4:4 instead of doing 4:2:0 -> 4:2:2 -> 4:4:4?
Your idea is basically the same idea that is used with smaller luts, such as 16x16x16, applied to higher bitdepths... for input values that don't fall on the samples defined by the lut interpolate inside the cube. I think that for input sources with greater than 8-bits this is the way to go due to memory requirements.
And there I thought I had a brand new idea... :o
Anyway, is there code available anywhere (or at least a formula) on how to "interpolate inside the cube"?
Thanks!
yesgrey
12th February 2009, 12:22
Is there also an "official" fourcc for 4:4:4 YCbCr?
YV24.
And wouldn't it be faster and better to convert 4:2:0 directly to 4:4:4 instead of doing 4:2:0 -> 4:2:2 -> 4:4:4?
This was discussed a few posts back. That was the reason for tritical create the yv12toyuy2 filter, so we could test if we could notice any quality difference, since ffdshow is performing yv12->yv24 with no intermediate yuy2 step.;)
My first tests showed that yv12->yv24 was slightly better, but I haven't tested yet the more recent versions of rgb3dlut and yv12toyuy2 with the new options for the chroma sampling position...
In the end, maybe the difference is only noticeable in still zoomed pictures, and nobody see a movie that way...:D
yesgrey
12th February 2009, 14:20
Here is the third iteration for the definition of a file format for the 3D LUT:
// 3D LUT file specification (fileVersion=1):
struct
{
char signature[5];
int headerSize, fileVersion;
int inputBitDepth[3], outputBitDepth[3];
int inputColorEncoding, outputColorEncoding;
char programName[20];
long int programVersion;
int parametersSize;
int reserved1, reserved2;
} h3dlut;
char parametersData[];
void lut3d[];
/*
// Header
signature - File signature; must be: '3DLUT'
headerSize - File header size in bytes
fileVersion - File format version number
inputBitDepth - Input bit depth per component (Y,Cb,Cr or G,B,R)
outputBitDepth - Output bit depth per component (Y,Cb,Cr or G,B,R)
inputColorEncoding - Input color encoding standard
- ...
outputColorEncoding - Output color encoding standard
- ...
programName - Name of the program that created the file
programVersion - Version of the program that created the file
parametersSize - Size in bytes of the array of char with a copy of the
run parameters settings used for creating the file
reserved1 - Reserved for future usage
reserved2 - Reserved for future usage
// Parameters Settings
parametersData - pointer to an array of char with size parametersSize
// 3D LUT Data
lut3d - pointer to an array that contains the 3dlut output values.
The type of the lut3d array entries is defined by the outputBitDepth values:
-If all outputBitDepth values are 8, it's an array of unsigned char.
-If any of outputBitDepth values is >8 and <=16, it's an array of unsigned short.
The offset and dimension of the array are calculated as:
offset = (cr<<(inputBitDepth[1]+inputBitDepth[0])+cb<<(inputBitDepth[0])+y)*3 (YCbCr input)
offset = ( r<<(inputBitDepth[1]+inputBitDepth[0])+ g<<(inputBitDepth[0])+b)*3 (RGB input)
dimension = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2])
(This is the general formula for the 3 3D LUT tables offset and size)
*/
I have not decided yet how to indicate the ColorEncoding values, I will update the specification later. I need a little more time to think about it.
We could simplify it a little by using outputBitDepth as the same for all output channels, but the way it is now is more future proof...
Comments/sugestions are welcome.
If you all agree with this I will start working in cr3dlut v2.0 to release the first "official" version of the 3DLUT file format.
madshi
12th February 2009, 15:13
YV24.
Ah, never heard of that one yet! I wonder why ffdshow doesn't allow YV24 output?
Here is the third iteration for the definition of a file format for the 3D LUT
Looks good to me. Two minor cosmetic things:
(1) "char parametersData[];" should be the last element of the h3dlut struct definition, so it should be inside the structure, not outside. At least that's how win32 structures like that are usually done.
(2) offset and dimension can be either "3*" or "6*", depending on output bitdepth.
Thanks!
honai
12th February 2009, 17:26
I wonder why ffdshow doesn't allow YV24 output?
Probably because no DirectShow video renderer exists for YV24 on the Windows platform.
yesgrey
12th February 2009, 21:08
(1) "char parametersData[];" should be the last element of the h3dlut struct definition
I have think on that, but did not want to add the pointer to the header, because it would be completelly useless to store it. But you're right, more 4 bytes will make no harm.
(2) offset and dimension can be either "3*" or "6*", depending on output bitdepth.
No. The offset and dimension are always the same, what changes is the entries type. 'char' for 8bit output, and 'short' for 16 bit output.
madshi
12th February 2009, 21:16
I have think on that, but did not want to add the pointer to the header, because it would be completelly useless to store it. But you're right, more 4 bytes will make no harm.
I thought the first character was stored right after "reserved2" without any pointers?
yesgrey
12th February 2009, 21:16
Here is the fourth iteration for the definition of a file format for the 3D LUT:
// 3D LUT file specification (fileVersion=1):
struct
{
char signature[5];
int headerSize, fileVersion;
int inputBitDepth[3], outputBitDepth[3];
int inputColorEncoding, outputColorEncoding;
char programName[20];
long int programVersion;
int parametersSize;
int reserved1, reserved2;
char parametersData[];
} h3dlut;
void *o3dlut;
/*
// Header
signature - File signature; must be: '3DLUT'
headerSize - File header size in bytes
fileVersion - File format version number
inputBitDepth - Input bit depth per component (Y,Cb,Cr or R,G,B)
outputBitDepth - Output bit depth per component (Y,Cb,Cr or R,G,B)
inputColorEncoding - Input color encoding standard
- ...
outputColorEncoding - Output color encoding standard
- ...
programName - Name of the program that created the file
programVersion - Version of the program that created the file
parametersSize - Size in bytes of the array of char with a copy of the
run parameters settings used for creating the file
reserved1 - Reserved for future usage
reserved2 - Reserved for future usage
parametersData - pointer to an array of char with size parametersSize
// Output data
o3dlut - pointer to an array that contains the 3dlut output values.
The type of the o3dlut array entries is defined by the outputBitDepth values:
-If all outputBitDepth values are 8, it's an array of unsigned char.
-If any of outputBitDepth values is >8 and <=16, it's an array of unsigned short.
The offset and dimension of the array are calculated as:
offset = (cr<<(inputBitDepth[1]+inputBitDepth[0])+cb<<(inputBitDepth[0])+y)*3 (YCbCr input)
offset = ( r<<(inputBitDepth[1]+inputBitDepth[0])+ g<<(inputBitDepth[0])+b)*3 (RGB input)
dimension = 3*(2^inputBitDepth[0]*2^inputBitDepth[1]*2^inputBitDepth[2])
(This are the general formulas for the 3D LUT offset and dimension
assuming: char = 1 byte; short = 2 byte; int = 4 byte; long int = 8 byte)
*/
I have not decided yet how to indicate the ColorEncoding values, I will update the specification later. I need a little more time to think about it.
@tritical,
We could define it later, but how should we store <16bit values in the 'short'? MSB or LSB?
Comments/sugestions are welcome.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.