Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 21st February 2018, 14:37   #49141  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Quote:
Originally Posted by huhn View Post
sorry but the current 3D LUTs are not just slow for fun.

we are talking about to change a process from 5 mins to at most a couple of sec.

sorry to say this but web browser don't do this in an quality way comparable to the 3d LUT in madVR and most important no one said they do it with a 3D LUT at all.
I don't know about web browsers, but software like Photoshop seems to be able to apply an ICC profile with no perceivable delay. And it would be pretty ludicrous to say that professional image editing software like Photoshop do this in subpar quality. In fact people using Photoshop probably care even more about color accuracy than videophiles.

As for "3DLUTs are not just slow for fun", well, it might be that programs like ArgyllCMS's collink command are slow because there is no perceived need to make them fast: people use it offline and very infrequently, no one cares a great deal about how long it takes. I suspect it would be possible to do it much faster with little loss in quality if one were to specifically optimize for that. (Does anyone know how long LittleCMS takes to generate a transform?)

Quote:
Originally Posted by huhn View Post
just to make this clear what a web browser does with a ICC file when a video plays is just garbage in term of quality.
I don't think web browsers use ICC profiles for video, but they do support it (to some extent) for still images. (Although they support it in kind of a retarded way because they don't use the monitor ICC profile - they always use sRGB as the destination. But that's neither here nor there.)

Last edited by e-t172; 21st February 2018 at 14:47.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 14:41   #49142  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
first of all madVR it self doesn't send any image out of the graphics card at all. it a renderer it just send an image to the GPU driver that have to do with the rest.
so if the image send out of the GPU is 8 bit or 10 bit is a pure GPU driver thing.

the 10 bit setting has to be set up for each connected devices separately and is by default 8 bit so that's most likely everything that's happening here.

and the oppo needs 10 bit input support in the first place the specs of tis thing are not really clear...
huhn is offline   Reply With Quote
Old 21st February 2018, 14:49   #49143  |  Link
Jong
Registered User
 
Join Date: Mar 2007
Location: London, UK
Posts: 576
Quote:
Originally Posted by huhn View Post
first of all madVR it self doesn't send any image out of the graphics card at all. it a renderer it just send an image to the GPU driver that have to do with the rest.
so if the image send out of the GPU is 8 bit or 10 bit is a pure GPU driver thing.

the 10 bit setting has to be set up for each connected devices separately and is by default 8 bit so that's most likely everything that's happening here.

and the oppo needs 10 bit input support in the first place the specs of tis thing are not really clear...
I guess this is aimed at me. Thanks for replying.

The Oppo definitely supports HDR10 on it's HDMI input. It's used quite regularly for this and, indeed, I tested it myself using 4:2:2, when the Oppo reports the PC outputting 4K/60 4:2:2 12-bit, as expected.

In RGB mode the Oppo reports 4K/60 RGB 8-bit, which again is good and as expected, given the limitations of HDMI 2.0.

[S]What I don't understand is why, if MadVR is as removed from the display as suggested, MadVR reports in its HUD outputing 10-bit when connected direct (which seems to lead to the driver outputting an illegal video mode), yet MadVR reports outputing 8-bit via the Oppo and all remains "legal". It seems MadVR is more aware of the display chain than thought.[/S]

Edit: wait a minute I get what you are saying. Probably MadVR realises it has a different "display" connected, when the Oppo is in the chain and has defaulted to 8-bit. makes sense.

Last edited by Jong; 21st February 2018 at 14:55.
Jong is offline   Reply With Quote
Old 21st February 2018, 14:56   #49144  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
Quote:
Originally Posted by e-t172 View Post
I don't know about web browsers, but software like Photoshop seems to be able to apply an ICC profile with no perceivable delay. And it would be pretty ludicrous to say that professional image editing software like Photoshop do this in subpar quality. In fact people using Photoshop probably care even more about color accuracy than videophiles.
ok let's assume PS is as flawless as you make it is here.
does PS have to color correct an image 60 times a sec or even more?
is the calculation PS uses done in a way it can be applied to a totally different image in a reasonable speed?

Quote:
As for "3DLUTs are not just slow for fun", well, it might be that programs like ArgyllCMS's collink command are slow because there is no perceived need to make them fast: people use it offline and very infrequently, no one cares a great deal about how long it takes. I suspect it would be possible to do it much faster with little loss in quality if one were to specifically optimize for that.
i'm not going to blindly assume that it is so bad coded that it can be speed up by a magnetite of 1000 and most important the result is ~100 mb.
a 3d LUT takes so long to get a fast high quality color correction by using a huge amount of space and a huge amount of processing power for the creation of the LUT itself.
huhn is offline   Reply With Quote
Old 21st February 2018, 15:23   #49145  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Quote:
Originally Posted by huhn View Post
ok let's assume PS is as flawless as you make it is here.
does PS have to color correct an image 60 times a sec or even more?
is the calculation PS uses done in a way it can be applied to a totally different image in a reasonable speed?
If you can color correct a single image, then it de facto means you can trivially generate a 3DLUT. It's super easy: just generate a single image containing all the source colors you care about, then color correct it. The resulting image is de facto your 3DLUT. It's as simple as that. If you have the transform to color correct a single image, then just generate a 3DLUT from that transform and then it's business as usual. Thus the "60 times a sec" that you're worried about is a non-issue. The hard part is generating the transform - once you have that, the rest is trivial.

(The reason why it's so simple is because the color of a destination pixel is only determined by the color of the source pixel - there is no other input, which is the very reason why you can use a LUT in the first place.)

Quote:
Originally Posted by huhn View Post
i'm not going to blindly assume that it is so bad coded that it can be speed up by a magnetite of 1000 and most important the result is ~100 mb.
a 3d LUT takes so long to get a fast high quality color correction by using a huge amount of space and a huge amount of processing power for the creation of the LUT itself.
Again, the only reason why it's 100 MB is because madshi wrote the simplest possible 3DLUT code without support for interpolation, therefore users are forced to generate full 256x256x256 3DLUTs which are grossly overkill. If you can get away with using smaller 3DLUTs, just like all other color-managed software does, then I suspect it will be much faster. (A 64x64x64 16-bit 3DLUT is 1.5 MB.)

Last edited by e-t172; 21st February 2018 at 15:38.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 15:28   #49146  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Quote:
Originally Posted by huhn View Post
ok let's assume PS is as flawless as you make it is here.
does PS have to color correct an image 60 times a sec or even more?
is the calculation PS uses done in a way it can be applied to a totally different image in a reasonable speed?
I don't know about Photoshop, but like I said, Firefox does do this - it calculates a 3DLUT on the fly in a manner of milliseconds, then caches it (the system isn't perfect incidentally, various APIs interacting in ways that aren't ideal, so I'm not sure it can reuse the generated 3DLUT for all images with the same profile, but that's not a fundamental problem). Now I imagine Firefox cuts corners to do this, and they replaced LCMS with the in-house qcms because LCMS wasn't fast enough - but if madVR has to spend say half a second to generate a 3DLUT, then caches it for every video with a matching color space, I think that would be fine.
Ver Greeneyes is offline   Reply With Quote
Old 21st February 2018, 15:43   #49147  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Quote:
Originally Posted by e-t172 View Post
Again, the only reason why it's 100 MB is because madshi wrote the simplest possible 3DLUT code without support for interpolation, therefore users are forced to generate full 256x256x256 3DLUTs which are grossly overkill. If you can get away with using smaller 3DLUTs, just like all other color-managed software does, then I suspect it will be much faster. (A 64x64x64 16-bit 3DLUT is 1.5 MB.)
The reason the 3DLUT is so large is specifically to front-load the processing requirement, and not have to do it for every single video frame. Thats the entire purpose of such a LUT in first place - solve complex math once.

Photoshop for example isn't going to care if displaying a single image takes 100ms of color processing, its not in any area a user is going to notice. Video playback does care, hence entirely different requirements.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 21st February 2018, 15:47   #49148  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
Quote:
Originally Posted by e-t172 View Post
If you can color correct a single image, then it de facto means you can trivially generate a 3DLUT. It's super easy: just generate an image containing all the source colors you care about, then color correct it. The resulting image is de facto your 3DLUT. It's as simple as that. If you have the transform to color correct a single image, then just generate a 3DLUT from that transform and then it's business as usual. Thus the "60 times a sec" that you're worried about is a non-issue. The hard part is generating the transform - once you have that, the rest is trivial.

(The reason why it's so simple is because the color of a destination pixel is only determined by the color of the source pixel - there is no other input, which is the very reason why you can use a LUT in the first place.)
and who said that the image has all possible colors and who said PS is saving it for all possible colors?

Quote:
Again, the only reason why it's 100 MB is because madshi wrote the simplest possible 3DLUT code without support for interpolation, therefore users are forced to generate full 256x256x256 3DLUTs which are grossly overkill. If you can get away with using smaller 3DLUTs, just like all other color-managed software does, then I suspect it will be much faster. (A 64x64x64 16-bit 3DLUT is 1.5 MB.)
the 256³ LUT is interpolated from a 64³ with default settings. so you want to make it faster by skipping the interpolation which is suspected to be slow to be done later in realtime?
huhn is offline   Reply With Quote
Old 21st February 2018, 16:04   #49149  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Quote:
Originally Posted by nevcairiel View Post
The reason the 3DLUT is so large is specifically to front-load the processing requirement, and not have to do it for every single video frame. Thats the entire purpose of such a LUT in first place - solve complex math once.
If your 3DLUT is smaller than full size (say, 64x64x64 instead of 256x256x256), then the only "complex math" you need to do in real time is interpolation between the 3DLUT points, such as basic bilinear or bicubic interpolation. Which is laughably trivial for a GPU to do, and something that I'm sure madshi would be able to do in his sleep.

You don't even need to use a fancy upscaling algorithm for that - no one is going to notice the difference. (Keep in mind that the color response of any reasonable monitor is at least somewhat linear, so basic interpolation is highly likely to land very close to the correct point. And in fact, you really don't want to get too fancy, because a response that's not smooth will result in banding artefacts. Which is why ArgyllCMS docs warn you against generating a transform that's trying to be too precise.)

Quote:
Originally Posted by huhn View Post
and who said that the image has all possible colors and who said PS is saving it for all possible colors?
It is trivial to generate an image that has all possible colors. In fact, a full 3DLUT is, itself, by definition, an image that has all possible colors. You just need to deal with that once (not for every frame), and then you're done. In practice though, you would use sampling (only generate an image with a subset of all possible colors, such as 64³) and then interpolate, as described above.

Quote:
Originally Posted by huhn View Post
the 256³ LUT is interpolated from a 64³ with default settings. so you want to make it faster by skipping the interpolation which is suspected to be slow to be done later in realtime?
Huh? I didn't know that madVR could interpolate from a 64³ 3DLUT. I've always used a 256³ 100 MB 3DLUT (256x256x256 x 3 colors x 16-bit = ~100 MB). Maybe I've missed that option.

In any case, yes, I'm saying that 3DLUT interpolation can, and should, be done in real-time. That's completely trivial and can be done extremely quickly. (Just like video upscaling, except you can get away with very basic interpolation.) Interpolating a large 3DLUT from a small one is not hard nor expensive. It's generating the initial transform that's the hardest part. Everything else after that is peanuts.

Last edited by e-t172; 21st February 2018 at 16:16.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 16:16   #49150  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Quote:
Originally Posted by e-t172 View Post
Huh? I didn't know that madVR could interpolate from a 64³ 3DLUT. I've always used a 256³ 100 MB 3DLUT (256x256x256 x 3 colors x 16-bit = ~100 MB). Maybe I've missed that option.
I think what huhn is referring to here is the processing in collink - collink generates a 3DLUT of a lower resolution (determined by the quality setting) then interpolates it to 256x256x256 to produce a file compatible with madVR. But you can override the resolution using -r256 to make it produce a 256³ 3DLUT directly without interpolation (which obviously takes a while).
Ver Greeneyes is offline   Reply With Quote
Old 21st February 2018, 16:24   #49151  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Ah, okay. But the ArgyllCMS docs explicitly state that you should not do that. I can't find where the exact rationale is explained but, IIRC, you don't want to generate too many points because there is a point of diminishing returns where you're basically just optimizing for measurement error and are more likely to create banding and other aberrant behavior than to actually improve color accuracy. This leads to the counter-intuitive result that interpolating provides better results than trying to achieve maximum precision, because the resulting transform is smoother.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 16:42   #49152  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Sure, I mean regardless of the fact that it's possible not to use interpolation, I don't think it really matters since we aren't pushing for madVR to do that. I suppose there is some value in noting that collink is slow despite not producing a 256³ 3DLUT directly, but I think that's just because it does everything to a very high standard. Whatever the reason, there are counterexamples showing that the process doesn't have to be that slow (and there's no reason not to cache the result).
Ver Greeneyes is offline   Reply With Quote
Old 21st February 2018, 16:48   #49153  |  Link
bkrieger
Registered User
 
Join Date: Jan 2016
Posts: 24
Quote:
Originally Posted by ryrynz View Post
Depends on your content, personally I'd drop chroma or luma scaling back if you need the headroom to hit that sweet SSIM 2D, your decision if you want to stick AR on top of that but personally I'd rather use that headroom for luma and chroma.

I am mainly watching 1080P movies

Should I drop Chroma Upscaling to NGU AA medium, and change downscaling to SSIM 2D?

Also under “ if any more Upscaling/downscaling is needed”, I have set to Let Madvr decide. Should I change these to something else?

Also, under image Upscaling, should I leave the quadrupling settings to “let madvr decide” as well as Chroma set to normal?

Lastly, what do you mean if I want to stock AR on top? What settling is that for and what should it be changed to?

Thanks
bkrieger is offline   Reply With Quote
Old 21st February 2018, 17:33   #49154  |  Link
fhoech
Registered User
 
fhoech's Avatar
 
Join Date: Nov 2010
Location: Stuttgart, Germany
Posts: 17
Quote:
Originally Posted by Asmodian View Post
The 3DLUT is is a better solution, technically, than it would be if madVR used the full contents of the ICC profile.
Creating complex gamut mapping is not really a focus (and doesn't need to be) of applications using ICC profiles to display imagery, irrespective if they are video renderers/players/editors or still image viewers/editors. There is no inherent technical reason why one solution would be superior to the other (pre-computed link/3D LUT vs dynamic linking of, in essence, also pre-computed cLUTs). The practical reason you can do more with a pre-computed link/3D LUT in terms of complex gamut mapping is because ICC color management modules that do the actual color transform, are by design "stupid" (same for programs applying 3D LUTs as they don't have to concern themselves with anything else than straightforward lookup through the cLUT and maybe interpolation) - all the complexity is in the profiles. That allows fast transform creation (linking) and application, and software authors can implement support more easily (even more easily if they just want to support pre-computed links).
But there's a difference between just creating a transform (linking two existing ICC profiles, for example) and inverting the "natural" profile (which for display devices is inverting the device RGB -> CIE values mapping), doing complex gamut mapping and appearance modeling (what Argyll's collink does when used with the -G inverse forward lookup gamut mapping option). The latter bunch is what takes up most time, transform creation (linking) alone is relatively trivial. You could also put all that complexity in the input and/or output profile creation instead of into the link creation (less self-contained approach though).

Quote:
Originally Posted by Asmodian View Post
Also what is in an ICC profile, beyond the simple 1D LUTs, varies wildly. Often they contain simple measurements that can only be used to do a rough conversion with little or no user control over how it is done. madVR's 256x256x256 3DLUT allows full control of the conversion with very fine grained corrections being possible.
Same is true for any 3D LUT, naturally, when it comes to the result. The source and destination color spaces to be linked can come from just a few measurements, or a synthetic color space defined by only primaries and simple tone curve, or thousands of measurements or complex synthetic cLUT-based color spaces.

Quote:
Originally Posted by Asmodian View Post
Wow, EVR uses an ICC profile in Windows 10 now?
That functionality has been around for well over half a decade now (not sure if it's a feature of EVR or MPC-HC).

Quote:
Originally Posted by Asmodian View Post
Unfortunately there are simply too many formats or types of ICC profiles and that tool doesn't support all of them. Proper handling of most ICC profiles is a big job.
Argyll supports all types of profiles, but only version ICC v2.x, not ICCv4.

Quote:
Originally Posted by e-t172 View Post
I don't think web browsers use ICC profiles for video, but they do support it (to some extent) for still images. (Although they support it in kind of a retarded way because they don't use the monitor ICC profile - they always use sRGB as the destination. But that's neither here nor there.)
Yes, browsers that do color management, only tend to color manage still images. IE 9+ (prior versions had no notion of color management whatsoever) and Edge use sRGB as display profile though (which is wrong and broken, could as well use no color management), Firefox does work correctly (but you really should set gfx.color_management.enablev4=true to get cLUT profile support), even though its multi-monitor support is lacking. Chrome does color management since a few versions, but it regularly seems to break in all kinds of interesting ways.

Quote:
Originally Posted by Ver Greeneyes View Post
Now I imagine Firefox cuts corners to do this, and they replaced LCMS with the in-house qcms because LCMS wasn't fast enough
If I recall correctly the reason for the replacement was not speed but security concerns (which several people, me included, shook their head at at the time, and still do).

Quote:
Originally Posted by nevcairiel View Post
The reason the 3DLUT is so large is specifically to front-load the processing requirement, and not have to do it for every single video frame. Thats the entire purpose of such a LUT in first place - solve complex math once.

Photoshop for example isn't going to care if displaying a single image takes 100ms of color processing, its not in any area a user is going to notice. Video playback does care, hence entirely different requirements.
A performant color management solution worth it's salt will cache the transforms and re-use them if it detects same source and destination profiles, and it will apply the created transforms using hardware acceleration via a GPU, e.g. via texture lookups/shaders.
__________________
DisplayCAL - Graphical front-end for Argyll CMS display calibration and characterization

Last edited by fhoech; 21st February 2018 at 18:26. Reason: Minor typo
fhoech is offline   Reply With Quote
Old 21st February 2018, 17:45   #49155  |  Link
Ver Greeneyes
Registered User
 
Join Date: May 2012
Posts: 447
Quote:
Originally Posted by fhoech View Post
If I recall correctly the reason for the replacement was not speed but security concerns (which several people, me included, shook their head at at the time, and still do).
Hmm, I don't recall. I do know they designed qcms to be fast.. but they wrote the whole thing in C and gave it an upstream repository, which has resulted in it becoming basically unmaintained. The library works, but I wouldn't call it their finest moment. I wonder if anyone has considered rewriting it in rust..
Ver Greeneyes is offline   Reply With Quote
Old 21st February 2018, 17:56   #49156  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
Quote:
Originally Posted by fhoech View Post
A performant color management solution worth it's salt will cache the transforms and re-use them if it detects same source and destination profiles, and it will apply the created transforms using hardware acceleration via a GPU, e.g. via texture lookups/shaders.
In the case of madVR you could be even smarter and, on top of caching, also compute the transform asynchronously. The video starts playing with slightly wrong colors for maybe a few seconds while the transform is being computed, and as soon as it's done, the transform is swapped in and color correction is active for the rest of the playback session.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 18:04   #49157  |  Link
fhoech
Registered User
 
fhoech's Avatar
 
Join Date: Nov 2010
Location: Stuttgart, Germany
Posts: 17
Quote:
Originally Posted by e-t172 View Post
It is trivial to generate an image that has all possible colors. In fact, a full 3DLUT is, itself, by definition, an image that has all possible colors. You just need to deal with that once (not for every frame), and then you're done. In practice though, you would use sampling (only generate an image with a subset of all possible colors, such as 64³) and then interpolate, as described above.
Exactly. You then push that (e.g.) 64^3 image through your transform, and the resulting image is your 3D LUT. Cache it, use it. Over and over. 60, 120, 240, or any number of times per second that you need. GPUs are made for that, speed is not an issue unless you have texture sizes that no longer fit into the graphics card memory.

Quote:
Originally Posted by e-t172 View Post
Huh? I didn't know that madVR could interpolate from a 64³ 3DLUT.
madVR can load eeColor 3D LUT files, which are 65^3 by design.
__________________
DisplayCAL - Graphical front-end for Argyll CMS display calibration and characterization
fhoech is offline   Reply With Quote
Old 21st February 2018, 18:14   #49158  |  Link
fhoech
Registered User
 
fhoech's Avatar
 
Join Date: Nov 2010
Location: Stuttgart, Germany
Posts: 17
Quote:
Originally Posted by Ver Greeneyes View Post
I do know they designed qcms to be fast..
I have a feeling they achieved speed mostly by leaving out all the parts of a CMM that deal with complex transforms (i.e., cLUT profiles). Naturally, they could be fast when they didn't even do color management

Quote:
Originally Posted by e-t172 View Post
In the case of madVR you could be even smarter and, on top of caching, also compute the transform asynchronously. The video starts playing with slightly wrong colors for maybe a few seconds while the transform is being computed, and as soon as it's done, the transform is swapped in and color correction is active for the rest of the playback session.
Not even sure that would be needed. Transform creation using LCMS should take mere milliseconds on a moderately modern system (as it's only linking). I may be wrong, but I think the main time spent is probably reading the files from disk, and then a couple more milliseconds to apply the transform to the input 'cLUT' image (the result of which will be the final 3D LUT).
__________________
DisplayCAL - Graphical front-end for Argyll CMS display calibration and characterization

Last edited by fhoech; 21st February 2018 at 18:21.
fhoech is offline   Reply With Quote
Old 21st February 2018, 18:32   #49159  |  Link
e-t172
Registered User
 
Join Date: Jan 2008
Posts: 589
I meant that in case someone wants to go through the full gamut mapping process (i.e. the equivalent of collink -G) on the fly. But as you said, it is debatable whether that's really that useful in the first place.
e-t172 is offline   Reply With Quote
Old 21st February 2018, 18:33   #49160  |  Link
leeperry
Kid for Today
 
Join Date: Aug 2004
Posts: 3,477
If all you need is gamut mapping, this script works like a charm in mVR: http://www.avsforum.com/forum/26-hom...lly-works.html

Feel free to use a Windows LUT on top if need be.

IME on test patterns it outputs identical results to 3DLUT's with a perfectly calibrated TV using CMUNDIS + Color.HCFR.

You can setup automatic PotPlayer profiles using different gamut mappings based on resolution and/or framerate, works like a champ and you can roll them with a mouse click too
leeperry is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 02:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.