Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th September 2011, 10:29   #1  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
RFC: extensions for a high bit depth pipeline

So I've finally started some code for writing high bit depth filters, allowing an entire script pipeline to make use of it without constant up/down converting between filters. Looking for comments, contributions, and probing for interest in devs of existing plugins. There's a lot of work to be done—many of the Avisynth filters would need to be rewritten, and it probably wouldn't be very useful until some of the more popular filters adopted it.

Some current design choices:
  1. Components can be 16-bit unsigned integer or 32-bit float.
  2. Linear and gamma-compressed RGB are separate colorspaces.
  3. All colorspaces are made planar, as I've found it simpler to work with. If someone has a good reason for packed formats, they can be added.
  4. Frames are packed into a square RGB32 frame. This keeps it compatible with the frame cache and audio filters.
  5. Rows are always aligned for multi-threading and AVX.
  6. It packs extra information into each frame, in theory allowing any of the following to change with each frame: colorspace, bit depth, dimensions, frame rate (VFR ftw).
  7. Allows per-frame data that is custom to filters.

Usage would look something like this:
Code:
FFVideoSource()
ConvertToHQ32()
...
ConvertToLQ()
And a filter something like this: http://svn.int64.org/viewvc/int64/hq...pp?view=markup
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.

Last edited by PhrostByte; 5th September 2011 at 04:21.
PhrostByte is offline   Reply With Quote
Old 4th September 2011, 13:10   #2  |  Link
Hiritsuki
Novice of AVS
 
Join Date: Oct 2009
Posts: 156
this is same as Dither filter the 8bit to 16bit convert?
__________________
My PC

Last edited by Hiritsuki; 4th September 2011 at 13:15.
Hiritsuki is offline   Reply With Quote
Old 4th September 2011, 20:47   #3  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Quote:
Originally Posted by Hiritsuki View Post
this is same as Dither filter the 8bit to 16bit convert?
No. Think of it as a new avisynth.h for plugin developers, to provide a faster native solution.
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.
PhrostByte is offline   Reply With Quote
Old 4th September 2011, 23:54   #4  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,843
What about importing 8bit+ sources, like v210?
kolak is offline   Reply With Quote
Old 5th September 2011, 02:25   #5  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Quote:
Originally Posted by kolak View Post
What about importing 8bit+ sources, like v210?
Source filters would need to be updated for it. Hopefully it would be easy to get such a change built into FFMS.
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.

Last edited by PhrostByte; 5th September 2011 at 02:28.
PhrostByte is offline   Reply With Quote
Old 5th September 2011, 03:58   #6  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by PhrostByte View Post
Source filters would need to be updated for it. Hopefully it would be easy to get such a change built into FFMS.
I don't think I like it. Wouldn't it be better to just use Avisynth 2.6 instead?

Currently, the FFMS2 Avisynth plugin is a relatively thin wrapper around the FFMS2 API (with the exception of the pulldown flag manipulation). I don't think implementing what you want to do is particularly hard, but adding custom output mode hacks to the Avisynth plugin isn't very elegant.

I will still have to admit that this is a clever and potentially useful hack, though. If you implement the FFMS2 part, you can have a branch in the main repository if you want it, like the C-plugin has.
TheFluff is offline   Reply With Quote
Old 5th September 2011, 03:59   #7  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Okay, I've decided to move from the mvtools2 hack to including a small header in each frame. This makes it compatible with audio filters and adds VFR support.
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.
PhrostByte is offline   Reply With Quote
Old 5th September 2011, 04:14   #8  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Quote:
Originally Posted by TheFluff View Post
I don't think I like it. Wouldn't it be better to just use Avisynth 2.6 instead?
Avisynth development seems all but dead, and I suspect people will be using 2.5x for some time to come. A filter seems like the quickest way to get the idea into general use.

I know it's a bit hacky, but I've done my best to keep the developer API as clean as possible. Should Avisynth development ever kick back into gear and get a compatible feature set, it would not be difficult to move a filter to the official API.
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.
PhrostByte is offline   Reply With Quote
Old 5th September 2011, 12:01   #9  |  Link
SEt
Registered User
 
Join Date: Aug 2007
Posts: 374
Very nice ideas (especially what you are willing to write it ^_^), but my vote is for diffing it against Avisynth 2.6 code, not as another hack around. And it absolutely doesn't matter that there will be very limited support for it in internal filters. Avisynth 2.6 or even 3 development isn't going to revive itself without people willing to write something.

Another thing I don't like is names: ResampleHQ sounds reasonable, but ConvertToHQ32 is not. I suggest something like ConvertToFloatYUV444.

Packed colorspaces would be nice too - I don't remember a good way of using planar images with OpenCL/OpenGL.
SEt is offline   Reply With Quote
Old 5th September 2011, 19:42   #10  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
This is exactly what I've wanted Avisynth to support! The per-frame meta data is very important, we have so many hacks like stuffing the low bits with whether the frame was interlaced or not, and some information I want to access, like if the frame was originally I,B, or P, which I need for my filter. And of course, when I presented my ideas it got shot down just like you I'm all for getting something done *now*, because if you try to do it in some ideal way, it's just not going to get done. I really want to get workable high bit-depth support working, that would make it usable to professional users. If this is a plugin you can make it for both versions, of course.

To start, I'd like to see a version of ffms to support high bit-depth and also meta data (like I,B,P as I mentioned), and some functions to access these properties in script, and a way to convert to/from the usual formats, including the possibilibies to change the script convention being used (stacked MSB/LSB) into this internal format. The first filters can do levels, gamma, and dither and IVTC - that would immediately solve the most pressing needs of professionals. We also need a way to export to an encoder.
jmac698 is offline   Reply With Quote
Old 6th September 2011, 01:20   #11  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Quote:
Originally Posted by SEt View Post
Very nice ideas (especially what you are willing to write it ^_^), but my vote is for diffing it against Avisynth 2.6 code, not as another hack around. And it absolutely doesn't matter that there will be very limited support for it in internal filters. Avisynth 2.6 or even 3 development isn't going to revive itself without people willing to write something.

Another thing I don't like is names: ResampleHQ sounds reasonable, but ConvertToHQ32 is not. I suggest something like ConvertToFloatYUV444.

Packed colorspaces would be nice too - I don't remember a good way of using planar images with OpenCL/OpenGL.
I agree it'd be best to get support directly into 2.6, but I'm not up for maintaining Avisynth right now. Perhaps in the future .

I think you're right about the names too—ConvertToFloat32() and ConvertToUInt16() would work better. And OpenCL/OpenGL support is exactly the kind of reason I was looking for! I'll be sure to add support for packed colorspaces.

Quote:
Originally Posted by jmac698 View Post
This is exactly what I've wanted Avisynth to support! The per-frame meta data is very important, we have so many hacks like stuffing the low bits with whether the frame was interlaced or not, and some information I want to access, like if the frame was originally I,B, or P, which I need for my filter. And of course, when I presented my ideas it got shot down just like you I'm all for getting something done *now*, because if you try to do it in some ideal way, it's just not going to get done. I really want to get workable high bit-depth support working, that would make it usable to professional users. If this is a plugin you can make it for both versions, of course.

To start, I'd like to see a version of ffms to support high bit-depth and also meta data (like I,B,P as I mentioned), and some functions to access these properties in script, and a way to convert to/from the usual formats, including the possibilibies to change the script convention being used (stacked MSB/LSB) into this internal format. The first filters can do levels, gamma, and dither and IVTC - that would immediately solve the most pressing needs of professionals. We also need a way to export to an encoder.
I think we all want these features, we just need to be careful to avoid bikeshedding. We'll get there eventually!
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.
PhrostByte is offline   Reply With Quote
Old 6th September 2011, 02:52   #12  |  Link
Daemon404
Registered User
 
Join Date: Mar 2005
Posts: 128
Quote:
Originally Posted by PhrostByte View Post
I agree it'd be best to get support directly into 2.6, but I'm not up for maintaining Avisynth right now. Perhaps in the future .
Yet another dev who has run scared from the Avisynth code base
Daemon404 is offline   Reply With Quote
Old 6th September 2011, 09:19   #13  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
I've found just the kind of comments you were looking for. Actually I brought this up in 2006 and not much has changed since then - except I actually wrote Deepcolor tools
Quote:
The 3.0 team have the infrastructure in place to do any bits/pixel but I haven't seen any 15 bit/pixel code.

MMX and SSE does not readily lend itself to operations on 16 bit data with the same willingness as with 8 bit data. It certainly is doable, it just moves the shrewdness/cleverness bar to the next level.

The other hurdle is that Avisynth 2.x is based on the VFW AVI interface, which as far as I know has no current hacks to present greater then 8 bit video data. So in theory if we did do any 15 bit/pixel implementation and even if someone wrote a HDCAM_SR_Source() or DigiBetaSource() or someone jazzed Mpeg2Source(), it would only be available internally, we have no way to serve substantially better data to other applications, the best to hope for would be max quality 8 bit RGB.

If you are aware of anybodies plans to do hi res video thru VFW this forum is the right place to bring it to attention.

I guess if some encoding app wanted to use the Avisynth env->Invoke(...) interface instead of VFW then it would work, but it would be specific to that app only.

In principle there is nothing to stop VFW from transfering 15 bit data, it just needs a standard, I guess a FourCC to be allocated, that everyone agrees apon and can code to.

All these formats have registered (by Apple) fourCC's.

But wow what a pain to unpack and repack!
From IanB, http://forum.doom9.org/showthread.php?t=118580

In the last bit where he talks about standards, he didn't realize there was v210. That's what he meant about registered by Apple. So yes, v210 is an official 'hack' for high bit-depth in VFW. I see nothing wrong with this. Our new convertion of stacked 16bit as (MSB, LSB) or your planar format could become an offical 4cc as well - unless there's something existing that's similar.

I had some more thoughts on this, can't find it yet - but I listed all the meta data which would actually be useful per frame.

Last edited by jmac698; 6th September 2011 at 09:23.
jmac698 is offline   Reply With Quote
Old 6th September 2011, 09:42   #14  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
And here's another good thread
Quote:
I had sort of assumed internally the data format would be dictated by what was best in MMX/SSE i.e. 16 bit signed int using only the 0 to 32k range for PC levels and approx 2000 to 30000 for TV levels.

I am hopeing to at least add some infrastructure hooks in the 2.6 API change so when/if we get around to actually writing some 15 bit code the API won't have to be changed yet again. In theory the 2.6 engine should be able to route 15 bit channel data thru all the zero cost calls, like splice, trim and crop, and all the pixel processing calls throw a generic what's this pixel format exception.

The notes I have made are :- Reserve a pixel_type bit for 8/15 bit data, currently 0. Add a <Bytes/Bits>PerPixel<Channel or some better name> method that currently returns <1/8> and in the future return <2/15>. Define example RGB45 and YUV45 templates. Make sure all code correctly handles unknown pixel_type values.

In terms of pixel_type I am going to redefine the bits so for new the planar formats it describes the chroma subsampling, rather than the single bit per colour format we have in the interem.
http://forum.doom9.org/showthread.php?t=127779

The discussion was about using signed or unsigned 16bit ints. Someone from an audio background initially favored signed, but someone who had written image processing in 16bit unsigned said that there was no problem with unsigned as far as MMX. Another person then pointed out that targetting SSE2 makes sense now, and that even floating point was fast these days.

A few people would definitely use Avisynth for color grading and denoising in Digital Intermediates, if we had high bit-depth. They suggested 10bit with gamma, 16bit linear, 10bit log, and 32bit float as some standards. I also discovered there is a rawsource plugin for 16bit import that should be a lot more workable now that dither, avs2yuv etc. are now available. People have access to up to 4k resolution, 16bit files. This has been requested since over 5 years ago.
jmac698 is offline   Reply With Quote
Old 6th September 2011, 09:45   #15  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
There's enough high bit stuff now that I should package a Digital Intermediates tools package...
jmac698 is offline   Reply With Quote
Old 6th September 2011, 22:08   #16  |  Link
ajp_anton
Registered User
 
ajp_anton's Avatar
 
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 805
I would love to see
- ConvertTo8bit
- ConvertTo16bit
And while I don't fully understand the practical difference between packed and planar,
- ConvertToPlanar420
- ConvertToPlanar444
- ConvertToPacked422
etc...
ajp_anton is offline   Reply With Quote
Old 6th September 2011, 23:05   #17  |  Link
PhrostByte
Grand Fruitioner
 
PhrostByte's Avatar
 
Join Date: Mar 2004
Location: Chicago, IL
Posts: 115
Quote:
Originally Posted by jmac698 View Post
I've found just the kind of comments you were looking for. Actually I brought this up in 2006 and not much has changed since then - except I actually wrote Deepcolor tools

From IanB, http://forum.doom9.org/showthread.php?t=118580

In the last bit where he talks about standards, he didn't realize there was v210. That's what he meant about registered by Apple. So yes, v210 is an official 'hack' for high bit-depth in VFW. I see nothing wrong with this. Our new convertion of stacked 16bit as (MSB, LSB) or your planar format could become an offical 4cc as well - unless there's something existing that's similar.

I had some more thoughts on this, can't find it yet - but I listed all the meta data which would actually be useful per frame.
Here I'm more worried about organizing things for optimal in-memory editing. I don't think it would make a good fourcc. I'll be interested to see your list of frame-specific metadata.

Quote:
Originally Posted by jmac698 View Post
And here's another good thread

http://forum.doom9.org/showthread.php?t=127779

The discussion was about using signed or unsigned 16bit ints. Someone from an audio background initially favored signed, but someone who had written image processing in 16bit unsigned said that there was no problem with unsigned as far as MMX. Another person then pointed out that targetting SSE2 makes sense now, and that even floating point was fast these days.

A few people would definitely use Avisynth for color grading and denoising in Digital Intermediates, if we had high bit-depth. They suggested 10bit with gamma, 16bit linear, 10bit log, and 32bit float as some standards. I also discovered there is a rawsource plugin for 16bit import that should be a lot more workable now that dither, avs2yuv etc. are now available. People have access to up to 4k resolution, 16bit files. This has been requested since over 5 years ago.
I've never actually done 16-bit image processing before so I don't know what would be best. Floating point is simpler and wicked fast on modern CPUs, so other than memory bandwidth and file storage I've not seen a good use for integer formats.

Windows internally uses full range unsigned integers, so that's what I'm inclined to use for now. A signed HDR format does sound interesting, though.

Thanks for all your posts!

Quote:
Originally Posted by ajp_anton View Post
I would love to see
- ConvertTo8bit
- ConvertTo16bit
And while I don't fully understand the practical difference between packed and planar,
- ConvertToPlanar420
- ConvertToPlanar444
- ConvertToPacked422
etc...
It will definitely have all of these.
__________________
Lanczos4, Spline36, what!? Don't know how to pick a resizer? Take a look at my kernel visualizations.
Want a high-quality, gamma-aware resizer? Check out my ResampleHQ filter.

Last edited by PhrostByte; 6th September 2011 at 23:10.
PhrostByte is offline   Reply With Quote
Old 7th September 2011, 00:45   #18  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
Well, I can't find my big list, though I did mention this in a few threads. I just think we need an open and easy to access set of data per frame. Some things per frame: caption characters, color standard, interlaced, frame type.
dgdecode saves the per-frame color matrix as hints, the lower bits of the picture. So there is one immediate use. I saw that Fizick wanted meta data, we could ask him what his needs are.
jmac698 is offline   Reply With Quote
Old 8th September 2011, 02:06   #19  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,867
Hi,
I'm still gathering requirements here, for future reference. I found this thread interesting:
http://forum.doom9.org/showthread.php?t=162459
The problem was that a VOB was changing resolution mid-stream, and it was reported that ffms does handle this. It's unknown how encoders handle it. So the proposal is that, we need to store resolution per frame. And I know that audio can change from stereo to 5.1 (during commercials). Sure, some plugins that use temporal references won't be resolution change safe, perhaps they should be padded to the largest possible size? Well, just throwing it out there.
jmac698 is offline   Reply With Quote
Old 8th September 2011, 04:01   #20  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by jmac698 View Post
Hi,
I'm still gathering requirements here, for future reference. I found this thread interesting:
http://forum.doom9.org/showthread.php?t=162459
The problem was that a VOB was changing resolution mid-stream, and it was reported that ffms does handle this. It's unknown how encoders handle it. So the proposal is that, we need to store resolution per frame. And I know that audio can change from stereo to 5.1 (during commercials). Sure, some plugins that use temporal references won't be resolution change safe, perhaps they should be padded to the largest possible size? Well, just throwing it out there.
I don't think you've understood the implications of what you propose. While FFMS2 does support variable resolution input, the output from the Avisynth plugin is always fixed resolution. The biggest problem here is probably that even FFMS2 doesn't know what resolution a video frame will be until after it's been decoded, so in order to ensure that you allocate a frame big enough to fit all resolutions that will be encountered, you need to decode the entire stream first.

As for audio, FFMS2 doesn't support channel layout switches yet. It's in the works, though, and will probably be in 2.17.

Furthermore, while high bitdepth stuff is bad enough, we're now starting to get into the sort of hacks that aren't just ugly, but really hairy as well.
TheFluff is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 03:22.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.