Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 31st July 2014, 07:28   #41  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
NNEDI3 is quite great, IMHO. It does have disadvantages, too, though. E.g. in some image areas (trees, grass, leaves) it can produce fractal like artifacts. That doesn't happen with Jinc. Also NNEDI3 is a lot slower than Jinc, when using madVR. I'm not sure how speed compares in AviSynth, though. Finally, NNEDI3 can only upscale by exactly 2.0x, while Jinc can up and downscale with any factor you like. So although NNEDI3 is great, there's a place for Jinc, too, IMHO.
madshi is offline   Reply With Quote
Old 31st July 2014, 18:03   #42  |  Link
zerowalker
Registered User
 
Join Date: Jul 2011
Posts: 1,121
Hmm, well guess i say the same, Lanczos looks similar, but has more artifacts.
Jic looks less detailed, so softer indeed.

But i think it depends on content, some things looks better in a soft detailed scale, and harder need that sharp edge.
zerowalker is offline   Reply With Quote
Old 31st July 2014, 18:33   #43  |  Link
innocenat
Registered User
 
innocenat's Avatar
 
Join Date: Dec 2011
Posts: 77
Quote:
Originally Posted by madshi View Post
Also NNEDI3 is a lot slower than Jinc, when using madVR. I'm not sure how speed compares in AviSynth, though.
In my test (on Mobile Haswell, which uses AVX2 that is ~25% faster than SSE3 on same CPU), Jinc36 720p->1080p runs at ~20fps. With nnedi3 doubling with spline64 to 1080p average to 3-5fps prescreener on, <1fps off. I can't test the OpenCL version because I can't force it to use my external graphic over integrated (I'm on optimus setup).

So yeah, right now performance is MUCH better. But tbh, because of current optimization I don't think it will be nearly as fast on CPU when anti-ringing filter is implemented due to branchy nature of the filter. I am looking into OpenCL right now, but no promise since I don't have much free time nowadays.
__________________
AviSynth+
innocenat is offline   Reply With Quote
Old 1st August 2014, 14:13   #44  |  Link
Reel.Deel
Registered User
 
Join Date: Mar 2012
Location: Texas
Posts: 1,664
Hi innocenat,

Thanks for the update. I was wondering what's the purpose of the version parameter? When I set it to true it gives this message.
Quote:
[Jinc Resizer] [7] Compiled Instruction Set: FMA3 AVX2 SSE3 SSE2 x86
Reel.Deel is offline   Reply With Quote
Old 1st August 2014, 14:32   #45  |  Link
innocenat
Registered User
 
innocenat's Avatar
 
Join Date: Dec 2011
Posts: 77
It currently show Jinc's internal CPU flag (the [7]) and instruction support it compiles with. I was not sure if AVX2 will really be faster at first so I make this mechanism to tell which version of plugin you have. The Jinc's internal CPU flag is because Avisynth and Avisynth+ cannot currently detect FMA3 and AVX2. Granted, its name is misnomer.
__________________
AviSynth+
innocenat is offline   Reply With Quote
Old 1st August 2014, 14:36   #46  |  Link
Reel.Deel
Registered User
 
Join Date: Mar 2012
Location: Texas
Posts: 1,664
Thanks for the information. One more question if you don't mind. Whats the license for JincResize? Apache 2.0 license? The reason I'm asking is because I want to add JincResize it to the wiki.
Reel.Deel is offline   Reply With Quote
Old 1st August 2014, 14:44   #47  |  Link
innocenat
Registered User
 
innocenat's Avatar
 
Join Date: Dec 2011
Posts: 77
Put it as Apache 2.0 I guess.

The Jinc function calculation (JincFilter.cpp) are Apache 2.0 since they are from ImageMagick. The main resampling code I wrote (EWAResizer.h, FilteredEWAResize.cpp, etc.) is also under MIT license. But the combination (i.e. the project itself) is under Apache 2.0. I guess should put a LICENSE file on the repository.

On a side note, you might encounter line artefact with large upscaling factor. It can be fixed by increasing quant_(x|y) option, depending on what direction the line is. I am still not sure if this is bug in my code, or limitation of quantization. I think it's the former, but I still can't pinpoint it yet.

The code on the GitHub actually now support downscaling, but I haven't thoroughly check/test it for correctness yet.
__________________
AviSynth+
innocenat is offline   Reply With Quote
Old 1st August 2014, 23:16   #48  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
I decided to play around with this a little which usually includes getting the code and compiling it myself. I used VC10/ICL13 with PGO to build the DLL.

Test script:
Code:
SetMemoryMax(1700)
LoadPlugin("JincResize.dll")

w = 1280
h = 720
colorbars(width = w, height = h, pixel_type = "yv12").killaudio().assumefps(24000, 1001)
trim(0,99)
fadeio(49)
trim(0,99)
v = last

a = v.Jinc36Resize(1920, 1080)
b = v.Jinc64Resize(1920, 1080)
c = v.Jinc144Resize(1920, 1080)
d = v.Jinc256Resize(1920, 1080)

return a++b++c++d
Results with innocenat's DLL:
Code:
Frames processed:               400 (0 - 399)
FPS (min | max | average):      3.965 | 18.09 | 7.212
CPU usage (average):            25%
Thread count:                   1
Physical Memory usage (peak):   1328 MB
Virtual Memory usage (peak):    1327 MB
Time (elapsed):                 000:00:55.463
Results with my DLL:
Code:
Frames processed:               400 (0 - 399)
FPS (min | max | average):      3.998 | 18.29 | 7.263
CPU usage (average):            25%
Thread count:                   1
Physical Memory usage (peak):   644 MB
Virtual Memory usage (peak):    643 MB
Time (elapsed):                 000:00:55.075
This was tested on a i5 2500K @ 4GHz (on XP, so AVX was not used.)

The speed is more or less the same but the memory usage is less than half with the DLL I built, no idea why.

FYI, Here is the makefile I used to build the DLL:
Code:
CPP=@icl.exe
CPP_FLAGS=/MT /EHa /W0 /O3 /Qipo /arch:IA32 /Qprof-use /D "NDEBUG" /nologo

LINK=@xilink.exe
LINK_FLAGS=/dll /nologo

JincResize.dll: JincFilter.obj AvisynthEntry.obj cpuid.obj FilteredEWAResize.obj
	$(LINK) $(LINK_FLAGS) JincFilter.obj AvisynthEntry.obj cpuid.obj FilteredEWAResize.obj /out:JincResize.dll

JincFilter.obj: JincFilter.cpp 
	$(CPP) $(CPP_FLAGS) JincFilter.cpp -c

AvisynthEntry.obj: AvisynthEntry.cpp 
	$(CPP) $(CPP_FLAGS) AvisynthEntry.cpp -c

cpuid.obj: cpuid.cpp 
	$(CPP) $(CPP_FLAGS) cpuid.cpp -c

FilteredEWAResize.obj: FilteredEWAResize.cpp 
	$(CPP) $(CPP_FLAGS) FilteredEWAResize.cpp -c
Groucho2004 is offline   Reply With Quote
Old 2nd August 2014, 01:22   #49  |  Link
innocenat
Registered User
 
innocenat's Avatar
 
Join Date: Dec 2011
Posts: 77
Quote:
Originally Posted by Groucho2004 View Post
I decided to play around with this a little which usually includes getting the code and compiling it myself. I used VC10/ICL13 with PGO to build the DLL.
I use VC12/ICL14 right now. I am surprised it works on WinXP, though, since I did not select vc120_xp as a base platform, though it is statically compiled.

Quote:
Originally Posted by Groucho2004 View Post
This was tested on a i5 2500K @ 4GHz (on XP, so AVX was not used.)
There are no AVX code anymore, it's AVX2 only now so you require Haswell. I might try to see if SSE2 integer pack/unpack and AVX processing is faster than pure SSE3, but I doubt that.

Quote:
Originally Posted by Groucho2004 View Post
The speed is more or less the same but the memory usage is less than half with the DLL I built, no idea why.
This commit is not in the release built yet.

EDIT: Also, FYI my official built are build with /arch:SSE btw, but important functions are #pragma to specific instruction set anyway (which is SSE minimum)
__________________
AviSynth+

Last edited by innocenat; 2nd August 2014 at 07:14.
innocenat is offline   Reply With Quote
Old 2nd August 2014, 10:11   #50  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by innocenat View Post
This commit is not in the release built yet.
I see, that might explain the difference.

Quote:
Originally Posted by innocenat View Post
Also, FYI my official built are build with /arch:SSE btw, but important functions are #pragma to specific instruction set anyway (which is SSE minimum)
Just checked the ICL13 documentation, "arch:IA32" is the same as "arch:SSE".

Last edited by Groucho2004; 2nd August 2014 at 16:27.
Groucho2004 is offline   Reply With Quote
Old 12th September 2014, 19:02   #51  |  Link
NicolasRobidoux
Nicolas Robidoux
 
NicolasRobidoux's Avatar
 
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
If you are looking for a method of minimizing haloing that does not rely on some sort of a limiter, you may want to check http://www.imagemagick.org/discourse...b4c5d6db824d9b and the linked discussions on the Luminous Landscape web site, possibly starting with the later posts.
NicolasRobidoux is offline   Reply With Quote
Old 2nd August 2020, 17:50   #52  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,036
Finally found that I call '2D' processing. Unfortunately current Jinc being Lanczos weighted is not ideal for most high quality work : it is bad for production/downsizing because do not produce 'conditioned' spectrum and not enough for reference display work because has too few sinc taps and additionally weighted with Lanczos window which is with too few taps do not open full potential of sinc kernel for correctly prepared input data. But it have implementated correct '2D/round' resampler engine as I see from quick testing. So we need to add more practical kernels for this resampler - same as used in SinPowResize for production work and same used in SincLin2Resize for reference displaying. And current Jinc is better to rename to some like JLanczosResize.

As for processing speed it is strange to see the runtime-calculated kernel samples. At least for pow2 resize can use static pre-computed kernel and just convolve in 2D matrices using SIMD that is faster. When I do not know what Jinc do I made simple C-based 2D convolution project for Sinc2D resize with static pre-computed kernel as an example. It is on github.
https://github.com/DTL2020/Sinc2D-master

Do developer of JincResize project still active ?

Quote:
Originally Posted by madshi View Post
Jinc has to be slower than other linear resamplers because due to how it works (can't be split into 2 separate passes) it has to read and process more source pixels. E.g. compared to Lanczos3, Jinc3 has to read and process 4x as many source pixels.

One thing to keep in mind is that NNEDI3 can only do exact 2x enlargements while Jinc can handle any up- and downscale factor you want.
In practice with too large input arrays the 1pass '2D' processing may be even faster in compare with V+H 1D passes + 2 full memory 90degree rotation as we have in old avisynth resampler.

For 1pass 2D we have read taps*2 lines from mem for convolution and stream 1 line for writing. If taps*2*width is fits into cache the things must run fast. For typical taps around 20 and width up to 10000 we have 10000*40 bytes to cache for planar 8bit encoding.

Quote:
Originally Posted by innocenat View Post
lolno. Jinc (and all lanczos-based resizer) has hallelujah ringing. But less aliasing, yes.
Jinc is sinc-based filter and it will ofcourse rings if feeded with 'unconditioned'/illegal content. So it is ofcourse not general resize filter for content from unknown sources. But it good for high image quality data processing and reference video monitor emulation. As such monitors are strictly prohibited to mask any ringing or aliasing caused by illegal video data supplied. So for badly conditioned content it is better to use any other simple resizer like bilinear/bicubic/etc.

In the 'perfect world' it is responsibility of image data producer to supply such (compressed to limited samples) video image data that not rings, not aliases and with controlled over/under shooting per producer's taste with 'decompressing' to analog image data with sinc (pure non-weighted sinc) filter. Practically it is upsizing to infinity number digital samples. With upcoming 8K displays we finally can see FullHD 1080 sampled video data with up to 4x restoration closer to analog form - that is usually good enough. It simulate 54 MHz DAC for DVD-player.

Last edited by DTL; 2nd August 2020 at 20:21.
DTL is offline   Reply With Quote
Old 4th December 2020, 07:40   #53  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,036
Well - I see this thread is not updated but there was significant plugin update. It looks it finally re-written for much more stable 2D resampler core and capable for downsizing.

Now I think using the current 2D resampler core it is good to add all common family of resizers - at least downsizers suitable for production work (having properties for Gibbs-neutralisation) like Gauss kernel and SinPow kernel. So we at least will have a complementary family of 'true-2D' resizers for downsample and upsample work. Because current Jinc resizer as well as Sinc in 1D is not suitable for downsize (antializing/production) work (without additional pre-filtering).

Which version of Visual Studio is required to build current version 1.1.0 ? (https://github.com/Asd-g/AviSynth-Ji...ases/tag/1.1.0) I tried VS2013 but it looks uses old std library and even can not found std::cyl_bessel_j(1, x) function.
DTL is offline   Reply With Quote
Old 4th December 2020, 08:10   #54  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,148
Visual studio ver 16
kedautinh12 is offline   Reply With Quote
Old 4th December 2020, 08:19   #55  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,036
Quote:
Originally Posted by kedautinh12 View Post
Visual studio ver 16
It is from VS 2019 software pack ? v16.8.1 is good/enough ?
DTL is offline   Reply With Quote
Old 4th December 2020, 08:24   #56  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,148
Yeah ������
kedautinh12 is offline   Reply With Quote
Old 4th December 2020, 15:41   #57  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by DTL View Post
Now I think using the current 2D resampler core it is good to add all common family of resizers - at least downsizers suitable for production work (having properties for Gibbs-neutralisation) like Gauss kernel and SinPow kernel. So we at least will have a complementary family of 'true-2D' resizers for downsample and upsample work. Because current Jinc resizer as well as Sinc in 1D is not suitable for downsize (antializing/production) work (without additional pre-filtering).
IIRC z.lib is do resize as 2D https://forum.doom9.org/showthread.php?t=173986
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 4th December 2020, 17:59   #58  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,036
Quote:
Originally Posted by real.finder View Post
I look into its description at http://avisynth.nl/index.php/Avsresize and do not found how to use special '2D' mode. I do not think it is default processing because typical 2D processing is much slower in compare with typical 1D+1D V+H 'fast video resizers' processing and usually have wide adjustment of 'plane of processing' size because it greatly inflence on processing speed. Like with JincResize it is tap parameter. And it cause tap*tap plane of processing size and speed decreasing. May be as squared number of tap or larger when processing block do not fits into cashes.
DTL is offline   Reply With Quote
Old 4th December 2020, 18:16   #59  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by DTL View Post
I look into its description at http://avisynth.nl/index.php/Avsresize and do not found how to use special '2D' mode. I do not think it is default processing because typical 2D processing is much slower in compare with typical 1D+1D V+H 'fast video resizers' processing and usually have wide adjustment of 'plane of processing' size because it greatly inflence on processing speed. Like with JincResize it is tap parameter. And it cause tap*tap plane of processing size and speed decreasing. May be as squared number of tap or larger when processing block do not fits into cashes.
https://forum.doom9.org/showthread.p...90#post1784190 and here https://forum.doom9.org/showthread.p...ib#post1788992
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 4th December 2020, 18:30   #60  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,297
Humm... Maybe i'll try to MT this one... Maybe... A quick look at the code, it seems not so hard at first glance, but there is allways tricky things possible...
How is it realy better to standard resizer, NNEDI3 ? Or just different ?
__________________
My github.
jpsdr is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 04:42.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.