Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st June 2021, 11:54   #1001  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Unfortunately i don't have time right now, can quick test could be this :
Code:
static void Sobel_16(const unsigned char *psrc,unsigned char *pdst,const int32_t src_pitch, const int32_t dst_pitch,
	const int32_t src_height,int32_t dst_row_size, int32_t thresh,uint8_t bit_pixel)
{
  const int32_t i = (dst_row_size + 3) >> 2;
  const int32_t i0 = (dst_row_size + 3-2) >> 2;

  dst_row_size >>= 1;
  thresh <<= (bit_pixel-8);

  if (aWarpSharp_Enable_AVX)
  {
    for (int32_t y=0; y<src_height; y++)
    {
		uint16_t *dst=(uint16_t *)pdst;
		
		if (y==0) JPSDR_Sobel_16_AVX(psrc+2,pdst+2,src_pitch,y,src_height,i0,thresh);
		else
		{
			if (y==src_height-1) JPSDR_Sobel_16_AVX(psrc,pdst,src_pitch,y,src_height,i0,thresh);
			else JPSDR_Sobel_16_AVX(psrc,pdst,src_pitch,y,src_height,i,thresh);
		}
		dst[0]=dst[1];
		dst[dst_row_size-1]=dst[dst_row_size-2];

		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
  else
  {
    for (int32_t y=0; y<src_height; y++)
    {
		uint16_t *dst=(uint16_t *)pdst;

		if (y==0) JPSDR_Sobel_16_SSE2(psrc+2,pdst+2,src_pitch,y,src_height,i0,thresh);
		else
		{
			if (y==src_height-1) JPSDR_Sobel_16_SSE2(psrc,pdst,src_pitch,y,src_height,i0,thresh);
			else JPSDR_Sobel_16_SSE2(psrc,pdst,src_pitch,y,src_height,i,thresh);
		}
		dst[0]=dst[1];
		dst[dst_row_size-1]=dst[dst_row_size-2];

		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
}
You have to put threads=1 in aWarpSharp2 call.

The
Code:
dst[0]=dst[1];
dst[dst_row_size-1]=dst[dst_row_size-2];
may fill the missing pixels.
__________________
My github.

Last edited by jpsdr; 1st June 2021 at 11:57.
jpsdr is offline   Reply With Quote
Old 1st June 2021, 13:27   #1002  |  Link
GMJCZP
Registered User
 
GMJCZP's Avatar
 
Join Date: Apr 2010
Location: I have a statue in Hakodate, Japan
Posts: 744
Quote:
Originally Posted by GMJCZP View Post
Thanks pinterf.

With frames = 2 it improved noticeably but still doesn't even match Prefetch 1:

Code:
Log file created with:      AVSMeter 3.0.9.0 (x86)
Script file:                Prueba.avs
Command line switches:      -log

[OS/Hardware info]
Operating system:           Windows 7 (x86) Service Pack 1.0 (Build 7601)

CPU:                        Pentium(R) Dual-Core CPU E5800 @ 3.20GHz / Wolfdale (Core 2 Duo) 2M
                            MMX, SSE, SSE2, SSE3, SSSE3
                            2 physical cores / 2 logical cores


[Avisynth info]
VersionString:              AviSynth+ 3.7.0 (r3382, 3.7, i386)
VersionNumber:              2.60
File / Product version:     3.7.0.0 / 3.7.0.0
Interface Version:          8
Multi-threading support:    Yes
Avisynth.dll location:      C:\Windows\system32\avisynth.dll
Avisynth.dll time stamp:    2021-01-11, 20:46:40 (UTC)
PluginDir2_5 (HKLM, x86):   C:\Program Files\AviSynth+\plugins
PluginDir+   (HKLM, x86):   C:\Program Files\AviSynth+\plugins+


[Clip info]
Number of frames:                      162
Length (hh:mm:ss.ms):         00:00:06.757
Frame width:                           640
Frame height:                          480
Framerate:                          23.976 (24000/1001)
Colorspace:                      YUV420P12
Audio channels:                        n/a
Audio bits/sample:                     n/a
Audio sample rate:                     n/a
Audio samples:                         n/a


[Runtime info]
Frames processed:                   162 (0 - 161)
FPS (min | max | average):          0.971 | 107491 | 29.03
Process memory usage (max):         37 MiB
Thread count:                       8
CPU usage (average):                64.9%

Time (elapsed):                     00:00:05.581


[Script]

LWLibavVideoSource("Sample2.mp4")
AssumeFPS("ntsc_film")
Prefetch(2,frames=2)
I include the test with FFMS2:

Code:
Log file created with:      AVSMeter 3.0.9.0 (x86)
Script file:                Prueba.avs
Command line switches:      -log

[OS/Hardware info]
Operating system:           Windows 7 (x86) Service Pack 1.0 (Build 7601)

CPU:                        Pentium(R) Dual-Core CPU E5800 @ 3.20GHz / Wolfdale (Core 2 Duo) 2M
                            MMX, SSE, SSE2, SSE3, SSSE3
                            2 physical cores / 2 logical cores


[Avisynth info]
VersionString:              AviSynth+ 3.7.0 (r3382, 3.7, i386)
VersionNumber:              2.60
File / Product version:     3.7.0.0 / 3.7.0.0
Interface Version:          8
Multi-threading support:    Yes
Avisynth.dll location:      C:\Windows\system32\avisynth.dll
Avisynth.dll time stamp:    2021-01-11, 20:46:40 (UTC)
PluginDir2_5 (HKLM, x86):   C:\Program Files\AviSynth+\plugins
PluginDir+   (HKLM, x86):   C:\Program Files\AviSynth+\plugins+


[Clip info]
Number of frames:                      162
Length (hh:mm:ss.ms):         00:00:06.757
Frame width:                           640
Frame height:                          480
Framerate:                          23.976 (24000/1001)
Colorspace:                      YUV420P16
Audio channels:                        n/a
Audio bits/sample:                     n/a
Audio sample rate:                     n/a
Audio samples:                         n/a


[Runtime info]
Frames processed:                   162 (0 - 161)
FPS (min | max | average):          0.756 | 239787 | 5.697
Process memory usage (max):         35 MiB
Thread count:                       7
CPU usage (average):                81.5%

Time (elapsed):                     00:00:28.437


[Script]
FFVideoSource("Sample2.mp4")
AssumeFPS("ntsc_film")
Prefetch(2)
My Avs+ system is suffering of "mono-nucleosis".
I need your help.
__________________
By law and justice!

GMJCZP's Arsenal

Last edited by GMJCZP; 1st June 2021 at 13:29.
GMJCZP is offline   Reply With Quote
Old 1st June 2021, 16:26   #1003  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
@Wonkey,

Avisynth is a bit laxly specified, you can even have a variable with same name as a function, its not until
it tries to evaluate an expression that it can figure out what it is, and whether the result of that expression is
assigned to some variable or used in some way as an argument in another expression. It may not be used
for anyhting at all, eg just plonk a "123456" on a line by itself somewhere, with or without the double quotes.

I doubt whether avisynth script lnaguage could be properly described in Backus–Naur [used in Kernighan & Ritchie,
"The C Programming Language", at the back of the book to describe Std/ISO C]. Or in those language describing tools
derived from Unix "Lex", and the like.

Making "\" line continuation optional, is just making it even more lax than it already is, and is bound to come with a bundle of trip wires, safer to forget any changes at this stage in the life of AVS.
I guess Ben implemented the language to the point that it could get the job done, and no further.

Backus–Naur form:- https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form
The C Programming Language:- https://en.wikipedia.org/wiki/The_C_...mming_Language
Lex:- https://en.wikipedia.org/wiki/Lex_(software)
Lexx [The bestest Sci-Fi ever created, weird mix of Canadian-German humour]:- https://en.wikipedia.org/wiki/Lexx

EDIT:
Quote:
plonk a "123456" on a line
Except the last line. [the entire script must evaluate to a clip]

EDIT: Fixed the Lex link, the trailing ")" of the url is wrongly appended as simple text after the remaining part of the link by the vBulletin url insertion thingy.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 1st June 2021 at 17:15.
StainlessS is offline   Reply With Quote
Old 1st June 2021, 17:01   #1004  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Ok, i've tested the following change :
Code:
static void Sobel_8(const unsigned char *psrc,unsigned char *pdst,const int32_t src_pitch, const int32_t dst_pitch,
	const int32_t src_height,const int32_t dst_row_size, int32_t thresh)
{
  const int32_t i = (dst_row_size-2 + 3) >> 2;

  if (aWarpSharp_Enable_AVX)
  {
	for (int32_t y=0; y<src_height; y++)
    {
		JPSDR_Sobel_8_AVX(psrc+1,pdst+1,src_pitch,y,src_height,i,thresh);
		pdst[0] = pdst[1];
		pdst[dst_row_size-1] = pdst[dst_row_size-2];
		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
  else
  {
	for (int32_t y=0; y<src_height; y++)
    {
		JPSDR_Sobel_8_SSE2(psrc+1,pdst+1,src_pitch,y,src_height,i,thresh);
		pdst[0] = pdst[1];
		pdst[dst_row_size-1] = pdst[dst_row_size-2];
		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
}
tested with :
Code:
a=AVISource("SP_IVTC.avi",False,"YV12").SetPlanarLegacyAlignment(True)
b=aWarpSharp2(a,chroma=3,threads=1)
c=aWarpSharp2(a,chroma=3,threads=0)

Subtract(b,c).Levels(127, 1, 129, 0, 255)
As i was hopping for, it seems it produces the same result.
So can you test if still crashing with previous change, and following change :
Code:
static void Sobel_16(const unsigned char *psrc,unsigned char *pdst,const int32_t src_pitch, const int32_t dst_pitch,
	const int32_t src_height,int32_t dst_row_size, int32_t thresh,uint8_t bit_pixel)
{
  const int32_t i = (dst_row_size-4 + 3) >> 2;

  dst_row_size >>= 1;
  thresh <<= (bit_pixel-8);

  if (aWarpSharp_Enable_AVX)
  {
    for (int32_t y=0; y<src_height; y++)
    {
		uint16_t *dst=(uint16_t *)pdst;

		JPSDR_Sobel_16_AVX(psrc+2,pdst+2,src_pitch,y,src_height,i,thresh);
		dst[0]=dst[1];
		dst[dst_row_size-1]=dst[dst_row_size-2];

		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
  else
  {
    for (int32_t y=0; y<src_height; y++)
    {
		uint16_t *dst=(uint16_t *)pdst;

		JPSDR_Sobel_16_SSE2(psrc+2,pdst+2,src_pitch,y,src_height,i,thresh);
		dst[0]=dst[1];
		dst[dst_row_size-1]=dst[dst_row_size-2];

		psrc += src_pitch;
		pdst += dst_pitch;
    }
  }
}
with the following :
Code:
aWarpSharp2(a,chroma=3,threads=1)
Edit
.... I realised too late, should have put this on my aWarpsharp thread...
__________________
My github.

Last edited by jpsdr; 1st June 2021 at 17:04.
jpsdr is offline   Reply With Quote
Old 1st June 2021, 17:39   #1005  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by jpsdr View Post
Ok, i've tested the following change :
...
with the following :
Code:
aWarpSharp2(a,chroma=3,threads=1)
Thanks, neither Sobel_8 not Sobel_16 is working (I don't have AVX in this test machine), because there is a new crash at e.g. in JPSDR_Sobel_8_SSE2
Code:
    movntdq XMMWORD ptr[rsi+rdi],xmm2
I think changing the destination start by +1 (8 bit) or +2 (16 bit) will make the storage unaligned

Unfortunately one have to treat the left and rightmost loads specially when we want to keep the aligned access in the middle.
pinterf is offline   Reply With Quote
Old 1st June 2021, 20:03   #1006  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
Argh... Forgot this. For purpose testing right not, does replacing "movntdq" with "movdqu" make it work ?
If yes, will create a specific asm function with "movdqu" for 1rst and last line.

Euh.... Why Sobel_8 worked for me on my Windows7 x86 avs2.60... .... Ah... picture was grey as expected in VDub and probably didn't notice an error message displayed in the bottom information line. If there is no pop-up crash, displayed picture was grey as expected, and i didn't realise...

I'll redo the test.

Edit
Indeed, didn't notice the error message in the bottom line of VDub...

Edit2
Replacing with "movdqu" solved, but if it seems to produce the same result for the 1 line pixel, it seems not for the last line pixel.
Can you confirm that with "movdqu" there is also no crash with CUDA ?
__________________
My github.

Last edited by jpsdr; 1st June 2021 at 20:16.
jpsdr is offline   Reply With Quote
Old 1st June 2021, 20:33   #1007  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
I'll look into it tomorrow, this bug however is not Cuda aware build specific; pure luck if it did not cause troubles. Movnt is the streaming version of mov mnemonic and requires aligned access. Replacing it with .u (unaligned) will surely solve the problem.
pinterf is offline   Reply With Quote
Old 1st June 2021, 22:19   #1008  |  Link
GMJCZP
Registered User
 
GMJCZP's Avatar
 
Join Date: Apr 2010
Location: I have a statue in Hakodate, Japan
Posts: 744
Friends, I am extremely worried, I made a great effort to buy my "new" card with a dual core processor and I have not been able to take advantage of its two cores, I feel like I have a Celeron, if someone gave a light, I am trying to update and improve all the scripts that I have posted but this situation is out of my hands.
__________________
By law and justice!

GMJCZP's Arsenal
GMJCZP is offline   Reply With Quote
Old 2nd June 2021, 10:46   #1009  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,582
Let me dream about at least MVTools2 on CUDA.
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 2nd June 2021, 11:01   #1010  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,431
Quote:
Originally Posted by StainlessS View Post
Making "" line continuation optional, is just making it even more lax than it already is, and is bound to come with a bundle of trip wires, safer to forget any changes at this stage in the life of AVS.
In the AviSynth language, newline is a statement terminator (except when using the \ escape), but the parser can also recognise it has reached the end of a statement when the next symbol is not a valid continuation of what it already has. Therefore in most cases, the newline is not strictly necessary (though recommended for readability).
However, there are some cases where a newline is required (see this post and this one), so it is not possible to have the parser ignore them altogether.
Quote:
I doubt whether avisynth script lnaguage could be properly described in Backus–Naur
Well, we do have the formal Avisynth grammar in Extended Backus-Naur Form (EBNF) (contributed a long time ago by gzardakas and possibly slightly out-of-date).
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 2nd June 2021, 11:15   #1011  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
So true...
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 2nd June 2021, 11:35   #1012  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by DJATOM View Post
So true...
Nekopanda extracted and rewrote some stuff what was needed for his project. Actually KTGMC filter is the beginning of what you are searching for.
https://github.com/pinterf/AviSynthC...C/MV.cpp#L5416
pinterf is offline   Reply With Quote
Old 2nd June 2021, 11:52   #1013  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Interesting. To actually use that, I have to build Avs+ WIP or 3.7 will work?
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 2nd June 2021, 12:33   #1014  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
Quote:
Originally Posted by DJATOM View Post
Interesting. To actually use that, I have to build Avs+ WIP or 3.7 will work?
Here had built x64 CUDA
https://drive.google.com/uc?export=d...Q4x49ZCt8W2jkS
kedautinh12 is offline   Reply With Quote
Old 2nd June 2021, 12:39   #1015  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
I already did my own build, now building boost libs since it's the only one dependency that I didn't yet resolved.
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 2nd June 2021, 13:15   #1016  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Apparently it doesn't work.
Quote:
ClearAutoloadDirs()
AddAutoloadDir("C:\avsCuda\scripts")
AddAutoloadDir("C:\avsCuda\plugins")
DGSource("NCOP.dgi").onCPU()
KTGMC()
gives me
Quote:
Error: [KMasktoolFilterBase] CUDAГtГМБ[ГАВЁУ№Ч═В╡В─ВнВ╛В│Вв occurred while reading frame 0.
while
Quote:
KTGMC().onCUDA()
doesn't crash, but process doing nothing - it simply waiting for something.
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 2nd June 2021, 13:40   #1017  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
I don't remember now, have you built masktools as well? (cuda branch in my masktools2 repo)
pinterf is offline   Reply With Quote
Old 2nd June 2021, 14:50   #1018  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
I went through some steps of your building manual with minor changes (using cuda 11.3 and setting cuda arch 7.5 where it was set to lower version, also fixed afxres.h to windows.h in nnedi3 since it doesn't build on latest msvc).
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 2nd June 2021, 15:13   #1019  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Ehh, that build manual is rather a 'log of my adventures on doing something I've never encountered before'. O.K., in a somewhat polished version, but I was happy that it worked as is for me. Anyway it can be a good start for someone who really wants to be involved in the project. (Probably not me, it requires weeks or months to have an active knowledge on it, though programming CUDA is a very interesting topic).
pinterf is offline   Reply With Quote
Old 2nd June 2021, 16:10   #1020  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Quote:
Originally Posted by GMJCZP View Post
My Avs+ system is suffering of "mono-nucleosis".
I need your help.
The problem is that you are trying to use multithreading on a simple script where is source filter is MT_SERIALIZED (just seen in the source). So it cannot be requested in a parallel way.

Such filters cannot be called again for a new frame until the previous frame is ready. There is blocking, new requests are getting into a queue. Prefetch(2) is starting threads with prefetching 4 frames in advance.

It is possible that the calls in this scenario are producing nonlinear frame access. In this case the somewhat slower execution is blocking the other frame requests, we are seeing a negative feedback.

This is not something which is debuggable easily. There is probably an 1/10000sec delay in timing conditions which results in the first out-of-sequence access to LWLibavVideoSource. Debug build is quick. But release build is getting into this state after some ten frames. When I put a simple line in the source code around the delay (cout::stdout << frame_number - really not a time consuming operation) which is writing the actual frame number to the standard output - the problem disappears.
Pretty much an the observer effect: the disturbance of an observed system by the act of observation.

I'd like to understand how it begins and how this chaotic internal state can be healed but it is not easy at all.
pinterf is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.