Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th May 2008, 03:31   #881  |  Link
Zep
Registered User
 
Join Date: Jul 2002
Posts: 587
Quote:
Originally Posted by TSchniede View Post
I just tried for several days to get the best performance (while maintaining correct output) out of MVTools. I even tried to MT MVAnalyse with OpenMP. (It only got slower - I suppose the memory overhead and crating and closing Thread several times for each frame is way too high)

I can tell, that the MT method can never produce the same output , even for very large overlaps. That's because MVAnalyse gets its best predictors (the location where the search is started from) from the block surrounding it AND which have been fully computed on this hierarchy plane. So splitting will always miss data.
that is why you need to overlap. trouble is if you over lap enough to catch all motion you slow it down so much you might as well not use MT()
Zep is offline   Reply With Quote
Old 24th May 2008, 18:28   #882  |  Link
Konrad Klar
Registered User
 
Konrad Klar's Avatar
 
Join Date: Nov 2005
Location: Wałbrzych, Poland
Posts: 105
Code:
SetMTMode(2,0)

LoadPlugin("J:\Program Files\AviSynth 2.5\plugins\exinpaint.dll")
Logo1=ImageSource("K:\afd\WRHH1logoTR.bmp"). ConvertToYV12(matrix="pc.601")
Logo2=ImageSource("K:\afd\WRHH1logoTL.bmp"). ConvertToYV12(matrix="pc.601")

Video=AviSource("example.avi", audio=false)

A=Trim(Video,0,115)
Exinpaint(A,Logo1,color=$FF8080, Radius=100)
LanczosResize(720,540)
A=last

B=Trim(Video,116,0)
Exinpaint(B,Logo2,color=$FF8080, Radius=100)
LanczosResize(720,540)
B=last

AlignedSplice(A,B)


SetMTMode(5,0)
FFT3DGPU()

SetMTMode(2,0)
AddBorders(0,18,0,18)

Distributor()
I have noticed that HCEnc 0.23 processing this script uses by most time only 50% of CPU (Core2Quad). Currently I have replaced:
Code:
SetMTMode(5,0)
FFT3DGPU()

SetMTMode(2,0)
by
Code:
FFT3DFilter()
and it gives significant speedup on my current config (and CPU usage is at 69-75%), but in future I will replace my 8600GT with faster gfx.

My question is about FFT3DGPU. FFT3DGPU does not work in pararrel threads for obvious reasons. Which mode would be adequate for this filter? Is SetMTMode(5,0) good or maybe it should be replaced by SetMTMode(5,1)?
Any suggestions?
Konrad Klar is offline   Reply With Quote
Old 2nd June 2008, 07:35   #883  |  Link
TSchniede
Registered User
 
Join Date: Aug 2006
Posts: 77
Ok, now I have tried every possible variation I could come up.
The input clip is a full screen PAL capture in YUY2, which works flawlessly in singletreaded and up to #cores processes.

Using this simple Script:
Code:
setMTmode(5,0)

LoadPlugin("d:\avs game\filter\mvtools192.dll")

AVISource("testclipYUY2.avi")
#FFmpegSource("testclipYUY2.avi")
#DirectShowSource("testclipYUY2.avi", pixel_type="YUY2").converttoyuy2()
#MTsource("""AVISource("testclipYUY2.avi")""",delta=1,threads=1)

setMTmode(2)

idxref=30
ol=4
ts=320
	

backward_vec2 = source.MVAnalyse(isb = true, delta = 2, overlap=ol,idx=idxref)
backward_vec1 = source.MVAnalyse(isb = true, delta = 1, overlap=ol,idx=idxref)
forward_vec1 = source.MVAnalyse(isb = false, delta = 1, overlap=ol,idx=idxref)
forward_vec2 = source.MVAnalyse(isb = false, delta = 2, overlap=ol,idx=idxref)

source.MVDegrain2(backward_vec1,forward_vec1,backward_vec2,forward_vec2,thSAD=ts,idx=idxref)
Avisource alone crashed always after a couple of frames with an out of bounds memory access,. no matter the used (huffyuv-2.1.1, huffyuv(decoded by ffdshow), Lagarith), YUY2 uncompressed is a bit more stable, same is avisource called by MTsource.
AssumeFPS and requestlinear(TIVTC) can reduce the number of bad frames.
Directshowsource didn't produce obvious errors, but it always converted to RGB.
FFmpegSource worked without any errors (output binary identical to single thread).
Generated data (blankclip or loops) based on the avisource worked without noticeable errors.
MT 0.6/0.7 and the included avisynth.dlls produced virtually identical results. the .4 crashed a bit less than the other two.
The hardware should not be the cause, I tried on my Q9300 and on my Athlon X2 3800+, fresh WinXP/SP2(3) 32-bit. Both worked without any errors single threaded (and multi process - 2 or 4 clips parallel). Even MT() in an otherwise single threaded script didn't produce the corrupted frames/crashes (I didn't check on minor errors yet). With SetMTMode !=0,5 or 6 on the whole script crashes and corrupted frames occurred. The number of processors and the cpu load seemed to increase failures. A strange fact puzzled me for some time - longer more complex scripts, where the errors first occurred were more stable in every aspect. So making the script simpler usually increased the errors. Simple scripts like Avisource("clip1.avi").reduceby2() produce errors too, MVdegrain2 only produces nearly 100% cpu load and works at least with some datasources without errors, MVDegrain2 from mvtoolsMTcomp2.zip or Fizicks builds have similar errors. Getting additional load to the cpu with prime95 torture tests provide nearly identical results on less cpu-intensive scripts.

So am I the only one with unusable AVISource in scripts with SetMTMode 1-4 somewhere in the script?
I guess a critical section in the getFrame method of AVISource would prevent this, as the corrupted frames, if they don't crash the whole script, seem to be shifted(with garbage on top or bottom)/garbage luma with "missing" chroma and the luma seems to belong to some frame several frames away. so either they refer to frames(memory) already freed, or not yet filled with the decompressed input.
__________________
GA-P35-DS3R, Core2Quad Q9300@3GHz, 4.0GB/800 MHz DDR2, 2x250GB SATA HD, Geforce 6800

Last edited by TSchniede; 2nd June 2008 at 08:31. Reason: copied in the wrong version of the script.
TSchniede is offline   Reply With Quote
Old 3rd June 2008, 12:49   #884  |  Link
Graigddu
Registered User
 
Join Date: Nov 2007
Posts: 48
Can i ask whether the MT filter could be used to speed up a noise filter like fluxsmooth in a
tools GUI such as AVS2DVD or FAVC while encoding with HCEnc and how i would go about setting
that up in a script.
Sorry but i'm a noobie when it comes to avisynth and haven't hade much luck with it

All help gratefully received
Graigddu is offline   Reply With Quote
Old 4th June 2008, 16:12   #885  |  Link
TSchniede
Registered User
 
Join Date: Aug 2006
Posts: 77
(Spatio-)temporal filter usually profit more from MT() than from SetMTMode. Which one is better or which one works (if any at all) has to be tested. First make sure your system can profit from multi threading (at least HT or dualcore). Then that you can use MT() or SetMTMode() (mode 2 or 3 for most filters) for example masktools-v2.0 work with MT() and SetMTMode(2).

IF both your system and the filter works then a script like the following would be what you seek:
Code:
#loadplugin("mt.dll") # or use the autoload feature (see installation  on on first page)
Avisource("clip.avi")
MT("FluxSmoothT(7)")
#or MT("FluxSmoothST(7,7)",overlap=2) or use MTi on interlaced sources
Code:
SetMTMode(5)
#load your input here - for example FFMpegsource("clip.wmv")
SetMTMode(4) # 2 or 3 faster
FluxSmoothT(7)
Sorry, I don't know if fluxsmooth() works and my encoding rig is busy today.
__________________
GA-P35-DS3R, Core2Quad Q9300@3GHz, 4.0GB/800 MHz DDR2, 2x250GB SATA HD, Geforce 6800
TSchniede is offline   Reply With Quote
Old 4th June 2008, 16:43   #886  |  Link
Graigddu
Registered User
 
Join Date: Nov 2007
Posts: 48
Thanks for the scripts TShniede yes my computer is a dualcore so i'll have to give your script a try
and see which is the best at working with fluxsmooth if any

would this script still work if other avisynth scripts were run without MT first such as resize,crop
Graigddu is offline   Reply With Quote
Old 5th June 2008, 12:06   #887  |  Link
TSchniede
Registered User
 
Join Date: Aug 2006
Posts: 77
Of course if the simple script works (that does includes outright crashes, but most of the time only corrupted frames or "minor" pixel errors), other things can be added between the sourcefilter and fluxsmooth. In the best case the output is identical to single treaded operation.

crop or resize could be used freely multithreaded with SetMTMode(2), but not MT().

You could even use SetMTMode for everything except FluxSmooth.

if you use SetMTMode anywhere, then the first line has to be SetMTMode(5) or SetMTMode(5,2). Be careful with Avisource and SetMTMode!

This script should work:
Code:
SetMTMode(5)
FFMpegSource("input.clip")
SetMTMode(2)
crop(8,8,-8,-8).BilinearResize(1024,1024)
MT("FluxSmoothST()") #implicit change to mode 5 and back to 2
crop(16,16,-16,-16)
I don't know if you can really benefit from multithreading here, all involved filters are quite fast. A fast test over 10000 frames didn't produce errors, but at a cpu load of below 50% that doesn't say much.
__________________
GA-P35-DS3R, Core2Quad Q9300@3GHz, 4.0GB/800 MHz DDR2, 2x250GB SATA HD, Geforce 6800
TSchniede is offline   Reply With Quote
Old 5th June 2008, 12:22   #888  |  Link
Graigddu
Registered User
 
Join Date: Nov 2007
Posts: 48
think i follow you TSchniede

the script i had was this

# 4:3 encoding
AviSource("C:\Test Movies\test\test.avi", false)
ConvertToYUY2()
FadeIn(50)
Crop(0,16,720,544)
Lanczos4Resize(720,576,0.0,0.6)
loadplugin("mt.dll")
MT("FluxSmoothST(7,7)",overlap=2)

would this actually work or would SetMTMode for everything
be a better option.

really gratefull for your help as i'm totally lost on the whole when it comes to avisynth scripts.
Graigddu is offline   Reply With Quote
Old 5th June 2008, 13:44   #889  |  Link
TSchniede
Registered User
 
Join Date: Aug 2006
Posts: 77
If the script without MT() does what you intend to, that should be correct.
It should even be faster, if you use this:
Code:
SetMTMode(5)
# 4:3 encoding
loadplugin("mt.dll") # should be done outside of parallel region
AviSource("C:\Test Movies\test\test.avi", false)
SetMTMode(2)
ConvertToYUY2()
FadeIn(50)
Crop(0,16,720,544)
Lanczos4Resize(720,576,0.0,0.6)
#SetMTMode(5) #implicit done anyway on current version
MT("FluxSmoothST(7,7)",overlap=2)
One last hint, I can't use AVISource with SetMTMode (see post http://forum.doom9.org/showthread.ph...25#post1145725)
the script without SetMTMode should work though.
Is the input clip RGB? otherwise ConvertToYUY2() makes no sense, keeping it YV12 would be faster.
__________________
GA-P35-DS3R, Core2Quad Q9300@3GHz, 4.0GB/800 MHz DDR2, 2x250GB SATA HD, Geforce 6800
TSchniede is offline   Reply With Quote
Old 6th June 2008, 11:41   #890  |  Link
Graigddu
Registered User
 
Join Date: Nov 2007
Posts: 48
Thanks for the help TSchniede,much appreciated

when i checked the gui avisynth script the source was actually from DirectShowSource so i ran this script and it worked

Quote:
SetMTMode(5)
# 16:9 encoding
loadplugin("C:\Program Files\Avisynth 2.5\Plugins\MT.dll") # should be done outside of parallel region
DirectShowSource("C:\Test Movies\test\test.avi", false)
SetMTMode(2)
ConvertToYV12()
FadeIn(50)
Crop(0,16,720,544)
Lanczos4Resize(720,576,0.0,0.6)
#SetMTMode(5) #implicit done anyway on current version
MT("FluxSmoothST(12,12)",overlap=2)
it seems to work fine and i noticed that a 12 setting to temporal and spatial in fluxsmooth as file was blocky in parts produced a 1st pass encode of around 58 mins the normal without filtering being about 30 mins sadly unable to see what the 2nd pass would have been as had to leave for work.
Graigddu is offline   Reply With Quote
Old 7th June 2008, 09:49   #891  |  Link
buletti
Registered User
 
Join Date: Jun 2007
Posts: 42
AviSynth 2.5.8 support

Hi there,
can we expect MT support for AviSynth 2.5.8 RC1? As far as my experience goes AVS alphas and RCs live quite for some time. So even a pre-release update might be worth the effort...
buletti is offline   Reply With Quote
Old 18th June 2008, 17:49   #892  |  Link
leeperry
Kid for Today
 
Join Date: Aug 2004
Posts: 3,477
Quote:
Originally Posted by buletti View Post
Hi there,
can we expect MT support for AviSynth 2.5.8 RC1? As far as my experience goes AVS alphas and RCs live quite for some time. So even a pre-release update might be worth the effort...
+1

this RC1 is mod2 for YV12 resizing instead of mod4.

very useful for SD content with LSF.

any chance seeing a MT patch please ?
it's too slow for real time use in ffdshow otherwise
leeperry is offline   Reply With Quote
Old 10th July 2008, 18:34   #893  |  Link
Warpman
Registered User
 
Join Date: Oct 2005
Posts: 131
first: thank you for your great work tsp
filters that were supposed to be deadslow are now fast as hell

however i encountered one strange behavior:
if i drag & drop a script (even version() will do) into Megui it works 2-3 times and then stop working... if i do the same with unmodified avisynth it works always.

Maybe you have a clue whats going on...

Warpman is offline   Reply With Quote
Old 11th July 2008, 16:48   #894  |  Link
superuser
Registered User
 
Join Date: Sep 2006
Posts: 84
Quote:
Originally Posted by leeperry View Post
+1

this RC1 is mod2 for YV12 resizing instead of mod4.

very useful for SD content with LSF.

any chance seeing a MT patch please ?
it's too slow for real time use in ffdshow otherwise
+2.

without MT, many of the filters with 2.5.8 are pretty slow. anxiously looking forward to release support latest avisynth release.

thnxs
superuser is offline   Reply With Quote
Old 11th July 2008, 23:26   #895  |  Link
Dreassica
Registered User
 
Join Date: May 2002
Posts: 384
Does Setmtmode work properly with allot of freezeframes, duplicate and deleteframes as wel as trims?
I ask this because I'm encoding with setmtmode, but i get memory access errors a few seconds into encode, with d2v as source.
Dreassica is offline   Reply With Quote
Old 12th July 2008, 11:35   #896  |  Link
Zep
Registered User
 
Join Date: Jul 2002
Posts: 587
Quote:
Originally Posted by Dreassica View Post
Does Setmtmode work properly with allot of freezeframes, duplicate and deleteframes as wel as trims?
I ask this because I'm encoding with setmtmode, but i get memory access errors a few seconds into encode, with d2v as source.
some filters can not be used at all. Some need setmtmode=5.
Using mt eats up memory fast and can cause errors. Trims work fine but make sure you trim right after the source call in mode 5.
Zep is offline   Reply With Quote
Old 14th July 2008, 08:45   #897  |  Link
halsboss
likes to tinker
 
Join Date: Jan 2004
Location: girt by sea
Posts: 635
Just purchased an Intel Q9450 Duo Quad4 and am looking forward to thrashing the living daylights out of it

New to MT and SetMTmode ... I gather
  • MT - splits frames vertically (or horizontally if specified) for each thread to process
  • SetMTmode - each thread takes and alternate frame to process (can't locate what the different modes do) edit yes I can, http://avisynth.org/mediawiki/MT_modes_explained
  • Its good to use OVERLAP=2 to 8 to ensure motion detection works in say MVtools and spatio-temporal filters like Convolution3D and fft3dfilter ?
and from TSP http://forum.doom9.org/showpost.php?...&postcount=522 that LimitedSharpenFaster works in MT..

However I am not sure if "resizing to a final size" all in one step works, like
Code:
MT("LimitedSharpenFaster(smode=4, dest_x=704, dest_y=576)",overlap=8,threads=4)
since I recall seeing somewhere that this was the way to resize in MT (substitute your resizer) -
Code:
MT("spline36resize(704, last.height())",splitvertical=false,overlap=8,threads=4)
MT("spline36resize(last.width(), 576)",splitvertical=true,overlap=8,threads=4)
Any suggestions ?

Last edited by halsboss; 15th July 2008 at 08:15. Reason: fix syntax
halsboss is offline   Reply With Quote
Old 15th July 2008, 04:22   #898  |  Link
halsboss
likes to tinker
 
Join Date: Jan 2004
Location: girt by sea
Posts: 635
Did a quick search but couldn't spot how MT's "threads=N" divides up the source for input into each thread and deals with Width/Height issues such as Width and Height both not evenly divided by N.

Does anyone know, eg does MT "smartly" add borders before and remove afterward to avoid the issue, or do I have to do those calcs and Addborders myself ?

Also, if N>2 what are the interpretations of MT's "splitvertical" parameter in those cases ? I guess it means something like "striped slices" into N vertical or horizontal segements.

For MT's N>2, and thinking about optimal motion detection - given generally horizontal panning for my sources - does "splitvertical=false" ensure nice wide horizontal striped slices going into each thread (I know it's the default) ?

On a quad core, is threads=3 optimal ? ie 3 for MT and one for HC; a 3800 frame test with Conv3D/LimitedSharpenFaster/spline36resize indicates a small elapsed time saving in N=4 over N=3 (only a few secs) whereas N=1 and N=2 and N=3 seem to have larger savings of 13 or 14 secs gaps.

Last edited by halsboss; 15th July 2008 at 04:55.
halsboss is offline   Reply With Quote
Old 15th July 2008, 07:03   #899  |  Link
halsboss
likes to tinker
 
Join Date: Jan 2004
Location: girt by sea
Posts: 635
difference in encode times ?

I'm not sure why the following dramatic changes occur with HC023 encode times when using a function to encapsulate most of the filters - any suggestions ?

MT_Nthreads=2 38.2 fps
MT_Nthreads=3 46.2 fps
MT_Nthreads=4 49.8 fps
from
Code:
SetMemoryMax(256) 
WIDTH=704 
HEIGHT=576 
MT_Nthreads=2
MT_overlap=4
AviSource("G:\test\test-24Mb.avi", audio=false) 
AssumeFPS(25) 
global resizeWidth = 0 
global resizeHeight = 0 
global resizeBorderHalfHeight = 0 
bh = wCalcResize(LAST, WIDTH, HEIGHT) 
ConvertToYUY2(interlaced=FALSE)  
MT("Convolution3D(0, 32, 128, 32, 128, 10, 0)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
MT("LimitedSharpenFaster(smode=4)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
MT("spline36resize(resizeWidth,last.height())",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
MT("spline36resize(last.width(),resizeHeight)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=true)
Addborders(0, resizeBorderHalfHeight, 0, resizeBorderHalfHeight) 
Converttoyv12() 
SetPlanarLegacyAlignment(True)
MT_Nthreads=2 48.5 fps
MT_Nthreads=3 56.8 fps
MT_Nthreads=4 59.7 fps
from
Code:
SetMemoryMax(256) 
WIDTH=704 
HEIGHT=576 
MT_Nthreads=2
MT_overlap=4
AviSource("G:\test\test-24Mb.avi", audio=false) 
AssumeFPS(25) 
global resizeWidth = 0 
global resizeHeight = 0 
global resizeBorderHalfHeight = 0 
bh = wCalcResize(LAST, WIDTH, HEIGHT) 
ConvertToYUY2(interlaced=FALSE)  
Function wMulti(clip "inpclp") {
 zclp=inpclp.Convolution3D(0, 32, 128, 32, 128, 10, 0)
 zclp=zclp.LimitedSharpenFaster(smode=4)
 zclp=zclp.spline36resize(resizeWidth,zclp.height())
 return zclp
}
MT("wMulti(LAST)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
MT("spline36resize(last.width(),resizeHeight)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=true)
# unfortunately the height resize must be done separately since "splitvertical=true" is not the same as the rest
Addborders(0, resizeBorderHalfHeight, 0, resizeBorderHalfHeight) 
Converttoyv12() 
SetPlanarLegacyAlignment(True)
PS single-thread equivalent = 32.7 fps

Last edited by halsboss; 15th July 2008 at 13:19. Reason: Edit1 typo. Edit2 it might be HC needing Distributor() ?
halsboss is offline   Reply With Quote
Old 16th July 2008, 22:35   #900  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Lost
Posts: 9,558
MT("wMulti(LAST)",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
is equivalent to
MT("Convolution3D(0, 32, 128, 32, 128, 10, 0)
LimitedSharpenFaster(smode=4)
spline36resize(resizeWidth,last.height())",threads=MT_Nthreads,overlap=MT_overlap,splitvertical=false)
not three separate MTs. The perf drop is from all the back and forth copies being made when you split up the filters.

Quote:
Did a quick search but couldn't spot how MT's "threads=N" divides up the source for input into each thread and deals with Width/Height issues such as Width and Height both not evenly divided by N.
It rounds to the nearest mod 2. One being a couple of lines shorter than another isn't really a noticeable perf difference.
Quote:
Does anyone know, eg does MT "smartly" add borders before and remove afterward to avoid the issue, or do I have to do those calcs and Addborders myself ?
This is more like what overlap=X does. You're right about it being only needed for spatial filters and motion-compensated filters, small amounts are enough for most filters but large overlap is needed for mvtools.
Quote:
On a quad core, is threads=3 optimal ?
Optimal is pretty much whatever works best in your workflow - for HC with your combination of filters, 3 is the way to go; with much heavier filtering or another encoder 4 or 2 can be much better. With multiple instances of HC (ie, HCEnc^N), less may be optimal.
foxyshadis is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 08:18.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.