Pat357
21st March 2015, 23:58
I noticed that when using lot's of threads (like 12 or so) in VS, the output is no longer identical to the the one using less threads.
I 've used :
vspipe -y -p -e 999 TDeintMod.vpy 1000frames_thr01_a.yuv
to output 1000 frames and repeated this for different settings of the number of threads used.
Here are the results :
658f6e7349f83ab0643596583ef464c6 *1000frames_thr01_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr02_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr04_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr06_a.yuv
8146826125ccc3c019d76b88707aff40 *1000frames_thr12_a.yuv
3c4745721c519a27e6a8b52e5950de82 *1000frames_thr12_b.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr12_c.yuv
3c4745721c519a27e6a8b52e5950de82 *1000frames_thr12_d.yuv
3ba74da4a1fd324a28ff5920b0849dd9 *1000frames_thr12_e.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr6_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr6_ffms220_2.yuv
The filenames ending on thr(x)_a are created with (x) threads configured in the script.
Notice that for threads=12, we have 5 results, one is correct (1000frames_thr12_c.yuv), and we have 2 identical 's : 1000frames_thr12_b.yuv and 1000frames_thr12_d.yuv, but they are not the same as the results from threads=1...to 6.
This should rule out some kind of random processes like corruption from bad memory, bad disk,...
I was very surprised about the differences in the output for 12 threads : the output is no longer always identical !! :confused:
I normally don't use more than 8 threads on this system (4+4 HT cores) because it won't speed up things (>8 threads can even slowdown the process).
I did similar tests like this is the past but I've never encountered this issue : maybe it's specific this the script/clip combination ?
Memory consumption is not an issue as even with 12 threads it's well below the limit.
Script I use is straight from the DOC's :
import vapoursynth as vs
import functools
core = vs.get_core(threads=6, accept_lowercase = True)
core.std.LoadPlugin(path=r"j:\Programs\ffms2-r940c64-avs_vsp\ffms2.dll")
core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\vapoursynth-nnedi3-v4.0-win32\libnnedi3.dll")
core.std.LoadPlugin(r"c:\Program Files (x86)\VapourSynth\filters\TDeintMod-r5\Win32\TDeintMod.dll")
clip = core.ffms2.Source(r"j:\film\new_2015-02\WhiteQueen_Sample.h264", fpsnum=30000, fpsden=1001, threads=1)
def conditionalDeint(n, f, orig, deint):
if f.props._Combed:
return deint
else:
return orig
deint = core.tdm.TDeintMod(clip, order=1, field=1, mode=0, edeint=core.nnedi3.nnedi3(clip, field=1)) #order=1, field=1, mode=1
combProps = core.tdm.IsCombed(clip)
clip = core.std.FrameEval(clip, functools.partial(conditionalDeint, orig=clip, deint=deint), combProps)
clip.set_output()
System & environment :
Input clip is 1080i30 (fps=30000/1001), interlaced TFF containing 1360 frames.
Vapoursynth r26 32 bit
Windows 7 Pro X64
Core i7-4770 CPU with 16 GB RAM
System is not OC'd
I 've used :
vspipe -y -p -e 999 TDeintMod.vpy 1000frames_thr01_a.yuv
to output 1000 frames and repeated this for different settings of the number of threads used.
Here are the results :
658f6e7349f83ab0643596583ef464c6 *1000frames_thr01_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr02_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr04_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr06_a.yuv
8146826125ccc3c019d76b88707aff40 *1000frames_thr12_a.yuv
3c4745721c519a27e6a8b52e5950de82 *1000frames_thr12_b.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr12_c.yuv
3c4745721c519a27e6a8b52e5950de82 *1000frames_thr12_d.yuv
3ba74da4a1fd324a28ff5920b0849dd9 *1000frames_thr12_e.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr6_a.yuv
658f6e7349f83ab0643596583ef464c6 *1000frames_thr6_ffms220_2.yuv
The filenames ending on thr(x)_a are created with (x) threads configured in the script.
Notice that for threads=12, we have 5 results, one is correct (1000frames_thr12_c.yuv), and we have 2 identical 's : 1000frames_thr12_b.yuv and 1000frames_thr12_d.yuv, but they are not the same as the results from threads=1...to 6.
This should rule out some kind of random processes like corruption from bad memory, bad disk,...
I was very surprised about the differences in the output for 12 threads : the output is no longer always identical !! :confused:
I normally don't use more than 8 threads on this system (4+4 HT cores) because it won't speed up things (>8 threads can even slowdown the process).
I did similar tests like this is the past but I've never encountered this issue : maybe it's specific this the script/clip combination ?
Memory consumption is not an issue as even with 12 threads it's well below the limit.
Script I use is straight from the DOC's :
import vapoursynth as vs
import functools
core = vs.get_core(threads=6, accept_lowercase = True)
core.std.LoadPlugin(path=r"j:\Programs\ffms2-r940c64-avs_vsp\ffms2.dll")
core.std.LoadPlugin(path=r"c:\Program Files (x86)\VapourSynth\filters\vapoursynth-nnedi3-v4.0-win32\libnnedi3.dll")
core.std.LoadPlugin(r"c:\Program Files (x86)\VapourSynth\filters\TDeintMod-r5\Win32\TDeintMod.dll")
clip = core.ffms2.Source(r"j:\film\new_2015-02\WhiteQueen_Sample.h264", fpsnum=30000, fpsden=1001, threads=1)
def conditionalDeint(n, f, orig, deint):
if f.props._Combed:
return deint
else:
return orig
deint = core.tdm.TDeintMod(clip, order=1, field=1, mode=0, edeint=core.nnedi3.nnedi3(clip, field=1)) #order=1, field=1, mode=1
combProps = core.tdm.IsCombed(clip)
clip = core.std.FrameEval(clip, functools.partial(conditionalDeint, orig=clip, deint=deint), combProps)
clip.set_output()
System & environment :
Input clip is 1080i30 (fps=30000/1001), interlaced TFF containing 1360 frames.
Vapoursynth r26 32 bit
Windows 7 Pro X64
Core i7-4770 CPU with 16 GB RAM
System is not OC'd