Log in

View Full Version : Decoding multiple threads?


zerowalker
27th March 2015, 02:09
Just wondering if it's possible to decode a video on more threads. I don't mean the hackish way MTMode etc.
Normally a decoder has support for multiple threads nowadays, but with Avisynth it's locked to 1.

Why i want this is because it's the bottleneck when i am encoding, the decoding is slowest so i would like to make it faster. (Good thing is that i can waste more time on encoding quality i guess).

I am talking about AviSynth 2.6.0 RC1.

Many Thanks

Mounir
27th March 2015, 04:04
decoding or encoding ?
for decoding there is DGDecNV (not free) see here: http://neuron2.net/dgdecnv/dgdecnv.html

zerowalker
27th March 2015, 05:00
Decoding, as it says;P

Should have mentioned, Codecs are MagicYUV, Lagarith etc, those types.

colours
27th March 2015, 07:12
Why would Avisynth have any control over how many threads a source filter wants to use?

Where do your lossless video files come from?

Have you tried FFV1, UtVideo, lossless H.264, etc.?

hello_hello
27th March 2015, 07:57
How did you determine Avisynth is "locked" to a single decoding thread? ffms2 doesn't specify a need for MT Avisynth to use more than one thread for decoding. At least no mention of it I could see.

https://ffmpegsource.googlecode.com/svn/trunk/doc/ffms2-avisynth.html
int threads = -1
The number of decoding threads to request from libavcodec. Setting it to less than or equal to zero means it defaults to the number of logical CPU's reported by Windows. Note that this setting might be completely ignored by libavcodec under a number of conditions; most commonly because a lot of decoders actually do not support multithreading.

zerowalker
27th March 2015, 16:34
My only clue to single thread is using Avisource with a lossless codec (pretty sure i have tried UtVideo and had same results). And it's always 25% Max (4 cores).
So that's why my assumption is 1 thread. So it's probably the Avisource that's the culprit i guess?
Haven't tried DirectShow, but had some bad experiences with it at times being non-frame-accurate etc.

Groucho2004
27th March 2015, 16:48
My only clue to single thread is using Avisource with a lossless codec (pretty sure i have tried UtVideo and had same results). And it's always 25% Max (4 cores).
So that's why my assumption is 1 thread. So it's probably the Avisource that's the culprit i guess?
Avisource with an UTVideo encoded file utilizes all the cores of my i5-2500K (~90% usage).

poisondeathray
27th March 2015, 16:49
How is decoding a bottleneck? What resolutions are you using ? You should get 100's of FPS with UT Video and 1080p on a typical computer

creaothceann
27th March 2015, 19:02
Haven't tried DirectShow, but had some bad experiences with it at times being non-frame-accurate etc.

Use DSS2.

zerowalker
27th March 2015, 20:53
hmm wait it might be avs4x264 that's the bottleneck here.

(I am doing simple 2D encodings, so it's super fast, pixelated low res stuff).
It's not super bottleneck, but yeah, it's fairly close on Medium.

hello_hello
27th March 2015, 21:06
hmm wait it might be avs4x264 that's the bottleneck here.

(I am doing simple 2D encodings, so it's super fast, pixelated low res stuff).

I've seen a couple of "why is MeGUI so slow?" type threads recently. Maybe at VideoHelp rather than doom9, but I did seriously wonder if avs4x264 was the cause.

I use XP so I can't test it myself, but if you uncheck the 64 bit x264 encoder setting in MeGUI's options I believe MeGUI will stop using avs4x264 (it's only needed for "32 bit Avisynth" -> "64 bit x264", the way I understand it).
And I guess I just made an assumption you're actually using MeGUI..... sorry.

I only let ffms2 use a single thread for decoding and it doesn't seem to slow things down much. CPU usage still sits between 90% and 100% most of the time (4 cores) unless I'm using slow Avisynth filters.

Sparktank
27th March 2015, 22:42
Taro_06 made an updated versoin of avs4x264, now avs4x26x (as it supports x265).
http://tmod.nmm-hd.org/avs4x26x/

If one wants to try that intead of old avs4x264.

zerowalker
28th March 2015, 06:44
I am not using MeGUI but my own made "MeGUI", but it's basically the same just super simple, so it uses avs4x264 to pipe to x264 as i must have the 32bit version of Avisynth (which i could use 64bit, but sadly i need the new Avisource track support).
Tried the new version avs4x26x, same results.

But it's not really a deal breaker. But i guess if it was possible to use 64bit it might go faster, as 64bit decoders tend to better.

colours
28th March 2015, 06:54
Please post your full script and command line if you want any of us to help you rather than take stabs in the dark.

zerowalker
28th March 2015, 07:40
Not really much of a script, but here it is.

Remember, it's not that it's Slow. It's that it's limited to 1 Thread that bothers me a bit.

a=Avisource("Z:\VisualBoyAdvance-M_2015_03_26_14_02_00_585.avi",atrack=1).GetChannels(1,1).SyncAudio( 0.999979)
b=Avisource("Z:\VisualBoyAdvance-M_2015_03_26_14_02_00_585.avi",atrack=0).SyncAudio( 0.999979)
#~ a.MixAudio(b, clip1_factor=1, clip2_factor=0.04)
a
PointResize(800, 720, src_left=0.0, src_top=0.0, src_width=0.0, src_height=0.0)
ConvertToYV24(matrix="Rec709", interlaced=false)

function SyncAudio(clip c, Float sync)
{
c
ConvertAudioToFloat()
AR=AudioRate()
TR=float(AR)*sync


ResampleAudio(ContinuedNumerator(TR), ContinuedDenominator(TR))
AssumeSampleRate(AR)

}

function SyncAudioToVideo(c){
c
ConvertAudioToFloat()
AR=AudioRate()
TR=Float(AR)*AR*FrameCount()/(AudioLengthF()*FrameRate())
ResampleAudio(ContinuedNumerator(TR), ContinuedDenominator(TR))
AssumeSampleRate(AR)
}

Groucho2004
28th March 2015, 09:04
It's that it's limited to 1 Thread that bothers me a bit.
How do you determine that?

I suggest the following:
1. Remove all audio processing from the script and run it through AVSMeter:
Avisource("Z:\VisualBoyAdvance-M_2015_03_26_14_02_00_585.avi")


2. Use the full script and also run it through AVSMeter.

Use the "-log" switch for AVSMeter and post the log files. If it's a very long clip you can limit the frame range with "-range=x,x".

Pat357
28th March 2015, 12:01
Unless you're have used a very old codec to create those .AVI, the decoding should support more cores.

What codec is used to decode the AVI ? What's the resolution ?

zerowalker
28th March 2015, 22:13
Seems to be superfast with only the Decoding (even if it doesn't seem to use 50%, it's at like 1000fps).
The HDD Didn't bottleneck, but at that speed i don't really care if it's 1 thread or not, so decoding isn't the issue i guess.

[General info]
Log file created with: AVSMeter 1.9.8.0 (x86)
Avisynth version: AviSynth 2.60, build:Jan 14 2015 [09:58:31] (2.6.0.5)


[Clip info]
Number of frames: 249948
Length (hh:mm:ss.ms): 01:09:25.800
Frame width: 640
Frame height: 576
Framerate: 60.000 (60/1)
Colorspace: RGB32
Audio channels: 2
Audio bits/sample: 2
Audio sample rate: 48000
Audio samples: 199948000


[Runtime info]
Frames processed: 10001 (0 - 10000)
FPS (min | max | average): 411.6 | 2063 | 1149
Memory usage (phys | virt): 33 | 29 MB
Thread count: 4
CPU usage (average): 31%
Time (elapsed): 00:00:08.706


[Script]
Avisource("Z:\VisualBoyAdvance-M_2015_03_27_16_36_24_407.avi")

[Performance data]
Frame interval Frames/sec Time/frame(ms) CPU(%) Threads PhysMEM(MB) VirtMEM(MB)
0-19 1414.005 0.7072 0 4 31 27
20-39 1229.132 0.8136 0 4 31 27
40-59 454.435 2.2005 0 4 31 27
60-79 1129.949 0.8850 0 4 31 27
80-99 1394.569 0.7171 28 4 33 29
100-119 1505.480 0.6642 28 4 33 29
120-139 1386.771 0.7211 28 4 33 29
140-159 1492.680 0.6699 28 4 33 29
160-179 1250.698 0.7996 28 4 33 29
180-199 1144.013 0.8741 28 4 33 29
200-219 1121.031 0.8920 28 4 33 29
220-239 1134.955 0.8811 32 4 33 29
240-259 1128.806 0.8859 32 4 33 29
260-279 1324.400 0.7551 32 4 33 29
280-299 1215.761 0.8225 32 4 33 29
300-319 1235.058 0.8097 32 4 33 29
320-339 1355.985 0.7375 32 4 33 29
340-359 1587.015 0.6301 32 4 33 29
360-379 1121.916 0.8913 25 4 33 29
380-399 966.981 1.0341 25 4 33 29
400-419 936.753 1.0675 25 4 33 29
420-439 911.505 1.0971 25 4 33 29
440-459 1005.669 0.9944 25 4 33 29
460-479 1306.263 0.7655 25 4 33 29
480-499 1157.317 0.8641 25 4 33 29
500-519 999.020 1.0010 25 4 33 29
520-539 706.702 1.4150 25 4 33 29
540-559 1003.679 0.9963 25 4 33 29
560-579 988.415 1.0117 25 4 33 29
580-599 957.875 1.0440 32 4 33 29
600-619 1236.035 0.8090 32 4 33 29
620-639 1524.458 0.6560 32 4 33 29
640-659 1043.036 0.9587 32 4 33 29
660-679 813.758 1.2289 32 4 33 29
680-699 863.593 1.1580 32 4 33 29
700-719 867.352 1.1529 31 4 33 29
720-739 411.591 2.4296 31 4 33 29
740-759 842.799 1.1865 31 4 33 29
760-779 864.128 1.1572 31 4 33 29
780-799 871.065 1.1480 21 4 33 29
800-819 867.410 1.1529 21 4 33 29


Codec is Lagarith in this case. (As the size is so small for 2D, i did use MagicYUV first but i see no benefit).

EDIT: Checked, and PointResize really does have quite a performance hit, didn't think it would be that great, that explains it.

Groucho2004
28th March 2015, 22:44
Checked, and PointResize really does have quite a performance hit, didn't think it would be that great, that explains it.
Really, pointresize? That should be very fast. What CPU is this running on?

zerowalker
28th March 2015, 23:38
Really, pointresize? That should be very fast. What CPU is this running on?

FPS (min | max | average): 53.72 | 387.5 | 316.5
That's with PointResize.

I got i5 760 @4ghz

Though this is easily solvable as i can just record in the resolution i resize to.
I only used PointResize is i thought it would be faster compared to a higher resolution decoding.

Groucho2004
29th March 2015, 10:30
FPS (min | max | average): 53.72 | 387.5 | 316.5
That's with PointResize.

I got i5 760 @4ghz

Though this is easily solvable as i can just record in the resolution i resize to.
I only used PointResize is i thought it would be faster compared to a higher resolution decoding.
The question is - why would you need more than 300 fps for frame serving? I assume that you feed the script to an encoder which most likely processes the frames at an order of magnitude slower. Also, the CPU cycles you are not using for frame serving can be put to use for encoding.

zerowalker
29th March 2015, 10:43
The question is - why would you need more than 300 fps for frame serving? I assume that you feed the script to an encoder which most likely processes the frames at an order of magnitude slower. Also, the CPU cycles you are not using for frame serving can be put to use for encoding.

Basically if i encode with x264 Medium, i still got some CPU left that could be used.

Then again it doesn't make sense cause i never reach above 160fps (this was without PointResize with my new test 800x720). But still if i use Very Fast for example, speed is the same.

Basically it's not That much of a problem. It's just that i would like it to be faster as i encode lossless to x264 so preset isn't really much of a problem.

(Oh right i do convert to YV24 in Avisynth, as i can't get it to work with RGB in x264, else i would do rgb->i444 in it).

Desbreko
29th March 2015, 14:39
Move PointResize to after ConvertToYV24 and your script will run much faster. AviSynth's resizers are way slower on RGB than on YUV for some reason.

zerowalker
29th March 2015, 14:41
Well i don't use PointResize anymore as i can simply record at that resolution to begin with. Thought decoding would be slower than the resize, but was wrong.
So the only thing i do is Converting to YV24. So not really much that can be done to improve the speed;P

Sparktank
29th March 2015, 14:51
Have you tried using ThreadRequest?
I've found it to be just as useful as SetMT.
http://forum.doom9.org/showthread.php?p=1404556#post1404556
downloads in first post are dead.
download link alive in this post: http://forum.doom9.org/showthread.php?p=1708643#post1708643

Though it requires some finesse and tedious experimenting at first.
It's a very short thread (2 pages) and the results have always been variable.
There doesn't seem to be a consistent analysis on the plugin and how it would work best for certain operations.

I usually do video and audio separate.
ConvertToYV24(matrix="Rec709", interlaced=false).ThreadRequest()

You could try it on the source file, too.
Avisource("Z:\VisualBoyAdvance-M_2015_03_27_16_36_24_407.avi").ThreadRequest()
ConvertToYV24(matrix="Rec709", interlaced=false).ThreadRequest()

I find adding ThreadRequest often uses up all 4 cores and goes anywhere from 97% to 100% for the whole encoding process.

zerowalker
30th March 2015, 11:39
Would prefer not to use those stuff, as it's pretty much hackish ways most of the time.
But will do some testing later just in case.