Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th December 2008, 21:11   #5761  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by clsid View Post
@LoRd_MuldeR,
I never said that there won't be any speedup with >1 threads. I just said that testing with 1 thread gives a better picture that isn't clouded by the effect of a crappy MT implementation.
Okay. I will do another test later and I will explicitly enforce one single thread. Right now a capture is in progress...
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊
LoRd_MuldeR is offline   Reply With Quote
Old 19th December 2008, 21:12   #5762  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
Quote:
Originally Posted by LoRd_MuldeR View Post
If that was the case, then multiple threads would run slower than one thread (in the none-MT version).
I think you misunderstood me. In this particular situation the performance gains that have been achieved by the FFmpeg guys, are not enough to overcome the cost that is associated with spawning a 2nd, 3rd or 4th thread. Either that or the changes they made are just less SMP-friendly...
fastplayer is offline   Reply With Quote
Old 19th December 2008, 21:22   #5763  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by fastplayer View Post
I think you misunderstood me. In this particular situation the performance gains that have been achieved by the FFmpeg guys, are not enough to overcome the cost that is associated with spawning a 2nd, 3rd or 4th thread. Either that or the changes they made are just less SMP-friendly...
Sure. But if two threads already run faster than one thread, which is the case on my system (even with the none-MT version), then I hardly can imagine how further optimizations can make the "one thread" variant run faster, while there is no noticeable speed-up for the "two threads" variant. I'd rather assume that these optimizations don't help my Core2 as much as other CPUs. But as said before, I will do more tests later.
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 19th December 2008 at 21:29.
LoRd_MuldeR is offline   Reply With Quote
Old 19th December 2008, 21:27   #5764  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
Quote:
Originally Posted by clsid View Post
Generic and ICL10 builds are online. GCC build is no longer officially supported.
You mean GCC is not used anymore for compiling ffdshow.ax, correct?
fastplayer is offline   Reply With Quote
Old 19th December 2008, 21:31   #5765  |  Link
yesgrey
Registered User
 
Join Date: Sep 2004
Posts: 1,295
Quote:
Originally Posted by leeperry View Post
when yesgrey3 built ICL10.1 versions of the Reclock resampler, some of them were much faster than others...I could ask him for the best settings.
The best settings are always application dependent...
For the resampler the best settings were to not use any Intel extensions...
Maybe with the new code the result is different, but I don't believe it.
yesgrey is offline   Reply With Quote
Old 19th December 2008, 21:40   #5766  |  Link
tetsuo55
MPC-HC Project Manager
 
Join Date: Mar 2007
Posts: 2,317
Quote:
Originally Posted by clsid View Post
A few people have reported an issue with the x64 builds of ffdshow where ffdshow uses an unusually high amount of CPU cycles, resulting in bad playback (stuttering, frame drops). This issue is present for a long time now, so not related to any recent changes.

Weird thing is that the CPU usage returns to normal when the OSD is enabled in ffdshow video decoder.

Does anyone have an idea what might cause this problem, and why the OSD makes a difference?

I myself am unable to reproduce the issue on a clean install of Vista x64.
Just based on the description this seems to be the inversion of the bug i reported earlier. OSD enabled causes it to show 100% CPU when combined with auto-post processing.

I think the OSD is broken in more ways than one resulting in unexpected behaviour in several parts of ffdshow
In the case you mention it helps, it the case i mentioned it hurts.

Imho it should get a higher priority than it has at the moment
tetsuo55 is offline   Reply With Quote
Old 19th December 2008, 21:46   #5767  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
Quote:
Originally Posted by LoRd_MuldeR View Post
I'd rather assume that these optimizations don't help my Core2 as much as other CPUs.
M. Niedermayer made quite a few H.264-related commits in the past few days and he's running a Merom CPU (castrated C2D):
http://lists.mplayerhq.hu/pipermail/...er/018415.html
fastplayer is offline   Reply With Quote
Old 19th December 2008, 22:00   #5768  |  Link
tetsuo55
MPC-HC Project Manager
 
Join Date: Mar 2007
Posts: 2,317
Yeah all those h264 commits look great, all those 0.? speedups have to combine into a nice ?.? somewhere.

According to the changelog the speedups where measured on a pentium dual. Also a lot of unneeded checks and calculations where dropped.

This means on a per sample and per system basis the increase in FPS can be pretty high(probably never more than 10% though)

Also there seems to be more work done on realmedia 30 and 40, hopefully there should be less problems with it now and i hope that its soon fully able to replace realplayer.
tetsuo55 is offline   Reply With Quote
Old 19th December 2008, 23:16   #5769  |  Link
Snowknight26
Registered User
 
Join Date: Aug 2007
Posts: 1,430
Quote:
Originally Posted by haruhiko_yamagata View Post
As for the interlacing flags, are you sure that the stream has correct flags?
I don't know how to find that out, but here is a sample of where it changes from progressive to interlaced and vice versa.
Snowknight26 is offline   Reply With Quote
Old 20th December 2008, 00:10   #5770  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
Quote:
Originally Posted by Snowknight26 View Post
I don't know how to find that out, but here is a sample of where it changes from progressive to interlaced and vice versa.
Your sample is MBAFF. That means it's interlaced at macroblock level. In the same frame it may be macroblocks coded as interlaced and others as progressive. In order to view this correctly the entire stream has to be deinterlaced, just like it is now.

BTW Tiesto rules
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.
STaRGaZeR is offline   Reply With Quote
Old 20th December 2008, 00:46   #5771  |  Link
haruhiko_yamagata
Registered User
 
Join Date: Feb 2006
Location: Japan
Posts: 1,560
Quote:
Originally Posted by LoRd_MuldeR View Post
Some new numbers:

[...]

Can't say that the "normal" branch of ffdshow became significant faster between Beta-5 (r2033) and Beta-6 (r2527) on my system...
Beta6 is 5 - 12% faster for me. Please make sure you get the new pre-beta6.
Quote:
Revision 16239 - Directory Listing
Modified Fri Dec 19 13:45:13 2008 UTC (10 hours, 15 minutes ago) by darkshikari

Port x264 deblocking code to libavcodec. This includes SSE2 luma deblocking code and both MMXEXT and SSE2 luma intra deblocking code for H.264 decoding. This assembly is available under --enable-gpl and speeds decoding of Cathedral by 7%.
__________________
[ Download ffdshow | Wiki ]

Last edited by haruhiko_yamagata; 20th December 2008 at 01:09.
haruhiko_yamagata is offline   Reply With Quote
Old 20th December 2008, 01:07   #5772  |  Link
Snowknight26
Registered User
 
Join Date: Aug 2007
Posts: 1,430
Quote:
Originally Posted by STaRGaZeR View Post
Your sample is MBAFF. That means it's interlaced at macroblock level. In the same frame it may be macroblocks coded as interlaced and others as progressive. In order to view this correctly the entire stream has to be deinterlaced, just like it is now.
Since thats the case, when I enable the deinterlacer, it shouldn't be deinterlacing the progressive macroblocks... but it does anyway.
Snowknight26 is offline   Reply With Quote
Old 20th December 2008, 01:11   #5773  |  Link
haruhiko_yamagata
Registered User
 
Join Date: Feb 2006
Location: Japan
Posts: 1,560
Quote:
Originally Posted by Snowknight26 View Post
when I enable the deinterlacer, it shouldn't be deinterlacing the progressive macroblocks
This is wrong. A macroblock encoded using progressive algorithm may require deinterlacing.
__________________
[ Download ffdshow | Wiki ]
haruhiko_yamagata is offline   Reply With Quote
Old 20th December 2008, 01:26   #5774  |  Link
fastplayer
Registered User
 
Join Date: Nov 2006
Posts: 799
300-tlr2_h1080p.mov | terminatorsalvation-tlr2_h1080p.mov | Madagascar.avi
2033: 44.7 | 43.2 | 198.2
2527: 45.4 | 44.9 | 202.5
2527 ICL: 45.4 | 45.0 | 202.4

- All results in dfps
- First 2 trailers: 1920x800 H.264, 3rd one: 1280x720 DX50
- CPU: Athlon64 3500+
fastplayer is offline   Reply With Quote
Old 20th December 2008, 01:29   #5775  |  Link
Snowknight26
Registered User
 
Join Date: Aug 2007
Posts: 1,430
Quote:
Originally Posted by haruhiko_yamagata View Post
This is wrong. A macroblock encoded using progressive algorithm may require deinterlacing.
May require. If it doesn't require it, does it still get deinterlaced? Might I remind you of the screenshots I previously posted (#5721).

Last edited by Snowknight26; 20th December 2008 at 01:32.
Snowknight26 is offline   Reply With Quote
Old 20th December 2008, 01:40   #5776  |  Link
haruhiko_yamagata
Registered User
 
Join Date: Feb 2006
Location: Japan
Posts: 1,560
Quote:
Originally Posted by Snowknight26 View Post
May require. If it doesn't require it, does it still get deinterlaced? Might I remind you of the screenshots I previously posted (#5721).
Anyway, the decoder flagged the frame correctly.
Deinterlace or not is choice of deinterlacers. If you use a good deinterlacer, you will have satisfactory results.
__________________
[ Download ffdshow | Wiki ]
haruhiko_yamagata is offline   Reply With Quote
Old 20th December 2008, 02:27   #5777  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by haruhiko_yamagata View Post
Beta6 is 5 - 12% faster for me. Please make sure you get the new pre-beta6.
I use the latest version. Re-downloaded, just to be sure. File from 2008-12-19, 18:01.
This time I ran the comparison with only one single thread, so the multi-threading code can't have any impact.
However there still is no remarkable difference between Beta-5 and preBeta-6. See:

Code:
E:\HD\freedom EP1 sample.mkv, 1920x1080, High@L4.1

[ffdshow, rev2033, Beta-5, 2008-07-05, 1 thread]
User: 31s, kernel: 0s, total: 31s, real: 30s, fps: 19.8, dfps: 20.0
User: 30s, kernel: 0s, total: 30s, real: 30s, fps: 20.0, dfps: 20.0
User: 30s, kernel: 0s, total: 30s, real: 30s, fps: 20.0, dfps: 19.9

[ffdshow, rev2527, Pre-Beta 6, 2008-12-19, 1 thread]
User: 30s, kernel: 0s, total: 30s, real: 30s, fps: 20.0, dfps: 20.0
User: 30s, kernel: 0s, total: 30s, real: 30s, fps: 19.9, dfps: 20.0
User: 31s, kernel: 0s, total: 31s, real: 30s, fps: 19.7, dfps: 19.9

[ffdshow-MT, rev2525, 2008-12-20, 1 thread]
User: 30s, kernel: 0s, total: 30s, real: 30s, fps: 20.1, dfps: 20.0
User: 31s, kernel: 0s, total: 31s, real: 31s, fps: 19.6, dfps: 19.5
User: 31s, kernel: 0s, total: 31s, real: 31s, fps: 19.5, dfps: 19.5

[ffdshow-MT, rev2525, 2008-12-20, 4 threads]
User: 2s, kernel: 0s, total: 2s, real: 8s, fps: 244.9, dfps: 71.5
User: 2s, kernel: 0s, total: 2s, real: 8s, fps: 236.1, dfps: 71.3
User: 2s, kernel: 0s, total: 2s, real: 8s, fps: 244.9, dfps: 71.0

[CoreAVC, Version 1.8.5]
User: 0s, kernel: 0s, total: 0s, real: 7s, fps: 625.8, dfps: 83.2
User: 0s, kernel: 0s, total: 0s, real: 7s, fps: 773.0, dfps: 83.0
User: 1s, kernel: 0s, total: 1s, real: 7s, fps: 588.4, dfps: 82.3

[DivX H.264 Decoder, Beta-3]
User: 1s, kernel: 0s, total: 1s, real: 6s, fps: 499.0, dfps: 89.4
User: 1s, kernel: 0s, total: 1s, real: 6s, fps: 433.2, dfps: 89.0
User: 1s, kernel: 0s, total: 1s, real: 7s, fps: 486.7, dfps: 88.0

Another sample, just to be sure. But same result:

Code:
E:\HD\Crowd Run 2160p UHD CRF22 x264-CtrlHD.mkv

[ffdshow, rev2033, Beta-5, 2008-07-05, 1 thread]
User: 121s, kernel: 0s, total: 121s, real: 121s, fps: 4.1, dfps: 4.1
User: 121s, kernel: 0s, total: 121s, real: 121s, fps: 4.1, dfps: 4.1
User: 121s, kernel: 0s, total: 121s, real: 121s, fps: 4.1, dfps: 4.1

[ffdshow, rev2527, Pre-Beta 6, 2008-12-19, 1 thread]
User: 120s, kernel: 0s, total: 120s, real: 120s, fps: 4.1, dfps: 4.1
User: 120s, kernel: 0s, total: 120s, real: 120s, fps: 4.1, dfps: 4.1
User: 121s, kernel: 0s, total: 121s, real: 120s, fps: 4.1, dfps: 4.1

[ffdshow-MT, rev2525, 2008-12-20, 4 threads]
User: 8s, kernel: 0s, total: 8s, real: 35s, fps: 61.1, dfps: 14.2
User: 7s, kernel: 0s, total: 8s, real: 35s, fps: 62.4, dfps: 14.2
User: 7s, kernel: 0s, total: 7s, real: 35s, fps: 62.9, dfps: 14.2

[DivX H.264 Decoder, Beta-3]
User: 3s, kernel: 0s, total: 4s, real: 28s, fps: 121.7, dfps: 17.5
User: 4s, kernel: 0s, total: 4s, real: 28s, fps: 104.6, dfps: 17.5
User: 4s, kernel: 0s, total: 4s, real: 28s, fps: 118.5, dfps: 17.4
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 20th December 2008 at 02:55.
LoRd_MuldeR is offline   Reply With Quote
Old 20th December 2008, 06:24   #5778  |  Link
Snowknight26
Registered User
 
Join Date: Aug 2007
Posts: 1,430
Quote:
Originally Posted by haruhiko_yamagata View Post
Deinterlace or not is choice of deinterlacers. If you use a good deinterlacer, you will have satisfactory results.
Which deinterlacer that comes with ffdshow would qualify as being 'good?'
Snowknight26 is offline   Reply With Quote
Old 20th December 2008, 06:33   #5779  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by LoRd_MuldeR View Post
I use the latest version. Re-downloaded, just to be sure. File from 2008-12-19, 18:01.
This time I ran the comparison with only one single thread, so the multi-threading code can't have any impact.
However there still is no remarkable difference between Beta-5 and preBeta-6. See:
Are you sure the person who compiled your copy of ffdshow had yasm installed? Otherwise, none of the new assembly code will get used...

(it also requires --enable-gpl...)
Dark Shikari is offline   Reply With Quote
Old 20th December 2008, 07:29   #5780  |  Link
haruhiko_yamagata
Registered User
 
Join Date: Feb 2006
Location: Japan
Posts: 1,560
@LoRd_MuldeR
It may dependent on samples. Please try premiere-paff.ts or bbc-japan_1080p.mov.
__________________
[ Download ffdshow | Wiki ]
haruhiko_yamagata is offline   Reply With Quote
Reply

Tags
ffdshow, ffdshow tryouts, ffdshow-mt, ffplay, icl

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 06:45.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.