Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd May 2009, 20:53   #821  |  Link
yesgrey
Registered User
 
Join Date: Sep 2004
Posts: 1,295
Quote:
Originally Posted by madshi View Post
Which means that if you compare madVR 0.8 statistics to madVR 0.9, you have to substract the madVR 0.9 "uploading textures" time from the average GPU rendering time.
The uploading textures time is ~4.6ms. The Increase in total time is around 2-3ms, so, subtracting, with v0.9 I get less 2ms in the total rendering time. Good work.

Quote:
Originally Posted by madshi View Post
However, strange enough, madVR 0.9 shows lower GPU rendering times for me even without doing this math...
What's you memory bandwidth?

I've performed some tests with overclocking and increasing the memory clock gives very little improvement, around 1% decreasing of gpu times with a 30% memory clock increase. The core and shader clock increase gives me a real gain, less 10% times for a 10% clock increase. It seems my gpu is shader limited.
yesgrey is offline   Reply With Quote
Old 3rd May 2009, 21:24   #822  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Quote:
Originally Posted by Chumbo View Post
1920x1080 resolution
--- @30Hz - smoothest playback and no tearing, however, not completely smooth.
madVR is not optimized for smooth playback yet.

Quote:
Originally Posted by Chumbo View Post
--- @60Hz - Smooth like 30Hz but tearing exists

--- @24Hz - Not smooth and tearing. It's pretty much a mess, but that goes for all renderers. I don't know much about this stuff, but it doesn't seem to be a renderer issue?
Have you tried VMR or EVR fullscreen exclusive mode? That always got rid of any tearing for me. Unfortunately madVR does not support fullscreen exclusive mode yet.

Quote:
Originally Posted by nijiko View Post
madVR can not work with NVidia video decoder (MPEG2).
Can anybody confirm this problem?

Quote:
Originally Posted by yesgrey3 View Post
What's you memory bandwidth?
10.7 GPixels/s. 10.7 GTexels/s. 427 GFLOPS.

Quote:
Originally Posted by yesgrey3 View Post
I've performed some tests with overclocking and increasing the memory clock gives very little improvement, around 1% decreasing of gpu times with a 30% memory clock increase. The core and shader clock increase gives me a real gain, less 10% times for a 10% clock increase. It seems my gpu is shader limited.
I guess it's time for you to stop using Nearest Neighbor scaling then...

Seriously, are these numbers with 1:1 display or with scaling active? I guess with 1:1 display memory bandwidth is not that much of a problem. But IIRC you had a 8600? That one really has low shader power. That may explain why you're shader limited...
madshi is offline   Reply With Quote
Old 3rd May 2009, 21:28   #823  |  Link
Egh
Registered User
 
Join Date: Jun 2005
Posts: 630
Not bad. SD content now (upscaled to 720p on playback) is really low in CPU consumption. 704x400avc upscaled to fit on 1280x1024 screen typically uses less than 10% CPU.

As for 720p avc content, difference is less impressive. Compared with Haali over the same file, mVR uses approx. 10%CPU extra for each scene (i.e. if scene is 12% with Haali then it is about 20-25% with mVR, if Haali uses 20% CPU then mVR uses 30%).

However new totals for average time are actually higher than before. However after doing the maths and subtracting the texture update time, it is somewhat better with 0.9 version. Now the average timings are as follows (3dluts disabled):
roughly 60% for update texture, 25% render, 15% resample (720p content no rescale)
roughly 30% for update texture, 45% render, 25% resample (720p content slight downscale to fit in the window)
roughly 20% for update texture, 30% render, 50% resample (704x400 upscaled to 720p)

Interesting to check the times for the actual scaling. As for the absolute values, roughly 1.5ms rendering no rescale and 5ms rendering with rescale (720p). Update texture time is pretty much same in both cases. Resample is roughly three times larger with rescaling as well.
Egh is offline   Reply With Quote
Old 3rd May 2009, 21:37   #824  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
0.9 seems to be running significantly smoother here on my 9400 with 1080p content.

Previously, I was seeing rendering times in the 60-70ms region. Now, I can get that down to an average of 22ms if I use bilinear and no 3D LUT. (I realise that this removes a lot of the advantages of madVR, but I get almost smooth playback with this)

Max GPU times are still around 45ms at times though. (which is more than a frame at 23.976)

If I use madVR's chroma upsampling (I find softcubic 50 to look best) average rendering times are around 38ms with max GPU times around 60,65ms.

Frame queue is generally at 16/16, though it has dropped as low as 13. With 0.8 it was hovering around 1/2. (the higher the better, right?)


I'm a bit confused about the max GPU rendering time though. It seems to jump around a lot, whereas I thought the max would simply update for higher numbers and not lower ones, giving you the true peak value for a film.
6233638 is offline   Reply With Quote
Old 3rd May 2009, 21:41   #825  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,047
I agree that v0.9 runs significant smoother for HD content

Still it renders at 60 fps, although the video only has a framerate of 24 fps. My question: Isn't that a huge wast of processing power? And can it be avoided, like other renders obviously do?

(Sorry if this was answered already. This is a really huge thread, so I may have missed something)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 3rd May 2009 at 22:03.
LoRd_MuldeR is offline   Reply With Quote
Old 3rd May 2009, 21:53   #826  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Quote:
Originally Posted by Egh View Post
Not bad. SD content now (upscaled to 720p on playback) is really low in CPU consumption. 704x400avc upscaled to fit on 1280x1024 screen typically uses less than 10% CPU.

As for 720p avc content, difference is less impressive. Compared with Haali over the same file, mVR uses approx. 10%CPU extra for each scene (i.e. if scene is 12% with Haali then it is about 20-25% with mVR, if Haali uses 20% CPU then mVR uses 30%).
I think/hope that CPU consumption will go further down once I redesign the display logic for smooth playback.

Quote:
Originally Posted by Egh View Post
As for the absolute values, roughly 1.5ms rendering no rescale and 5ms rendering with rescale (720p). Update texture time is pretty much same in both cases. Resample is roughly three times larger with rescaling as well.
Sounds pretty good to me.

Quote:
Originally Posted by 6233638 View Post
0.9 seems to be running significantly smoother here on my 9400 with 1080p content.

Previously, I was seeing rendering times in the 60-70ms region. Now, I can get that down to an average of 22ms if I use bilinear and no 3D LUT. (I realise that this removes a lot of the advantages of madVR, but I get almost smooth playback with this)

Max GPU times are still around 45ms at times though. (which is more than a frame at 23.976)

If I use madVR's chroma upsampling (I find softcubic 50 to look best) average rendering times are around 38ms with max GPU times around 60,65ms.

Frame queue is generally at 16/16, though it has dropped as low as 13. With 0.8 it was hovering around 1/2. (the higher the better, right?)
The higher the better. You getting higher numbers is probably a consequence of the lowered CPU usage.

Quote:
Originally Posted by 6233638 View Post
I'm a bit confused about the max GPU rendering time though. It seems to jump around a lot, whereas I thought the max would simply update for higher numbers and not lower ones, giving you the true peak value for a film.
Statistics are currently always done for 1s and then reset. The purpose of resetting the max numbers is to allow testing the effect of different settings on max numbers.

Quote:
Originally Posted by LoRd_MuldeR View Post
I agree that v0.9 runs significant smoother for HD content
Happy to hear that many of you guys notice smoother results with v0.9!

However, smooth playback is still not really implemented yet. That's still due for a future version. So you can expect further improvements. v0.9 just lowered CPU and GPU consumption a bit...

Quote:
Originally Posted by LoRd_MuldeR View Post
Still it renders at 60 fps, although the video only has a framerate of 24 fps. My question: Isn't that a huge wast of processing power?
It's a waste of processing power, yes. I think it's at least partially responsible for the higher CPU usage compared to other renderers. The render logic will change once I implement smooth motion playback (should be soon now).
madshi is offline   Reply With Quote
Old 3rd May 2009, 21:59   #827  |  Link
Egh
Registered User
 
Join Date: Jun 2005
Posts: 630
Quote:
Originally Posted by LoRd_MuldeR View Post
Still it renders at 60 fps, although the video only has a framerate of 24 fps. My question: Isn't that a huge wast of processing power? And can be avoided, like other renders obviously do?

(Sorry if this was answered already. This is a really huge thread, so I may have missed something)
Madshi typically replies in jumbo monster posts, so it is possible to miss it I think your question has been answered.

As well, I wonder about paused gpu timings.

Even if 720p video is paused, it still takes roughly same average time. Why update and resample textures times are any different from zero? I'd say only rendering time needs to be used in such a case. What is even more confusing, even the window is staying still here, max rendering times are still higher than average by same roughly 50%. Why would some shader passes take considerably more time than average for the same frame? (i.e. resample texture passes take roughly double the average time).
If no rescaling is used, then things are even more shocking 720p no rescale when paused uses <1ms for average resampling time and about 5-6 for max resampling time
Egh is offline   Reply With Quote
Old 3rd May 2009, 22:01   #828  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
Quote:
Originally Posted by madshi View Post
The higher the better. You getting higher numbers is probably a consequence of the lowered CPU usage.
Hmm, that's disappointing. I had been told a 2.5GHz Pentium Dual Core (which is almost the same performance as a Core2Duo) should have been plenty for even the highest bitrate VC1/AVC blu-ray content. I guess that's not the case. (from looking at cpu load, it seems like that was the problem)

I wish I had found out about madVR before building this HTPC a few weeks ago. What sort of CPU should I need to decode 40mbps AVC/VC1 content and run madVR smoothly? (I realise that you've not fully optimised it and have smooth playback features implemented yet)

Is there a chance of madVR ever working with DXVA (I assume that's something the MPC-HC guys have to fix, rather than it being a madVR problem) or would the additional GPU load then end up being too much?
6233638 is offline   Reply With Quote
Old 3rd May 2009, 22:02   #829  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,047
Quote:
Originally Posted by madshi View Post
It's a waste of processing power, yes. I think it's at least partially responsible for the higher CPU usage compared to other renderers. The render logic will change once I implement smooth motion playback (should be soon now).
That sounds like good news
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.


LoRd_MuldeR is offline   Reply With Quote
Old 3rd May 2009, 22:07   #830  |  Link
racerxnet
Registered User
 
Join Date: Jul 2004
Location: ILLINIOS
Posts: 50
Quote:
And another thing. I forgot to say.
madVR can not work with NVidia video decoder (MPEG2).
I confirm it's in YV12 mode.

Works fine for me using the Nvidia audio and video decoders with YV12.. Hope to see the smooth playback efforts soon as stated.

MAK
racerxnet is offline   Reply With Quote
Old 3rd May 2009, 22:10   #831  |  Link
yesgrey
Registered User
 
Join Date: Sep 2004
Posts: 1,295
Quote:
Originally Posted by madshi View Post
I guess it's time for you to stop using Nearest Neighbor scaling then...

Seriously, are these numbers with 1:1 display or with scaling active? I guess with 1:1 display memory bandwidth is not that much of a problem. But IIRC you had a 8600? That one really has low shader power. That may explain why you're shader limited...
I'm scaling BR for 1280x960, using Lanczos8 for Luma and SoftCubic100 for chroma. I only used Lanczos8 for testing purposes, because it's the most gpu intensive of all. I know that for downsampling using Lanczos8 is a bad idea, because with bicubic I should get the same image quality... Even bilinear would be pretty close...
yesgrey is offline   Reply With Quote
Old 3rd May 2009, 22:20   #832  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,137
Quote:
Originally Posted by Egh View Post
As well, I wonder about paused gpu timings.

Even if 720p video is paused, it still takes roughly same average time. Why update and resample textures times are any different from zero? I'd say only rendering time needs to be used in such a case. What is even more confusing, even the window is staying still here, max rendering times are still higher than average by same roughly 50%. Why would some shader passes take considerably more time than average for the same frame?
The stats may be wrong in paused state, I'm not sure about that.

Quote:
Originally Posted by 6233638 View Post
Hmm, that's disappointing. I had been told a 2.5GHz Pentium Dual Core (which is almost the same performance as a Core2Duo) should have been plenty for even the highest bitrate VC1/AVC blu-ray content. I guess that's not the case. (from looking at cpu load, it seems like that was the problem)
Well, madVR did use (and still uses) more CPU than most other renderers. Hopefully I'll be able to get that down to "normal" levels sooner or later. Then maybe you'll be fine with VC1/AVC content. I can't tell you what kind of CPU you need for decoding. Maybe your HTPC was built to make use of DXVA? In that case obviously with madVR you're out of luck, because madVR does not support DXVA and never will.

Quote:
Originally Posted by 6233638 View Post
What sort of CPU should I need to decode 40mbps AVC/VC1 content and run madVR smoothly?
I've no idea. Generally my target is to make madVR run smoothly when VMR/EVR are smooth *without DXVA*, too. Provided that the GPU is fast enough to keep up with the rendering work...

Quote:
Originally Posted by 6233638 View Post
Is there a chance of madVR ever working with DXVA
No chance. CUDA works, though.

Quote:
Originally Posted by racerxnet View Post
Works fine for me using the Nvidia audio and video decoders with YV12.
Thanks. It seems that nijiko has more problems with madVR than anybody else. And, I don't know why, but it seems that most of his problems seem to not be reproducible by other people.

@nijiko, at least that one weird 700x218 clip you sent me should work properly with v0.9 now. That was the one clip where I was able to reproduce a problem.

Quote:
Originally Posted by yesgrey3 View Post
I'm scaling BR for 1280x960, using Lanczos8 for Luma and SoftCubic100 for chroma. I only used Lanczos8 for testing purposes, because it's the most gpu intensive of all.
And you are still shader limited when using Lanczos8? Well, then I think you really do need a new GPU... Lanczos8 is rather heavy on memory, while shader math for scaling is relatively low.

Quote:
Originally Posted by yesgrey3 View Post
I know that for downsampling using Lanczos8 is a bad idea, because with bicubic I should get the same image quality... Even bilinear would be pretty close...
While downscaling differences may be smaller compared to upscaling differences, I still think that Lanczos downscaling is not a bad idea at all. I once tried upscaling+downscaling an image. And using Lanczos+Lanczos produced the best results. Noticeably better than Lanczos+Bicubic. With Lanczos+Lanczos the scaled image was very near to the original. Using Lanczos+Bicubic the scaled image was noticeably softer compared to the original.
madshi is offline   Reply With Quote
Old 3rd May 2009, 22:38   #833  |  Link
Hypernova
Registered User
 
Join Date: Feb 2006
Posts: 293
Here are the same frame (from 0.8, will install 0.9 right after). Could you or anyone point out to me where I could make mistake for EVR CP (or put me to a place that does)? In Catalyst Avivo all setting is either "use application settings" or not enable. I didn't do anything on Color page as well. I can try include mplayer's OpenGL renderer, is that gonna help? (setting: vo=gl:yuv=0:rectangle=2:lscale=5:cscale=5)






Add Haali renderer shots. Now it's hard to see the difference, but it's still there. Again, I would be happy if anyone can help me improve EVR CP result. I have to live with that until madVR got subtitle, which is still some time in the future.

Last edited by Hypernova; 3rd May 2009 at 23:05. Reason: Add Haali
Hypernova is offline   Reply With Quote
Old 3rd May 2009, 22:41   #834  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
Quote:
Originally Posted by madshi View Post
Well, madVR did use (and still uses) more CPU than most other renderers. Hopefully I'll be able to get that down to "normal" levels sooner or later. Then maybe you'll be fine with VC1/AVC content. I can't tell you what kind of CPU you need for decoding. Maybe your HTPC was built to make use of DXVA? In that case obviously with madVR you're out of luck, because madVR does not support DXVA and never will.
Unfortunately, I think the 2.5GHz recommendation was based on the fact that I was getting a motherboard that has a 9400 and therefore would do all the processing on-board.

I've never been one for overclocking really (I'd rather run at the rated speeds without risk) but rather than go out and replace most of the components in this computer two weeks after buying it, I'm now running at 3GHz with a 10% overclock on the GPU and it's eliminated almost all spikes to 100% on the CPU graph when playing back video. (though I need to test and see what the highest bitrate/most demanding film I have is)

I've just noticed now though that, even though the GPU times are ok, CPU is at 100% usage if I enable the 3D LUT.

Perhaps once CPU usage is lowered by rendering at 24/30fps rather than 60, it'll be able to cope properly.

I'll have to see just how much I can push this system and have it still run reliably.
6233638 is offline   Reply With Quote
Old 3rd May 2009, 22:45   #835  |  Link
Hypernova
Registered User
 
Join Date: Feb 2006
Posts: 293
Quick report on 0.9: The reduce in GPU render time is about 5ms I think. (Bilinear/Bilinear, upscaling from 848x480 to 2560x1600) (I didn't take note on CPU's before, sorry). I also attach VSync.dat here
Attached Files
File Type: 7z vsync.7z (1.9 KB, 13 views)
__________________
Spec: Intel Core i5-3570K, 8g ram, Intel HD4000, Samsung U28D590 4k monitor+1080p Projector, Windows 10.

Last edited by Hypernova; 4th May 2009 at 08:02. Reason: Misunderstanding the GPU render time
Hypernova is offline   Reply With Quote
Old 3rd May 2009, 23:13   #836  |  Link
tetsuo55
MPC-HC Project Manager
 
Join Date: Mar 2007
Posts: 2,317
Quote:
Originally Posted by 6233638 View Post
What sort of CPU should I need to decode 40mbps AVC/VC1 content and run madVR smoothly
The 40mbps AVC requires a 3ghz C2D for the toughest scenes.
Add madVR to the mix and we would currently need a 3.3ghz C2D

Your pentium dualcore doesn't even come close, you would need to OC it to between 3,8 and 4,2 ghz to get a similar performance.

Most AVC's don't have tough scene's and will work fine on a slower cpu, i'm personally not going to take any changes and have ordered a e8400 (3ghz)
tetsuo55 is offline   Reply With Quote
Old 3rd May 2009, 23:24   #837  |  Link
nijiko
Hi-Fi Fans
 
Join Date: Dec 2008
Posts: 222
No output with NVidia.
See snap in pic named no_op.jpg.
This Video clip is HD!

But works fine in SD.
See snap in pic named tmp_op.jpg.
Attached Images
  

Last edited by nijiko; 3rd May 2009 at 23:30.
nijiko is offline   Reply With Quote
Old 3rd May 2009, 23:34   #838  |  Link
honai
Guest
 
Posts: n/a
Quote:
Originally Posted by tetsuo55 View Post
The 40mbps AVC requires a 3ghz C2D for the toughest scenes.
Add madVR to the mix and we would currently need a 3.3ghz C2D

Your pentium dualcore doesn't even come close, you would need to OC it to between 3,8 and 4,2 ghz to get a similar performance.

Most AVC's don't have tough scene's and will work fine on a slower cpu, i'm personally not going to take any changes and have ordered a e8400 (3ghz)
Not true. I had an Pentium Dual-Core (E5200) in my rig and swapped it for an E8400 - but only to get additional headroom in post-processing (LSF etc). The E5200 played all content just fine, even Pirates of the Caribbean AVC at 40mbps, and never hit 100%.

I'm not saying that more CPU power is unnecessary, though.
  Reply With Quote
Old 3rd May 2009, 23:58   #839  |  Link
6233638
Registered User
 
Join Date: Apr 2009
Posts: 1,019
Quote:
Originally Posted by tetsuo55 View Post
The 40mbps AVC requires a 3ghz C2D for the toughest scenes.
Add madVR to the mix and we would currently need a 3.3ghz C2D

Your pentium dualcore doesn't even come close, you would need to OC it to between 3,8 and 4,2 ghz to get a similar performance.

Most AVC's don't have tough scene's and will work fine on a slower cpu, i'm personally not going to take any changes and have ordered a e8400 (3ghz)
Thanks for the info. Everything I had read suggested that there was almost no difference between a Pentium Dual-Core and a Core2Duo of equivalent clockspeed for the majority of tasks. (Dual-Core not Pentium D, which is much slower)


Due to the 12.5x multiplier on this, it means I'm not having to push the fsb so hard to get fairly substantial overclocks.

I don't want to speak too soon, but it seems to be running stable at 3.7GHz, and not even that hot. (15℃ below operating limits under load with the stock HSF—which I'll upgrade if I decided to keep running it like this, or faster if it'll do it)

Quote:
Originally Posted by honai View Post
Not true. I had an Pentium Dual-Core (E5200) in my rig and swapped it for an E8400 - but only to get additional headroom in post-processing (LSF etc). The E5200 played all content just fine, even Pirates of the Caribbean AVC at 40mbps, and never hit 100%.

I'm not saying that more CPU power is unnecessary, though.
Good to hear. I wonder if there's something wrong with my software setup then, as it seems to be underperforming.
6233638 is offline   Reply With Quote
Old 4th May 2009, 00:18   #840  |  Link
yesgrey
Registered User
 
Join Date: Sep 2004
Posts: 1,295
Quote:
Originally Posted by madshi View Post
And you are still shader limited when using Lanczos8? Well, then I think you really do need a new GPU...
Well, my rendering times without L8 are <20ms, so I think I will keep it a little more, until madVR is a little more mature and then I will see if I need to change it or not... The 4770 is very tempting, but my last experience with an ATI card (radeon 9500) was not very good... analog vga output with low quality and the drivers were not also very good handling two displays, so I'm a bit affraid of spending money in an ATI again...

Quote:
Originally Posted by madshi View Post
I once tried upscaling+downscaling an image...
But that's different than just downscaling. I've performed some tests with Avisynth, because I'm backing up a BR movie to a lower resolution, and I have tryed the opposite: downscaling+upscaling. In the Avisynth user's guide they say that for downscaling bilinear should give the same results (or better) than bicubic, but it's not true; the image is softer. Between Bicubic, Spline64 and Lanczos, there was no visible difference.
This is only true for downsampling; upsampling is a different story, then, Lanczos shows all its quality...
yesgrey is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 03:51.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.