Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th April 2014, 21:25   #25981  |  Link
ThurstonX
Registered User
 
Join Date: Mar 2006
Posts: 58
AMD Interop Tests

Finally found some time to do a few quick tests. AMD Radeon R7 200 Series; passively cooled (SAPPHIRE Ultimate 100368USR Radeon R7 250 1GB 128-Bit GDDR5 PCI Express 3.0); Catalyst 14.2; Core i5-3470; 8 GB RAM
Display is an old Sharp Aquos LC-32GA5U running at native 1366x768 via DVI

tl;dr
v0.87.9 couldn't run without dropping frames; Test1 ran with Luma doubling forced at 16 neurons (32 was too much) and Jinc 3 AR; Test2 could only handle Lanczos 3 AR, so I vote for Test1.

Hope this helps, and thanks for the test builds. Definitely a step in the right direction!


I started with v0.87.9 trying to force NNEDI3 to double Luma resolution using 16 neurons. Plenty of dropped frames.

Settings
  • Chroma upscaling: Bicubic 75 (No AR)
  • Image doubling: use NNEDI3 to double Luma; always - if upscaling is needed; 16 neurons
  • Image upscaling: Jinc 3 AR
  • Image downscaling: Catmull-Rom scale in linear light
  • No Debanding
  • Smooth motion: Enable, only if there would be motion judder without it...
  • Dithering: Ordered; use colored noise; change dither for every frame
  • Trade quality for performance: first five items checked
  • Exclusive mode settings at default

With v0.87.9 I got the following:
Queues
  • Decoder: 13-16/16
  • Upload: 6-8/8
  • Deinterlace: 5-8/8
  • Render: 2-4/8
  • Present: 0-2/8
Tons of dropped frames

with Test1
  • Decoder: 14-16/16
  • Upload: 7-8/8
  • Deinterlace: 6-8/8
  • Render: varied from 5-7/8; 6-7/8; 6-8/8
  • Present: varied from 4-5/8; 4-6/8
1 frame repeat every 3.63 secs
NO dropped frames

Source video (a VHS capture using an old Hauppauge card)

Format : MPEG-PS
File size : 8.90 GiB
Duration : 1h 40mn
Overall bit rate : 12.7 Mbps

Video
ID : 224 (0xE0)
Format : MPEG Video
Format version : Version 2
Format profile : Main@Main
Format settings, BVOP : Yes
Format settings, Matrix : Custom
Format settings, GOP : M=3, N=15
Duration : 1h 40mn
Bit rate : 12.0 Mbps
Width : 720 pixels
Height : 480 pixels
Display aspect ratio : 4:3
Frame rate : 29.970 fps
Standard : NTSC
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Interlaced
Scan order : Top Field First
Compression mode : Lossy
Bits/(Pixel*Frame) : 1.159
Time code of first frame : 00:00:00:00
Time code source : Group of pictures header
Stream size : 8.46 GiB (95%)

Audio
ID : 192 (0xC0)
Format : MPEG Audio
Format version : Version 1
Format profile : Layer 2
Duration : 1h 40mn
Bit rate mode : Constant
Bit rate : 384 Kbps
Channel(s) : 2 channels
Sampling rate : 48.0 KHz
Compression mode : Lossy
Delay relative to video : -111ms
Stream size : 275 MiB (3%)


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Been playing an NTSC DVD (Peter Gabriel Live in Athens 1987). Same resolution, interlaced, but dropped frames using Jinc 3 AR. OK switching to Lanczos 3 AR. Didn't check any other differences between the two sources.

2nd edit:
I think the difference is that the DVD is really 16:9, while the VHS capture is 4:3. That's my guess, anyway.

Last edited by ThurstonX; 16th April 2014 at 01:32. Reason: Additional Info
ThurstonX is offline   Reply With Quote
Old 15th April 2014, 21:56   #25982  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Quote:
Originally Posted by seiyafan View Post
what if the video quality is high, like blu-ray? Would it still benefit more from debanding than dithering?
ED is a large performance hit and ordered dither is quite good, if you have banding debanding + OD is better than no debanding + ED. debanding + no dither is not a reasonable option.

Last edited by Asmodian; 15th April 2014 at 22:00.
Asmodian is offline   Reply With Quote
Old 15th April 2014, 22:00   #25983  |  Link
leeperry
Kid for Today
 
Join Date: Aug 2004
Posts: 3,477
I believe this was already mentioned but 48/96/192 neurons NNEDI would be pretty cool for when you can't quite run for 64/128x and still got potential cycles unused. I still kinda find NNEDI too sharp for chroma but sometimes I still don't have enough horse power to go for the next level and I can't run 256 neurons luma for SD@1080p, I would welcome the opportunity to try 192 if technically doable

It's even more true now that AMD boards have magically earned extra headroom.
leeperry is offline   Reply With Quote
Old 15th April 2014, 22:13   #25984  |  Link
Fullmetal Encoder
Registered User
 
Join Date: Jan 2011
Posts: 107
I'd love to test the new build but I keep getting "dxva processing failed" message from madVR. I don't have any of those options selected (for dxva) in the UI and I'm using Radeon 5850 on Windows 7 Pro. I would add that this is with ED 2 selected.

Last edited by Fullmetal Encoder; 15th April 2014 at 22:16.
Fullmetal Encoder is offline   Reply With Quote
Old 15th April 2014, 22:20   #25985  |  Link
kasper93
MPC-HC Developer
 
Join Date: May 2010
Location: Poland
Posts: 586
@Fullmetal Encoder: Make sure to disable all "enhancements" in gpu driver. Especially "dynamic contrast" there seem to be bug in drivers which breaks DXVA processing, for example deinterlacing will fail to initialize when opening, yet you can re-enable it later.
kasper93 is offline   Reply With Quote
Old 15th April 2014, 22:45   #25986  |  Link
QBhd
QB the Slayer
 
QBhd's Avatar
 
Join Date: Feb 2011
Location: Toronto
Posts: 697
Time for my report on the test builds.

System:
R9 270X factory OC to 1120/1400 (GPU/memory)
PCI-e 2.0 (GA-990FXA-UD7)
Windows 8.1

Target resolution:
1024x768 (rectangular pixels)

Source:
1280x720p24

Settings:
Chroma Upscaling - NNEDI3 x32
Image Upscaling - Jinc 3 AR
Luma Doubling - NNEDI3 x64
Chroma Doubling - NNEDI3 x32
Image Downscaling - Catmull-Rom AR LL
Debanding - med/med

With previous release I could only do Ordered Dithering (Error Diffusion just pushed over the limit of GPU)

Testbuild1 - Still dropped frames with ED
Testbuild2 - NO dropped frames with ED

So my vote goes to Testbuild2... It allows for me to go even further than any build to date

QB
__________________

Last edited by QBhd; 16th April 2014 at 01:44.
QBhd is offline   Reply With Quote
Old 16th April 2014, 00:06   #25987  |  Link
Fullmetal Encoder
Registered User
 
Join Date: Jan 2011
Posts: 107
Scaling from 720x480 to 1920x1200 with ED2 and NNEDI doubling at 32 neurons on luma using a Radeon 5850 I am getting:

- 52 frame drops/refresh with 87.9
- 40 frame drops/refresh with test 1
- 31 frame drops/refresh with test 2

I don't know why others with the 5850 are getting so much better performance though.

Last edited by Fullmetal Encoder; 16th April 2014 at 00:14.
Fullmetal Encoder is offline   Reply With Quote
Old 16th April 2014, 00:08   #25988  |  Link
Fullmetal Encoder
Registered User
 
Join Date: Jan 2011
Posts: 107
Quote:
Originally Posted by kasper93 View Post
@Fullmetal Encoder: Make sure to disable all "enhancements" in gpu driver. Especially "dynamic contrast" there seem to be bug in drivers which breaks DXVA processing, for example deinterlacing will fail to initialize when opening, yet you can re-enable it later.
Thank you very much! I don't know how, but all of those "enhancements" were on in CCC. Although I'm not sure how they got turned on since I turned them off long ago
Fullmetal Encoder is offline   Reply With Quote
Old 16th April 2014, 00:50   #25989  |  Link
tickled_pink
Registered User
 
Join Date: Dec 2011
Posts: 12
Win7 x64, HD 7750 PCI-E 2.0x16

720x404@25fps with 64 neurons tested

Test build 1 uses slightly less (55% vs 57%) GPU and less graphics memory (~40 MB or 10%) than test build 2.

0.87.9 used ~60% GPU and similar amount of memory as test build 1.

Neither improved performance enough to allow more neurons but a 10% overall improvement is certainly welcome!
tickled_pink is offline   Reply With Quote
Old 16th April 2014, 01:23   #25990  |  Link
sajara
Registered User
 
Join Date: Jan 2013
Posts: 18
This test came a bit as a shock because I do remember being unable to use NNEDI3 even with 16 neurons when first release and didn't bothered to try again.

AMD 5730M 650Mhz core /800Mhz GDDR3 mem

H264 clip 720x304 -> 1366x768

87.9 - 16 Neurons ~86.7% / 32 Neurons - slideshow
Test 1 - 16 Neurons ~57.6% / 32 Neurons ~86.5%
Test 2 - 16 Neurons ~60% / 32 Neurons ~89.7%

Queues the same in test 1 and 2.

So again beyond words on the improvement and as much, amazed being able to do 32 neurons.
sajara is offline   Reply With Quote
Old 16th April 2014, 01:49   #25991  |  Link
ryrynz
Registered User
 
ryrynz's Avatar
 
Join Date: Mar 2009
Posts: 3,646
Do the test builds improve anything on Nvidia hardware at all?
ryrynz is offline   Reply With Quote
Old 16th April 2014, 06:40   #25992  |  Link
Procrastinating
Registered User
 
Procrastinating's Avatar
 
Join Date: Aug 2013
Posts: 71
Considering the previous responses, and the less meaningful low render times, I changed the survey defaults to double luma, and added a default video of tears of steel. Remember that any data is good data, and this survey/spreadsheet will not only help madshi, but those interested in what the optimal media GPU for them might be.

To Madshi in particular, I think it will be interesting to see across the various hardware configurations, how the improvements appear between versions. I will probably try the AMD builds at some point.
Procrastinating is offline   Reply With Quote
Old 16th April 2014, 07:00   #25993  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Quote:
Originally Posted by ryrynz View Post
Do the test builds improve anything on Nvidia hardware at all?
Yes, but only with SLI on. SLI is much better with the test builds though still not as fast as without SLI.

madVR 87.9:
Quote:
Originally Posted by Asmodian View Post
1280x720p24 -> 2560x1440 @ 72Hz, Bicubic75 AR chroma, Bicubic75 AR image, NNEDI3 128 Luma doubling, No smooth motion, no debanding, Ordered Dither, 3DLUT calibration, Windowed Overlay.

SLI on 41.2ms
GPU0 81% @ 1097 MHz, 19% PCI-E, 7% memory controller
GPU1 18% @ 836 MHz, 16% PCI-E, 0% memory controller

SLI off 29.9ms
GPU0 67% @ 1097 MHz, 5% PCI-E, 7% memory controller
GPU1 00% @ 324 MHz, 0% PCI-E, 0% memory controller

GTX Titans, 3770K @ 4.6 GHz, Z77 chipset, each GPU is on PCI-E 3.0 x8, 32GB DDR3-2133CL9.
madVR interopTest1 & interopTest2 (the two are identical as far as I can tell):

SLI on
GPU0 73% @ 1097 MHz, 11% PCI-E, 7% memory controller
GPU1 09% @ 836 MHz, 7% PCI-E, 0% memory controller

SLI off
GPU0 67% @ 1097 MHz, 5% PCI-E, 7% memory controller
GPU1 00% @ 324 MHz, 0% PCI-E, 0% memory controller

I can also run my "720p24" profile with SLI on which used to drop a lot frames. Jinc3 chroma, Jinc3 Image, NNEDI3 128 luma doubling, ED2, no debanding, no smooth motion.

I did recheck 87.9 immediately after these tests and it does perform as it did before so this isn't an accidental setting or system change.

Nvidia Driver 337.50

Quote:
Originally Posted by leeperry View Post
I believe this was already mentioned but 48/96/192 neurons NNEDI would be pretty cool for when you can't quite run for 64/128x and still got potential cycles unused. I still kinda find NNEDI too sharp for chroma but sometimes I still don't have enough horse power to go for the next level and I can't run 256 neurons luma for SD@1080p, I would welcome the opportunity to try 192 if technically doable

It's even more true now that AMD boards have magically earned extra headroom.
Sadly I don't think finer grained neuron settings are possible. From the NNEDI3 docs:

Quote:
nns -

Sets the number of neurons in the predictor neural network. Possible settings are
0, 1, 2, 3, and 4. 0 is fastest. 4 is slowest, but should give the best quality. This
is a quality vs speed option; however, differences are usually small. The difference
in speed will become larger as 'qual' is increased.

0 - 16
1 - 32
2 - 64
3 - 128
4 - 256

Default: 1 (int)
Another impressive update madshi, and I don't even have an AMD GPU. Thanks again!

Last edited by Asmodian; 16th April 2014 at 08:51.
Asmodian is offline   Reply With Quote
Old 16th April 2014, 08:34   #25994  |  Link
James Freeman
Registered User
 
Join Date: Sep 2013
Posts: 919
Quote:
Originally Posted by Asmodian View Post
Another impressive update madshi and I don't even have an AMD GPU. Thanks again!
I'm pretty sure you're wrong, unless madshi is a real living magician....
__________________
System: i7 3770K, GTX660, Win7 64bit, Panasonic ST60, Dell U2410.
James Freeman is offline   Reply With Quote
Old 16th April 2014, 08:37   #25995  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
Huh? did you read my post?
Asmodian is offline   Reply With Quote
Old 16th April 2014, 08:45   #25996  |  Link
James Freeman
Registered User
 
Join Date: Sep 2013
Posts: 919
Ohhhhhh.... I see.
There should be a comma there.

Like so:
Quote:
Another impressive update madshi, and I don't even have an AMD GPU. Thanks again!
Not like so (what I thought):
Quote:
Another impressive update, madshi and I don't even have an AMD GPU. Thanks again!
__________________
System: i7 3770K, GTX660, Win7 64bit, Panasonic ST60, Dell U2410.
James Freeman is offline   Reply With Quote
Old 16th April 2014, 08:51   #25997  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,406
OH! haha yes, I never saw that reading.
Asmodian is offline   Reply With Quote
Old 16th April 2014, 11:54   #25998  |  Link
Procrastinating
Registered User
 
Procrastinating's Avatar
 
Join Date: Aug 2013
Posts: 71
Alright, after testing the new test builds, I can confirm that, on my HD6770 I go from

Old: Many drops, render times ~46ms on 720p->1080p source, using luma 32 doubling
New interop 1: ~0.2 drops per second
New interop 2: ~ 1 drop per second.

The problem now, is that I'm no longer seeing render times in the debug window (with the new builds).

I can conclude however, that I am at least getting the fastest render times from test build 1, and the improvements are at least enough to prevent noticeable framedrops on a particular source now.

Last edited by Procrastinating; 16th April 2014 at 13:45.
Procrastinating is offline   Reply With Quote
Old 16th April 2014, 11:57   #25999  |  Link
romulous
Registered User
 
Join Date: Oct 2012
Posts: 179
Quote:
Originally Posted by Procrastinating View Post
The problem now, is that I'm no longer seeing render times in the debug window (with the new builds)!
Quoting from that same post in which madshi posted the download link (two lines under the link itself in fact):

Quote:
Originally Posted by madshi View Post
I've intentionally removed the rendering times from the OSD (only for these test builds, of course) because due to the way these 2 test builds work, judging them by looking at the rendering times would be misleading. So please judge these builds by testing which build allows you to use higher/more quality settings.
romulous is offline   Reply With Quote
Old 16th April 2014, 12:06   #26000  |  Link
Procrastinating
Registered User
 
Procrastinating's Avatar
 
Join Date: Aug 2013
Posts: 71
My bad, but the conclusion stands at least.

That said, I wonder where the difference in results for the two builds lie, with some people reporting better results in either. It doesn't appear to be related to overall architecture, so possibly clocks?

Last edited by Procrastinating; 16th April 2014 at 12:12.
Procrastinating is offline   Reply With Quote
Reply

Tags
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:22.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.