Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 26th October 2019, 22:46   #1881  |  Link
Adonisds
Registered User
 
Join Date: Sep 2018
Posts: 13
Does Youtube keep the original of every video? Will every video receive an AV1 version? Just future ones?
Adonisds is offline   Reply With Quote
Old 26th October 2019, 23:09   #1882  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,502
Quote:
Originally Posted by Adonisds View Post
Does Youtube keep the original of every video?
Yes.


Quote:
Originally Posted by Adonisds View Post
Will every video receive an AV1 version? Just future ones?
Possibly every video but we can't say for sure. It will cost a lot of cpu time/money to convert all videos in all resolutions (+SDR/HDR) to AV1. For the longest time Youtube did not encode all videos to VP9 either. So we don't really know how it will be for AV1.
sneaker_ger is offline   Reply With Quote
Old 26th October 2019, 23:39   #1883  |  Link
Adonisds
Registered User
 
Join Date: Sep 2018
Posts: 13
Quote:
Originally Posted by sneaker_ger View Post
Yes.



Possibly every video but we can't say for sure. It will cost a lot of cpu time/money to convert all videos in all resolutions (+SDR/HDR) to AV1. For the longest time Youtube did not encode all videos to VP9 either. So we don't really know how it will be for AV1.
Thank you. Are all videos from before VP9 available in VP9?
Adonisds is offline   Reply With Quote
Old 27th October 2019, 00:48   #1884  |  Link
vidschlub
Registered User
 
Join Date: May 2016
Posts: 20
Quote:
Originally Posted by soresu View Post
Meanwhile AV1 was only standardised last year, it takes time to get these new codecs ship shape for production purposes, let alone significant maturity.
Due to the extended interest in AV1 from such a wide group of companies, could we expect to see it gain traction faster than 265 at least?
vidschlub is offline   Reply With Quote
Old 27th October 2019, 00:57   #1885  |  Link
vidschlub
Registered User
 
Join Date: May 2016
Posts: 20
Quote:
Originally Posted by Kirakishou View Post
[Xrip][Nekopara][OVA_Extra][GB][1080P][AV1_10bit].mp4 .


As predicted, anime teams are always first to adopting crazy new codecs as soon as possible. I love these guys, they're nuts <3
vidschlub is offline   Reply With Quote
Old 27th October 2019, 11:44   #1886  |  Link
birdie
.
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 147
Quote:
Originally Posted by Adonisds View Post
Thank you. Are all videos from before VP9 available in VP9?
No. Google encodes into VP9 only videos with more than N number of views or views pre day where N or N/day are yet to be determined.

Quote:
Originally Posted by vidschlub View Post
Due to the extended interest in AV1 from such a wide group of companies, could we expect to see it gain traction faster than 265 at least?
I really doubt that considering that AV1 is up to two orders of magnitude more computationally expensive and older x86 CPUs cannot even decode FullHD 60fps videos encoded in it in real time.
birdie is offline   Reply With Quote
Old 27th October 2019, 14:18   #1887  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 190
Quote:
Originally Posted by vidschlub View Post
Due to the extended interest in AV1 from such a wide group of companies, could we expect to see it gain traction faster than 265 at least?
Probably a lot depends on how you define traction in this situation.

For example, it wouldn't surprise me if the total stream watch time for AV1 is already above HEVC due to YouTube encoding low res versions of popular videos. Mozilla released some numbers that suggested AV1 Firefox video views on their nightly channel rapidly rose to 20% of all video plays just before YouTube paused their AV1 rollout. Would be interested to see where that's gone since.

Instagram already uses a software decoder for VP9 on Android ( a version of libvpx surprisingly) so switching to dav1d and AV1 for popular content isn't totally unbelievable even before hardware decoders are widespread. (I'm not sure if AV1 would be much better in terms of bitrate than VP9 for their user generated content at low bitrates, but if SVT and dav1d are sufficiently better than libvpx then it would actually save them time and energy to upgrade.

I'd love to see a real world comparison of watching something like Breaking Bad on a phone on a metered connection. What are realistic bitrates for these users, if you can get a 30% bitrate saving just from synthetic film grain and AV1's tools appear to work better at lower bitrates how much does the software decoding actually cost you? Once you factor in network savings is it actually noticeable against the baseline of having the screen on? What about in the download scenario? Is it worth the battery hit to see 5 more episodes on your monthly bandwidth allowance?

Last edited by dapperdan; 27th October 2019 at 14:23.
dapperdan is offline   Reply With Quote
Old 27th October 2019, 15:58   #1888  |  Link
Mr_Khyron
Member
 
Mr_Khyron's Avatar
 
Join Date: Nov 2002
Posts: 120
https://code.videolan.org/videolan/d...releases#0.5.1
Quote:
This is a minor update of the 0.5.0 version of dav1d, the fast and small AV1 decoder, codename 'Asiatic Cheetah'.

0.5.1 brings improvements in speed for SSE2 CPUs (up to 50% speedup), and ARMv7 CPUs (up to 41% speedup).

It also fixes minor issues and minor speed improvements for other architectures.

http://download.opencontent.netflix....ix=AV1/Sparks/
Netflix posted new AV1 samples with and without film grain in 540p, 1080p and 2160p
Mr_Khyron is offline   Reply With Quote
Old 27th October 2019, 17:38   #1889  |  Link
IgorC
Registered User
 
Join Date: Apr 2004
Posts: 1,315
I wonder why dav1d developers have dedicated time to optimize for SSE2. Isn't SSSE3 already old enough? AMD has catched up and implemented SSSE3 in 2011. Even outdated Core 2 Duo has SSSE3.

While 10 bits decoding has literally zero optimizations till moment.

P.S. Few years ago I have tested 10 years old laptop with Pentium T4200 (SSSE3) which now rests unused. It could barely play Youtube VP9 720p videos while still dropped some frames, leave alone AV1 with its 3x complexity.
AV1 would be actually a downgrade for this kind of hardware (from VP9 720p to AV1 360/480p). And we're talking about CPU with SSSE3.

Last edited by IgorC; 27th October 2019 at 18:58.
IgorC is offline   Reply With Quote
Old 27th October 2019, 19:02   #1890  |  Link
birdie
.
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 147
Quote:
Originally Posted by Mr_Khyron View Post
https://code.videolan.org/videolan/d...releases#0.5.1



http://download.opencontent.netflix....ix=AV1/Sparks/
Netflix posted new AV1 samples with and without film grain in 540p, 1080p and 2160p
Neither mpv, nor ffplay can open these *.obu files. Any ideas how one can play them?
birdie is offline   Reply With Quote
Old 27th October 2019, 19:20   #1891  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,502
Mux to mkv using mkvmerge first.
sneaker_ger is offline   Reply With Quote
Old 27th October 2019, 19:44   #1892  |  Link
Nintendo Maniac 64
Registered User
 
Nintendo Maniac 64's Avatar
 
Join Date: Nov 2009
Location: Northeast Ohio
Posts: 432
Quote:
Originally Posted by IgorC View Post
I wonder why dav1d developers have dedicated time to optimize for SSE2. Isn't SSSE3 already old enough? AMD has catched up and implemented SSSE3 in 2011.
AMD's non-DDR4 processors with SSSE3 were...underwhelming to say the least (the only exception being their Atom-competitor chips such as the Jaguar cores used in consoles, but it certainly wasn't their absolute performance that made them exceptions, far from it in fact).

A good amount of people saw no reason to upgrade from their SSE3-at-max AMD CPUs regardless of whether that was a Phenom II (especially those using the 6 core) or the first-gen "Llano" laptop APUs, worse still because of requiring different motherboards for either (AM3+ and FM2). Heck, when the first gen AM3+ FX CPU reviews landed, it was even common for people to instead upgrade to the Phenom II X6!



Quote:
Originally Posted by IgorC View Post
P.S. Few years ago I have tested 10 years old laptop with Pentium T4200 (SSSE3) which now rests unused. It could barely play Youtube VP9 720p videos while still dropped some frames
VP9 decoding in browsers was woeful back then. I was able to run YouTube's 1080p30 VP9 encodes on 2.0GHz first-gen Core 2 Duo (2MB L2 cache) and their 1080p60 VP9 encodes on a 2.4GHz second-gen Core 2 Duo (3MB L2 cache) if I ran the video stream through MPC-HC/LAVfilters, but the results were terrible in the browser. This was because the browsers at the time all used libvpx while MPC-HC/LAVfilters used ffvp9 (which actually for quite a while ran just as terribly if your CPU didn't support SSSE3 and/or you were using 32bit MPC-HC/LAVfilters, this however isn't the case anymore)

For reference your Pentium is the same exact architecture as a second-gen Core 2 Duo but has 1MB of L2 cache, so I would expect its IPC to be similar to a first-gen 2MB L2 Core 2 Duo.
__________________
____HTPC____  | __Desktop PC__
2.93GHz Xeon x3470 (4c/8t Nehalem) | 4.6GHz Pentium G3258 (2c/2t Haswell)
Radeon HD5870  | Intel iGPU      
2x2GB+2x1GB DDR3-1333 | 4x4GB DDR3-1600       

Win7 x64

Last edited by Nintendo Maniac 64; 28th October 2019 at 00:05.
Nintendo Maniac 64 is offline   Reply With Quote
Old 28th October 2019, 03:00   #1893  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 67
Quote:
Originally Posted by IgorC View Post
I wonder why dav1d developers have dedicated time to optimize for SSE2.
SSSE3 is done. There was a comment by Steve Robertson (Youtube) at Video@Scale this year that 10% of their userbase on x86 has no SSSE3. So we're trying to explore whether this is meaningful.

1080p on SSE2 is not our goal. The goal is to have a baseline support so ~5 years (or even earlier?) from now, AV1 can be the baseline, not H.264. We don't know for sure, but this may imply some basic need for SSE2 support. So we're exploring what is possible and how much work it'd be.
Beelzebubu is offline   Reply With Quote
Old 28th October 2019, 10:29   #1894  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,712
Quote:
Originally Posted by Beelzebubu View Post
SSSE3 is done
According to the dAV1d team, latest version 0.5.0 is extremely fast.
Many times faster than libaom, even using SSSE3.
Based on the benchmarks below, what really surprises me is that depending on content and CPU implementation, SSSE3 code running on 128bit registers can be as fast as AVX2 code running on 256bit registers!
How is this even possible ?
I'm starting to believe that your AVX2 assembly optimizations could be optimized further.
BTW, any plans for AVX-512 in near future ?
Is there any benefit on this ?

https://i.postimg.cc/7h1nkFss/dav1d-0-5-x86-s.png
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 28th October 2019, 10:43   #1895  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,859
dav1ds AVX2 is fine. If you want to properly compare SSSE3 vs AVX2, then you need to look at Single Threaded benchmarks. Multi-Threading is often limited in scaling, where such differences can "hide".
But you should also not expect twice the performance from AVX2, since once you optimize everything possible with SSSE3/AVX2, the remaining parts that cannot be optimized so easily will impact the performance the most.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 28th October 2019 at 10:45.
nevcairiel is offline   Reply With Quote
Old 28th October 2019, 11:17   #1896  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,712
Ok...So, I take a look at the single threaded performance and I see a 20% gain of AVX2 compared to SSSE3.

It is really amazing that the remaining non-optimized parts of the algorithm can impact the performance around 80% (!)

Does that mean that all these months of writing optimized AVX2 assembly are really contributing for 20% ?

I would really like to hear what the dAV1d team or other developers of software AV1 decoding say about that.

Do we really have an 80% non optimizable algorithm here ?

Looks like another implementation of Pareto law to me.

https://i.postimg.cc/3rt91v4z/1-0o-W...a9-BSb3-SQ.png
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 28th October 2019, 12:55   #1897  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 67
Quote:
Originally Posted by NikosD View Post
Ok...So, I take a look at the single threaded performance and I see a 20% gain of AVX2 compared to SSSE3.
On what system (chipset)?
Beelzebubu is offline   Reply With Quote
Old 28th October 2019, 12:57   #1898  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,712
Quote:
Originally Posted by Beelzebubu View Post
On what system (chipset)?
I took it from "you"
http://www.jbkempf.com/blog/post/201...elease-fastest
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 28th October 2019, 13:50   #1899  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 67
Quote:
Originally Posted by NikosD View Post
Few things going on there:
  • YMM (e.g. AVX2) functions are never exactly 2x as fast as XMM (e.g. SSSE3) functions, even in theoretical conditions;
  • YMM upper lane use will cause CPU downclocking (but not on modern AMD CPUs, I'm being told);
  • certain code in SIMD functions does not use YMM upper lanes (effectively), usually because the block size is too small (width=4-8), but sometimes because we don't want a function-pointer-call overhead (multisymbol coding);
  • and obviously, a lot of code is not SIMD'ed at all, it's 50%-50% between SIMD and non-SIMD at best.

Together, that means the speedup is well below half of half, so 20% is not entirely unreasonable. Sucks a bit, but you can't beat reality.
Beelzebubu is offline   Reply With Quote
Old 28th October 2019, 14:43   #1900  |  Link
soresu
Registered User
 
Join Date: May 2005
Location: Swansea, Wales, UK
Posts: 127
Quote:
Originally Posted by NikosD View Post
BTW, any plans for AVX-512 in near future ?
Is there any benefit on this ?
There was a merge request/issue some time ago for adding some support for it (specifically mentioned as Ice Lake), but I don't think any actual optimisations have been committed to the master yet, going by my git commit RSS/Atom feed anyway.

I'm more interested in the GPGPU work that happened over the summer, another mirror repo had some further Vulkan work that seemed like bugfixes or 'piping' as it were.

Is there any chance of getting some bench figures on that work soon?
soresu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:34.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.