Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > VP9 and AV1

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd March 2022, 18:41   #141  |  Link
lvqcl
Registered User
 
Join Date: Aug 2015
Posts: 305
8-bit video: SSE4.1 vs AVX2 vs AVX-512 (on 8C/16T Rocket Lake) - https://code.videolan.org/videolan/d..._requests/1301

Last edited by lvqcl; 23rd March 2022 at 19:46.
lvqcl is offline   Reply With Quote
Old 23rd March 2022, 18:56   #142  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 111
Quote:
Originally Posted by benwaggoner View Post
Do we know how much speedup AVX512 provided? We've not seen it to be particularly useful in encoder performance, so it'd be interesting if it helps more on the decode side.
dav1d uses the icelake subset (AWS: m6i/c6i, or: F, CD, VL, DQ, BW, IFMA, VBMI, VBMI2, VPOPCNTDQ, BITALG, VNNI, VPCLMULQDQ, GFNI, VAES), not skylake subset (AWS: m5*/c5*, or: F, CD, VL, DQ, BW). Icelake's performance of AVX512 instructions is in general much better than Skylake's, but the wider instruction subset also allows for certain additional code optimizations.

Extreme example of the latter: 8-bit film grain is more than 3x as fast with AVX512 compared to AVX2.

Last edited by Beelzebubu; 23rd March 2022 at 19:04.
Beelzebubu is offline   Reply With Quote
Old 23rd March 2022, 20:05   #143  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,823
Wow, those are some very impressive speedups with AVX512! The new instructions are making at least as much of a difference than the "AVX2, but 2x wider" instructions.

Of course, Icelake CPUs don't have that much market share yet, but these kinds of speedups are quite promising in the long term for software decoding.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 24th March 2022, 12:21   #144  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 111
Quote:
Originally Posted by benwaggoner View Post
Of course, Icelake CPUs don't have that much market share yet, but these kinds of speedups are quite promising in the long term for software decoding.
... and software encoding!
Beelzebubu is offline   Reply With Quote
Old 15th February 2023, 19:53   #145  |  Link
Spyros
Registered User
 
Join Date: Jun 2019
Posts: 18
dav1d 1.1.0 'Arctic Peregrine Falcon'

dav1d 1.1.0 was released yesterday. (Tag)

Quote:
Changes for 1.1.0 'Arctic Peregrine Falcon':
-------------------------------------------

1.1.0 is an important release of dav1d, fixing numerous bugs, and adding SIMD
  • New function dav1d_get_frame_delay to query the decoder frame delay
  • Numerous fixes for strict conformity to the specs and samples
  • NEON and AVX-512 misc fixes and improvements
  • Partial AVX2 12bpc transform implementations
  • AVX-512 high bit-depth cdef_filter, loopfilter, itx
  • NEON z1/z3 optimization for 8bpc
  • SSSE3 z1 optimization for 8bpc

"From VideoLAN with love"
Spyros is offline   Reply With Quote
Old 3rd May 2023, 07:43   #146  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,135
Changes for 1.2.0 'Arctic Peregrine Falcon':
-------------------------------------------

- Improvements on attachments of props and T.35 entries on output pictures
- NEON z1/z3 high bit-depth optimizations and improvements for 8bpc
- SSSE3 z2/z3 8bpc and SSSE3 z1/z3 high bit-depth optimziations
- refmvs.save_tmvs optimizations in SSSE3/AVX2/AVX-512
- AVX-512 optimizations for high bit-depth itx (16x64, 32x64, 64x16, 64x32, 64x64)
- AVX2 optimizations for 12bpc for 16x32, 32x16, 32x32 itx
hajj_3 is offline   Reply With Quote
Old 4th June 2023, 21:57   #147  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,135
Changes for 1.2.1 'Arctic Peregrine Falcon':
-------------------------------------------

- Fix a threading race on task_thread.init_done
- NEON z2 8bpc and high bit-depth optimizations
- SSSE3 z2 high bit-depth optimziations
- Fix a desynced luma/chroma planes issue with Film Grain
- Reduce memory consumption
- Improve dav1d_parse_sequence_header() speed
- OBU: Improve header parsing and fix potential overflows
- OBU: Improve ITU-T T.35 parsing speed
- Misc buildsystems, CI and headers fixes
hajj_3 is offline   Reply With Quote
Old 5th October 2023, 20:54   #148  |  Link
Barough
Registered User
 
Barough's Avatar
 
Join Date: Feb 2007
Location: Sweden
Posts: 495
Changes for 1.3.0 'Tundra Peregrine Falcon (Calidus)':
------------------------------------------------------

1.3.0 is a medium release of dav1d, focus on new APIs and memory usage reduction.

- Reduce memory usage in numerous places
- ABI break in Dav1dSequenceHeader, Dav1dFrameHeader, Dav1dContentLightLevel structures
- new API function to check the API version: dav1d_version_api()
- Rewrite of the SGR functions for ARM64 to be faster
- NEON implemetation of save_tmvs for ARM32 and ARM64
- x86 palette DSP for pal_idx_finish function
__________________
Do NOT re-post any of my Mediafire links. Download & re-host the content(s) if you want to share it somewhere else.
Barough is offline   Reply With Quote
Old 5th October 2023, 21:02   #149  |  Link
Barough
Registered User
 
Barough's Avatar
 
Join Date: Feb 2007
Location: Sweden
Posts: 495
dav1d v1.3.0-3-g47107e3
Built on October 05, 2023, GCC 13.2.0

https://code.videolan.org/videolan/dav1d

DL :
dav1d v1.3.0-3-g47107e3
__________________
Do NOT re-post any of my Mediafire links. Download & re-host the content(s) if you want to share it somewhere else.

Last edited by Barough; 5th October 2023 at 21:35.
Barough is offline   Reply With Quote
Old 14th February 2024, 17:21   #150  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,135
Changes for 1.4.0 'Road Runner':
------------------------------------------------------

1.4.0 is a medium release of dav1d, focusing on new architecture support and optimizations

- AVX-512 optimizations for z1, z2, z3 in 8bit and high-bit depth
- New architecture supported: loongarch
- Loongarch optimizations for 8bit
- New architecture supported: RISC-V
- RISC-V optimizations for itx
- Misc improvements in threading and in reducing binary size
- Fix potential integer overflow with extremely large frame sizes
hajj_3 is offline   Reply With Quote
Old 14th February 2024, 20:19   #151  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,823
RISC-V is interesting. It's starting to go into a lot of embedded things. License free (unlike ARM) and a very elegant architecture.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 9th March 2024, 17:41   #152  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,135
v1.4.1 'Road Runner':
--------------------------------

- Optimizations for 6tap filters for NEON (ARM)
- More RISC-V optimizations for itx (4x8, 8x4, 4x16, 16x4, 8x16, 16x8)
- Reduction of binary size on ARM64, ARM32 and RISC-V
- Fix out-of-bounds read in 8bpc SSE2/SSSE3 wiener_filter
- Msac optimizations
hajj_3 is offline   Reply With Quote
Old 20th April 2024, 21:22   #153  |  Link
dapperdan
Registered User
 
Join Date: Aug 2009
Posts: 202
dav1d pushed as part of a Google update going out to Android 12+

https://twitter.com/videolan/status/1781025929659392360

Apps will still use the Google developed alternative libgav1 unless they opt in though.
dapperdan is offline   Reply With Quote
Old 22nd April 2024, 16:28   #154  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 111
Quote:
Originally Posted by dapperdan View Post
dav1d pushed as part of a Google update going out to Android 12+

https://twitter.com/videolan/status/1781025929659392360

Apps will still use the Google developed alternative libgav1 unless they opt in though.
See also: https://www.linkedin.com/feed/update...5577493544960/

"Apps need to opt into dav1d to benefit for now yet soon it will become the default av1 software decoder. "
Beelzebubu is offline   Reply With Quote
Old 24th April 2024, 20:02   #155  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,823
Quote:
Originally Posted by dapperdan View Post
dav1d pushed as part of a Google update going out to Android 12+

https://twitter.com/videolan/status/1781025929659392360

Apps will still use the Google developed alternative libgav1 unless they opt in though.
The odds of Apple shipping someone else's precompiled binary in any of their OSes is very low these days. For security, portability, and optimization reasons.

They may leverage dav1d source code, but with their own tweaks and compile.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 24th April 2024, 20:47   #156  |  Link
Ritsuka
Registered User
 
Join Date: Mar 2007
Posts: 98
Apple has been shipping dav1d for years. The arm64 version is compiled with pointer authentication codes enabled.
Ritsuka is offline   Reply With Quote
Old 24th April 2024, 21:27   #157  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,363
Quote:
Originally Posted by benwaggoner View Post
The odds of Apple shipping someone else's precompiled binary in any of their OSes is very low these days. For security, portability, and optimization reasons.

They may leverage dav1d source code, but with their own tweaks and compile.
Why would anyone with a serious distribution ever ship someone elses binary for an open-source project, instead of just compiling it for your target? Am I missing context for this comment?

Obviously they compile their own. As does Google for Android. And Microsoft for Windows.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 21st May 2024, 21:57   #158  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,867
dav1d 1.4.1-66-g3623543 (MSYS2; MinGW32 / MinGW64: GCC 14.1.0)
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 25th May 2024, 19:31   #159  |  Link
hajj_3
Registered User
 
Join Date: Mar 2004
Posts: 1,135
Changes for 1.4.2 'Road Runner':
--------------------------------

1.4.2 is a small release of dav1d, improving notably ARM, AVX-512 and PowerPC
- AVX2 optimizations for 8-tap and new variants for 6-tap
- AVX-512 optimizations for 8-tap and new variants for 6-tap
- Improve entropy decoding on ARM64
- New ARM64 optimizations for convolutions based on DotProd extension
- New ARM64 optimizations for convolutions based on i8mm extension
- New ARM64 optimizations for subpel and prep filters for i8mm
- Misc improvements on existing ARM64 optimizations, notably for put/prep
- New PowerPC9 optimizations for loopfilter
- Support for macOS kperf API for benchmarking
hajj_3 is offline   Reply With Quote
Old 26th May 2024, 14:01   #160  |  Link
Beelzebubu
Registered User
 
Join Date: Feb 2003
Location: New York, NY (USA)
Posts: 111
The 6-tap optimizations for AVX2/512 were inspired by an earlier patch-set (provided by someone from Arm) doing the same on arm platforms. On both Arm (included in the previous release already) and x86 (in this release), on affected sequences (particularly these encoded using faster presets in encoders, which is what you'd find on Youtube etc.) this can provide a ~10% overall performance improvement. Pretty spectacular at this stage of dav1d's life cycle.
Beelzebubu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:47.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.