Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd March 2018, 09:45   #1  |  Link
`Orum
Registered User
 
Join Date: Sep 2005
Posts: 166
DupStep, a duplicate frame detector & decimator (x64 only; alpha)

DupStep is a duplicate frame detector/decimator, currently in an "alpha" state. It was written from scratch by me, and is not based on any other similar filter, though it can emulate some. Although I normally don't release projects this early, I would like the community's help with regard to feedback, testing, and feature suggestions, so I'm making it available now.

For those that are wondering how it compares with other, similar filters, I've put up a spreadsheet comparison that I think is somewhat accurate. Please let me know if you spot any incorrect information or if you can fill in some of the blanks (??? cells) and I'll update it appropriately.

The readme is too long to fit within this post, so instead you can read it here.

Keep in mind I'm actively developing it and as it's still in alpha, thus many things are subject to big changes. Right now my priorities for upcoming releases are:
Code:
High priority:
- SSIM based metric support
- Reduction kernel instead of atomics
- Greatly improving the displayed information in the overlay (the 'show' parameter)
- Improve the dsstats.pl script to allow parameter input and computed stat output (currently only displays intermediate stats)

Medium priority:
- Cumulative error reduction
- Adding range-limiting support (Note: This is not the same as duplicate limiting!)
- RGB support
- VFR input

Low priority:
- 32-bit floating point color space support
- Dynamic threshold support
- Adding frame type support (see 'ptp' parameter) to other source filters (e.g. LSMASH, DSS2?)
Last but not least, the relevant links:
__________________
My filters: DupStep | PointSize

Last edited by `Orum; 23rd April 2018 at 17:37.
`Orum is offline   Reply With Quote
Old 23rd March 2018, 14:25   #2  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 170
Interesting. I wonder if it can be used to make VFR from QTGMC 60p output, since Tdecimate doesn't support 60p sections at mode 5. Anyway thanks for your efforts, will check it later.
__________________
Me on GitHub | My Telegram
DJATOM is offline   Reply With Quote
Old 2nd April 2018, 21:33   #3  |  Link
`Orum
Registered User
 
Join Date: Sep 2005
Posts: 166
v0.02

New release, with numerous bug fixes. Full changelog:
Code:
v0.02
----------------
Speed: OpenCL is now used for metric calculation, allowing for multi-threading on a CPU or using
  a GPU instead
Speed: All frames read during cache generation are now cached on the OpenCL device so they are
  only requested once from AviSynth
Added: New function "DS_dumpocl()" to list OpenCL platforms and devices with index
Param: Added parameters 'oclpi' and 'ocldi'
Param: Default 'cdepth' changed from 2 to 5
Param: Maximum value of 'cdepth' changed from 10 to 250
Param: dsstats.pl now features additional parameters; run "dsstats.pl -h" for information
Fixed: SSD metrics could be incorrect (lower than they should be) for 16-bit video
Fixed: Cache depth promotion was not working in some cases (affected ifmcm modes 1 & 2)
Fixed: Frame type cache was leaking a small amount of memory
Fixed: Cache was delivering incorrect frame deltas when within a radius of 'cdepth' of the final
  video frame, sometimes reading outside of buffer memory (undefined values)
Fixed: Cache had incorrect frame types when cdepth > 1
Other: Documentation updated regarding when 'ptp' should be disabled, and detailed new concerns
  regarding device memory usage during cache generation
Other: dsstats.pl now validates cache version
Other: Minor refactoring
It's also a little bit faster than the previous release when it comes to generating metrics:


Finally, if you're willing to test, please let me know if you get the same cache file data between the previous release and the current one, keeping the following in mind:
  • Set 'cdepth' to the same value on each release to make the comparison easier. Note that the default value of this parameter has changed.
  • Comparisons should be done with the dsstats.pl from v0.02 using metric-only dumping (-m). Redirect the output from each release to a csv file (e.g. "dsstats.pl -m -s v0.01.dsd > v0.01.csv") and see if they are identical for both releases.
  • Alternatively, if you don't want to install perl, either use a source filter other than FFVideoSource() or set cdepth=1 and compare the files directly (e.g. via a hashing algorithm or a bit-for-bit comparison)
  • 16-bit comparisons between v0.01 and v0.02 aren't valid, as v0.01 would generate incorrect metric results for 16-bit video in some circumstances. All other supported depths shouldn't have the problem.
I'm particularly interested to know if the results match between the two if you generate the cache on v0.02 using an AMD GPU. I have not tested DupStep with an AMD GPU as I don't have access to one at the moment.
__________________
My filters: DupStep | PointSize

Last edited by `Orum; 2nd April 2018 at 23:35.
`Orum is offline   Reply With Quote
Old 22nd April 2018, 13:03   #4  |  Link
`Orum
Registered User
 
Join Date: Sep 2005
Posts: 166
v0.03

Version 0.03 is released, the big changes being:
  • Completely dynamic cache depth, eliminating the 'cdepth' parameter as well as the hybrid 'ifmcm' modes. Now 'ifmcm' should almost always be set to '0' (the new default) which should give the best quality while being slower than the other modes only when changing parameters (and probably only slightly so at that, as metrics are now reused whenever possible).
  • The addition of block-based metrics, which are very useful in detecting slow fades of static images. SSBD (a block metric) is the new default metric.
  • New 'pco' parameter gives control of how planes are combined when using both luma and chroma.
  • Better default value for 'Cweight' parameter.
The full changelog for this release:
Code:
v0.03
----------------
Speed: Cache depth is now dynamic and generates/caches only what is needed for the current
  parameters/video
Speed: Cache data from file is always used when available and only recalculated when necessary
  (exception: changing block size recalculates all metrics)
Speed: OpenCL kernel should now be able to take advantage of SIMD instructions for platforms
  that support them
Speed: Avoids unnecessary memory copying when executing OpenCL kernels on a CPU
Added: New function "DS_cachefp()" to cache only frame priorities, allowing 'ptp' to be used
  even when (spatio)temporal filtering is used prior to metric generation
Param: Removed the 'cdepth' parameter as cache depth is dynamic on a per-frame basis now
Param: 'ifmcm' modes 1 and 2 removed and replaced by former modes 3 & 4
Param: Default mode for 'ifmcm' set to 0 (direct)
Param: Removed "raw" metrics (RSAD/RSSD)
Param: New 'metric' settings for block-based metrics
Param: Reordered the 'metric' settings to put squared metrics at lower values than their
  absolute counterparts
Param: Default 'metric' now set to SSBD (2)
Param: Automatic (negative) 'Cweight' uses new values based on empirical data, and no longer
  considers chroma subsampling factor as a factor
Param: New parameter 'pco' to set the plane combination operation
Param: Added 'blksize' parameter to set the block size for block metrics
Param: Added new option to prefer CPU to 'oclpi'; subtract 1 from old options if < -1 to get the
  equivalent new option index for them
Param: Default for 'oclpi' is now set to prefer CPU (-2)
Param: Reordered the parameters
Fixed: Potential overflow when calculating chroma metrics has been fixed
Fixed: A potential race condition that could lead to incorrect results was resolved
Fixed: Corrected a typo in dsstats.pl help (-h)
Fixed: Multiple known issues from v0.02 resolved
Other: Metrics are now all stored as floating point values, pre-normalized to resolution where
  appropriate (previously they were post-normalized, which could cause problems when resolution
  changed between cache generation and usage)
Other: Pixel metrics (SSPD/PAD) are no longer clamped after being scaled by chroma weighting,
  permitting more aggressive chroma weighting
Other: Version bump for cache file; there is no compatibility with caches from older versions
Other: OpenCL device memory usage reduced as it only ever needs to hold the two compared frames
Other: DS_dumpocl() now displays OpenCL version supported (both for platforms and devices)
Other: Significant refactoring
__________________
My filters: DupStep | PointSize

Last edited by `Orum; 22nd April 2018 at 13:32.
`Orum is offline   Reply With Quote
Old 1st May 2018, 15:57   #5  |  Link
leoenc
Registered User
 
Join Date: Mar 2007
Posts: 176
I'm getting an access violation exception when running the following script:

Code:
FFvideoSource("MVS000019427.mxf").ConvertToYUV420()
DupStep()
I tried with ocldi=1 but getting:
"DupStep: Requested device index out of range" even though I see it listed when running DS_dumpocl():

Code:
AVSMeter 2.4.7 (x64) - Copyright (c) 2012-2017, Groucho2004
AviSynth+ 0.1 (r2664, MT, x86_64) (0.1.0.0)


oclpi: 0 - ver: OpenCL 1.2 CUDA 9.1.84 - NVIDIA CUDA
ocldi: 0 - ver: OpenCL 1.2 CUDA - GeForce GTX 980 Ti
ocldi: 1 - ver: OpenCL 1.2 CUDA - GeForce GTX 780

oclpi: 1 - ver: OpenCL 1.2  - Intel(R) OpenCL
ocldi: 0 - ver: OpenCL 1.2 (Build 63463) - Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
Let me know if you need more information.
leoenc is offline   Reply With Quote
Old 2nd May 2018, 09:35   #6  |  Link
leoenc
Registered User
 
Join Date: Mar 2007
Posts: 176
Fixed by using DupStep(oclpi=-4).
leoenc is offline   Reply With Quote
Old 2nd May 2018, 09:58   #7  |  Link
leoenc
Registered User
 
Join Date: Mar 2007
Posts: 176
@`Orum,

What is the proper way to get a list of all the duplicated frames?
After running DubStep() and getting the dupstep.dsd file I ran:
perl dsstats.pl -s dupstep.dsd >stats.txt

Not sure how to decipher the data there to know which frames were considered duplicates.
leoenc is offline   Reply With Quote
Old 4th May 2018, 21:46   #8  |  Link
`Orum
Registered User
 
Join Date: Sep 2005
Posts: 166
Quote:
Originally Posted by leoenc View Post
...
I tried with ocldi=1 but getting:
"DupStep: Requested device index out of range" even though I see it listed when running DS_dumpocl():
...
Fixed by using DupStep(oclpi=-4).
Just a note on this, you should always set oclpi to a non-negative value when setting ocldi. While I mention this in the readme, I neglected to mention that it should be non-negative. I'll have an updated readme in the next release.


Specifically, in your case, it's due to the Intel CPU not supporting OpenCL atomics, but your nVidia GPU does. So just using oclpi=-4 (or -3) and not specifying ocldi is probably the best way to handle it (unless you really want it to run on your 780 instead of your 980, in which case I recommend setting both oclpi=0 and ocldi=1). The next release will be using a reduction kernel instead of atomics and should run on Intel CPUs/GPUs without issue.
Quote:
Originally Posted by leoenc View Post
@`Orum,

What is the proper way to get a list of all the duplicated frames?
After running DubStep() and getting the dupstep.dsd file I ran:
perl dsstats.pl -s dupstep.dsd >stats.txt

Not sure how to decipher the data there to know which frames were considered duplicates.
This is a bit complicated, because there's actually several ways to do this, but I'll focus on the simpler method. But before I get into that, I recommend you instead redirect the output to a ".csv" file instead of ".txt", and that you open the CSV file in a spreadsheet application, e.g. MS Excel or LibreOffice Calc. You may also want to dump only metrics (and not the priority cache) by using -m, e.g. "dsstats.pl -ms dupstep.dsd" (note: "-s dupstep.dsd" isn't necessary here as that's the default file if unspecified). Lastly, keep in mind the following assumes two things: you are using the current release, v0.03, and you are using ifmcm=0 (the default in v0.03).

Now, if you have not modified DupStep's parameters since the cache file was created, it's actually relatively simple to figure out. The first column is the frame number of the "earlier" or base frame in the comparison (i.e. the frame with the lower frame number), and the second column is the "delta" or forward distance from this frame that the metric is calculated from. If you add this delta to the first frame number, you'll get the frame number that the frame is compared to. For example (remember right now we're only looking at the first two columns, and for the sake of formatting for this forum I'll be using CSV format):
  • 0,1 - This means frame 0 is being compared to frame 1 (0+1=1). This will always be the first metric of any file, as DupStep always begins by comparing the first frame (frame 0) to the second frame (frame 1)
  • 37,8 - Frame 37 is being compared to frame 45 (37+8=45)
  • x,y - Abstracting this, frame x is being compared to frame (x+y)
So, how does this tell you what frames are being decimated? Well, in v0.03, DupStep only generates metrics as needed, which means it will keep increasing the delta until it exceeds the threshold. For example, consider the following output (again, only looking at the first two columns) from some of the test footage I run through DupStep:
Code:
frame,delta
0,1
1,1
1,2
3,1
3,2
5,1
5,2
7,1
7,2
9,1
9,2
11,1
Here, you'll see the first metric stored in the cache is 0,1, followed by 1,1. Any time the first number (the base frame number) changes, that means that the threshold was exceeded and the frame is not a duplicate. So, in this case, it means frame 1 was not a duplicate of frame 0.

However, look at the next two deltas: 1,1 followed by 1,2. This means that when comparing frame 1 to frame 2 (1+1), frame 2 was detected as a duplicate, because it then went on to calculate metrics between fame 1 and frame 3 (1+2).

Then it jumps to frame 3 as the base frame, because it discovered that frame 3 was not a duplicate of frame 1. So, in short, if you look at all the unique numbers of the base frames, you'll find almost all the frames that are non-duplicates, or in this case: frames 0, 1, 3, 5, 7, 9 and 11. The duplicate frames are almost everything else, or 2, 4, 6, 8, and 10 in this case. (Note, however, the frames returned by DupStep may not actually be those particular frame numbers if you have ptp=true (the default), as it tries to find the "highest quality" frame from a range of duplicates.)

Why almost, and what about frame 12? Well, this method will not tell you if the final frame is a duplicate or not, but it's much simpler as you don't need to do any math regarding the metrics. It does, however, require that you have not modified DupStep's parameters, as it does not delete metrics that are no longer used when you change the parameters, for performance reasons (otherwise they'd have to be recalculated if you reverted your change to the parameters).

If you're asking, "Well, how do I get the duplicates even if I change the parameters without deleting the cache, and how do I find out if the last frame is a duplicate?" the short answer is "You can using the numbers in the columns following the frame/delta columns, but it's complicated." While the math itself is relatively simple, it depends on parameters including your threshold, and if you're using an automatic threshold (i.e. negative) like I recommend, the calculation of the threshold itself is based on numerous other parameters.

In short, I recommend just waiting for a later release, as adding in display of the metrics within the video (the "show" parameter) is one of my high priority items. It probably won't be available next release, as I am focusing on getting SSIM metrics and a reduction kernel for that, but I am planning on them being in v0.05. Also, at some point in the future (probably with v0.05) I'll be adding in the capability to input your DupStep() parameters into dsstats.pl and have it spit out more useful information when you do.

Edit: You can also set show=1 to see which frame DupStep is returning from the footage you're giving it, and this method will work even if you change the parameters. However, if you want to get the range of duplicates, it's best to set ptp=false as well so it will always return the first frame of the range, and then by moving forward/backward one frame in its output, you can calculate which frames are duplicates.
__________________
My filters: DupStep | PointSize

Last edited by `Orum; 6th May 2018 at 19:32.
`Orum is offline   Reply With Quote
Reply

Tags
dedup, deduplicate, duplicate, vfr

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:40.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.