View Full Version : Perceptual fingerprinting to remove TV commercials?
ronnylov
14th January 2011, 16:02
Hi!
I wonder if it would be possible to use video and audio fingerprinting technology to detect and remove commercials from TV captures? I have read about the technicues YouTube use to detect copyrighted music and video.
I would like to use the same type of technology to detect commercial breaks and remove them automatically from my recorded TV shows.
So I searched and found pHash. Do you know any open source software using pHash or something similar to detect repeated sequencies of video and audio (like repeated commercials)?
Maybe analyzing TV-commercials and put a database of hash codes on internet for HTPC users to be able to remove unwanted commersial breaks would be great! It could also be used to detect the wanted parts if we for example record a known movie we could detect the parts which is not a part of this known movie (like commercial breaks).
The idea is that commercials are repeated many times and after a while it could be stored in a "known commercials database", locally or public on internet. Sometimes commercial breaks are started with identical "sponsor messages" which could be fingerprinted and serched for to detect start of break. I think analyzing both video and audio would be useful.
Here is a link to pHash: http://www.phash.org/
The bad part is that I am not a very good programmer but maybe this already exists?
LoRd_MuldeR
14th January 2011, 16:41
Well, perceptual fingerprinting can be used to find "perceptually similar" videos or segments of videos.
But, even if we had a perceptual fingerprinting algorithm that is robust enough for what you are trying to do, you would still need a database of the hashes/fingerprints of ALL commercials in existence!
How do you create such database and how do you keep it up-to-date? New commercials are "released" every day, old ones usually disappear after a very short period...
Ghitulescu
14th January 2011, 17:15
There's no use, since TV stations live from commercials they will fight every method to counteract this.
ronnylov
14th January 2011, 18:57
I got a response from phash support because I sent them an email asking the same thing. They said that the current method of video fingerprinting is too slow and inaccurate for this to work in practice but I could maybe use the audio fingerprinting methods for this.
Some commercials use the same background music all the time and have been repeated for years so maybe if only one commercial could be detected in each commercial break it should be a little bit easier to find the breaks. Also fingerprinting the audio of the released movies it could be possible to detect parts as no commercial. OK they could use movie soundtracks inside the commercial breaks but still I don't think the complete break would include movie soundtracks.
Well, I guess it could be combined with other methods to find a working algorithm. But maybe it is not easy to do...
Maybe detecting patterns of scene changes, sound levels, logos, letterboxing could help too.
I am playing with comskip and it is surprisingly accurate but not perfect. I would like something better.
LoRd_MuldeR
14th January 2011, 21:05
I got a response from phash support because I sent them an email asking the same thing. They said that the current method of video fingerprinting is too slow and inaccurate for this to work in practice but I could maybe use the audio fingerprinting methods for this.
Given you have an Audio Fingerprinting algorithm that is both, fast enough as well as accurate enough...
Some commercials use the same background music all the time and have been repeated for years so maybe if only one commercial could be detected in each commercial break it should be a little bit easier to find the breaks.
Well, if you can identify at least two commercials within the commercial break, you could assume that everything in between (up to some threshold) is commercial too.
Still this all depends on having a database of the fingerprints of all (or at least of the majority of) the commercials being used at the time being, which I think is the big problem here :rolleyes:
Also fingerprinting the audio of the released movies it could be possible to detect parts as no commercial. OK they could use movie soundtracks inside the commercial breaks but still I don't think the complete break would include movie soundtracks.
Sure, embedding a specific "watermark" into all movies that are not commercials would make detecting commercials an easy task.
Only problem: The movie studios will never do that (what would be their benefit?) and the TV stations will (for obvious reasons) never allow that :p
Ghitulescu
16th January 2011, 10:54
Is this video of yours so badly "infected" that you cannot do it by hand? Most broadcasters I've seen in Europe (SAT) do employ some 15:5 algorithm (after each 15 min of useful programme come 5 min of garbage).
LoRd_MuldeR
16th January 2011, 14:35
Is this video of yours so badly "infected" that you cannot do it by hand? Most broadcasters I've seen in Europe (SAT) do employ some 15:5 algorithm (after each 15 min of useful programme come 5 min of garbage).
He is looking for an automatic solution, like AdblockPlus for your web-browser ;)
I think even if the TV stations did use a well known scheme, like "15:5 algorithm", this probably isn't accurate enough to blindly rely on.
(For example, late at night the frequency and duration of commercial breaks is usually lowered significantly)
Ghitulescu
16th January 2011, 15:48
I think differently:
1. is this show available somewhere else in a better quality?
if not then I set the EPG to record it
and
since it is so important I wouldn't let a software to do this for me and throw away maybe useful parts (which may fit the characteristics of a commerical - like a parody thereof)
and supervising its work would require more or less the same amount of time
so, as a conclusion,
I'll do it by hand.
mariush
16th January 2011, 15:59
In Romania where I live, it's regulated by law that you can't have more than 7-10 minutes of commercials every hour, or something like that. But the problem is that we have something like this:
10-20 minute show - 5-10s clip ads start - 1-5 minute ads - 5-10s promo start (promotes movies and shows they're going to air in the next days) - 1-5 minute promo - 5-10s promo end - [optional : 5-10s clip ads start - 1-2 minute ads - ads end]
and then we have instant interruptions within a show for a 10second clip that you can only tell it's an add because of a P in a circle in the corner of the TV ... those are annoying. Oh and there's also during news or some talkshows about one or two an hour of about an eight of the screen on the left and bottom black bars with some promotional message with just logo and text that stays for about 10 seconds and then fades out.
Yes, I guess you could make a database of ads and some hash for each one or I could just track the small clips at the beginning and end of ad sequences and promo messages but it wouldn't be 100% reliable and they're just too many. Think we're not there yet, manual is still better and safer.
Lighto
16th January 2011, 16:07
In Romania where I live, it's regulated by law that you can't have more than 7-10 minutes of commercials every hour, or something like that. But the problem is that we have something like this:
10-20 minute show - 5-10s clip ads start - 1-5 minute ads - 5-10s promo start (promotes movies and shows they're going to air in the next days) - 1-5 minute promo - 5-10s promo end - [optional : 5-10s clip ads start - 1-2 minute ads - ads end]
and then we have instant interruptions within a show for a 10second clip that you can only tell it's an add because of a P in a circle in the corner of the TV ... those are annoying. Oh and there's also during news or some talkshows about one or two an hour of about an eight of the screen on the left and bottom black bars with some promotional message with just logo and text that stays for about 10 seconds and then fades out.
Yes, I guess you could make a database of ads and some hash for each one or I could just track the small clips at the beginning and end of ad sequences and promo messages but it wouldn't be 100% reliable and they're just too many. Think we're not there yet, manual is still better and safer.
From where I come from, a 1 hour TV show consist of 20-30 min of advertisements in total.:(
Ghitulescu
16th January 2011, 16:16
Yes, I guess you could make a database of ads and some hash for each one or I could just track the small clips at the beginning and end of ad sequences and promo messages but it wouldn't be 100% reliable and they're just too many. Think we're not there yet, manual is still better and safer.
It's no good at all. Also because, unlike official sources, a TV recording does not begin at the same time, and even if it's DVB (which starts at a GOP boundary) it still lacks a sort of synch. If analog captured, it's even worse, the GOPs are differently placed by the encoder, and some commercials would start in the middle of a GOP, which makes the automatic cut even more problematic that it is.
The best solution, that works only for DVB sources, is to use ProjectX, to cut the show into segments each time a new audio format is detected. It doesn't work with the shows that blend the movie with the commercials.
Taurus
16th January 2011, 17:55
From where I come from, a 1 hour TV show consist of 20-30 min of advertisements in total.:(
Where are you living? In hell?
Sorry, off topic, could'nt resist..:p
ronnylov
18th January 2011, 12:18
Well, I have played a little bit more with comskip and for at least one channel it is working perfectly. This channel does always remove the channel logo during commercial breaks and it is a good indicator of what to keep and what to throw away. But it is difficult to understand all the settings and tune it for other channels. I have a feeling that the logo detection algorithm could be better (or maybe I use wrong settings).
comskip: http://www.kaashoek.com/comskip/
I also found the MSU TV commercial detector: http://compression.ru/video/tv_commercial_detector/index_en.html
But this one I have not tested yet.
Here in Sweden we have quite nice and strict commercial cuts without blending of the movie and things like that. If a lot of people use this kind of software to remove commercials I guess the TV stations start these ugly tricks here too... But there are rules for commercial breaks so I guess they can't do whatever they like.
Here are some info on european advertising rules:
http://ec.europa.eu/avpolicy/reg/tvwf/advertising/shop/index_en.htm
EDIT: Found more links:
http://en.wikipedia.org/wiki/Television_advertisement
http://www.mythtv.org/wiki/Commercial_detection_in_the_UK
Ghitulescu
18th January 2011, 14:38
I think even if the TV stations did use a well known scheme, like "15:5 algorithm", this probably isn't accurate enough to blindly rely on.
I also noticed this algorithm on series on DVD, too. Normally they use only cuts and not transitions (blends) everywhere, except when the broadcaster should/may insert a commercial break, which is a transition (fade to black).
LoRd_MuldeR
18th January 2011, 15:27
Well, they can insert a commercial break at those "blends", but they don't necessarily have to (and in television broadcast you often see that they don't do, at least in Germany).
Also they may insert breaks at arbitrary positions, even when there's no "blend" prepared there. So you can't rely on that.
But the bigger problem is: Even if we assume that there are parts of "fixed" (known ahead) duration between the breaks, we can't know how long the commercial break itself will be.
As the commercial break is composed of various short clips/trailers that are changed/recomposed every day, it will never be exactly 5 minutes (or whatever fixed duration)...
Ghitulescu
18th January 2011, 15:57
The trivial fact that most if not all VHS recorders have a maximal duration of 5 minutes for pause was a compromise between band tearing and commercial break duration. I remember that, including the "lead in" and "lead out", these commercial breaks started mid 90ies to be longer with only a few seconds, just enough for the mechanics to disengage and thus the first seconds to be lost (it takes some time to load and unload the tape). Just for the fun of spoiling the recording .....
clacker
18th January 2011, 20:47
and then we have instant interruptions within a show for a 10second clip that you can only tell it's an add because of a P in a circle in the corner of the TV ... those are annoying.
Can you:
make a copy of the stream
crop it down to the area where the P appears
set up a mask in the shape of the P
create a blank clip
overlay the cropped stream over the blank using the mask
and then use something like histogram to analyse it for a match?
Or is the P only for that certain type of advertisement?
Ghitulescu
18th January 2011, 20:55
Why would one do this when there are AFAIK 2 logo remover?
clacker
18th January 2011, 21:28
Not to remove the logo, but to use the logo (the P in a circle) to determine if this is a commercial. Then once you know strip that section of video out.
Ghitulescu
19th January 2011, 18:37
Not to remove the logo, but to use the logo (the P in a circle) to determine if this is a commercial. Then once you know strip that section of video out.
Those with the P are not commercials in the sense we use. The P stands for Product identification or Product presentation (rough translation), it means that during the show, the actors would also present some products to the public. So it's part of the show.
ronnylov
20th January 2011, 10:01
Perhaps a logo remover could be used for more accurate logo detection? I saw something in the avisynth logotools regarding "nologo detection" that it was more accurate to add a logo remover and detect where the logo remover fails than trying to detect a logo before removing it. A good frame accurate "nologo detector" would be a useful tool for me.
Another thing is detect letterboxing. If recording a letterboxed movie then the commercials may not be letterboxed.
Subtitles: If the show has subtitles then it could be an indicator that this part belongs to the show. The problem is that they don't talk all the time so it can be show also when there is no subtitles.
Audio: Switching from 2.0 ac3 to 5.1 ac3 could be an indicator of beginning and end of show.
Aspect ratio flag: Switching between 4:3 and 16:9 can be an indicator for cutting points but nowadays not much contrent is 4:3 anymore.
Detecting scene changes to find cutting points.
I read somwhere that reencoding the video to Xvid or maybe x264 with a long distance between key frames and then analyze the automatic inserted keyframes could be used to detect change in keyframe patterns (I think this would be like an estimation of scene changes). The idea is that commercials have more scene changes than the show. But then an action scene may be mistaken as commercial...
Motion detection: Parts with absoute no motion may be sponsor signs but the actual show may also contain still pictures. Probably not very accurate method because the show may also contain still pictures...
Detect after texts. The end of show (especially movies) may have rolling vertical text containing name of actors. Cut at the end of the aftertexts.
Any other idéas?
Ghitulescu
20th January 2011, 11:10
It's a lot of work to automatise something that it's not so hard to be manually done.
Besides, depending on how the show was recorded, some of these features might be unavailable (eg using a DVD recorder - all audio gets downmixed to Ac-3 2.0 and all frames will have the exact DAR of the first recorded frame). So the solution is personalized, and it would require another similar amount of work to be customized for the next technical advance. Which are coming more and more often, so it seems...
As I said, if the show is important, why rely on unsecure algorithms? Just to find out later that several seconds of commercials still are in and parts of the show is irremediably lost?
ronnylov
3rd February 2011, 14:53
I have googling a bit and found that this has already been tested as a method of detect commercials on TV broadcasts.
Collecting fingerprints of known commercials and store them into a database.
Here is an example:
http://www.ic.uff.br/iwssip2010/Proceedings/nav/papers/paper_126.pdf
Reading different papers it seems that spatial block based fingerprinting algoritms are fast enogh to use on video.
Is there an easy and fast method to extract DC-components of the blocks in MPEG-2 and h.264 videos?
Or is it easier to decode it and recalculate.
There seems to e a lot of math and statistic analyzing inclded in these methods so it might be a little bit difficult for me.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.