PDA

View Full Version : Could be Digital Natural Motion implemented in XVID?


Lobuz
25th May 2003, 18:09
Philips is going to implement its "Digital Natural Motion" TV system to use at PC while playing DVD, DivX. It should generate intermediate images to use full refresh rate of monitor with original 24-25 frames. News (www.whatvideotv.com/cgi-bin/displaynews.php?id=3949[/url)
I thought that it could be possible to do something like that in xvid decoder. So motion would be more fluent. Imagine 100fps. Even if it wouldn't have more data the picture should be nicer.

Any comments, thoughts?

Regards
Lobuz

OUTPinged_
25th May 2003, 19:47
1. alot of cpu time required
2. alot of effort required
3. useless for FILM and anime, useless for cam/interlaced tv captures.
4. no one would notice the effect
5. false motion may produce visual artifacts

Let big and wealthy companies like Philips spend their money and R&D resources as they would like.

athos
25th May 2003, 20:53
My Sony TV has something like this, and when compared to other TVs in TV store it does look better, same thing comparing with this feature on/off at home. I suggested this for ffdshow, and a very simple for is implemented (interpolating two frames to produce an intermediate). This is too simple to give any good effect, but I think it might be possible by using the motion vector information in the video stream. I have also suggested an algorithm for this (see this (http://forum.doom9.org/showthread.php?s=&threadid=36361) thread), but I am not familiar enough with mpeg-4 and directshow programing to implement it myself.

I still dont think we should just discard this idea. When the TV does this it ha to somehow calculate the motion in the video, but we already have this information (in Xvid for example) so it should require much less cpu.

mf
25th May 2003, 21:39
I have tested dynapel slowmotion (yay I got it to work in graphedit) on anime sources and a problem I also encountered when I was manually interpolating frames in an experiment (manually means setting the motion vectors by hand) is that you can't interpolate motion coming in from outside the frame. Also still objects in front of motion or subtitles impair the process greatly. I don't think anyone will come with a solution for these problems in the near future.

vinouz
26th May 2003, 17:17
and the fact is that the motion compensated macroblock, after being copied from previous location in last frame, is then applied a delta image. And if you could halve the MV, you couldn't get the exact delta information corresponding to it.
The only way I could see using would be taking the last MB, the new MB, blending it, and putting it halfway from source to destination. and you'd have to complete the remaining image (as there will be 'holes in the pictures, not all MV being parrallel usually). And there would be partially recovering block problems. And I'm pretty sure it would look shitty.
Another way I thought of before would be to move region surrounding the centre of the MBs of the new frame halfway rear (with a kind of 'pinch'ing effect : each pixel will be moved to a place directed by the linear interpolation between the MB-surrounding-it's MV, and this would be made as the inverse function (e.g. each dest pixel relating to a source pixel)) and the one surrounding the centre of the MBs of the last frame halfway on, then blend the two results.
Yeah. Quite time consuming.
[edit : ah ! and what about this moving wall artifact we tried to avoid. I think he would be back, and stronger than ever ! ;) ]

Or a quick dirty little sse/mmx pblend, or even to be softer, 4 times quicker, .... an interlaced frame between them. Yo !

Finally, I like this solution rather. Milan ?

sysKin
26th May 2003, 17:38
IMHO if a TV can do that, it's doable ;)

Although I admit I have no idea how :)

vinouz
27th May 2003, 13:11
Yes, and I think the TV does the simple blending.
(as it has no information about motion vectors)

But the problem is that it takes mmory bandwith. And that's simply where it takes too much time.
That's why copying one line on two from the next frame at half the time (without a new buffer. Maybe dirty but twice less memory consuming) could be a nice option to smooth a bit the display. (I'm not talking abouth smoothing the image, you had understood. Period).
[edit : one blend being on the odd lines, the second on the even, to give a little more of a TV look and feel]

Defiler
27th May 2003, 14:27
ffdshow already does this, on the Deinterlacing tab. Heh.

vinouz
27th May 2003, 19:44
I'm not talking about deinterlacing but about interlacing. :)

athos
27th May 2003, 20:07
What Defiler is refering to is "frame rate doubler" in the Deinterlace tab, which is the simple algorithm i suggested in the thread i linked to above. This method just interpolates the pixels in the intermediate frames using the corresponding pixels from the frame before and the frame after. It does not use motion vectors. This method is very "cheap" cpu-wise, but does not give very good results. I think to get any good results you need to take motion into the equation. I also suggested such an algorithm, partly based on the papers linked to in the thread mentioned above, but it is enough specified to implement, and I do not have enough knowledge to do so, at least not right now.

Defiler
27th May 2003, 20:31
It's a bit of a misnomer. Check out the "framerate doubler" option. Heh.
Edit: Beaten to the punch by the man himself.

Lobuz
27th May 2003, 23:10
There is small description of Philips's DNM: link1 (http://www.research.philips.com/InformationCenter/Global/FPressRelease.asp?lArticleId=2699&lNodeId) link2 (http://www.research.philips.com/InformationCenter/Global/FArticleDetail.asp?lArticleId=1994&lNodeId=938&channel=938&channelId=N938A1994) doc (http://www.research.philips.com/Assets/Downloadablefile/passw7_16-893.pdf)

It looks like the main problem is motion estimation. And it's possible in raw video. In xvid decoder we have MV and it should be much easier to do it.

Frame doubler in ffdshow is blending frames and the result is opposit. Maybe Nic could implement it in the next version of his great xvid decoder? :D

Regards
Lobuz

MfA
28th May 2003, 15:01
You could use encoded MVs in the motion estimation for motion compensated frame interpolation on display, and I assume this is what Philips will do, but that only makes the search easier computationally ... it will still require a lot of new code, a different ME routine which searches for natural motion and not for low bitrates for instance.

The encoded MVs will be correlated with motion, but they will hardly be a universally good match.

athos
28th May 2003, 20:29
I dont think it has to be that complicated. Motion estimation is allready done in encoding, I am not talking about raw video here.

To calculate the color of pixel B(x+(mx/2), y+(my/2)):
interpolate the pixels A(x,y) and C(x+mx, y+my),
where
A is one original frame,
C is the next original frame,
B is the generated intermediate frame between A and C,
x is the horizontal position of a given pixel in A,
y is the vertical position of a given pixel in A,
mx is the horizontal motion of this given pixel,
ie how many pixels to the right or left it will move for frame C
my is the vertical motion of this given pixel,
ie how many pixels up or down it will move for frame C

so the position of a pixel in B, will be half the way from A to where it will be in C, according to the MVs. The color of it will be interpolated (perhaps averaged) from the color of the pixel in A and the corresponding pixel in C.

Some adjustments has to be made for pixels that fall out of the frame (just drop), pixels that do not appear until C (copy from C), pixels that only appear in A (copy from A).

I hope that I have managed to communicate what I'm thinking here.
If so, am I thinking right?

MfA
29th May 2003, 03:22
Motion vectors encoded in the bitstream do not necessarily correspond to true motion, and that can be a problem. A trivial example are MVs for skipped blocks, which are assumed to be (0,0). I can assume you can see the problem of using the (0,0) vector for MBs near the edge of evenly colored moving surfaces, which tend to provoke skipping, for motion compensated interpolation?

In general the motion vectors are chosen not to represent motion, but to minimize the total rate. Sometimes it is cheaper to encode an error in the DFD (Displaced Frame Difference) than via an accurate motion vector ...

Not having read up on existing implementations I would go for a warping mesh motion compensation method. It is easier to deal with parts of the image which have semi-static components (black bars, subtitles, logos) that way than with block motion estimation, you simply insert some extra vertices and force their motion to 0. It will do its best to work around it without introducing too obvious artifacts. With block based motion models this is harder, the inherent asymmetry (between how block based motion models deal with the previous and the subsequent frame) also just feels wrong. Warping motion compensation doesnt allow any occlusion or gaps, that seems appropriate here.

Id warp both the previous and subsequent frame using the mesh and 50:50 interpolated vertices, and use a weighted average of the warped pixels from both with the weights determined by the surface of the triangles in the relevant frame (this is so that at moving edges the image which has more information, due to occlusion in the real motion, has a greater contribution).

Lobuz
1st September 2003, 19:36
As posted at the News page there is a demo of Philips's Trimension software for Digital Natural Motion on PC. Althought effect of demo is stunning I'm not sure if it's real DNM effect or just slyck presentation. There's no way to check it on real video file.
It looks promising, but it's a technology that no-one will give for free.
So if it's proven to be efficient some free OSS implementation would be appreciated by whole community. First step in ffdshow is done but it's distant from perfect.

Regards
Lobuz

MfA
1st September 2003, 21:31
Wow, that is pretty damn striking, it should work that well on pans in general ... they are relatively easy to motion compensate.

mf
1st September 2003, 22:20
The funny thing is, I tried FrameDbl on the test clip (trusting a linear pan would be simple enough to motion compensate), and I got terrible results! Pity that we can't tell them we have an open source alternative that works as good on the pans :(.

athos
2nd September 2003, 09:47
The latest ffdshow builds have motion compensated framerated doubling. Works quite well IMO, although some flickering is visible around edges.

mf
2nd September 2003, 11:23
Originally posted by athos
The latest ffdshow builds have motion compensated framerated doubling. Works quite well IMO, although some flickering is visible around edges.
Does it work on the test samples (http://www.trimensiontech.com/index.php?page=downloads)?

duartix
2nd September 2003, 12:20
Does it work on the test samples?Strange, it looks the same on or off.

MfA
2nd September 2003, 20:36
Is the "motion compensation" the same as in tomsmocomp? If so the range of motion is probably too great.

Lobuz
2nd September 2003, 21:00
I'm not sure if it works in ffdshow from 23-05 maybe try version from April or newer Athos compile.

Reards
Lobuz

superdump
2nd September 2003, 21:01
Originally posted by mf
Does it work on the test samples (http://www.trimensiontech.com/index.php?page=downloads)?

I tried it briefly and it most cases it made the panning worse.

redeemer-dk
3rd September 2003, 22:04
100fps wouldn't help. the human eye can only perceive about 80fps at optimal light conditions. in theathres etc. it's about 20-30 fps which is the reason the motion appears fluent.

athos
3rd September 2003, 23:33
Originally posted by redeemer-dk
100fps wouldn't help. the human eye can only perceive about 80fps at optimal light conditions. in theathres etc. it's about 20-30 fps which is the reason the motion appears fluent.
Movies in theatres are 24 fps. But if you compare using motion compensation on a 100hz to not using it, you too will notice the difference.

PowerMacG4
4th September 2003, 02:51
Originally posted by redeemer-dk
100fps wouldn't help. the human eye can only perceive about 80fps at optimal light conditions. in theathres etc. it's about 20-30 fps which is the reason the motion appears fluent.

I can tell the difference between FILM and NTSC. ;-)

Joe Fenton
4th September 2003, 05:54
I have a paper on this from IEEE Transactions on Consumer Electronics. The problem with linear interpolation is that you can get erroneous displacements due to effects like moving picture elements which suddenly disappear. The thrust of the paper was to move from linear interpolation to n-frame (they used 4) polynomial approximation in windows. They would take a maximum size window and calculate the minimal average displacement given n-frames. They would then recursively shrink the window until a minimun was achieved. In areas where the window would not achieve a minimum displacement (possibly due to in-scene motion or vanishing elements), they would simply copy the original content. The overall improvement compared to simple linear interpolation was pretty good.

Of course this was all in hardware with no budget on memory.

athos
4th September 2003, 11:08
This sounds very interesting. Probably it would not be possible to do this in real-time for decoding using current generation cpu's, but for encoding it might work. The current algorithms that are implemented in ffdshow are very efficient, but the results are not perfect. The simplest framerate doubling algorithm simply interpolates two frame to create a third. This is very cheap, as milan has used some mmx-optimized assembly code for the interpolation.

mf
4th September 2003, 12:14
Originally posted by redeemer-dk
100fps wouldn't help. the human eye can only perceive about 80fps at optimal light conditions. in theathres etc. it's about 20-30 fps which is the reason the motion appears fluent.
It's just about overkill. "The human eye" is a general thing. It might vary between persons and the best thing you can do is overdo it slightly so you're absolutely SURE it will be 100% fluid. Besides, high framerates generate a "natural motion blur" effect which looks much more real than calculated motion blur (3D rendering) or camera motion blur (from the shutter time). Ever played Unreal Tournament at 189Hz (the max my monitor would do) ? I have. It looks nice :D. (Needed to set resolution at 320x240 though for the rendering fps to be as high as the monitor refresh rate)
Besides, 25 * 4 = 100. Try to make "about 80fps" from 25 (or 24). And then there's NTSC land, where 30 * 4 = 120. It's just easier to interpolate *2 = 50 *2 = 100, than being fussy just because the human eye can only percieve "about 80".

MfA
4th September 2003, 16:52
Originally posted by athos
Probably it would not be possible to do this in real-time for decoding using current generation cpu's

Philips proposes to do just that.

Joe, it works well enough for Philips ... they have been using it, and motion compensated deinterlacing, for half a decade in consumer products.

fyo
4th September 2003, 22:25
100fps wouldn't help. the human eye can only perceive about 80fps at optimal light conditions. in theathres etc. it's about 20-30 fps which is the reason the motion appears fluent.

This is completely bogus, with no basis in science. It's very easy to see the difference between 80fps and, say, twice that.

The subject is fairly complicated, but the notion that "the eye can only perceive xxx fps" (where xxx is on the order of 100fps) is nothing more than an urban legend.

You might have a hard time finding a display device capable of showing more than that (refresh rate has to be high enough and decay time of e.g. the phosphorous in a CRT has to short enough), but that's about it. Even with a standard CRT monitor (which has a non-zero decay time for its phosphorous), 100fps is not completely fluent. You can see this in your ability to detect that the monitor is still flickering slightly. Imagine how horrible the flicker would be if the light-emmission as a function of time was a true delta-function!

The problem of finding the maximum "perceivable" frame rate is not trivial, but you might want to start with the typical transmission times from light-incidence on the retina to reception in the visual cortex. Even here all is not trivial. The different types of receptors in the eye (primarily rods and cones) have different properties. Specifically, some receptors react very rapidly, with a very short transmission rate, but a very low level of information content (basically "no colors"). Roughly, you could say the the receptors in the peripheral regions of the retina are capable of seeing much higher frame rates, but are more or less color blind.

This is also why it is easier to see screen flickering using your peripheral vision, rather than staring directly at your monitor.

Sincerely,

fyo

trbarry
6th September 2003, 16:16
The funny thing is, I tried FrameDbl on the test clip (trusting a linear pan would be simple enough to motion compensate), and I got terrible results!

If you mean my filter that was just a first experiment and the motion compensation is really pretty stupid. It only catches motion of up to one pixel / frame in any direction. And even with faster machines the algorithm used would start to cause artifacts if I tried to extend that.

But I do believe that motion compensated frame doubling in the display will be very necessary in the future. It's too bad we have many decades of legacy movies shot at 24 FPS, but that's a fact.

Most of those movies were shot with slower shutter speeds and other attempts to cause more motion blur that hides jerkiness. But with brighter crisp HDTV type displays the motion may still not be very smooth looking and increasingly we have the processing power to do something about that and display nicer at higher frame rates.

This will be especially needed as folks start using 24/30 progressive video cameras like the nifty new JVC HD cams. They tend to amplify the problems somewhat since our current interlaced schemes also tend to blend motion a bit and hide the problem.

I'm still just in the very begining design stages of a truly workable version of FrameDbl based upon early TomsMoComp code. Don't know if I'll ever get there.

- Tom

MfA
6th September 2003, 17:11
Personally I think feature based motion estimation (like here (http://iss.bu.edu/jkonrad/Publications/local/cpapers/Kard03ivcp.pdf)) would work well for True Motion estimation/interpolation (the mesh based interpolation could additionally assign weight according to the size of triangles in each frame, so the frame which is more likely to have good info will contribute more).

Given the patent thing and Philips trying to license it for software right now I think it would be better to stay as far as possible from the algorithms Philips uses (http://www.ics.ele.tue.nl/~dehaan/publications.html).