View Single Post
Old 23rd March 2016, 19:56   #1  |  Link
Seedmanc
Registered User
 
Join Date: Sep 2010
Location: Russia
Posts: 85
Transposing a 2D videostream into a 3D object (frame count as a depth)

I got a strange idea once, what if you could represent your average videostream that has width, height and duration as a box with the same dimensions, but having its z-axis along the video length? Such a "box" would have all video frames inside of it. Imagine that when you're watching a video, you're looking at this box from its front side, going deeper into it as you're progressing along the time axis.


With me so far?

Now, what if we rotated the box 90 degrees around its width or height axis? The video length would become the new video height or width respectively and vice-versa. You would be looking at a collection of every frame's side pixel columns, and watching the video would move you along the original video frame pixel lines (sorry if my wording is confusing).


In order to implement this I wrote the following script (Avisynth+)

Code:
AVISource(“C:\source.avi”) 
converttorgb32

original = last 
last.blankclip(length = original.width, width = original.framecount, height = original.height) 

scriptclip(""" 
  
  for ( i= 0, original.framecount - 1) { 
    selected = original.freezeframe(0, original.framecount - 1, i) 
    layer(selected.crop(original.width - 1 - current_frame, 0, 1, 0), x = i) 
  }  

""")
(this one transposes video length to width).

You can see how the video looks like after such transformation here: https://www.youtube.com/watch?v=Ut00lonRqVk (left to right is beginning to end of the original)
Here's also the same method used to transpose length to height: https://www.youtube.com/watch?v=-K5b2BKI0ew
Compare to the original video: https://www.youtube.com/watch?v=-YKuxZgDamM (mine was trimmed to speed up processing).

The catch is though that despite the simplicity of the script it runs extremely slowly (originally at 0.2 fps for a 672x384 video, later sped up to ~1.5-3 fps by replacing overlay with layer and switching from Avisynth to AVS+). Speed is also inversely proportional to the original video length, because in order to construct one frame of the output the script has to access every frame of the input.

Interesting thing is, this conversion is reversible, you can get the original video back by applying the transformation again (although it will be reversed and mirrored, but those filters are lossless as well).
This opens such possibilities as applying a spatial only filter to the temporal domain and vice-versa. I already tried upsampling FPS with a neural network upscaler as well as upscaling with MFlowFPS. Results weren't that much impressive, although in the first case it was curious to notice that parts of the frame were moving independently of each other.

Another point to make is how switching dimensions, temporal with spatial, lets you see the whole timeline of the video at once, even though only for a column of pixels at the moment. To remove this limitation we need to add up a dimension instead of replacing one with another - to go 3D.

I took the already transposed width-to-length video and processed it with another script, overlaying each frame with a diagonal shift to achieve 3D looks.
Code:
AviSource("D:\width-transposed.avi").converttorgb32.trim(0, 639)

a = last
a.blankclip(source_framecount, a.width + 640, a.height + 640/2, "RGB32", color=$888888)
scriptclip("""
        for (i = 0, 639) {
            frozen = a .freezeframe(0, a.framecount - 1, i).crop(current_frame, 0, 0, 0)
            layer(frozen , x = i, y = i / 2,  level = round(256.0 * (640 - i) / 640)) 
        }
 """)
Now the video actually looks like a 3D box, with the original playing on its left side, while you can see all the frames at once: https://www.youtube.com/watch?v=2vUihdaMbwk
Well, could, you can't see through an object, so I added some transparency to its front edge.

Why am I writing all this? To share my experience and discoveries made thanks to Avisynth and to give my insight on the concept of observing N-dimensional objects from a N+1 dimensional space. Remember 4D tesseract in Interstellar, seeing 3D space in all timeframes at once? Here we see all instances of a 2D video by going 3D.

Since this is more of a "share thoughts" topic than a "get help" one I'm not asking anything by it, but perhaps you can share some cool non-conventional things you did with video by Avisynth too, or maybe give ideas on speeding up such transformations.

Last edited by Seedmanc; 23rd March 2016 at 20:01.
Seedmanc is offline   Reply With Quote