View Full Version : New Video Codec: BergWave
bergi
11th July 2004, 20:53
Some time ago i released some binarys of my wavelet based video codec called 'BergWave'. Now you can also get the source code under the terms of the GPL. Perhaps some developers have some ideas for future of the codec. I'm happy to add your code or implement your ideas by my self.
You can find the source code and a VfW binary at my homepage: www.bergos.org/projects/bergwave/ (http://www.bergos.org/projects/bergwave/)
At the moment I guess this would be mostly a replacement for MJPEG? How fast is it?
Sirber
11th July 2004, 22:16
What are the specs?
bergi
16th July 2004, 12:03
At the moment I guess this would be mostly a replacement for MJPEG? How fast is it?
At the moment i think it's to slow, but with a mmx or sse based wavelet transform it would be fast enough for realtime encoding.
I have created a readme file based on the XviD readme file. There are also some infos about the algorithms and future plans. I will include this file in the next release of my codec. The next release will come soon, but there is a small bug left (i think it's a memory leak) that crashes all vfw applications :mad: .
1) Introduction:
----------------
BergWave is a wavelet based video de-/encoding solution.
The BergWave package currently consists of three parts:
- libbw: the main video de-/encoding library
- test: a simple example programm
- vfw: video for windows interface
2) Licensing:
-------------
The BergWave package is licensed under the terms of the GPL.
3) Info:
--------
BergWave is working in yv12 colorspace. Color conversion functions
where XviD taken from the XviD project, a open source MPEG 4 video
codec (www.xvid.org). At the moment BergWave doesn't support delta
frames, but that's on the doto list. To compress the yv12 image
BergWave uses 5,3 wavelet transform with run length encoding. The
values are coded by rice encoding. Every frame has n objects
(wavelet image, motion vectors). The frame end is marked with a
null object.
Example:
FRAME_HEADER
- type = FRAME_TYPE_I
+ OBJECT_HEADER
- type = OBJECT_TYPE_WAVELET
- flags = OBJECT_FLAG_WAVELET_RLE
+ OBJECT_HEADER
- type = OBJECT_TYPE NULL
- flags = OBJECT_FLAG_NULL
(not supported at the moment)
FRAME_HEADER
- type = FRAME_TYPE_P
+ OBJECT_HEADER
- type = OBJECT_TYPE_MVECTORS
- flags = OBJECT_FLAG_MVECTORS_BLOCK
+ OBJECT_HEADER
- type = OBJECT_TYPE_HADAMARD
- flags = OBJECT_FLAG_NULL
+ OBJECT_HEADER
- type = OBJECT_TYPE_NULL
- flags = OBJECT_FLAG_NULL
4) Todo:
--------
- delta frames
- motion vectors (block and perhaps mesh based)
- perhaps other transforms (integer dct, hadamard, ...)
- mmx and sse based wavelet transform
- bitrate control
- better vfw interface
specs look ok!
Tho i would kill for 4:2:2 native support and directshow support.
Can it be supported in other containerformats than avi because of filesize and interlacing support?
And will u have a lossless mode?
teb
708145
16th July 2004, 18:09
Originally posted by bergi
At the moment i think it's to slow, but with a mmx or sse based wavelet transform it would be fast enough for realtime encoding.
4) Todo:
--------
- delta frames
- motion vectors (block and perhaps mesh based)
- perhaps other transforms (integer dct, hadamard, ...)
- mmx and sse based wavelet transform
- bitrate control
- better vfw interface
Nice codec. I have a few questions, though:
o is quant 1 lossless? Since 5/3 is lossless in principle and it would make a good but slow huffYUV replacement.
o is it normal I get visually annoying artifacts at ~10Mbit already.
Other remarks:
o the codec only crashed for me on completion of the invoking application.
o you could use the XVID ME and bitrate control
o AFAIK mesh based ME is (far) superior to block based ME for wavelets
I intend to do some experiments on 3D-DCT and I think I just found the framework to begin with. :)
bis besser,
T0B1A5
bergi
17th July 2004, 21:03
Tho i would kill for 4:2:2 native support and directshow support.
Can it be supported in other containerformats than avi because of filesize and interlacing support?
And will u have a lossless mode?
I think directshow support is no problem, but first i want a working version :) . The codec has an easy api, so there shouldn't be any problem to add a new interface, just make an other layer like the vfw one. If you want to encode into other containerformats use the directshow interface when it's ready.
I don't like it, but i will add interlacing support because people how use lossless mode (yes there is a lossless mode) will need it.
o is quant 1 lossless? Since 5/3 is lossless in principle and it would make a good but slow huffYUV replacement.
o is it normal I get visually annoying artifacts at ~10Mbit already.
Other remarks:
o the codec only crashed for me on completion of the invoking application.
o you could use the XVID ME and bitrate control
o AFAIK mesh based ME is (far) superior to block based ME for wavelets
I intend to do some experiments on 3D-DCT and I think I just found the framework to begin with.
It should be lossless if you set qaunt=1 and deadz=0, but in my test i have found a little difference, but that was RGB source. Next week i will add a dump data function.
10Mbit? what source do you use? With my 640x3?? test sources i never reached 10Mbit. Perhaps also a bug :mad: .
XviD is a good source for ideas and source code :) .
First i will add block based ME, i don't know any algorithms for mesh based ME. Onyone has good pdfs about mesh based ME?
With 3D transform you have the old b-frames problem. Frame in frame out doesn't work :mad: .
708145
17th July 2004, 22:26
Originally posted by bergi
It should be lossless if you set qaunt=1 and deadz=0, but in my test i have found a little difference, but that was RGB source. Next week i will add a dump data function.
10Mbit? what source do you use? With my 640x3?? test sources i never reached 10Mbit. Perhaps also a bug :mad: .
XviD is a good source for ideas and source code :) .
First i will add block based ME, i don't know any algorithms for mesh based ME. Onyone has good pdfs about mesh based ME?
With 3D transform you have the old b-frames problem. Frame in frame out doesn't work :mad: .
I'm looking forward to the lossless mode since (quant 1) really outperforms (in terms of size) HuffYUV here.
I deleted the initial file, but reproduced this behaviour with a quant 5 encode with approx. 8Mbit @720x576 (no crop)
There is some kind of flickering "blocks" which is largely exaggerated compared to the source.
Source is 27MB and result 5MB. If you are interested, I'll upload them somewhere.
The encoded version is here: http://www.ra.informatik.uni-stuttgart.de/~bergmats/phenonemon_huff_wl5.avi
http://www.utdallas.edu/~aria/mcl/motion.html might contain useful info for mesh based ME.
I will need a DShow interface to make the 3D transform work well. But the initial goal is to find out if it is useful at all.
bis besser,
Tobias
bergi
19th July 2004, 12:32
I deleted the initial file, but reproduced this behaviour with a quant 5 encode with approx. 8Mbit @720x576 (no crop)
There is some kind of flickering "blocks" which is largely exaggerated compared to the source.
Source is 27MB and result 5MB. If you are interested, I'll upload them somewhere.
This is a "feature" of my bad quant selection. The quant selection now:
for(i=0;i<levels;i++)
quant = (quant+1)/2;
I think i have to had a quant matrix to the api and gui. At the moment the matrix would look like this:
Level HL HH LH
1 16 16 16
2 8 8 8
3 4 4 4
4 2 2 2
A better matrix for your example could look like this:
Level HL HH LH
1 16 16 16
2 6 6 6
3 2 2 2
4 1 1 1
I will add the matrix code to the api and gui the next days, but i haven't found the bug, that closes applications after playback. Perhaps someone want to help me?! :)
PatchWorKs
27th September 2005, 21:03
When the smoke did clear, many thousands
were dead. There was much blood and gore.
Their bodies lay broken and scattered
across the battlefield like brown leaves
blown by the wind. Manowar - Warriors Prayer
DeathTheSheep
28th September 2005, 00:07
Poetry? Reminds me of... me. :D
Poor codec... it had so much potential... :(
Sirber
28th September 2005, 00:47
what do you mean?
DeathTheSheep
28th September 2005, 22:49
Part 1: Poetry? Reminds me of... me. :DRemember how I used to write random poetry in that test thread? Ah, the good old days... *sigh*
Part 2: Poor codec... it had so much potential... :(
I guess I was referring to all wavelets in particular. Me looooves wavelets... We need some wavelet development-- many video compression "philosophers" have pointed to theories that wavelets can far outperform even AVC if developed to the extent that XviD is today. "FAR" outperform.
YAY. Maybe if OGG tarkin is picked up... or rududu again... or snow (which I believe came the farthest....*sigh*)
Well, maybe in the near future, eh? ;)
Sirber
29th September 2005, 18:21
My question was about part 2 :)
Most people are focused on AVC right now, but it'S good to have alternatives :D
k0r0n4
30th September 2005, 10:54
I guess I was referring to all wavelets in particular. Me looooves wavelets... We need some wavelet development-- many video compression "philosophers" have pointed to theories that wavelets can far outperform even AVC if developed to the extent that XviD is today. "FAR" outperform.
YAY. Maybe if OGG tarkin is picked up... or rududu again... or snow (which I believe came the farthest....*sigh*)
Well, maybe in the near future, eh? ;)
You bring up a good point....there has never been a wavelet codec that has gone under a *major* amount of development. It sure is a shame since I have been very impressed with snow's results with certain footage.
Anyways, if bergi continues to develop it, I'd say this has just as much potential as any other codec, it just needs lots of hard (and usually quite frustrating) work. He definately gets my applause though for what he's made so far :)
:thanks:
Caroliano
30th September 2005, 20:06
I guess I was referring to all wavelets in particular. Me looooves wavelets... We need some wavelet development-- many video compression "philosophers" have pointed to theories that wavelets can far outperform even AVC if developed to the extent that XviD is today. "FAR" outperform.
YAY. Maybe if OGG tarkin is picked up... or rududu again... or snow (which I believe came the farthest....*sigh*)
Well, maybe in the near future, eh?
Well, the only wavelet codec that is being developed right now is Dirac, at least between the opensource ones, but it don't seem to reached in snow stage, but its close. They there is a tread named "Dirac VFW codec" here, and it also have his own webpage and foruns.
DeathTheSheep
4th October 2005, 02:20
Mmm, good to see some active progress. With a forum and and a big popular following, its certainly plausible that Dirac may develop into a particularly good codec. But if what happened to snow is to be used as comparison, maybe there is a certain "level" of wavelet development when coding just becomes too complicated--like a journy through uncharted waters, not knowing what to expect or how to get there.
Still, my kudos to all developers of wavelets in this Block-solid MPEG4 dominated world.
Caroliano
4th October 2005, 02:51
BBC is backing them. They plan to use a patent-free codec for their broadcasts in future. And plus that it is an opensource AND well decumented (something rare) project, I think it will survive and, at least, turn in an good patent-free codec alternative.
They also have a nice page: http://dirac.sourceforge.net/index.html
bergi
7th October 2005, 20:55
Hi!
Nice to see some people still now my codec.
You wan't to know why the latest version was released over 1 year ago? Ok, first now i work in a different department in my company and this means less spare time. Also there is a new video codec (snow) which just need some standardization. Because of the other codec which has also a bigger team (i'm the only developer of BergWave!) i started some other projects. Some just for fun from which i don't expect a final version in the near futur (~10 years ;) ) for example my OS. The develop part of another project (PiMail) i already gave away. At the moment, when i have some spare time i'm working on a DVB application for SkyStar2 tv-cards, because many of the features i wan't to use of this card aren't supported by other DVB applications.
If there has been more feedback about my codec in the past and also some other developers would join the codec team (if this mean a new name for it, it's ok for me ;) ) i would go on, but these things hasn't happend in the past.
I really would like to go on developing my video codec, because i have still so many ideas, but alone i think it's not possible at the moment :(
PatchWorKs
8th October 2005, 12:30
Ok, first now i work in a different department in my company and this means less spare time. [...] Because of the other codec which has also a bigger team (i'm the only developer of BergWave!) i started some other projects.
So, why don't you release the source code under an open source license to make your codec immortal ?
Rise, know the strength that you feel.
Hold in your heart but never reveal.
You were called by the Gods, their powers to wield.
Manowar - Secret of Steel
Kopernikus
8th October 2005, 13:07
Some time ago i released some binarys of my wavelet based video codec called 'BergWave'. Now you can also get the source code under the terms of the GPL. Perhaps some developers have some ideas for future of the codec. I'm happy to add your code or implement your ideas by my self.
You can find the source code and a VfW binary at my homepage: www.bergos.org/projects/bergwave/ (http://www.bergos.org/projects/bergwave/)
He did already!
PatchWorKs
10th October 2005, 10:20
doh!
Sorry ! :goodpost:
k0r0n4
13th October 2005, 09:14
While everyone is getting excited about MPEG-4 AVC, I'm far more excited about the recent development of wavelets. I hope bergi keeps working on his (cus you can never have too many codecs ^_^), and I'd really like to see the other ones come along as well, as he mentioned (Snow and Dirac).
The real question is, why isn't there a greater focus on wavelets? I mean, it avoids the nightmarish blocking problems w/ XviD (though I have to admit, h264 does a much better job than XviD/DivX does when it comes to blocking problems). I know codecs are extremely complex, and I'd imagine wavelets are especially complex as far as codecs go. Still, they've shown to be rather impressive in the past, even in the early stages (not that any of them are really "mature" yet).
...seems like an untapped market to me :devil:
Koepi
13th October 2005, 09:37
The only difference between a wavelets codec and a dct-based one is the curve used for transcoding the image material into the frequency domain. So you could make XviD a wavelet codec by replacing dct by a wavelet function. It stil would be that "horrible blocky"... *evil grin*
I think there's some "overhype" taking place in some people's mind ;)
Cheers
Koepi
P.S.: don't get me wrong. I like wavelet-based codecs as well, i.e. Edouardos 9/7-wavelet for realtime-capturing!
Kopernikus
13th October 2005, 11:33
The only difference between a wavelets codec and a dct-based one is the curve used for transcoding the image material into the frequency domain. So you could make XviD a wavelet codec by replacing dct by a wavelet function. It stil would be that "horrible blocky"... *evil grin*
With wavlets you transform the whole picture, while with dct you are transforming every 8x8 Block alone. So I don`t think its so easy to turn XviD into a wavelet coder...
I think wavelets are a very promising field, and there are many possibilities and much research to do and much progress to expect.
But block-based dct transform coding isn't so dead and wavelets arent automtically better.
eMotionEstimation
13th October 2005, 12:43
With wavlets you transform the whole picture, while with dct you are transforming every 8x8 Block alone. So I don`t think its so easy to turn XviD into a wavelet coder...XviD does DCT for 8x8 blocks. But that doesn't mean a DCT can't be done on the whole picture. You can even do a wavelet transform on a 8x8 Block, too. ;)
708145
13th October 2005, 13:59
I remember that prunedtree has mentioned lapped DCT which does minimize blocking.
BTW, snow uses something similar for wavelets... the inter blocks "fade out" at the borders.
bis besser,
Tobias
akupenguin
13th October 2005, 19:48
Snow uses lapped motion compensation, which is not the same thing.
bergi
14th October 2005, 09:36
I have made already some test in the past. Here are the results:
Blocking and wavelet codecs:
This happens on keyframes if a high value for the quant was selected for a haar transformation (which i don't use of this reason) or with block based motion compensation (i have not tested lapped motion compensation).
Pure wavelet codec:
Deltaframes produce much data in the hight frequencies which isn't one of the wavelets specialties. Integer dct with adaptive blocksize selection do a better job in this case. I also must say i have only tried full- and halfpel with a algorithm which was for sure not the best.
Things i would like to test:
Mesh based motion compensation:
Perhaps the wavelet transform gives already good information for detection of edges. I'm realy astonished that nobody ever tried mesh based motion compensation in some experimental codec.
Different bitencodings for dct:
I use rice for the wavelets, but perhaps (sure) there are better encodings for dct.
Deltaframe with reference to the frame before the last keyframe:
I think for dialogs with only 2 camera angles (perhaps also more) it would perform very well. This would also mean some keyframe can't be marked as keyframes for the container. This must be of course a feature which can be deactivated.
Some other things, which i don't remember at the moment.
Things i will do next
Replace my wavelet transform with maven's transform
My transformation has a bug which i tried very long to fix but i haven't found it untill now. Maven's transform is very fast now, still the quant selection can't be used for a video codec so i still have to use my little trick.
Clean up the old deltaframe code
I have already done some test with deltaframes which means this code already exists, but this code was only for testing.
pest
26th October 2005, 03:11
@Bergi
for lossy just use the 9/7 transform
the lifting-phase can be largly optimized
don't use vlc-codes, use a bitplane coder
or a simpler but faster variant like the pacc-encoding
where you encode a significance map and after
that only the relevant coeffs and signs
this saves about 10-20%
good luck!
pest
26th October 2005, 14:16
for the high-freq problems on difference-frames
an adaptive wavelet structure helps a lot.
you don't expand the tree with a mallat algorithm
but every node in the tree. you then select
the children with the lowest cost, based on
entropy or something else. i've seen results
of wavelet-coders better than h.264 but they
mostly use a combination of 9/7 spatial and
5/3 temporal. hope this helps a bit.
Tommy Carrot
26th October 2005, 16:04
Pest, what is your opinion about the new approaches (bandelet, curvelet, ridgelet, etc.) on improving wavelet compression? I've read some papers about them, and although i don't understand too much about their mathematical backgrounds, they seem to be a big improvement over the traditional 2d wavelet transform, especially in the edge representation. What do you think, will these methods be usable in video compression, or they are limited only to image compression?
Kopernikus
26th October 2005, 17:26
Pest, what is your opinion about the new approaches (bandelet, curvelet, ridgelet, etc.) on improving wavelet compression? I've read some papers about them, and although i don't understand too much about their mathematical backgrounds, they seem to be a big improvement over the traditional 2d wavelet transform, especially in the edge representation. What do you think, will these methods be usable in video compression, or they are limited only to image compression?
Do you have links to the papers? That sounds interesting
Hellworm
26th October 2005, 23:04
I have some ideas about the motion-compensation:
The Problems with a mesh-based motion model are that it is very complicated to program and really very slow in usage. Therefor I thought about an not block based, but pixel ( or better "flat objects") based motion compensation, that would also be a starting point for a real mesh based me.
The idea is, that one could generate a normal motion map, but with mvs for every single pixel and than compress this map by dividing it into different objects, where every object gets 3 or 4 mvs (move, rotate, zoom, perspective - there we can later start with recognizing meshes), calculated by the mvs of the single pixels. When the objects are correctly recognized this motion map is perhaps even smaller in size, than a normal block based.
To calculate the motion map for every pixel, we just need a normal block-based motion estimation and then decide for every Pixel wich mv of the surrounding blocks it gets.
The problem I got so far, experimenting with this idea is that an normal ME does often, mainly on noisy sources, not get the right mvs, so the pixel-motion map gets also kinda "noisy" and it would be impossilble to recognize objects. But also my test code was rather simple, and i will try some better methods ( than pure brutefoce ) for the me.
pest
27th October 2005, 00:10
@Tommy Carrot
the new approaches seem promising at enlarged complexity
the improvements pointed in the papers are
mostly based on psnr-based rate-distortion...
for video you have to keep things simple
and the wavelet transform itself does not really provide
a relevant improvement over 8x8 dct
the coding stage is the really important part
Tommy Carrot
27th October 2005, 02:01
@Tommy Carrot
the new approaches seem promising at enlarged complexity
the improvements pointed in the papers are
mostly based on psnr-based rate-distortion...
for video you have to keep things simple
and the wavelet transform itself does not really provide
a relevant improvement over 8x8 dct
the coding stage is the really important part
It's not simply psnr improvements, these experiments are trying to fix the biggest problem of the 2d wavelets, poor edge representation. For example, here is a comparison of curved wavelet transform against normal WT, the bitrate and the coding stage are supposed to be identical in both cases, only the transform stage is different. Note that the edges are how much sharper.
image removed
A better transform can represent the image in fewer coefficients, so the coding efficiency can be improved in this way too. I'm just not sure if these methods can be effectively used in the motion compensation part of video compression.
Kopernikus: as always, Google is your friend. ;)
k0r0n4
27th October 2005, 04:54
Tommy Carrot:
Very interesting comparison. Do you have an image of the source footage for that frame as well?
Tommy Carrot
27th October 2005, 13:33
Well, i took the pictures from this paper (http://futurevideo.epfl.ch/internal/docs/f0002.pdf), it also contains the source image, and some interesting ideas. :)
MfA
28th October 2005, 01:34
That's one hell of a nasty transform.
Leak
28th October 2005, 09:34
That's one hell of a nasty transform.
And it's probably not really suited for a video codec, seeing as these work best when running in realtime... ;)
Tommy Carrot
29th October 2005, 14:20
Haha, and this transform was even one of the more simple solution for this problem, there are far "nastier" ones out there. :D Anyway, i take it that in your opinions this is not a viable way to improve video compression.
bergi
16th November 2005, 15:55
for lossy just use the 9/7 transform
the lifting-phase can be largly optimized
In my tests the 9/7 was better for very low bitrates, but for DVD backups (which is my first target) the 5/3 is better in my opinion.
don't use vlc-codes, use a bitplane coder
or a simpler but faster variant like the pacc-encoding
where you encode a significance map and after
that only the relevant coeffs and signs
this saves about 10-20%
I had already bitplane encoding, don't know why i removed this :confused:
Perhaps it was the speed, but i should perform better for lossy compression. I could include all of these encodings, as i wrote in the readme i want something like binary xml for selecting different algorithms and encodings.
for the high-freq problems on difference-frames
an adaptive wavelet structure helps a lot.
you don't expand the tree with a mallat algorithm
but every node in the tree. you then select
the children with the lowest cost, based on
entropy or something else. i've seen results
of wavelet-coders better than h.264 but they
mostly use a combination of 9/7 spatial and
5/3 temporal. hope this helps a bit.
You have some papers? Sounds interesting.
pest
18th November 2005, 17:00
Just search for anything related to "wavelet packets"
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.