Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
22nd September 2020, 08:42 | #1 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,903
|
DeOldify and AI Colorization: my experience
Hi there,
I was looking at DeOldify https://github.com/jantic/DeOldify as an AI model for colorizing grayscale pictures. So the idea was to take a video, export a sequence of .tiff via FFMpeg, let them run through DeOldify and then put them back together via FFMpeg. Unfortunately, it seems to be only working in Linux and the guide covers Ubuntu, so I had to fire up my Fedora and pray for it to work. The idea behind DeOldify is a series of models trained by several people to colorize pictures automatically. This is what I've got with the 1948 Winter Olympics: (Grayscale Remastering on the left, AI Colorized Output on the right) Of course, I have no idea about whether colors are actually historically correct or not but some of them are clearly not like those: In the very last example, it didn't recognized faces, probably due to the low definition of the original picture (my heavy filtering probably didn't help either // SpotLess, MDegrain, DFTTest //) and I'm pretty sure it thought they were flowers on a field, instead of humans. All in all, I think it's not really ready to be used for something like this, especially 'cause it's such an important event, so I'm gonna stick with grayscale. Human made colorizations are probably the best but I've never done it and from what I saw it takes a lot of time (even if object tracking tools like Mocha etc come into help). So... what do you think about DeOldify and AI Colorization algorithms in general? Do you use them? Do you have one you prefer over the others? Let me know. p.s images have been scaled down to fit on Doom9 Last edited by FranceBB; 9th November 2020 at 00:42. |
8th November 2020, 09:14 | #2 | Link |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 7,277
|
Don't have any use for it myself atm. but I stumbled over https://github.com/ericsujw/InstColorization and it looks promising and in their paper they compare against DeOldify and since it reminded me of this post I thought I post it here.
Cu Selur |
8th November 2020, 15:00 | #3 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Quote:
[EDIT: Ie, KillerSpots() ~= Spotless(RadT=1) ::: Based on same Didee idea]. EDIT: Perhaps you use Killerspots first to get better vectors on very noisy material, or something, if so then I suggest try SpotLess(..., bBlur=0.6) to prefilter for better vectors and omit the KillerSpots() thing.
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? Last edited by StainlessS; 8th November 2020 at 17:03. |
|
9th November 2020, 00:41 | #4 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,903
|
Ah, nice one, Selur.
I should try to process the very same video with it and compare it to deOldify to see the results... when I have time... if I have time... (T_T) As to SpotLess, Stainly, yeah, I really meant that I was pre-filtering the images, but I wasn't using all those filters one after the other. I was using KillerSpots initially, then I substituted it with SpotLess, so don't worry, I'm not using them together xD But yeah, I should really edit that so that other people don't make confusion about this. |
3rd December 2020, 18:22 | #6 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,903
|
Quote:
Converting videos by processing them frame by frame like I'm doing doesn't take into account correlation between them and may lead to unpleasant results, still, with no Avisynth alternative available, I guess it's better than nothing... Anyway, it seems that it's still way too early to let neural networks put chroma on videos for us. If I have time next week, I might show you the final result after I colorized the whole reel frame by frame (well actually using Mocha when it did detect and follow the objects), trying to be as historically accurate as possible by continuously asking the archivist and my other colleagues who handle those things. |
|
4th December 2020, 02:21 | #7 | Link |
Registered User
Join Date: Sep 2005
Posts: 130
|
Yeah, MyHeritage version of DeOldify is pretty good, but I'm more impressed with its Photo Enhancer (which I found out later on that it's licensed from Remini, and it uses AI). Does anyone know other implementation of such "AI Photo Enhancer"?
|
10th December 2020, 17:19 | #8 | Link |
Registered User
Join Date: Oct 2001
Posts: 454
|
Hi FranceBB et al,
in my findings, such "AI Stuff" is dependant on two things: The actual techniques used behind it (super resolution by subsampling of several adjanced frames for example) and the model being trained and then used.. Video or Image Upscale stuff shares a lot "under the hood" with technqieus like oldify. When playing around with several ESRGAN Models, itīs obvious that the content in wish of processing and the used model has to fit. The gaming community trains a lot of models for upscaling game content of older games, often these have very rouhg 8 Bit raster images or very low resolution intor videos (Indeo Codec - yeah..) and with the proper model, a lot can be done to "enhance". Video algorithms can take into account that the same image content is often present in several frames. A slowly moving background in a movie scene is a good example where the motion estimation and subsampling is easily able to recover "lost details" out of several frames and put them back into the video (independent of any neural network model - photoaccute does this, many astronomical software packages are able to do so,, too)... As the models progress and the algos advance, more and more is possible... We discussed the merging of algos like deepfake with video enhancements (the "AI" knows faces, finds them in the video and replaces the bad low resolution versions with higher resolution faces from the database - good for "one actor scenes with nicolas cage" - bad for "crouwd scenes"). If algos like deoldify learn more in terms of how to recognice objects, people etc in scenes, they will be able to be more accurate in guessing: "Hey, this is a german soldier from WW2, he should have a grey--isch coat, not a green one"... But this will take quite some time, years.... So in my oppinion, for now itīs only "nice to have and play with", but in no way is any "AI colourization or video enhancement" good enough to unattended, fully automatic perfect results... There are cases where the result turn out phenomenal, in others itīs just garbage. So a lot of manual tweaking and attention is necessary at the moment and some footage just isnīt ready to be dealt with (and maybe neber will..). Iīve been playing around with stuff like that for quite some time (btw - Selur, thanks for implementing all the crazy upscale stuff....) and in recent tests I moved my focus from "old VHS footage" (just too hard sometimes to deal with.... will keep the original captures, look for a posibility to RF Capture VHS and visit the topic later when I manage to train my own models..) to "older movies". Stanley Kubrick loved to use "natural lighting" in many scenes, hence adding no extra light while filming - a lot of fiolm grain in many scenes of his movies are the result. One could argue this is an artistic choice - I would think that the director would have loved to have less noise if the camera and film technologie would have been able to do so... So I tossed some BD footage into some AI-stuff and was happy to see that in some cases they do a wonderful job of "cleaning up" - much better than any degrain algo would have been able to or commercial denoisers like neatvideo etc... Some colour grading was eliminated, too, details got clearer... Overall, a very nice result. BUT: In some cases, small details in objects where missinterpreted by the AI and people suddenly had patterns on clothing where there should be none in reality This is a little off-topic to the "colouring" topic you started, but since many principles behind the discussed techniques are the same and some "video enhancement/upscale" stuff deals with colours in some way, too - I thought I mention it. A project I would love to attent - maybe someone wanst to join - is to train my own model for specific usecases.... For example I have some HD footage of DS9, which could be one trainig pair for the series - a specially trained DS9 model so to say Use this ont something like vdsr... |
10th December 2020, 17:23 | #9 | Link | |
Registered User
Join Date: Oct 2001
Posts: 454
|
Quote:
Most stuff is developed in the linux world, but a lot can be used and accesed with windows - or google colab, if you want. A commercial product would be Topaz Gigapixel. A free to use thingy for windows using ESRGAN would be https://github.com/ptrsuder/IEU.Winforms Dig a little, you will find a lot of models to "plug" into Image enhancer (search "model zoo esrgan"), for different usecases (trees, landscapes, game content of all sorts, jpeg compression reduction, faces, balck&white content, maps, etc...).. In fact, there is quite a lot out there to play with... |
|
11th December 2020, 11:30 | #10 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,903
|
Quote:
As to the things you explained, I guess the artifacts come from the fact that motion estimation can't really be perfect and it's very source/scene dependent, so yeah, as you said, when it works fine, it can produce very good results, better than conventional methods, but when it doesn't, it will produce undesired side effects and can't therefore be trusted blindly. |
|
11th December 2020, 12:32 | #11 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,903
|
Just so you can see the final result, this is what we've got after months of work by manually denoising, degraining, removing spots, scratches, stabilization and then finally using object tracking via Mocha to manually colorize the whole thing. About the colorization, when we were not sure, we left as grey-ish as possible without making it so obvious (without people noticing it), but we're pretty confident we've got many things right like flags. About dress and other things, there's been a lot of history research and not only that, for people like Barbara Ann Scott, we had to track her down and see the color of her hair (she was blonde). Same goes for Thomas Hedvin Byberg, Ken Bartholomew, Robert Fitzgerald and many many others which we had to track down. I mean, the recording is from the 40s, so...
I hope John Meyer is gonna be proud, 'cause we certainly are proud of the results: |
13th December 2020, 20:15 | #12 | Link |
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,496
|
The snow, particularly in the first shot, looks a little unnatural to me - the shadows should be a bit bluer (lit as they are by the sky), or perhaps a bit less green. But other than that minor niggle, the colourisation looks fantastic.
|
Thread Tools | Search this Thread |
Display Modes | |
|
|