DeOldify and AI Colorization: my experience

FranceBB · 22nd September 2020, 08:42

Hi there,
I was looking at DeOldify https://github.com/jantic/DeOldify as an AI model for colorizing grayscale pictures. So the idea was to take a video, export a sequence of .tiff via FFMpeg, let them run through DeOldify and then put them back together via FFMpeg. Unfortunately, it seems to be only working in Linux and the guide covers Ubuntu, so I had to fire up my Fedora and pray for it to work. The idea behind DeOldify is a series of models trained by several people to colorize pictures automatically.

This is what I've got with the 1948 Winter Olympics:

(Grayscale Remastering on the left, AI Colorized Output on the right)

Of course, I have no idea about whether colors are actually historically correct or not but some of them are clearly not like those:

In the very last example, it didn't recognized faces, probably due to the low definition of the original picture (my heavy filtering probably didn't help either // SpotLess, MDegrain, DFTTest //) and I'm pretty sure it thought they were flowers on a field, instead of humans.
All in all, I think it's not really ready to be used for something like this, especially 'cause it's such an important event, so I'm gonna stick with grayscale. Human made colorizations are probably the best but I've never done it and from what I saw it takes a lot of time (even if object tracking tools like Mocha etc come into help).

So... what do you think about DeOldify and AI Colorization algorithms in general? Do you use them? Do you have one you prefer over the others? Let me know.

p.s images have been scaled down to fit on Doom9

Selur · 8th November 2020, 09:14

Don't have any use for it myself atm. but I stumbled over https://github.com/ericsujw/InstColorization and it looks promising and in their paper they compare against DeOldify and since it reminded me of this post I thought I post it here.

Cu Selur

StainlessS · 8th November 2020, 15:00

Quote:

(my heavy filtering probably didn't help either // KillerSpot, SpotLess, MDegrain, DFTTest //)

Not 100% sure but think that KillerSpots() is basically single frame version of SpotLess(), so maybe a bit of redundancy in your filter chain.
[EDIT: Ie, KillerSpots() ~= Spotless(RadT=1) ::: Based on same Didee idea].

EDIT: Perhaps you use Killerspots first to get better vectors on very noisy material, or something,
if so then I suggest try SpotLess(..., bBlur=0.6) to prefilter for better vectors and
omit the KillerSpots() thing.

FranceBB · 9th November 2020, 00:41

Ah, nice one, Selur.
I should try to process the very same video with it and compare it to deOldify to see the results... when I have time... if I have time... (T_T)

As to SpotLess, Stainly, yeah, I really meant that I was pre-filtering the images, but I wasn't using all those filters one after the other.
I was using KillerSpots initially, then I substituted it with SpotLess, so don't worry, I'm not using them together xD
But yeah, I should really edit that so that other people don't make confusion about this.

Randomoneh · 2nd December 2020, 14:19

MyHeritage gives me better results for still images. DeOldify creator says it's based on newer DeOldify algorithm. I wonder if they'd allow for batch processing to try and convert videos.

FranceBB · 3rd December 2020, 18:22

Quote:

Originally Posted by Randomoneh

MyHeritage gives me better results for still images. DeOldify creator says it's based on newer DeOldify algorithm. I wonder if they'd allow for batch processing to try and convert videos.

Well, if they do, it would be worth trying them.
Converting videos by processing them frame by frame like I'm doing doesn't take into account correlation between them and may lead to unpleasant results, still, with no Avisynth alternative available, I guess it's better than nothing...
Anyway, it seems that it's still way too early to let neural networks put chroma on videos for us.
If I have time next week, I might show you the final result after I colorized the whole reel frame by frame (well actually using Mocha when it did detect and follow the objects), trying to be as historically accurate as possible by continuously asking the archivist and my other colleagues who handle those things.

patul · 4th December 2020, 02:21

Yeah, MyHeritage version of DeOldify is pretty good, but I'm more impressed with its Photo Enhancer (which I found out later on that it's licensed from Remini, and it uses AI). Does anyone know other implementation of such "AI Photo Enhancer"?

ReinerSchweinlin · 10th December 2020, 17:19

Hi FranceBB et al,

in my findings, such "AI Stuff" is dependant on two things:

The actual techniques used behind it (super resolution by subsampling of several adjanced frames for example) and the model being trained and then used.. Video or Image Upscale stuff shares a lot "under the hood" with technqieus like oldify. When playing around with several ESRGAN Models, it´s obvious that the content in wish of processing and the used model has to fit. The gaming community trains a lot of models for upscaling game content of older games, often these have very rouhg 8 Bit raster images or very low resolution intor videos (Indeo Codec - yeah..) and with the proper model, a lot can be done to "enhance". Video algorithms can take into account that the same image content is often present in several frames. A slowly moving background in a movie scene is a good example where the motion estimation and subsampling is easily able to recover "lost details" out of several frames and put them back into the video (independent of any neural network model - photoaccute does this, many astronomical software packages are able to do so,, too)...

As the models progress and the algos advance, more and more is possible... We discussed the merging of algos like deepfake with video enhancements (the "AI" knows faces, finds them in the video and replaces the bad low resolution versions with higher resolution faces from the database - good for "one actor scenes with nicolas cage" - bad for "crouwd scenes").

If algos like deoldify learn more in terms of how to recognice objects, people etc in scenes, they will be able to be more accurate in guessing: "Hey, this is a german soldier from WW2, he should have a grey--isch coat, not a green one"... But this will take quite some time, years.... So in my oppinion, for now it´s only "nice to have and play with", but in no way is any "AI colourization or video enhancement" good enough to unattended, fully automatic perfect results... There are cases where the result turn out phenomenal, in others it´s just garbage. So a lot of manual tweaking and attention is necessary at the moment and some footage just isn´t ready to be dealt with (and maybe neber will..).

I´ve been playing around with stuff like that for quite some time (btw - Selur, thanks for implementing all the crazy upscale stuff....) and in recent tests I moved my focus from "old VHS footage" (just too hard sometimes to deal with.... will keep the original captures, look for a posibility to RF Capture VHS and visit the topic later when I manage to train my own models..) to "older movies". Stanley Kubrick loved to use "natural lighting" in many scenes, hence adding no extra light while filming - a lot of fiolm grain in many scenes of his movies are the result. One could argue this is an artistic choice - I would think that the director would have loved to have less noise if the camera and film technologie would have been able to do so... So I tossed some BD footage into some AI-stuff and was happy to see that in some cases they do a wonderful job of "cleaning up" - much better than any degrain algo would have been able to or commercial denoisers like neatvideo etc... Some colour grading was eliminated, too, details got clearer... Overall, a very nice result. BUT: In some cases, small details in objects where missinterpreted by the AI and people suddenly had patterns on clothing where there should be none in reality

This is a little off-topic to the "colouring" topic you started, but since many principles behind the discussed techniques are the same and some "video enhancement/upscale" stuff deals with colours in some way, too - I thought I mention it.

A project I would love to attent - maybe someone wanst to join - is to train my own model for specific usecases.... For example I have some HD footage of DS9, which could be one trainig pair for the series - a specially trained DS9 model so to say

Use this ont something like vdsr...

ReinerSchweinlin · 10th December 2020, 17:23

Quote:

Originally Posted by patul

Yeah, MyHeritage version of DeOldify is pretty good, but I'm more impressed with its Photo Enhancer (which I found out later on that it's licensed from Remini, and it uses AI). Does anyone know other implementation of such "AI Photo Enhancer"?

There are lots....
Most stuff is developed in the linux world, but a lot can be used and accesed with windows - or google colab, if you want.

A commercial product would be Topaz Gigapixel.

A free to use thingy for windows using ESRGAN would be
https://github.com/ptrsuder/IEU.Winforms

Dig a little, you will find a lot of models to "plug" into Image enhancer (search "model zoo esrgan"), for different usecases (trees, landscapes, game content of all sorts, jpeg compression reduction, faces, balck&white content, maps, etc...)..

In fact, there is quite a lot out there to play with...

FranceBB · 11th December 2020, 11:30

Quote:

Originally Posted by ReinerSchweinlin

If algos like deoldify learn more in terms of how to recognice objects, people etc in scenes, they will be able to be more accurate in guessing: "Hey, this is a german soldier from WW2, he should have a grey--isch coat, not a green one"... But this will take quite some time, years.... So in my oppinion, for now it´s only "nice to have and play with", but in no way is any "AI colourization or video enhancement" good enough to unattended, fully automatic perfect results... There are cases where the result turn out phenomenal, in others it´s just garbage. So a lot of manual tweaking and attention is necessary at the moment and some footage just isn´t ready to be dealt with (and maybe neber will..).

Yeah, we're on the same page on this, it can't be trusted for fully automated stuff. Neither the upscaling algorithm, nor the colorization, which is one of the reason why I relied on old proven things like NNEDI3 + Spline64 rather than using ESRGAN models for my Remastering workflow by the way.
As to the things you explained, I guess the artifacts come from the fact that motion estimation can't really be perfect and it's very source/scene dependent, so yeah, as you said, when it works fine, it can produce very good results, better than conventional methods, but when it doesn't, it will produce undesired side effects and can't therefore be trusted blindly.

FranceBB · 11th December 2020, 12:32

Just so you can see the final result, this is what we've got after months of work by manually denoising, degraining, removing spots, scratches, stabilization and then finally using object tracking via Mocha to manually colorize the whole thing. About the colorization, when we were not sure, we left as grey-ish as possible without making it so obvious (without people noticing it), but we're pretty confident we've got many things right like flags. About dress and other things, there's been a lot of history research and not only that, for people like Barbara Ann Scott, we had to track her down and see the color of her hair (she was blonde). Same goes for Thomas Hedvin Byberg, Ken Bartholomew, Robert Fitzgerald and many many others which we had to track down. I mean, the recording is from the 40s, so...

I hope John Meyer is gonna be proud, 'cause we certainly are proud of the results:

wonkey_monkey · 13th December 2020, 20:15

The snow, particularly in the first shot, looks a little unnatural to me - the shadows should be a bit bluer (lit as they are by the sky), or perhaps a bit less green. But other than that minor niggle, the colourisation looks fantastic.

Thundik81 · 13th December 2020, 22:33

Nice work!

22nd September 2020, 08:42	#1 \| Link
FranceBB Broadcast Encoder Join Date: Nov 2013 Location: Royal Borough of Kensington & Chelsea, UK Posts: 2,903	DeOldify and AI Colorization: my experience Hi there, I was looking at DeOldify https://github.com/jantic/DeOldify as an AI model for colorizing grayscale pictures. So the idea was to take a video, export a sequence of .tiff via FFMpeg, let them run through DeOldify and then put them back together via FFMpeg. Unfortunately, it seems to be only working in Linux and the guide covers Ubuntu, so I had to fire up my Fedora and pray for it to work. The idea behind DeOldify is a series of models trained by several people to colorize pictures automatically. This is what I've got with the 1948 Winter Olympics: (Grayscale Remastering on the left, AI Colorized Output on the right) Of course, I have no idea about whether colors are actually historically correct or not but some of them are clearly not like those: In the very last example, it didn't recognized faces, probably due to the low definition of the original picture (my heavy filtering probably didn't help either // SpotLess, MDegrain, DFTTest //) and I'm pretty sure it thought they were flowers on a field, instead of humans. All in all, I think it's not really ready to be used for something like this, especially 'cause it's such an important event, so I'm gonna stick with grayscale. Human made colorizations are probably the best but I've never done it and from what I saw it takes a lot of time (even if object tracking tools like Mocha etc come into help). So... what do you think about DeOldify and AI Colorization algorithms in general? Do you use them? Do you have one you prefer over the others? Let me know. p.s images have been scaled down to fit on Doom9 __________________ LUT Collection FFAStrans Videotek - AAA - SafeColorLimiter Last edited by FranceBB; 9th November 2020 at 00:42.

8th November 2020, 09:14	#2 \| Link
Selur Registered User Join Date: Oct 2001 Location: Germany Posts: 7,277	Don't have any use for it myself atm. but I stumbled over https://github.com/ericsujw/InstColorization and it looks promising and in their paper they compare against DeOldify and since it reminded me of this post I thought I post it here. Cu Selur __________________ Hybrid here in the forum, homepage

9th November 2020, 00:41	#4 \| Link
FranceBB Broadcast Encoder Join Date: Nov 2013 Location: Royal Borough of Kensington & Chelsea, UK Posts: 2,903	Ah, nice one, Selur. I should try to process the very same video with it and compare it to deOldify to see the results... when I have time... if I have time... (T_T) As to SpotLess, Stainly, yeah, I really meant that I was pre-filtering the images, but I wasn't using all those filters one after the other. I was using KillerSpots initially, then I substituted it with SpotLess, so don't worry, I'm not using them together xD But yeah, I should really edit that so that other people don't make confusion about this. __________________ LUT Collection FFAStrans Videotek - AAA - SafeColorLimiter

11th December 2020, 12:32	#11 \| Link
FranceBB Broadcast Encoder Join Date: Nov 2013 Location: Royal Borough of Kensington & Chelsea, UK Posts: 2,903	Just so you can see the final result, this is what we've got after months of work by manually denoising, degraining, removing spots, scratches, stabilization and then finally using object tracking via Mocha to manually colorize the whole thing. About the colorization, when we were not sure, we left as grey-ish as possible without making it so obvious (without people noticing it), but we're pretty confident we've got many things right like flags. About dress and other things, there's been a lot of history research and not only that, for people like Barbara Ann Scott, we had to track her down and see the color of her hair (she was blonde). Same goes for Thomas Hedvin Byberg, Ken Bartholomew, Robert Fitzgerald and many many others which we had to track down. I mean, the recording is from the 40s, so... I hope John Meyer is gonna be proud, 'cause we certainly are proud of the results: __________________ LUT Collection FFAStrans Videotek - AAA - SafeColorLimiter

13th December 2020, 20:15	#12 \| Link
wonkey_monkey Formerly davidh***** Join Date: Jan 2004 Posts: 2,496	The snow, particularly in the first shot, looks a little unnatural to me - the shadows should be a bit bluer (lit as they are by the sky), or perhaps a bit less green. But other than that minor niggle, the colourisation looks fantastic. __________________ My AviSynth filters / I'm the Doctor

2nd December 2020, 14:19	#5 \| Link
Randomoneh Registered User Join Date: Apr 2020 Posts: 1	MyHeritage gives me better results for still images. DeOldify creator says it's based on newer DeOldify algorithm. I wonder if they'd allow for batch processing to try and convert videos.

4th December 2020, 02:21	#7 \| Link
patul Registered User Join Date: Sep 2005 Posts: 130	Yeah, MyHeritage version of DeOldify is pretty good, but I'm more impressed with its Photo Enhancer (which I found out later on that it's licensed from Remini, and it uses AI). Does anyone know other implementation of such "AI Photo Enhancer"?

10th December 2020, 17:19	#8 \| Link
ReinerSchweinlin Registered User Join Date: Oct 2001 Posts: 454	Hi FranceBB et al, in my findings, such "AI Stuff" is dependant on two things: The actual techniques used behind it (super resolution by subsampling of several adjanced frames for example) and the model being trained and then used.. Video or Image Upscale stuff shares a lot "under the hood" with technqieus like oldify. When playing around with several ESRGAN Models, it´s obvious that the content in wish of processing and the used model has to fit. The gaming community trains a lot of models for upscaling game content of older games, often these have very rouhg 8 Bit raster images or very low resolution intor videos (Indeo Codec - yeah..) and with the proper model, a lot can be done to "enhance". Video algorithms can take into account that the same image content is often present in several frames. A slowly moving background in a movie scene is a good example where the motion estimation and subsampling is easily able to recover "lost details" out of several frames and put them back into the video (independent of any neural network model - photoaccute does this, many astronomical software packages are able to do so,, too)... As the models progress and the algos advance, more and more is possible... We discussed the merging of algos like deepfake with video enhancements (the "AI" knows faces, finds them in the video and replaces the bad low resolution versions with higher resolution faces from the database - good for "one actor scenes with nicolas cage" - bad for "crouwd scenes"). If algos like deoldify learn more in terms of how to recognice objects, people etc in scenes, they will be able to be more accurate in guessing: "Hey, this is a german soldier from WW2, he should have a grey--isch coat, not a green one"... But this will take quite some time, years.... So in my oppinion, for now it´s only "nice to have and play with", but in no way is any "AI colourization or video enhancement" good enough to unattended, fully automatic perfect results... There are cases where the result turn out phenomenal, in others it´s just garbage. So a lot of manual tweaking and attention is necessary at the moment and some footage just isn´t ready to be dealt with (and maybe neber will..). I´ve been playing around with stuff like that for quite some time (btw - Selur, thanks for implementing all the crazy upscale stuff....) and in recent tests I moved my focus from "old VHS footage" (just too hard sometimes to deal with.... will keep the original captures, look for a posibility to RF Capture VHS and visit the topic later when I manage to train my own models..) to "older movies". Stanley Kubrick loved to use "natural lighting" in many scenes, hence adding no extra light while filming - a lot of fiolm grain in many scenes of his movies are the result. One could argue this is an artistic choice - I would think that the director would have loved to have less noise if the camera and film technologie would have been able to do so... So I tossed some BD footage into some AI-stuff and was happy to see that in some cases they do a wonderful job of "cleaning up" - much better than any degrain algo would have been able to or commercial denoisers like neatvideo etc... Some colour grading was eliminated, too, details got clearer... Overall, a very nice result. BUT: In some cases, small details in objects where missinterpreted by the AI and people suddenly had patterns on clothing where there should be none in reality This is a little off-topic to the "colouring" topic you started, but since many principles behind the discussed techniques are the same and some "video enhancement/upscale" stuff deals with colours in some way, too - I thought I mention it. A project I would love to attent - maybe someone wanst to join - is to train my own model for specific usecases.... For example I have some HD footage of DS9, which could be one trainig pair for the series - a specially trained DS9 model so to say Use this ont something like vdsr...

13th December 2020, 22:33	#13 \| Link
Thundik81 Registered User Join Date: Jul 2004 Posts: 22	Nice work!

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode