Log in

View Full Version : Square vs. non-square issues


TlatoSMD
22nd April 2010, 17:16
I'm still kinda confused about square vs. non-square PAR matters and I feel that my encodings are suffering from it especially if using containers/codecs and/or tools that don't seem to process non-square formats too well, so I wonder if any of you could clear up some things for me.

Now, what I do know so far is that back during the 1920s and 1930s when TV was developed, technical, mechanical, and electronic standards weren't up to modern precision yet, and that's why we're still struggling today with archaic nonsense such as interlacing and non-square PARs. Hence, the visual *DISPLAY* ratio of a TV set might be 4:3, while any video's actually encoded *PIXEL* resolution ratio is closer to 5:4 (at least in PAL), because J. L. Baird, Paul Loewe, Manfred von Ardenne and other such pioneers didn't have the means yet to create precise 1:1 square pixels, just as they didn't have any proper frequency transformers and cathode rays/phosphor displays yet that would have facilitated them to work progressive instead of interlaced.

Working mostly in a PAL environment, what I'd like to know is the analogue and digital reasons as for:

a.) Why and when is the vertical resolution of a 576 horizontal pixel PAL video 704, 720, or 768 pixels?

b.) So the display ratio of a traditional TV set or comp monitor is 4:3, and how, exactly, does 5:4 figure into this, especially in regard to a vertical resolution of 704, 720, and 768 pixels? At their article for PAR, Wikipedia rather mention some ratio of 59:54, how does that fit into things? What are the exact *PIXEL* resolution ratios for each of those resolutions (704, 720, 768)?

c.) How do all these things work for 16:9 PAL, square and non-square? What are the PARs here? What I do know is Premiere Pro 2.0 obviously puts out its 16:9 PAL DV-AVIs at a resolution of either 1024x576 or anamorphic 720x576.

Sharc
22nd April 2010, 19:34
That's a long story .... and you find tons of posts in this forum on the subject.
A short answer to c) 16:9 PAL anamorphic DVD (mpeg-2):
You have the choice between 2 PARs. Which one is 'correct' depends on how the DVD has been authored, and only the studio could tell. You won't find it on the disc.
The 2 options are
1024:702 = 512:351 = 1.4587 (ITU-R BT.601)¨
1024:720 = 64:45 = 1.4222..
Fortunately the 2 options are quite close, so you don't really have to worry much.

See also:
http://forum.doom9.org/showthread.php?p=826896#post826896
http://lipas.uwasa.fi/~f76998/video/conversion/#conversion_table

Midzuki
22nd April 2010, 19:35
Suggested "homework": :)

http://www.mir.com/DMG/aspect.html

because J. L. Baird, Paul Loewe, Manfred von Ardenne and other such pioneers didn't have the means yet to create precise 1:1 square pixels, just as they didn't have any proper frequency transformers and cathode rays/phosphor displays yet that would have facilitated them to work progressive instead of interlaced.

There are no pixels in analog video.

Interlaced was chosen because of the limited bandwidth for broadcasting, which had to do with certain "legal" definitions. Of course analog TV could have chosen progressive instead of interlaced, however such choice would imply reducing-by-2 the quantity of available broadcasting channels (which wouldn't have been so bad, considering that at least 95% of all that's been broadcast since 1930 is 100% garbage :devil: ).

TlatoSMD
22nd April 2010, 20:32
There are no pixels in analog video.

Pixels have been around for as long as rasterized scanning which is a basic requirement for faxing, and that's been around since the 1840s (originally developed and used to scan original signatures and transmit them by wire, probably for legal and business purposes), see http://en.wikipedia.org/wiki/Fax#History. The first raster-scanned, electronically transmitted (still) images were created by Alexander Bain around 1845, and by the 1860s, his system was further refined into a commercial faxing service by Giovanni Caselli. The existence of pixels and rasterized scanning of electronically transmitted images as far back as the 1800s is a fact known as little as that the first permanent three-color photography still in existence today was created by Sir James Clark Maxwell in 1867.

Moreover, Nipkow's patent for his mechanical rotating disk televisor in the 1880s spoke verbatim of a "pixel raster" ("Bildpunktraster") which the scanned moving images were resolved into. It's Nipkow's disk design that Baird would use in the 1920s to create his mechanical television. What we do know about Baird's resolutions and that of the comparable 1930s rotating-drum Scolophony system is that they achieved up to about 400 lines.

Ever since the 1920s, TV resolution was precisely defined in terms of horizontal lines, and I kinda doubt that analogue cameras and displays have some kind of "infinite" vertical resolution. I'd be very willing to bet that analogue TV pixels are way bigger than the size of each single electron fired at the phosphor layer, as at least with CRT TVs, you can put a magnifier to the screen and see the shadow mask's precisely defined RGB dots placed in triangles, like this:

http://upload.wikimedia.org/wikipedia/commons/thumb/9/9f/CRT_screen._closeup.jpg/150px-CRT_screen._closeup.jpg

It's called a triad: http://en.wikipedia.org/wiki/Triad_(computers), and each triad equals one pixel in resolution.

That's a long story .... and you find tons of posts in this forum on the subject.
A short answer to c) 16:9 PAL anamorphic DVD (mpeg-2):
You have the choice between 2 PARs. Which one is 'correct' depends on how the DVD has been authored, and only the studio could tell. You won't find it on the disc.
The 2 options are
1024:702 = 512:351 = 1.4587 (ITU-R BT.601)¨
1024:720 = 64:45 = 1.4222..
Fortunately the 2 options are quite close, so you don't really have to worry much.[/url]

Actually, I'm not coming from an authored DVD. Most of the time, I use stuff that

a.) I've captured from 4:3 (often letterboxed) VHS,

b.) that's not on a standard physical medium to begin with (websites, file-sharing, YouTube...),

c.) stuff that I shot myself in DV.

Any of these I put through tools such as PPro, VDub, MEncoder, NeroVision, and Encore, through various codecs and containers (only some of which support anamorphism and non-square pixels, which BTW are two related, but not identical issues, as one refers to display ratio and the other to PAR), and especially when resizing or cleaning in VDub, I seem to lose any non-square informations, and I guess also when outputting captured VHS material from PPro.

Especially for VDub purposes such as resizing, it would be great to know definite DISPLAY ratios such as 4:3, 5:4, 16:11etc. (not PARs such as 1.xxxx), also because VDub seems to enforce square pixels during encoding which can't be corrected later, such as when authoring. For example, let's take a resolution of 720x576, which at square pixels would be a ratio of exactly 5:4.

Midzuki
22nd April 2010, 21:00
http://en.wikipedia.org/wiki/Optical_resolution

http://en.wikipedia.org/wiki/Kell_factor

Sharc
22nd April 2010, 21:07
@TlatoSMD
Yes, but rasterizing, scan lines etc. are technical (artificial) means to quantize or 'digitize' information which is natively analogue.
It is therefore correct to say that 'pixels' are not existent in analogue video, they are a result of the attempts to quantize analogue information.
That does not mean that an analogue signal will represent an infinite resolution, unless it is given an infinite spectral bandwidth (which is impractical) and not affected by noise.

TlatoSMD
22nd April 2010, 21:17
@TlatoSMD
Yes, but rasterizing, scan lines etc. are technical (artificial) means to quantize or 'digitize' information which is natively analogue.
It is therefore correct to say that 'pixels' are not existent in analogue video, they are a result of the attempts to quantize analogue information.

In other words, any analogue format is not analogue but digital because it requires scanning and sampling? Are you trying to say that only nature is analogue and any technological representation of it is sampled, i. e. digital?

http://en.wikipedia.org/wiki/Optical_resolution

What I gather from that is that the mere number of pixels can be misleading if the size for the individual pixel is not defined? Is that what it says? I guess it doesn't matter much here, it's no surprise that if your CRT TV is bigger, so will be the pixels, or in other words, your lines will be higher.

http://en.wikipedia.org/wiki/Kell_factor

Oh, that seems to give a pretty precise pixel resolution of analogue PAL by the Kell factor of 0.7 which looks like the analogue equivalent to PAR. Accordingly, (4/3)×0.7×576 = 537.6 vertical lines, which would rate analogue PAL at a pixel resolution of 537x576...I guess I'll need to read up on your other homework above as to why we ended up with 7xx vertical pixels then.

TlatoSMD
22nd April 2010, 23:17
http://www.mir.com/DMG/aspect.html

Oy...*tries to wrap his head around that*

So...768x576 is the full resolution of a PAL signal captured in square pixels, and 702x576 or 704x576 is the same information captured in non-square pixels? Whatever did 720 originate from, then? That source even says that's why any standard digital PAL signal of 720x576 pixels must be cropped before it can be displayed on a TV.

And, again, DISPLAY ratios such as 59:54 and 118:81 appear. I guess these are what I'm after (such as when resizing obscure web-originated sources into square pixel destinations), but what do they derive from? All I know is 720x576 is exactly 5:4 when square.

http://forum.doom9.org/showthread.php?p=826896#post826896

Okay, this now has ratios such as 12:11 and 16:11. What do THEY originate from?

This thread is confirming my notion that we ought to leave stuff such as non-square PARs and interlacing behind as fast as possible...*sighs* but until then, I'll need to get a grasp on these issues.

Oh, and as for Brother John's question about videos actually in 704x576, I've seen them when working for Euro1, the company behind the small music video channel YavidoClips, available on cable and satellite. It was the format our Austrian content provider (I think the name was Moro or something like that) sent us the music videos as, 704x576 MPEG-2 (along with some 768x576), with no pillarboxing as the display ratio was 4:3. I don't remember whether we kept that resolution when sending our material down to ASTRA in Munich.

In any case, that stuff looked like kept for 20 years on self-recorded VHS stored on active (i. e. heavily magnetical) speakers and then captured via composite cable (either Belling-Lee or even just cinch) directly to MPEG-2, with lots of cross color, cross luminance, and MPEG blocking artifacts. In fact, that was why they hired me as an internee for half a year in the first place, in order to make that crap conform to broadcasting standards.

But anyway, with that other thread I got to http://www.bbc.co.uk/commissioning/tvbranding/picturesize.shtml, where the BBC explains to me that the difference between 704 and 720 is due to non-image informations required for "digital processing"...whatever that is. mpucoder on this forum here says that "720 is the lowest multiple of 16 greater than 710", which latter is the vertical resolution of analogue NTSC. It's interesting to know though that 480 for NTSC was chosen to have an equal sampling rate for capturing both analogue NTSC and PAL to digital ("720x480x30 = 720x576x25").

Also, that thread is telling me that a difference by 4 pixels is a neglectable difference between ITU/RCC and DVD/MPEG, so we may get that outta the equation and might simplify a few things here. Basically, one is the recommended standards for capturing analogue video, and the other are de-facto standards of digital encoding, right?

Hrmmm...http://lipas.uwasa.fi/~f76998/video/conversion/ sez that "720 pixels are sampled to allow for little deviation from the ideal timing values for blanking and active line length in analog signal." Timing values? Sounds like something that could rather be fixed by something as simple as what was a waiting loop in BASIC. Either that, or in case your "wobbly old home video tape recorder" is way off-synch simply use the blanking and line length signals for TBC of "wobbly" analogue recording and/or playback to extract proper sampling frequency values only to be used during the capturing process but not while encoding the digital file resulting from it. I bet you could save quite a bit of space that way.

Then again..."Last but not least, 720 pixels are sampled because a common sampling rate (13.5 MHz) and amount of samples per line (720) makes it easier for the hardware manufactures to design multi-standard digital video equipment." Could that be related to the fact that 480 was chosen for an equal sampling rate of NTSC and PAL? Does 720, at the end of the day, go back to the simple convenience of using the same sampling rate of 13.5MHz for both formats when digitizing? Now, THAT'd be a simple reason I could accept.

Soooo...I guess my first question is answered:

*768 is an analogue PAL signal captured square (= each analogue pixel represented by several digital pixels),
*702 is the same captured non-square (= how much information of original, analogue non-square pixels is stored within an analogue PAL signal),
*704 is the nearest equivalent to 702 divisible by 8 or 16 to meet codec requirements,
*and 720 is 704 plus pillarboxing of 8 pixels on either side, due to requiring the same sampling frequency for digitizing both NTSC and PAL ("720x480x30 = 720x576x25"), pretty much like the 4:x:x convention in chroma-subsampling is due to the very first practical ColorNTSC system back in the 1940s which used 4:1:1. Additional rumored benefits for 720 arise by somehow matching up "in timing" with analogue non-visual information such as blanking etc. which is the difference between 576 visual vs. 625 actual lines.

But I guess by now I need to add another question to the list in my initial post: How do we get from the Kell factor of 0.7 (which would indicate 538 vertical lines) to 7xx? 538:7xx is certainly not the Nyquist factor of 1:2. Is it the need for same sampling frequency in NTSC and PAL? Digital codec requirements of divisibility by 8 or 16? Or, wait...could it be that 538x576 is square somehow, and TV never had non-square pixels to begin with? x.x

Emulgator
24th April 2010, 14:24
As I tried to get closer behind PAR I decided to finally follow
http://lurkertech.com/lg/
http://lipas.uwasa.fi/~f76998/video/conversion/
as you did and found these quite consistent sources.

My best guess at genesis of video parameters:
As video engineers had to find a way transforming moving pictures from film to analog video, then digital video:

0.:Analog video is interlaced by birth since 1930.
Interlacing was developed end of the 1920s in Germany at Telefunken by Fritz Schröter.
Interlacing was patented 1930 as „Verfahren zur Abtastung von Fernsehbildern" (DRP-Patent Nr. 574085).

http://de.wikipedia.org/wiki/Zeilensprungverfahren

Why? Interlacing yielded good results on any ancient slow-decay CRT and kept this advantage even on modern fast-decaying CRT
getting the best picture out of a limited bandwidth by distributing temporary and spatial resolution on two fields.
(Later attempts had be made in the US by splitting into even more fields, but dropped in favour of NTSC, falling back to two fields)

Any scanning video projection will draw advantage from interlacing.
Any full-frame projection will have to introduce something similar, bob up, calculate new motion-based frames
("100Hz/120Hz, later 200/240Hz technique")
to avoid noticeable flicker when presenting genuine 24p,25p,30/1.001p footage.
Or suffer from deinterlacing as todays most TFTs still do.
If stored progressive, animations need very elaborated motion blur techniques
to look good at these lower framerates.
The Pixar movies discs give an interesting example of how painstakingly motion blurred animation was developed.
Camera-shot movies may keep shutter speeds in the range of frame duration to keep motion blurred.

From 48p,50p,60/1.001p on we should experience an impression of steady motion
even on less motion-blurred footage and interlaced may decease.


A.: Timing requirements (given by framerate, born from film stock and inherited as fieldrate by anlaog video) decide in first instance.
B.: Vertical resolution (digital lines, inherited by analog video's line syncs) follow,
C.: Horizontal resolution (columns, these are introduced by digitizing) may use what is left by the first two restrictions.

The Kell factor only describes the luxury you want to have regarding signal fidelity after reconstruction.
Only a quality model, a rule of thumb, no fixed structure would need to follow.

The sampling theorem I learnt back in 1985 still assumed plainly mathematically
that you only have to oversample twice the frequency you intend to restore.
Give fs = double fn and you have...maybe nothing.

This approach is only just sufficient for an eternally constant signal which has to be strictly monochrome.
Like a sine wave that started in the beginning of the universe, stays forever, and is sampled for a long period.

And this is way to poor for our daily real life signals like sound, optical impressions and the like.

Just imagine you start to sample a sine wave exactly at the nodes.
No matter how big the amplitude is and you still get: 0

CD-DA: fs=44.1kHz is a bit poor to reconstruct a 20kHz signal, just 220.5% of fn.
but we cannot judge distortions there anyway.

DVD: fs=13.5MHz represent 270% of fn, a bit better.
Our eyes can judge high (spatial) frequency details better than our ears can do with high (temporal) frequency details.

My rule of thumb: Give fs= triple (300%) fn and you get an impression, even with complex signals.

A.: Sample rate vs. field rate

With D1 using ITU-Rec601 a common video sampling rate of 13.5MHz for both NTSC and PAL was established
as sufficient to carry video information of roughly 5 MHz bandwidth,
yielding the same pixel rate, (so amount of information) on two quite incompatible time bases.
Other sampling rates, differing for NTSC vs. PAL, yielding square pixels do exist, but not in consumer area.

In any professional Video ADC
both system timings, NTSC and PAL have to be accomplished.
In any consumer Video DAC (like DVD-Video) device the same two system timings (sometimes only one) have to be accomplished.

How? A single 27.000.000 Hz quartz oscillator supplies the master clock.
This clock is then used as time base, directly or multiplied by 2 or 4 for ADC, DAC, any digital pre-and post-filtering.
27MHz / 54MHz / 108MHz Video DAC Oversampling/Antialiasing etc.pp.

27MHz. Divided by 2 we yield the infamous 13.5MHz sampling frequency.
27MHz. Divided by 300 we yield 90000 Hz. These are the SCR ticks for DVD Video.
DVD-Video: 90kHz. Now only 2 more dividers are used:
NTSC: A divider of 3003 gives 90000/3003 = 30/1.001 Hz 29.97002997Hz NTSC framerate.
PAL: A divider of 3600 gives 90000/3600 = 25 Hz framerate.

B.: Line count

The number of picture lines in analog interlaced video (where picture information sets in and ends in the mid of the "timing lines"
had to be kept as good as possible, but of course had to be rounded/truncated.

A single NTSC frame carries picture information for the duration of 485 line syncs.
Picture information sets in and out at half line position,
so 486 digital lines will sample the whole content.

The picture information of a single full analog NTSC video line lasts for 710.85 sampling cycles.
The digital sample count containing picture information would be 711.
720 columns cover 711 and allow for a additional timing misalignment of 9 columns equivalent to be still captured.

A single PAL frame carries picture information for the duration of 575.75 line syncs.
Picture information sets in at half line position and ends at full,
so 576 digital lines will sample the whole content.

The picture information of a single full analog PAL video line lasts for 702 sampling cycles.
The digital sample count containing picture information would be 702.
720 columns cover 702 and allow for a additional timing misalignment of 18 columns equivalent to be still captured.

NTSC: Later for DV and DVD mod16 cropping had to be applied to the 486 sampled NTSC lines,
clipping NTSC-DV and NTSC-DVD to 480 to fit digital compression purposes.

C.: Columns:

Now my best guesses:
FullD1 (720 columns) are mod16, the next mod16 to crop to (Broadcast D1) is 704.

720 columns should be captured from film stock or analog tapes for both systems,
but not completely filled with picture information.

A 4:3 picture was expected to be inside 704 columns.

A total of 9 (raw 711 NTSC on 720 FullD1 NTSC),
18 (raw 702 PAL on 720 FullD1 PAL),
16 (any 720 FullD1 to 704 Broadcast D1)
pillar pixels at the sides (symmetric or not) were expected to be acommodated
due to any possible misalignment in analog storage and/or analog part of A/D conversion.
(Flying spot scanner, line sync issues etc.)

These pillars should be fixed-cropped to 704 after the intermediate file was ready for final rendering.
The final distribution, so broadcast format was intended to be 704 wide.

A production and presentation aspect ratio of 4:3 was assumed to be distributed as 704 wide.
Any other aspect ratio (16:9) should be anamorphotically compressed/expanded to use the same width.
But since the 720 were used for final distribution as well,
we now have to cope with a multitude of PARs and their misinterpretations.

(I still sometimes wonder why the "academy AR" of 1.38:1 is only close to 4:3,
find the 1.3636 in 4:3*720/704, wonder about 1.0926 vs.1.0666, yawn, stretch my pixels and go to bed)

Sharc
24th April 2010, 15:24
Those who can read and understand German may want to enjoy reading the sections about pixels, PAR, anamorphic encoding, BT.601 etc. in Brother John's Encodingwissen (http://encodingwissen.de/video/anamorph-quelle.html). A very nice and easy to read treatment of the subject.

edzieba
26th April 2010, 13:13
[...]and each triad equals one pixel in resolution.If you mean 'each tried equals one pixel of input resolution', this isn't strictly true. This is something I hadn't figured out until not too long ago, is rarely explicitly mentioned anywhere, and for anyone who grew up with digital video and discrete-pixel LCD displays, is rather counter-intuitive and non-obvious: each input pixel is spread over several triads. The number of triads it is spread over can very (leading to a CRT display not requiring native resolution input to produce a sharp image), and rarely (if ever) will a CRT have accurate enough beamscanning electronics to address individual triads (what could be though of as 'native resolution').