Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
14th October 2021, 04:04 | #1 | Link |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Preparing for numPy reshape
Though it might not seem this is VS and python oriented, it is.
What is the 'pix_fmt' of raw frames in an ffmpeg filter complex? Is it 60 unpadded bits formatted yuv420, like the quads in macroblocks [note 1]?... like this: Code:
<-byte-> yyyyyyyy yy yyyyyy yyyy yyyy yyyyyy yy yyyyyyyy uuuuuuuu uu vvvvvv vvvv etc. like this: Code:
<-byte-> yyyyyyyy yy uuuuuu uuuu vvvv vvvvvv etc. Or does the 'p' in 'yuv...p10' NOT mean "packed"? Are the raw frames in filter complexes RGB or YUV? Is there even such a thing as rgb420p10? I need to ask these questions in order to configure ndarrayFrame.reshape() in this python script: Code:
rawvideoFrame = procPipeIn.stdout.read(bytesPerFrame) import numpy ndarrayFrame = numpy.fromstring(rawvideoFrame, dtype='uint8') ndarrayFrame = ndarrayFrame.reshape((pixelsPerRow, rowsPerFrame, bytesPerPixel)) Code:
[note 1] SAMPLE-QUAD RGB COMPONENTS YCbCr420 YCbCr422 YCbCr444 ...simplifed notation ·-------· ·-------· ·-------· ·-------· ·-------· ¦ S ¦ S ¦ ¦ R ¦ R ¦ ¦ Y ¦ Y ¦ ¦ Y ¦ Y ¦ ¦ Y ¦ Y ¦ ...samples, conceived as color planes ¦---¦---¦ ¦---¦---¦ ¦---¦---¦ ¦---¦---¦ ¦---¦---¦ ¦ S ¦ S ¦ ¦ R·-------· ¦ Y·-------· ¦ Y·-------· ¦ Y·-------· ·-------· ·--¦ G ¦ G ¦ ·--¦ ¦ ·--¦ Cb ¦ ·--¦ Cb¦ Cb¦ \ ¦---¦---¦ \ ¦ Cb ¦ \ ¦-------¦ \ ¦---¦---¦ \¦ G·-------· \¦ ·-------· \¦ C·-------· \¦ C·-------· ·--¦ B ¦ B ¦ ·--¦ ¦ ·--¦ Cr ¦ ·--¦ Cr¦ Cr¦ \ :---¦---: \ ¦ Cr ¦ \ ¦-------¦ \ ¦---¦---¦ \¦ B ¦ B ¦ \¦ ¦ \¦ Cr ¦ \¦ Cr¦ Cr¦ ·-------· ·-------· ·-------· ·-------· --Mark. |
14th October 2021, 07:18 | #2 | Link |
Registered User
Join Date: May 2011
Posts: 321
|
btw, you can get ffmpeg raw output to vapoursynth,
if wanting to get a numpy image, where planes have different sizes, plane by plane size data is needed to read (not the whole frame) and then shove each plane into numpy, then stack those three planes into image array, not sure what you want, all processing using numpy is usually using rgb, rgb has same plane size for each array So if having subsampled stdout, that'd be weird to get to numpy array, because one plane has different shape than other two, is it still what you want? iirc, .reshape(-1,w,h,3) should be for all frames to get , but planes need to have same size (yuv444 or rgb), if it is for viewing raw data should be rgb -> numpy Last edited by _Al_; 14th October 2021 at 07:23. |
14th October 2021, 15:58 | #3 | Link | ||
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Yes, that's what I'm doing now. In the future I need to put VS right in the middle of a filter graph, so I'm thinking of breaking the filter graph in half: preprocess and postprocess.
Preprocess in ffmpeg: open source video, do TB & PTS fix ups and whatnot. MV interpolate in VS: pipe the raw frames via ffmpeg's pipe to do motion vector interpolation in VS, then pipe the frames back to ffmpeg via subprocess.Popen() -- I think. Postprocess in ffmpeg & mkvmerge: mux in audio, subs, and chapters whatnot, and then output. I would do everything in VS if I knew how. I'm lost in python -- totally a python newbie who is largely ignorant of standard python architecture and methods. From what I've found, the python documentation seems to concentrate on syntax and functions. But for what I'm trying to do, I need architecture methods, such as '<object>.list_functions()', which are unnamed in the documentation I've found -- you know that learning a new language is likened to climbing a mountain: the most time is spent looking at maps of the mountain and planning a route, not spent getting gear together. :-) I'm kind of bogged down looking for the maps. Quote:
Quote:
I guess my best approach would be to force ffmpeg to convert the input to raw rgb444p16le, only, so that all color planes are the same size and all pixels are on byte boundaries. What do you say? Or perhaps there's another way that involves less processing and memory? What do you say? Thanks, Al. --Mark. |
||
14th October 2021, 16:45 | #4 | Link | |||||||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
Some filters have limitations on pixel formats, sometimes ffmpeg will "sneakily" auto inject some conversion. You can insert a -vf showinfo to see what the current pixel format at that current node, or point in the filter graph Quote:
Quote:
Quote:
You can use ffmpeg -pix_fmts to list ffmpeg supported pixel formats Quote:
Note there are no TB or PTS issues in vapoursynth from DVD or BD sources (they are CFR encoded only, although the content can VFR) - when you use a frame accurate source filter. None of this flaky wandering timestamp jitter issues that ffmpeg is prone to. A buggy timestamp DVD or BD caused by ripping with makemkv using 1ms timebase can be interpreted as CFR, with perfect timestamps, perfect timebase, if you use a frame accurate source filter. What is important in these scenarios is "frame accuracy" - you can assume any framerate - and that will fix any timestamp issues if there were any It would probably be simpler to do all the preprocessing steps in vapoursynth, since you're using it for the interpolation part anyways. Then continue encoding / muxing with ffmpeg/mkvmerge I find a faster way learn is from examples. State what you're trying to do in basic terms - and there are probably examples posted already. Quote:
If you send yuv420p8 data, it reads yuv420p8 data. If you add some RGB-only filter, but don't specify the preconversion, it will auto insert a conversion (or sometimes throw an error) Quote:
|
|||||||
14th October 2021, 18:47 | #5 | Link | |||||||||||||||
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
@poisondeathray, you are... a prince! In one reply, you have cleared up so much!
Quote:
Quote:
Quote:
Do you happen to know why it's called 'planar'? Is that simply to indicate that color is treated as sepatate (and separable) components or is it more profound like having to do with the structure of macroblocks (e.g. the differences how pixels are stored in frame v. field structure)? -- I'm quite familiar with the various species of macroblocks. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
For progressive and soft telecine (what I call 23.9fps[24pps]): Force TB & PTSs to 24fps[24pps], MV interpolate to 120fps[120pps], resync audio subs & chaps to the resulting 1/120 TB, encode & mux. For hard telecine(what I call 23.9fps[2-3[24pps]] for example): Add detelecine. For so-called NTSC (what I call '29.9fps[59.9sps]'): separatefields, separately MV interpolate each field stream to 120fps, bob the first field stream for 2 frames while delaying the second field stream by 2 frames and finally weave them together to get a perfect run of 100% progressive frames. Mixed 'NTSC'+hard telecine presents a problem: I've not found a way to flag combed frames, frame-by-frame. I need that flag in order to switch between my 29.9fps[2-3[24pps]]-to-120fps[120pps] method and my 29.9fps[59.9sps]-to-120fps[120pps] method. The overall objective is a one-click solution to take anything on a DVD or BD disc that's been professionally mastered, probe it to ascertain its properties, and then transcode it to HEVC/MKV. What I don't believe is that professionals make so many mistakes as to make that goal unattainable -- afterall, anything a DVD/BD player can play can't be that messed up. Quote:
Quote:
|
|||||||||||||||
14th October 2021, 19:42 | #6 | Link | ||||||||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
Packed vs. planar has to do with the way uncompressed data is stored and organized. If you look at fourcc.org, it lists some common formats and orientation, with some diagrams https://www.fourcc.org/yuv.php Quote:
eg. Both "YV12" and "NV12" and "IYUV" are all 8bit 4:2:0. But they store/arrage the data differently. "NV12" has U,V planes interleaved, but "IYUV" has reversed plane order compared to "YV12" https://www.fourcc.org/pixel-format/yuv-yv12/ https://www.fourcc.org/pixel-format/yuv-nv12/ https://www.fourcc.org/pixel-format/yuv-i420/ Quote:
For DVD/BD - the encoding is CFR, 100% always. But the content can be VFR. It's very common - What I mean is you can have mixed cadence with field and frame repeats. e.g. 59.94 fields/s interlaced content sections - where this is motion in every field, with 29.97p sections, 23.976p sections, or with 14.985p sections, or other frame rates. For example a "slow motion" section of a 23.976p film might have 14.985p frames as duplicates (for effectively 1/2 speed during that sequence) . Quote:
The problem with avisynth , is natively CFR. VFR is more difficult to output (it's possible with decimation and output of timestamps to keep sync) Quote:
I was just suggesting doing most of it in 1 program, because going back and forth adds complexity and overhead (slower) Quote:
If you want to "speedup" 24000/1001 to 24/1, you can use core.std.AssumeFPS(clip, fpsnum=24, fpsden=1) . It's the same frame count, just the framerate, their timestamps are all adjusted. There are 24000/1001 vs. 24/1 variants on BD but rarely are BD's telecined (almost always native progressive); but "film" DVD will always be 24000/1001 The NTSC interlaced content case is slightly problematic because of the even/odd field offset (spatially displaced 1 pixel). If you do it the way you propose, you will get an up/down flutter motion per frame pair. Usually a smart double rate deinterlacer would be used, like QTGMC , to output 59.94p, then interpolate to something else if desired. Quote:
Combed frames can be flagged (but comb detection is not necessarily 100% accurate - there are different types of "comb" patterns) , but I don't know of a good way to automatically process mixed VFR cadence with interpolation properly, and automatically. Quote:
I have a feeling you haven't encountered "problem" DVD's, such as some low budget DVD's, multi format converted sources, some anime DVD's. These have many layers of problems on top of what you describe - and what many posts in the avisynth forum deal with . (There is no one click solution for those situations, they need human eyes and custom script solutions) |
||||||||
17th October 2021, 01:32 | #7 | Link | |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
I know you're very knowledgeable and have given video much thought. May I share my views? That you mix frame rate and picture rate does not surprise me. The MPEG engineers do the same. If there's one small contribution I can make, I would like it to be my nomenclature to separate frame rate and picture rate. Let me give examples and see what you think of it. '24pps' denotes 24 pictures per second. '23.9fps' denotes 24000 frames per 1001 seconds. '23.9fps[24pps]' denotes cinema-to-video that runs slow by 1 part in 1000 parts -- running time is ++3.6 seconds per hour. (The brackets essentially mean 'contained'.) '72fps[24pps]' denotes cinema that's essentially triple-shuttered. '2-3[24pps]' denotes 2-3 pull-down cinema that yields the equivalent of 30pps. '29.9fps[2-3[24pps]]' denotes hard telecine of 2-3 pull-down cinema that runs slow by 3.6 seconds per hour. '3-2[24pps]' '2-2-2-4[24pps]' etc. denote other pull-downs. '59.9sps' denotes NTSC. '50sps' denotes PAL. '29.9fps[59.9sps]' & '25fps[50sps]' denote interlaced digital-NTSC & -PAL. '120fps[59.9sps]' denotes interlaced digital-NTSC that's doubled and runs fast by 3.596403.. seconds per hour. '120fps[120pps] is cinema that's been 1-to-5 interpolated to 120pps and put in frames on a 1-to-1 basis. Using this notation, I haven't encountered any video situation that can't be compactly characterized with one exception: the DVD of the movie "PASSION FISH". That DVD feature has sequences of 2 combed frames alternating with an odd number of progressive frames. The number of progressive frames is between 5 and 71, is always an odd number, and the alternation has no descernable repetition pattern. You cite an example: "59.94 fields/s interlaced content sections - where this is motion in every field, with 29.97p sections, 23.976p sections, or with 14.985p sections". If they are characterized by '29.9fps[59.9sps]' '29.9fps[29.9pps]' '29.9fps[24pps]' and '29.9fps[14.9pps]', then they are all contained in 29.9fps frames and is therefore a CFR stream. Is that what you intended? PS: In the same way that a TS contains PSs (i.e. frames), a notation that has frames contining pictures is, I think, a useful and consistent extension. Last edited by markfilipak; 17th October 2021 at 02:09. |
|
17th October 2021, 02:46 | #8 | Link | ||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
I'm just distinguishing between the content frame rate vs. encoded frame rate or field rate. The frame rate is 29.97 , and the field rate is 59.94 for all interlaced encoded NTSC DVD's for the encoded stream. But that is not necessarily reflective of what the content frame rate truly is. Quote:
What people have been using for years are descriptions like "29.97i" for interlaced content, "29.97p in 29.97i" , "23.976p in 29.97i" , "14.985p in 29.97i" most people would abbreviate as "15p in 29.97i". pN for progressive native, such as 23.976pN , 24pN, 29.97pN . How would you distinguish between a video 14.985p in 29.97p (encoded progressively) vs. 14.985p in 29.97i (encoded interlaced as fields)? - or is that what sps is for ? - you never fully explained what the sps letters stand for - so would it be "29.9fps[14.9pps]" vs "59.9sps[14.9pps]" ? And 14.985p in 59.94p would be "59.94fps[14.9pps]" ? Last edited by poisondeathray; 17th October 2021 at 03:24. |
||
17th October 2021, 05:18 | #9 | Link | |||||
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
Quote:
"14.985p in 29.97i (encoded interlaced as fields)?" Well, if I understand correctly, that would be 14.985 pictures per second, deinterlaced (so, 29.970sps), then interlaced and encoded at 29.970 frames per second -- that, though it doesn't make much sense to me. So I guess that'd be '29.970fps[1-1[14.985pps]]. My problem is with the meaning of "14.985p in 29.97i". I'm probably misunderstanding because, you see, I have a problem understanding what most folks mean by the word "interlace". For example, you wrote "encoded interlaced as fields". That, to me, is a contradictory statement -- an encoding is either frame-based (interlaced) or field-based (not interlaced). Since field encoding is not interlaced, "encoded interlaced as fields" confuses me. By "interlaced", most folks mean 2 temporal scans woven together to form (combed) pictures but in the macroblocks, the picture data is field-based, not interlaced -- the interlace occurs in the decoder, not in the stream and the metadata are instructions to the decoder, not a statement regarding what the macroblock format is. So, what most people call "interlaced" video is actually not interlaced. The MPEG engineers solve this problem by calling such streams "interleaved", not "interlaced" -- in fact, when you read the various MPEG specs, you won't find the word "interlace" anywhere. Thus, an interlaced video means that the decoder is required to interlace the fields (i.e. the scans), not that the stream consists of interlaced data. In other words, an interlaced video is actually non-interlaced. Confusing isn't it? The key to my understanding is that metadata: 'progressive_sequence' 'picture_structure' 'top_field_first' 'repeat_first_field' 'progressive_frame', are all instructions to decoders. However, most folks interpret them as stream data states, instead. So, I've learned that when someone refers to "interlaced video" they mean a video that needs to be interlaced. I've likewise learned that when most folks refer to "deinterlace" (as, for example, a deinterlace filter), they mean a process that deinterlaces as part of the process, not as an input or an output format. I even have a problem with the word "filter" because many of the processess that are called filters do absolutely no filtering (i.e. no separating, no sorting, no routing -- even a process that weaves is called a "filter"). They are not filters but, instead, are processes. But I've learned that the word "filter" in video can mean almost anything, usually denoting a position in a processing pipeline (such as 'filter_complex'), not what I learned in electrical engineering courses. I tried to discuss such stuff in the ffmpeg-user mailing list and was attacked as not knowing anything and stupid. Quote:
I use "scans" instead of "fields" because "scan" can be abbreviated as "s" whereas "field" presents an obvious problem. Also, I think the word "scan" is more descriptive of the camera (television) used to make pictures. Quote:
Quote:
PS: Just a moment. Upon rereading I see you wrote "59.9sps[14.9pps]". What do you mean by that? PPS: You're (naturally) asking me about a lot of things I haven't considered in my workflows. For example, to be consistent, 59.9fps[14.9pps] would be 14.9pps framed at 59.9fps, i.e. sped up by 4x. 59.9fps[8[14.9pps]] would be 14.9pps with 8 fields per picture contained in 4 frames, i.e. 14.9pps simply quadruple-shuttered -- '8' specifies a type of telecine, in the same way that '2-3' is telecine except that '8' doesn't change the cadence, it just uses the 8 fields (1 field-pair are original + 3 field-pairs are copies) to fill out the 4 frames. so, no speed up but just 4x shutter. Last edited by markfilipak; 17th October 2021 at 05:44. |
|||||
17th October 2021, 06:03 | #10 | Link |
Registered User
Join Date: May 2011
Posts: 321
|
Oh man, just hop on the wagon and use terminology that is used in the neighborhood and there is a reason why. In DVD, broadcast , if interlaced, it is all about fields. Everyone knows what is delivery fps and what is actual fps that is most of the time "dug out" from there. Not sure why would you suddenly decided to attack a terminology. That leads absolutely nowhere.
I'd be worried about that 24fps to 120fps. It is just insane. You are going to create tons of artifacts that does not belong to video, and you will be storing it. |
17th October 2021, 06:38 | #11 | Link |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
@poisondeathray
I tried to include (attach?) a '.jpg' showing 29.9fps[2-1-2-5[24pps]] from an "ALL ABOUT EVE" BD bonus feature (i.e. 00390.m2ts, frames 1018-1029). It may be approved and show up in a future post to this thread. Also, I have a '.jpg' showing 29.9fps[1-4-4-4..[24pps]+2-3[24pps]] from the tltle screens of "28 DAYS" (i.e. VTS_02_1.VOB). It's a mix of a badly edited background (the 1-4-4-4..[24pps] part) with overlayed text (the 2-3[24pps], i.e. normally telecined, part that apparently was added later). If the "ALL ABOUT EVE" post shows up here, I'll post the "28 DAYS" jpeg. Both of them are terribly hard to describe but using the notation, they are precisely and unambiguously characterized. They provide excellent examples of the power of the notational system. |
17th October 2021, 06:48 | #13 | Link | ||
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
Good to hear from you. You know that, if you don't like what I write -- totally understandable -- you're free to ignore it, eh? Quote:
|
||
17th October 2021, 06:54 | #14 | Link | |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
I can be patient. Also, I like to have the jpegs in the thread. That's more convenient, and it makes the posting look so cool. ...Like I know what I'm 'talking' about. |
|
17th October 2021, 16:27 | #15 | Link | |||||||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
If you removed the duplicates you have the original, and timestamps would show a delta of ~66.733ms for that section. Different sections might have different "pps" - hence the usefulness of "VFR" and timestamps that Paul alluded to (? or Elon... WTF ). Timestamp VFR means you can have unique frames and content runs at the original rate (or each frame is displayed for the proper amount of time) . You don't need wasteful frame or field repeats - that archaic system was imposed on us to comply with NTSC, broadcast engineers Quote:
This is 14.985p original content encoded for NTSC DVD. Hard telecined if you will, so there would be frame duplicate pairs if you were examining frames. Nothing else is done to the content. The encoding type is interlaced (in MPEG2 alternate scan would be used) instead of progressive (in MPEG2 zig zag scan, but progressive would be soft telecined with repeat field flags for DVD compatibility). The metadata "flagging" will be different , one will be interlaced, the other progressive. The flagging and metadata has potential implications for other programs and how stream is handled. (And there is MBAFF too, but we will avoid that for the moment) Quote:
You're correct - it is field encoding, frame encoding, or a mixed field or frame macroblocks (MBAFF). "interlace", "deinterlace", and "filter" can mean different things to different people . Everyone might be on a different page. People, official organizations make up and change terms too. For example "PAR" pixel aspect ratio in MPEG2 is now called "SAR" or sample aspect ratio in MPEG4 terminology. It's all there in the ITU specs. There is no PAR anymore in modern formats. Or "29.97i" and "25i" is the original official notation that many organizations like broadcasters, EBU use, but Sony, Adobe, a bunch of other companies now call it "59.94i" and "50i." Maybe their marketing team though a higher number would sell more cameras You'll never get everyone to agree on nomenclature, just do the best you can to describe whatever it is. Quote:
For example 720p59.94 broadcast channels. The encoded stream is 59.94p. The content frame rate (pps if you want) in a given section might be 14.985fps consisting of 4x frame repeats. If you decimated the duplicates, you would have the original 14.985fps Quote:
Quote:
I don't like term "sped up", because that implies a speed change (it is a speed change, but with duplicates - someone might interpret that differently, it's a point that might cause confusion) 14.985p native content (unique frames) appears the same as 14.985p in 59.94p (4x frame repeats), on a 59.94Hz display. The former has 4x fewer frames and is more efficient in terms of encoding |
|||||||
18th October 2021, 00:40 | #16 | Link | ||||
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
Bear with me, this is at the core of our miscommunication regarding the notation. Trust me, it's simpler than you think. 14.985pps framed by the camera would be 14.985fps[14.985pps]. 14.985pps with 4x repeats would be 59.940[8[14.985pps]] -- pictures have 2-to-8 field telecine, so would run at 1x but are actually 4x shuttered. 14.985pps with 4x speed up would be 59.940[14.985pps] -- pictures would run fast by 4x (running time would be 1/4). Quote:
Give me more use cases and I'll give you more examples of the notation. To cite one of the most common use cases: 29.970[2-3[24pps]] is cinema (24pps) that's 2-3 telecined (8-to-10 field telecine, so 30 telecined pictures per second) inside 29.970fps (telecined picture rate = frame rate, so no speed up). Well, actually 2-3[24pps] runs slow by 29.970/30 (a 'feature' that the notation exposes unambiguously). If I write this conversion: 23.976fps[24pps] --> 29.970fps[2-3[24pps]] isn't that an easier, more compact, and more precise way to characterize 2-3 telecine that runs x/1.001 slow than by using words to explain it? Another example, this time interpolating to a higher picture rate: 24pps --> 120pps is 1-to-5 picture interpolation. 24fps[24pps] --> 120fps[120pps] is the same interpolation, but contained in frames. 24fps[120pps] would be x/5 slow motion. You see, when you write "24p", I can't tell whether you mean 24fps (a frame rate) or 24pps (a picture rate). But if I write "24fps", you know that's a frame rate. If I write "24pps", you know that's a picture rate. If I write "24fps[24pps]", you know that's 24 pictures per second in frames running at 24 frames per second, and you can conclude that the pictures are shown at normal rate. See? Simple, eh? Restricting communication to just "p" and "i" for everything leads to frame v. picture confusion that breaks understanding. Let me state it another way so you understand. When you write "24p", you know what you mean (by its context), but I don't know what you mean because I don't yet understand the context -- the context is the thing you're trying to explain, otherwise, no one would misunderstand anything, eh? Using terms that rely on context when trying to explain the context (the use case) leads to frustration for both of us. Quote:
By "a 29.97i stream", I think you mean 29.970 frames per second, right? Or do you actually mean 29.970 fields per second? ...At this point, I really don't know. 1 - If you mean 29.970 frames per second, then that's 29.970fps[??????pps]. 2 - But if you mean 29.970 fields (i.e. scans) per second, then that's ??????fps[29.970sps]. And by "14.985 fps content", I think you mean 14.985 frames per second, right? Or do you actually mean "content" (meaning: pictures/scans)? ...At this point, I really don't know. A - If you really do mean 14.985 fps, then that's simply 14.985[??????pps]. B - But if you really do mean content, then that's either ??????fps[14.985pps] or ??????fps[14.985sps]. And to be clear, by "picture" I mean 720x480 for example, and by "scan" I mean 720x240 for example. (Now, technically, if the original frame's picture is progressive and has been separated into fields (deinterlaced if you wish), then the result is actually half-pictures, not scans, but I'm going to ignore that detail for the time being because what you wrote includes the letter "i" which leads me to believe you're referring to scans. Okay?) So when attempting to understand what you wrote: "29.97i stream, with 14.985 fps content", I'm presented with 4 possibilities: 1A - The union of 29.970fps[??????pps] and 14.985fps[??????pps], or 1B - The union of 29.970fps[??????pps] and ??????fps[14.985sps], or 2A - The union of ??????fps[29.970sps] and ??????fps[14.985sps], or 2B - The union of ??????fps[29.970sps] and ??????fps[14.985sps]. Well, I can immediately toss out 2A and 2B because they are the same, and they both cite 'sps' with no 'fps' and those 'sps's conflict. And, I can toss out 1A because a video can't be both 29.970fps and 14.985fps at the same time. That leaves me with this: 1B - The union of 29.970fps[??????pps] and ??????fps[14.985sps]. At this point I think you refer to a video made by a field-scan camera (maybe a TV camera or camcorder) producing digital-NTSC frames. Am I right? In that case, then the notation is 29.970fps[14.985sps]. If that is the case, and since the normal frame rate for 14.985sps would be 7.4925fps, then I'd say that you're referring to an interlaced video taken at 14.985sps that is sped up by 4x. How'd I do? |
||||
18th October 2021, 02:19 | #17 | Link | ||||||||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
Quote:
There is definitely merit to the notation - but trust me - you're going to confuse many people. If I were you, I would probably write a guide or "FAQS" with common examples - so people can understand what you mean. The rest of the world is probably fine with the "15p in 60p", or "23.976p in 29.97i" style notation It's easier and it already works. People don't like change. LIke PAR to SAR, 29.97i to 59.94i. Video people on forums generally know what it means, because they deal with processing of various video Quote:
Quote:
Quote:
But there is nothing wrong with explaining in different ways, or different words - I welcome it (but others might not as you've seen on some forums) Quote:
"in 29.97i" means the stream is encoded as fields at 59.94 fields/second. "in 29.97p" means the stream was encoded as frames at 29.97 frames/s "in 59.94p" means the stream was encoded as frames at 59.94 frames/s "i" means field encoding, "p" means frame encoding. 14.985p in 29.97i can only mean 1 thing to people here (and most that work with video) 14.985p in 59.94p can only mean 1 thing to people here (and most that work with video) Quote:
Quote:
|
||||||||
20th October 2021, 03:34 | #18 | Link | |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Quote:
PAR, "picture aspect ratio" -- The MPEG folks never use the acronym "PAR", but they do call the macroblock data "picture". SAR, "sample aspect ratio" -- That's what the MPEG folks call it. Now, bear with me... In the case of PAR & SAR, it doesn't really matter. DAR (display AR) = PAR (picture AR) x SAR (sample AR), or DAR (display AR) = PAR (pixel AR) x SAR (storage AR). The equation is the same in either case. But I've seen this: DAR (data AR), and that is wrong, but the people who write it defend it. That's not really the same type of issue as with fps vs. pps/sps, is it? 29.97i might be interpreted as 29.97fps with scan interlaced fields, or 59.94i might be interpreted as 59.94 scans per second in 29.97fps frames (but not necessarily: they could be in 119.8fps frames or in 10fps frames). Big difference there. 29.97fps[59.94sps] is unambiguous. It specifies both frame rate and original scan rate, and does so in a way that can't be misinterpreted, eh? Last edited by markfilipak; 20th October 2021 at 03:36. |
|
20th October 2021, 04:35 | #19 | Link | ||
Registered User
Join Date: Sep 2007
Posts: 5,346
|
Quote:
DAR = FAR x SAR Quote:
Yes, it is a better notation style, no argument here The problem is there are 2 people in the world that knows what it means. It's a bit confusing at first. I know now , so I can help translate what you say, or help translate what other people say to you In your mind it's unambiguous , but you're assuming someone took time to learn it. That notation might be interpreted as 29.97fps content in 59.94 fields/s, or does that mean actual interlaced content 59.94 fields/s content, with a framerate of 29.97 ? It's quite easy to misinterpret the first time you see it The more simple your "readme" or "faqs" are with examples - the higher chance that a someone might take time to figure out what you're trying to say. It might end up like "betamax" vs. "vhs", where betamax was technically superior but wasn't as popular People here (or other video related forums) tend use the "content" description as your "pps" , or just describe it in words. Last edited by poisondeathray; 20th October 2021 at 04:44. |
||
20th October 2021, 04:51 | #20 | Link | |
Registered User
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 277
|
Well, thank you poisondeathray. That means a lot to me.
Quote:
Actually, I've seen other folks using the notation in the last few weeks. If it's going to catch on, better that it catch on slowly. We both prefer showing by examples over pedantic explanations. Give me some simple and some difficult use cases in words and I'll try to 'translate' them, eh? That will help me. And other people who read this will catch on. People are pretty smart, especially people here. PS: With the possible exception of implementing finite state machines in software, anything that requires much explanation isn't worth a sh*t. Last edited by markfilipak; 20th October 2021 at 04:55. |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|