Log in

View Full Version : Inspecting a pixel or a row of pixels at a specific coordinate


groucho86
20th November 2020, 18:56
What's the most streamlined approach to checking RGB values of a pixel at predetermined XY coordinate?

I could apply a crop but wondering if there's a better way.

Thank you!

feisty2
20th November 2020, 19:28
you could probably write a filter for that, really simple stuff, just a few lines of code

auto ArrayToText = [](auto x) {
auto Text = ""s;
for (auto y : x)
Text += std::to_string(y) + " ";
return Text;
};

LogMessage(ArrayToText(std::array{ SourceFrame[0][y][x], SourceFrame[1][y][x], SourceFrame[2][y][x] }));

groucho86
20th November 2020, 19:35
Thanks feisty2, I should have specified I'd like to do this in python.

feisty2
20th November 2020, 19:39
there's get_read_array() for VideoFrame in python that provides the same functionality, but it would be probably very slow if you do this very often

ChaosKing
20th November 2020, 22:49
Not exactly what you wanted, but there is a "block based filter" for vs https://github.com/kewenyu/BlockEvaluatedFilter-for-Vapoursynth

_Al_
20th November 2020, 23:25
As was said, again, using Python is good to get one pixel value reading at a time (from mouse on screen etc.) or a few. But if you have some segments and you want to do something with it, you are better off using numpy. You use RGB as you said, so it is ready. (if there is no dll or a vs function usable, which would be much faster)

For any vapoursynth format you can use this function:

def get_pixel_value(f, position):
if not isinstance(f, vs.VideoFrame):
raise TypeError(f'get_pixel_value(): first argument needs to be vs.VideoFrame type')

if not isinstance(position, (tuple, list)):
raise TypeError(f'get_pixel_value(): Second argument needs to be a pixel position argument, it must be a tuple with integers, example: (245,405)')

x,y = position

try:
planes =[f.get_read_array(i) for i in range(f.format.num_planes)]
except:
pass

p0, p1, p2 = (None, None, None)

if f.format.name == 'CompatYUY2': #Interleaved COMPATYUY2, two pixels share U and V
try: #values seem to be in 2byte packs: YU,YV, ....
pack = planes[0][y][x]
p0 = pack & 0xFF
p1 = (pack >> 8) & 0xFF
if x % 2 == 0: #got YU pack
pack = planes[0][y][x+1]
p2 = (pack >> 8) & 0xFF
else: #got YV pack
p2 = p1
pack = planes[0][y][x-1]
p1 = (pack >> 8) & 0xFF
except:
p0, p1, p2 = (None, None, None)

elif f.format.name == 'CompatBGR32': #Interleaved COMPATBGR32, 1 pixel = BGRA = 4byte pack
try:
pack = planes[0][f.height-1 - y][x] #COMPATBGR32 is vertically flipped
p2 = pack & 0xFF
p1 = (pack >> 8) & 0xFF
p0 = (pack >> 16) & 0xFF
except:
p0, p1, p2 = (None, None, None)

else: #Planar videos
try: p0 = planes[0][y][x]
except: p0 = None
ys = y >> f.format.subsampling_h #chroma planes are reduced if subsampling
xs = x >> f.format.subsampling_w
try: p1 = planes[1][ys,xs]
except: p1 = None
try: p2 = planes[2][ys,xs]
except: p2 = None

return p0,p1,p2
I took it out from view.py, I am overhauling it a bit now adding zooms on wheel and bunch of practical stuff. Also function get_pixel_value() returning pixel values making that function a static function so it could be called from anywhere, like I posted..
You can call that function like this, for example RGB clip:
R,G,B = get_pixel_value(RGB_clip.get_frame(25645), (100,200))
or YUV clip:
Y,U,V = get_pixel_value(RGB_clip.get_frame(25645), (100,200))
or
Gray clip example:
n=25645
f = GRAY_clip.get_frame(n)
Y,_,_ = get_pixel_value(f, (100,200))

or while within ModifyFrame(), that might be usefull too, where f is already ready for you for each frame:
Y,U,V = get_pixel_value(f, (100,200))
you can loop positions for a row etc.

get_pixel_value() always returns what it can, returns a tuple, if there is gray clip only, it passes (value, None, None) ...
but beware, its Python, so it is slow if trying to get lots of pixels from a frame.

feisty2
20th November 2020, 23:41
Well, you don’t have to try everything...

_Al_
20th November 2020, 23:50
... it's Python, :-) , I think trying the heck of everything is ok, if there is a problem with a plane, so following plane gives a reading etc.

also , I might add, it mimics human logic, if things go shitty, do this or that or better have whatever you got so far and get out of this block,
or for example, while creating list from a frame and frame is grayscale, first plane is loaded, but not second or third, it does not error. Not needing check what video is in the frame. Python is not fast, rather to proceed to try: instead of 5 lines in C or C++ checking on everything, then proceed with one minuscule one try overhead only:
try:
planes =[f.get_read_array(i) for i in range(f.format.num_planes)]
except:
pass
so if frame f is just one gray plane it just creates legit planes list with one Y plane only, then it leaves, try/except block

ok, BAD example, :-), in case of GRAY clip f.format.num_planes is just 1 so no exception is thrown, so you are right , I guess there is no point to try anything in there. At the moment I cannot remember any other reason why I put it in there. So I check following code if some try: could go away too.

groucho86
21st November 2020, 02:02
Wow, very comprehensive. Will take a beat to digest your code sample... Thanks _Al_!!

_Al_
21st November 2020, 02:19
those MemoryView of 'array' in vapoursynth have same coordinates as in numpy. So it is more families how to get a value by coordinates if you encounter numpy array calculations.

it uses y rows first, then x,
so if there is an array of a plane called plane, then accessing value on 3rd row and 10th column would be: value = plane[3][10], or plane[3,10]

EDITED: mistake , sorry, obviously indexing is from zero, so 3rd row and 10th column would be: plane[2][9].
Top, left pixel in array would be located as: plane[0][0]

numpy and vs as well have that y coordinate first then x, it is kind of swapped, because when having a pixel position you'd usually use (x,y)

video has three planes so instead of one plane , three planes could be in one list variable and then using them as planes[0], planes[1] and planes[2]

in python you can generate list in one line:
planes = [f.get_read_array(i) for i in range(f.format.num_planes)]

this is the same as this:
planes = []
for i in range(f.format.num_planes):
planes.append(f.get_read_array(i))

those one lines have their advantages , you can create lists or a generator instead if using regular brackets () instead of []. Python is cool.

also
xs = x >> f.format.subsampling_w
is a very cool operation that I started to use after seeing codes from big boys, it is dividing resolution to subsampling size. Instead of fishing what subsampling is , searching what number to use for division and then using clip.width/divisor , subsampling_w and subsampling_h could be used which are zero if no subsampling , 1 if half ,etc. so bit shift operation is used on binary where something>>1 basically is division by two (something//2 ). And it is cool because by shifting it discards last bit, value 1 or 0 whatever. If value is 5, in binary 101, then 5>>1 gets to 2, which is binary 10. So there is same subsampling coordinate for 4>>1 and 5>>1. In video, a pixel value with both luma coordinates (no subsampling coordinate) , x=4 and x=5 would share same chroma value for chroma coordinate x=2 if subsampling_w = 1.