after I made the plugin
mt I discovered the way avisynth handles VideoFrame is thread unsafe. The problem lies in how the list of VideoFrames are handle in VideoFrame:: operator new(unsigned). Here the first VideoFrame with a refcount of 0 is returned to the VideoFrame constructor. The problem is that the refcount is increased when it reaches the PVideoFrame object so in the mean time another thread could have grapped it. I solved this in mt by wrapping the functions that creates a new videoframe(by calling new VideoFrame) so only one thread at a time can access these function. The following functions calls new VideoFrame:
cache::GetFrame,
ScriptEnvironment::NewVideoFrame(int row_size, int height, int align),
ScriptEnvironment::NewPlanarVideoFrame(int width, int height, int align, bool U_first), (both is used by ScriptEnvironment::NewVideoFrame(const VideoInfo& vi, int align) and ScriptEnvironment::MakeWritable(PVideoFrame* pvf)),
VideoFrame::Subframe .
I suggest the following fixes to avoid it.
Code:
void* VideoFrame::operator new(unsigned) {
// CriticalSection
//optional EnterCriticalSection(&csVideoFrame);
for (LinkedVideoFrame* i = g_VideoFrame_recycle_bin; i; i = i->next)
if (i->vf.refcount == 0)
{
InterlockedIncrement((long*)&i->vf.refcount);
if(i->vf.refcount==1)
return &i->vf;
else
InterlockedDecrement((long*)&i->vf.refcount);
}
//LeaveCriticalSection(&csVideoFrame);
LinkedVideoFrame* result = (LinkedVideoFrame*)::operator new(sizeof(LinkedVideoFrame));
result->next = g_VideoFrame_recycle_bin;
g_VideoFrame_recycle_bin = result;
return &result->vf;
}
for even greater threadsafety CriticalSection could be used (csVideoFrameBuffer should of course be initialized somewhere else)
Code:
VideoFrame::VideoFrame(VideoFrameBuffer* _vfb, int _offset, int _pitch, int _row_size, int _height)
: refcount(1), vfb(_vfb), offset(_offset), pitch(_pitch), row_size(_row_size), height(_height),offsetU(_offset),offsetV(_offset),pitchUV(0) // PitchUV=0 so this doesn't take up additional space
{
InterlockedIncrement(&vfb->refcount);
}
VideoFrame::VideoFrame(VideoFrameBuffer* _vfb, int _offset, int _pitch, int _row_size, int _height, int _offsetU, int _offsetV, int _pitchUV)
: refcount(1), vfb(_vfb), offset(_offset), pitch(_pitch), row_size(_row_size), height(_height),offsetU(_offsetU),offsetV(_offsetV),pitchUV(_pitchUV)
{
InterlockedIncrement(&vfb->refcount);
}
the refcount is just initialized to 1 to avoid the VideoFrame to be recruted by another thread.
Code:
VideoFrame* VideoFrame::Subframe(int rel_offset, int new_pitch, int new_row_size, int new_height) const {
VideoFrame* Retval=new VideoFrame(vfb, offset+rel_offset, new_pitch, new_row_size, new_height);
InterlockedDecrement((long*)&Retval->refcount
return Retval;
}
VideoFrame* VideoFrame::Subframe(int rel_offset, int new_pitch, int new_row_size, int new_height, int rel_offsetU, int rel_offsetV, int new_pitchUV) const {
VideoFrame* Retval=new VideoFrame(vfb, offset+rel_offset, new_pitch, new_row_size, new_height, rel_offsetU+offsetU, rel_offsetV+offsetV, new_pitchUV);
InterlockedDecrement((long*)&Retval->refcount);
return Retval;
}
It is neccesary to decrease the refcount because it is not posible to change avisynth.h without breaking existing filters, this introduce a small window where another thread could grap the VideoFrame (The ideal solution would be to have VideoFrame::Release decrement the refcount to 0 when it hits 1.)
Code:
PVideoFrame __stdcall ScriptEnvironment::NewVideoFrame(const VideoInfo& vi, int align) {
// Check requested pixel_type:
switch (vi.pixel_type) {
case VideoInfo::CS_BGR24:
case VideoInfo::CS_BGR32:
case VideoInfo::CS_YUY2:
case VideoInfo::CS_YV12:
case VideoInfo::CS_I420:
break;
default:
ThrowError("Filter Error: Filter attempted to create VideoFrame with invalid pixel_type.");
}
// If align is negative, it will be forced, if not it may be made bigger
if (vi.IsPlanar()) { // Planar requires different math ;)
if (align>=0) {
align = max(align,FRAME_ALIGN);
}
if ((vi.height&1)||(vi.width&1))
ThrowError("Filter Error: Attempted to request an YV12 frame that wasn't mod2 in width and height!");
PVideoFrame retval=ScriptEnvironment::NewPlanarVideoFrame(vi.width, vi.height, align, !vi.IsVPlaneFirst()); // If planar, maybe swap U&V
InterlockedDecrement((long*)&retval->refcount);
return retval;
} else {
if ((vi.width&1)&&(vi.IsYUY2()))
ThrowError("Filter Error: Attempted to request an YUY2 frame that wasn't mod2 in width.");
if (align<0) {
align *= -1;
} else {
align = max(align,FRAME_ALIGN);
}
PVideoFrame retval=ScriptEnvironment::NewVideoFrame(vi.RowSize(), vi.height, align);
InterlockedDecrement((long*)&retval->refcount);
return retval;
}
}
same story again.
Code:
bool ScriptEnvironment::MakeWritable(PVideoFrame* pvf) {
const PVideoFrame& vf = *pvf;
// If the frame is already writable, do nothing.
if (vf->IsWritable()) {
return false;
}
// Otherwise, allocate a new frame (using NewVideoFrame) and
// copy the data into it. Then modify the passed PVideoFrame
// to point to the new buffer.
const int row_size = vf->GetRowSize();
const int height = vf->GetHeight();
PVideoFrame dst;
if (vf->GetPitch(PLANAR_U)) { // we have no videoinfo, so we can only assume that it is Planar
dst = NewPlanarVideoFrame(row_size, height, FRAME_ALIGN,false); // Always V first on internal images
InterlockedDecrement((long*)&dst->refcount);
} else {
dst=NewVideoFrame(row_size, height, FRAME_ALIGN);
InterlockedDecrement((long*)&dst->refcount);
//LeaveCriticalSection(&csVideoFrame);
}
BitBlt(dst->GetWritePtr(), dst->GetPitch(), vf->GetReadPtr(), vf->GetPitch(), row_size, height);
// Blit More planes (pitch, rowsize and height should be 0, if none is present)
BitBlt(dst->GetWritePtr(PLANAR_V), dst->GetPitch(PLANAR_V), vf->GetReadPtr(PLANAR_V), vf->GetPitch(PLANAR_V), vf->GetRowSize(PLANAR_V), vf->GetHeight(PLANAR_V));
BitBlt(dst->GetWritePtr(PLANAR_U), dst->GetPitch(PLANAR_U), vf->GetReadPtr(PLANAR_U), vf->GetPitch(PLANAR_U), vf->GetRowSize(PLANAR_U), vf->GetHeight(PLANAR_U));
*pvf = dst;
return true;
}
Code:
PVideoFrame __stdcall Cache::GetFrame(int n, IScriptEnvironment* env)
{
n = min(vi.num_frames-1, max(0,n)); // Inserted to avoid requests beyond framerange.
__asm {emms} // Protection from rogue filter authors
if (h_policy == CACHE_NOTHING) { // don't want a cache. Typically filters that only ever seek forward.
__asm mov ebx,ebx // Hack! prevent compiler from trusting ebx contents across call
PVideoFrame result = childGetFrame(n, env);
// if (result->vfb) env->ManageCache(MC_ReturnVideoFrameBuffer, result->vfb); // return vfb to vfb pool for immediate reuse
return result;
}
if (h_policy == CACHE_RANGE) { // for filters that bash a span of frames. Typically temporal filters.
PVideoFrame result;
bool foundframe = false;
for (int i = 0; i<h_total_frames; i++) {
if (h_video_frames[i]->status & CACHE_ST_USED) {
// Check if we already have the frame
if (h_video_frames[i]->frame_number == n) {
result = new VideoFrame(h_video_frames[i]->vfb, h_video_frames[i]->offset, h_video_frames[i]->pitch,
h_video_frames[i]->row_size, h_video_frames[i]->height,
h_video_frames[i]->offsetU, h_video_frames[i]->offsetV,
h_video_frames[i]->pitchUV);
foundframe = true;
}
// Check if it is out of scope
else if (abs(h_video_frames[i]->frame_number-n)>=h_total_frames) {
h_video_frames[i]->status |= CACHE_ST_DELETEME;
if (!(h_video_frames[i]->status & CACHE_ST_HAS_BEEN_RELEASED)) { // Has this framebuffer been released?
UnlockVFB(h_video_frames[i]); // We can now release this vfb.
h_video_frames[i]->status |= CACHE_ST_HAS_BEEN_RELEASED;
}
}
}
} // for (int i
if (foundframe) { // Frame was found - build a copy and return a (dumb) pointer to it.
VideoFrame* copy = new VideoFrame(result->vfb, result->offset, result->pitch, result->row_size,
result->height, result->offsetU, result->offsetV, result->pitchUV);
_RPT2(0, "Cache2:%x: using cached copy of frame %d\n", this, n);
InterlockedDecrement((long*)&result->refcount);
InterlockedDecrement((long*)©->refcount);
return copy;
}
else {
__asm mov ebx,ebx // Hack! prevent compiler from trusting ebx contents across call
result = childGetFrame(n, env); // Should be checking the rest of cache
// Find a place to store it.
for (i=0 ;; i++) {
if (i == h_total_frames)
#ifdef _DEBUG
env->ThrowError("Cache2:%x: Internal cache error! Report this!", this);
#else
return result; // Should never happen
#endif
if (h_video_frames[i]->status & CACHE_ST_DELETEME) // Frame can be deleted
break;
if (!(h_video_frames[i]->status & CACHE_ST_USED)) // Frame has not yet been used.
break;
}
_RPT2(0, "Cache2:%x: Miss! Now locking frame frame %d in memory\n", this, n);
}
if (h_video_frames[i]->status & CACHE_ST_USED) ReturnVideoFrameBuffer(h_video_frames[i], env); // return old vfb to vfb pool for early reuse
// Store it
h_video_frames[i]->vfb = result->vfb;
h_video_frames[i]->sequence_number = result->vfb->GetSequenceNumber();
h_video_frames[i]->offset = result->offset;
h_video_frames[i]->offsetU = result->offsetU;
h_video_frames[i]->offsetV = result->offsetV;
h_video_frames[i]->pitch = result->pitch;
h_video_frames[i]->pitchUV = result->pitchUV;
h_video_frames[i]->row_size = result->row_size;
h_video_frames[i]->height = result->height;
h_video_frames[i]->frame_number = n;
h_video_frames[i]->faults = 0;
// Keep this vfb to ourselves!
// 1. This prevents theft by Plan B:
// 2. This make the vfb readonly i.e. MakeWriteable() returns a copy
LockVFB(h_video_frames[i]);
h_video_frames[i]->status = CACHE_ST_USED;
return result;
}
Code:
VideoFrame* Cache::BuildVideoFrame(CachedVideoFrame *i, int n)
{
Relink(&video_frames, i, video_frames.next); // move the matching cache entry to the front of the list
VideoFrame* result = new VideoFrame(i->vfb, i->offset, i->pitch, i->row_size, i->height, i->offsetU, i->offsetV, i->pitchUV);
// If we have asked for a same stale frame twice, leave frames locked.
if ((fault_rate <= 160) || (fault_rate == 190)) { // Reissued frames are not subject to locking as readily
UnlockVFB(i);
_RPT2(0, "Cache:%x: using cached copy of frame %d\n", this, n);
}
else {
_RPT3(0, "Cache:%x: lock vfb %x, cached frame %d\n", this, i->vfb, n);
}
InterlockedDecrement((long*)&result->refcount);
return result;
}
This is the cache from the current cvs. I'm not total familiar with out put I think that could do.
Also the linked list with VideoFrameBuffer 's can be accessed by two threads at the same time but it seems like it is only ScriptEnvironment::GetFrameBuffer2 that handles the list so putting a EnterCriticalSection/LeaveCriticalSection would be an easy solution.
[edit] that might not be enough because the VideoFrameBuffer is returned with a refcount of 0. So something like the VideoFrame hack above might be needed[/edit]
All this should allow two different threads to work on different clips/videoframes at the same time like mt does it. Because only the scripteviroment interface is exposed user filters can easy overwrite it to handle the GetVar SetVar.[Edit]now the code compiles[/edit]