Log in

View Full Version : fmtconv: resize, bitdepth and colorspace conversions


Pages : 1 2 [3] 4 5 6 7 8

cretindesalpes
13th May 2015, 14:42
djcj: AvstpFinder should be build only on Windows (please refer to fmtconv/doc/fmtconv.html#compiling). For the other files, what are the errors?

djcj
13th May 2015, 16:57
I got a lot of errors like these: error: ‘__m256’ does not name a type

But compiling the files that refused to build with -mavx2 instead of -msse2 solved it for me.

Here's my modified Makefile.am:
warningflags = -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers
includeflags = -I$(srcdir)/src
commonflags = -O3 $(MFLAGS) $(warningflags) $(includeflags)
AM_CXXFLAGS = -std=c++11 $(commonflags)

lib_LTLIBRARIES = libfmtconv.la

libfmtconv_la_SOURCES = \
src/fmtc/Bitdepth.cpp \
src/fmtc/ConvStep.cpp \
src/fmtcl/BitBltConv.cpp \
src/fmtcl/ChromaPlacement.cpp \
src/fmtcl/CoefArrInt.cpp \
src/fmtcl/ContFirBlackman.cpp \
src/fmtcl/ContFirBlackmanMinLobe.cpp \
src/fmtcl/ContFirCubic.cpp \
src/fmtcl/ContFirFromDiscrete.cpp \
src/fmtcl/ContFirGauss.cpp \
src/fmtcl/ContFirInterface.cpp \
src/fmtcl/ContFirLanczos.cpp \
src/fmtcl/ContFirLinear.cpp \
src/fmtcl/ContFirRect.cpp \
src/fmtcl/ContFirSinc.cpp \
src/fmtcl/ContFirSnh.cpp \
src/fmtcl/ContFirSpline16.cpp \
src/fmtcl/ContFirSpline36.cpp \
src/fmtcl/ContFirSpline64.cpp \
src/fmtcl/ContFirSpline.cpp \
src/fmtcl/DiscreteFirCustom.cpp \
src/fmtcl/DiscreteFirInterface.cpp \
src/fmtcl/ErrDifBuf.cpp \
src/fmtcl/ErrDifBufFactory.cpp \
src/fmtcl/KernelData.cpp \
src/fmtcl/ResampleSpecPlane.cpp \
src/fmtcl/ResizeData.cpp \
src/fmtcl/ResizeDataFactory.cpp \
src/fmtc/Matrix2020CL.cpp \
src/fmtc/Matrix.cpp \
src/fmtc/NativeToStack16.cpp \
src/fmtc/Stack16ToNative.cpp \
src/fstb/CpuId.cpp \
src/fstb/fnc.cpp \
src/fstb/ToolsSse2.cpp \
src/vsutl/FilterBase.cpp \
src/vsutl/fnc.cpp \
src/vsutl/PlaneProcCbInterface.cpp \
src/vsutl/PlaneProcessor.cpp

#if WINDOWS
#libfmtconv_la_SOURCES += src/AvstpFinder.cpp src/AvstpWrapper.cpp
#endif

libfmtconv_la_LDFLAGS = -no-undefined -avoid-version $(PLUGINLDFLAGS)


noinst_LTLIBRARIES = libavx2.la

libavx2_la_SOURCES = \
src/fmtc/Convert.cpp \
src/fmtc/Matrix_avx.cpp \
src/fmtc/Resample.cpp \
src/fmtc/Transfer.cpp \
src/fmtc/Transfer_avx2.cpp \
src/fmtcl/BitBltConv_avx2.cpp \
src/fmtcl/FilterResize.cpp \
src/fmtcl/Scaler.cpp \
src/fmtcl/Scaler_avx2.cpp \
src/fstb/ToolsAvx2.cpp \
src/main.cpp

libavx2_la_CXXFLAGS = $(AM_CXXFLAGS) -mavx2


libfmtconv_la_LIBADD = libavx2.la


edit:

Since the AvstpFinder stuff is only meant for Windows, a quick patch that avoids undefined references on Unix systems:
diff --git a/fmtconv/src/fmtcl/FilterResize.cpp b/fmtconv/src/fmtcl/FilterResize.cpp
index e5b66a7..fc07af4 100644
--- a/fmtconv/src/fmtcl/FilterResize.cpp
+++ b/fmtconv/src/fmtcl/FilterResize.cpp
@@ -61,8 +61,12 @@ namespace fmtcl


FilterResize::FilterResize (const ResampleSpecPlane &spec, ContFirInterface &kernel_fnc_h, ContFirInterface &kernel_fnc_v, bool norm_flag, double norm_val_h, double norm_val_v, double gain, SplFmt src_type, int src_res, SplFmt dst_type, int dst_res, bool int_flag, bool sse2_flag, bool avx2_flag)
+#if defined(__USE_AVSTP)
: _avstp (AvstpWrapper::use_instance ())
, _task_rsz_pool ()
+#else
+: _task_rsz_pool ()
+#endif
/*, _src_size ()
, _dst_size ()
, _win_pos ()
@@ -517,7 +521,9 @@ void FilterResize::process_plane_normal (uint8_t *dst_msb_ptr, uint8_t *dst_lsb_
assert (stride_dst > 0);
assert (stride_src > 0);

+#if defined(__USE_AVSTP)
avstp_TaskDispatcher * task_dispatcher_ptr = _avstp.create_dispatcher ();
+#endif

TaskRszGlobal trg;
trg._this_ptr = this;
@@ -612,19 +618,25 @@ void FilterResize::process_plane_normal (uint8_t *dst_msb_ptr, uint8_t *dst_lsb_
tr._work_dst [d] = work_dst [d];
}

+#if defined(__USE_AVSTP)
_avstp.enqueue_task (
task_dispatcher_ptr,
&redirect_task_resize,
tr_cell_ptr
);
+#endif
} // for Dir_H
} // for Dir_V

+#if defined(__USE_AVSTP)
_avstp.wait_completion (task_dispatcher_ptr);
+#endif

// Done
+#if defined(__USE_AVSTP)
_avstp.destroy_dispatcher (task_dispatcher_ptr);
task_dispatcher_ptr = 0;
+#endif
}


diff --git a/fmtconv/src/fmtcl/FilterResize.h b/fmtconv/src/fmtcl/FilterResize.h
index f49e280..c9a0e2f 100644
--- a/fmtconv/src/fmtcl/FilterResize.h
+++ b/fmtconv/src/fmtcl/FilterResize.h
@@ -42,6 +42,10 @@ http://sam.zoy.org/wtfpl/COPYING for more details.

#include <cstdint>

+#if defined(__CYGWIN32__) || defined(__CYGWIN__) || defined(__MINGW32__) || defined(__MINGW64__) || defined(_MSC_VER) || \
+defined(__WIN32) || defined(__WIN32__) || defined(_WIN32) || defined(__WIN64) || defined(__WIN64__) || defined(_WIN64)
+# define __USE_AVSTP
+#endif


namespace fmtcl
@@ -158,7 +162,9 @@ private:

static void redirect_task_resize (avstp_TaskDispatcher *dispatcher_ptr, void *data_ptr);

+#if defined(__USE_AVSTP)
AvstpWrapper & _avstp;
+#endif
conc::CellPool <TaskRsz>
_task_rsz_pool;

jeremy33
15th May 2015, 17:04
Hello,

Jackoneill and me found a bug : http://forum.doom9.org/showthread.php?p=1722246#post1722246

We found that if you use something like sx=-0.5, sy=-0.5 (eg. clip = c.fmtc.resample(clip, kernel="spline36", sx=-0.5, sy=-0.5)) the image is really bad :
http://img11.hostingpics.net/thumbs/mini_464396screenshot.jpg (http://www.hostingpics.net/viewer.php?id=464396screenshot.jpg)

jackoneill
15th May 2015, 17:18
Hello,

Jackoneill and me found a bug : http://forum.doom9.org/showthread.php?p=1722246#post1722246

We found that if you use something like sx=-0.5, sy=-0.5 (eg. clip = c.fmtc.resample(clip, kernel="spline36", sx=-0.5, sy=-0.5)) the image is really bad :
http://img11.hostingpics.net/thumbs/mini_464396screenshot.jpg (http://www.hostingpics.net/viewer.php?id=464396screenshot.jpg)

It does not happen with r12 (or r8) on a Core 2 Duo.

jeremy33
15th May 2015, 17:33
It does not happen with r12 (or r8) on a Core 2 Duo.
Thanks for the info, so I build it myself and it works!
False alarm ;)

jackoneill
15th May 2015, 21:20
Not sure if related to any crashes or garbled output, but I found these out-of-bounds reads:

==13632== Thread 4:
==13632== Invalid read of size 8
==13632== at 0xD64BBC1: _mm_set_epi64x (emmintrin.h:583)
==13632== by 0xD64BBC1: _mm_set_epi64 (emmintrin.h:589)
==13632== by 0xD64BBC1: _mm_loadl_epi64 (emmintrin.h:700)
==13632== by 0xD64BBC1: load_8_16l (ToolsSse2.hpp:93)
==13632== by 0xD64BBC1: read_i16 (ProxyRwSse.hpp:70)
==13632== by 0xD64BBC1: void fmtcl::BitBltConv::bitblt_ixx_to_x16_sse<fmtcl::ProxyRwSse<(fmtcl::SplFmt)1>, fmtcl::ProxyRwSse<(fmtcl::SplFmt)3>, 16, 8>(fmtcl::ProxyRwSse<(fmtcl::SplFmt)1>::Ptr::Type, int, fmtcl::ProxyRwSse<(fmtcl::SplFmt)3>::PtrConst::Type, int, int, int) (BitBltConv.cpp:808)
==13632== by 0xD6541C2: void fmtcl::FilterResize::process_tile_transpose<unsigned short, (fmtcl::SplFmt)1>(fmtcl::FilterResize::TaskRsz const&, fmtcl::FilterResize::TaskRszGlobal const&, fmtcl::ResizeData&, int*, int, fmtcl::FilterResize::Dir&, int&, int*) (FilterResize.cpp:1002)
==13632== by 0xD652C37: fmtcl::FilterResize::process_tile(conc::LockFreeCell<fmtcl::FilterResize::TaskRsz>&) (FilterResize.cpp:691)
==13632== by 0xD5DD75F: AvstpWrapper::fallback_enqueue_task_ptr(avstp::TaskDispatcher*, void (*)(avstp::TaskDispatcher*, void*), void*) (AvstpWrapper.cpp:272)
==13632== by 0xD653186: fmtcl::FilterResize::process_plane_normal(unsigned char*, unsigned char*, unsigned char const*, unsigned char const*, int, int) (FilterResize.cpp:619)
==13632== by 0xD688035: fmtc::Resample::process_plane_proc(VSFrameRef&, int, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&) (Resample.cpp:842)
==13632== by 0xD6881DA: fmtc::Resample::do_process_plane(VSFrameRef&, int, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&) (Resample.cpp:622)
==13632== by 0xD698A93: vsutl::PlaneProcessor::process_frame(VSFrameRef&, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>) (PlaneProcessor.cpp:283)
==13632== by 0xD687633: fmtc::Resample::get_frame(int, int, void*&, VSFrameContext&, VSCore&) (Resample.cpp:490)
==13632== by 0x6ED8827: VSNode::getFrameInternal(int, int, VSFrameContext&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6EE6E08: VSThreadPool::runTasks(VSThreadPool*, std::atomic<bool>&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x50F4DEF: execute_native_thread_routine (in /usr/lib/libstdc++.so.6.0.20)
==13632== Address 0x199c6f7d is 307,197 bytes inside a block of size 307,200 alloc'd
==13632== at 0x4C2C526: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13632== by 0x4C2C641: posix_memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13632== by 0x6ED7BBD: VSPlaneData::VSPlaneData(unsigned long, MemoryUse&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED9955: VSFrame::VSFrame(VSFormat const*, int, int, VSFrame const*, VSCore*) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED9C56: VSCore::newVideoFrame(VSFormat const*, int, int, VSFrame const*) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED05EC: newVideoFrame(VSFormat const*, int, int, VSFrameRef const*, VSCore*) (in /usr/lib/libvapoursynth.so)
==13632== by 0xD93B664: ??? (in /usr/lib/libffms2.so.3.0.0)
==13632== by 0x6ED8827: VSNode::getFrameInternal(int, int, VSFrameContext&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6EE6E08: VSThreadPool::runTasks(VSThreadPool*, std::atomic<bool>&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x50F4DEF: execute_native_thread_routine (in /usr/lib/libstdc++.so.6.0.20)
==13632== by 0x60A2373: start_thread (in /usr/lib/libpthread-2.21.so)
==13632== by 0x564727C: clone (in /usr/lib/libc-2.21.so)
==13632==
==13632== Invalid read of size 8
==13632== at 0xD65FE82: _mm_set_epi64x (emmintrin.h:583)
==13632== by 0xD65FE82: _mm_set_epi64 (emmintrin.h:589)
==13632== by 0xD65FE82: _mm_loadl_epi64 (emmintrin.h:700)
==13632== by 0xD65FE82: load_8_16l (ToolsSse2.hpp:93)
==13632== by 0xD65FE82: read (ProxyRwSse.hpp:118)
==13632== by 0xD65FE82: process_vect_int_sse2<fmtcl::ProxyRwSse<(fmtcl::SplFmt)1>, 16, fmtcl::ProxyRwSse<(fmtcl::SplFmt)3>, 8> (Scaler.cpp:756)
==13632== by 0xD65FE82: void fmtcl::Scaler::process_plane_int_sse2<fmtcl::ProxyRwSse<(fmtcl::SplFmt)1>, 16, fmtcl::ProxyRwSse<(fmtcl::SplFmt)3>, 8>(fmtcl::ProxyRwSse<(fmtcl::SplFmt)1>::Ptr::Type, fmtcl::ProxyRwSse<(fmtcl::SplFmt)3>::PtrConst::Type, int, int, int, int, int) const (Scaler.cpp:688)
==13632== by 0xD64F836: fmtcl::FilterResize::process_tile_resize(fmtcl::FilterResize::TaskRsz const&, fmtcl::FilterResize::TaskRszGlobal const&, fmtcl::ResizeData&, int*, int, fmtcl::FilterResize::Dir&, int&, int*) (FilterResize.cpp:901)
==13632== by 0xD652CC1: fmtcl::FilterResize::process_tile(conc::LockFreeCell<fmtcl::FilterResize::TaskRsz>&) (FilterResize.cpp:683)
==13632== by 0xD5DD75F: AvstpWrapper::fallback_enqueue_task_ptr(avstp::TaskDispatcher*, void (*)(avstp::TaskDispatcher*, void*), void*) (AvstpWrapper.cpp:272)
==13632== by 0xD653186: fmtcl::FilterResize::process_plane_normal(unsigned char*, unsigned char*, unsigned char const*, unsigned char const*, int, int) (FilterResize.cpp:619)
==13632== by 0xD688035: fmtc::Resample::process_plane_proc(VSFrameRef&, int, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&) (Resample.cpp:842)
==13632== by 0xD6881DA: fmtc::Resample::do_process_plane(VSFrameRef&, int, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode> const&) (Resample.cpp:622)
==13632== by 0xD698A93: vsutl::PlaneProcessor::process_frame(VSFrameRef&, int, void*, VSFrameContext&, VSCore&, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>, vsutl::ObjRefSPtr<VSNodeRef, &VSAPI::cloneNodeRef, &VSAPI::freeNode>) (PlaneProcessor.cpp:283)
==13632== by 0xD687633: fmtc::Resample::get_frame(int, int, void*&, VSFrameContext&, VSCore&) (Resample.cpp:490)
==13632== by 0x6ED8827: VSNode::getFrameInternal(int, int, VSFrameContext&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6EE6E08: VSThreadPool::runTasks(VSThreadPool*, std::atomic<bool>&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x50F4DEF: execute_native_thread_routine (in /usr/lib/libstdc++.so.6.0.20)
==13632== Address 0x1b1204bc is 691,196 bytes inside a block of size 691,200 alloc'd
==13632== at 0x4C2C526: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13632== by 0x4C2C641: posix_memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13632== by 0x6ED7BBD: VSPlaneData::VSPlaneData(unsigned long, MemoryUse&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED8DDB: VSFrame::VSFrame(VSFormat const*, int, int, VSFrame const* const*, int const*, VSFrame const*, VSCore*) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED8FFD: VSCore::newVideoFrame(VSFormat const*, int, int, VSFrame const* const*, int const*, VSFrame const*) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED0372: newVideoFrame2(VSFormat const*, int, int, VSFrameRef const**, int const*, VSFrameRef const*, VSCore*) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6EB6FBC: makeDiffGetFrame (in /usr/lib/libvapoursynth.so)
==13632== by 0x6ED8827: VSNode::getFrameInternal(int, int, VSFrameContext&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x6EE6E08: VSThreadPool::runTasks(VSThreadPool*, std::atomic<bool>&) (in /usr/lib/libvapoursynth.so)
==13632== by 0x50F4DEF: execute_native_thread_routine (in /usr/lib/libstdc++.so.6.0.20)
==13632== by 0x60A2373: start_thread (in /usr/lib/libpthread-2.21.so)
==13632== by 0x564727C: clone (in /usr/lib/libc-2.21.so)


This is fmtconv r12 running in 64 bit Arch Linux.

cretindesalpes
16th May 2015, 07:44
Thanks for the bug report. Do you know what is the input video format and the resize parameters?

jackoneill
16th May 2015, 09:39
Thanks for the bug report. Do you know what is the input video format and the resize parameters?

For the first one:

clip = VideoNode
Format: YUV420P8
Width: 640
Height: 480
Num Frames: 100
FPS Num: 30003
FPS Den: 1001
Flags: None
w = 960
h = 720
sx = 0.0
sy = 0.0
sw = 0.0
sh = 0.0
kernel = spline36
taps = 4
a1 = None
a2 = None
css = None
planes = [3, 3, 3]
cplace = mpeg2
cplaces = None
cplaced = None
interlaced = 2
interlacedd = 2
flt = False

For the second one:

clip = VideoNode
Format: Gray8
Width: 960
Height: 720
Num Frames: 100
FPS Num: 30003
FPS Den: 1001
Flags: None
w = 640
h = 480
sx = 0.0
sy = 0.0
sw = 0.0
sh = 0.0
kernel = spline36
taps = 4
a1 = None
a2 = None
css = None
planes = [3, 3, 3]
cplace = mpeg2
cplaces = None
cplaced = None
interlaced = 2
interlacedd = 2
flt = False

cretindesalpes
16th May 2015, 11:58
Hmmm… It seems that the allocated space corresponds exactly to the frame size. I’m just checking the VS source code and I see this VS_FRAME_GUARD which is not defined by default. Actually I always thought there was a few bytes left after the frame data to make SIMD code simpler (same as in Avisynth). But it seems that VS_FRAME_GUARD exists only for debugging purpose and strict frame bounds should be enforced. Sigh… so consider all the fmtconv releases broken at the moment.

djcj
18th May 2015, 07:37
Just want to report you that a build with debug level 2 or higher fails for me. Building with -g1 is no problem.
g++ -c -O2 -fPIC -Wall -Wextra -Wno-unused-parameter -Wno-unused-result -Isrc -g -Wno-reorder -std=c++11 -mavx2 -msse2 -march=native -pipe -I/usr/local/include/vapoursynth -I/usr/include/vapoursynth -o src/fmtc/Transfer.o src/fmtc/Transfer.cpp
src/fmtc/Transfer.cpp: In static member function ‘static void fmtc::Transfer::MapperLog::find_index(const fmtc::Transfer::FloatIntMix*, __m128i&, __m128&)’:
src/fmtc/Transfer.cpp:791:24: warning: unused variable ‘val_max’ [-Wunused-variable]
static const float val_max = float (int64_t (1) << LOGLUT_MAX_L2);
^
src/fmtc/Transfer.cpp: At global scope:
src/fmtc/Transfer.cpp:1547:1: error: ‘static void fmtc::Transfer::MapperLog::find_index(const fmtc::Transfer::FloatIntMix*, __m256i&, __m256&)’ conflicts with a previous declaration
} // namespace fmtc
^
src/fmtc/Transfer.cpp:780:6: note: previous declaration ‘static void fmtc::Transfer::MapperLog::find_index(const fmtc::Transfer::FloatIntMix*, __m128i&, __m128&)’
void Transfer::MapperLog::find_index (const FloatIntMix val_arr [4], __m128i &index, __m128 &frac)
^
In file included from src/fmtc/Transfer.cpp:31:0:
src/fmtc/Transfer.h:321:18: note: -fabi-version=6 (or =0) avoids this error with a change in mangling
find_index (const FloatIntMix val_arr [8], __m256i &index, __m256 &frac);
^
src/fmtc/Transfer.cpp:1547:1: error: ‘static void fmtc::Transfer::MapperLin::find_index(const fmtc::Transfer::FloatIntMix*, __m256i&, __m256&)’ conflicts with a previous declaration
} // namespace fmtc
^
src/fmtc/Transfer.cpp:689:6: note: previous declaration ‘static void fmtc::Transfer::MapperLin::find_index(const fmtc::Transfer::FloatIntMix*, __m128i&, __m128&)’
void Transfer::MapperLin::find_index (const FloatIntMix val_arr [4], __m128i &index, __m128 &frac)
^
In file included from src/fmtc/Transfer.cpp:31:0:
src/fmtc/Transfer.h:301:18: note: -fabi-version=6 (or =0) avoids this error with a change in mangling
find_index (const FloatIntMix val_arr [8], __m256i &index, __m256 &frac);
^
I'm using g++ 4.8.2 64 bit on Ubuntu 14.04.

jackoneill
18th May 2015, 10:24
-fabi-version=6 does take care of those errors.

cretindesalpes
18th May 2015, 18:18
fmtconv r13 (http://forum.doom9.org/showthread.php?t=166504):
matrix: optimized the SSE2 and AVX2 paths for integer data.
Added cpuopt to some functions, to manually limit the instruction set optimizations.
Added build files for the unix-like systems, thanks to jackoneill.
Fixed a buffer overflow bug in the SSE2 and AVX2 code of bitdepth and resample, thanks to jackoneill for reporting it.
Removed the int16tofloat and floattoint16 temporary functions.

sl1pkn07
18th May 2015, 18:23
Added build files for the unix-like systems, thanks to jackoneill.


missing

cretindesalpes
18th May 2015, 18:33
Oh right, fixed.

sl1pkn07
18th May 2015, 18:57
tnx bro!

metyo
19th May 2015, 19:03
funny

this module went wrong

vspipe: symbol lookup error: /usr/lib/vapoursynth/libfmtconv.so: undefined symbol: __sync_val_compare_and_swap_16

ArchLinux 64bit

metyo

sl1pkn07
19th May 2015, 19:37
mmm

try to commenting the line 16 in the pkgbuild

greetings

jackoneill
19th May 2015, 19:57
funny

this module went wrong

vspipe: symbol lookup error: /usr/lib/vapoursynth/libfmtconv.so: undefined symbol: __sync_val_compare_and_swap_16

ArchLinux 64bit

metyo

"-march=native" may fix that. Other values must work as well, but I don't know which. (Obviously you shouldn't use "native" if you're making a binary for other people.)

metyo
19th May 2015, 20:11
hi
you mean to disable the

rm -fr src/VapourSynth.h --- line ?

i used the vapoursynth-plugin-fmtconv r13-1 cos it was the more fresh..

info - i reinstalled the 20131130.ce9e577-1-x86_64.pkg.tar.xz, and it works as it did

sl1pkn07
19th May 2015, 20:20
please paste the script

edit: try to use the non-git package

metyo
19th May 2015, 20:37
the fmtconv is a mission for me always

the srcipt is


import vapoursynth as vs
core = vs.get_core()

c = core.std.BlankClip(width=100, height=200, format=vs.YUV420P16, color=[64, 64, 64])
c = core.fmtc.resample(c, w=200, h=200)
c.set_output()


so i think i don't miss too mutch:)

i use havsfunc - QTMGC , so the fmtc is a dependency.
But don't you spend too much free time of yours, the previous versions was working good...

edit

sorry the same
Csomagok (means packages hungarian..:):) ) (1) vapoursynth-plugin-fmtconv-r13-1

sl1pkn07
19th May 2015, 20:53
works for me with r11.6.f796ebd-1 and with non-git r13 with VS-git r27.7.ge00d0de-1

cretindesalpes
19th May 2015, 21:57
"-march=native" may fix that. Other values must work as well, but I don't know which. (Obviously you shouldn't use "native" if you're making a binary for other people.)
Actually, on x64 architectures, fmtconv needs the CMPXCHG16B instruction, which is lacking on early AMD CPUs (http://en.wikipedia.org/wiki/X86-64#Older_implementations). So you need to specify a more recent arch. For example -march=nocona should do the job. Maybe there are other ways to specify this constraint to GCC?

EDIT: just found the -mcx16 (https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options) option which should be more appropriate.

jackoneill
20th May 2015, 07:28
EDIT: just found the -mcx16 (https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options) option which should be more appropriate.

That works!

lo1t3yu
20th May 2015, 07:46
Compiled fmtconv doesn't work :)
Used fmtconv: R13 from git: https://github.com/EleonoreMizo/fmtconv
Due to this makefile and -mavx2 instead of -msse2 fmtconv successfully linked and compiled. Thanks to sl1pkn07! But if use fmtconv...
Vspipe breaks error:
vspipe: symbol lookup error: /usr/lib/vapoursynth/libfmtconv.so: undefined symbol: _ZN5vsutl6CpuOptC1ERNS_10FilterBaseERK5VSMapRS3_PKc

cretindesalpes
20th May 2015, 08:39
Unfortunately I cannot help with Linux builds, as I haven’t access to a Linux system at the moment. Anyway you should use -mavx2 only on *_avx2 files, as specified in the doc, unless you have an AVX2-capable CPU and don’t plan to share the library.

lo1t3yu
20th May 2015, 14:13
Exist different branches and forks of this plugin. What's the difference between them?

https://github.com/EleonoreMizo/fmtconv
https://github.com/vapoursynth/fmtconv
https://github.com/dubhater/fmtconv

Myrsloik
20th May 2015, 14:21
https://github.com/EleonoreMizo/fmtconv is the official one now.

lo1t3yu
20th May 2015, 14:58
Thanks to all, it works.
Full Makefile.am: https://github.com/dubhater/fmtconv/blob/master/build/unix/Makefile.am
Also need to use -march=native flag.

cretindesalpes
20th May 2015, 21:36
fmtconv r14 (http://forum.doom9.org/showthread.php?t=166504):
matrix: fixed a bug introducing wrong offsets in custom matrix coefficients, thanks to mawen1250 for the report.

metyo
21st May 2015, 18:08
It works for me too.

Compiled from non git version Archlinux 64 bit

feisty2
22nd May 2015, 06:40
http://i.imgur.com/R9g6SVM.png
the following function gets me a weird line stuck in the middle (or lower middle? whatever) of the image, boxed out with a red rectangle

def cblur16 (src):
core = vs.get_core ()
w = src.width
h = src.height
blur = core.fmtc.resample (src, w*4, h*4, kernel="cubic", a1=1, a2=0, fulls=True, fulld=True)
sharp = core.fmtc.resample (src, w*4, h*4, kernel="cubic", a1=-1, a2=0, fulls=True, fulld=True)
dif = core.std.MakeDiff (blur, sharp)
dif = core.fmtc.resample (dif, w, h, kernel="gauss", a1=100, fulls=True, fulld=True)
clip = core.std.MergeDiff (src, dif)
return clip

cretindesalpes
22nd May 2015, 19:57
feisty2: indeed. I think it’s fixed now.

fmtconv r15 (http://forum.doom9.org/showthread.php?t=166504):
resample and bitdepth: fixed a bug creating dark lines or weird patterns. Was introduced in r13 while trying to fix the buffer overflow problem. Thanks to feisty2 for spotting it.
resample: fixed the non-SIMD code path, causing crashes.

mawen1250
7th June 2015, 05:04
With core.fmtc.matrix(src, mat="709", col_fam=vs.RGB), I've got slightly brighter output if input is YUV444P16 (compared with YUV444PS input for fmtc.matrix and z.Colorspace).

EDIT:
More tests reveal that it's a problem related to vsedit when previewing floating point frames. When I manually convert output depth to 8bit, all the results are visually the same.

jeremy33
21st June 2015, 08:56
Hello,

I had a problem with fmtconv from the djcj PPA (https://launchpad.net/~djcj/+archive/ubuntu/vapoursynth) as you can see here:
http://forum.doom9.org/showthread.php?t=165771&page=81

We found that it was compiled incorrectly. So djcj now use the original Makefiles from "https://github.com/EleonoreMizo/fmtconv/tree/master/build/unix" to build fmtconv on the PPA but there is a problem.

The compilation works for Ubuntu 15.04 and the problem we had with fmtconv is gone but the compilation doesn't work with Ubuntu 14.04. Djcj opened an issue on github :

https://github.com/EleonoreMizo/fmtconv/issues/3

I hope that can be fixed.

Thank you

cretindesalpes
21st June 2015, 14:53
jeremy33: It looks like the errors come from the declarations using AVX types in compilation units limited to SSE2. Because they are only declarations, they have no consequence on the generated code and GCC 4.9+ accept them, but I agree it’s not very clean. I’ll see if I can refactor the whole AVX business.

jeremy33
21st June 2015, 22:08
Great, thank you very much !

cretindesalpes
22nd June 2015, 19:33
I did the modification, pushed in the git repository. I am unable to run the unix build, so if someone could test it, any help is appreciated.

Are_
22nd June 2015, 20:29
Tested, it builds without "march" with everything.

sl1pkn07
22nd June 2015, 20:30
build for me with gcc5. (non AVX machine. -march=native)

cretindesalpes
22nd June 2015, 21:46
Thanks for your report.

jeremy33
23rd June 2015, 00:22
I did the modification, pushed in the git repository. I am unable to run the unix build, so if someone could test it, any help is appreciated.
Now it builds with g++-4.8, g++-4.9, clang++-3.5 and clang++-3.6 without problem.

Thank you very much !

mawen1250
30th June 2015, 18:34
I've got a problem with fmtc.resample:
When an array is specified for sx/sy, only the first one in the array is actually used.

cretindesalpes
1st July 2015, 14:57
Right. It should be fixed now. Thanks for the report.

fmtconv r16 (http://forum.doom9.org/showthread.php?t=166504):
bitdepth: added support for 11-bit and 14-bit integer input.
bitdepth: Fixed a slight plane inconsistency when dithering grey multi-plane pictures using an error diffusion algorithm.
matrix2020cl: added SSE2 optimisations for the floating point path.
resample: sx, sy, sw and sh parameters passed as arrays are now correctly taken into account.
transfer: added the blacklvl parameter.

mawen1250
1st July 2015, 17:54
Thanks for the update! It works correctly now.

feisty2
4th July 2015, 15:16
source
http://i.imgur.com/hxMRuOt.png

clp = xxx
clp = core.fmtc.bitdepth(clp, fulls=False, fulld=True, bits=32, flt=True)
clp = core.fmtc.transfer(clp, transs="709", transd="linear", fulls=True, fulld=True, cont=3)
clp = core.fmtc.transfer(clp, transs="linear", transd="709", fulls=True, fulld=True, cont=1/3)
clp.set_output ()

http://i.imgur.com/8w07oNl.png
looks like nothing ever happened, good


clp = xxx
clp = core.fmtc.bitdepth(clp, fulls=False, fulld=True, bits=32, flt=True)
clp = core.fmtc.transfer(clp, transs="709", transd="linear", fulls=True, fulld=True, cont=3)
clp = core.fmtc.bitdepth(clp, fulls=True, fulld=True, bits=16)
clp = core.fmtc.bitdepth(clp, fulls=True, fulld=True, bits=32)
clp = core.fmtc.transfer(clp, transs="linear", transd="709", fulls=True, fulld=True, cont=1/3)
clp.set_output ()

http://i.imgur.com/Znwhy9r.png
what the *beep*? looks totally wrong and miles away kind of different from the source clip, is it like, a bug, or more of, a failure of 16bits?
guess it's the latter one, 16bits failing, imho, told ya, 16bits sucks, 32bits rules!

EDIT: I'll keep doing "Placebo" (scripting sort of thing) if it's just a bug and wait it to be fixed
or, well, I'll try to find a way to stuff float support to nnedi3 and mvtools if it's 16bits sucking ass.

cretindesalpes
5th July 2015, 16:20
feisty2: After the x3 contrast operation, the luminance channel contains values > 1. They just get clipped when converting to integer data and cannot be restored to their original value when applying the 1/3 contrast. Nothing wrong here; keep working with floating point data when building such a processing graph.

cretindesalpes
8th July 2015, 18:12
fmtconv r17 (http://forum.doom9.org/showthread.php?t=166504):
bitdepth: added “Void and cluster” dithering method and its patsize parameter.
bitdepth: added floating point implementation for the Ostromoukhov dithering
bitdepth: added SSE2 optimizations for halftone modes (0, 1 and 8).
bitdepth: fixed incorrect conversion from float to 8-bit integer using the “fast” modes with SSE2 instruction set.

jose1711
24th July 2015, 21:22
i am still getting the following error:
$ LANG=C LC_ALL=C vspipe test.vpy stream.y4m --y4m & mpv stream.y4m
[1] 9942
Playing: stream.y4m
vspipe: ./../../src/conc/Interlocked.hpp:183: static int64_t conc::Interlocked::cas(volatile int64_t&, int64_t, int64_t): Assertion `is_ptr_al
igned_nz (&dest)' failed.
Failed to recognize file format.

the problem goes away as soon as i remove this line from test.vpy:

ret = haf.QTGMC( ret, Preset='Slow', TFF=True)


does it mean that this feature is limited to 64bit only processor/OS? i am on arch linux i686. thank you, jose

cretindesalpes
25th July 2015, 12:06
jose1711: I fixed something in the git repository, but at this point I’m not sure it fixes everything. Please check it and tell me if it works better, or differently.