Sunday, November 25, 2018

Star Symphony in Chepelare - Poetic CGI Music Video | Звездна симфония в Чепеларе

An Unreal Star Storm watched from the forests of the Rhodope Mountains, Bulgaria

The premiere of my new music video - a poetic and artistic production with beautiful 2D visual effects, produced using computer vision for automatic compositing, masks generation, objects removal etc. Edited and rendered using my inhouse software "Twenkid FX Studio".

Watch in darkness and on a big screen in 1920x1080!

The Eagle from "Star Symphony in Chepelare". Camera operator: Todor Arnaudov

Short version (9:39 min, 4 musical pieces)

See more info and the long version from the Twenkid Studio's blog

Thanks for watching and please, share the videos if you like it!

Since this is a "Research" blog, let me tell something technical.

Some of the technologies used:

Custom GUI NLE video-editor: C++, Win32 (yes), a custom Win32 wrapper, VFW (yea-a-h), Direct3D9 (ahm), HLSL

"Twenkid FX Studio" is an endless "prototype" in which I've invested too little time and had to redesign a long time ago. Using Win32 sounds a bit insane, but my choice when I started was because there were issues with the usage of another "default" and having bad reputation simple windows library (MFC) - I used Visual Studio Express.

Sure, there were free GUI class-libraries, but I preferred a smaller code base that was not dependent on additional huge third-party libraries* such as wxWidgets (which was considered and maybe I was wrong not to develop with it).

Qt had some issues with the license - I didn't want my system to be GPL, and their other license fee was unreasonable. Maybe I could try GTK, but it's also bloated with a lot of dependencies and verbose method calls, similarly to wxWidgets, so apart from being multiplatform, I don't know would it be "simpler" to work with than Win32 or my own Win32 wrapper.

Furthermore, at the time when I started, FFmpeg or other Linux video libraries seemed undocumented/unaccessible, while I found a windows' one, although a bit outdated - VFW (Video for Windows). DirectShow was the more appropriate choice, but it seemed to me that it had more complex interface and harder access to the raw bitmap, so I decided to use VFW and not delve too much. Maybe I was wrong here again, I had to spend some more time on DirectShow.

(*Regarding huge code bases with too many fragmented modules - respectively I don't like Boost with its 999999 tiny little files, most of which not used.)

So I started with simple Win32,  I developed also simple wrapper classes for some controls. I didn't care that it didn't look "beautiful" or "modern", the buttons look-and-feel was not important.

One reasonable design choice was to develop the GUI in C# with an interface to the core processing through pipes, sockets or memory mapping (file-mapping in Windows), it's still an option. It would go with a "standardized" interface to the core editor so that it could be controlled from all kinds of external GUIs. I did something like that with my speech synthesizer "Toshko 2.070", but only for simple input, not full API to its internals.

Another possibility is Lua and/or automatic generation of the GUI from the specifications.


Historically, there were years with zero or a few lines of code added to the project and unfortunately the editor's GUI is still underdeveloped and ugly for casual users which prevents it to be released for external usage out of my "in-house" needs.

It's pretty fast for some tasks, though. For example, Twenkid FX loads the long version of the "Star Symphony", including alternative disabled video segments and overlays, 200 full HD video files in total, for the first time in a fresh session in about 6-7 seconds 1.5-2 seconds from a laptop's mechanical HDD and external HDD. Maybe that's the total seek-time for so many files.
If the project is then closed and re-opened again, it loads and is ready in just 2 seconds.

(It seems that test run was with a highly loaded RAM and page-file slowing it down).

The GUI has to be improved, though, and possibly rewritten in a multiplatform way to escape that Windows dependency. I've been thinking about that from time to time, but it requires enough of focus to start.

Perhaps it would be based on FFmpeg, OpenCV and OpenGL, maybe using multiple programming languages (Python and C++, maybe others) with a custom GUI written on top of OpenGL and OpenCV or some light GUI or gaming library, unless I changed my mind and continued with Windows and a DirectX11-12

Also it's supposed to start utilizing some form of AI already, of course. Finally...


Custom VFX system and effects for the movie:
* Python, OpenCV with Python, Numpy; a little C++ and OpenCV in C++ for some retouch work of already rendered video segments during the final stage of the editing.

I started with Python because I had a prototype for simple reviewing and cutting, besides my main GUI NLE editor. Of course I was assuming that it would be easier to experiment with OpenCV, even though I knew it'd be slower, and initially I didn't know how far I'd go with the visual effects.

I could use C++ without a big hurdle, since I had experience and experiments with OpenCV C++ as well such as applying computer vision processing over pictures and frames of videos, traversing pixels and changing them during playback etc. The heavier editing system Twenkid FX C++/Direct3D9 was also an option, especially as HLSL shader, the only simple way to add new effects. However it needs a general and sophisticated plug-in subsystem, which is still lacking.

So I took the Python road this time.

It got too slow for some operations, then some tricks with Numpy fancy indexing sped it up, for one of the early effects: 60 times, from about 30 seconds per frame to about 0.5 seconds per frame. However it still remained slow for complex effects, sometimes taking several seconds per frame.

Ironically, the slow speed sometimes was "right", allowing real-time adjustments during rendering, virtual camera operating for Pan & Scan sequences etc., without slowing down the playback or stepping manually frame-by-frame.

Of course, I had better worked with C++ and GLSL or/and HLSL shaders from the start.

CogAlg Prize

Nevertheless that performance-wise wrong design decision and involvement with Python and Numpy directed me to check the CogAlg* project, then eventually to contribute to the debugging of the stuck frame_dblobs function and to win a prize.

Python is a bad choice for a "non-neuromorphic deep learning" for computer vision at the low level of the system, which is expected to require a zillion of operations before starting to produce meaningful output, though. Besides CogAlg's code is getting progressively unreadable.

This is another story, though.

* B.K. is the creator of "Cognitive Algorithm" project, but I recalled that I first called it with that shorthand "CogAlg" in an e-mail a few years ago, and he adopted it.

Keywords: Computer Graphics, Computer Vision, Film, Filmmaking, Twenkid FX, Twenkid Studio, Analysis, Art, Programming, CogAlg, Cognitive Algorithm, Sport, Acting, About Tosh, AGI, Animation, Видеообработка, Изкуство, Компютърна графика, Кино, Визуални ефекти, Компютърно зрение, Познавателен алгоритм, КогАлг, УИР, Универсален изкуствен разум, Анимация, ...