2020-12-30

coding itch: imgui and VStool mk II

A few days ago I got a coding itch and started a totally new version of the tool I built for my twitter/mastodon bot "Capcom VS Everyone" (aka VSbot) for editing and adding characters to its corpus, using the imgui ("immediate mode GUI") library. There are real benefits to VSbot I want out of this but it was also an ideal excuse to finally learn imgui - specifically, the Pythonic wrapping of it called pyimgui.

I heard about the concept of immediate mode GUI years ago but had never actually built anything real with it. I have a basic understanding of its benefits now, as well as some questions and issues with it that I'll need to explore further to fully understand. Most prominently, the use of returns to handle interactions seems limiting - sure, an expression defining a widget can return True if it has just been clicked, but which mouse button clicked it? What if you want two different things to happen when the widget is clicked vs unclicked (which may or may not happen while the widget is still being hovered?) You could have every expression that defines a new widget return a "state" struct with all the information one could want to know about what is happening to the button, but that no longer resembles the "if widget: do what is supposed to happen when the widget is clicked" simplicity with which the idea is initially presented. Which seems fine. No, the power of immediate mode GUI lies, as far as I can tell, more with centralizing the definition of all widgets in a straightforward, imperative way, eg a single flat function, instead of breaking everything up into small pieces, eg classes in some OOP hierarchy, that is much harder to follow execution through. Much as the benefit of the old "immediate mode openGL" was that you could describe drawing a shape as a series of instructions to the API - apparently, Jessica Mak's "Everyday Shooter" mixed GL calls directly into game logic! - rather than building buffers of data to ship off to the GPU. Definitely pitfalls to building stuff this way but it seems like a useful counterpoint to the "retained mode" status quo of building GUIs.

Also, and more specific to imgui's implementation, text input fields don't seem to have a concept of "focus", ie which of 3 fields the caret is currently typing in. Working around this and a few similar limitations involves creating lightweight versions of more traditional UI concepts, eg tracking "the currently hovered widget" and handling input events specially. It's the kind of paradigm where purity matters, I think.

So yeah those are some relatively naive thoughts on immediate mode GUI. It's possible I'm making tons of newbie mistakes and misconceptions, but imgui is so darn useful I would like to build many things with it in the future so over time I'll climb the learning curve and learn how to make it sing. You can see my commit history for the past few days here:

https://heptapod.host/jp-lebreton/vsbot/-/commits/branch/default

A bit of history of the VSbot project in case it's interesting. I had a dream ~15 years ago about a joke fighting game arcade machine titled Capcom VS Everyone. It had the standard pre-match character selection grid but it just kept scrolling infinitely in all directions, just every character you could possibly think of (the header image linked below is a recreation of the gag). Years later, after I'd built a few twitter bots and seen Darius Kazemi's "Alt Universe Prompts", I realized it would make a pretty good bot, throwing together videogame, film, cartoon, comic etc characters into a blender.

I knew that most of the fun of the concept would be visual, like you're seeing an actual pre-match "Ryu VS M. Bison" style match up screen from a fictional fighting game. My first stab at the idea, circa 2015, was to try to auto-generate these images from web image searches, the way Alt Universe Prompts appeared to, from text lists of characters. I got as far as generating match up images with character art drawn from web searches, but the range of image search results was too wide to get anything that looked like an arcade fighter character portrait - specifically, the problem of how to crop a random image of a character so that their face and other character design qualities were clearly visible. I went so far as to throw OpenCV's face detection at it to try and get crops focused on characters' faces, but there's no ML model in existence that can recognize faces as diverse as C-3PO, Count Chocula, Kermit the Frog, Steven Universe, etc etc. And some characters don't even have faces! So I abandoned this approach and let it sit for a couple years.

When I picked the idea back up, I created this ridiculous header image in GIMP:

The process of manually selecting, cropping, and arranging the ~150 images for that convinced me that hand-building the corpus was the way to get a visual of the desired quality. But would it be feasible to do this for the several thousand (and growing!) characters in the corpus? It seemed like the kind of thing where the right tool could turn the process of entering in a new character, and getting a good portrait for them, from a few minutes into 10-15 seconds or so. So I hacked one together in PyGame, since I already knew it well. And it worked! I was able to get a massive number of characters into the corpus in a few days of spare-time work. But the PyGame tool was kinda creaky, I was just kind of winging lots of basic UI concepts, and while the code isn't a mess per se it's been difficult to maintain and extend. It takes too much code to do simple things, which invariably means you do less with it.

The new imgui version of the tool is less code, already has more features, and is way easier to work with and extend. And now I know a cool, powerful UI toolkit. So that feels like a relatively worthwhile way to spend a couple holiday afternoons.

As far as new features I'd like to add to the tool, I want to add a search window that can be used to quickly survey and filter the whole corpus, and a tag editor that lets you add metadata to each character which can then be read by the bot's match-up generator, so you can have a match-up between a team of, for example, cartoon dogs VS a team of time travelers.

/projects_log/