Whew, time to look at every `int` variable we ever declared! The best
moment to do this would have been a year ago, but well, better late
than never. No need to communicate that in comments anymore.
These shouldn't be used for widths, heights, or sprite-space
coordinates. Maybe we'll cover that another time, this commit is
already large enough.
Part of P0111, funded by [Anonymous] and Blue Bolt.
"Oh wait, actually, I *do* want to blit 8 pixels at a time in some
cases" :zunpet:
First time in a long while that the VRAM access macros couldn't do the
job, because code generation wants *both* pointer arithmetic *and*
subscripts within the same expression.
But *come on*, just blit the 16 pixels for the byte-aligned case!
Part of P0107, funded by Yanga.
Leading to slight complications in TH02's Music Room and shot type
selection menus. Thought about leaving those in C for a while, but I
still think it's worth it for the consistency we get with the VRAM
offset functions. Also, we'll have similar code for the main menus of
later games, and I'll surely won't be using C++ when starting out with
these.
Part of P0105, funded by Yanga.
Well, if master.lib insists on defining constructors and therefore make
its `Point` type unusable in C++ mode… This new type might replace some
of the typically separate X and Y function parameters, and reduce our
overall dependence on <master.h>.
… yeah, too bad that you can't simply #undef and then re-#define
`__cplusplus`, but what can you do.
Part of P0105, funded by Yanga.
Ideally, the future sprite compiler should automatically pre-shift such
sprites, and correctly place the shifted variants in memory, by merely
parsing the C header. On disk, you'd then only have a .BMP with each
individual cel at x=0.
And that's why we need macros and consistent naming: To express these
semantics, without having to duplicate the sprite declaration in some
other format. sSPARKS[8][8][8] wouldn't help anyone 😛
Now, we could go even further there by defining a separate type
(`preshifted_dots8_t`), and maybe get rid of the _W macro by replacing
it with a method on that type. However,
• that would be inconsistent, since we'll need the _H macro anyway, for
both the actual rendering code and the sprite compiler
• we couldn't directly call such a method on a 2D or 3D array, and have
to go down to a single element to do so (`sSPARKS[0][0][0].w()`)
• making it a static method instead duplicates the type all over the
code
• and any variables of that type would no longer be scalar-type values
that can be stored in registers, requiring weird workarounds in those
places. As we've already seen with subpixels.
Part of P0085, funded by -Tom-.
Which actually does inline… in C++, because Turbo C++ doesn't support
the `inline` keyword in C mode. So much for the superiority of that
language, even in 1994…
Part of P0083, funded by Yanga.
They do make everything so much simpler, after all. Especially now that
th01/formats/grz.cpp should compile as a freestanding translation unit,
as part of the .GRZ viewer.
Part of P0082, funded by Ember2528.
I tried `brge` for the latter, but that had *the* most horrible
ergonomics, and I misspelled it as `bgre` 100% of the times I typed it
manually. Turns out that `dots` is also consistent with master.lib's
naming scheme, leaving `planar` to *actually* refer to types storing
multiple planes worth of pixels. These types are showing up more and
more, and deserve something better than their previous long-winded and
misleading name.
Part of P0081, funded by Ember2528.
Allowing us to then retrieve it using a function call with no run-time
cost, although we do have to be careful with the types here.
Also, is that another solution to decompilation puzzles that involve
types of number literals?
Part of P0080, funded by Ember2528 and Splashman.
Yes, when clipping the start and end points to the screen area, ZUN
uses an integer division to calculate the line slopes, rather than a
floating-point one. Doesn't seem like it actually causes any incorrect
lines to be drawn, though; that case is only hit in the Mima boss
fight, which draws a few lines with a bottom coordinate of 400 rather
than 399. It *might* also restore the wrong pixels at parts of the
YuugenMagan fight, causing weird flickering, but seriously, that's an
issue everywhere you look in this game.
Part of P0069, funded by [Anonymous] and Yanga.
Templates would have been nicer, but as soon as you add just one
non-immediate parameter, Turbo C++ generates a useless store to a new
local variable, ruining the generated code.
Part of P0069, funded by [Anonymous] and Yanga.
Right, PC-98 hardware only supports 4 bits per RGB component, for a
total of 4,096 possible colors. The 8-bit RGB color values we've been
seeing throughout the later games are a master.lib extension, to allow
for more toning precision. Which TH01, with all its NIH syndrome,
doesn't use.
And yup, that means templates in the most basic header files… Since
that would have meant renaming *everything* to compile as C++, I simply
made these types exclusive to C++ code, thcrap style.
Part of P0066, funded by Keyblade Wiedling Neko and Splashman.
*Finally*. We already used `(unsigned) int` in quite a few places where
we actually want a 16-bit value, which was bound to annoy future port
developers.
The pascal calling convention for TH03's input mode functions actually
sort of matters, since we have this nice function pointer type that
expects pascal.
Going with the classic pointer-in-typedef approach here, because the
syntax you'd otherwise have to use is terribly inconsistent. It'd be
farfunc_t *near near_ptr_to_far_func;
but
nearfunc_t near *near_ptr_to_near_func;
And that'd hopefully be the last change to ReC98.h for a long time!
Those glacial compile times if every .c file is affected… Really
stands out if your build system is otherwise perfect.
Part of P0030, funded by zorg.
Only one code segment left in both OP and FUUIN! its-happening.gif
Yeah, that commit is way larger than I'm comfortable with, but none of these
functions is particularly large or difficult to decompile (with the exception
of graph_putsa_fx(), which I actually did weeks ago), and OP and MAIN have
their own unique functions in between the shared ones, so…
So apparently, TH01 isn't double-buffered in the usual sense, and instead uses
the second hardware framebuffer (page 1) exclusively to keep the background
image and any non-animated sprites, including the cards. Then, in order to
limit flickering when animating the bullet, character and boss sprites on top
of that (or just to the limit number of VRAM accesses, who knows), ZUN goes to
great lengths and tries to make sure to only copy back the pixels that were
modified on plane 0 in the last frame.
(Which doesn't work that well though. When you play the game, you still notice
tons of flickering whenever sprites overlap.)
And by "great lengths", I mean "having a separate counterpart function for
each shape and sprite animated which recalculates and copies back the same
pixels from plane 1 to plane 0", because that's what the new functions here
lead me to believe. Both of them are only called at one place: the wave
function on the second half of Elis' entrance animation, and the horizontal
masked line function for Reimu's X attack animations.
This function raises one of those essential questions about the eventual ports
we'd like to do. I'll explain everything more thoroughly here, since people
who might complain about the ports not being faithful enough need to
understand this.
----
The original plan was aim for "100% frame-perfect" ports and advertise them as
such. However, the PC-98 is not a console with fixed specs. As the name
implies, it's a computer architecture, and a plethora of different, more and
more powerful PC-98 models were released during its lifespan. Even if we only
consider the subset of products that fulfills the minimum requirements to run
the PC-98 Touhou games, that's still a sizable number of systems.
Therefore, the only true definition of a *frame* can be "everything that is
drawn between two Vsync wait calls". Such a *frame* may contain certain
expensive function calls, and certain systems may run these functions slower
than the developer expected, thus effectively leading to more *frames* than
the developer explicitly specified.
This is one of those functions.
Here, we have a scaling function that appears to be written deliberately to
run very slow, which ends up creating the rolling effect you see in the route
selection and the high score and continue screens of TH01. However, that
doesn't change the fact that the function is still CPU-bound, and neither
waits for Vsync nor is iteratively called by something that does. The faster
your CPU, the faster the rolling effect gets… until ultimately, it's faster
than one frame and therefore vanishes altogether. Mind you, this is true on
both emulators and real hardware. The final PC-98 model, the Ra43, had a CPU
clocked at 433 Mhz, and it may have even been instant there.
If you use more optimized algorithm, it also runs faster on the same CPU (I
tried this, and it worked beautifully)… you get the idea.
Still, it may very well be that this algorithm was not a deliberate choice and
simply resulted from a lack of experience, especially since this was ZUN's
first game.
That leaves us with two approaches to porting functions like these:
1) Look at the recommended system requirements ZUN specified, configure the
PC-98 emulator accordingly, measure how much of the work is done in each
frame, then rewrite the function to be bound to that specific frame rate…
2) …or just continue using a CPU-bound algorithm, which will pretty much
complete instantly on any modern system.
I'd argue that 2) is actually the more "faithful" approach. It will run faster
than the typical clock speeds people emulate the games at, and maybe draw a
bit of criticism because of that, but it seems a lot more rational than the
approximation provided by 1). Not to mention that it's undeniably easier to
implement, and hey, a faster game feels a lot better than a slower one, right?
… Oh well, maybe we'll still encounter some kind of CPU-bound animation that
is so essential to the experience that we do want to lock it to a certain
frame rate…