Turns out that the "new idea" for handling these data slices won't work
out in reality. At least the amount of necessary manual `extern`
declarations isn't soul-crushingly large, so this way isn't bad either,
after all!
Part of P0197, funded by Yanga and Ember2528.
Reason: Too much micro-optimization using 32-bit registers, which
aren't supported by Turbo C++'s inline assembler. It's also just
another variation on a common function we've decompiled time and time
again.
Part of P0192, funded by [Anonymous], nrook, and -Tom-.
Reason: Saving SI and DI on the stack way too late. Just because ZUN
absolutely *had* to move the clipping condition before these two PUSH
instructions… Was it really necessary to save a total of 4 instructions
for an unlikely worst case in a function that's maybe called like 10-20
times per frame *at worst*?
Part of P0192, funded by [Anonymous], nrook, and -Tom-.
Second PC-98 Touhou boss completely decompiled, 29 to go! But meh,
ZUN's original code did in fact force the three leaf pattern sprites
into separate 1-sprite sheets…
Completes P0181, funded by Ember2528.
The one where Sariel's second form shoots sparks towards the top of the
playfield, which then turn into leaf-like sprites that sway towards the
bottom, killing Reimu on contact.
And wow, what a finish! A weird "decimal subpixel" type, hardcoded
sprites, and effectively unused non-hardcoded sprites. Too bad that it
also ruins the nice `dot_rect_t(w, h)` parameter abstraction for
grcg_put_8x8_mono()…
Completes P0180, funded by Yanga.
I've long moved to a convention of putting every .OBJ compiled from ZUN
code into the subdirectory of the game that introduced it. These four
are the last remaining inconsistencies from earlier in development.
Part of P0162, funded by Ember2528.
Reason: That switch statement. How should we even?
Well, the code *is* fairly good. After looking very deep into it, and
spending 35% of that function on blank lines (for logical grouping) and
explanatory comments, that is…
Part of P0152, funded by -Tom- and [Anonymous].
Boom! Decompilable after all. And look what that made us finally point
out: In all 4 games that use this function, its return value is
undefined if BGM is inactive. (That is, if the user disabled it, or if
no FM sound board is installed.)
Part of P0148, funded by [Anonymous].
Second previously undecompilable translation unit, second creative
workaround for the workaround. We can't compile snd_se_play() with -WX,
as that function needs a stack frame, and it's also illegal to disable
-WX in the middle of a translation unit. But since we only need word
alignment in front of snd_se_reset() *and* that function is identical
in all 4 games, it makes sense to move it to its own translation unit.
And then you notice that the TH02/TH03 and TH04/TH05 versions of the
other two functions are basically identical. The small differences can
easily be moved out to inline functions, leaving us with a single
implementation file for all 4 games. Nice!
Part of P0137, funded by [Anonymous].
Now actually decompilable with the discovery of -WX… even though it
now requires additional workarounds for the drawbacks of the -WX
workaround.
Part of P0137, funded by [Anonymous].
Reason: Self-modifying. -.-
Also, why no GRCG? Would have allowed blitting via REP MOVSD… Might as
well optimize all the way if you're going the ASM route to begin with.
Part of P0136, funded by [Anonymous].
Reason: Self-modifying. -.-
The TH05 version *might* be decompilable into a mess. Don't have time
for that right now, though.
Part of P0136, funded by [Anonymous].
Reasons:
• piano_fm_part_put_raw(): SI register referenced and not saved on
the stack
• piano_current_note_from(): Would be decompilable… into a mess.
Not worth adding a separate translation unit just for it.
• piano_part_keys_put_raw(): DI register saved before the SI register
• piano_pressed_key_put(): DI register referenced and not immediately
saved on the stack
• piano_label_put_raw(): SI and DI registered referenced and not saved
on the stack
• grcg_setcolor_direct_seg1_raw(): Let's procrastinate this one until
we have to reference all of these instances in C land.
And we could have even emitted that PIANO_KEY_PRESSED_TOP pixel data
into the code segment, by using `#pragma option -z` to give identical
names to both the code and the data segment. At least we can decompile
the first two functions here.
Part of P0135, funded by [Anonymous].
Reason: Pascal calling convention with function parameters but no stack
frame. Theoretically we can __emit__() everything inside this function,
but there's no way we can get a `RETN 8` this way. Oh, and it also
accesses SI and DI without backing them up to the stack.
And thanks to TLINK apparently not reporting fixup overflows when
segments are small enough (?), it took quite a while to get that CALL
correct and not weirdly offset by 32 bytes. 😕
Part of P0134, funded by [Anonymous].
Well, it *would* have been decompilable, but that ridiculous placement
of the nullptr assignment would have forced the entire function call to
be spelled out in inline ASM, verbatim. No amount of comma operator
trickery would have generated the same instructions either. And for a
function this small and obvious in what its decompilation *should* be,
it really defeated the purpose of adding a separate translation unit…
Part of P0134, funded by [Anonymous].
Reason: Wants to be word-aligned, and the previous version in OP.EXE,
game_exit(), is not, despite having an even length :(
Oh well, at least I'm confident enough about it by now to document it.
And out of all decompilations to be thrown away, this is a pretty
dispensable one.
Part of P0133, funded by [Anonymous].
Reason: Manual "tail call optimization" of input_reset_sense(), with
execution falling through to input_sense() immediately below.
Part of P0133, funded by [Anonymous].
Undecompilable again. The loading functions have these *_noalpha()
variants that simply set a global variable and fall through to the
regular functions, while cdg_free() has its first `PUSH DI` instruction
after the first expression we'd be decompiling. cdg_free_all() *could*
be decompiled… but would also require _FLAGS trickery, and it's simply
not worth starting a translation unit for one such small function.
Part of P0127, funded by [Anonymous].
Nooooo, gotta throw away that decompilation for the stupidest of
reasons :( Turns out that a function may also be "undecompilable" if
the original code layout places it at a word-aligned address, but the
last byte of the previous function in just one of the original binaries
(TH03's MAIN.EXE, in this case) also lies at a word-aligned address.
There's simply no way to enforce per-function word alignment in Turbo
C++ alone. You *could* fake it with `#pragma codestring`, but of course
that won't work for functions that are part of the SHARED segment, and
where the alignment previously would have been correct. Conditionally
emitting that codestring would work, but then we'd also have to compile
that translation unit at least twice.
Now, I could have created a dummy .ASM file that just contains a single
zero-length but word-aligned SHARED segment, which could be placed
anywhere on the link command line where word alignment is needed… but
the decompilation of this function was a mess anyway, and probably
helped nobody.
Part of P0127, funded by [Anonymous].
First ZUN bug in sprite preshifting! One wrongly shifted pixel means
that we can't use the auto-preshift feature of our sprite converter -.-
Also, why did these even have to be hardcoded sprites to begin with.
These dot patterns could have been easily generated procedurally… but
even *that* wouldn't have been necessary, given that there's this nice
function called, uh, graph_r_line_patterned()? Which could have
rendered all of the lasers in the upcoming class and more?
Part of P0122, funded by Yanga.
Some of the unused interleave masks are not that straightforward, so it
makes sense to have all of them as a bitmap. I'm positive that this
sort of thing could have been EGC-accelerated… although, simply
writing better C would probably already go a long way.
Part of P0121, funded by Yanga.
In which ZUN accidentally the GRCG rather than the EGC in what should
have been (?) the unblitting function. Which then ends up actually
blitting yet another randomly background-masked version of the same
sprite on top of the old one. And after just a few frames, you get
those fully filled red diamonds you don't see in the sprite sheet.
Then again, if the 16w×h rectangle unblitting function is all you
have, and you can't be bothered to actually learn the EGC, this *is*
the better option 🎺
Completes P0120, funded by Yanga.
And with that, we finally dumped every single PC-98 Touhou binary!
Since it'd be overkill to merge bmp2arr into the re-baseline branch
though, we also have to start out with the raw image bytes here.
Part of P0117, funded by [Anonymous].
Needlessly linked with TCC rather than TLINK, adding almost 4 KB of
completely unnecessary libc startup code.
Or maybe not, since ZUN doesn't free the allocated memory himself, but
relies on libc to do that?
Part of P0117, funded by [Anonymous].
On the surface, Version1.02 of the `INTvector set program` seems to
be largely the same as Version1.01, just with fancier instructions,
some redundancy removed, and some slightly different wording in the
playful messages… or is there more to it? Stay tuned!
Part of P0117, funded by [Anonymous].
Yup, it's finally the right time to properly rebuild ZUN.COM. While
all of these small binaries would still need some RE attention, putting
in the few minutes to make them position-independent right now is
definitely worth it. Adding them to the PI calculation on the website
would take much longer 😅
Part of P0117, funded by [Anonymous].
Yeah, why *were* we assembling them in the 16-bit part before?!
Possible reasons:
• In a time before Tup, it made no actual difference whether these
little files were assembled in the 32-bit or 16-bit part. Now it sort
of does, since we've temporarily given up on minimal rebuilds in the
16-bit part.
• Emphasizing the temporary nature of the 32-bit part by deliberately
moving everything to the 16-bit part as early as possible?
• It all started with the ZUN.COM ASM code, which doesn't include any
other files, and can therefore be perfectly tracked by a Makefile.
Which *was* superior than the exclusive dumb batch file we had in the
past. And then I've simply cargo-culted all new .ASM translation
units into the 16-bit part well.
Oh, and another positive side effect of temporarily not using 16-bit
TASM: The build process now also runs on Windows 95.
Part of P0113, funded by Lmocinemod.
This Tupfile does in fact date back to January 2017, and I've been
using it myself ever since. Time to finally deliver it on master!
Part of P0001, funded by GhostPhanom.