Mostly centered around the HUD, popup, overlay, boss, and player shot
functions we're about to reference in the upcoming decompilations.
Part of P0186, funded by [Anonymous] and Blue Bolt.
If the macro itself is local to a function, these can work in certain
scenarios, but never for global ones.
Part of P0186, funded by [Anonymous] and Blue Bolt.
The single underscore version is actually slightly more supported among
the compilers I've seen so far. Also added the exact list now.
Part of P0183, funded by Yanga and [Anonymous].
Aha! TH05 actually loads every single rendered dialog image
individually before rendering it, either from the EMS area or disk.
That's one way to save memory, I guess?
Part of P0169, funded by Blue Bolt.
Including the pointless DOS I/O variation in TH05's MAIN.EXE.
I'm slowly running out of characters to remove from the first segment
name in that file, though…
Part of P0148, funded by [Anonymous].
It shouldn't need a comment to communicate that this function does in
fact not load all values from the .CFG file that are part of the
resident structure, but only loads and sets the global pointer to that
structure.
Part of P0148, funded by [Anonymous].
Not exclusively used for the boss entrance animations, even though its
data is declared in that general vicinity. It's also used for all bombs
in TH04, and Reimu's and Yuuka's bomb in TH05.
Part of P0147, funded by -Tom- and Ember2528.
We really wouldn't have wanted to start writing inline ASM in the
middle of a conditional expression just to get that janky `CMP AX, 0`
instead of TCC's sensible `OR AX, AX` optimization.
Part of P0146, funded by -Tom- and Ember2528.
This gets rid of a couple of per-entity sprite bitplane types, makes
sprite declarations easier to read by putting width and height next to
each other… and points out a number of array dimension mistakes -.-
Even in places where we can't use it.
Part of P0138, funded by [Anonymous] and Blue Bolt.
Might look uglier, but has the advantage of not generating an empty
segment with the default name… *and* the default padding, which will
really come in handy with the following breakthrough.
Part of P0137, funded by [Anonymous].
Whoops, turns out that the build has been broken on TASM32 version 5.3
(the one in the DevKit) ever since 7897bf1. In contrast to version 5.0
(which I use for my development), 5.3 actually defines 32-bit segments
if you specify a .386 CPU before using .MODEL.
That might have been the reason for the .286 workaround all along?
Turns out there's the USE16 modifier, which makes this much more
explicit than switching CPUs.
eeb4e7e changed the final C translation unit that used this header to
C++, and we got some more helpful inline functions upcoming.
Part of P0136, funded by [Anonymous].
Reason: Pascal calling convention with function parameters but no stack
frame. Theoretically we can __emit__() everything inside this function,
but there's no way we can get a `RETN 8` this way. Oh, and it also
accesses SI and DI without backing them up to the stack.
And thanks to TLINK apparently not reporting fixup overflows when
segments are small enough (?), it took quite a while to get that CALL
correct and not weirdly offset by 32 bytes. 😕
Part of P0134, funded by [Anonymous].
That assembly is *worse* than what you would have gotten out of your
1994 C++ compiler with the 386 code generation switch!
Part of P0134, funded by [Anonymous].
> assigning to the DI register immediately before a CALL
Yeah, no amount of comma operator trickery can get *that* out of this
compiler. Also, these TH05 .PI functions are the only place in PC-98
Touhou with a `IMUL DI, imm8` instruction, which is impossible to get
out of Turbo C++'s built-in assembler.
Well, at least the `if` branches decompile somewhat nicely.
Part of P0134, funded by [Anonymous].
… especially because this one *is* actually undecompilable. Reason:
Base pointer assignment to BX, before saving the SI register on the
stack.
Part of P0134, funded by [Anonymous].
Well, it *would* have been decompilable, but that ridiculous placement
of the nullptr assignment would have forced the entire function call to
be spelled out in inline ASM, verbatim. No amount of comma operator
trickery would have generated the same instructions either. And for a
function this small and obvious in what its decompilation *should* be,
it really defeated the purpose of adding a separate translation unit…
Part of P0134, funded by [Anonymous].
Turns out that ARG RETURNS is only really necessary in DEFCONV
functions, which are explicitly declared to use either the C or PASCAL
calling convention. In functions without such a declaration, ARG by
itself works just fine, and won't emit any instructions on its own.
The parameter lists for PASCAL functions still have to be reversed in
that case, though… oh well, let's just comment these cases to hopefully
reduce the confusion.
Part of P0134, funded by [Anonymous].
`cPtrSize` is simply the wrong constant for calculating parameter
offsets on the stack, because it corresponds to the memory model's
default distance, not the function's distance. Luckily, ARG has a
RETURNS clause, and if you declare all parameters in there, ARG won't
emit that pesky and unnecessary `ENTER 0, 0` instruction. Big discovery
right there!
Sadly, ARG is unusable for ZUN's silly functions that keep the base
pointer in BX. TASM declares the resulting equates as `[BP+offset]`,
and it's apparently impossible to only get `offset` out of such an
equate later.
So, rather than staying with numbers, let's reimplement ARG for these
functions instead. This way, we can even abstract away the stack clear
size for the `RET` instructions.
It's a bit rough around the edges though, forcing you to explicitly
specify the function distance, and to pass the parameters in reverse
order compared to the C declaration (thankfully, all of these use the
PASCAL calling convention). It also doesn't work with more complex
types yet. But certainly better than numbers.
Part of P0134, funded by [Anonymous].
DOS is not the same thing as the underlying CPU, after all. A separate
file not only indicates to future port authors which parts of the code
are x86-specific, but it also speeds up build times…
… in theory, because removing 677 lines from 49 files each doesn't seem
to speed up the build as much as I had hoped? But apparently my whole
system mysteriously got faster in the meantime, and I was getting 22-23
seconds for the entire repo even before this commit. Good enough.
Part of P0134, funded by [Anonymous].
And this is how you make code less undecompilable by improving your
pointless micro-optimizations to use more registers instead of
self-modifying code. Worth it if only to get rid of the branches in
TH04's undecompilable ASM implementation.
Part of P0133, funded by [Anonymous].
The change of pi_free() from a macro to a function in TH05 doesn't
require a complete redefinition.
Part of P0124, funded by [Anonymous] and Blue Bolt.
The TH04 one might have the same function structure, but the only thing
that's actually identical in both games is the picture darkening loop.
Part of P0119, funded by [Anonymous] and -Tom-.
Wow, this is the first time we're about to call any of these from C
land in ≥TH03? Found no built-in way to just uppercase an identifier
in TASM, so apparently we have to spell out the names in both lower-
and uppercase.
So, let's go back to regular, non-macro PUBLIC / PROC / ENDP code
wherever we can – for all functions introduced in ≥TH03, and for
everything that takes no parameters. It's simply not worth the
trouble.
Part of P0114, funded by Lmocinemod.
…Wow. A 32-element lookup table for the very computationally expensive
operation of (i * 320), needlessly limiting the amount of unique 384×80
tile sections in a stage to 32… and then TH05 further "optimizes" this
lookup by pre-multiplying all section IDs in the .STD file with the
element size of that table, to save a grand total of 1 x86 instruction.
Part of P0112, funded by [Anonymous] and Blue Bolt.
Whew, time to look at every `int` variable we ever declared! The best
moment to do this would have been a year ago, but well, better late
than never. No need to communicate that in comments anymore.
These shouldn't be used for widths, heights, or sprite-space
coordinates. Maybe we'll cover that another time, this commit is
already large enough.
Part of P0111, funded by [Anonymous] and Blue Bolt.
TH04's z_super_roll_put_tiny_32x32(), which this is based on, had
checks for not writing empty rows to VRAM. Why would ZUN remove them
here?
Completes P0086, funded by [Anonymous] and Blue Bolt.
And with all possible .COM executables decompiled, this set of changes
reaches an acceptable scope, allowing us to *finally*…
Part of P0077, funded by Splashman and -Tom-.
The supposedly low-hanging fruit that almost every outside contributor
wanted to grab a lot earlier, but (of course) always just for a single
game… Comprehensively covering all of them has only started to make
sense recently 😛
Also, yes, the variable with the uppercase .CFG filename has itself a
lowercase name and vice versa…
Part of P0077, funded by Splashman and -Tom-.
Might as well cover this one while I've got this type of function in my
head. This one is used for the animated star backgrounds during Stage 2
and the Shinki battle.
Completes P0073, funded by [Anonymous] and -Tom-.
Which finally allows us to use the PLANE_SIZE macro in ASM land. Yeah,
(ROW_SIZE * RES_Y) has finally got old.
Part of P0073, funded by [Anonymous] and -Tom-.
WTF. Fine, let's have separate, micro-optimized ASM implementations for
decoding and encoding inr MAIN.EXE, but still, all those minute
difference between OP.EXE and MAINE.EXE…
This is as far as anyone should reasonably go before decompilation;
things will get really ugly with the loading functions once the file
name is involved as well…
Completes P0063, funded by -Tom-.
As used for the title screen fade-in effect. Another function that
apparently was deliberately written to run not that fast, by blitting
each row individually to the 400th VRAM row just so that it can then
turn on the EGC, perform the *actual* masked blit to the VRAM
destination, and then turn the EGC off before moving to the next now.
The same effect could have entirely been accomplished by copying
graph_pack_put_8() and applying the mask there; it's not like ZUN
didn't know how to modify master.lib…
(See also 44ad3eb4) --Nmlgc