This gets rid of a couple of per-entity sprite bitplane types, makes
sprite declarations easier to read by putting width and height next to
each other… and points out a number of array dimension mistakes -.-
Even in places where we can't use it.
Part of P0138, funded by [Anonymous] and Blue Bolt.
Sure, we can't use them everywhere, but it's really nice to get rid of
that casting madness – and any explicit references to x86 memory
segmentation – wherever we can.
Part of P0138, funded by [Anonymous] and Blue Bolt.
Segment alignment forces us to do all of those at once… but now, we've
not only caught up with the segment split point in TH04's OP.EXE and
MAINE.EXE, but also decompiled all instances of DEFCONV functions!
Part of P0138, funded by [Anonymous] and Blue Bolt.
Second previously undecompilable translation unit, second creative
workaround for the workaround. We can't compile snd_se_play() with -WX,
as that function needs a stack frame, and it's also illegal to disable
-WX in the middle of a translation unit. But since we only need word
alignment in front of snd_se_reset() *and* that function is identical
in all 4 games, it makes sense to move it to its own translation unit.
And then you notice that the TH02/TH03 and TH04/TH05 versions of the
other two functions are basically identical. The small differences can
easily be moved out to inline functions, leaving us with a single
implementation file for all 4 games. Nice!
Part of P0137, funded by [Anonymous].
Might look uglier, but has the advantage of not generating an empty
segment with the default name… *and* the default padding, which will
really come in handy with the following breakthrough.
Part of P0137, funded by [Anonymous].
Whoops, turns out that the build has been broken on TASM32 version 5.3
(the one in the DevKit) ever since 7897bf1. In contrast to version 5.0
(which I use for my development), 5.3 actually defines 32-bit segments
if you specify a .386 CPU before using .MODEL.
That might have been the reason for the .286 workaround all along?
Turns out there's the USE16 modifier, which makes this much more
explicit than switching CPUs.
Reason: Self-modifying. -.-
Also, why no GRCG? Would have allowed blitting via REP MOVSD… Might as
well optimize all the way if you're going the ASM route to begin with.
Part of P0136, funded by [Anonymous].
Eh, REP MOVSD is used too inconsistently across the games to justify
replacing these macros with an `inline` function. Still can use a
custom one here to make the register usage a bit more explicit, though.
Part of P0136, funded by [Anonymous].
Reason: Self-modifying. -.-
The TH05 version *might* be decompilable into a mess. Don't have time
for that right now, though.
Part of P0136, funded by [Anonymous].
eeb4e7e changed the final C translation unit that used this header to
C++, and we got some more helpful inline functions upcoming.
Part of P0136, funded by [Anonymous].
Almost undecompilable, until you remember that you can inhibit the
optimization of keeping a function parameter in a register by having an
`inline` function take a reference. Yet another function that wouldn't
have decompiled that nicely if we had restricted us to C…
Part of P0136, funded by [Anonymous].
Not that it really fits there either, but I've been trying to keep the
th0?/ directories free from any actual code. They should only contain
the distinct translation units within the original three .EXE binaries,
`#include`ing files from subdirectories, along with maybe game-specific
`#pragma`s, but contain no code on their own. Port authors would simply
ignore those, and link everything from the subdirectories into one
binary. That approach has seemed to make the most sense for all of this
so far.
Part of P0135, funded by [Anonymous].
Allowing us to consistently mirror the declaration in pc98.inc
without adding a planar.inc file. 😛 And points us to two more
dots8_t* arrays that should have used the Planar<> template.
Part of P0135, funded by [Anonymous].
A decompilation of ZUN-written ASM that was almost worth it, for once!
Too bad that those aren't the <string.h> intrinsics that the
Wolfenstein 3D disassembly hinted at, though.
Part of P0135, funded by [Anonymous].
Turns out that ARG RETURNS is only really necessary in DEFCONV
functions, which are explicitly declared to use either the C or PASCAL
calling convention. In functions without such a declaration, ARG by
itself works just fine, and won't emit any instructions on its own.
The parameter lists for PASCAL functions still have to be reversed in
that case, though… oh well, let's just comment these cases to hopefully
reduce the confusion.
Part of P0134, funded by [Anonymous].
`cPtrSize` is simply the wrong constant for calculating parameter
offsets on the stack, because it corresponds to the memory model's
default distance, not the function's distance. Luckily, ARG has a
RETURNS clause, and if you declare all parameters in there, ARG won't
emit that pesky and unnecessary `ENTER 0, 0` instruction. Big discovery
right there!
Sadly, ARG is unusable for ZUN's silly functions that keep the base
pointer in BX. TASM declares the resulting equates as `[BP+offset]`,
and it's apparently impossible to only get `offset` out of such an
equate later.
So, rather than staying with numbers, let's reimplement ARG for these
functions instead. This way, we can even abstract away the stack clear
size for the `RET` instructions.
It's a bit rough around the edges though, forcing you to explicitly
specify the function distance, and to pass the parameters in reverse
order compared to the C declaration (thankfully, all of these use the
PASCAL calling convention). It also doesn't work with more complex
types yet. But certainly better than numbers.
Part of P0134, funded by [Anonymous].
DOS is not the same thing as the underlying CPU, after all. A separate
file not only indicates to future port authors which parts of the code
are x86-specific, but it also speeds up build times…
… in theory, because removing 677 lines from 49 files each doesn't seem
to speed up the build as much as I had hoped? But apparently my whole
system mysteriously got faster in the meantime, and I was getting 22-23
seconds for the entire repo even before this commit. Good enough.
Part of P0134, funded by [Anonymous].
Getting us completely macro-free there… even though it did require a
separate version of those functions if the ID is a pointer.
Part of P0134, funded by [Anonymous].
Turns out the inlining behavior of `const` variables at global scope
that we've been relying on lately is actually exclusive to C++ mode…
once again!
Part of P0133, funded by [Anonymous].
Reason: Manual "tail call optimization" of input_reset_sense(), with
execution falling through to input_sense() immediately below.
Part of P0133, funded by [Anonymous].
Umm… but this can't be in the same translation unit as frame_delay(),
because OP.EXE has cdg_put_nocolors() inbetween, which means we'd have
to compile it twice.
What probably happened there: ZUN originally wrote this in C when
frame_delay() was still next to it, then generated ASM from it,
tinkered with that, and ultimately only linked that ASM into the final
game, with the NOPCALL still in there. That might very well be the one
temporary NOPCALL workaround we can never get rid of…
Oh well, at least we got lucky with the padding, and can keep the
cdg_put_nocolors() decompilation from the last commit.
Part of P0133, funded by [Anonymous].
And this is how you make code less undecompilable by improving your
pointless micro-optimizations to use more registers instead of
self-modifying code. Worth it if only to get rid of the branches in
TH04's undecompilable ASM implementation.
Part of P0133, funded by [Anonymous].
Undecompilable again. The loading functions have these *_noalpha()
variants that simply set a global variable and fall through to the
regular functions, while cdg_free() has its first `PUSH DI` instruction
after the first expression we'd be decompiling. cdg_free_all() *could*
be decompiled… but would also require _FLAGS trickery, and it's simply
not worth starting a translation unit for one such small function.
Part of P0127, funded by [Anonymous].
Actually fairly average, as far as unreasonable decompilations are
concerned. No `goto`, at least! Another place that would benefit from
EGC raster op documentation, though.
Also, got one more padding byte in TH05's MAINE.EXE correct. 🙂
Part of P0126, funded by [Anonymous] and Blue Bolt.
Rather than preferring either the Microsoft/Watcom `(in|out)pw?` style,
or the Borland `(in|out)portb?` style, master.lib had to introduce its
own `(OUT|IN)P[BW]` naming scheme… Insert obligatory xkcd standards
comic.
Part of P0126, funded by [Anonymous] and Blue Bolt.
Yup, no trick there. If the selection moves to the other character, the
original background behind the raised top and left edges has to be
blitted back to VRAM, which means that it also has to be stored
somewhere. TH04 backs up exactly the two 256×8 and 8×244 strips behind
Reimu and Marisa, requiring 2 KB of heap memory, whereas TH05 simply
gave up, and backs up the entire 640×400 screen, totalling 128 KB.
Part of P0125, funded by [Anonymous].