Eh, REP MOVSD is used too inconsistently across the games to justify
replacing these macros with an `inline` function. Still can use a
custom one here to make the register usage a bit more explicit, though.
Part of P0136, funded by [Anonymous].
Reason: Self-modifying. -.-
The TH05 version *might* be decompilable into a mess. Don't have time
for that right now, though.
Part of P0136, funded by [Anonymous].
Almost undecompilable, until you remember that you can inhibit the
optimization of keeping a function parameter in a register by having an
`inline` function take a reference. Yet another function that wouldn't
have decompiled that nicely if we had restricted us to C…
Part of P0136, funded by [Anonymous].
If anything, these decompilations of barely decompilable functions can
at least indicate that I tried. 🤷 Nice __fastcall abuse though, which
allows us to least formalize *some* of the implicit state passed
between those functions.
Completes P0135, funded by [Anonymous].
Reasons:
• piano_fm_part_put_raw(): SI register referenced and not saved on
the stack
• piano_current_note_from(): Would be decompilable… into a mess.
Not worth adding a separate translation unit just for it.
• piano_part_keys_put_raw(): DI register saved before the SI register
• piano_pressed_key_put(): DI register referenced and not immediately
saved on the stack
• piano_label_put_raw(): SI and DI registered referenced and not saved
on the stack
• grcg_setcolor_direct_seg1_raw(): Let's procrastinate this one until
we have to reference all of these instances in C land.
And we could have even emitted that PIANO_KEY_PRESSED_TOP pixel data
into the code segment, by using `#pragma option -z` to give identical
names to both the code and the data segment. At least we can decompile
the first two functions here.
Part of P0135, funded by [Anonymous].
A decompilation of ZUN-written ASM that was almost worth it, for once!
Too bad that those aren't the <string.h> intrinsics that the
Wolfenstein 3D disassembly hinted at, though.
Part of P0135, funded by [Anonymous].
Reason: Pascal calling convention with function parameters but no stack
frame. Theoretically we can __emit__() everything inside this function,
but there's no way we can get a `RETN 8` this way. Oh, and it also
accesses SI and DI without backing them up to the stack.
And thanks to TLINK apparently not reporting fixup overflows when
segments are small enough (?), it took quite a while to get that CALL
correct and not weirdly offset by 32 bytes. 😕
Part of P0134, funded by [Anonymous].
> try to debug weird linker error
> find performance secret sauce
Removes roughly 1 second from the build time of the full repo!
Part of P0134, funded by [Anonymous].
> assigning to the DI register immediately before a CALL
Yeah, no amount of comma operator trickery can get *that* out of this
compiler. Also, these TH05 .PI functions are the only place in PC-98
Touhou with a `IMUL DI, imm8` instruction, which is impossible to get
out of Turbo C++'s built-in assembler.
Well, at least the `if` branches decompile somewhat nicely.
Part of P0134, funded by [Anonymous].
Well, it *would* have been decompilable, but that ridiculous placement
of the nullptr assignment would have forced the entire function call to
be spelled out in inline ASM, verbatim. No amount of comma operator
trickery would have generated the same instructions either. And for a
function this small and obvious in what its decompilation *should* be,
it really defeated the purpose of adding a separate translation unit…
Part of P0134, funded by [Anonymous].
Getting us completely macro-free there… even though it did require a
separate version of those functions if the ID is a pointer.
Part of P0134, funded by [Anonymous].
Reason: Wants to be word-aligned, and the previous version in OP.EXE,
game_exit(), is not, despite having an even length :(
Oh well, at least I'm confident enough about it by now to document it.
And out of all decompilations to be thrown away, this is a pretty
dispensable one.
Part of P0133, funded by [Anonymous].
My repo will never include all of master.lib anyway, as explained in
https://rec98.nmlgc.net/blog/2020-11-03.
That gets me a median build time of 27 seconds for the entirety of
ReC98 on my system, just as estimated in that same blog post. And we're
far from done with #include optimizations…
Part of P0133, funded by [Anonymous].
This might have almost been the case that justified making master.hpp
compatible with C translation units… but even this one would like to
use the template functions from th01/math/vector.hpp.
Part of P0133, funded by [Anonymous].
Since I'd still prefer to not use include guards *and* to have that
standalone .GRZ viewer, these would have actually caused somewhat of
an issue with the upcoming final stretch of the master.hpp transition.
Part of P0133, funded by [Anonymous].
Reason: Manual "tail call optimization" of input_reset_sense(), with
execution falling through to input_sense() immediately below.
Part of P0133, funded by [Anonymous].
Umm… but this can't be in the same translation unit as frame_delay(),
because OP.EXE has cdg_put_nocolors() inbetween, which means we'd have
to compile it twice.
What probably happened there: ZUN originally wrote this in C when
frame_delay() was still next to it, then generated ASM from it,
tinkered with that, and ultimately only linked that ASM into the final
game, with the NOPCALL still in there. That might very well be the one
temporary NOPCALL workaround we can never get rid of…
Oh well, at least we got lucky with the padding, and can keep the
cdg_put_nocolors() decompilation from the last commit.
Part of P0133, funded by [Anonymous].
And this is how you make code less undecompilable by improving your
pointless micro-optimizations to use more registers instead of
self-modifying code. Worth it if only to get rid of the branches in
TH04's undecompilable ASM implementation.
Part of P0133, funded by [Anonymous].
How was this game even *built*, originally, if it uses *both* a common
shared set of library functions *and* obviously copy-pasted and
separately compiled versions of some of these functions?
Part of P0132, funded by [Anonymous].