About time I finally developed this piece of tech. Towards TH05, this
segment got more and more undecompilable ASM functions mixed inbetween
C ones. Which means that pretty much all of the current ASM land
`#include`s in that segment will have to become translation units. And
we *really* don't want an additional layer of numbered, per-binary
translation units that just `#include` maybe one or two functions.
Also yeah, no _TEXT suffix, to drive home the point that this is a
"library" segment, and not really "owned" by any one file.
Part of P0113, funded by Lmocinemod.
Right, *_TRAM_W refers to 8-pixel halfwidth characters, and *_KANJI_W
to 16-pixel full-width characters… which also include gaiji, which are
what the entire HUD is made out of.
Part of P0112, funded by [Anonymous] and Blue Bolt.
… and this one, while I'm at it. I've been using pretty much every
possible type for VRAM offset variables, depending on my mood that day,
since signedness apparently never matters for those.
Except that it does. And so, just like with most of our high-level
types, we also have to account for ZUN's little signedness
inconsistencies here. Oh well, at least it's now only one of two types,
and there's no need to choose between `int` or `unsigned int` or
`short` or `unsigned short` or `int16_t` or `uint16_t` or `size_t` or…
Part of P0111, funded by [Anonymous] and Blue Bolt.
Whew, time to look at every `int` variable we ever declared! The best
moment to do this would have been a year ago, but well, better late
than never. No need to communicate that in comments anymore.
These shouldn't be used for widths, heights, or sprite-space
coordinates. Maybe we'll cover that another time, this commit is
already large enough.
Part of P0111, funded by [Anonymous] and Blue Bolt.
One fewer magic number. And one more deliberate dependency on a
PC-98-specific hardware constant, to further drive home just how
unportable these games are, even once decompilation will be complete.
Part of P0105, funded by Yanga.
Leading to slight complications in TH02's Music Room and shot type
selection menus. Thought about leaving those in C for a while, but I
still think it's worth it for the consistency we get with the VRAM
offset functions. Also, we'll have similar code for the main menus of
later games, and I'll surely won't be using C++ when starting out with
these.
Part of P0105, funded by Yanga.
Or, in more relevant news: That's the function that forced TH01's
pellet sprites to be defined in C land. First sprite to make that jump.
Part of P0102, funded by Yanga.
A future sprite converter (documented in #8) could then convert these
to C or ASM arrays.
(Except for the piano sprites for TH05's Music Room, which are stored
and used in such a compressed way that it defeats the purpose of
storing them as bitmaps.
Which works in both Borland C++, Open Watcom, and Visual C++.
Not that we're about to port any of the games to these compilers, just
something I noticed while evaluating 32-bit compilers for ReC98's own
32-bit pipeline tools. Modders might want to look into that though,
since 100% position independence also makes it easier to change
compilers.
So even TH01 wasn't 100% C++ after all. Turns out that this function
was the only instance in all of REIIDEN.EXE where ReC98 previously had
different encodings for identical x86 instructions.
Part of P0096, funded by Ember2528.
tiles_invalidate_around() must have had the hardest function signature
to figure out in all of PC-98 Touhou… because it seems impossible to
handle all three ways of passing parameters. No way around separately
declaring it in every translation unit then, with the parameter list
expected by that segment's code generation.
Part of P0089, funded by [Anonymous] and Blue Bolt.
Last one of the shared entity types! The TH05 version of the .STD enemy
VM would now be ready for decompilation in one single future push.
Completes P0088, funded by -Tom-.
Adding op/, main/, and end/ directories does nicely cover a great
majority of the "not really further classifiable slices" implied in
d56bd45.
Part of P0086, funded by [Anonymous] and Blue Bolt.
Again, 11 necessary workarounds, vs. forcing byte aligment in at least
18 places, and that number would have significantly grown in the
future.
Part of P0085, funded by -Tom-.
5 enums where code generation wants an `int`, vs. 11 cases where using
the minimum size is exactly the right default. So it's way more
idiomatic to force those 5 to 16 bits via a dummy element… except that
we can't give it a single, consistent name, because you can't redeclare
the same element in a different enum later.
Oh well, let's have this ugly naming convention instead, which makes it
totally clear that the force element not, in fact, a valid value of
that enum.
Part of P0085, funded by -Tom-.
… which allows a split into first rendering the top part of every
pellet, then the bottom part. This way, the game only needs two
grcg_setcolor() calls for any number of pellets.
Part of P0085, funded by -Tom-.
Ideally, the future sprite compiler should automatically pre-shift such
sprites, and correctly place the shifted variants in memory, by merely
parsing the C header. On disk, you'd then only have a .BMP with each
individual cel at x=0.
And that's why we need macros and consistent naming: To express these
semantics, without having to duplicate the sprite declaration in some
other format. sSPARKS[8][8][8] wouldn't help anyone 😛
Now, we could go even further there by defining a separate type
(`preshifted_dots8_t`), and maybe get rid of the _W macro by replacing
it with a method on that type. However,
• that would be inconsistent, since we'll need the _H macro anyway, for
both the actual rendering code and the sprite compiler
• we couldn't directly call such a method on a 2D or 3D array, and have
to go down to a single element to do so (`sSPARKS[0][0][0].w()`)
• making it a static method instead duplicates the type all over the
code
• and any variables of that type would no longer be scalar-type values
that can be stored in registers, requiring weird workarounds in those
places. As we've already seen with subpixels.
Part of P0085, funded by -Tom-.
I tried `brge` for the latter, but that had *the* most horrible
ergonomics, and I misspelled it as `bgre` 100% of the times I typed it
manually. Turns out that `dots` is also consistent with master.lib's
naming scheme, leaving `planar` to *actually* refer to types storing
multiple planes worth of pixels. These types are showing up more and
more, and deserve something better than their previous long-winded and
misleading name.
Part of P0081, funded by Ember2528.
And with all possible .COM executables decompiled, this set of changes
reaches an acceptable scope, allowing us to *finally*…
Part of P0077, funded by Splashman and -Tom-.
And of course, TH05 ruins the consistency once again. Sure, the added
file error handling is nice, but we also have changes in the playful
messages (lol), and now need a third distinct optimization barrier
(🤦)… But as it turns out, inlined calls to empty functions work as
well. They also seem closer to what ZUN might have actually written
there, given that their function body could have been removed by the
preprocessor, similar to the logging functions in the Windows Touhou
games. (With the difference that the latter infamously *aren't*
inlined…)
Part of P0077, funded by Splashman and -Tom-.
Huh, C++ wants its `char`s to be unsigned in order to *not* sign-extend
them to 16 bits for comparison against ASCII literals?!
Anyway, that completes TH03's ZUN.COM, with bascially no new C code.
Part of P0077, funded by Splashman and -Tom-.
The supposedly low-hanging fruit that almost every outside contributor
wanted to grab a lot earlier, but (of course) always just for a single
game… Comprehensively covering all of them has only started to make
sense recently 😛
Also, yes, the variable with the uppercase .CFG filename has itself a
lowercase name and vice versa…
Part of P0077, funded by Splashman and -Tom-.
Oh wow, caff4fe introduced wrong bytes into RES_KSO.COM by confusing
[cfg_bombs] with [credit_lives] -.- Which I didn't find out back then
because all the RES_*.COM binaries still had some different instruction
encodings anyway and I just didn't care enough to base my diff of those
files on the wrong encoding versions to notice the bug…
Whoops.
Part of P0077, funded by Splashman and -Tom-.
4½ years after aa56a7c, it turns out that the correct decompilation
involves… no-ops generated by assigning variables that just happen to
be in registers to themselves?! Which does get optimized out, but only
after TCC folded identical tail code in all branches of a function,
thus effectively functioning as an optimization barrier.
The initial attempt used register pseudovariables, but this definitely
is the best possible way this could work – portable, and doesn't
unnecessarily shred the code into tiny inlined functions pieces. The
mindblowing thing here is that ZUN could have actually written this to
have additional, albeit unnecessary, lines to place breakpoints on. But
that means he must have chosen those two local variables in SI and DI
completely by chance… 🤯
The best thing though? ~#pragma inline is gone~
Part of P0076, funded by [Anonymous] and -Tom-.
Necessary to make string literals from the first one end up at their
correct positions in the data segment even after the upcoming
deduplication…
Part of P0076, funded by [Anonymous] and -Tom-.
Oh hey, guarding declarations with complicated types via #ifdef limits
the header files we additionally have to #include!
Part of P0076, funded by [Anonymous] and -Tom-.
Since they're determined by the order of sprites in a .BFT file,
they're best auto-generated by an enum as much as possible.
Part of P0074, funded by Myles.
Which finally allows us to use the PLANE_SIZE macro in ASM land. Yeah,
(ROW_SIZE * RES_Y) has finally got old.
Part of P0073, funded by [Anonymous] and -Tom-.
Templates would have been nicer, but as soon as you add just one
non-immediate parameter, Turbo C++ generates a useless store to a new
local variable, ruining the generated code.
Part of P0069, funded by [Anonymous] and Yanga.
I did consider not doing this, because "well, can't anyone who's
*actually* interested just diff the TH01 and TH02 implementations to
figure out the differences themselves", but that duplication ended up
feeling too filthy after all.
And hey, it's a nice excuse to update TH02's version to current naming
standards! 😛
Part of P0068, funded by Yanga.
The TH04/TH05 BGM/SE mode setup is a good example for code where
different structure field offsets will vanish completely upon reverse-
engineering. If we continued to use the per-game ID string as the
variable name, we'd only have another game-specific "difference" there.
Part of P0065, funded by Touhou Patch Center.
Funny how the actual scores are stored as little-endian gaiji strings
in the bold font, yet never actually used as such.
Part of P0063, funded by -Tom-.
So many things named `score_*`, so many things named `hiscore_*`…
Let's go with `scoredat_*`, which clearly indicates that this stuff is
saved into a file, while still being only 8 characters.
Part of P0063, funded by -Tom-.
At least wherever Turbo C++ and master.lib want us to use a
non-pointer, since both use uint16_t for segment values throughout
their APIs instead of the more sensible void __seg*. Maybe, integer
arithmetic on segment values was widely considered more important than
dereferencing?
*Finally*. We already used `(unsigned) int` in quite a few places where
we actually want a 16-bit value, which was bound to annoy future port
developers.
Not applying this leak to TH03 since it would have more than one
`key_det` variable, resulting in names that are as much fanfiction as
the current ones…
Yup, function parameters that can clearly be identified as coordinates
are by far the fastest way to raise the calculated position
independence percentage. Kinda makes it sound like useless work, which
I'm only doing because it's dictated by some counting algorithm on a
website, but decompilation will want to un-hex all of these values
anyway. We're merely doing that right now, across all games.
Part of P0058, funded by -Tom-.
That should make this convoluted copypasta a bit easier to read. And
sure, I could have done something about the loop as well, but
SHOT_FUNC_INIT already hides enough control flow behind a macro…
Part of P0037, funded by zorg.
With both 16- and 32-bit build parts soon having full dependency
tracking, having more small includes wins out over having fewer, larger
ones – and also, over having to fix tons of macro conflicts that stem
from most .inc files assuming the context of the big .asm files.
Case in point, including ReC98.inc doesn't work right now without
defining a .MODEL, which is counter-productive for ASM compilation
units.
Part of P0035, funded by zorg.
The TH02 version is a piece of cake…
… but TH04 starts turning it into this un-decompilable piece of
unnecessarily micro-optimized ZUN code. Couldn't have chosen anything
better for the first separate ASM translation unit.
Aside from now having to convert names of exported *variables* to
uppercase for visibility in ASM translation units, the most notable
lesson in this was the one about avoiding fixup overflows. From the
Borland C++ Version 4.0 User's Guide:
"In an assembly language program, a fixup overflow frequently
occurs if you have declared an external variable within a
segment definition, but this variable actually exists in a
different segment."
Can't be restated often enough.
Completes P0032, funded by zorg.
Rule of thumb going forward: Everything that emits data is .asm,
everything that doesn't is .inc.
(Let's hope that th01_reiiden_2.inc won't exist for that much longer!)
Part of P0032, funded by zorg.
Many thanks to http://bytepointer.com/tasm/index.htm for providing a
better searchable resource for TASM's default `LEA imm16` → `MOV imm16`
optimization, which we initially had to hack around here.
Funded by -Tom-.
Yes, you're reading that correctly. If the cursor is at 255, reading a
16-bit value will fill the upper 8 bits with the neighboring cursor
value, which always is 0xFF.
Funded by -Tom-.
Which is the last step on the way to completely position-independent
code, with no random hex numbers that should have been data pointers,
but weren't automatically turned into data pointers by IDA because
they're only ever addressed in the indirect fashion of
mov bx, [bp-array_index]
mov ax, [bx+0D00h] ; 0D00h is obviously an array of some sort
Removing all of these makes it practicable to add or delete code without
breaking the game in the process. Basic "modding", so to speak.
Automatically catching all possible cases where this happens actually
amounts to emulating the entire game, and *even then*, we're not
guaranteed that the *size* of the array just falls out as a byproduct
of this emulation and the tons of heuristics I would have thrown on top
of that. ZUN hates proper bounds checking and the correct size of each
array may simply never be implied anywhere.
So, rather than going through all that trouble of that (and hell, I
haven't even finished *parsing* this nasty MASM assembly format), and
since nothing really has happened in this project for almost two years,
I chose to just turn this into a text manipulation issue and figure out
the rest manually. Yeah, quick and dirty, and it probably won't scale if
I ever end up doing the same for PC-98 Policenauts, but it'd better work
at least for the rest of PC-98 Touhou.
Trying to do one of those per day from now on. Probably won't make it
due to the reverse-engineering effort required for the big main
executables of each game, but it'd sure be cool if I did.
This, hands down, has been the single worst stretch of decompilation so far.
Three extremely difficult functions that each still required inline assembly.
And no, this didn't even work out with any of the optimization features in
Borland C++ that aren't included in Turbo C++.
Oh, right, these functions can have parameters. So, let's turn snd_kaja_func()
into a macro that combines the function number and the parameter into the AX
value for the driver.
And renaming them all to the short filenames they will be decompiled to for
consistency. These functions aren't really immediately hardware-related, as
we've established earlier in the decompilation.
Only one code segment left in both OP and FUUIN! its-happening.gif
Yeah, that commit is way larger than I'm comfortable with, but none of these
functions is particularly large or difficult to decompile (with the exception
of graph_putsa_fx(), which I actually did weeks ago), and OP and MAIN have
their own unique functions in between the shared ones, so…
So yeah, after ignoring this issue for a week, we indeed have no choice but to
decompile these functions into this horrible mess of C and inline assembly.
And you know what? Since the compiled result still matches with ZUN's binary,
it's entirely possible that this *was* the original format this code was
written in! Seriously, how intoxicated do you have to be to write (or rather,
slur) code like this?
Keeping these functions entirely in assembly would have surely been better.
However, it would have made linking practically impossible, especially for the
later games which still need them in the current assembly slice format.
MAIN.EXE shares most of the code in this segment, but I can't remove it from
there right now due to the weird ordering of the data segments in that
executable…
And yes, once again, those three seemingly random type casts in here are
*necessary* to build a bit-perfect binary.
Small detour into MAINE.EXE because it has all the juicy algorithms that will
explain the remaining unknown members of the highscore data structure, and
there's this one code segment here we need to get out of the way first.
Oh, OK, so this is what the PC-98 GRCG is all about. You call grcg_setcolor(),
and that puts the PC-98 hardware in some sort of "monochromatic mode". Then,
you just write your pixels into any *single* one of the 4 VRAM bitplanes. This
causes the hardware to automatically write to *all* bitplanes in such a way
that the final palette index for each of the 8, 16, or 32 pixels you just wrote
a 1 value to will actually end up to match the color you set earlier.
Don't forget to call grcg_off() at the end though, or you can't draw any
non-monochromatic graphics, heh.
Yes, all of it. Including the bouncing polygons, of course. And since it's
placed at the end of ZUN's code inside the executable, the code's already
position-independent and fully hackable.
Don't really understand the other games yet because they start introducing
joystick support and TH03 has multiplayer and then there are these master.lib
modifications that don't really make any sense to me, especially when you add
that TH04 seemingly does not read js_stat *at all*, yet still works just fine
with a gamepad and... urgh.
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.
So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:
* C where the calling convention is actually __cdecl and where TASM's emitted
__cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
the caller either defers stack cleanup (summing up the stack size of
multiple functions, then cleaning it all in a single "add sp" instruction)
or where the stack is cleared in a different way (e.g. "pop cx").
Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
Note how it's only one *mode* in TH02/TH03, but two *modes* in TH04/TH05,
since you can't select between FM and Beep sound effect modes in TH02/TH03 (or
even disable sounds altogether). Might be a bit confusing, but it seemed
appropriate enough to distinguish the two functions.
Well, the naming.
Even though only TH02 actually uses MIDI (and thus, the MMD driver), every
game since then contains interrupt instructions for both functions. We could
just name it "pmd", since it seems like that's what came first - the AH
numbers of the 6 functions that make up MMD's interrupt API are identical to
those of the equivalent functions in PMD, even including gaps in the numbering
for PMD functions that don't have an equivalent in MIDI. However, except for
the FM sound effect handling and the key display in TH05's Music Room, these 6
functions are all the games actually use. Also, we already distinguish between
PMD and MMD in the driver check functions, and it might be confusing to only
imply PMD from now on?
So, "kaja" it is, collectively referring to the shared aspects of both
drivers.
For 32-bit immediate values, PUSH by itself is enough. For everything else,
PUSHD works in both TASM and JWasm.
Also, could it be...? Could we actually move to JWasm without breaking the
build in TASM at all?
... and then I end up copying modified versions into the individual game
subdirectories after all, because the changes between games were simply too
drastic. (That's also why I'm counting pfopen() itself twice.)
Only one slice left now, and then we're done with reduction!
Yup, packfiles finally proved that we really have a different set of changes
to master.lib in every game. Also, there are bound to be more of these game-
specific small changes to otherwise identical code in ZUN's own code.
And hey, no need to define that value in the build scripts anymore.
(I've also considered just copying modified versions into the individual game
subdirectories, but it's not too nice to expect people to diff them in order
to actually understand why these copies exist and where the changes actually
are.)