And with all possible .COM executables decompiled, this set of changes
reaches an acceptable scope, allowing us to *finally*…
Part of P0077, funded by Splashman and -Tom-.
The TH04/TH05 BGM/SE mode setup is a good example for code where
different structure field offsets will vanish completely upon reverse-
engineering. If we continued to use the per-game ID string as the
variable name, we'd only have another game-specific "difference" there.
Part of P0065, funded by Touhou Patch Center.
Rule of thumb going forward: Everything that emits data is .asm,
everything that doesn't is .inc.
(Let's hope that th01_reiiden_2.inc won't exist for that much longer!)
Part of P0032, funded by zorg.
Oh, right, these functions can have parameters. So, let's turn snd_kaja_func()
into a macro that combines the function number and the parameter into the AX
value for the driver.
And renaming them all to the short filenames they will be decompiled to for
consistency. These functions aren't really immediately hardware-related, as
we've established earlier in the decompilation.
Only one code segment left in both OP and FUUIN! its-happening.gif
Yeah, that commit is way larger than I'm comfortable with, but none of these
functions is particularly large or difficult to decompile (with the exception
of graph_putsa_fx(), which I actually did weeks ago), and OP and MAIN have
their own unique functions in between the shared ones, so…
So yeah, after ignoring this issue for a week, we indeed have no choice but to
decompile these functions into this horrible mess of C and inline assembly.
And you know what? Since the compiled result still matches with ZUN's binary,
it's entirely possible that this *was* the original format this code was
written in! Seriously, how intoxicated do you have to be to write (or rather,
slur) code like this?
Keeping these functions entirely in assembly would have surely been better.
However, it would have made linking practically impossible, especially for the
later games which still need them in the current assembly slice format.
Oh, OK, so this is what the PC-98 GRCG is all about. You call grcg_setcolor(),
and that puts the PC-98 hardware in some sort of "monochromatic mode". Then,
you just write your pixels into any *single* one of the 4 VRAM bitplanes. This
causes the hardware to automatically write to *all* bitplanes in such a way
that the final palette index for each of the 8, 16, or 32 pixels you just wrote
a 1 value to will actually end up to match the color you set earlier.
Don't forget to call grcg_off() at the end though, or you can't draw any
non-monochromatic graphics, heh.
Yes, all of it. Including the bouncing polygons, of course. And since it's
placed at the end of ZUN's code inside the executable, the code's already
position-independent and fully hackable.
Don't really understand the other games yet because they start introducing
joystick support and TH03 has multiplayer and then there are these master.lib
modifications that don't really make any sense to me, especially when you add
that TH04 seemingly does not read js_stat *at all*, yet still works just fine
with a gamepad and... urgh.
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.
Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at
http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm
explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).
Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.
So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:
* C where the calling convention is actually __cdecl and where TASM's emitted
__cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
the caller either defers stack cleanup (summing up the stack size of
multiple functions, then cleaning it all in a single "add sp" instruction)
or where the stack is cleared in a different way (e.g. "pop cx").
Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
Note how it's only one *mode* in TH02/TH03, but two *modes* in TH04/TH05,
since you can't select between FM and Beep sound effect modes in TH02/TH03 (or
even disable sounds altogether). Might be a bit confusing, but it seemed
appropriate enough to distinguish the two functions.
Well, the naming.
Even though only TH02 actually uses MIDI (and thus, the MMD driver), every
game since then contains interrupt instructions for both functions. We could
just name it "pmd", since it seems like that's what came first - the AH
numbers of the 6 functions that make up MMD's interrupt API are identical to
those of the equivalent functions in PMD, even including gaps in the numbering
for PMD functions that don't have an equivalent in MIDI. However, except for
the FM sound effect handling and the key display in TH05's Music Room, these 6
functions are all the games actually use. Also, we already distinguish between
PMD and MMD in the driver check functions, and it might be confusing to only
imply PMD from now on?
So, "kaja" it is, collectively referring to the shared aspects of both
drivers.
For 32-bit immediate values, PUSH by itself is enough. For everything else,
PUSHD works in both TASM and JWasm.
Also, could it be...? Could we actually move to JWasm without breaking the
build in TASM at all?
... and then I end up copying modified versions into the individual game
subdirectories after all, because the changes between games were simply too
drastic. (That's also why I'm counting pfopen() itself twice.)
Only one slice left now, and then we're done with reduction!
Yup, packfiles finally proved that we really have a different set of changes
to master.lib in every game. Also, there are bound to be more of these game-
specific small changes to otherwise identical code in ZUN's own code.
And hey, no need to define that value in the build scripts anymore.
(I've also considered just copying modified versions into the individual game
subdirectories, but it's not too nice to expect people to diff them in order
to actually understand why these copies exist and where the changes actually
are.)
> introduce a new macro to halve the lines of a far function pointer
assignment, hoping that this commit will end up deleting more lines than it
adds, because TH03 has lots of those
> oh wait, these games mainly use near function pointers
> unearth even more new functions in the process
Seriously, how many more functions are still hidden in this codebase? And all
that just because IDA was not smart enough to begin with.
Since DSEG is *supposed* to be aligned to a paragraph boundary, this should
have never mattered. However, JWlink seems to not obey this segment alignment
in some layer of the linking process.
This now guarantees identical builds with TLINK, ALINK and JWlink.
So that's the - admittedly rather weird - solution to the problem that has
been plaguing this project ever since the beginning of the reduction step.
Without any 32-bit dummy segments in the compiled object files, more linkers
will be able to build this project, one of them being JWlink
(http://sourceforge.net/projects/jwlink/).
Still can't rename dseg to _DATA though, as TASM stupidly refuses to accept
any ALIGN directives above a segment's alignment attribute value. TH01's
floating-point data slices already require larger alignments, and we're very
likely to have even more of those in the future.
Also, we're finally defining the Borland C++ model symbols directly in the
code, rather than in my unpublished build batch files. :)