Also something that any future ReC project should be doing right at the
start. Finally, it made sense to do it here as well, because…
Part of P0084, funded by Yanga.
And with all possible .COM executables decompiled, this set of changes
reaches an acceptable scope, allowing us to *finally*…
Part of P0077, funded by Splashman and -Tom-.
The supposedly low-hanging fruit that almost every outside contributor
wanted to grab a lot earlier, but (of course) always just for a single
game… Comprehensively covering all of them has only started to make
sense recently 😛
Also, yes, the variable with the uppercase .CFG filename has itself a
lowercase name and vice versa…
Part of P0077, funded by Splashman and -Tom-.
The TH04/TH05 BGM/SE mode setup is a good example for code where
different structure field offsets will vanish completely upon reverse-
engineering. If we continued to use the per-game ID string as the
variable name, we'd only have another game-specific "difference" there.
Part of P0065, funded by Touhou Patch Center.
WTF. Fine, let's have separate, micro-optimized ASM implementations for
decoding and encoding inr MAIN.EXE, but still, all those minute
difference between OP.EXE and MAINE.EXE…
This is as far as anyone should reasonably go before decompilation;
things will get really ugly with the loading functions once the file
name is involved as well…
Completes P0063, funded by -Tom-.
Funny how the actual scores are stored as little-endian gaiji strings
in the bold font, yet never actually used as such.
Part of P0063, funded by -Tom-.
Waiting with the `fx` parameter in TH01's calls for the decompilation
of this game's version of this function…
Part of P0062, funded by Touhou Patch Center.
As used for the title screen fade-in effect. Another function that
apparently was deliberately written to run not that fast, by blitting
each row individually to the 400th VRAM row just so that it can then
turn on the EGC, perform the *actual* masked blit to the VRAM
destination, and then turn the EGC off before moving to the next now.
The same effect could have entirely been accomplished by copying
graph_pack_put_8() and applying the mask there; it's not like ZUN
didn't know how to modify master.lib…
(See also 44ad3eb4) --Nmlgc
"bgimage" really was the best name I could come up with here, given
what it's used for. Everything else sounded way too ambiguous. Even
something like "mempage", in the meaning of "a third memory-backed
graphics page"… --Nmlgc
The pascal calling convention for TH03's input mode functions actually
sort of matters, since we have this nice function pointer type that
expects pascal.
Not applying this leak to TH03 since it would have more than one
`key_det` variable, resulting in names that are as much fanfiction as
the current ones…
Raw, uninteresting position independence work. Or maybe not, given that
this was one of the few things that also apply to TH01, and reveal just
how chaotically this game was coded. And so we've got three ways that
ZUN stored regular 2D points: Regularly (X first, Y second), Y first
and X second, and multiple points stored in a structure of arrays…
Completes P0059, funded by [Anonymous] and -Tom-.
Yup, function parameters that can clearly be identified as coordinates
are by far the fastest way to raise the calculated position
independence percentage. Kinda makes it sound like useless work, which
I'm only doing because it's dictated by some counting algorithm on a
website, but decompilation will want to un-hex all of these values
anyway. We're merely doing that right now, across all games.
Part of P0058, funded by -Tom-.
Rule of thumb going forward: Everything that emits data is .asm,
everything that doesn't is .inc.
(Let's hope that th01_reiiden_2.inc won't exist for that much longer!)
Part of P0032, funded by zorg.
No leading underscore for functions with Pascal calling convention, but
we do have one for all variables, because it's not worth it to put
keywords in front of everything for no reason.
Seemed to have forgotten this rule in 2017?
Part of P0030, funded by zorg.
…unlike these two identical functions, which hopefully become clearer
after more reverse-engineering has been done around them. It's not like
they're directly related to CDG slot data anyway, they just apply the
TH05 version of the dissolution effect in the staff roll on top of the
already displayed image.
And that concludes all CDG/CD2-related code!
Funded by DTM.
With TH05 definitely being the Galaxy Brain version of this function.
You'll see once I get to push the C decompilation for that one…
Anyway, that covers all shared input functions of TH02-TH05!
Funded by zorg.
In which ZUN actively refuses help from master.lib and rips out
everything that isn't immediately related to reading the one joystick
port of the PC-9801-86 sound board.
Maybe there actually was a good reason for that?
Funded by -Tom-.
Look at that TH05 vector2_at_opt function. What the hell, the caller is
supposed to set up the stack frame for the function? How do you even get
a compiler to do this (and no, I haven't found a compiler switch)? No
way around writing a separate "optimizer" as part of the compilation
pipeline, it seems.
OK, let's not identify the arrays in a file-based fashion just yet, and
first reduce all shared ZUN code that uses arrays. Less stressful, we'll
have to do this anyway, and I just can't resist the urge to immediately
reverse-engineer everything I find.
Oh, right, these functions can have parameters. So, let's turn snd_kaja_func()
into a macro that combines the function number and the parameter into the AX
value for the driver.
And renaming them all to the short filenames they will be decompiled to for
consistency. These functions aren't really immediately hardware-related, as
we've established earlier in the decompilation.
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
After spending a few hours on correctly decompiling ZUN's bulky custom text
renderer used in TH02 and TH03, it unfortunately turned out that TLINK doesn't
actually give us the fine-grained control over segment ordering we'd like to
have in a project like this, and that we can't slot code from one object file
in between segments from another object file. This means that yes, we really
have to decompile the functions in the order they appear in the executables,
starting on either end.
So, have a boring janitorial commit instead.
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.
Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at
http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm
explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).
Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.
So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:
* C where the calling convention is actually __cdecl and where TASM's emitted
__cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
the caller either defers stack cleanup (summing up the stack size of
multiple functions, then cleaning it all in a single "add sp" instruction)
or where the stack is cleared in a different way (e.g. "pop cx").
Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
Note how it's only one *mode* in TH02/TH03, but two *modes* in TH04/TH05,
since you can't select between FM and Beep sound effect modes in TH02/TH03 (or
even disable sounds altogether). Might be a bit confusing, but it seemed
appropriate enough to distinguish the two functions.
Well, the naming.
Even though only TH02 actually uses MIDI (and thus, the MMD driver), every
game since then contains interrupt instructions for both functions. We could
just name it "pmd", since it seems like that's what came first - the AH
numbers of the 6 functions that make up MMD's interrupt API are identical to
those of the equivalent functions in PMD, even including gaps in the numbering
for PMD functions that don't have an equivalent in MIDI. However, except for
the FM sound effect handling and the key display in TH05's Music Room, these 6
functions are all the games actually use. Also, we already distinguish between
PMD and MMD in the driver check functions, and it might be confusing to only
imply PMD from now on?
So, "kaja" it is, collectively referring to the shared aspects of both
drivers.
For 32-bit immediate values, PUSH by itself is enough. For everything else,
PUSHD works in both TASM and JWasm.
Also, could it be...? Could we actually move to JWasm without breaking the
build in TASM at all?
... and then I end up copying modified versions into the individual game
subdirectories after all, because the changes between games were simply too
drastic. (That's also why I'm counting pfopen() itself twice.)
Only one slice left now, and then we're done with reduction!
Yup, packfiles finally proved that we really have a different set of changes
to master.lib in every game. Also, there are bound to be more of these game-
specific small changes to otherwise identical code in ZUN's own code.
And hey, no need to define that value in the build scripts anymore.
(I've also considered just copying modified versions into the individual game
subdirectories, but it's not too nice to expect people to diff them in order
to actually understand why these copies exist and where the changes actually
are.)
So that's the - admittedly rather weird - solution to the problem that has
been plaguing this project ever since the beginning of the reduction step.
Without any 32-bit dummy segments in the compiled object files, more linkers
will be able to build this project, one of them being JWlink
(http://sourceforge.net/projects/jwlink/).
Still can't rename dseg to _DATA though, as TASM stupidly refuses to accept
any ALIGN directives above a segment's alignment attribute value. TH01's
floating-point data slices already require larger alignments, and we're very
likely to have even more of those in the future.
Also, we're finally defining the Borland C++ model symbols directly in the
code, rather than in my unpublished build batch files. :)
Mostly moving spurious null bytes, which are actually supposed to denote
alignment, into their associated slices, but also prettying up some of the
very first slices.
Well, we have to start reducing this mess somewhere. The actual reduced
initialization code I've been preparing still fails to compile, and the data
is shared with a number of other components anyway, so...
Especially annoying if that happens in the middle of a Shift-JIS multi-byte
sequence, like in those two instances in TH02's OP.EXE. Also, making up for
the lack of string analysis during the dumping process of TH05's MAIN.EXE.
Which challenges a lot about what we thought to know about Amusement Makers'
modifications to master.lib, due to the fact that TH02 contains the modified
version of this function, but the original of draw_trapezoid...
And I haven't even begun to research how this removal of conditional branches
could have a positive effect on the game, especially since it's only called
before exiting anyway.
OK, *that's* the last piece of C++ crud shared across all main executables.
According to the object in the library file though, it seems to include one
more dword named
__DestructorCountPtr
in the BSS segment. Neither games nor the runtime itself seem to use it, and
as a consequence, it doesn't even seem to be included in the games' BSS
segments, given that they all end with the symbols of xx.cpp...
Neither is this one. Also, interesting how IDA didn't identify the function in
one third of the cases.
[Binary change] Order of 2 relocations in TH03's MAINL.EXE, TH04's MAIN.EXE
and MAINE.EXE, and TH05's MAINE.EXE.
OK, looks like we got all of the C++ crap out of the way... e~xcept for
another function in TH01's REIIDEN.EXE, of course.
[Binary change] Order of 2 relocations in TH01's FUUIN.EXE.
God, this C++ stuff really is a crappy mess. Even had to manually adjust the
alignments at the end of the the TEXTC segment - and no, the ALIGN directive
remains an inadequate tool random bytes, even more so because TASM's
implementation just pads the space with random bytes. But hey, nice to finally
see some reduction outside of seg000.
[Binary change]
* Order of 3 relocations in all of TH04 and TH05's OP.EXE
* Order of 6 relocations in TH03's OP.EXE and MAIN.EXE, and TH05's MAIN.EXE
and MAINE.EXE
* Order of 9 relocations in all of TH01, TH02's OP.EXE and MAINE.EXE, and
TH03's MAINL.EXE
* Order of 11 relocations in TH02's MAINE.EXE
[Binary change]
* Order of 2 relocations in all executables of TH02, TH03, TH04 and TH05
* Order of 4 relocations in TH01's FUUIN.EXE
* Inserts a new relocation into TH01's REIIDEN.EXE
Yup. 50 functions in a single module, totalling 12,633 bytes, used in all 15
game executables, and no references to any of that in the remaining game code.
[Binary change]
* Order of 3 relocations in all of THO3, TH04 and TH05, TH02's MAIN.EXE and
MAINE.EXE, and TH01's OP.EXE and FUUIN.EXE
* Order of 2 relocations in TH02's OP.EXE and TH01's REIIDEN.EXE
* Inserts a new relocation into TH03's MAIN.EXE