MAIN.EXE shares most of the code in this segment, but I can't remove it from
there right now due to the weird ordering of the data segments in that
executable…
And yes, once again, those three seemingly random type casts in here are
*necessary* to build a bit-perfect binary.
Small detour into MAINE.EXE because it has all the juicy algorithms that will
explain the remaining unknown members of the highscore data structure, and
there's this one code segment here we need to get out of the way first.
The same function appears unused in TH02's MAINE.EXE. Separate commit because
this was painful enough and we can link the C version into FUUIN.EXE right
now.
Oh, OK, so this is what the PC-98 GRCG is all about. You call grcg_setcolor(),
and that puts the PC-98 hardware in some sort of "monochromatic mode". Then,
you just write your pixels into any *single* one of the 4 VRAM bitplanes. This
causes the hardware to automatically write to *all* bitplanes in such a way
that the final palette index for each of the 8, 16, or 32 pixels you just wrote
a 1 value to will actually end up to match the color you set earlier.
Don't forget to call grcg_off() at the end though, or you can't draw any
non-monochromatic graphics, heh.
Yes, all of it. Including the bouncing polygons, of course. And since it's
placed at the end of ZUN's code inside the executable, the code's already
position-independent and fully hackable.
Don't really understand the other games yet because they start introducing
joystick support and TH03 has multiplayer and then there are these master.lib
modifications that don't really make any sense to me, especially when you add
that TH04 seemingly does not read js_stat *at all*, yet still works just fine
with a gamepad and... urgh.
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
Well, that became unbearable pretty quickly. Not sure whether I'm doing all
this Makefile business right, but this looks pretty nice.
It doesn't really help much at this point though because the 32-bit part is
still entirely separate and forces everything to rebuild all the time, but at
least it aborts on C compiler errors.
After spending a few hours on correctly decompiling ZUN's bulky custom text
renderer used in TH02 and TH03, it unfortunately turned out that TLINK doesn't
actually give us the fine-grained control over segment ordering we'd like to
have in a project like this, and that we can't slot code from one object file
in between segments from another object file. This means that yes, we really
have to decompile the functions in the order they appear in the executables,
starting on either end.
So, have a boring janitorial commit instead.
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.
Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
Yup, we'll be linking against the original binary blob for the time being.
Don't worry though, we will (and in fact, have to) recompile the libraries
from source, separately for each game, as part of the build process in the
future, but we'll get to that once we've decompiled some of the non-TH01 code.
So yeah, that'll be our build environment - just plain batch files calling the
Borland command-line assembler, linker, and eventually C compiler. These are
the exact tools that ZUN used as well. There certainly are other assemblers,
compilers and linkers that could compile this code into 16-bit DOS
executables; Open Watcom is the only free one I know, and the master.lib
manual also mentions C compilers by Microsoft and Symantec. However, I favor
having one clear build path for a single toolchain that will, with the correct
command-line switches for each game, create builds that are bit-perfect to
ZUN's original ones over the possibility of cross-platform builds and the
maintenance nightmare they add.
So, Borland-only it is.
(Also, no Makefile, due to our messy build setup. I think I still prefer this
solution though, as we can have these really nice error messages that double
as build instructions without any dependencies on installed software.)
I kinda wanted to wait with this until I've brought REIIDEN.EXE down to at
least 65,536 lines as well, but that's not going to happen anytime soon, and
this split has annoyed me enough by now...
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at
http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm
explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).
Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
Yup, the code for the first ZUN Soft logo is now completely position-
independent and ready to be decompiled.
(Also, TIL that the PC-98 GRCG has hardware support for double-buffering
through page flipping. Heh, at least one feature that makes it a viable system
for games...)
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.
So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:
* C where the calling convention is actually __cdecl and where TASM's emitted
__cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
the caller either defers stack cleanup (summing up the stack size of
multiple functions, then cleaning it all in a single "add sp" instruction)
or where the stack is cleared in a different way (e.g. "pop cx").
Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
AKA "the source of the infamous STOP message".
This is pretty much irreducible assembly code, so it may very well be that we
don't even touch this file ever again, but at least it completes our build.
This executable is embedded into all 4 versions of ZUN.COM. It was written by
KAJA, not ZUN, so we don't care about anything in there - not that it would
matter for porting anyway. We only need that binary to be able to create
bit-perfect rebuilds of ZUN.COM in the future.
Once again, TH05 demonstrates that it's not a mere copy of TH04 by introducing
another set of code changes. This time, the configuration structure is
initialized with the default values in this executable, not in OP.EXE.
The code doesn't give away the original filename in this game, so I'll follow
the pattern of naming these after the ID of the game's resident configuration
structure.
TH03 doesn't prepare the initial high score list (instead leaving that to
MAINL.EXE), and the config file creation is identical to the one in TH02.
2 functions, surrounded by 88.8% of library code. Way to go.
From what I can tell, this program does exactly three things:
• preparing the initial high score list
• writing default settings to HUUMA.CFG
• and allocating the game's resident configuration structure and writing its
segment address to bytes 6-7 of HUUMA.CFG
All that results in a COM file of 6.84 KiB, 83% of which is library code.
That's why C was once seen as a bloated high-level language as well.
Yep, we'll be needing some of those smaller executables embedded into ZUN.COM
after all in order to fully understand what's going on with things like that
persistent configuration structure used in each game, for example.
For now, I'll be keeping every one of these executables separately, for a
number of reasons:
• I can't get IDA to segment the code in a way that would reconstruct the
layout of the individual executables, since it unfortunately requires
segments to be aligned on paragraph boundaries...
• This, in turn, means that IDA can't apply FLIRT signatures, making
identification of the Borland C++ functions a bit harder. Probably not that
big of a deal at this point anymore, but still.
• There are bound to be multiple copies of Borland C++ and master.lib
functions in these. We are still using the "slice model", meaning that *all*
functions in an executable are part of the same namespace. Creating copies of
some source files just to allow a second instance of that function is not
too pretty.
• Lastly, we don't actually need to reproduce all executables. For example,
TH02's version of ZUNSOFT.COM is bit-identical to TH01's.
Hence, adding a separate build step to wrap these smaller executables back
into a bit-perfect version of ZUN.COM at a later point is a much better
option. (And it would be even better if we could track down the program used
to wrap those in the first place!)