Commit Graph

219 Commits

Author SHA1 Message Date
nmlgc 6c4852f789 [Position independence] False positives in master.lib GRCG function calls
Yup, function parameters that can clearly be identified as coordinates
are by far the fastest way to raise the calculated position
independence percentage. Kinda makes it sound like useless work, which
I'm only doing because it's dictated by some counting algorithm on a
website, but decompilation will want to un-hex all of these values
anyway. We're merely doing that right now, across all games.

Part of P0058, funded by -Tom-.
2019-11-14 00:51:48 +01:00
nmlgc f07089017f [Maintenance] Rename the extension of game-specific ASM includes to .inc
Rule of thumb going forward: Everything that emits data is .asm,
everything that doesn't is .inc.
(Let's hope that th01_reiiden_2.inc won't exist for that much longer!)

Part of P0032, funded by zorg.
2019-09-21 13:03:56 +02:00
nmlgc 35ef90f4d1 [Reduction] Page flipping
Funded by -Tom-.
2018-12-30 00:16:18 +01:00
nmlgc 7a309919aa [Reverse-engineering] [th02/maine] *Actually* identify all array references -.-
And already, the script begins to crumble, reminding me of what a
terrible idea it actually was. Like, if you did it for real, you'd get
so many false positives that the script stops being useful, since
every raw number above 0x90 (the size of the _DATA segment of the
Borland C++ DOS startup code) can potentially be a memory reference.

I do think that the script now covers the sweet spot between full-blown
emulation and shallow parsing though, so going to do at least a few
more files.
2017-01-05 23:54:17 +01:00
nmlgc 915f780e73 [Reverse-engineering] [th02/maine] Identify all remaining global arrays
Which is the last step on the way to completely position-independent
code, with no random hex numbers that should have been data pointers,
but weren't automatically turned into data pointers by IDA because
they're only ever addressed in the indirect fashion of

	mov bx, [bp-array_index]
	mov ax, [bx+0D00h] ; 0D00h is obviously an array of some sort

Removing all of these makes it practicable to add or delete code without
breaking the game in the process. Basic "modding", so to speak.

Automatically catching all possible cases where this happens actually
amounts to emulating the entire game, and *even then*, we're not
guaranteed that the *size* of the array just falls out as a byproduct
of this emulation and the tons of heuristics I would have thrown on top
of that. ZUN hates proper bounds checking and the correct size of each
array may simply never be implied anywhere.

So, rather than going through all that trouble of that (and hell, I
haven't even finished *parsing* this nasty MASM assembly format), and
since nothing really has happened in this project for almost two years,
I chose to just turn this into a text manipulation issue and figure out
the rest manually. Yeah, quick and dirty, and it probably won't scale if
I ever end up doing the same for PC-98 Policenauts, but it'd better work
at least for the rest of PC-98 Touhou.

Trying to do one of those per day from now on. Probably won't make it
due to the reverse-engineering effort required for the big main
executables of each game, but it'd sure be cool if I did.
2017-01-04 20:52:21 +01:00
nmlgc 58e1e142a5 [Maintenance] clip[bss].asm actually covers 16 bytes, not 8 -.- 2017-01-03 21:42:14 +01:00
nmlgc 43001161e3 [Maintenance] Fix any whitespace issues in our own code 2015-09-07 15:44:48 +02:00
nmlgc c5f53d9cf1 [Maintenance] Rename snd_kaja_func() to snd_kaja_interrupt()
Oh, right, these functions can have parameters. So, let's turn snd_kaja_func()
into a macro that combines the function number and the parameter into the AX
value for the driver.
2015-03-15 23:51:11 +01:00
nmlgc de491f225d [Maintenance] Move the sound driver function slices from hardware/ to snd/
And renaming them all to the short filenames they will be decompiled to for
consistency. These functions aren't really immediately hardware-related, as
we've established earlier in the decompilation.
2015-03-15 23:01:31 +01:00
nmlgc 92979e8f31 [C decompilation] [th02] Code segment #2 of all three executables
Only one code segment left in both OP and FUUIN! its-happening.gif

Yeah, that commit is way larger than I'm comfortable with, but none of these
functions is particularly large or difficult to decompile (with the exception
of graph_putsa_fx(), which I actually did weeks ago), and OP and MAIN have
their own unique functions in between the shared ones, so…
2015-03-14 23:25:50 +01:00
nmlgc a3ae0095f0 [C decompilation] [th02] PI display 2015-03-04 04:28:16 +01:00
nmlgc ed0437f80e [C decompilation] [th02] First set of sound driver calls 2015-03-04 02:47:22 +01:00
nmlgc a8384c925f [C decompilation] [th02/maine] HUUMA.CFG loading 2015-03-03 07:40:29 +01:00
nmlgc 87b1fb9e14 [C decompilation] [th02/maine] High score screen
MAIN.EXE shares most of the code in this segment, but I can't remove it from
there right now due to the weird ordering of the data segments in that
executable…

And yes, once again, those three seemingly random type casts in here are
*necessary* to build a bit-perfect binary.
2015-03-02 06:30:06 +01:00
nmlgc d058666929 [C decompilation] [th02/maine] Rotating rectangle animation
Small detour into MAINE.EXE because it has all the juicy algorithms that will
explain the remaining unknown members of the highscore data structure, and
there's this one code segment here we need to get out of the way first.
2015-02-28 22:37:40 +01:00
nmlgc a7235304ed Make the VRAM plane constants available to C 2015-02-24 22:16:31 +01:00
nmlgc ad9d6f97eb [Reverse-engineering] [th02] MIKOConfig structure 2015-02-23 23:48:03 +01:00
nmlgc 22332a71fa Make all sound functions and variables available to C 2015-02-23 18:28:38 +01:00
nmlgc 46eb3792cf Move frame_delay into the hardware/ subdirectory 2015-02-23 10:29:12 +01:00
nmlgc f0be7dadf4 [Reverse-engineering] [th02] Keyboard input
Don't really understand the other games yet because they start introducing
joystick support and TH03 has multiplayer and then there are these master.lib
modifications that don't really make any sense to me, especially when you add
that TH04 seemingly does not read js_stat *at all*, yet still works just fine
with a gamepad and... urgh.
2015-02-22 22:33:07 +01:00
nmlgc 6d8ff6b72e Make previously reduced ZUN functions available to C 2015-02-21 14:12:22 +01:00
nmlgc 145ecaaa54 Rename all code segments to names that Turbo C++ would generate
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
2015-02-21 12:47:24 +01:00
nmlgc c2a8c221f2 Let Turbo C++ link in the Borland C/C++ runtime for the main EXE files
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.

Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
2015-02-19 10:22:00 +01:00
nmlgc 2d5d38426f Finally use standard segment names everywhere
And I guess we just have to ignore and disable that segment alignment warning
for TH01. It's not like this changes anything in the binary.
2015-02-18 14:04:43 +01:00
nmlgc 07519a7238 [Reverse-engineering] 32-bit VRAM plane pointers
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at

	http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm

explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).

Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
2015-02-10 23:43:34 +01:00
nmlgc 44146c4749 [Reduction] GRCG modes 2015-01-12 22:48:13 +01:00
nmlgc f0ab47fd18 [Reduction] Hardware text colors and effects
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
2014-12-20 22:36:38 +01:00
nmlgc a07e5fad42 [Reverse-engineering] Slot-based PI display
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.

So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
2014-12-18 14:36:43 +01:00
nmlgc bead27b781 Use TASM calling convention syntax for previously identified ZUN functions
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:

* C where the calling convention is actually __cdecl and where TASM's emitted
  __cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
  the caller either defers stack cleanup (summing up the stack size of
  multiple functions, then cleaning it all in a single "add sp" instruction)
  or where the stack is cleared in a different way (e.g. "pop cx").

Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
2014-12-16 05:53:56 +01:00
nmlgc 46b2d67143 [Reverse-engineering] Music and sound effect loader 2014-11-30 00:18:40 +01:00
nmlgc 08db7d6392 [Reverse-engineering] Sound mode determination
Note how it's only one *mode* in TH02/TH03, but two *modes* in TH04/TH05,
since you can't select between FM and Beep sound effect modes in TH02/TH03 (or
even disable sounds altogether). Might be a bit confusing, but it seemed
appropriate enough to distinguish the two functions.
2014-11-29 00:56:26 +01:00
nmlgc 181d2920af [Reverse-engineering] Symbols for PMD and MMD API calls 2014-11-27 19:35:54 +01:00
nmlgc de25d6de3e [Reverse-engineering] PMD and MMD function call wrapper
Well, the naming.

Even though only TH02 actually uses MIDI (and thus, the MMD driver), every
game since then contains interrupt instructions for both functions. We could
just name it "pmd", since it seems like that's what came first - the AH
numbers of the 6 functions that make up MMD's interrupt API are identical to
those of the equivalent functions in PMD, even including gaps in the numbering
for PMD functions that don't have an equivalent in MIDI. However, except for
the FM sound effect handling and the key display in TH05's Music Room, these 6
functions are all the games actually use. Also, we already distinguish between
PMD and MMD in the driver check functions, and it might be confusing to only
imply PMD from now on?

So, "kaja" it is, collectively referring to the shared aspects of both
drivers.
2014-11-26 21:21:57 +01:00
nmlgc 98de0abfab [Reverse-engineering] Sound driver and hardware checks 2014-11-24 22:36:57 +01:00
nmlgc f40819b0e5 [Reverse-engineering] frame_delay 2014-11-23 22:32:26 +01:00
nmlgc 510a3a5070 [Reverse-engineering] pi_slot_palette_apply 2014-11-22 09:29:09 +01:00
nmlgc b532a96c7e [JWasm move] Avoid "push large"
For 32-bit immediate values, PUSH by itself is enough. For everything else,
PUSHD works in both TASM and JWasm.

Also, could it be...? Could we actually move to JWasm without breaking the
build in TASM at all?
2014-11-19 12:09:22 +01:00
nmlgc f54b85577d [Reverse-engineering] Slot-based PI file loading and freeing 2014-11-18 17:56:13 +01:00
nmlgc b4361e8487 [Reduction] #700-704: pfopen
... and then I end up copying modified versions into the individual game
subdirectories after all, because the changes between games were simply too
drastic. (That's also why I'm counting pfopen() itself twice.)

Only one slice left now, and then we're done with reduction!
2014-11-17 04:54:40 +01:00
nmlgc 62d4593842 [Reduction] #697-699: Packfile interrupt hooking 2014-11-16 04:08:46 +01:00
nmlgc f303222ffc Replace MASTERMOD with a per-game constant
Yup, packfiles finally proved that we really have a different set of changes
to master.lib in every game. Also, there are bound to be more of these game-
specific small changes to otherwise identical code in ZUN's own code.

And hey, no need to define that value in the build scripts anymore.

(I've also considered just copying modified versions into the individual game
subdirectories, but it's not too nice to expect people to diff them in order
to actually understand why these copies exist and where the changes actually
are.)
2014-11-15 02:03:41 +01:00
nmlgc 13b10ef589 [Reduction] #683: access (the one that *actually* has no underscore) 2014-11-09 11:58:33 +01:00
nmlgc 3a1c2fd679 Move the stack segment into its own slice
Saves 141 lines, and we'll need to ASSUME it in the upcoming floating-point
slices.
2014-11-02 19:44:02 +01:00
nmlgc 4ac17ac2a5 Trick TASM into not creating 32-bit default segments
So that's the - admittedly rather weird - solution to the problem that has
been plaguing this project ever since the beginning of the reduction step.
Without any 32-bit dummy segments in the compiled object files, more linkers
will be able to build this project, one of them being JWlink
(http://sourceforge.net/projects/jwlink/).

Still can't rename dseg to _DATA though, as TASM stupidly refuses to accept
any ALIGN directives above a segment's alignment attribute value. TH01's
floating-point data slices already require larger alignments, and we're very
likely to have even more of those in the future.

Also, we're finally defining the Borland C++ model symbols directly in the
code, rather than in my unpublished build batch files. :)
2014-10-31 08:17:54 +01:00
nmlgc 696d7f9476 Identify the missing BSS slice of xxv.cpp
sigdata.c doesn't specify any alignment, so this is the only position that
makes sense.
2014-10-29 05:41:43 +01:00
nmlgc 340c8a792a General cleanup
Mostly moving spurious null bytes, which are actually supposed to denote
alignment, into their associated slices, but also prettying up some of the
very first slices.
2014-10-20 17:20:04 +02:00
nmlgc 1c72d7e242 [Reduction] #548: Floating-point emulation data
Well, we have to start reducing this mess somewhere. The actual reduced
initialization code I've been preparing still fails to compile, and the data
is shared with a number of other components anyway, so...
2014-10-19 23:37:46 +02:00
nmlgc 658ed9e72b Move "Abnormal program termination" to its own slice
That was the very first function reduced, before I came up with the data slice
model in 59688e23fc.
2014-10-12 18:37:58 +02:00
nmlgc 4625339af1 Identify all remaining nopcalls 2014-10-07 06:32:20 +02:00
nmlgc eace57b1a2 Wrap all code segments into their own group
Necessary to keep the original segment ordering with ALINK, our new linker.
2014-09-22 22:19:29 +02:00