Commit Graph

713 Commits

Author SHA1 Message Date
nmlgc 436f1c5722 [C decompilation] [th01] MDRV2 calls
Still missing mdrv2_resident() though, which we currently can't slot in there
due to that string constant constructor syntax. :/
2015-02-21 20:48:58 +01:00
nmlgc ed8d0e28f5 [C decompilation] [th02/op] Title screen flashing animation 2015-02-21 14:16:27 +01:00
nmlgc 6d8ff6b72e Make previously reduced ZUN functions available to C 2015-02-21 14:12:22 +01:00
nmlgc 145ecaaa54 Rename all code segments to names that Turbo C++ would generate
Well, duh, of course, we *can* do this in order to allow decompilation to be
started at the end (not the beginning) of any segment. In fact, if we hadn't
done this, we would have had to start by moving _TEXT out to libraries....
2015-02-21 12:47:24 +01:00
nmlgc 7836363019 Use a Makefile for the 16-bit part of the build process
Well, that became unbearable pretty quickly. Not sure whether I'm doing all
this Makefile business right, but this looks pretty nice.

It doesn't really help much at this point though because the 32-bit part is
still entirely separate and forces everything to rebuild all the time, but at
least it aborts on C compiler errors.
2015-02-21 11:28:56 +01:00
nmlgc ffd8bb9013 Clean up the last remaining misanalyzed procedure boundaries
After spending a few hours on correctly decompiling ZUN's bulky custom text
renderer used in TH02 and TH03, it unfortunately turned out that TLINK doesn't
actually give us the fine-grained control over segment ordering we'd like to
have in a project like this, and that we can't slot code from one object file
in between segments from another object file. This means that yes, we really
have to decompile the functions in the order they appear in the executables,
starting on either end.

So, have a boring janitorial commit instead.
2015-02-20 22:44:09 +01:00
nmlgc c2a8c221f2 Let Turbo C++ link in the Borland C/C++ runtime for the main EXE files
This took long enough, so we're not covering the COM files right now. Like, I
can't even tell how you're supposed to work around the forced word alignment
for the _TEXT segment. Guess we'll just have to decompile all of these in one
go, just like we did with ZUNSOFT.COM.

Also, it really seems as if we're merely trading one ugly workaround for
another in our quest for identical binaries.
2015-02-19 10:22:00 +01:00
nmlgc 2d5d38426f Finally use standard segment names everywhere
And I guess we just have to ignore and disable that segment alignment warning
for TH01. It's not like this changes anything in the binary.
2015-02-18 14:04:43 +01:00
nmlgc f861b0a5c3 [C decompilation] ZUNSOFT.COM, all of it
And, of course, it recompiles into the exact binary ZUN shipped in 1997.
Success! This project is so going to happen now.
2015-02-17 13:18:14 +01:00
nmlgc cc219ff2b4 Add MASTERS.LIB and MASTER.H from the original distribution
Yup, we'll be linking against the original binary blob for the time being.
Don't worry though, we will (and in fact, have to) recompile the libraries
from source, separately for each game, as part of the build process in the
future, but we'll get to that once we've decompiled some of the non-TH01 code.
2015-02-16 23:10:47 +01:00
nmlgc ff94dce594 Add build batch files and some documentation about the build process
So yeah, that'll be our build environment - just plain batch files calling the
Borland command-line assembler, linker, and eventually C compiler. These are
the exact tools that ZUN used as well. There certainly are other assemblers,
compilers and linkers that could compile this code into 16-bit DOS
executables; Open Watcom is the only free one I know, and the master.lib
manual also mentions C compilers by Microsoft and Symantec. However, I favor
having one clear build path for a single toolchain that will, with the correct
command-line switches for each game, create builds that are bit-perfect to
ZUN's original ones over the possibility of cross-platform builds and the
maintenance nightmare they add.
So, Borland-only it is.

(Also, no Makefile, due to our messy build setup. I think I still prefer this
solution though, as we can have these really nice error messages that double
as build instructions without any dependencies on installed software.)
2015-02-15 23:32:32 +01:00
nmlgc 5268241a06 Merge the second halves of TH04's and TH05's MAIN.EXE back into the main files
I kinda wanted to wait with this until I've brought REIIDEN.EXE down to at
least 65,536 lines as well, but that's not going to happen anytime soon, and
this split has annoyed me enough by now...
2015-02-14 18:58:03 +01:00
nmlgc 2cac434455 [Reverse-engineering] Sound effect playback 2015-02-13 12:56:51 +01:00
nmlgc cc1a2987c4 Forgot the TH05 VRAM planes data file -.- 2015-02-12 21:33:22 +01:00
nmlgc 07519a7238 [Reverse-engineering] 32-bit VRAM plane pointers
I've looked at every openly available piece of PC-98 documentation, and there
don't seem to be any official names for the individual planes. The closest
thing I could find was the description at

	http://island.geocities.jp/cklouch/column/pc98bas/pc98disphw2.htm

explaining that they represent the blue, red, green, and brightness component
when using the default PC-98 palette. However, these planes correspond to
nothing else but the 4 individual bits of the final index into the color
palette, and you can assign any color to every single palette slot. Therefore,
it's merely a convention that your own palettes don't have to follow (and in
Touhou, they don't).

Nevertheless, there doesn't seem to be an alternative, and the Neko Project II
source code uses the same B/R/G/E convention, so I'll go with that as well.
2015-02-10 23:43:34 +01:00
nmlgc 60f6ecec84 [Reverse-engineering] [th01/zunsoft] Identify all global variables
Yup, the code for the first ZUN Soft logo is now completely position-
independent and ready to be decompiled.

(Also, TIL that the PC-98 GRCG has hardware support for double-buffering
through page flipping. Heh, at least one feature that makes it a viable system
for games...)
2015-01-13 18:10:24 +01:00
nmlgc 44146c4749 [Reduction] GRCG modes 2015-01-12 22:48:13 +01:00
nmlgc 0b89233e48 [Reverse-engineering] Music Room comment loading 2014-12-24 21:39:34 +01:00
nmlgc f0ab47fd18 [Reduction] Hardware text colors and effects
Turns out we're not quite done with reduction yet, as there still are a bunch
of macros in master.h that #define PC-98-specific hardware constants and I/O
ports.
2014-12-20 22:36:38 +01:00
nmlgc 04ab24d669 [th01] Undo the floating-point hacks 2014-12-19 06:11:42 +01:00
nmlgc a07e5fad42 [Reverse-engineering] Slot-based PI display
Also covering the two variations for blitting only every second row or
blitting only a 320x200 quarter, as seen in the endings.

So yeah, there's indeed nothing wrong with piread.cpp. TH03 just uses that
separate function that only blits every second row of an image, and indeed
always loads the entire image as it would appear in a PNG conversion. Here's
what happens if you display these images using the non-interlacing function:
https://www.dropbox.com/s/885krj09d9l0890/th03%20PI%20no%20interlace.png
2014-12-18 14:36:43 +01:00
nmlgc 721aa18de8 [Reduction] #709: graph_pack_put_8_noclip
Yeah, it's really just a copy of that function with 3 instructions deleted.
2014-12-17 13:04:21 +01:00
nmlgc bead27b781 Use TASM calling convention syntax for previously identified ZUN functions
With TH03 changing the calling convention for most of the code from __cdecl to
__pascal, I've been getting more and more confused about this myself. So,
let's settle on the following consistent syntax for function calls:

* C where the calling convention is actually __cdecl and where TASM's emitted
  __cdecl code matches the original binary
* PASCAL where the calling convention is actually __pascal
* STDCALL where the calling convention is actually __cdecl, but where
  the caller either defers stack cleanup (summing up the stack size of
  multiple functions, then cleaning it all in a single "add sp" instruction)
  or where the stack is cleared in a different way (e.g. "pop cx").

Unfortunately though, when using the ARG directive to automatically generate
an appropriate RET instruction for the given calling convention, TASM always
emits ENTER and LEAVE instructions even when no local variables are declared,
which greatly limits the number of functions where we can use that syntax. -.-
2014-12-16 05:53:56 +01:00
nmlgc 3b497286a5 [th03/zunsp] Initial state
This one contains 4 instructions (on lines 462, 528, 534 and 1309,
respectively) for which TASM chooses different opcodes than the original
compiler.
2014-12-13 23:58:08 +01:00
nmlgc fe90349a53 [th02/zuninit] Initial state
AKA "the source of the infamous STOP message".

This is pretty much irreducible assembly code, so it may very well be that we
don't even touch this file ever again, but at least it completes our build.
2014-12-11 19:48:34 +01:00
nmlgc 1be1cebc21 Update the Readme file to match the recent ZUN.COM developments
No point in having lame excuses for not dumping it when we've actually started
doing so.
2014-12-09 22:38:34 +01:00
nmlgc 2851587997 Add the FM hardware checker tool (ONGCHK.COM) from the PMD driver
This executable is embedded into all 4 versions of ZUN.COM. It was written by
KAJA, not ZUN, so we don't care about anything in there - not that it would
matter for porting anyway. We only need that binary to be able to create
bit-perfect rebuilds of ZUN.COM in the future.
2014-12-09 21:15:55 +01:00
nmlgc f541d98be8 [th05/res_kso] Reduce all known library functions
Once again, TH05 demonstrates that it's not a mere copy of TH04 by introducing
another set of code changes. This time, the configuration structure is
initialized with the default values in this executable, not in OP.EXE.
2014-12-08 19:08:05 +01:00
nmlgc 1e554af1b5 [th05/res_kso] Initial state
The code doesn't give away the original filename in this game, so I'll follow
the pattern of naming these after the ID of the game's resident configuration
structure.
2014-12-08 18:34:01 +01:00
nmlgc df4c07b728 [th04/res_huma] Reduce all known library functions
Different API for file handling, but the ratio of ZUN code to library code is
still exactly the same as in TH03.
2014-12-07 22:33:54 +01:00
nmlgc e7947192b5 [th04/res_huma] Initial state
Yep, in TH04 and TH05, ZUN.COM is a COM wrapper inside a COM wrapper.
And yep, "HUMA", despite this being TH04?
2014-12-07 21:47:09 +01:00
nmlgc 5ecc65f43f [th03/res_yume] Reduce all known library functions
TH03 doesn't prepare the initial high score list (instead leaving that to
MAINL.EXE), and the config file creation is identical to the one in TH02.
2 functions, surrounded by 88.8% of library code. Way to go.
2014-12-06 18:53:30 +01:00
nmlgc 511739861a [th03/res_yume] Initial state
AKA "ZUN -5".
2014-12-06 17:10:25 +01:00
nmlgc bfa3829003 [Reduction] #705-708: Remaining third-party functions in TH02's ZUN_RES.COM 2014-12-01 05:16:41 +01:00
nmlgc 3f1c1eba6d [th02/zun_res] Reduce all known library functions
From what I can tell, this program does exactly three things:
• preparing the initial high score list
• writing default settings to HUUMA.CFG
• and allocating the game's resident configuration structure and writing its
  segment address to bytes 6-7 of HUUMA.CFG

All that results in a COM file of 6.84 KiB, 83% of which is library code.
That's why C was once seen as a bloated high-level language as well.
2014-12-01 03:25:30 +01:00
nmlgc 7d81d76f6a [th02/zun_res] Initial state
Yep, we'll be needing some of those smaller executables embedded into ZUN.COM
after all in order to fully understand what's going on with things like that
persistent configuration structure used in each game, for example.

For now, I'll be keeping every one of these executables separately, for a
number of reasons:
• I can't get IDA to segment the code in a way that would reconstruct the
  layout of the individual executables, since it unfortunately requires
  segments to be aligned on paragraph boundaries...
• This, in turn, means that IDA can't apply FLIRT signatures, making
  identification of the Borland C++ functions a bit harder. Probably not that
  big of a deal at this point anymore, but still.
• There are bound to be multiple copies of Borland C++ and master.lib
  functions in these. We are still using the "slice model", meaning that *all*
  functions in an executable are part of the same namespace. Creating copies of
  some source files just to allow a second instance of that function is not
  too pretty.
• Lastly, we don't actually need to reproduce all executables. For example,
  TH02's version of ZUNSOFT.COM is bit-identical to TH01's.

Hence, adding a separate build step to wrap these smaller executables back
into a bit-perfect version of ZUN.COM at a later point is a much better
option. (And it would be even better if we could track down the program used
to wrap those in the first place!)
2014-12-01 00:46:10 +01:00
nmlgc 46b2d67143 [Reverse-engineering] Music and sound effect loader 2014-11-30 00:18:40 +01:00
nmlgc 08db7d6392 [Reverse-engineering] Sound mode determination
Note how it's only one *mode* in TH02/TH03, but two *modes* in TH04/TH05,
since you can't select between FM and Beep sound effect modes in TH02/TH03 (or
even disable sounds altogether). Might be a bit confusing, but it seemed
appropriate enough to distinguish the two functions.
2014-11-29 00:56:26 +01:00
nmlgc 181d2920af [Reverse-engineering] Symbols for PMD and MMD API calls 2014-11-27 19:35:54 +01:00
nmlgc de25d6de3e [Reverse-engineering] PMD and MMD function call wrapper
Well, the naming.

Even though only TH02 actually uses MIDI (and thus, the MMD driver), every
game since then contains interrupt instructions for both functions. We could
just name it "pmd", since it seems like that's what came first - the AH
numbers of the 6 functions that make up MMD's interrupt API are identical to
those of the equivalent functions in PMD, even including gaps in the numbering
for PMD functions that don't have an equivalent in MIDI. However, except for
the FM sound effect handling and the key display in TH05's Music Room, these 6
functions are all the games actually use. Also, we already distinguish between
PMD and MMD in the driver check functions, and it might be confusing to only
imply PMD from now on?

So, "kaja" it is, collectively referring to the shared aspects of both
drivers.
2014-11-26 21:21:57 +01:00
nmlgc 2da9a458ab [Reverse-engineering] snd_delay_until_volume
Only really used to delay during fade-outs, though.
2014-11-25 21:21:17 +01:00
nmlgc 98de0abfab [Reverse-engineering] Sound driver and hardware checks 2014-11-24 22:36:57 +01:00
nmlgc f40819b0e5 [Reverse-engineering] frame_delay 2014-11-23 22:32:26 +01:00
nmlgc 510a3a5070 [Reverse-engineering] pi_slot_palette_apply 2014-11-22 09:29:09 +01:00
nmlgc 5ad97a08ea [JWasm move] Fix the remaining small issues to get through the first pass
Thanks to the LOCALS directive, we do need to break compatibility to TASM at
one point after all. This is the rest we can reasonably change to get at least
through JWasm's first pass without errors while maintaining compatibility to
TASM.

Includes:
* the OPTION syntax to switch in and out of floating-point emulation mode
* REP CMPSB → REPE CMPSB
* Hacks for two 80-byte short jumps
* lack of support for floating-point stupidity ♥
as well as other issues that I covered in previous commits and overlooked in
some files.
2014-11-21 11:24:47 +01:00
nmlgc 2279e82167 [JWasm move] Use unique global names for local labels where it matters
From the TASM manual:
"NEAR labels defined with the colon directive (:) are considered block-scoped
if they are located inside a procedure, and you've selected a language
interfacing convention with the MODEL statement. However, these symbols are
not truly block-scoped; they can't be defined as anything other than a near
label elsewhere in the program."

MASM's own local label syntax - declaring labels using @@ and then jumping to
the next and previous @@ using @F and @B - is obviously too limiting for any
longer function, and is not even supported by TASM unless we switch it to MASM
mode completely.

While this is indeed ugly, it only affected 16 files, which is way less than
what we would get in a TASM build without LOCALS. In comparison to having a
modern, cross-platform assembler, that really is a small price to pay.
2014-11-21 08:40:41 +01:00
nmlgc 5e35cfb1af [JWasm move] Fix improper structure declarations
Really, Borland? You considered it necessary to add directives for object-
oriented programming (in Assembly!) and convenience features like bitfield
records or PUSHSTATE/POPSTATE, yet you never came up with the actually
*helpful* idea of just adding a simple basic pointer data type that depends
on the current memory model's data size?
Like, something like DP... oh wait, that's already taken, as an alias for
DF, the 48-bit 80386 far pointer type.

And this, exactly, is the problem with assemblers. The language itself is
undefined beyond the instructions themselves, but it's obviously very
uncomfortable to program anything with just that, so your assembler needs to
add custom directives on top of that, and of course everyone has different
ideas of the features and use cases that should (and should not) be covered by
syntax. (I'm looking especially at you, NASM.)

And then one of those developers sells their compiler division to a different
company, which then subsequently discontinues all products without ever
releasing the source code, trapping their nice extensions in a single
executable for a single platform that is not even legally available anymore.

tl;dr: http://xkcd.com/927/
2014-11-20 04:55:57 +01:00
nmlgc b532a96c7e [JWasm move] Avoid "push large"
For 32-bit immediate values, PUSH by itself is enough. For everything else,
PUSHD works in both TASM and JWasm.

Also, could it be...? Could we actually move to JWasm without breaking the
build in TASM at all?
2014-11-19 12:09:22 +01:00
nmlgc 877804c739 [JWasm move] Prefixes must be on the same line as the modified instruction 2014-11-19 07:31:59 +01:00
nmlgc e551d590bd [JWasm move] Fix the interrupt vector declarations in c0[data].asm
The leniency! It hurts!
2014-11-19 07:26:12 +01:00