IntroductionWell - here it is: this one...
This one's a tad rough - many elements, multiple files, but very extensive and probably the most valuable of all the file formats in this game to crack. A little bit of encryption, a bit of compression, some character encoding; they have it all.
This is going to be a multi-part more than likely as there are some things to clear up still, so let's get to it!!!
Raw FileStandard LH5 compressed file; roughly 80k. 0x08 with a 3 byte size, big endian; you know the drill by now.
Size of this one is the normal big endian two bytes but then the third byte gets a << 0xF and OR'd to the end, meaning:
0x8868 OR ((0x05 << 0xF ) == 0x028000) == 0x28868 or 165992 bytes.
For Dreamcast:All compressed cards are a single format (0x08 header with a 3 byte size)
For Playstation 2:Most are like Dreamcast, but some cards (like hidden ones such as Carbunfly) start with 0x09... more on that later.
An extracted card header looks like this:
An extracted 0x09 compressed card looks like this:
Whoa - well... something's wrong here. It could be that the developers didn't want these cards showing up somewhere, it could be they were privy to tools to dump the cards - hard to say.
At first, the card looks corrupt, we can see the start of the word 'Ca' in one part and it looks like it still retains some form...
Making the assumption that this has to be decrypted in memory and simple to be done rather quickly, I started looking online for values for Carbunfly because it appears that the shot values are still in place (you can see the header size looks like its a 16 bit value still.)
Referring back to the 'corrupted' data, lets say that our ST value (40) which is 0x28 is SUPPOSED to actually BE 0x28. It's currently 0xFFDC, what's 0x28-FFDC? WELL - it's 0x4C!!!
We've run into some kind of chain cipher algorithm! Because we're 'scientists', let's actually confirm this; the next value should be 0x28 as well. What's 0x28 - 0x00? It's 0x28!!!
One more! The next one is 100 for G (0x64) what's 0x64 - 0x3C? 0x28!!!!! YESSIR!!!
OK! Excitement aside, let's write a decryptor!
The Decrypt Algorithm
For every two bytes after the first:
- Add that value to the 16 bits behind it.
On to everything else!
Card File Structure
As a breakdown, the card file structure is as follows:
[Sprite file - if creature]
[Animation Data - if creature]
This area has a number of card-related metadata such as:
- Size of the pre-text header in bytes
- ST (which equates to the card's attack rating)
- HP (well...hp)
- MHP (Max HP)
- G (Cost to use card)
- Type (Neutral, Earth, Air, Fire, Water, Spell Weapon, Armor, etc.)
- Land restriction (can't use with types above)
- Artist ID who drew the card (somewhere)
- Item restriction (can't use with armor, spell, scroll, etc.)
- Other values (Will come back to these - most are like, extra cost to use, etc.)
- Card ID in the set
- Offset to Card Graphic
- Offset to Sprite
- Offset to Animation Data
- Title of card (always at offset 0x34)
- Description Page 1 (Normally with a repeated title header)
- Description Page 2
- Description Page 3
Only a couple of notes here:
The devs used special non-printable hex digits to mean in-game icons like elements and weapons to be displayed instead of text. This makes Python throw a shitfit; I replace them with special characters to denote at a later time.
The Dreamcast strings are all Shift-JIS... Python's default JSON module doesn't like them - partially because of that, and partially out of laziness, my JSON that I'll talk about later on isn't pretty-printed.
This one is a basic LH5 compressed file, the two consoles differ greatly, however.
The Dreamcast uses an 8 bit texture that has been twiddled (all the pixels have been
resorted by row so they load faster into the GPU). They also have no color data embedded,
an external palette is used and yet to be found (more on this in part 2). Basically, they're a real mess to extract (lots of math to flip stuff around, etc.)
The Playstation 2 version is 8 bit as well, but is indexed and uses a CLUT (also note that the PS2 re-adjusts the width of the sprite):
Basically, a Color Lookup Table is a palette of all the colors in an image. Instead
of storing color values, each pixel need only store a 1 byte index of the color it
requires at that spot (meaning you can put up to 256 8 bit colors, or 128 16 bit colors,
or 64 32 bit colors (meaning ARGB).
(lifted from wikipedia)
To reconstruct this image is fairly easy then, we:
- Read all the 16 bit colors in the palette, convert them to 32 bit RGB values
- Go through each pixel, find which number out of 256 it points to
- Draw a pixel at [x,y] on a new image with that color.
What if you don't have enough colors to fill the palette? Well, it just repeats the
colors you do have until the palette memory is 1024 bytes in size (LOL).
Animation FormatThis one is interesting - I haven't fully figured out how this one works yet. It's like a number of values that specify each frame's upper-right coord, the width and height, and some projection value. The first two bytes are definitely the size of the data, however.
This, unlike the sprite, isn't compressed. It's actually 256x320 ARGB1555 in the Playstation2 version (as basically everything is) and RGB565 with BGR channel swap in the Dreamcast version; so swap those channels or you'll get blue when you want red!
The end result is something like:
Oprah MomentFor fun, I wrote a WIP tool that dumps all the card files, decrypts the PS2 encrypted cards, and writes the results like so:
- Metadata -> JSON file
- Sprite -> PNG
- Animation -> Bin file
- Card Graphic -> PNG
I've also zipped them below for those interested.
Stay tuned for Part2.