The marching little byte in Eggerland Mystery
This blog post is also available as a YouTube video:
In the last blog post, we looked at how the level data was stored in Eggerland Mystery. This time, we'll take a different approach. Instead of having a clear goal, and figuring out how something was exactly implemented, we'll take a look at something interesting that can be seen in the memory view, while we're playing the game, and try to figure out what this memory represents.
Take a look at the video below, to see the marching byte in action:
The game seems to be filling up this part of memory, starting at address 0xD2D6
, mostly with the
value 0x0F
. It neatly counts up from zero to fifteen, then progresses to the next byte, when the
'limit' of 0x0F
has been reached.
Table of contents:
- Taking a step back
- Looking at the code
- Figuring out 0xD2D0 and 0xD008
- Another look at the marching byte
- Bringing it all together
- Some more details
Taking a step back
If you remember the last blog post, the
first thing we did was figure out where the round number was stored. While using the cheat finder, we ended up
with two addresses potentially containing the current round number. We went ahead with the first entry in the
list, because when we changed it, the round number in-game immediately changed as well, but now let's look at
the second result we encountered. It had the address 0xD2D4
. Looking at this memory region again,
we can see that this is very close to the data we are investigating. This is how this region looks when we
boot up the game, in the main menu.
D2D0: FF FF 00 00 56 FF 12 7F 73 5C 38 54 78 55 33 14...
= Round number (presumably) at 0xD2D4
= The zone at 0xD2D6 that gets overwritten by the marching byte
This presumed round number byte being in close proximity of the data we are interested in could be a coincidence, but let's indulge a bit further, and start the game in round 1. That same memory now looks like this, after the round is loaded, but before it is started and the (internal) timer starts to tick down.
D2D0: 00 00 D6 D2 01 00 01 7F 73 5C 38 54 78 55 33 14...
= Round number at 0xD2D4, now changed to 01
= The zone at 0xD2D6, with the first byte changed to 01
We notice that the round number at 0xD2D4
did indeed change from 0x56
to
0x01
, seeing as we are now in round one. The value 0x56
at startup isn't a surprise
either, it's 86 in decimal, and we recognize that as the round number for the demo mode, which triggers if we
wait a few moments in the main menu.
Do you see something else that's peculiar in this area of the memory? The bytes at 0xD2D2
and
0xD2D3
changed from 00 00
to D6 D2
!
D2D0: 00 00 D6 D2 01 00 01 7F 73 5C 38 54 78 55 33 14...
If we interpret those two bytes as a 16-bit pointer, we end up with 0xD2D6
(we have to flip them
around because of endianness). 0xD2D6
happens to exactly point to the first byte of the marching
byte memory area we are investigating. Now take a look at the video of the marching byte again, keeping a close
eye on that pointer in particular.
Did you notice that the pointer gets updated each time the marching byte jumps to the next position? It clearly points to the "current position" of our marching byte.
Before we even look at the assembly, we are already aware of three memory locations:
0xD2D6
: a region of memory in which the marching byte gets written0xD2D4
: the current round number, most likely related to what we're doing0xD2D2/0xD2D3
: a pointer to where in the marching byte region we currently are
Looking at the code
The first thing you would do is, of course, put a write watchpoint on that first byte of memory at
0xD2D6
, and see if we can deduce something from the code that writes here.
Immediately, even before the first round is loaded, our watchpoint triggers at address 0x4271
,
where the hard-coded value of 0x01
is written in the first byte. Seeing as we haven't started
to
play the round itself yet, and there's a hard-coded value being written, we can assume this is some sort of
initialization.
0x4271 36 01 ld (hl),#01
The HL register does of course contain 0xD2D6
, pointing to the first byte of the area in which
our
marching
byte walks.
After continuing execution, the watchpoint doesn't trigger again, but we haven't started playing the round yet. So let's do that. I press the down arrow, to start the gameplay, and we break on our watchpoint immediately. The line that triggers the watchpoint is this:
0x447d 3a 19 c8 ld a,(#c819)
0x4480 a7 and a
0x4481 c0 ret nz
0x4482 3a 60 d2 ld a,(#d260)
0x4485 a7 and a
0x4486 c0 ret nz
0x4487 2a d2 d2 ld hl,(#d2d2)
0x448a 11 00 00 ld de,#0000
0x448d 3a d0 d2 ld a,(#d2d0)
0x4490 a7 and a
0x4491 28 02 jr z,#4495
0x4493 1e 80 ld e,#80
0x4495 3a 08 d0 ld a,(#d008)
0x4498 87 add a,a
0x4499 87 add a,a
0x449a 87 add a,a
0x449b 87 add a,a
0x449c 57 ld d,a
0x449d 3a d1 d2 ld a,(#d2d1)
0x44a0 a7 and a
0x44a1 20 0b jr nz,#44ae
0x44a3 b2 or d
0x44a4 b3 or e
0x44a5 c8 ret z
0x44a6 3c inc a
0x44a7 77 ld (hl),a
0x44a8 3e 01 ld a,#01
0x44aa 32 d1 d2 ld (#d2d1),a
0x44ad c9 ret
0x44ae 7b ld a,e
0x44af a7 and a
0x44b0 28 08 jr z,#44ba
0x44b2 23 inc hl
0x44b3 b2 or d
0x44b4 3c inc a
0x44b5 77 ld (hl),a
0x44b6 22 d2 d2 ld (#d2d2),hl
0x44b9 c9 ret
0x44ba 7e ld a,(hl)
0x44bb e6 70 and #70
0x44bd ba cp d
0x44be 28 07 jr z,#44c7
0x44c0 23 inc hl
0x44c1 14 inc d
0x44c2 72 ld (hl),d
0x44c3 22 d2 d2 ld (#d2d2),hl
0x44c6 c9 ret
0x44c7 7e ld a,(hl)
0x44c8 4f ld c,a
0x44c9 e6 0f and #0f
0x44cb fe 0f cp #0f
0x44cd 28 f1 jr z,#44c0
0x44cf 3c inc a
0x44d0 47 ld b,a
0x44d1 79 ld a,c
0x44d2 e6 f0 and #f0
0x44d4 b0 or b
0x44d5 77 ld (hl),a
0x44d6 c9 ret
0x44d7 16 00 ld d,#00
0x44d9 59 ld e,c
0x44da 77 ld (hl),a
0x44db 19 add hl,de
0x44dc 10 fc djnz #44da
0x44de c9 ret
This is another quite elaborate function, with multiple returns, and lots of unknown variables/memory locations. Let's look at the start. We immediately start with two checks for unknown variables.
0x447d 3a 19 c8 ld a,(#c819)
0x4480 a7 and a
0x4481 c0 ret nz
0x4482 3a 60 d2 ld a,(#d260)
0x4485 a7 and a
0x4486 c0 ret nz
If they are non-zero, we bail out of this function early. Considering we're breaking deeper inside this function, we can assume these values are currently zero.
Now, for some more interesting stuff:
0x4487 2a d2 d2 ld hl,(#d2d2) ; 0xD2D2 = Current position in marching byte area
0x448a 11 00 00 ld de,#0000
0x448d 3a d0 d2 ld a,(#d2d0)
0x4490 a7 and a
0x4491 28 02 jr z,#4495
0x4493 1e 80 ld e,#80
0x4495 3a 08 d0 ld a,(#d008)
0x4498 87 add a,a
0x4499 87 add a,a
0x449a 87 add a,a
0x449b 87 add a,a
0x449c 57 ld d,a
We first load the 16-bit value at 0xD2D2
into the HL
register. We already established
that 0xD2D2/0xD2D3
contains a pointer to the current position in our marching byte area. We need
this pointer in HL
to write our calculated byte to that area.
We also initialize the DE
register to 0x0000. If we look a bit further in the code, we can see we
are writing stuff to the D
and E
registers, so it makes sense we are using these to
calculate some value, which will ultimately end up in the marching byte area.
The three next lines test the value of address 0xD2D0
for zero, and if it's not zero, we put
0x80
in the E
register, otherwise it remains zero.
Now, we've arrived at address 0x4495
, where we load the value of 0xD008
into the
A
register, another yet unknown memory location. This value gets doubled four times, and loaded
into the D
register. Doubling a value shifts its bits one position to the left. If you do that four
times in an 8-bit number, then you've shifted the bottom nibble to the top nibble. For example,
0x07
doubled four times becomes 0x70
.
Before we go any further in this big function, let's convert the code above to pseudocode and then figure out
what type of information 0xD2D0
and 0xD008
actually hold.
bool b_C819 = false;
bool b_D260 = false;
if ( b_C819 || b_D260 ) {
return;
}
uint8_t* current_position_in_marching_byte_area; // *(0xD2D2)
uint8_t d = 0;
uint8_t e = 0;
bool b_D2D0 = ?;
if ( b_D2D0 ) {
e = 0x80;
}
uint8_t u8_D008 = ?;
d = u8_D008 << 4;
To make sense of this, we need to figure out what the question marks mean.
Figuring out 0xD2D0 and 0xD008
There's no magic bullet when it comes to figuring out what a completely unknown byte in memory does. Sure, you can put a read or write watchpoint on the location, and start reading assembly, but that can sometimes get you multiple layers deep in code, surrounded by many other unknown memory locations and many calls to other functions. An alternative is to just keep the memory view window open, and simply try different things in game. Once you see the value change in memory, figure out what you were doing at that exact moment in time, be it shooting an enemy, walking over the exit, picking up a diamond framer, ...
First, let's see what happens to 0xD008
(third row, last column) when I move Lolo around in the
game.
The value at 0xD008
seems to change whenever I change Lolo's direction, representing which key I
hold down. It switches quite quickly between the value and 00
, but that's not really important,
probably some sort of debouncing mechanism. The important part is the values we see for each direction:
0
: standing still1
: moving UP3
: moving RIGHT5
: moving DOWN7
: moving LEFT
We can therefore conclude that 0xD008
contains the movement direction.
With the same way of experimenting, we see that 0xD2D0
becomes 01
when the spacebar
key is pressed down, though it is also debounced in a similar way.
The relevant pseudocode now looks like this:
uint8_t e = (spacebar_held) ? 0x80 : 0;
uint8_t d = (movement_direction) << 4;
Now that we know we are dealing with the spacebar status and movement, we'll take another look at the marching byte, this time while moving Lolo around in the level.
Another look at the marching byte
It looks like there are two conditions for the marching byte to skip to the next position! One, we've already
seen, when the bottom nibble reaches 15, or 0x0F
. But now there's a second reason: whenever we
change direction, the byte skips to the next position as well.
We know from looking at the assembly that we take the movement_direction (and spacebar status) and move that in the top nibble. The bottom nibble we can visually see incrementing, it looks like some sort of timer or counter. If we look at the little video above, we ended up with:
35 75 35 52 12
53 13 53 14 74
We can interpret this as:
- Move RIGHT (3) for 5 ticks
- Move LEFT (7) for 5 ticks
- Move RIGHT (3) for 5 ticks
- Move DOWN (5) for 2 ticks
- Move UP (1) for 2 ticks
- ...
These actions seem to correspond exactly with what Lolo is doing in the video. The question remains: why are we storing the timings of Lolo's movements in memory?
Bringing it all together
The answer to this remaining question is quite obvious, if you know the game a little bit. There's a big hint when the game boots up. Remember, when we started round 1, there was already data there that was overwritten as soon as we started moving. That data is consistently there, it's not random uninitialized memory.
Also, the round number at 0xD2D4
contains 0x56
or 86 decimal at startup.
When we put a read memory watchpoint on our marching byte array, we find that it only triggers in one exact spot: when we let the demo play. That is why it initializes to 86 at startup, and why there is consistent movement data already there, it's simply the solution to round 86.
The demo mode, which triggers when you wait a bit in the main menu, doesn't always play round 86 for you, it actually replays the last round you played yourself! And that is a definitive answer as to why we have to record all Lolo's movements, this area is the demo replay movement array.
Some more details
After investigating a little bit, I found out about a couple more memory locations, such as the two conditions at the start of the big function that had to be zero/false for the demo recording function to progress.
0x447d 3a 19 c8 ld a,(#c819)
0x4480 a7 and a
0x4481 c0 ret nz
0x4482 3a 60 d2 ld a,(#d260)
0x4485 a7 and a
0x4486 c0 ret nz
0xC819
gets set to 0x01
whenever the demo is playing. This makes sense, because you
don't want to record the movements during replay of the demo.
0xD260
gets set to 0x01
whenever the level timer runs down. Even though the level
timer is only shown on-screen when you play in mode B and you have to race against the clock, it is still
ticking down in gamemode A, in the background. The timer is initialized to 150 seconds, and when that reaches
zero, the game flips 0xD260
to 1, effectively stopping the recording of the demo movements.
The game engine seems to update at five ticks per seconds, so 150 seconds times 5 ticks makes for 750 potential
memory locations needed to store for the demo replay, in case you trigger a byte progression every tick. While
this is technically possible, it seems a bit unrealistic. If you're fast enough, you can keep pressing the
spacebar, so the demo replay area fills up with 0x81
. The 8 representing the space, and 1
representing a single tick. Of course, each spacebar press needs to be recorded separately as you might shoot
two bullets one right after the other, so each spacebar press gets a single entry in the array.
It is still very hard and nearly impossible for a regular player to fill up that array with data. Even still, it looks like they even reserved more than 750 bytes. The next byte that seems to be used after the demo replay array is 800 positions out. This looks too round to be a coincidence, and I don't see anything referencing the last 50 bytes elsewhere in the code either. This looks like a bit of a waste of memory, which could have been used for other stuff.
In any case, what prevents a buffer overflow here is the level timer. If you mess with that timer while the game is running, it will happily overwrite whatever comes after this array.
Finally, one thing that attracts attention in the assembly of the big function is line 0x44BB
0x44bb e6 70 and #70
If you remember the last blog post, about how the level data was stored in the game, we saw a lot of information being crammed into one byte. Multiple groups of bits representing separate pieces of information. Of course, we already know that we are doing the same here, we separated the top and bottom nibble, with the bottom nibble containing tick count and the top containing the movement direction.
But then, if we look at the code, the spacebar and direction testing were handled separately.
uint8_t e = (spacebar_held) ? 0x80 : 0;
uint8_t d = (movement_direction) << 4;
Spacebar and movement are OR
'ed together, so 0x80
or 1000 0000
in binary,
wasn't chosen at random. It simply sets the top bit of the byte (or the top nibble).
The movement data ranges from zero to seven, which perfectly fits inside three bits. That's why in line
0x44BB
we do an AND
with 0x70
, which is in binary 0111 0000
,
to see if the movement has changed and we need to start a new byte.
The exact storage of a demo replay byte is then, bitwise from left to right:
- 1 bit: spacebar pressed or not
- 3 bits: movement direction: 0, 1, 3, 5 or 7
- 4 bits: amount of ticks: 1 to 15
If you want to watch me do this type of reverse engineering live, follow me on twitch.tv/zappatic, or check out my YouTube channel.