Skip to content →

pitfalls – how to dumb

This page will basically document all the errors I made, some of which might happen to a lot of people by making false assumptions from the docs, some may just be me being dumb af. So, either way, I hope these entries help and/or give you some joy.

Forgetting the PBR

So, the root of many bugs and quite a few hours of debugging turned out to be missing using PBR when fetching bytes in opcodes or addressing modes.

For example:

u32 ADDR_getAbsoluteIndirectLong() {
	regs.PC += 2;
	u8 lo = BUS_readFromMem(regs.PC - 1);
	u8 hi = BUS_readFromMem(regs.PC);
	u16 adr = (hi << 8) | lo;
	u8 i_lo = BUS_readFromMem(adr);
	u8 i_hi = BUS_readFromMem(adr + 1);
	regs.setProgramBankRegister(BUS_readFromMem(adr + 2));
	return (regs.PB << 16) | (i_hi << 8) | i_lo;

Looks pretty fine at first glance, and probably will pass a lot of tests as well, but will ultimately break your neck. Because the reads of lo and hi are not using the PBR.

The correct way would be:

u32 ADDR_getAbsoluteIndirectLong() {
	regs.PC += 2;
	u8 lo = BUS_readFromMem((regs.PB << 16) | regs.PC - 1);
	u8 hi = BUS_readFromMem((regs.PB << 16) | regs.PC);
	u16 adr = (hi << 8) | lo;
	u8 i_lo = BUS_readFromMem(adr);
	u8 i_hi = BUS_readFromMem(adr + 1);
	regs.setProgramBankRegister(BUS_readFromMem(adr + 2));
	return (regs.PB << 16) | (i_hi << 8) | i_lo;

Debugging something like this can take quite a bit of time, since you’re probably only experiencing weird behaviour way after this has been called, and you will have to backtrack quite a bit.

PBR-modifying instructions

Another pitfall, that can be really nasty in execution and also be undetected by a lot of tests, is instructions, that modify the PBR, where you didn’t expect it.

The instructions that have this behavior are the following:

JSL 0x22 - Jump Soubroutine - Absolute Long
JML 0x5C - Jump - Absolute Long
JMP 0xDC - Jump - Indirect Long
RTL 0x6B - Return from Subroutine Long

Especially if you’re coming from developing other emulators you might already have had an idea about these instructions and implemented them pretty easily. But you probably havn’t read certain paragraph in the offical 500 pages doc:

New instructions and addressing modes were added to let you transfer control between banks: jump
absolute long (jump to a specified 24-bit address), jump indirect long (the operand is an absolute address in
bank zero pointing to a 24-bit address to which control is transferred), jump to subroutine long (to a specified
24-bit address, with the current program counter and program bank register pushed onto the stack first), and a
corresponding return from subroutine long, which re-loads the bank register as well as the program counter.
(The addressing modes are among those listed in Table 4.3, the instructions in Table 4.4.)
These instructions that specify a complete 24-bit address to go to, along with native mode’s software
interrupt and return from interrupt instructions, are the only ones that modify the value in the program bank

Meaning, this will often work, but is wrong:

u8 JMP_IND_LONG(u32(*f)(), u8 cycles) {
	u32 adr = f();
	regs.PC = adr;
	return cycles;

You will have to do something like this:

u8 JMP_IND_LONG(u32(*f)(), u8 cycles) {
	u32 adr = f();
	regs.PC = adr;
	regs.setProgramBankRegister((adr >> 16) & 0xff);
	return cycles;

Super Mario World – Vertical blue line

One of the weird graphical issues I encountered when I got Super Mario World running for the first few times, was a weird blue vertical line straight through the middle of the screen.

As you may be aware, SMW uses the window in the beginning for the zoom effect, and I knew that this particular blue was the fixed color.

After digging around a bit, I saw the mistake:

const bool in_W1 = window.W1_LEFT <= RENDER_X && RENDER_X <= window.W1_RIGHT;
const bool in_W2 = window.W2_LEFT <= RENDER_X && RENDER_X <= window.W2_RIGHT;

Here I am missing out that when W1_LEFT is bigger than W1_RIGHT, the whole screen gets drawn. It’s an essential piece in the pixel pipeline and was just missed out. So when I changed the code to:

const bool in_W1 = (window.W1_LEFT <= RENDER_X && RENDER_X <= window.W1_RIGHT) && window.W1_RIGHT > window.W1_LEFT;
const bool in_W2 = (window.W2_LEFT <= RENDER_X && RENDER_X <= window.W2_RIGHT) && window.W2_RIGHT > window.W2_LEFT;

the blue line was gone. Yay.

Super Mario World – Broken Zoom

So, as already mentioned in the last pitfall, there is the zoom animation in the SMW intro. In this particular issue I had the problem that the zoom animation was broken.

Ignore the color for the moment, it’s actually supposed to be black. But, a little bit above the grass you can see the edges on both sides.

I won’t go into technical detail here, because I can’t really show anything, but in the end it turned out that I wasn’t correctly synching CPU and PPU, therefore table data for HDMAs, which create the circle, were read at the wrong times. They are supposed to be loaded in the are where you can see the wooden border, but in my case it was reading the new tables mid-circle, so the circle-HDMAs got the values for the next frame way too early, thus creating the split in the circle.

Super Mario World – Falling Koopaling

Okay, this one really got to me. I was basically booting SMW, and was about to fix my sprites (you will see them flipped wrong and in wrong colors, just ignore that), when I noticed this weird behavior.

If you look closely (and are somewhat familiar with the SMW intro), the koopaling on the slope doesn’t slide down the slope, but falls through it! Usually Mario would jump off of him and continue to the next koopas. But since the koopaling is missing, Mario can’t jump off of it, therefore messing up the entire rest of the intro as well (it’s just a timed input of buttons after all).

So, where even to begin to debug something like this? In the end it took me around 8 hrs to finally find the culprit. Luckily, there are a few disassemblies of SMW out there, that helped in understanding the game a bit. In the end I still had to wait for the sprite to appear, and check where it reads its positions from, and when these positions get changed – and then why. It was a real torture, hundreds of notes, thousands of restarts and breakpoints. The final bugfix was miniscule (and dumb), but fixing something still is rewarding:

My Absolute Long Indexed X addressing mode looked like this:

return (BUS_readFromMem((regs.PB << 16) | regs.PC) << 16) | (BUS_readFromMem((regs.PB << 16) | regs.PC - 1) << 8) | BUS_readFromMem((regs.PB << 16) | regs.PC - 2) + regs.getX();

Looks right at the first glance, but it is not.

return ((BUS_readFromMem((regs.PB << 16) | regs.PC) << 16) | (BUS_readFromMem((regs.PB << 16) | regs.PC - 1) << 8) | BUS_readFromMem((regs.PB << 16) | regs.PC - 2)) + regs.getX();

This is the necessary solution. Yes, 2 additional brackets. The X-Register is supposed to be added to the final value, but would be added to the BUS_readFromMem((regs.PB << 16) | regs.PC - 2) first in the first solution, because the addition takes precedence over the binary OR.
This would often lead to valid results, but not when the ORed value overlaps with the values that come in front of it.

After finally figuring this out, adding the 2 brackets fixed it. Hooray!

Super Mario World – Map Glitches

After finally fixing the falling koopaling I was hoping for a somewhat playable game at that point, but I was disappointed yet again. The world map would glitch out on loading.

The code was running into areas it wasn’t supposed to and my Stack Pointer was going crazy. So, I found out that some tables that are being read for the map were filled with the wrong data. After more digging around (I don’t want to bore you with how debugging works in every single one of these entries here), I found out that my BRL instruction was broken, I had it set the PC too far. This would only fix it for an additional frame though:

So, I had another bug in it:

u8 BRL(u32(*f)(), u8 cycles) {
	u8 pb = regs.PB;
	u32 adr = (pb << 16) | f();
	i8 offset = (BUS_readFromMem((pb << 16) | adr + 1) << 8) | BUS_readFromMem((pb << 16) | adr);
	regs.PC += offset;
	return cycles;

If you look closely, you can spot it. A 16-bit value sadly won’t fit into an 8-bit datatype, so of course I had to change the i8 into i16. After that, the map would finally load properly (ignore the pixel-pipeline issues with the black background, it had something to do with transparency not working properly).


Leave a Reply

Your email address will not be published. Required fields are marked *