Re-writing after fatal cock up

At the end of last week I had a fatal lapse of concentration during a coding session. I have these from time to time as I have trouble with short term memory when I’m working on a problem if I get distracted I can seriously lose the thread of my idea. Normally this is ok and I manage to get back onto my idea eventually (I sometimes have to re-discover an idea to do this). Unfortunately this happened while I was in the middle of a major bit of refactoring of my code base. I had the code broken and far away from being rebuilt, but when I came back to it I couldn’t remember what I was in the middle of doing. I tried to reconstruct some of the code but got myself into a bit of a pickle.

The code was terminally naffing up the stack and I spent quite a bit of time trying to work out what I had done, but I never picked the idea thread up again. On Sunday I decided that it was probably gonna be more productive to junk practically all of the code, about 4KB worth and take my core idea for refactoring forward from scratch.

I’ve spent some time the last 3 days and now have a much better working codebase and have myself in a much better position.

saving memory

One of the “joys” of assembly programming in restricted environments is trying to make sure that your stack and your variables and data structures do not collide. Despite the way we always teach the abstract stack mechanism (as a series of locations with the index increasing as we add more items)., stacks managed by the cpu grow downwards and are usually placed at the very top of user RAM.

As we add things to the stack the stack moves nearer to our data. At some point we may have a meeting of stack and our data, this most often is fatal as the stack contains return address from subroutine calls and ISR calls.

If we overwrite a return address on the stack with some game data the control unit will retrieve an incorrect return address when it executes an RTS instruction and happily continue execution from that point. This could be some random bit of code or even an area of data. The control unit on the cpu doesn’t know the context of memory (this is a Von Neumann machine where data and instructions occupy the same memory and are stored in the same format) so will decode whatever byte of data it retrieved from the address in the PC and execute it. Its best to make sure the stack doesn’t encroach on user data.

This is exactly the problem I had today, so I’ve been on the initial memory saving hunt without having to do another major refactor (although that may still have to happen as I add more stories for the stunt man).

We can reduce the depth of the stack by making more use of jumps and macros, but this is not always desirable if a subroutine is to be used many times in different contexts. One way around this and a technique I fall back on when I’m using the improved speed of the stack for pushing data around on Z80 hardware is to supply a return address in an index register and jump to the routine I want to execute which ends with an indexed jump back like this in 6809  jmp [,x].

see my Pac-Man pixel scrolling code for an example of what you can do on that hardware.

packing data

Another thing you can do to save memory is to pack multiple data items into a single byte.

When we work in high level languages, we usually use native word lengths (32 or 64 bit) for data items even if all we want is a boolean value. This is very wasteful and completely acceptable most of the time as we have quite a large amount of RAM to play with on modern machines. We don’t have that luxury on old school hardware, Pac-Man has about 900 useable bytes of RAM and the Vectrex has about 1K of usable RAM.

Techniques that are easy and perfomant on a PC are usually impossible to implement on old hardware, that’s part of the fun of coding on this sort of cool restricted hardware.

Anyway back to packing, I had part of my object data structure that looked like this:

base_vis        equ  0             ; 1 byte - 1 means draw, 0 means don't draw 
base_size       equ base_vis + 1   ; scale factor for sprite 
base_x          equ base_size + 1  ; 2 bytes $xxxx 
base_x_screen   equ base_x + 1     ; offset to screenwidth position information 
base_y          equ base_x + 2     ; 2 bytes $xxxx 
base_y_screen   equ base_y + 1     ; offset to screenheight position information 
base_image      equ base_y + 2     ; 2 bytes $xxxx addr of image data 
base_len        equ base_image + 2 ; base_image + 2 ; length of a base sprite

In this structure I was using a byte for visiblity and the following one for the scale of an object. The visiblity only really requires a single bit to state whether an object is visible or not, so I looked to see if I could combine this with the scale byte. If I choose to keep the scale to 127 or less that frees up the MSB or Sign bit of the byte to be used to hold visiblity. The knock on of this is, if I set the MSB I can use this to indicate that the object should not be drawn, it doesn’t matter that looking at the byte would add 128 to the scale as we are not going to draw the object anyway. If we clear the MSB then we can use this to indicate that the object can be drawn, the bonus is when this is zero it has no effect on the 7 bits representing the scale, it’s a win win.

Testing for not to draw the object can be performed easily with some code like this, which just test checks for the sign of the last operation:

 lda line_vis,x            ; get visibliity 
 bmi draw_skip_not_visible

Making an object visible or invisible can be performed by using an AND mask and an OR MASK as follows:

; sets sprite at X to be not drawn
; sets msb
              lda spr_vis,x 
              ora #$80 
              sta spr_vis,x 
; sets sprite at X to be drawn
; clears msb
VISIBLE_X     macro 
              lda spr_vis,x 
              anda #$7f 
              sta spr_vis,x 

this saves a byte per object (in my current format), giving me a little breathing room and hopefully stopping the stack coming a knocking. Here’s the current format:

base_vis      equ 0              ; msb 0 means visible, 1 means invisible
base_size     equ base_vis + 0   ; scale factor for sprite 
base_x        equ base_size + 1  ; 2 bytes $xxxx 
base_x_screen equ base_x + 1     ; offset to screenwidth position information 
base_y        equ base_x + 2     ; 2 bytes $xxxx 
base_y_screen equ base_y + 1     ; offset to screenheight position information 
base_image    equ base_y + 2     ; 2 bytes $xxxx addr of image data 
base_len      equ base_image + 2 ; base_image + 2 ; length of a base sprite

I’ve left the base_vis and base_size equates pointing at the same offset as sometimes I’m looking at the byte as visibility and sometimes as scale, it makes my code more self documenting (readable).

More complex packing systems can be used, but performance is always a consideration, when accessing and manipulating the data. It all depends on what you are trying to do.

Test Overlay print

I decided to have  go at printing a test overlay, this is a multi step process and not a professional one.

1) print overlay design onto acetate sheet, using a colour inkjet and transparency sheet

2) Cut acetate sheet down, making sure that the overlay is smaller than an actual real overlay

3) Laminate the acetate sheet, this stiffens it up enough to stand up (could double laminate)

4) hand paint a white mask on the back (took a couple of coats of white primer)

first coat pretty thin everywhere


view from front


5 Tidy up mask edges with scalpel

Before tidying up it looked like this


back after tidy up (it’s not perfect but good enough for testing position of elements)


front after tidy up


Overlay on the Vectrex


Can now start tweaking lengths and positions.