At the end of last week I had a fatal lapse of concentration during a coding session. I have these from time to time as I have trouble with short term memory when I’m working on a problem if I get distracted I can seriously lose the thread of my idea. Normally this is ok and I manage to get back onto my idea eventually (I sometimes have to re-discover an idea to do this). Unfortunately this happened while I was in the middle of a major bit of refactoring of my code base. I had the code broken and far away from being rebuilt, but when I came back to it I couldn’t remember what I was in the middle of doing. I tried to reconstruct some of the code but got myself into a bit of a pickle.
The code was terminally naffing up the stack and I spent quite a bit of time trying to work out what I had done, but I never picked the idea thread up again. On Sunday I decided that it was probably gonna be more productive to junk practically all of the code, about 4KB worth and take my core idea for refactoring forward from scratch.
I’ve spent some time the last 3 days and now have a much better working codebase and have myself in a much better position.
One of the “joys” of assembly programming in restricted environments is trying to make sure that your stack and your variables and data structures do not collide. Despite the way we always teach the abstract stack mechanism (as a series of locations with the index increasing as we add more items)., stacks managed by the cpu grow downwards and are usually placed at the very top of user RAM.
As we add things to the stack the stack moves nearer to our data. At some point we may have a meeting of stack and our data, this most often is fatal as the stack contains return address from subroutine calls and ISR calls.
If we overwrite a return address on the stack with some game data the control unit will retrieve an incorrect return address when it executes an RTS instruction and happily continue execution from that point. This could be some random bit of code or even an area of data. The control unit on the cpu doesn’t know the context of memory (this is a Von Neumann machine where data and instructions occupy the same memory and are stored in the same format) so will decode whatever byte of data it retrieved from the address in the PC and execute it. Its best to make sure the stack doesn’t encroach on user data.
This is exactly the problem I had today, so I’ve been on the initial memory saving hunt without having to do another major refactor (although that may still have to happen as I add more stories for the stunt man).
We can reduce the depth of the stack by making more use of jumps and macros, but this is not always desirable if a subroutine is to be used many times in different contexts. One way around this and a technique I fall back on when I’m using the improved speed of the stack for pushing data around on Z80 hardware is to supply a return address in an index register and jump to the routine I want to execute which ends with an indexed jump back like this in 6809 jmp [,x].
see my Pac-Man pixel scrolling code for an example of what you can do on that hardware.
Another thing you can do to save memory is to pack multiple data items into a single byte.
When we work in high level languages, we usually use native word lengths (32 or 64 bit) for data items even if all we want is a boolean value. This is very wasteful and completely acceptable most of the time as we have quite a large amount of RAM to play with on modern machines. We don’t have that luxury on old school hardware, Pac-Man has about 900 useable bytes of RAM and the Vectrex has about 1K of usable RAM.
Techniques that are easy and perfomant on a PC are usually impossible to implement on old hardware, that’s part of the fun of coding on this sort of cool restricted hardware.
Anyway back to packing, I had part of my object data structure that looked like this:
base_vis equ 0 ; 1 byte - 1 means draw, 0 means don't draw base_size equ base_vis + 1 ; scale factor for sprite base_x equ base_size + 1 ; 2 bytes $xxxx base_x_screen equ base_x + 1 ; offset to screenwidth position information base_y equ base_x + 2 ; 2 bytes $xxxx base_y_screen equ base_y + 1 ; offset to screenheight position information base_image equ base_y + 2 ; 2 bytes $xxxx addr of image data base_len equ base_image + 2 ; base_image + 2 ; length of a base sprite
In this structure I was using a byte for visiblity and the following one for the scale of an object. The visiblity only really requires a single bit to state whether an object is visible or not, so I looked to see if I could combine this with the scale byte. If I choose to keep the scale to 127 or less that frees up the MSB or Sign bit of the byte to be used to hold visiblity. The knock on of this is, if I set the MSB I can use this to indicate that the object should not be drawn, it doesn’t matter that looking at the byte would add 128 to the scale as we are not going to draw the object anyway. If we clear the MSB then we can use this to indicate that the object can be drawn, the bonus is when this is zero it has no effect on the 7 bits representing the scale, it’s a win win.
Testing for not to draw the object can be performed easily with some code like this, which just test checks for the sign of the last operation:
lda line_vis,x ; get visibliity bmi draw_skip_not_visible
Making an object visible or invisible can be performed by using an AND mask and an OR MASK as follows:
;======================================== ; sets sprite at X to be not drawn ; sets msb ;======================================== NOT_VISIBLE_X macro lda spr_vis,x ora #$80 sta spr_vis,x endm ;======================================== ; sets sprite at X to be drawn ; clears msb ;======================================== VISIBLE_X macro lda spr_vis,x anda #$7f sta spr_vis,x endm
this saves a byte per object (in my current format), giving me a little breathing room and hopefully stopping the stack coming a knocking. Here’s the current format:
base_vis equ 0 ; msb 0 means visible, 1 means invisible base_size equ base_vis + 0 ; scale factor for sprite base_x equ base_size + 1 ; 2 bytes $xxxx base_x_screen equ base_x + 1 ; offset to screenwidth position information base_y equ base_x + 2 ; 2 bytes $xxxx base_y_screen equ base_y + 1 ; offset to screenheight position information base_image equ base_y + 2 ; 2 bytes $xxxx addr of image data base_len equ base_image + 2 ; base_image + 2 ; length of a base sprite
I’ve left the base_vis and base_size equates pointing at the same offset as sometimes I’m looking at the byte as visibility and sometimes as scale, it makes my code more self documenting (readable).
More complex packing systems can be used, but performance is always a consideration, when accessing and manipulating the data. It all depends on what you are trying to do.