Compiling Lynx homebrew projects

As a follow-up to my post on compiling the sources for CC65 here are some instructions to compile homebrew projects without a graphical user interface or Integrated Development Environment (IDE) like Visual Studio or CodeBlocks. I will show you how to do some more setup of your Windows installation to be able to compile homebrew projects.

Environment variables

CC65 needs a couple of environment variables for all tools to work correctly. You can read more about this in part 2 of the Lynx programming tutorial, but for your convenience here is the lowdown:

CA65_INC: C:\Program Files\CC65\asminc
CC65_INC: C:\Program Files\CC65\include
CC65_HOME: C:\Program Files\CC65
LD65_CFG: C:\Program Files\CC65\cfg
LD65_LIB: C:\Program Files\CC65\lib
LD65_OBJ: C:\Program Files\CC65\obj

and your PATH variable should contain the MinGW msys bin path and the CC65 base directory. Part 2 contains some instructions to create the permanent environment variables.

@SET PATH=%CC65_HOME%\bin;C:\MinGW\msys\1.0\bin;%PATH%
@if exist %CC65_HOME%\wbin (
    @set PATH=%CC65_HOME%\wbin;%PATH%
@SET CA65_INC=%CC65_HOME%\asminc
@SET CC65_INC=%CC65_HOME%\include
@SET LD65_CFG=%CC65_HOME%\cfg
@SET LD65_LIB=%CC65_HOME%\lib
@SET LD65_OBJ=%CC65_HOME%\obj

Below is a batch script that can do the same for you. Examine the script, then after you have verified it is benign copy and paste it to a text file. Save the file as cc65vars.bat in your CC65 installation directory (probably C:\Program Files\CC65).

Then create a shortcut by right-clicking the cc65vars.bat file and choose Send To > Desktop (Create shortcut) from the context menu. Go to your desktop and right-click the shortcut. Choose Properties and edit the Target field. Enter this value:

%comspec% /k “%PROGRAMFILES%\cc65\cc65vars.bat”

Change the ‘Start in’ field to %PROGRAMFILES%\cc65\ and fill in ‘Open CC65 Tools Command Prompt’ in the Comment field. The cogwheel icon of the shortcut should change to the familiar black command prompt. Give the shortcut a fancy name like ‘CC65 Tools Command Prompt’.


Now double-click the shortcut and a command prompt should open that looks like this:


That’s it. All set to start compiling your sources.

An example: “Shaken, not stirred” by Karri Kaksonen

You can try your installation by compiling the sources of a homebrew game called “Shaken, not stirred”. As before, create a clone of Karri’s Git repository for Shaken by using this clone url: Use either Git for Windows tooling or Visual Studio (or whatever else you might have for that).

After cloning the Shaken repo, start the command prompt we created before and navigate to the root of the repo. Run make.exe and everything should run smooth when the setup is okay.


You might notice in the output that the final stage of zipping the cart.lnx file didn’t happen correctly. I’ll figure out where to get the zip.exe tool to have this succeed as well. Frankly, I do not bother zipping the files that are a maximum of 512 kB.

image  image

Feel free to contact me here or at the AtariAge Lynx Programming forum if you run into problems you cannot fix.

Have fun. Let me know of your particular details, tips or tricks so I can include them here and at the tutorial.

Posted in Tutorial | Leave a comment

Compiling CC65 suite and Lynx libraries

Here is a quick guide to reproduce the steps needed to compile the CC65 assembler, compiler and linker suite, plus the corresponding console specific libraries.


The first step is to compile the C++ sources to create the tooling in the CC65 suite. This consists of the assembler ca65.exe, compiler cc65.exe and the cl65.exe plus additional executables. You can find important documentation at the CC65 website.

We will start by creating a clone of the Github repository for the CC65 source code. The base of the CC65 repository can be found at The documentation, wiki and mailarchive is located there. We need the source code at You can use command-line tooling to create the clone, but I will show this using Visual Studio 2013 Community Edition that is available as a free download from the Microsoft website.

Start VS2013 and go to the Team Explorer tab. Click Connect at the top and choose Clone under Local Git Repositories.


Fill in the https clone URL: and initiate the clone by clicking the Clone button.

image image

The cc65 repository is shown with the red icon. Double-click it and the view changes to this particular repo. You can see the cc65.sln solution file. Open the solution by double-clicking it and watch while it opens.

image image

Once the solution is loaded you can build it by simply pressing Ctrl+Shift+B or choose Build Solution from the Build menu.


After that you will have a clean build of CC65 using the latest CC65 sources in the bin folder.


Lynx libraries

The next step is to compile the console specific .lib libraries that you need to create images (or roms) for your console. In my case, that is the Lynx, but the steps are similar for the others.

The sources for the libraries are located under cc65\libsrc. There are subdirectories there for each of the consoles. At the location there is also a Makefile. We need to have the make.exe tool to compile these sources. For that I used MinGW, because the Unix Utilities does not have a modern enough version of make.exe.

MinGW is a “minimalist development environment for native Windows applications”. It has a lot of . The automated installer is a very handy setup program for the command-line challenged like myself. Download and run the setup from here.

image imageimage

You can select whatever package you think you need, but at a minimum you want the msys-base package. That contains the make.exe tool we need.


Click the checkbox and choose ‘Mark for Installation’. Select the Installation menu and click Apply Changes.

image image

When all packages are applied succesfully, you can click Close and dismiss the dialog.


With a default installation location at C:\MinGW the make.exe tool will be under C:\MinGW\msys\1.0\bin. Open a Command Prompt windows at the libsrc folder. Add the bin folder to your PATH environment variable, so you can just type make.exe.

SET PATH=C:\MinGW\msys\1.0\bin;%PATH%

Run the following command from the prompt:


and magic should happen. It will compile all library sources to .lib files, including our lynx.lib. Depending on the speed of your computer this might take somewhere from 1 to 15 minutes to compile.

image image

Look in the cc65\lib folder and find the freshly compiled libraries all there, including lynx.lib.

Final thoughts

Why is this useful? With these instructions you can reproduce creating the tools you need to program Atari Lynx. But it also allows you the sources and recompile the CC65 library for your favorite console. For example, if you want to slim down the lynx.lib and leave out unneeded functionality to increase available memory, you need to modify the library and recompile it.

Posted in Tutorial | Leave a comment

Map of Tiny Lynx Adventure

This has been on my list for a long while: a map for Tiny Lynx Adventure. So, I finally wrote a little program to create a map of the world. It is a spoiler, but might come in handy should you get stuck in the game.

You can see the forest, river, white and black castle, plus the desert area. Some more spoiler details:

  • Order is flippers, white key, crown/star, black key, then bad guy
  • The items are always in the same areas.
    Flippers in forest, key to white castle on the shores of the river, crown and weapon in white castle, black key in desert and finally the bad guy in the black castle.
  • The spawn points of the items are chosen randomly from 8 possible locations per item. Memorize them all to speed up your searching.


In case you are interested, this is the meat of the program.

var rgbValues = new byte[bytes];
for (int index = 0; index < Width * Height; index++)
  int row = index >> 10;
  int column = (index % 128) >> 3;
  int screen = world_map[row, column];
  if (screen == 0xff) continue;
  int data = screen_data[screen, ((index % 8) >> 1) + 
(((index / (8 * 16)) % 8) * 4)]; rgbValues[index] = (
byte)(index % 2 == 0 ? data >> 4 : data & 0x0f); }

It uses the world_map 2-dimensional array (16×16) containing the screens (8×8 tiles) and the screen_data 2D array (92×32) that has the individual tiles per screen. screen_data holds 4-bit nibbles for each 16 tile types, packed together per two in a byte. They are ordered in a sequential fashion giving 32 bytes for each screen. There are 92 different screens altogether, some of which are reused. A screen with value 0xff means there is no world (the large open areas from the map).

If you are interested in more details, let me know.

Posted in Games, Homebrew | Leave a comment

Programming tutorial: Part 18–Files

Atari Lynx programming tutorial series:

In part 16 we investigated the way cartridges work at a reasonably low level and how the CC65 libraries help out by providing cartridge read functions from a start position and onwards. In this part we will take this one step further and introduce the file system that Epyx introduced and is still used in most cartridges. Also, we will see how such files are created and put into the ROM image. That will bring us right back to segments and memory from part 15.

All kinds of images

So far we haven’t really looked at how ROM images are organized. There are quite a few types of images available. Most of these images are related to the development kit that was used to create them.

Development kit File extension Description
Epyx .bin Uploadable file for Pinky or Howard
Epyx .rom ROM image with unencrypted header
BLL .o (or .obj) Uploadable file, not to be confused with the CC65 object files (also .o extension)
CC65 .lyx Headerless ROM image
CC65 .lnx .lyx file with Handy header

Two of these image types are more alike than you might expect. Even coming from different development kits they use the same directory and files structures. The Epyx .rom, the .lyx images (and therefore also the CC65 .lnx images) all have a directory that lists one or more files for the game to load from cart into RAM memory. The Epyx development kit introduced this as a (partially optional) means of partitioning content on a cartridge. You could choose not to use it, but you must have at least one file entry to make a bootable game on real Lynx hardware with the original encrypted headers. If you create your own boot loader in the encrypted header of a cartridge you could dispose file entries and the directory all together. As it turned out, the mechanism of having a directory structure proved to be really useful. Useful and relatively easy, as most segmented games have used it ever since. The functions to load files from the cartridge have been put to the test extensively and work like a charm.

Directory and files

The contents or binary image of a cartridge typically have at least three distinct areas:

  1. Encrypted header
  2. Directory structure
  3. Individual files containing code or data

Each of these areas consist of one or more items. Here’s a graphical view of a cartridge that you could find in an arbitrary, generic retail cartridge for the Lynx.


The three areas are marked in green, orange and grey. The header with its encrypted two stage loader is a requirement for games that need to run on actual hardware. There are variations to this theme (it does not have to be a two-stage loader per se), but the original retail releases all had them. The directory holds the file entries, which in essence are offset pointers to the “files” on the cartridge. These files follow the directory and consist of any number of up to 256 files. Typical for the Epyx two stage loader is that the first file holds (must hold) the title screen that is shown during the booting of the Lynx and game. The second one is the main executable. After that there can be any number of files holding code, data, music, graphics, whatever.

Here is a zoomed in picture of the directory and file entries pointing to the various files:


The file entries contain the information to find the start of the particular bytes on the cartridges, expressed in terms of cartridge page and offset in the page. Three bytes are used for that: one byte for the page number, as there are at most 256 pages in a cartridge, then two bytes for the offset, because the pagesize can be as much as 2048 (sometimes 4096) bytes.

The other things that a file entry has are the length of the file in number of bytes and the load address in memory. That’s another two bytes for the length (no use to have files longer than $10000 because that is the maximum RAM memory size, assuming you do not intend to stream from the cartridge) and two for the load address in the $FFFF address space. That’s 7 bytes, plus an 8th byte that is used as a reserved flag with value 0x00. It is not actually used in the Epyx style of file entries, although the BLL style file entries use it to mark it as an executable (0x88) or normal data file (0x00).

The structure of a file entry in bytes look like this (it’s the first file entry for APB in case you want to know):


You can see the offset, length and load address in LSB, MSB format. This particular file entry of APB is used to load the title screen from the cartridge at the first (zero) cartridge page with a 666 (0x29A) bytes offset. The title screen has a length of 3291 (0xCDB) bytes and is loaded at 0x8000 in RAM memory.

The next file entry in the directory is located right after the first and usually points to the file directly after the first file. Some calculations are in order. If the first entry points to page 0, 0x29A offset for a file 3291 bytes long, then it occupies bytes 666 to 666+3291-1 = 3956. The next file should be start at 3957 (0xF75). APB is a 256 KB cartridge, so each page is 1024 bytes. Taking the ROUND(3957 / 1024) = 3 means that it starts in the third page of the cartridge. The 3 pages take 3072 bytes, which means that location 3957 has an offset of 3957-3072 = 885 or 0x375 in hexadecimal. How did we do? Let’s check if we calculated that correctly. This is the next file entry of APB for the resident startup code:


Refer back to the previous picture and you should see that it indeed shows that the second file is located in page 3, 0x375. It is 0x1B4 bytes long and loaded at 0x0400 (which is typical as it is right behind the loader code from 0x200 to 0x3FF). You could repeat the process to calculate the next file location from there and keep continuing until you reach the end of the directory and the start location of the first file.

Note that it doesn’t always work out like this with files being contiguously located on the cartridge. You might decide that a file needs to start at a page boundary, meaning an offset of zero. This is much more efficient for loading, as you do not have to do dummy reads to advance the counter. The cartridge will end up with some unused space (call it fragmentation) caused by gaps from the end of a file to the start of the next page. It’s a trade-off between compactness/space and efficiency of loading files.

Loading files

All right, so far we know that each file entry is 8 bytes in size. You can find a particular file entry by skipping the 8-byte sets. For example, to read file number 3 (zero-based numbering) you would skip 3*8=24 bytes from the start of the directory, bypassing file number 0, 1 and 2, and reading the 8 byte tuple for the 4th entry. From the bytes the cartridge’s shift register for the page number is set, the dummy reads to advance the ripple counter are done and you start copying the bytes from every read to RCART0 to the destination RAM address until you have copied enough bytes for the length specified.

In CC65 all logic to do the preparations (shift register and ripple counter) have been taken and captured in the assembler code from lynx-cart.s. It holds the functions to skip bytes, read a single and a number of bytes, plus page (block) selection. To load individual files as specified inside a directory by a file entry you need the lynx_load function from load.s. This functions is available by using the include file lynx.h. I couldn’t imagine a scenario where you haven’t already included the file, so you should be good to go.

/*                     Accessing the cart                        */

void __fastcall__ lynx_load(int fileno);
/* Load a file into ram. The first entry is fileno=0. */

void __fastcall__ lynx_exec(int fileno);
/* Load a file into ram and execute it. */

The lynx_load functions is really simple to use: pass the file number and the file’s bytes will be read into RAM memory as specified in the file entry. Typically you will use define statements to give the symbolic, readable names to the file numbers. More on that later.

Laying out your memory

Before we can start creating a directory structure we need to know how CL65 as a linker will create an image and how our “files” are laid out in that big binary blob. You see, CL65 does not know nor care about files. It views your code and resources in terms of segments that are loaded into memory. You might want to refresh your knowledge of that by rereading the tutorial part on “Memory and Segments” before continuing.

Of the 64KB memory that the Lynx has a smaller number is available after the video buffers, C stack have been assigned.


If you have only a little code and data that will all fit in the remaining memory, you will probably not bother with files. But just imagine the challenges that arise when you have more code and data than fit into memory at one point in time. You will need to plan what will be in memory when. From that you will create a memory plan that shows the various memory areas that exist.

The special areas, such as zeropage, stacks and the videobuffers, are required for any Lynx game to function. This is regardless of the development kit you are using, except maybe for the C-stack. From here on these are not shown anymore, even though they might be located else in the memory space than in the initial pictures. Their location is fixed throughout the lifetime and state of the game and usually at the bottom and top of the 64K memory range, like shown in the picture.

Looking at an example

The example game we will use is a game that starts with an intro, consisting of a title screen and some music. Pressing A will start the game, where you will battle some nasty aliens, listening to different, exciting music. When you loose all your lives in the game, you will see a game over screen that plays some sad music. After this you will return to the intro screen and this loop starts over. The core of your program is always be in memory and in charge of loading and running the other parts of the program. There is also a resident part related to the C-runtime, because we are using C and CC65.

Perhaps a good way to design your memory areas is to start from the building blocks of your game. You pick your building blocks (intro, main game, outro, music, et cetera) and determine their required memory size. Next you need to place these blocks in the available memory space. If the sizes required by the blocks are small enough you can create a single memory area that accommodates all blocks at the same time. You would have the intro, outro, game and all three music pieces in memory simultaneously. Lucky you.


In this case, you create one big memory area that spans from $0200 to $BE38 and place all code and data in there. You could read it as one big blob from cartridge and execute it. If you have come this far in the tutorial series and have already created a game without having to use the lynx_load function, then you have done this already.

Here is the MEMORY definition from the lynxcart.cfg file that shows a single RAM section. The name for the area that holds the C-runtime and required startup code and data must be named RAM. We saw this before when we talked about memory and segments.

  ZP:    file = "", define = yes, start = $0000, size = $0100;
  HEADER:file = %O, start = $0000, size = $0040;
  BOOT:	  file = %O, start = $0200, size = __STARTOFDIRECTORY__;
  DIR:   file = %O, start = $0000, size = 5*8;
  RAM:   file = %O, define = yes, 
start = $0200, size = __VIDEO__ - __CSTACK__ - $0200;

However, if you find that it does not fit, you will have to time-share the memory space in a smart way. That means creating areas where the “occupants” are not in their assigned “appartment” at the same time.

It is like a puzzle: what should be together in memory when? The resident part will always be in memory and never be unloaded or replaced. The functional parts however will occasionally be in memory and not necessarily at the same time. That’s a good thing, because when the total size of these part exceeds the available memory it wouldn’t be possible anyway.

The following pictures shows possible layouts of memory at three points in time for our game that has an intro, actual game and a game over outro screen, all with different music to play:


Let’s look at the three layouts in a bit more detail:

  • Intro
    The top layout shows the intro in memory (yellow), next to the intro music and the core part plus some additional resident stuff (all blue) that the C runtime needs.
  • Main game
    During gameplay the intro is replaced with the game code and resources. There will be different music, but the area in memory is the same. It’s just that other music will be loaded.
  • Game over
    Essentially the same as the intro, except with other code and resources, and different music.

For now the important parts are those that vary, and they are represented as the yellow and dark blue areas. The resident block is not free to use to your liking, but reserved. Notice how the amount of code and data loaded into the area before music differs. The intro requires less space, but the main part of the game uses the full amount of memory available in that area.

The core is responsible for loading the intro, game and outro. It should also load the correct music data. Because the music will always be playing regardless of the state the game is in, the code to play the music must always be present. That means that either the core or music area should hold the code to play the music. Placing it in the core is more efficient (placing it in each music area means it has to be loaded each time and must be present on the cartridge three times as well). The yellow areas are self-supporting parts of the game that have all code and data inside their respective memory blocks.

When designing your memory layout you need to be aware that there is a difference in the area of memory and the code and data that is loaded there at a particular point in time. Looking at our scenario you might be tempted to think that there is 1 memory area for the yellow blocks, from $0200 to around $7000, which we will name YELLOW for the sake of argument. But, in actuality there are three memory areas and YELLOW is not relevant. The memory areas are INTRO, MAINGAME and OUTRO and they just happen to be positioned on top of each other.

You could create a flexible layout that gives the music as much space as is available after the yellow area. That’s a valid option in which case INTRO would be $0200-$5000, MAINGAME from $0200-$7000 and OUTRO from $0200-$6000. Instead our approach chose to keep the memory sizes fixed from $0200-$7000 across the three game states, because the layout does not require the empty space left by INTRO and OUTRO. The same approach is taken for MUSIC. It is actually three areas (MUSIC1, MUSIC2 and MUSIC3) when the music loaded at each time is different. They can be the same size even though the exact size of music pieces 1, 2 and 3 are unlikely to be exactly the same.


We will compute the yellow blocks from the start of available memory $0200 and place the music right after it. The required RAM area needs to be located at the end of available memory right before the C-stack. Therefore, it is calculated from $BE38 backwards.

We will need some symbols to make things readble, flexible and maintainable. The bolded items are of interest for now. They indicate the sizes of our various blocks.

  __STACKSIZE__: type = weak, value = $0800;
  __STARTOFDIRECTORY__: type = weak, value = $00CB;
  __BLOCKSIZE__: type = weak, value = 2048;
  __EXEHDR__:    type = import;
  __BOOTLDR__:   type = import;
  __RESIDENT__: type = weak, value = $1000;
  __GAMESIZE__: type = weak, value = $3000;
  __MUSICSIZE__: type = weak, value = $1000;
  __VIDEO__: type = weak, value = $BE38;

With these symbols we can calculate the memory requirements for each of our blocks. When you create a MEMORY area

  ZP:      file = "", define = yes, start = $0000, size = $0100;
  HEADER:  file = %O, start = $0000, size = $0040;
  BOOT:	    file = %O, start = $0200, size = __STARTOFDIRECTORY__;
  DIR:     file = %O, start = $0000, size = 5*8;
  RAM:     file = %O, define = yes, size = __RESIDENT__,
start = __VIDEO__ - __STACKSIZE__ – __RESIDENT__; INTRO: file = %O, define = yes, start = $0200, size = __GAMESIZE__; MAIN: file = %O, define = yes, start = $0200, size = __GAMESIZE__; GAMEOVER:file = %O, define = yes, start = $0200, size = __GAMESIZE__; MUSIC1: file = %O, define = yes, start = $0200 + __GAMESIZE__,
size = __MUSICSIZE__; MUSIC2: file = %O, define = yes, start = $0200 + __GAMESIZE__,

size = __MUSICSIZE__;
MUSIC3: file = %O, define = yes, start = $0200 + __GAMESIZE__,
size = __MUSICSIZE__;


From segments to memory to files

After laying out the memory requirements and areas we essentially have a few buckets where we need to put our code and data in. At this point we return to where we left of talking about memory and segments. By assigning code and data to the various segments, we can fill the buckets with all types of segments CODE, DATA and RODATA. BSS segments are never placed on in binary files, because it is all uninitialized memory anyway and nothing more than a memory range.

The first part should look familiar:

  EXEHDR:   load = HEADER, type = ro;
  BOOTLDR:  load = BOOT,   type = ro;
  DIRECTORY:load = DIR,    type = ro;
  STARTUP:  load = RAM,    type = ro,  define = yes;
  LOWCODE:  load = RAM,    type = ro,                optional = yes;
  INIT:     load = RAM,    type = ro,  define = yes, optional = yes;
  CODE:     load = RAM,    type = ro,  define = yes;
  RODATA:   load = RAM,    type = ro,  define = yes;
  DATA:     load = RAM,    type = rw,  define = yes;
  BSS:      load = RAM,    type = bss, define = yes;
  ZEROPAGE: load = ZP,     type = zp;
  EXTZP:    load = ZP,     type = zp,                optional = yes;
  APPZP:    load = ZP,     type = zp,                optional = yes;

It is the usual definition of the required segments that go into the standard memory areas. For the game we’ve been discussing some additional segments are required.

  # Intro
  INTRO_CODE: load = INTRO, type = ro, define = yes;
  INTRO_RODATA: load = INTRO, type = ro, define = yes;
  INTRO_DATA: load = INTRO, type = rw, define = yes;
  INTRO_BSS: load = INTRO, type = bss, optional = yes;
  # Outtro
  OUTRO_CODE: load = OUTRO, type = ro, define = yes;
  OUTRO_RODATA: load = OUTRO, type = ro, define = yes;
  OUTRO_DATA: load = OUTRO, type = rw, define = yes;
  OUTRO_BSS: load = OUTRO, type = bss, optional = yes;
  # Main game 
MAIN_CODE: load = MAIN, type = ro, define = yes;
MAIN_RODATA: load = MAIN, type = ro, define = yes;
MAIN_DATA: load = MAIN, type = rw, define = yes;
MAIN_BSS: load = MAIN, type = bss, optional = yes;
# Music MUSIC1_RODATA: load = MUSIC1, type = ro, define = yes; MUSIC2_RODATA: load = MUSIC2, type = ro, define = yes; MUSIC3_RODATA: load = MUSIC3, type = ro, define = yes; }

It is actually not that special. All code and data with the INTRO_ prefixed segments are assigned to go into the INTRO memory area. The same is true for the MAIN_ and OUTRO_ segments. You might notice how the MUSIC1, 2 and 3 segments only have a RODATA segment defined, as the music itself is just read-only data, not changing data, nor code.

You can safely assume that the linker will place the segments in a memory area in a certain order. How it is laid out is not really relevant. As long as you make sure you have loaded the code and data into the area before you start using it by calling the code or referencing the data.

What is more important is that the linker will create the binary file that holds all code and data in the order that the memory areas have been defined in the MEMORY section. The linker will emit the binary image with “files” for each of the areas that have a file=%O in their definition.

HEADER: file = %O, start = $0000, size = $0040;

This attribute will make the linker emit the contents of that memory area in the final file. You can see how it is referring to %O for each area, effectively appending the contents to the same, single output file. This file will be called whatever you have set as your target in the make file.

$target = tutorial-files.lnx
objects = lynx-160-102-16.o lynx-stdjoy.o \
$(target) : $(objects)
	$(CL) -t $(SYS) -o $@ $(objects) lynx.lib

The bolded item show how the $target is passed into the linker statement that will look like this when expanded:

cl65.exe –t LYNX –o tutorial-files.lnx lynx-160-102-16.o lynx-stdjoy.o tutorial.o lynx.lib

Long story short: the target is the file that is called %O in the memory area. In it all memory areas are written in the order in which they are declared. The order of the segments per area is not relevant.

Entries in a directory

The final thing that keeps us from having a fully functional file system is a directory. The purpose of the directory is to list the “files” and their respective positions within the binary image. In the end, the virtual file system of an Atari Lynx cartridge is nothing more than make believe. When we are able to compute the file entries like we saw at the beginning of this part, we are good to go.

The directory.asm file builds no code, just data representing the file entries. Here is the skeleton of the file, where details have been omitted for now.

.include ""

; More imports
.segment	"DIRECTORY"
; File entries go here

The directory is created in a segment called DIRECTORY. It will hold the 8 byte entries that indicate where the files are located in the binary image on the cartridge.

You can see how start and end address symbols (__DIRECTORY_START__  and __DIRECTORY_END__) are declared these are used to compute the length of the directory itself. The first entry is our RAM area where the C-runtime and other resident code and data are located.


; Entry 0 - Resident executable (RAM)
	.byte	<blocka
	.word	off0 & (__BLOCKSIZE__ - 1)
	.byte	$88
	.word	__RAM_START__
	.word	len0

First, the offset of the RAM area off0 is computed, by taking the start location of the directory and adding the length of the directory to it. The page number is computed from the blocks already taken up. Next the length of the current area is calculated by adding the startup and initialization segment sizes, plus the code, data and read-only data segments. All of these were defined to go into the RAM area. Finally the file entry is added as raw bytes with the .byte and .word statements. Take a look at the picture above to see that it creates the right data. Normally you do not need to change this first entry at all. It is special in that it has multiple parts and requires the total size to be computed from them.

The next file entries are more of the same. The help in creating them easily a macro is included in the default directory.asm file.

.macro entry old_off, old_len, new_off, new_block, new_len, new_size, new_addr
	.byte	<new_block
	.word	(new_off & (__BLOCKSIZE__ - 1))
	.byte	$88
	.word	new_addr
	.word	new_len

You use the macro entry by feeding it the offset (which has the page number and offset) and length of the previous (old) area, plus the size and load address of the current (new) area.


This is what a call to the macro looks like

entry off0, len0, off1, block1, len1,__INTRO_CODE_SIZE__+

The variables off0, len0 are the offset and length from the RAM area. off1, block1 and len1 are variables that are passed without meaning values. They will get a value after the macro has executed. Two of these values (new_off and new_len) will be used to feed into another macro call to create the next entry.


You will also add the new size and address. This repeats over and over again until all entries have been created.

entry off1, len1, off2, block2, len2, __OUTRO_CODE_SIZE__+

One thing we didn’t cover so far is where all these double underscore pre- and postfixed values come from. They are imported at the top of the file and come from values that the linker has emitted. Remember how the linker would create _SIZE__ and _LOAD__ postfixed values for each MEMORY and SEGMENTS declared area and segment that has the define=yes attribute set. From these values you can calculate the file entries, by simply adding the sizes to the old offsets. It’s like creating a long line of the memory areas, one after the other.

Typically you will use the _LOAD__ of the first segment in an area. The order of the segments is not really relevant, except that you need to take the first one listed per area. Like the areas themselves, the segments are laid out sequentially in a particular area. This implies that the first segment is located at the beginning. The various sizes you add together come from all the segments that are located in the area. There could be more than just one of each CODE, DATA and RODATA. It all depends on how you assigned segments to areas.

You need to adjust the lynxcart.cfg file to accommodate the space required in the directory. This is as simple as specifying the number of file entries in the multiplication (6 file entries in the example below):

  DIR:    file = %O,               start = $0000, size = 6*8;

After that you are good to go.

Putting it all together

Let’s do a quick recap of what is needed to create a files and corresponding directory.

  1. Create your memory areas based on the sizes they will need.
  2. Make a layout of the areas in memory and moments in time.
  3. Determine the start locations and finalize the definitions of the areas in the lynxcart.cfg
  4. Create segments that correspond to the areas and list them in lynxcart.cfg
  5. Write code and make resources and put these into the correct segments
  6. Create a directory entry that lists all memory areas, starting with the RAM area.
  7. In each part of your code, be mindful of the expectations for segments that need to be loaded. Call lynx_load before accessing functions, variables or resources in a segment.

To help out with the loading, do yourself a favor and define symbolic names to represent the file numbers. The file start at number 0, which is the RAM area and resident code. It is unlikely that you will ever have to load this yourself. After that the files

#define FILE_INTRO 1
#define FILE_OUTRO 2

Here is a typical piece of code that shows how to use the lynx_load and file numbers.


In the code fragment, which could be inside your void main() static entry point routine, the show_intro function is located in the INTRO area. It needs to be loaded before it can be called. Hence the call to lynx_load, passing in the FILE_INTRO symbol. Having the names and number of files decoupled will be very useful when you need to reorder files in the binary image. You can change the file numbers in one place and will not have to hunt your code to check where you used the particular file number that has changed.

At this point it is worth mentioning that the first file (RAM) is only there if you use the mini-bootloader. That loader does not require a startup sprite. For games that use the Epyx bootloader, you would have seen a first file pointing to sprite data, and the second file to the resident RAM file.


You might be lucky and get it all to work the first time. If you did not manage to do so, or want some more internal look at what has been done and generated, you need some more information. That’s where the map file comes in handy. The map file shows a lot of things for your program/game, including the segment locations and what is located where.

To generate a map file an additional argument is needed in the call to the CL65.exe linker.

$(CL) -t $(SYS) -m -C lynxcart.cfg -o $@ $(objects) lynx.lib

By adding –m and passing a filename the linker will emit a map file ( in this case) that holds valuable info. Here is an excerpt:

Segment list:
Name                   Start     End    Size  Align
DIRECTORY             000000  000027  000028  00001
EXEHDR                000000  00003F  000040  00001
ZEROPAGE              000000  000019  00001A  00001
EXTZP                 00001A  000032  000019  00001
BOOTLDR               000200  0002CA  0000CB  00001
INTRO_CODE            000200  000246  000047  00001
OUTRO_CODE            000200  00023D  00003E  00001
OUTRO_RODATA          00023E  00025C  00001F  00001
INTRO_RODATA          000247  000263  00001D  00001
INTRO_BSS             000264  000264  000001  00001
STARTUP               003200  00327C  00007D  00001
INIT                  00327D  0032AB  00002F  00001
CODE                  0032AC  0046E0  001435  00001
RODATA                0046E1  004818  000138  00001
DATA                  004819  00491A  000102  00001
BSS                   00491B  004A32  000118  00001

The segment list above shows the start and end addresses, plus the sizes of the segments that are created. It is not the complete list, but you can notice how the OUTRO_BSS is missing. Apparently nothing was created in the BSS segment for the OUTRO and there was no need to emit it.

Exports list:
_FileBlockByte         00002F RLZ    _FileBlockOffset       000027 RLZ
_FileCurrBlock         00002E RLZ    _FileDestAddr          00002A RLZ
_FileDestPtr           000031 RLZ    _FileEntry             000026 RLZ
_FileFileLen           00002C RLZ    _FileStartBlock        000026 RLZ
__BLOCKSIZE__          000800 REA    __BOOTLDR__            000001 REA
__BSS_RUN__            00491B RLA    __BSS_SIZE__           000118 REA
__CODE_SIZE__          001435 REA    __CONSTRUCTOR_COUNT__  000000 REA
__CONSTRUCTOR_TABLE__  0032AC RLA    __DATA_SIZE__          000102 REA
__EXEHDR__             000001 REA    __INIT_SIZE__          00002F REA
__INTRO_CODE_LOAD__    000200 RLA    __INTRO_CODE_SIZE__    000047 REA
__INTRO_DATA_SIZE__    000000 REA    __INTRO_RODATA_SIZE__  00001D REA
__OUTRO_CODE_LOAD__    000200 RLA    __OUTRO_CODE_SIZE__    00003E REA
__OUTRO_DATA_SIZE__    000000 REA    __OUTRO_RODATA_SIZE__  00001F REA
__RAM_SIZE__           008638 REA    __RAM_START__          003200 RLA
__RODATA_SIZE__        000138 REA    __STACKSIZE__          000800 REA

The bolded items will look familiar by now. Inspecting these values help find any overflow errors that the linker might report, or troubleshoot directory issues. A detailed look at how to track these errors is for another time.

Next time

We’ve looked at how to design your memory and create segments, files and the entries for your directory. With this you can start building beyond the 64KB limit that you otherwise have. Next time we will look at encryption of headers, or maybe input. Who knows. Till next time.

Posted in Tutorial | Leave a comment

Programming tutorial: Part 17–Interrupts

Atari Lynx programming tutorial series:

In part 13 we covered UART and serial communication. Then in part 14 we had a look at the timers inside of the Lynx. Both parts referred to interrupts as an important bit of functionality. Now it is time to dive deeper into interrupts and use their power to take your Lynx games to the next level.

Before we dig in deep into the Atari Lynx’s interrupts, you should have a good understanding of how interrupts function at the processor level. This will be a detailed overview, one that holds true for any 6502 processor, not just Mikey.

Backgrounder on 6502 family processors’ interrupts

There is an excellent write-up on 6502 interrupts by Gareth Wilson over at I suggest you read this when you want to know the nitty gritty details. I will provide a higher level overview of what is important. Gareth’s article fills in the gaps and the deeper details.

During normal operation the 6502 executes instructions by evaluating the opcodes at the current program counter (PC). It fetches the instruction located there and spends a couple of processor cycles performing the work. The PC is updated to point to the next instruction. Usually this is the next instruction in memory, but it can be somewhere else in case of branching (e.g. BEQ, BMI) or jumping (JMP, JSR). This mode of operation simply follows the flow of your code.

Getting interrupted

Normal operation can be interrupted by special events that occur. In most cases this is the hardware telling the processor that something important has happened, such as input that is available (keyboard, serial IO), or a timer that has expired and wants you to give you a chance to handle that. These special events are appropriately called interrupts and they trigger a specific sequence of action by the processor. The 6502 has two kinds of interrupts:

  1. IRQ: Interrupt ReQuests, normal interrupts
  2. NMI: Non-Maskable Interrupts, more important interrupts than the IRQ interrupts.

Both IRQ and NMI are essentially the same interrupts, except for an important distinction: normal interrupts can be “ignored” (also called masked), while NMI interrupts cannot be masked. This means that you can specify you do not want the IRQ interrupts to actually interrupt you, for example if you are in a critical piece of code execution, while NMI can never be suppressed.

The processor has an interrupt pin for IRQ and NMI that are (optionally) connected to the hardware that can signal an interrupt. The most critical hardware will use the NMI pin, while other hardware uses the IRQ pin.

Both the IRQ and NMI signal are high by default and will trigger when it goes low. These lines can be edged or level sensitive. Usually IRQ lines are level sensitive and will keep firing as long as it stays low. The NMI line on the other hand is edge sensitive in most cases and only triggers on a falling edge to avoid having it triggered over and over again. That would be bad since these cannot be ignored, so it will cause havoc.

The 6502 interrupt sequence

Whenever an interrupt occurs, be it an IRQ or NMI, the 6502 executes a sequence to stop the current execution of code, and render control for the handling of the interrupt to an interrupt service routine (ISR). The ISR location is determined from what is called a vector. Essentially a vector is a memory location where the processor can find the jump address for the reset, IRQ or NMI routine. There are three vectors in the 6502:

Vector Description Address
NMI Vector to NMI ISR $FFFA (low byte) and $FFFB (high byte)
Reset Vector to address of reset routine $FFFC (low byte) and $FFFD (high byte)
IRQ Vector to IRQ interrupt service routine $FFFE (low byte) and $FFFF (high byte)

The picture below (from interrupt tutorial at shows what happens at the processor level per clock cycle.


The current instruction that was executing is finished. As soon as that is done, the current program counter (address of next instruction) is pushed onto the stack. First the high byte, then the low byte. Next, the status register (processor status or PS) is also pushed onto the stack. Finally the IRQ or NMI vector is fetched from their respective addresses and the processor will continue execution at the vector addresses.

You can think of the interrupt and the vectors as JSR to the addresses specified at $FFFE and $FFFA. Something like JSR ($FFFE) for IRQ and JSR ($FFFA) for NMI. It is not exactly the same, because the PS is also placed onto the stack and the exact PC value is somewhat different, plus you return from a JSR with an RTS, but with a RTI (ReTurn from Interrupt) for a IRQ or NMI. Other than that the two are comparable to a certain extent.

An “Hello World” Interrupt Service Routine

A really simple interrupt service routine might look like this:

F000  INC $1337
F003  RTI

It could have been as simple as RTI, but that would have been essentially a stubbed out handler to would return as soon as it is called without actually doing anything. Useful only when you want to have an empty ISR. Instead the example above shows a counter at address $1337 is increased every time that the ISR is executed.

You need to put the address of the ISR ($F000 in this case) in the IRQ interrupt vector during startup, or an interrupt will jump off into unknown byteland. The wiring-up boils down to a bit of code like this:

LDA #$00
STA $FFFE ; or STZ $FFFE for short
LDA #$F0
STZ $1337 ; Initialize the data register

which puts the bytes $00 and $F0 at the low and high byte of the IRQ vector at $FFFE and $FFFF respectively. We will talk about the SEI and CLI instruction in a moment.

Writing an interruptor in CC65

The CC65 compiler allows you to create an interrupt service routine through assembler code. There is some special syntax required to wire your code to be called when an interrupt fires. Here is a sample that implements the simple handler

.interruptor _handler
.proc   _handler: near
.segment "CODE"
	inc $1337

But wait, what’s this? There is no RTI at the end and a mysterious CLC instruction. The reason is that interruptor handlers are wired together by the CA65 assembler. Each interruptor address is stored in an table. Whenever an IRQ occurs every handler in the table is called in order of priority. Each handler can indicate whether the other handlers still need to be called. It is the carry flag that conveys this intention. When the carry flag is cleared, the calling of other handlers should continue. If set, the handler tells the runtime that it has hcompletely handled and cleared the interrupts, and calling the others is not needed anymore.

A priority is specified as follows:

.interruptor _vbl, 15

The number indicates the priority. A higher value gives the handler a higher priority. The default value is 7. You can read some more at the CC65 documentation wiki on the .interruptor control command.

There is always a VBL handler created if you use the TGI library. TGI uses a handler to perform a swap of the video buffers at the right time, so no screen tearing occurs. Screen tearing would happen when the swap is performed midway during the drawing of the current buffer. The VBL interrupt is an excellent moment to do it, hence the choice for a VBL handler.

Here are a couple of strategies for building your handlers:

  • Strategy 1: Create a big handler that checks for each and every interrupt source. This would keep the handler table small and give a single point to have your own interrupt handling logic.
  • Strategy 2: A handler per interrupt type. It will give small and concise handlers that are easy to maintain. It implies a little more overhead of multiple jumps, but it is disputable if that is noticeable or significant.
  • Strategy 3: Override the TGI handler by your own. Given that you would need to recompile the TGI library to alter its VBL handler, you could specify your handler with a higher priority and return with a SEC call before the RTS.

Don’t interrupt me

There are occassions when you do not want interrupts to occurs. These are some typical moments when it is inconvenient to be disturbed:

  1. An ISR is already executing
    Once an ISR is executing it can be impractical to have a new IRQ come in and trigger a new ISR from the current ISR. That would make it a bit like the movie Inception, where dreams occur within dreams within dreams within dreams… You get the picture.
  2. You are manipulating the vector address values
    When you have changed either the low or high byte but not the other, there is a very brief moment where the vector address is invalid. Should an interrupt request come in at that particular time, it will probably lead to unwanted and unexpected behavior.
  3. Bootstrapping or initialization code is running
    At this time things may not have been properly setup for the program and data registers to start executing ISR code.

The 6502 processor has a bit flag in the processor status called I (for Interrupt Disabled) that determines whether an IRQ is acknowledged or not. NMI interrupt requests are unaffected by the bit, because they are unmaskable and cannot be suppressed or ignored.

When the I bit is set, no IRQ requests are responded to. You can influence the bit with two instructions:

SEI    ; Set Interrupt Disable flag: masks IRQ interrupts
CLI    ; Clear Interrupt Disable flag: listen to IRQs again.

By default new IRQ interrupts are ignored during an ISR. So, you do not have to call SEI at the start of your ISR code, be it IRQ or NMI triggered.

Remember that NMI interrupts are always acknowledged, whether the I flag is set or not. However, when an NMI or IRQ interrupt service routine is executing, you can choose to call CLI and let new IRQ requests come through, should they occur.

You might want to clear the source of the interrupt before calling CLI to accept new IRQs or RTI to return from an interrupt. Since the IRQ line is level-sensitive it is important to note that when the level is still low, a new IRQ will immediately fire. Luckily, the Lynx has edge-sensitive IRQs, so you don’t have to take that into account, except for UART interrupts as these are level sensitive. We will talk about this later.

Interrupt sources in the Lynx

The Lynx has 8 distinct hardware sources that trigger input. The hardware is always a timer. And, since the Lynx has 8 timers, an interrupt can come from each (and all) of those sources.

A quick recap of the timers that the Mikey holds:

Timer # Description Relation to interrupt
0 Horizontal blank (HBLANK or HBL). Fires when the end of a “scanline” has been reached.
1 General purpose timer 1  
2 Vertical blank (VBLANK or VBL) Fires interrupt after all lines on a screen have been drawn. Useful for doing work that is screen critical (such as the moment of swapping screen buffers).
3 General purpose timer 3  
4 UART RX or TX related Doesn’t fire at timer expiration, but rather at the moment when data has arrived in receive buffer or when transmit buffer is empty.
5 General purpose timer 5  
6 General purpose timer 6  
7 General purpose timer 7  

Each of these timers have an Enable Interrupt bit in their static control register A. Only when this bit is 1 (enabled) will the interrup fire at the moment of timer expiration. In code this would look something like this:

MIKEY.timer1.control = ENABLE_INTERRUPT | 0x1E;
MIKEY.timer1.reload = 255;
MIKEY.timer1.count = 255;

The one exception here is the UART timer #4. This timer’s interrupt does not fire at the timer expiration. The timer’s purpose is to generate the baud rate for UART and it will expire at a steady pace to transfer the single bits of data, plus some extra such as the start, stop and parity bit. Lots of expirations that do not really matter. The relevant moment to fire an interrupt for UART is when data has arrived in the receive buffer, or when there is no more data to be sent (if the transmit buffer runs empty). For that, you need to set the TX and RX Interrupt Enable flags (TXINTEN and RXINTEN) to 1 for enabled.


This enables both receive and transmit interrupts, besides the normal settings for enabling even parity while resetting any errors and switching the UART to open collector.

In summary, the timers will generate IRQs when they are configured to do so. The timers will always run, no matter what type of code is executing. Once expired they will generate an interrupt, but this will only cause the call of the ISR through the IRQ vector when the I flag of the processor status register is not set.

Inspecting the sources

When an IRQ occurs it is often necessary to determine the source of the interrupt. It could be any one of the 8 timer sources or a combination of them. Each of the timers has a interrupt flag associated with it. Each and every interrupt flag that is set will cause the IRQ signal to be low and raises an interrupt.

The 8 bits of the interrupts flag would fit nicely into a byte, right? Mikey has two special interrupt related hardware registers for that very byte. These are INTRST ($FD80) and INTSET ($FD81).


Their purpose is to allow you to expect and manipulate the sources of the interrupts by looking at the bytes of the value located in each of them. They both hold the same set of bits when you read from either address. The value for INTSET or INTRST has the bits from the interrupt flags in this order: timer 0 at bit 0 up to timer 7 at bit 7. Writing to INTSET and INTRST is a totally different thing.


The INTRST will set interrupt flags to zero when written to. It will set the flags for the bits that are present in the (mask) value you write. It leaves the other bits unaffected.


INTSET will push the values written into it to the interrupt flags. It provides an easy way to reset them all by writing a zero to it (just like writing $FF to INTRST would). On the other hand writing a non-zero value will cause an interrupt flag (or flags) to be set, effectively causing an IRQ indirectly.

The best practice is to read from the INTSET at the beginning of your ISR code. It will get you the bits for the expired timers and serial interrupt. After you have nearly finished your ISR you can write the value from INTSET to INTRST causing those interrupts to be reset. If a new interrupt occurred during the execution of your ISR, the respective bit or bits are unaffected. When the ISR returns to normal code, there is still a bit set in the interrupt flags and a new IRQ will occur. That is probably intented, because you missed a new interrupt and want to handle that as well.

Reading INTSET and writing that to INTRST is usually a good approach.

However, the UART triggered interrupts are level-sensitive, so they will keep triggering unless you clear the source explicitly. Here is an abstract from the Epyx development kit’s documentation on UART and ComLynx:

7. Unusual interrupt condition.
Well, we did screw something up after all. Both the transmit and receive interrupts are ‘level’ sensitive, rather than ‘edge’ sensitive. This means that an interrupt will be continuously generated as long as it is enabled and its UART buffer is ready. As a result, the software must disable the interrupt prior to clearing it.

Another example: HBL and VBL interrupts

A more complete example is one where we do some effects based on horizontal and vertical blank (HBL and VBL) interrupts. The goal is to change the color of the black pixel each scan line, which creates the banded effect on screen. It is as simple as increasing the red value at $FDB0 (BLUERED0). The difficult part is that this has to be done for every scanline.

By now we know that the HBL occurs when timer 0 expires. It has its interrupt enabled by the boot rom initialization. That part is covered. This is the interruptor we need to include in our code:

.interruptor _hbl
.include ""
.export _hblcount
	.byte   $00
.proc   _hbl: near
.segment "CODE"
	beq done  inc RBCOLMAP+0
	inc _hblcount

The bolded statements are of most interest. Taking it from the top, an interruptor is declared to point to the handler routine called _hbl. There is also an exported variable called _hblcount that serves as a counter for the total number of HBL interrupts. The CODE segment loads the interrupt flags from INTSET and checks whether the flag for timer 0 (the HBL timer). If so, this handler is called for an HBL IRQ and it can continue by increasing the red value (note that it will also increase the blue value every 16th HBL) and the HBL counter. If not, the two increase operations are skipped. Finally, we clear the carry flag to indicate that other handlers should still execute.


The other handler for vertical blanks (VBL) interrupts is fairly similar. It does a check for timer 2 instead of 2 and will

  • Increase a frame counter variable
  • Reset the Red/Blue value to zero, so we always start the new frame with the same value
  • Store the last HBL count, so we can see how many HBL interrupts fire per frame.
.interruptor _vbl
.include ""
.export _framecount
.export _lasthblcount
.import _hblcount
	.byte  $00
	.byte	$00
.proc   _vbl: near
.segment "CODE"
	and #TIMER2_INTERRUPT  ; Check for VBL timer
	beq done
	inc _framecount   ; 
	stz RBCOLMAP+0    ; Reset Red/Blue value to create steady image
	lda _hblcount
	sta _lasthblcount
	stz _hblcount

When you look at the screenshot above you can see a couple of remarkable things. First, there are 105 horizontal blanks. This is in accordance with the documented 3 scanlines of blank time every frame. Plus, you can see that the HBL for the 3 invisible lines are right after a VBL interrupt. The first band is 3 pixels smaller than the others, which can only happen if the HBL counter was already running 3 lines before the first visible line is drawn. This observation was already shared by TailChao at the AtariAge forum.

Next time

This time we dove into interrupts for the 6502 processors and looked at some Lynx console specific details.

In the meantime, some additional reading on interrupts in the Lynx is available in the Epyx documentation:

Posted in Tutorial | Leave a comment

Epyx Development Kit: part 2–Pinky and Mandy

Working with Pinky and Mandy

Let’s skip a lot of things you need to do to create your first Lynx binary, be it a game or another type of program and pick up at the point where you want to run your program on real hardware. In the early days of Lynx development there was no emulator, so you could only see and test your code running on an actual device. Nowadays it is trivial to create an encrypted ROM, put it on a FlashCard and run it on a normal Lynx. In 1989 Lynx developers only had either Pinky/Mandy or Howard/Howdy. The focus is on Pinky and Mandy although most holds true for Howard and Howdy.

Pinky can function in one of two ways. It can be a FlashCard of sorts, where you can upload your code in the device and have it act as a cartridge to Mandy. Additionally, it can be a passthrough communication device that facilitates in a live (remote) debugging session between the Amiga computer and the Mandy console. A set of jumper switches allowed you to change the mode of Pinky and configure its memory size and use of the EPROM with the Pinky bootloader.


For a regular debug session with Pinky and Mandy you would connect Pinky to the Amiga with parallel port and Mandy using the propriatery flatbed cable. Next, you would start the Amiga and run the ManDebug debugger program. Here’s a small bit of what is happening under the covers when you boot a Amiga machine that has been modified by the Epyx SDK.

image image

Essentially it assigns drive letters (symbolic names) to the two important SDK folders 6502 and HANDY. It also adds the HANDY drive to the search path, so you can run the SDK tooling from everywhere. Finally, the ManDebug program is launched separately from the Shell, keeping it free to do other things.

When ManDebug launches it presents a console application that looks somewhat like this screenshot in the WinUAE emulator:


image image image

Even though it is named HanDebug at the top, it actually is ManDebug. You can see so when you look at the greyed out tabs for Trace and ROM, plus the greyed out button that reads Bus Monitor. That is the missing functionality in Pinky when compared to Howard. (Thanks James Jacobs for the hint to change the Preferences to 80 column Text Mode)

The pictures below show what ManDebug looks like running on my Commodore Amiga 2000 computer, plus what Mandy shows on the screen after booting (a line indicative of the loader program placed in middle of video memory).

WP_002599 WP_002512 (1)

Booting Mandy

The Mandy console uses a normal Atari Lynx power supply with +9V DC. The power adapter powers both the Mandy and Pinky device and it needs the full 9 Volts to do so. Mandy switches on like a normal Lynx and immediately turns on Pinky as well. The boot process depends on the jumper settings. Assume that it set for the debugging scenario, meaning that it will use the Pinky EPROM as the first content on the cartridge.


It is encrypted in the EPROM and follows the normal decryption process once loaded by Mandy after booting and loading the “cartridge”. You can find the decrypted contents at $0200 like you would for regular (commercial) cartridges.

The Pinky EPROM will load a second stage loader that expects block of bytes to be uploaded from the Amiga via Pinky to Mandy.

0200     A2 66         LDX #66
0202     BD 1B 02      LDA 021B,X     ; Copy second stage loader to $3000
0205     9D FF 2F      STA 2FFF,X
0208     CA            DEX
0209     D0 F7         BNE 0202

020B     A9 08         LDA #08
020D     8D F9 FF      STA MAPCTL     ; set vectors for RAM
0210     A9 00         LDA #00
0212     8D FA FF      STA CPU_NMI+LO ; set NMI to point to code
0215     A9 30         LDA #30
0217     8D FB FF      STA CPU_NMI+HI
021A     80 FE         BRA 021A       ; sit here waiting for NMI

This part will copy the second stage loader from $021B (directly after the first part) to $3000. This address also serves as the NMI vector, i.e. the address that will be called when an NMI occurs. Finally, the first stage will stay in an endless loop and wait until a NMI interrupt occurs. Harry Dodgson explained this to be a “hardware lockout”. Indeed, the contents of the Pinky loader is general purpose and circumvents the need to have an encrypted header for the ROM itself and bypasses any checksumming on the contents of what is loaded. Essentially, it is a perfect troyan horse to get your homebrew code into a Lynx. The dependency on a NMI hardware signal makes it impossible to use this without a Lynx console that has been modified to allow an NMI falling edge to occur on the respective pin on Mikey. So, it is a lockout of the ROM using hardware.

Uploading the monitor program

The next step involves pressing the NMI button on the front of the Pinky device. It will trigger an NMI, and Mikey will jump to the NMI vector at $3000. That piece of code is going to wait for the following bytes to arrive through the parallel port:

Load address (2 bytes): LO, HI
Length (2 bytes in 2’s complement): LO, HI
Actual bytes of file

The first four bytes indicate the load address and length. The loader reads the data and copies it to Mandy RAM memory at the load address and finally execute the loaded code.

For debugging, it is necessary to first upload a file called monitor.bin. You click the Bootstrap button and a dialog opens that lets you specify the file.


The monitor.bin file has code that will communicate with the Amiga by using Pinky as a “dynamic” cartridge. It will use different cartridge pages to indicate which parallel port line it want to read from or write to. It will use the data lines of the port for data transfer and the input and output control lines for negotiating the conversation.

Mandy will run the monitor program after having loaded it at $F900-$FF00. The monitor will intialize itself by setting the IRQ and NMI vectors to point to its own two handler routines. It performs a handshake ritual to initiate the communication with Pinky whereby the connection with the Amiga is established. The title of ManDebug will change from “Parallel Port is DOWN” to “Parallel Port is ACTIVE”. Then the monitor sits idle waiting for the first incoming command from ManDebug.

Communication between ManDebug and Mandy

All electronics aside the communication between the Amiga and Mandy consists of sending commands from the Amiga to Mandy and receiving answers in the opposite direction.

It is always ManDebug that initiates a command, but the monitor needs to be in control. It can come into control in one of two ways:

  1. Pressing the NMI button on Pinky
    This will create a NMI signal that causes the NMI handler in the monitor to execute, because the NMI vector is set to point there. At that stage the monitor takes over control and performs its handshake ritual.
  2. A BRK instruction in code is encountered
    Whenever a BRK is executed by the 65SC02 processor, it will perform its normal IRQ routine and jump into the IRQ handler. But, for a BRK instruction the B flag is set in the processor status register before it is pushed onto the stack. It allows the handler to inspect whether a normal IRQ from the timers came in or a software IRQ from a BRK instruction.
    The IRQ handler is also inside the monitor and will check the presence of the B flag (for the BRK). If present, the NMI handler will be called. Otherwise, the normal IRQ jump table will be used to jump into the respective IRQ handlers that were (potentially) registered for the 8 timer IRQs of Mikey.

Debugging with ManDebug

The topic of debugging deserves a chapter of its own, as a lot can be done. ManDebug offers the following functionality during a debug session:

  • Inspect and change internals of Mandy
    This includes the registers A, X, Y, the processor status, stack pointer and current program counter address and the current RAM memory (entire range from $0000 to $FFFF).
  • Watch memory variables
    The variables you want to inspect are single byte or double byte values located in memory. The current value is shown for the variable and update whenever it changes, provided the monitor is in control again.
  • Set breakpoints in the code
    A breakpoint will cause the monitor to be in control again, so you can inspect the state of Mandy, alter it if desired and resume execution.
  • Step through the code instruction by instruction
    It is possible to step into JSR routines or skip over them.
  • View memory structures
    When there is a area of memory that has a specific layout (such as a Sprite Control Block (SCB), you can declare the structure of the memory and view the memory in a window that is specifically designed to show the structure.
  • Resume execution
    This will restore the pre-interrupt state and resume execution of the program. It means that the monitor releases control and gives control back to the program again (at the risk of not regaining it by a breakpoint).
  • Fill a memory range
    A single constant byte will be used to fill a part of memory. It can be used to wipe a piece of memory using all $FF for example.
  • Watch the Bus for special circumstances (Howard only)


  • Upload a memory range into ManDebug
    The specified range of current RAM memory values in Mandy is sent from Mandy to the Amiga and is stored in a file with a name and format you have chosen.
  • Download a file into Mandy
    This will send the contents of a file from the Amiga to Mandy.

image image

Sending commands

The ManDebug debugger can send its commands provided the monitor is listening for incoming commands. The handling is performed inside command loops. The main loop looks a little bit like this:

  1. Wait for command byte
  2. Check byte
    0x00: Done
    0x01: Download/Receive (from ManDebug to Mandy)
    0x02: Upload/Send (from Mandy to ManDebug)
    0x03: Continue
    0x04: Slave request (?)
    0x05: Go
  3. Repeat from top

The Download and Upload commands are other loops, with subcommands. We’ll discuss those shortly. “Continue” will restore the pre-NMI or IRQ values for the stack, A, X and Y plus the processor status register and jump back to the previous address before the interrupt. “Go” resets the IRQ jump table and then do a “Continue”. I have not figured out what the “Slave request” is supposed to do, but I assume it puts Mandy in control to initiate requests to ManDebug.

A lot of the available main loop commands are revealed in an include file (from the Epyx SDK) called monitor.i. It shows the following constant declarations:

NOP_REQUEST         .EQU    0
GO_REQUEST          .EQU    5

The last two commands are not available in ManDebug (unfortunately), because of the missing hardware.

The “Download” and “Upload” loop have the following commands used by ManDebug:

END_OF_FILE    .EQU    $00  * Done and return to main loop
ORIGIN         .EQU    $01   * Set load address
DATA           .EQU    $02  * Transfer data (max 256 bytes)
RUN_ADDRESS    .EQU    $03  * Set run address
REGISTER       .EQU    $10  * Send/receive registers (A, X, Y, SP, PS and PC)
FILL_MEM       .EQU    $11  * Fill memory range with specified (single) value
LARGE_DATA     .EQU    $12  * Transfer data (max 65536 bytes)

Only commands 0, 1, 2, and 10 are used for Upload. Download uses all of them.

Using a combination of sending such commands, the functionality of ManDebug is implemented. For example, to download a file into Mandy, the debugger will send a DOWNLOAD_REQUEST first, then an ORIGIN command byte to indicate the load address, followed by LARGE_DATA (accompanied by the length and the actual data). It returns to the main loop again by END_OF_FILE.

With a logic analyzer you can actually see the bits going across the parallel port.


The picture above shows the data lines for a simple command sent from the Amiga to Mandy (through Pinky).

Breakpoints and stepping

Analysis of the various functionality showed that breakpoints and the stepping are actually made possible by replacing the instruction at a specific address with a BRK command. A pretty nifty trick that makes use of the IRQ handler to call the NMI.

For breakpoints the original instruction is remembered and replaced by BRK. When it the original instruction is restored. Stepping involves replacing the next instruction that will execute with a BRK. This can be determined easily if you know how the instructions behave. This is deterministic, even for branch instructions if you know the status flags. Fortunately, you can transfer the current state of the processor status as part of the whole set of 65SC02 registers mentioned earlier. When control returns after a step, the original instruction is restored and the next instruction is replaced.

It can happen that when a step is performed the next instruction is never reached. In that case ManDebug will report that control did not return. Your only option to regain control is by hitting a user-specified breakpoint or pressing the NMI button on Pinky.

Posted in Hardware | 2 Comments

Epyx Development Kit: Part 1–Contents

Contents of the Epyx development kit

The Epyx development kit for Handy consists of a number of items ranging from hardware to reference materials and software:

Mandy and Pinky

Mandy is a slightly modified, fully functional Lynx I that has the Reset and NMI pins connected and has a special “cartridge”. The cartridge is essentially a connector to Pinky. Pinky is a custom electronics board that sits between the Commodore Amiga and Mandy to facilitate the communication and optionally hold ROM images of your Lynx programs. It has two buttons to trigger the reset and NMI signal to Mandy.

WP_20140426_023_thumb[2] WP_20140426_009_thumb[2]
WP_20140426_002_thumb[2] WP_20140426_013_thumb

These pictures are from my own development kit. In clockwise order starting at the top left they show

  • Pinky and Mandy with a blue parallel cable and the special cable to Mandy
  • Special cartridge in Mandy holding the cable
  • Inside of Pinky
  • Outside of Pinky with the blue parallel cable and two buttons (Reset and NMI).

You can see some additional pictures here at the Handheld Museum.

Howard and Howdy

This hardware set was more expensive than Pinky and Mandy, but offered some extra functionality. The set has two pieces of hardware called Howard and Howdy. Where Mandy is a full Lynx, Howdy has no processor and memory of its own. Instead, the guts of the Lynx exists in Howard, a PC case with a huge motherboard.

post-5140-0-40247600-1354856273 post-5140-0-62583600-1354856263 post-532-129051274036 post-532-129051271498

It holds the 65SC02 processor and lots of RAM and additional logic to offer functionality for bus monitoring and tracing of your code. Howard is connected to Howdy for display, sound and input. In turn, the Amiga is connected to Howard. The last two pictures (from this thread at the AtariAge forums) show the Howdy console connected to Howard.

The Epyx kit did not include the Commodore Amiga 2000 that was needed. You could have any Amiga machine, but the 2000 model was recommended because of its harddisk and memory.

Reference manual

A binder full with hundreds of pages detailing the internals of the Handy hardware, the use of the Amiga software and Epyx SDK for developing Lynx programs. When updates to the SDK were made, addendums where issued that you could place in the binder.

post-27403-0-52763900-1393252317 post-27403-0-96812000-1393252337


The software accompanying the the development kit is provided on a set of 8 disks that contain the Amiga software, source code and samples you need to develop Lynx programs.

The SDK’s 3.5″ floppy disks restore a Quarterback backup set to the system partition. You needed to have Quarterback software to use the disk. This is the way it works for the 1.6 revision of the SDK. Older sets may have worked in a different way. The backup sets were created using QB 4.2, but version 5.0 is also capable of restoring the set. The restore would add custom files for the Workbench 1.3 operating system under C2, replace some its system files in the C folder and place the SDK tools and source code (actual sources, include files, macros and sample code) in two folders called 6502 and HANDY.


The development software contained the compiler, sound and rom creation tools, and the source code for building Lynx programs. Additionally it had Amiga tools that made working with the Amiga as a development machine a little easier (e.g. faster fonts and a better text editor).

Posted in Hardware | 2 Comments

Lexis easter egg

The game Lexis published by Songbird has a really neat easter egg. You can play a game of Galaxian whenever you feel like it. Here’s how to access the easter egg:

Galaxian in Lexis6

Go to the Table of Contents screen and press Left, Right, Left, Right, Up, Down, Option 1 and finally Option 2. After you have done that, start a regular game of Pages. It may seem that the game simply starts, but you get an easy finish, by completing the word “SCIENTOLOGY” with the missing T. Receive compliments and enter your name in the highscore table.

Galaxian in Lexis3 Galaxian in Lexis4 Galaxian in Lexis5

You should now have a game screen for a good game of Galaxian. A fine example of an easter egg that offers more gameplay.

Galaxian in Lexis Galaxian in Lexis2


Posted in Games | Leave a comment

Programming tutorial: Part 16–Cartridges

Atari Lynx programming tutorial series:

In part 15 we discussed the memory and segments and how those are related. Before we can go into the details of loading segments into memory, we need some background on the cartridges that the Lynx uses for storage of binary information. This part we will look at the internals of cartridges and how to do raw reads from it.

Of ROM and RAM

Before we get started with the internals, it is worth pointing out a few pecularities of the Lynx. In previous parts we touched on this, but now a refresher and some details are badly needed.

You see, the Lynx only has RAM. 64 KB of it. Read part 12 and 15 to find out how this is organized. Other systems have less RAM (sometimes) and use part of their address space to look “into” ROM cartridges. These systems have the luxury of memory mapped swappable ROM. For example, the Atari VCS 2600 only has 128 bytes (!) of RAM, while there is around 4KB of address space to read from ROM “memory” of the inserted cartridge.

99 fgames were developed for the device. They were supplied as thin card-style cartridges with a prominent edge to make them easier to remove
Photo: Alex Kidman

No such luck for the Lynx. It only has RAM and will need to read its code and other binaries into RAM from the peripherial device called the cartridge. The cartridge can be viewed as a read-only harddisk of some sort. Like a PC the Lynx will have to read data from the cartridge and store it in memory.

A side note: at one point in time Atari had the idea to read games from tape. There is still reference of the tape and some hardware addresses like MAGRDY0 ($FD84) that are directly related. The timers 1, 3, 5 and 7 were also meant to be used for signalling the baud rate of the tape device.

It might seem that this is sort of limiting and that we took the short straw with the Lynx. Nothing is further from the truth. The setup allows us to use a lot of RAM in any way we like. We are not tied to certain memory ranges that we must use. Additionally, we can have cartridges that are much larger than the available RAM. The Atari Lynx cartridges come in different sizes. The common ones are 128, 256 or 512 KB, although smaller and larger variations can and do exist. We get to choose how and when to load data from the cartridge and where to store it. Heck, you can even stream live from the cartridge as some libraries have already demonstrated. HandyMusic can play sound effects in PCM format straight from the cartridge. How nifty is that?

Physical structure of the cartridge

Even though the sizes vary, all cartridges have something in common: they have the same (maximum) number of 256 blocks. For each cartridge every block contains a fixed number of bytes. Two simple formulas give the total cartridge size from the block size of a cartridge and vice versa:

TOTALSIZE = 256 * BLOCKSIZE             – or –

This tabel helps find the right sizes:

Cartridge size (KB) # Blocks Blocksize (bytes) Pins
64 256 256  A0-A7
128 256 512 A9-A8
256 256 1024 A0-A9
512 256 2048 A0-A10
1024 256 4096 A0-A10+?

The italic red ones indicate uncommon cartridges. No commercial cartridges with 64 KB and 1 MB have been released during the Atari age.

To give you a visual impression of the cartridges and their sizes, you can take a look at the picture below. It depicts the blocks and their sizes.


No matter how you look at the cartridges, their behavior is always that of a stream of bytes starting somewhere within the cartridge’s binary image and continuing for as long as you are reading bytes.

Close connections of console and cart

The Lynx and the inserted cartridge are connected to each other through a large flat connector that sits inside the Lynx console.

image image image
Pictures from

The connector passes the pins to a couple of signals and pieces of electronics on the Lynx motherboard:

  1. VCC +5V
  2. 74HC4040 (12-stage binary ripple counter; generates cartridge addresses A0-A10)
  3. 74HC164 (8-bit serial-in, parallel-out shift register; generates cartridge addresses A12-A19)
  4. Data lines (8-bit lines that go on the data bus)
  5. Ground
  6. Auxiliary Data Input/Output (aka AUDIN, not to be confused with Audio IN)

As a programmer you must know about the ripple counter and the shift register. These two pieces of hardware together build the cartridge address you are reading the data from.


It works like this: the shift register builds the high part of the cartridge’s address. It can target 256 different values, that correspond to the 256 blocks (or pages) of the cartridge. The lower part of the address is created from the ripple counter. That counter will start counting at value 0 and auto-increment after every read from the cartridge.

Different sized cartridges have different wirings from the A0 to A20 lines. More precisely, smaller cartridges have not all pins from A0 to A10 connected. They will only wire from A0 up to whatever they need. A 64KB cartridge needs to be able to address 65536 bytes which requires 16 bits. It is sufficient to connect the wires A0 to A7.


Look back at the table above and find out what the pins for each cartridge size are.

The data lines that are present will hold the byte value from the cartridge. Reading it will pulse the CE0/ line on the cartridge, advancing bank 0 to the next byte in the ripple counter. And so on.

The AUDIN pin is used heavily on custom cartridges as follows:

  1. An extra address line for 1 MB cartridges, giving a virtual A20 line for large enough EEPROMS.
  2. A bank-switching bit that allows switching between more than one bank (two usually) to increase the maximum number of data in the cartridge to 1 MB as well.
  3. Enable bit for EEPROM carrying cartridges, such as Lynxman’s Flashcard. By setting this bit high and using special address lines, you can write to the EEPROM, effectively saving (limited amounts of) data to the cartridge. The EEPROM is a separate chip and has around 128-512 bytes of storage.

Shifting and rippling

The block and position selector need to be prepared to read the intended data from the cartridge. Each requires a different approach. The block selection is performed by bit-shifting the right block number into the shift register. The position within the block is prepared by performing strobes, which in turn requires dummy reading, so the ripple counter (automatically) increases to the right position in the block.

The shift register is like a conveyor belt of bits. You need to place a bit on the belt, advance the belt one position and place the next bit. After performing this 8 times you are certain to have the right block selected. There are two registers involved in performing this bit shifting: $FD8B (IODAT) and $FD87 (SYSCTL1).


IODAT is a hardware register that has a bit for the data to be placed into the shift register. Bit 1 of this address is the Cart Address Data output bit. IODAT is a weird register, in that it can be read from and written to, but the individual bits provide either input or output access, so will only make sense when used appropriately You can control what direction (input or output) it has by setting it in the IODIR ($FD8A) register. It is sort of similar to the way the MAPCTL register determines whether to use RAM or hardware registers based on the bits you set. You need to set the direction for bit 1 of IODAT to 1 for output. After that, by setting that same bit 1 of IODAT you determine what is the next bit on the shift register for the cartridge address. The shift register is advanced by strobing bit 0 (called CartAddressStrobe) of SYSCTL1. A strobe means that the value of the bit changes from 0 to 1 and back to zero again. Although the shifter will except the data at the rise of the strobe bit, it must be set back to 0, as the high level of the bit is used to reset the ripple counter.

Here’s the general flow:

  1. Turn on cartridge power
  2. Set directions on IODIR
  3. Set bit 1 of IODAT to value of current address part (in order from A19 to A12)
  4. Strobe bit 0 of SYSCTL1 (write 1 then 0, assuming you start with a zero value)
  5. Repeat from 3 until all 8 bits have been set.

In assembler code this looks like the following:

	lda __iodat
	and #$fc
	ora #2
	lda _FileCurrBlock
	inc _FileCurrBlock
	bra @2
@0:	bcc @1
	stx IODAT
@1:	inx
	stx SYSCTL1
@2:	stx SYSCTL1
	sty IODAT
	bne @0
	lda __iodat
	sta IODAT
	stz _FileBlockByte
	lda #<($100-(>__BLOCKSIZE__))
	sta _FileBlockByte+1

The code above is from the CC65 implementation for selecting a block, actually. It shows a couple of things when we forget about the details and the optimizations.

  • Notice the two calls to SYSCTL1 for strobing bit 0 to advance the shift register.
  • The accumulator holds the block number to select. The rotation of the accumulator is moving the next (highest) bit in the carry flag. The first call to IODAT stores the value 1 in the CartAddressData bit, if that carry flag was set. The second call is used to always put a 0 in as the default, allowing the first call to be skipped for a zero bit.
  • At the end of the routine the remaing number of bytes in the block is set. We need this to determine the block edge transitions later on.
  • The __iodat value is used to get the current value of the IODAT register. Two shadow variables are declared to hold the values that where written to the registers IODAT and IODIR, conveniently called __iodat and __iodir (with double underscores). The shadow values are needed, because we can never read the values back from the registers, but might want to inspect them later on.

For completeness sake it is worth mentioning that in the startup code of any CC65 compiled program the IODAT and IODIR get initialized. They are set to $1B and $1A respectively. You can find the fragment for the shadow variables in crt0.s:

ldx     #$1b
stx     __iodat
dex;  $1A
stx     __iodir

The initialization of the actual registers IODAT and IODIR is done using a longer list of initialization values for Mikey. This is also in crt0.s (around line 40):

MikeyInitReg:  .byte $00, …, $50, $8a, $8b, $8c, $92, $93
MikeyInitData: .byte $9e, …, $ff, $1a, $1b, $04, $0d, $29

After having prepared the shift register it is a matter of reading data from the $FCB2 (RCART0) address. The $FCB3 (RCART1) address is used for reading from the second bank that might be present. Usually there is only one bank (bank0) on a cartridge. When reading from RCART0 the strobe CART0/ is used and it will advance the ripple counter to the next value, essentially autoincrementing the cart’s current address.

Reading from cartridges

Let’s assume that the data we want to read from the cartridge is located somewhere like this:


The picture shows the data starting in the middle of the second block and continuing into the fifthblock. The way to read this data is to advance the high part of the cart address to the second block, then dummy read until the starting point of the data is reached. A thing to remember is that the blocks have a certain size. This is relevant for two reasons.

  1. You need it in calculations of the desired block if all you have is the consecutive byte number. The ripple counter is automatically set to zero by changing the high part of the cartridge address (because this requires using the strobe for the bit shifter which resets the counter). This also means that you need to do the correct number of dummy reads, also determined by the blocksize.
  2. When crossing the boundary of a block you need to increase the block number to read the right data. If you do not do that, you will read data from the same block again. That’s why you need to keep track of how many bytes are left in the current block. When it reaches zero you must increase the block number by shifting in 8 bits again.

We could do this using assembler, but that has been done. There is also a higher level abstraction from C. The methods lseek and read will do what we want, …. sort of. This is what the functions look like (from the unistd.h include file):

int __fastcall__ read(int fd, void* buf, unsigned count);
off_t __fastcall__ lseek(int fd, off_t offset, int whence);

The lseek method takes three parameters, of which only one is relevant. The file descriptor fd is always 1 for the Lynx, and for whence only SEEK_SET is supported. That leaves the offset you want to have into the cartridge. The offset is passed using a type off_t, but it is in fact an long integer that can hold the large zero-based offset from the beginning of the cartridge. In a 2MB cartridge this might actually be 2^21-1 = 2097151.

lseek will set the shift register to the correct value and advance the ripple counter to the start of your data, both depending on your block size. It supports the 512, 1024 and 2048 byte block sizes.

off_t offset = 0;
lseek(1, offset, SEEK_SET);

You might need to calculate the offset from the block number and offset in the block. That’s kind of silly, because the lseek implementation does the reverse. It’s just the way lseek is defined.

You need to include the headers stdio.h (for SEEK_SET constant), unistd.h (for the function prototypes of lseek and read) and sys\types.h (for the off_t type).

The actual reading is performed by calling read.

unsigned char buffer[256];
read(1, &buffer, 256);

You need to have a buffer that is going to hold the data read from the cartridge. The example above shows a 256 byte buffer and reads a 256 bytes sized chunk from the cartridge into it.

Here is an example of how to read the contents of your cartridge and dump it to the screen in multiple pages:

void dump_cartridge()
  off_t offset = 0;
  unsigned char buffer[80];
  unsigned char byte;
  char text[4];
  unsigned char page = 0, index = 0, x = 0, y = 0;

// Advance current address to start of cartridge lseek(1, offset, SEEK_SET); do { // Read 80 bytes for one page into buffer read(1, &buffer, 80); index = 0; tgi_clear(); // Draw all values for (y = 0; y < 10; y++) { for (x = 0; x < 8; x++) { itoa(buffer[index++], text, 16); tgi_outtextxy(x * 20, y * 10, text); } } tgi_updatedisplay(); wait_joystick(); } while (++page < 3); }

This example uses a loop that will read 80 bytes for a page into the buffer, then display them in hexadecimal value 8 byte per line for a total of 10 lines. Pressing any normal button on the Lynx advances the current page.


You should try this: open the LNX file using the Binary Editor (right-click your tutorial-cartridge.lnx file, then select Open With…) and compare the contents you see with what is displayed. Mind you, you have to skip 64 bytes of the LNX files. We’ll explain later why that is.

The example is included in the sample source code. It can easily be expanded to allow you to select the current block number and start from there.

Next time

This part showed how the cartridge system works at a low level and with the CC65 methods lseek and read. The Lynx cartridge can also use a simple file system with a directory and file entries. The next part will we look at how files can be read, and how this relates to the segments we saw in memory segments. Till then.

Posted in Tutorial | 4 Comments

Programming tutorial: Part 15–Memory and segments

Atari Lynx programming tutorial series:

In part 12 we covered memory in the Lynx for the first time. By now you may have run into memory limitations while building your Lynx games. Admitted, 64KB of RAM is not really much, especially considering that considerable amounts of the memory are required for the Lynx hardware.

Atari Lynx memory layout

The layout of the Lynx’s memory varies over time. At startup, there is nothing really loaded and the memory layout resembles this:

The green areas that are required by the hardware meaning both Suzy and Mikey. The Mikey 65SC02 requires zero page memory and a stack at $0000 and $0100 respectively. It’s yours to use, but only through zero page addressing and variables, plus by Push and Pull instructions that manipulate the stack. You cannot use these 512 bytes for any other purpose. This is the main reason that most programs get loaded at $0200, which is exactly after the stack’s memory. We already saw that $FC00 to $FCFF contains memory mapped registers for Suzy, and $FD00 to $FDFF likewise for Mikey. $FE00 to $FFF7 is the boot ROM area that is used for booting the Lynx.

After initialization of the Lynx hardware and the C and TGI libraries from the CC65 toolset the memory looks like this:

As you can see in orange, there are three new memory areas:

  • C stack
    The C library uses a stack of its own. This stack is usually 2KB large and is needed for more complex pieces of code (e.g. recursive functions).
  • Video buffers
    The TGI library initializes the video driver and will automatically allocate two buffer for video and do double buffering to avoid screen tearing and other weird effects during updates. One buffer requires 160*102 = 16320 pixels. Since each pixel can hold 16 colors and requires only 4 bits to hold the pen index, the actual number of bytes is 8160 (or $1FE0 in hex). With two required buffers, that’s quite a lot of memory.

Excessive direct access

One thing may have struck you as odd: how come we can use the area from $FC00 to $FDFF where Suzy and Mikey’s hardware mapped registers are. Wouldn’t there be a conflict between the registers and the RAM address space? Would we have to do a lot of memory mapping tricks to make it work? Luckily, we do not have to reserve that area and no mapping of the memory is required. That would be way too complicated. No, the good thing is that the LCD panel gets its data directly from RAM memory… always. This feature is called DMA for Direct Memory Access. So, the video display will always read RAM, no matter where the video buffers are located.

That’s why it is better to overlay it on top of an area that is otherwise less (easily) useable. Hence, the Suzy and Mikey address spaces. We can simply leave it at the regular hardware space, so we can access the special registers. The RAM access is not needed, unless we want to draw directly into the video buffers. That is pretty unlikely.

The same will hold true for the collision buffer, should you want to use that. It will take another 8160 bytes and can be located anywhere. You probably want to lay it right before the first video buffer. That’s where TGI will place it if you use the tgi_setcollisionbuffer(1); call.

With the C-stack and the video buffers in place you are around 18 KB poorer in memory, 26 KB for a collision buffer as well. The bottom line is that in most cases (no collision detection) you can spend your memory from $0200 to $0B838.

Configure my memory

Let’s take a look at how this translates to the CC65 suite. The programs and games we write consist of C and assembler code. The cc65.exe compiles the C code and generates assembler code from it. The assembler code (from C and your own) gets assembled by ca65.exe. We end up with a couple of object modules that need to be tied together by the linker. The object modules do not have exact addresses for memory just yet. It uses placeholders to be flexible in the actual allocation in memory. The linker ld65.exe performs the connection of the modules and the choice of final memory locations based on the configuration of your memory areas as indicated in a configuration file.

Each specific area in memory has a few characteristics:

  • Start address
    The area is located from a specific address up in memory space. As an example, take the video buffer that has its start address at $C038.
  • Area size
    Each area is of a particular size. Sticking with the same example, the video buffers are both 8160 bytes in size.
  • Type of memory
    Some memory areas are read or write or read/write. In the Lynx we mostly deal with read/write memory, because everything memory is located in the 64KB of RAM. Other systems have memory mapped cartridges that are ROM, i.e. read-only.

The linker uses configuration files to tie the individual parts of your program or game together. A configuration file holds information on the memory area and segments. The ld65 linker has built-in configurations for each of the known targets. The lynx has 4 built-in configurations:

  1. lynx
    The default configuration that will have the MEMORY section like above. It adds a small boot loader and a required directory, plus a LNX header so it can run in Handy. The ROM image without the LNX header can be burned to an EEPROM or Flashcard and will produce a working cartridge.
  2. lynx-bll
    This configuration creates a BLL header to the output file, so it can be uploaded via a PC to ComLynx cable using any of the cartridges that allow BLL uploads (e.g. SIMIS and Championship Rally).
  3. lynx-coll
    Essentially the same configuration as the default one. It claims an additional $1FE0 of memory for the collision buffer, before the first video buffer.
  4. lynx-uploader
    This configuration adds a special uploader area right before the first video buffer. The useable RAM area is reduced by a full $100 (supposedly because of alignment?). I believe this configuration file does not function correctly.

Shown below is a fragment of the default configuration that is used by the linker ld65.exe for the lynx target in case you did not specify your own configuration.

  ZP:     file = “”, define = yes, start = $0000, size = $0100;
  HEADER: file = %O,               start = $0000, size = $0040;
  BOOT:   file = %O, start = $0200, size = __STARTOFDIRECTORY__;
  DIR:    file = %O,               start = $0000, size = 8;
  RAM:    file = %O, define = yes, start = $0200,
          size = $BE38 – __STACKSIZE__;

You can see what a particular configuration is by running:

ld65.exe –dump-config lynx

where the bold item is the configuration name. The source file for the default lynx configuration is called lynx.cfg. You can find it in your CC65 folder under the source code for ld65, presumably C:\Program Files\CC65\src\ld65\cfg. This configuration is compiled as part of ld65.exe, so changing the file has no effect. The other three configuration files are located in the same folder.

Focus on the bold items in the MEMORY section for now. You should be able to recognize some of the numbers. Zero page (ZP) runs from $0000 to $00FF, for a total size of $0100. The user available RAM area starts at $0200 as we saw earlier and runs until $B837. The size is computed as follows:

= $C038-$0200-STACKSIZE = $BE38 – $0800 = $B638 (46648 bytes)

The default configuration file does not give you the origin of the numbers, just the correct (resulting) ones. The constants come from the symbols section of the same configuration file:

  __STACKSIZE__: type = weak, value = $0800; # 2k stack
  __STARTOFDIRECTORY__: type = weak, value = $00CB;
  __BLOCKSIZE__: type = weak, value = 1024; # cart block size
  __EXEHDR__:    type = import;
  __BOOTLDR__:   type = import;
  __DEFDIR__:    type = import;

The stack size and start of directory are defined constants in the symbols section. These values can be used to define your memory areas and make them more flexible and less hardcoded. You could change the C stack size and make it bigger or smaller for your needs. All it takes is adjusting the value attribute of the __STACKSIZE__ symbol.

There are a couple of things in the memory and symbols section that do not make sense right now. We will get to them in time. For now, suffice to say that HEADER, BOOT and DIR are areas for respectively the Handy emulator’s LNX file header, the encrypted boot loader on the cartridge and the directory with file entries on the cartridge.

Define it for me

Notice how some of the memory areas use an attribute called define with a value of yes and no. Each memory area that has a define=yes will make the linker emit two values. For an area called AREA51 it will emit __AREA51_START__ and __AREA51_SIZE__ corresponding to the start address and size of the memory.

Other pieces of code may rely on these values to allow for a flexible layout of memory and the code that is tied to the memory layout. An example is the implementation of the C stack that depends on the location and size of the RAM memory area. We already saw that the __STACKSIZE__ is a constant in the symbols. But the implementation also relies on the final physical location. It uses the value of __RAM_START__ to indicate the start address. Later, the linker will emit this value because a memory area called RAM is defined. When linking together, all puzzle pieces fit together.

In a while you will see how the linker emitted values for these defines on memory areas can be very useful. For one, they allow you to create the file entries in the directory structure that each larger game cartridge will have.

Dividing in segments

The source code items you create get compiled, assembled and assigned to the memory areas. There’s another dimension to all this. The source code consists of elements such as executable code, variables and static data, that have a different behavior and memory requirements. Similar elements are combined and group together into memory segments of a certain type. In general this allows for the protection of memory and programs residing in it. There are four segment types in C source code:

  1. Code
    Regular executable code. Normally, this cannot be altered and it resides in a read-only memory segment. If code is self-modifying it cannot reside in this segment.
  2. Data
    Refers to data that can be altered. This data comes from the global and static variables that you declare and initialize with values. These values are combined in the data segment. They can be found in the object module as binary values that get copied by the loader at the memory location of the variables, initializing them to the values you gave them. After that they can be altered, because the memory is for variables (after all).
  3. Read-only data
    Data, but this is not meant to be altered. It is reference data from constant valued variables (marked as const). Some examples of read-only data are binary data for images and music, and text strings containing messages.
  4. Bss (Block Start by Symbol)
    This is data that is uninitialized. It will have a zero or null value. There is no need for the object module to contain this data, just the location in memory. A simple routine can initialize the values, because it will be zero anyway.

That’s a lot of theory, so a real example with code might illustrate this a bit. Take the following code sample:

unsigned char a;
char b = 42;
const char text[] = “Hello, World!”;

void example()
  int x, y = 1337;

The compiler will generate the following assembler code for this (showing relevant fragments):

.segment    “DATA”
_b :
  .byte  $2A

.segment    “RODATA”
_text :
  .byte  $48, $65, $6C, $6C, $6F, $2C, $20, $57, $6F, $72, $6C, $64, $21, $00

.segment    “BSS”
_a : .res    1, $00

; ———————————————————– –
; void __near__ example(void)
; ———————————————————– –

.segment    “CODE”
.proc    _example : near

.segment    “BSS”
L0035 :
  .res    2, $00
L0036 :
  .res    2, $00

.segment    “CODE”
; int x, y = 1337;
  ldx     #$05
  lda     #$39
  sta     L0036
  stx     L0036 + 1
; }


Hopefully you can make some sense of the transition from C to 6502 assembler. Notice how the segments for the various types of code and variables is declared. It uses the .segment keyword combined with the quoted segment name. Since b was initialized it is placed in the data segment with its initialization value. Likewise, a was not initialized and can reside in the bss segment. The constant value for text is listed as the hex ASCII values in the read-only data segment. The code segment is used for the implementation of example.

What may come as a surprise is that the initializer for y inside the example function is placed in the bss segment just like x. The reason is that y needs to be initialized every call to example(), so it is not sufficient to have the value 1337 in the data segment. Instead it is placed in the method itself and y is simply placed in bss, to save size in the binary image for the object module.

Choosing your segments

The names for the segments we just saw might seem arbitrary, but nothing is further from the truth. You chose them when you compiled your C source code. You don’t remember? Well, that’s because we never really discussed this. I will take you back to one of the first tutorials where we looked at the MAKE files for our projects. Here’s an excerpt from the lynxcc65.mak file:


  –rodata-name $(RODATA_SEGMENT) \
  –bss-name $(BSS_SEGMENT) \
  –data-name $(DATA_SEGMENT)

# Rule for making a *.o file out of a *.c file
  $(CC) -o $(*).s $(SEGMENTS) $(CFLAGS) $<
  $(AS) -o $@ $(AFLAGS) $(*).s

The MAKE file defined some macros for the 4 segments and gave them the names of CODE, DATA, RODATA and BSS. It might have been anything you liked, although changing this will force some adjustments in other places as well. The inference rule for .o files for object modules from C source code shows that the cc65.exe compiler takes the arguments –code-name, –rodata-name, –bss-name and –data-name to define the segments names used in the compilation to assembler code. This will make the compiler emit the .segment “DATA” and similar pieces of assembler code we saw in the earlier fragment.

Every time you call the compiler cc65 you are free to pass different segment names. This allows you to choose your segment names for all C files that are compiled by that single command. As the SEGMENTS macro is like a global variable and the inference rule will apply to all times the rule is triggered, it is a bit fairer to say that it applies to every C file that is affected by your make file and thus your entire project as it currently is organized.

If you want a more fine grained control over the segments you have a few options:

1. Create more MAKE files

Each MAKE file will hold its own SEGMENTS redefinition. The lynxcc65.mak file is included by every MAKE file, and defines the SEGMENTS macro first. If you add your own (re)definition in your MAKE file (say fonts.mak), it will overrule the previous definition with your new one. A separate MAKE file can be triggered by calling:

cd fonts && $(MAKE) $* /f fonts.mak

assuming you have placed the items build by the fonts.mak MAKE file into a relative subfolder fonts.
The fonts.mak file should hold a new definition like this:

  –rodata-name FONTS_RODATA \
  —bss-name FONTS_BSS \
  —data-name FONTS_DATA

2. Include #pragma statements in your C source code

The cc65.exe compiler recognizes the following #pragma statements: Adding this at the top of your C file will make sure that all code inside that C file is compiled into the specified segment names. It could be used midway through the code, but that would mean that some code gets compiled into the default SEGMENTS defined segment names, and the rest in the #pragma ones.

#pragma data-name (“FONTS_DATA”)
#pragma rodata-name (“FONTS_RODATA”)
#pragma code-name (“FONTS_CODE”)
#pragma bss-name (“FONTS_BSS”)

It is even possible to push the current (old) name for a segment onto a sort of stack with the push keyword. It seems unlikely that you will need this control any time soon.

So far we discussed how you can control the segments for C code. In case you are writing assembler code yourself, you will need to specify the segments for the various types of code and variables yourself. You will use the .segment keyword, just like in the compiler generated assembler code, to do so. As a matter of fact, you were already using it without knowing it.

# Rule fore making a *.o file out of a *.bmp file
  $(SPRPCK) -t6 -p2 $<
  $(ECHO).global _$(*B) > $*.s
  $(ECHO).segment “$(RODATA_SEGMENT)” >> $*.s
  $(ECHO) _$(*B) : .incbin “$*.spr” >> $*.s
  $(AS) -t lynx -o $@ $(AFLAGS) $*.s
  $(RM) $*.s
  $(RM) $*.pal
  $(RM) $*.spr

When a bitmap file was used to create the read-only SCB data for a sprite, it used an inference rule that generates a new assembler file containing the line

.segment “RODATA”

or whatever the read-only data segment is called by the RODATA_SEGMENT macro at that point in time. For example, when we did the robots.bmp file this gave the following robots.s assembler file:

.global _robot
.segment “RODATA”
  .incbin “robot.spr”

It might require a different inference rule or redefinition of RODATA_SEGMENT to place your sprite data in an other segment.

Segments and areas

At this point you are probably wondering what all these segments and memory areas are all about. And maybe even how the two are related like I hinted at when I mentioned another dimension to memory. Get ready for it, here it comes.

Individual segments are assigned to a memory area. As memory areas can hold various types of memory, like read/write for RAM or read-only for ROM, certain segments should go into compatible memory areas. E.g., code segments can come from ROM, but data should always be assigned to RAM, as it requires read/write memory.

Typically (for the Lynx) related segments are assigned to the same memory area. The Lynx only has RAM memory to work with. Admitted, the cartridges are like ROM, but it is accessed as a sequential stream that needs to be copied into RAM before it can be used.

The linker can work its magic for each of the segments that are assigned to memory areas. It can allow segments to be assigned to overlapping memory areas. This way we can have code and data in the same memory space at different times. By loading the required code and data at the appropriate time it will enable us to fit more code into our already constrained memory space.

The linker configuration file has a section for the segments and their mapping to memory areas. It tells the linker what type of code or data is in each segment and where to load the segment into memory. Here is a fragment from the default lynx configuration:

  EXEHDR: load = HEADER, type = ro;
  BOOTLDR: load = BOOT, type = ro;
  DIRECTORY:load = DIR, type = ro;
  STARTUP: load = RAM, type = ro, define = yes;
  LOWCODE: load = RAM, type = ro, optional = yes;
  INIT: load = RAM, type = ro, define = yes, optional = yes;
  CODE: load = RAM, type = ro, define = yes;
  RODATA: load = RAM, type = ro, define = yes;
  DATA: load = RAM, type = rw, define = yes;
  BSS: load = RAM, type = bss, define = yes;
  ZEROPAGE: load = ZP, type = zp;

  EXTZP: load = ZP, type = zp, optional = yes;
  APPZP: load = ZP, type = zp, optional = yes;

The bolded items are the segments we have encountered so far. CODE and RODATA are read-only segments as indicated by the type=ro attribute. DATA is read-write and BSS is of type bss, like you would expect. Each of these segments gets loaded into the RAM memory area. That much makes sense, as the RAM segment is currently the only user memory,

There are a few other segments (ZEROPAGE, EXTZP, APPZP) defined that are used by the compiler for zero page variables. The segments at the top EXEHDR, BOOTLDR and DIRECTORY are for creating a binary image that you can run in Handy. The STARTUP, LOWCODE and INIT segments are for the C runtime to put stuff that needs to be in potentially special areas. Consider these a given for now.

Also, notice the fact that some segments are marked as optional, where others have the define=yes attribute. The former means that the segment might not actually be there and some optimizations can be done. The latter will make the linker emit values for the section, similar to what it does for define=yes in memory areas. For a segment named FONTS_DATA the linker creates values __FONTS_DATA_SIZE__ and for FONTS_CODE two values __FONTS_CODE_LOAD__ and __FONTS_CODE_SIZE__. The values will come in useful at a later time.

Some rules of engagement

You might think about renaming some of the areas and segments. Be careful though, because some things simply need to be present and named according to presets. As an example, the C runtime library depends on the RAM memory area to be present. It assumes that the C stack is located directly after the RAM area. It uses the generated values for __RAM_START__ and __RAM_SIZE__.

Next time

This was a pretty deep and theoretical part in the tutorial. It covered a lot of ground that was more computer science related and less specific for the Lynx. Nevertheless, it was necessary to tackle this, because a lot of other Lynx and CC65 specifics are related to it either directly or indirectly. Next time we will continue our investigation of segments and look into loading code and data into memory from cartridges. Till then.

Posted in Tutorial | Leave a comment