WARAJEVO FILE FORMATS


Here will be given only description of native Warajevo formats, e.g. other file formats which will be internally converted in native Warajevo formats (like SNA, SLT, TZX etc.) will not be described at this place.

WARAJEVO TAP FILES


TAP files (tapes) in native Warajevo format have following structure:

At the beginning of the file there are four bytes which contain the pointer to the first block. Then follow four bytes with pointer to the last block. The next four bytes contain #FFFFFFFF, which is characteristic of native Warajevo format. So, empty tape have a format:

#04 #00 #00 #00 #00 #00 #00 #00 #FF #FF #FF #FF

Sequence #00 #00 #00 #00 #FF #FF #FF #FF is, in fact, a EOF (end of file) marker. Every block contains following: This is, in fact, double-linked list. The native Warajevo format don't keep control bytes (parity bytes) of the blocks, but the emulator calculates them using a flag byte and data bytes.

If the block size is 65535, it is a compressed block. Its structure is a bit different and looks like: Signatures are data which are important for the imploding algorithm used in the Warajevo emulator. This algorithm, when decompressing, copies bytes from the source file, or returns back for a few bytes, and copies some bytes from a destination file. Used algorithm is described later in this page.

If the block size is 65534, it is a block which contains tone record samples. Its structure is like: If bytes 9, 10, 11 and 12 into a TAP file are not equal to #FF, this is TAP file which is not in native Warajevo TAP format. The emulator in this case assumes Lunter's TAP format which is simply linear sequence of blocks which contain the following: The explanation of compressed data bytes is rather complex. We used format similar to format used in PKLITE, but unlike PKLITE where signature bytes are mixed with data bytes, we divided them in two parts, for easier debugging. So, compressed data bytes starts with signature bytes, then follows actual data bytes, immediately after signature bytes. Remember basic elements of Imploding (LZ77) algorithm. It depends on copying of some byte sequences. For example, sequence

#3D #18 #2E #42 #3D #18 #2E #15 #42 #3D #19

will be encoded as:

#3D #18 #2E #42
<Return for 4 bytes and copy 3 bytes>
#15
<Return for 5 bytes and copy 2 bytes>
#19

The different archivers (PKZIP, LHARC, ARJ etc.) differs on way of encoding of this special 'Return for...' code. We already told that in Warajevo compressed format, there are two parts: signatures and data. In our example coding of signatures will be (binary):

00001001 010100xx (xx - not used, so don't care)

while data bytes will be:

#3D #18 #2E #42 #04 #15 #05 #19

The signatures are bits which describe what to do with data bytes. When the signature bit is 0, corresponding data byte is simple data byte (it may be simply copied from input to output file). When the signature bit is 1, this is code for returning. In our example, four zeros in signatures means that four bytes can be simply copied (#3D, #18, #2E, #42) to output buffer. The next bit is 1. This means "Return for xxxx bytes and copy yyyy". Now, we will explain this case in detail.

The value of yyyy (size of string to be copied) is embedded in signatures if yyyy is less than 10, or in both signatures and data bytes if yyyy greater of equal 10 (used principle is, in fact, Huffman statistical coding). So, the size yyyy depends on next 2-4 signature bits:

Bits Size (yyyy)
010 2
00 3
100 4
101 5
011 >= 10
1100 6
1101 7
1110 8
1111 9

If size yyyy is greater or equal than 10 (code 011 in signatures), the next data byte contains yyyy-10. That means: maximal string size is 265 (255+10).

The next data byte determine lower byte of distance of string to be copied (e.g. lower byte of xxxx). If yyyy=2, higher bit is always zero (so, for this size distance can be maximally 255). If yyyy differs from 2, the next 1-6 signature bits determine higher byte:

Bits Higher byte of xxxx
1 0
0000 1
0001 2
00100 3
00101 4
00110 5
00111 6
01bbbb 7+bbbb


Obviously, higher byte is maximal 23, so maximal distance is 6143 bytes. To understand algorithm, experiment with some compressed ASCII text. Complex? Yes, it is. Samir spent more than 30 days in developing algorithm, analysing of some archivers, optimizing compression speed (it is still slow, but acceptable, although decompression speed is very high), and he worked mostly on paper, because it was in hardest days of summer 1993, without electric power, water and food (in this time we loosed 1 kg weekly), when only miracle saved Sarajevo of fall. In this time we had not leave the army building, and while we waited for a new battle tasks, Samir developed the compression algorithm...


Z80 FILES


Warajevo Z80 format is compatible with the format of the emulator 'Z80' written by Gerton Lunter. While loading, the emulator keeps compatibility with releases up to 3.03. (including), but Warajevo always saves snapshot files as release 2.0. of Lunter's emulator. This format is as follows (be aware of some Timex specificity):

Byte Length Description
0 1 A register
1 1 F register
2 2 BC register pair (LSB, i.e. C, first)
4 2 HL register pair
6 2 Always 0 (the emulator creates snapshoot files like release 2.0 of Lunter's 'Z80' emulator)
8 2 Stack pointer (SP)
10 1 Interrupt register (I)
11 1 Refresh register (R), bit 7 is not significant!
12 1 Various flags:
Bit D0: bit 7 of the R-register
Bits D1-D3: border color
Bit D4: always 0
Bit D5: always 1 (e.g. blocks are always compressed)
Bits D6-D7: not used
13 2 DE register pair
15 2 BC' register pair
17 2 DE' register pair
19 2 HL' register pair
21 1 A' register
22 1 F' register
23 2 IY register (Again LSB first)
25 2 IX register
27 1 Interrupt flip-flop IFF1 (0 = DI, otherwise EI)
28 1 IFF2 (not particularly important...)
29 1 Interrupt mode (0, 1 or 2)
30 2 Length of additional header block (contains 23)
32 2 Program counter (PC)
34 1 Hardware mode: 0 = Spectrum 48K, 1 = Spectrum 48K with ZX Interface 1, 3 = Spectrum 128K, 4 = Spectrum 128K with ZX Interface 1, 128 = Timex Sinclair 2068
35 1 In version 128, contains last OUT to 32765; in Timex version contains last OUT to port 244
36 1 Contains 255 if Interface 1 ROM paged in, except on Timex version; on Timex version contains last OUT to port 255
37 1 Some flags:
Bit D0: Always 1 (e.g. active R register emulation)
Bit D1: Always 1 (e.g. always full LDIR emulation)
38 1 Last OUT to port 65533 (AY register number, 128 version) or to port 245 (Timex version)
39 16 Contents of the AY chip registers (128 or Timex version)


Hereafter a number of memory blocks follow, each containing the compressed data of a 16K block. The structure of a memory block is:

Byte Length Description
0 2 Length of data (without this 3-byte header)
2 1 Page number of block
3 ? Compressed data


The pages are numbered, depending on the hardware mode, in the following way:

Page: In 48 or Timex version: In 128 version:
0 Standard 48 ROM Standard 128 ROM
1 Interface 1 shadow ROM Interface 1 shadow ROM
2 - Derby 128 ROM
3 - RAM, page 0 (49152-65535)
4 RAM, 32768-49151 RAM, page 1 (49152-65535)
5 RAM, 49152-65535 RAM, page 2 (49152-65535 or 32768-49151)
6 - RAM, page 3 (49152-65535)
7 - RAM, page 4 (49152-65535)
8 RAM, 16384-32767 RAM, page 5 (49152-65535 or 16384-32767)
9 - RAM, page 6 (49152-65535)
10 - RAM, page 7 (49152-65535)


In 48 version, pages 4, 5 and 8 are saved. In 128 version, all pages from 3 to 10 are saved. Pages will be saved in numerical order. On Timex version of the emulator, content of all eventual memory expansions (DOCK etc.) are not kept in snapshot files, but in separate DCK files due to necessary flexibility.

The compression method is very simple: it replaces repetitions of at least five equal bytes by a 4-byte code #ED #ED #xx #yy, which stands for 'byte #yy repeated #xx times'. Only sequences of length at least 5 are coded. The only exception is sequences consisting of #ED's; if they are encountered, even two #ED's are encoded into #ED #ED #02 #ED. Finally, every byte directly following a single #ED is not taken into a block, for example #ED #00 #00 #00 #00 #00 #00 is not encoded into #ED #ED #ED #06 #00 but into #ED #00 #ED #ED #05 #00.

Important: because we want to be compatible with Lunter's snapshot format, we do not put in snapshot file some things that are important for 'edge recognizer', ZX Printer emulation, and RS232 emulation. So if you create snapshot file (by pressing F7 key) during loading program with unusual loading routine, or during writing on ZX Printer, when you load snapshot file back into the emulator, program will not be continued correctly in all cases. We perform saving of snapshot files exactly in the moment when Z80 interrupt starts (e.g. after 'vertical retrace' signal), to eliminate problems about keeping timing data in a snapshot file.


MDR FILES


MDR files consists of several number (from 10 to 254) of sectors, where every sector is 543 bytes long. If the file length is not divisible by 543, the extra bytes will be simply ignored. In the Lunter's emulator, number of the sectors is always 254, and on the end of the file there is always one extra byte. If you create MDR file from the environment of the Warajevo emulator, created file will always have one extra byte, so if you select 254 sectors file length, the file format will be compatible with Lunter's format (the same format uses emulator Spectator for the QL by Carlo Delhez and XZX for XWindows by Erik Kunze and Des Harriot). This extra byte on Lunter's emulator determines whether the cartridge is write-protected, but in our emulator this byte has no meaning. Our emulator determines write-protect status using the read-only attribute of the MDR file.

On the real cartridge, structure of one sector is following. After a little gap, 12 bytes follows (10 zeros and 2 #FF bytes) which represents lead-in signal, then 15 bytes of the sector header, last byte of the header is the checksum (by modulo 255, not 256). After this, follows a new gap, new lead-in signal, then a 528 bytes long data block. First 15 bytes of the data block are block descriptor (with structure similar like sector header), next 512 bytes are actual data, and finally, the last byte is data checksum (also by modulo 255). The lead-in signal is used only for synchronization (with real hardware) so we will not keep it into MDR file. That's why, sector length is 15+528=543 bytes.

Psychical structure of the sector is not important for the emulator itself (it is important only for ROM routines), because emulator only buffers values sent to port 231 (or from port 231) and updates it when it is right time (see chapter 7.3.4.). The buffer alternatively changes maximal length from 15 to 528 bytes and vice versa, because on real microdrive we always have sequence header/block/header/block etc. However, the environment has commands for viewing and changing a whole sector area (see chapter 3.4.2), so the description of meaning of all bytes into a sector may be useful:

Byte Length Name Description
0 1 HDFLAG Header flag byte, bit D0 is 1 to mark header, other bits are not used
1 1 HDNUMB Logical number of the sector
2 2 HDEMPTY Not used
4 10 HDNAME Cartridge name (e.g. the name given by FORMAT) padded with blanks
14 1 HDCHK Header checksum by modulo 255 (checksum of the first 14 bytes)
15 1 RECFLG Data block flag byte:
Bit D0: Always 0, to mark data block
Bit D1: EOF bit, set if this sector is the last sector assigned to some file
Bit D2: Clear if block belongs to file created using PRINT#
Bits D3-D7: Not used (always 0)
16 1 RECNUM Determines which part of file allocated in this sector lays in this sector (counting starts from 0)
17 2 RECLEN Actual data length into the block (ó512), if the block is not a EOF block, RECLEN must be 512; unallocated sector has RECLEN=0 with clear EOF bit, and a bad sector has also RECLEN=0 with set EOF bit
19 10 RECNAM Name of the file allocated into the sector (padded with blanks)
29 1 DESCHK Checksum of the descriptor (e.g. of the last 14 bytes) by modulo 255
30 512 DATA Actual data
542 1 DCHK Checksum of the actual data (of whole 512 bytes, even if RECLEN<512) by modulo 255


The first 9 bytes in the actual data of the sector which has RECNUM=0 for a files created using SAVE* command, not using PRINT# (so-called non-PRINT files) contain additional informations about the file:

Byte Length Description
0 1 File type (0-BASIC, 1/2-DATA, 3-CODE)
1 2 Number of bytes into the file
3 2 Start address of the file
5 2 Length of program zone (only for BASIC)
7 2 Autorun line (only for BASIC)


The emulator every sector with wrong header checksum treats like a gap. Also, during emulation of IN 239 instruction (reading of the status register), a artificial gap will be "added" between sectors, and between header and data block.


DCK FILES


DCK files keeps information about memory content of various Timex memory expansions, and information which chunks of extra memory are RAM chunks and which chunks are ROM chunks. Such files have relatively simple format. At the beginning of a DCK file, a nine-byte header is located. First byte is bank ID with following meaning:

0: DOCK bank (the most frequent variant)
1-253: Reserved for expansions which allow more than three 64 Kb banks (not implemented at this moment)
254: EXROM bank (using this ID you may insert RAM or ROM chunks into EXROM bank, such hardware units exist on real Timex Sinclair)
255: HOME bank (mainly useless, HOME content is typically stored in a Z80 file); however, using this bank ID you may replace content of Timex HOME ROM, or turn Timex HOME ROM into RAM


This numbering of banks is in according to convention used in various routines from Timex ROM.

After the first byte, following eight bytes corresponds to eight 8K chunks in the bank. Organization of each byte is as follows:

bit D0: 0 = read-only chunk, 1 = read/write chunk
bit D1: 0 = memory image for corresponding chunk is not present in DCK file, 1 = memory image is present in DCK file
bits D2-D7: reserved (all zeros)


To be more clear, these bytes will have following values: After the header, a pure image of each presented chunk is stored in DCK file. Some examples will help understanding of such organization. 16 Kb long LROS program needs header 0,2,2,0,0,0,0,0,0 in front of pure binary image of this program. 24 Kb long AROS program needs header 255,0,0,0,0,2,2,2,0 in front of binary image of it to become a valid DCK file. 64 Kb DOCK RAM disc cartridge (64K of empty RAM) may be described as only 9-byte long DCK file with content 0,1,1,1,1,1,1,1,1. 32 Kb EXROM RAM disc cartridge mapped at address 32768 may be described also using 9-byte long DCK file with content 254,0,0,0,0,1,1,1,1. If you put a 9-byte header 255,2,2,0,0,0,0,0,0 in front of binary image of standard ZX Spectrum ROM, you will get DCK file which will replace Timex HOME ROM with ordinary Spectrum ROM (e.g. you will achieve Timex Sinclair 2048). At the last, if you put a header 255,3,3,0,0,0,0,0,0 in front of binary image of Timex HOME ROM, you will allow writing in the HOME ROM!

That's all if only one bank is stored in DCK file. Else, after the memory image, a new 9-byte header for next bank follows, and so on.


NETWORK FILES


The organization of the net files is very simple. They have 260 bytes (or more, but excess bytes will not be used), with following structure:

Byte Length Description
0 2 Package ID (used for fast checking whether content of the Net file is changed)
2 2 Reserved; not yet used
4 256 Content of the package


PALETTE FILES


Palette file WARAJEVO.PAL is simple 48-byte long binary file, in which first three bytes correspond to R, G and B value for color 0, next three bytes correspond to RGB for color 1, etc. First 24 bytes are related to BRIGHT 0 colors, and next 24 bytes are related to BRIGHT 1 colors. All values are in range 0-63. Why this file is binary instead of ASCII file? Because it is very tedious to read ASCII file format from a pure assembler...


DATABASE FILES


We use the classic database DBF format. Here will be explained only the meaning of the fields. These fields maybe have strange names, because names are acronyms or words in Bosnian language.

File: SOFTWARE.DBF

Field Name Description Type Width Dcp
DODATNI For future expansion AN 10
GODISTE Year of production AN 4
KOMPJUTER Computer type AN 4
MATERJEZIK Messages language AN 4
OPIS Program description (descriptor) MO 10
POZICIJA Block position in the tape NU 3 0
PRIORITET Marking sign AN 1
PROBLEMI Problems (coded) NU 5 0
PROIZVODJ Producer AN 4
PUNOIME Full name of the program AN 30
SIMULOPC Startup options AN 50
SKRACENO Short name AN 10
TRAKA Tape (or snapshot etc.) name AN 10
VRSTA Category of the program AN 4


Record length: 149

Fields MATERJEZIK, PROIZVODJ and VRSTA don't contain full text, but only the internal code (due to space saving). The full text which correspond to the code is stored in the file ADDITION.DBF. In the field TRAKA last 2 characters represent identification cone which allows you to have more files with same name if they are in different directories.

File: ADDITION.DBF

Field Name Description Type Width Dcp
REDPODAT See below AN 60


The first letter of REDPODAT is P, V, J, T, S, M or O, for producer, category, language, tape file, snapshot file, cartridge file, or other-type file respectively. If it is P, V or J then follows four letters for a code and then 55 letters for the corresponding name. In other cases, a file name with a file descriptor follows.



BACK TO THE WARAJEVO MAIN PAGE