Scan Range Counts Update + Release Format Notes

4 March 2002

Scan Range Counts Update

We did not think there was any point on improving on this anyway, but we also noticed that the long scanning time occurs only if the gap track does not start at the disk index position - much more rare than the previous estimation. Otherwise it is super fast to detect. If we did decide to improve on the time for processing this time, it would be a hack. We don’t like hacks.

Just for information: The scan range functionality is very complex to develop and to make “fast” since it must consider values that are already ranged, not to mention the fact that the analyser can search forward and backward from reference points (for found & filtered syncs) in order to properly recognize data. It is easier on a fixed amount of data (this was implemented quite easily) than on undefined amounts as that depends on what is actually being read.

Release Format Notes

As has been previously said, the release format has not been defined yet. What is defined is the dump format - the raw disk images that are submitted by contributors using our dumping tool. The release format is obviously the form that games will be available in, and the form that emulators and our re-mastering tools will support.

However, it is not a big piece of work. We thought it more important to see exactly what is needed by first looking at the hundreds of dumps we have. Now we are in a much better position to define a release format, but of course, there is still more work to do.

Much of the internal support for the release format is already done anyway. We are thinking along the lines of something super easy, like a code + data area repeated several times. The code can be RAW or MFM encoded and the data area may contain repeated values (using an RLE scheme), so gap tracks can be perfectly compressed to 2 values, instead of storing “as-is”.

RLE (Run Length Encoding) is a “loss less” compression, it results in the same data after decompression. Say you have 100 bytes all containing the value 33. Instead of storing 100 bytes, you can just store the number of repeats (100) and then the value (33) and thus it only takes 2 bytes. This huge amount of repeating values is similar to what gap tracks “look like”. Obviously for continuously varying data using RLE could at worst double the space required, but it will work well for gap tracks.