radix40 encoding
I was inspired to design an original(?) text encoding for tiny embedded
computers. It is, however, similar to DEC RADIX 50 from 1965. (That's 50₈=40₁₀).
Since 40³<65536, it is possible to store 3 symbols in each 16 bit word.
In radix 40 you get the 26 basic alphabetic characters, 10 digits, and 4 additional symbols. I chose:
- End of string
- Space (ASCII 32)
- Exclamation point (ASCII 33)
- Double quote (ASCII 34)
The choice of 3 characters that are adjacent in ASCII saved code size on the decoder; initially I thought maybe "-" and "." would be useful choices.
Unlike RADIX 50, the encoding is arranged so that no division or remainder operation is needed. Instead, at each step of decoding, a 24 bit temporary value is multiplied by 40 and the top byte gives the output code. In the assembler vesion, the multiplication is coded as x<-x*8; tmp<-x; x<-x*4; x<=x+tmp) since the MC6800 has no multiply instruction.
Here are the not quite matching Python encoder
(Embedded not available - View epvenhla/b40.py on codeberg.org or download raw)
And decoder/test program in m6800 assembly:(Embedded not available - View epvenhla/b40.asm on codeberg.org or download raw)
The implementation costs 90 bytes of code and 6 bytes of zero-page (which can be used for other purposes when the routine is not running). I estimate you'd need somewhat above 320 characters of text in your program for it to be a benefit.
The m6800 decoder can over-read the data by 1 byte, which seldom poses a problem in such environments.
By the way, this post debuts improved code embeds from forgejo. On my end, I just have to write [junk epvenhla/b40.py [lang python]] and the blog renderer does the rest!
All older entries
Website Copyright © 2004-2024 Jeff Epler