a STR is an unsigned long int. One special STR value, NOSTR, is used to represent the zero-length string. No other collapsing of identical-valued strings is done. The low 24 bits of a STR are an index into a table, allowing a little over 16 million strings. This table is kept in 4K-entry segments for the sake of storage efficiency. Each entry corresponds to one STRDESC, but if the string is short enough (strlen(str) < sizeof(STRDESC)), the bytes of the string are overlaid on the STRDESC. Note that when the string is overlaid this way, the trailing \0 is included, because there's then no explicit length field available. A vector of bits is kept to flag when this is done. A non-overlaid string has .u.taken filled in, with .u.taken.core being meaningful only when .incore is set. .u.taken.usestamp is a last-use serial number, used to move strings around to improve locality in the string file; .u.taken.fileoff is an offset in bytes into the string file itself. The trailing \0 may not be stored explicitly; that's why there's a length field. STRDESCs not currently in use are chained together with the u.free.link field, rooted at freedescs, have their .free bit set, and their STR values in .u.free.id; others never have .free set. Running out of STRDESC room is a fatal error; I don't expect this to be a problem. Space in the strings file is allocated in 4-byte units. The file length is known; free spaces in the file are kept in FREEFILE structures. When the array of FREEFILE structures fills up, the file is compacted, moving all free space to the end and coalescing it into a single block.