This document is my own description of the S24 interface, written based
on documentation I am not prepared to distribute verbatim.  The
documentation appears to have been converted badly between formats and
may be missing fragments; I can only guess what may be missing.  The
description below assumes the CPU is a SPARC, which is not unreasonable
since the S24 exists only for the SS5.  Something marked NOTDOC here
means that the documentation I have describes it, but it does not seem
to me to be worth describing; usually, these are facilities that are of
use only to manufacturing test equipment, or things which must be set
to match the electronics surrounding some chip (and which thus must
always have particular values for the S24 to work).

If you're looking to see what sort of hardware acceleration the S24
has, you want to read the sections on the STIP and BLIT spaces.  Search
for "STIP SPACE" and "BLIT SPACE".

The S24 address space consists of several pieces, which populate the
card's SBus address space rather sparsely.  The holes between the
pieces read as undefined data and ignore writes; writes to read-only
addresses are accepted but ignored.  Some of these holes are apparently
intended to allow the existing regions to grow for a possible
higher-resolution variant of the card.

This table is a brief description of the various pieces.  Offset is the
offset from the card's base.  Name is the name used for this piece
elsewhere in this document.  Space is the amount of space reserved;
that is, it is the distance to the next piece.  Actual is the amount of
space that actually has something at it.  Type describes how the
contents are addressed; it can be:

1/4	Every fourth byte exists; the other 3 don't.  ("Actual" is
	computed as if the nonexistent three-quarters actually existed;
	that is, it is ending address minus starting address.)
4	32-bit accesses only
1248	1-byte, 2-byte, 4-byte, or 8-byte accesses all work.
8	64-bit accesses only

Some entries have - in some fields.  The documentation is not clear
what these are; I suspect they represent unassigned holes in the
address space map.  "Resv'd" fields are documented as reserved, though
to what is not clear.  User indicates which portions of the system the
space is intended to be used by: r = ROM (OBP), w = window system, u =
user code.  (Of course, this intent can be violated by kernel code if it
feels like it.)

Offset		Name	Space	Actual	Type	User
0x0000000	PROM	2M	128K	1/4	r
0x0200000	DAC	256K	32	1/4	rw
0x0240000	DHC	256K	32	1/4	r
0x0280000	ALT	512K	128	1/4	r
0x0301000[*]	THC	512K	4K	4	r
0x0381000[*]	-	512K	-	-	-
0x0400000	-	2M	-	-	-
0x0600000	-	1M	-	-	-
0x0701000[*]	TEC	512K	4K	4	rw
0x0781000[*]	-	512K	-	-	rw[sic]
0x0800000	DFB8	8M	1M	1248	rwu
0x1000000	Resv'd	8M
0x1800000	Resv'd	8M
0x2000000	DFB24	16M	4M	1248	rwu
0x3000000	-	16M	-	-	-
0x4000000	STIP	32M	8M	8	rwu
0x6000000	BLIT	32M	8M	8	rwu
0x8000000	-	32M	-	-	-
0xa000000	RDFB32	16M	4M	1248	rw
0xb000000	-	16M	-	-	rw[sic]
0xc000000	RSTIP	32M	8M	8	rw
0xe000000	RBLIT	32M	8M	8	rw

[*] - the table shows these addresses as given here, though that is
inconsistent with the way the sizes fit together.  Some code I have
seen adds 0x1000 to the ROM-mapped addresses for the THC and TEC; in my
experience this must not be done with the S24.

Normally, you will not care about these addresses even if you are
writing kernel code for the S24, because the OBP will have mapped the
various pieces for you.  There are thirteen pieces the PROM maps; the
documentation I have - not part of the paper doc set - indicates that
the correspondence between ROM-mapped register set array indices and
spaces are:

	Index	Space
	0	DFB8
	1	DFB24
	2	STIP
	3	BLIT
	4	RDFB32
	5	RSTIP
	6	RBLIT
	7	TEC
	8	CMAP (must be "DAC" above, by exclusion)
	9	THC
	10	ROM
	11	DHC
	12	ALT

Brief descriptions of the various spaces:

PROM	OpenBoot PROM code
DAC	RAMDAC (colour lookup tables, primarily)
DHC	DAC hardware configuration registers
ALT	Pixel clock synthesizer
THC	Timing and hardware configuration
TEC	Timing & cursor
DFB8	Dumb 8bpp framebuffer
DFB24	Dumb 24bpp framebuffer (each pixel is 32 bits)
STIP	24bpp stipple space
BLIT	24bpp blit space
RDFB32	Raw 26bpp dumb framebuffer (each pixel is 32 bits)
RSTIP	26bpp stipple space
RBLIT	26bpp blit space

The hardware actually has 26 bits per pixel.  The high two bits
determine the interpretation of the other 24:

00 xxxxxxxx xxxxxxxx iiiiiiii
	8bpp PseudoColor mode.  iiiiiiii is an 8-bit value which
	indexes into the three colour lookup tables to determine an RGB
	triple for this pixel.  The xxxx bits are ignored.

01 bbbbbbbb gggggggg rrrrrrrr
	24bpp DirectColor mode.  rrrrrrrr, gggggggg, and bbbbbbbb are
	three 8-bit values which index into the corresponding colour
	lookup tables to determine an RGB triple for this pixel.

10 bbbbbbbb gggggggg rrrrrrrr
	24bpp gamma-corrected TrueColor mode.  rrrrrrrr, gggggggg, and
	bbbbbbbb are mapped through built-in read-only gamma-correction
	lookup tables to determine an RGB triple for this pixel.  (It
	doesn't say what the gamma exponent is; presumably it's some
	kind of compromise among what Sun's various monitors need.)

11 bbbbbbbb gggggggg rrrrrrrr
	24bpp non-gamma-corrected TrueColor mode.  rrrrrrrr, gggggggg,
	and bbbbbbbb directly give an RGB triple for this pixel.

PROM SPACE
----------

PROM space simply provides the OpenBoot code for the S24.  Doing a
32-bit read from address X<<2 returns the Xth byte of the ROM;
different pieces of the documentation disagree over whether the data
appears in the high 8 bits or the low 8 bits of the value read.  The
entire space is read-only.  The actual data content, of course, is only
a quarter what the amount of address space occupied would imply, since
24 of each 32 bits do not actually exist.

DAC SPACE
---------

The RAMDAC chip is an AT&T ATT20C567.

DAC space holds the colour lookup tables and some associated control
registers.  There are five registers, three of which appear twice:
	Address		Register
	0x0200000	Address register (copy 1)
	0x0200004	Pixel lookup table (copy 1)
	0x0200008	Control register 1
	0x020000c	Overlay lookup table (copy 1)
	0x0200010	Address register (copy 2)
	0x0200014	Pixel lookup table (copy 2)
	0x0200018	Control register 2
	0x020001c	Overlay lookup table (copy 2)
The lookup tables and control registers are all heavily multiplexed;
each register above (except the address register itself) actually
accesses one of many registers, depending on what value the address
register holds.  All values appear in the high byte of the 32-bit value
read or written.  To access a register, its address needs to be written
to the address register, then the appropriate indirect register needs
to be read or written.

Pixel lookup tables consist of three 256-byte tables, one each for red,
green, and blue.  To access a lookup table entry, first set the address
register to the pixel index desired, then perform three consecutive
accesses to the lookup table register; they access the red, green, and
blue table values for that index.  When accessing pixel lookup tables,
the address register increments automatically after the blue access, so
accessing consecutive indices does not require repeatedly writing the
address register.  There is a note that the address register must be
written to 0 after writing pixel lookup tables or the RAMDAC will
malfunction in some unspecified way.

Overlay lookup tables are just like pixel lookup tables except the
tables have only 4 entries instead of 256, so the valid addresses run
from 0 to 3 instead of 0 to 255; also, it's not clear whether the
automatic address increment applies here - the doc is silent.

Control register 1 provides access to six registers.  These registers
and the addresses that access them are:
	Address		Register
	00		ID
	01		Revision
	04		Read mask
	05		Blink mask
	06		Control 0
	07		Test 0
(addresses 2 and 3 are documented as reserved, returning 0 on read;
addresses 8 and above are not documented at all).  These registers are:

ID
	This read-only register identifies the RAMDAC type.  For the
	'567, it reads as 0x54.

Revision
	This read-only register identifies the RAMDAC revision.  It may
	show either of two values:
	0x67	'567 mode
	0x46	'467 mode
	The documentation is unclear on the difference between these;
	perhaps the 20C567 doc is more informative?

Read mask
	The read mask is ANDed with all pixel values before performing
	table lookups.  It's fairly clear this applies to 8bpp
	PseudoColor and 24bpp DirectColor modes; it's not clear whether
	it affects either of the TrueColor modes.  The same mask is
	used for all three primaries in 24bpp mode.

Blink mask
	The blink mask is used in all lookup modes, apparently
	including gamma-corrected TrueColor mode.  Normally this
	register is all 0; when a bit is set here, that bitplane in all
	pixels blinks at a rate selected by control 0.  (It's not clear
	whether the blinking is between pixel and pixel&~mask or
	between pixel and pixel^mask, though descriptions of the
	control 0 register imply the former.  The former is also more
	likely to be useful, it seems to me.)

Control 0
	This register contains various control bits, supposedly set to
	0x30 by reset, though I suspect this value should be 0x60.
	Some descriptions here speak of the odd and even overlay
	inputs; it's not clear exactly what these mean.
	0x80	Reserved, MBZ
	0x40	RAM/Overlap select
		If 0, all pixel values are ignored and only the overlay
			inputs (= the cursor, for the S24) are used.
			Overlays blink to overlay colour 0.
		If 1, when the overlay bits are 00, pixel data is
			displayed instead of overlay colour 0.  Overlay
			lookup entry [0] is ignored, and overlays blink
			to pixel data.
	0x30	Blink rate select (X/Y = X frames on, Y frames off)
		0x00	16/48
		0x10	16/16
		0x20	32/32
		0x30	64/64
	0x08	Overlay 1 blink enable.  When 1, enables the odd
		overlay inputs blinking to 0.
	0x04	Overlay 0 blink enable.  When 1, enables the even
		overlay inputs blinking to 0.
	0x02	Overlay 1 display enable.  ANDed with the odd overlay
		inputs before processing.
	0x01	Overlay 0 display enable.  ANDed with the even overlay
		inputs before processing.

Test 0
	This register appears to be designed for snooping the RAMDAC
	inputs.  It's not clear it has any real use except during board
	test, but the doc describes it, though somewhat confusingly.
	(The 0x07 bits speak of the high or low nibble being _written_,
	even though the rest of the description appears to describe
	_reading_ the DAC inputs.  Maybe it permits overriding them?)
	0xf0	DAC input data.  These bits show current input bits to
		the red, green, or blue DAC.  The other bits control
		which nibble and which DAC.
	0x08	Nibble select.  If 0, selects the high nibble to drive
		the 0xf0 bits; if 1, the low.
	0x04	Blue DAC select.  Selects the blue DAC inputs.
	0x02	Green DAC select.  Selects the green DAC inputs.
	0x01	Red DAC select.  Selects the red DAC inputs.
	It's not clear what happens if other than exactly one of the
	low three bits is set.

Control register 2 provides access to eight registers.  These registers
and the addresses that access them are:
	Address		Register
	08		Control 1
	09		Control 2
	0b		Test 1
	0c		Red signature analysis
	0d		Green signature analysis
	0e		Blue signature analysis
	10		Control 3
	11		Control 4
(addresses a and f are documented as reserved, not further described;
addresses below 8 or above 11 are not documented at all).  In the
following, "RWI" means the bits are read/write, but they are not
connected to anything and are ignored; "SBZ" means they may be written
with any value, but 0 should be used for forward compatability.  The
registers are

Control 1
	0xc0	RWI ("formerly the modulo counters", whatever that means)
	0x3f	SBZ

Control 2
	0x80	Reserved, RWI
	0x40	Blanking pedestal enable.  When 0, black level equals
		blank level.  When 1, there is a 7.5% blank pedestal.
	0x30	Reserved, RWI
	0x08	SYOUT source.  SYOUT generated from SYNC (when 0) or
		BLANK (when 1).
	0x04	SYOUT enable.  SYOUT enabled when 1, disabled when 0.
	0x02	SYOUT/SYNC select.  If 0, the SYOUT pin is driven as
		selected by 0x08 bit; if 1, driven by SYNC.
	0x01	Test type.  0 = signature analysis; 1 = data strobe.

Test 1
	0xf0	Reserved, RWI
	0x08	DAC output compare.  RO, indicating whether one or more
		of R, G, or B outputs exceeds 340mV; used to detect
		presence of a monitor.  1 = voltage less than
		reference, 0 = more than reference.
	0x07	Reserved, RWI, SBZ ("formerly, pixel select for
		signature analysis").

Red signature analysis
Green signature analysis
Blue signature analysis
	These registers either operate as LFSRs driven by DAC inputs
	(if Control 2, Test type, is 0) or a snapshot of DAC inputs (if
	1).  Used for manufacturing tests.  NOTDOC.

Control 3
	0x80	Mode enable.  0 = '467 compat mode, emulating "an 8:1
		multiplexed device", whatever that is; 1 = enhanced
		mode; "enhanced true color visuals and pixel controls
		bits are enabled".
	0x40	Reserved, SBZ
	0x20	Pixel interface timing.  0 = '458/'467 compatible LOAD
		sync; device loading is as for '458 or '467 devices.
		1 = '567 compatible LOAD sync; SCLK output enabled.
	0x10	Sleep enable.  0 = normal operation; 1 = sleep mode.
		In sleep mode, the DAC is turned off and palette RAM is
		powered down (but retains data).  Control registers are
		still accessible.
	0x08	Override pixel.  0 = decode high two bits of pixel to
		control switching; 1 = bits 0x07 control switching.
	0x04	Bypass.  0 = pass data through mapping tables (RAM
		lookup tables or gamma ROM); 1 = bypass mapping.
	0x02	Index.  0 = TrueColor/DirectColor; 1 = PseudoColor (red
		data replicated to index green and blue tables too).
	0x01	Gamma.  This controls the which table feeds the bypass
		multiplexor.  0 = use color RAM; 1 = use gamma ROM.

Control 4
	0x80	PLL enable.  NOTDOC.
	0x40	Test enable.  0 = normal operation, TEST1 and TEST2
		pins tristated; 1 = test mode, drives test data.
	0x30	BPP select.  I don't understand this register; it looks
		like an alternative to the two high pixel bits, but
		that makes no sense.  NOTDOC.
	0xc0	Reserved, RWI
	0x03	Active pixel pin count.  NOTDOC.

DHC SPACE
---------
This appears to be identical to DAC space; if there is a difference,
I'm not sure what it is.

ALT SPACE
---------
"Pixel clock synthesizer".  NOTDOC.

THC SPACE
---------
The THC and TEC share many registers.  This is the THC view of them.
Most of them are NOTDOC; the ones that are documented are marked *.

Address		Register
0x0301000	Config
0x0301080	Sensebus
0x0301090	Delay
0x0301094	Strap
0x030109c	Current line counter
0x03010a0	Horizontal sync start
0x03010a4	Horizontal sync end
0x03010a8	Horizontal display start
0x03010ac	Horizontal sync end during vsync
0x03010b0	Horizontal display end
0x03010c0	Vertical sync start
0x03010c0	Vertical display start
	(yes, the previous two are shown as being at the same address)
0x03010c4	Vertical sync end
0x03010c8	Vertical display start
0x03010cc	Vertical display end
0x0301200	Blit RAM pixels 0 and 1
0x0301208	Blit RAM pixels 2 and 3
0x0301210	Blit RAM pixels 4 and 5
...
0x0301270	Blit RAM pixels 28 and 29
0x0301278	Blit RAM pixels 30 and 31
0x0301818	THC misc
0x03018fc *	Cursor address
0x0301900 *	Cursor A00
0x0301904 *	Cursor A01
...
0x0301978 *	Cursor A30
0x030197c *	Cursor A31
0x0301980 *	Cursor B00
0x0301984 *	Cursor B01
...
0x03019f8 *	Cursor B30
0x03019fc *	Cursor B31

Documentation on the documented registers appears below, in the TEC
space section.

TEC SPACE
---------
The TEC and THC share many registers.  This is the TEC view of them.
Most of them are NOTDOC; the ones that are documented are marked *.

Address		Register
0x0701000	Config
0x0701080	Sensebus
0x0701090	Delay
0x0701094	Strap
0x0701098	THC misc
0x070109c	Current line counter
0x07010a0	Horizontal sync start
0x07010a4	Horizontal sync end
0x07010a8	Horizontal display start
0x07010ac	Horizontal sync end during vsync
0x07010b0	Horizontal display end
0x07010c0	Vertical sync end
0x07010c4	Vertical sync end
	(yes, the previous two are shown with the same name)
0x07010c8	Vertical display start
0x07010cc	Vertical display end
0x07018fc *	Cursor address
0x0701900 *	Cursor A00
0x0701904 *	Cursor A01
...
0x0701978 *	Cursor A30
0x070197c *	Cursor A31
0x0701980 *	Cursor B00
0x0701984 *	Cursor B01
...
0x07019f8 *	Cursor B30
0x07019fc *	Cursor B31

The cursor consists of two 32x32 bitplanes, allowing four combinations:

A bit	B bit	Effect
0	0	Transparent; pixel data shows through
0	1	Overlay colour 1
1	0	Overlay colour 2
1	1	Overlay colour 3

The data for these two bitplanes is stored in the Cursor [AB]{00-31}
registers; the 00 register is the top line, the 31 register the bottom
line; on each line, the MSB is the left end and the LSB the right.  The
cursor location is controlled by the cursor address register, with the
X coordinate in the upper 16 bits and the Y coordinate in the lower 16
bits.  Each half is a signed 16-bit number and may be negative (which
places the cursor origin off the left or top edge, useful if it's off
the edge by less than its width or height).  The cursor address
register is read only during vertical blanking, to prevent partial
cursors from being displayed during motion; partially-changed cursors
_can_ occur while changing shape.

DFB8 SPACE
----------

DFB8 space presents a dumb memory-mapped framebuffer interface which
accesses only the low 8 bits of each pixel, with the other 22 bits
inacessible.  Each pixel corresponds to one byte.  This space can be
accessed with 1-, 2-, 4-, and 8-byte reads and writes.

DFB24 SPACE
-----------

DFB24 space presents a dumb memory-mapped framebuffer interface which
accesses the low 24 bits of each pixel, with the high two bits
inaccessible.  Each pixel occupies 32 bits of space; the 24 bits of
data are in the low 24 bits, with the high 8 bits ignored on write and
zero on read.  This space is accessible with 4- and 8-byte reads and
writes; the documentation is unclear on exactly what happens with
one-byte or two-byte access.  One table and some language indicates
that small accesses simply access only parts of pixels, but there is a
notable lack of diagrams for 1-byte and 2-byte accesses (4-byte and
8-byte diagrams are present).

STIP SPACE
----------

STIP space provides a stateless stipple capability.  A write to STIP
space carries two pieces of information: the address written to and the
data written.  The address must be 8-byte-aligned (must have its low
three bits zero); the next 20 bits (the 0x007ffff8 bits) provide a
location in the form of (Y*1152)+X (which is <<3 to make it a 64-bit
address).  The design permits this address to be anywhere in the
framebuffer, but the S24 requires that it be 32-pixel-aligned (that X
be a multiple of 32).  The data written (always 64 bits) provides 32
write enable bits plus a 24-bit pixel value:
	0xf000000000000000	ROP code.  Not supported; write 0011 in
				these bits for forward compatability.
	0x0f00000000000000	MBZ.
	0x00ff000000000000	Blue component of pixel.
	0x0000ff0000000000	Green component of pixel.
	0x000000ff00000000	Red component of pixel.
	0x00000000ffffffff	32-bit write enable mask.
The 0x0000000080000000 bit corresponds to the pixel at the (X,Y)
location given by the address; the 0x0000000000000001 bit corresponds
to the pixel at (X+31,Y).  A 1 bit in the write mask writes that pixel;
a 0 bit leaves it unmodified.

BLIT SPACE
----------

BLIT space, despite the name, does not provide real blitter capability,
though it is designed to allow it in a potential future version.  A
write to BLIT space carries two pieces of information: the address
written to and the data written.  Needed are a source location, a
destination location, a width, a height, and a ROP.  The write address
must be 8-byte-aligned and provides the destination location,
(Y*1152)+X, in its 0x007ffff8 bits; the data written (always 64 bits)
provides the source location, (Y*1152)+X, in its 0x00000000000fffff
bits.  In the S24 implementation, the height is always 1 and the ROP is
always "copy source to dest".  The width can be from 1 to 32, and is
carried in the 0x000000001f000000 bits of the data written; the width
is one more than the value in those bits.  All other bits are reserved
for expansion (to larger framebuffers, for allowing variable values for
height or ROP, or for permitting larger widths).  The low 24 bits of
each pixel are copied; the high two bits are unchanged.

RDFB32 SPACE
------------

RDFB32 space is just like DFB24 space except that all 26 bits of each
pixel are accessible, with only the high 6 bits read-as-zero
ignore-on-write.

RSTIP SPACE
-----------

RSTIP space is just like STIP space except that the high two bits of
the pixels are accessible; they appear in the 0x0300000000000000 bits
of the data value written.

RBLIT SPACE
-----------

RBLIT space is just like BLIT space except that the high two bits of
the pixels are copied along with the low 24.

As far as I can tell the {,R}{STIP,BLIT} spaces constitute all the
acceleration the S24 has.  For window fills, STIP/RSTIP space is a
factor of two faster than writing to DFB8 and a factor of 8 faster than
writing to DFB24/RDFB32 (assuming of course that bus transaction speed
is the limiting factor), and the BLIT/RBLIT spaces permit copying 32
pixels with one 64-bit write, as compared to a four 64-bit read and
four 64-bit writes to copy 32 pixels through DFB8 space, or 16 reads
and 16 writes through DFB24/RDFB32 space (under the same assumption).
Not a whole screaming lot of acceleration, but significantly better
than nothing - though I haven't yet built an X server that uses them.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B