PowerVR The NEC PowerVR chip consists of a true-colour RAMDAC, and a hardware 3D engine based on a Tile Accelerator. The Tile Accelerator divides the scene into 32 by 32 pixel tiles, which can be rendered individually. Each tile is then rendered into an internal 32 by 32 pixel frame buffer in register memory before it is copied to the main frame buffer. As rendering is done to the internal frame buffer, the fill rate is very high. Also, no texel data is actually fetched from texture VRAM until the tile is copied to the frame buffer, which means that the texture fill rate is not affected by overpainting at all. 3D engine principle overview The follwing diagram shows the principle by which the hardware 3D engine works: [ta.gif] There are two stages, which can be run in parallell (provided you have dual sets of buffers of course). During the Binning stage, the Tile Accelerator is fed graphic primitives (either using DMA or directly by the CPU using the Store Queues or direct writes), which it will compile to an internal format. While doing this, it will register in which tiles this primitive might be visible by putting it in one or more tile bins. (If it's not visible in any tile, it can be completely clipped of course.) During the rendering stage, the ISP/TSP will read the lists created by the Tile Accelerator, and for each tile render the primitives visible for that tile into its internal framebuffer, before writing it out to the right place in the VRAM framebuffer, where the RAMDAC can display it. For a double buffering stratgy that allows you to run both stages simultanously (but for different frames, i.e. binning frame N+1 while rendering frame N), you need double sets of buffers for the display list and the tile bins, as well as double frame buffers to avoid rendering artifacts to be visible on the screen. The following diagram shows which tile bin set and frame buffer to use to avoid conflict: Frame Bin to Render Render Display # TB # from TB to FB # FB # 1 1 2 2 1 1 3 1 2 2 1 4 2 1 1 2 5 1 2 2 1 6 2 1 1 2 etc. As you can see, there is a two frame latency, i.e. frame 1 will not be visible on screen until frame 3 is being generated. Video memory There are 8 megabytes of video memory, located in memory area 1 (see the memory map). This memory is organized as two banks of 32×1Mbit each, and depending on the value of address bit 24 they can either be accessed sequentially as 32 bit memory, or parallelly as 64 bit memory. In both cases, you get 8 megabytes of continuous address space, but the correspondence of address to memory cell is slightly different, as this figure shows: 32 bit interface 64 bit interface 0xA57FFFFC 0xA47FFFF8 ... Bank 2 ... Bank 1 Bank 2 0xA5400000 0xA4000000 0 .. 31 32 ... 63 0xA53FFFFC ... Bank 1 0xA5000000 0 ... 31 So, the bytes 0xA4000000-0xA4000003 correspond to 0xA5000000-0xA5000003, 0xA4000004-0xA4000007 to 0xA5400000-0xA5400003, 0xA4000008-0xA400000B to 0xA5000004-0xA5000007 and so on. Both interfaces can handle 16-bit writes and up, 8-bit writes are not possible. It is possible to read any length of word, including 8-bit, though. In the following register descriptions, an address specification using the 32 bit interface will be referred to as a 32 bit address, and an address specification using the 64 bit interface as a 64 bit address, although this should not be mistaken as the width of the actual address since both types of addresses are really 23 bits wide. RAMDAC Registers The addresses given here are to the P2 area, as the registers should of course be accessed without cache. The register descriptions are partly based on research done by bITmASTER and maiwe. A05F8040 - Border colour RGB 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -----------?----------- ----------Red---------- ---------Green--------- ---------Blue---------- This register sets the solid colour displayed around the main display area. A05F8044 - Display mode 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -- -- -- -- -- -- -- -- C -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -COL- SD DE C - Clock double enable Setting this bit doubles the pixel clock, giving a scan rate suitable for VGA monitors. COL - Colour mode select Selects the frame buffer pixel colour mode (all colour modes are little endian, e.g. in RGB888 the blue byte comes first) Value Mode Bytes/pixel 0 0 RGB555 2 0 1 RGB565 2 1 0 RGB888 3 1 1 RGB888 4 SD - Scan Double enable Setting this bit makes each scan line be sent twice, allowing low resolutions in VGA mode. DE - Display Enable This bit must be set for any graphics to be display. If it is set to zero, only the border colour will be visible. A05F8050 - Video memory base offset 1 This sets the address in the video RAM of the first pixel displayed (top left). Address 0 means the first byte of the video RAM bank 1 (usually accessed as A5000000 from the CPU). The address must be longword aligned. This register is used for noninterlaced screens and the long field of interlaced screens. A05F8054 - Video memory base offset 2 Same as A05F8050, but used for the short field of interlaced screens. A05F805C - Display size and modulo 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --?-- -----------Modulo------------ ---------Lines/field--------- ------Pixel data / line------ This register determines how much pixel data to display each field, and the modulo between each line of data. Modulo The number of 32-bit words to skip between each line, plus 1. I.e. a value of 1 means the lines are stored immediatelty after each other in memory. Lines per field How many lines of pixels to fetch and display each field, minus 1. Since this is per field and not per frame, it should be set to half the total vertical resolution (minus 1) in interlaced mode. Pixel data per line The number of 32-bit words of pixel data to fetch and display each line, minus 1. If you want X pixels per line, and each pixel is Y bytes, X*Y/4-1 is the correct value to write. A05F80CC - Raster event position 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------- -------------Top------------- ----------------- -----------Bottom------------ This register defines two rasterlines on the screen, which when they are passed by the raster beam will generate a raster event (which optionally causes an interrupt). The rasterline for the "Top" raster event is typically set just above the display area, and the rasterline for the "Bottom" raster event just below the display area. A05F80D0 - Video encapsulation 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------------- VO -BC-- -- I ----- HP VP VO - Video Output enable Set to 1 to enable video output. I - Interlace Set to 1 to enable interlaced video. BC - Broadcast standard Used to select type of colour sync for composite video Value Standard 0 0 NTSC 0 1 PAL 1 0 PAL-M (?) 1 1 PAL-N (?) HP - H-sync polarity Set to 1 to for positive H-sync, 0 for negative H-sync. VP - V-sync polarity Set to 1 to for positive V-sync, 0 for negative V-sync. A05F80D4 - Border horizontal range 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------- ------------Start------------ ----------------- ------------Stop------------- This register selects the horizontal range in which the border colour is displayed. Left and right of this range, the border is displayed as black. Start The number of pixels from the horizontal sync where border display starts. Stop The number of pixels from the horizontal sync where border display ends. A05F80D8 - Full video size 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------- ----------Vertical----------- ----------------- ---------Horizontal---------- This register selects the total number of lines and "pixels" (including lace) between each retrace. The horizontal and vertical refresh rate are determined by this register, and the pixel clock. For 50Hz (PAL), set V=624 H=863. For 60Hz (NTSC/VGA), set V=524 H=857. (Halve the V value for non-interlaced PAL/NTSC screens.) A05F80DC - Border vertical range 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------- ------------Start------------ ----------------- ------------Stop------------- This register selects the vertical range in which the border colour is displayed. Above and below this range, the border is displayed as black. Start The number of scanlines from the vertical sync where border display starts. Stop The number of scanlines from the vertical sync where border display ends. A05F80E8 - Additional video settings 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------------------- --------N-------- -------------------- LR ----------------------- Misc additional video settings. N Unknown. Set to 22. LR Low-res; setting this bit makes each pixel be output twice, effectively giving a 320 pixel horizontal resolution. A05F80EC - Display horizontal position 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------------------------------------------------------- -----Horizontal position----- This register sets the distance from the horizontal sync to where pixel display starts. A05F80F0 - Display vertical position 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------- -------Vertical pos 2-------- ----------------- -------Vertical pos 1-------- This register sets the distance (in scanlines) from the vertical sync to where pixel display starts. Vertical pos 1 This value is used for noninterlaced screens and the long fields of interlaced screens Vertical pos 2 This value is used for the short fields of interlaced screens Tile Accelerator Registers The addresses given here are to the P2 area, as the registers should of course be accessed without cache. A05F8124 - Tile Bin base output address This sets the address of the Tile Bin array to which the Tile Accelerator should perform its binning. 64 bytes of memory per tile will be used at the video memory address pointed out by this register. A05F8128 - Display list base output address This sets the address of the compiled Display dist buffer to which the Tile Accelerator should output the processed primitives. The amount of memory needed depends on how large the scene is. A05F813C - Tile Bin array size 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------------------------- -Vertical-- ----------------------------- ---Horizontal---- The size of the Tile Bin array in rows and columns. Vertical How many tiles high the Tile Bin array is, minus 1. Each tile is 32 pixels high. Horizontal How many tiles wide the Tile Bin array is, minus 1. Each tile is 32 pixels wide. ISP/TSP Registers The addresses given here are to the P2 area, as the registers should of course be accessed without cache. A05F8020 - Display list base input address The address of the compiled Display list created by the Tile Accelerator which contains the primitives for the scene. A05F802C - Tile Bin header input address The address of a structure describing the location and clipping(?) of each tile on the screen, as well as pointers to the respective Tile Bin buffers. This structure has to be created before any rendering can be done, but can be reused in subsequent renders using the same set of Tile Bin buffers. A05F804C - Render output modulo 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -------------------------------------------------------------------- ----------Modulo---------- The modulo of the frame buffer to which rendering is to take place, in bytes / 8. A05F8048 - Render output pixel format 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------------- ---------------------TH ----------------------------------- D --COL--- The pixel format of the frame buffer to which rendering is to take place. TH - Alpha threshold Set this to control the alpha threshold level when output colour mode is ARGB1555. D - Dither enable Setting this bit enables dithering in highcolour modes. COL - Colour mode select Selects the frame buffer pixel colour mode (all colour modes are little endian, e.g. in RGB888 the blue byte comes first) Value Mode Bytes/pixel 0 0 0 RGB555 2 0 0 1 RGB565 2 0 1 0 ARGB4444 2 0 1 1 ARGB1555 2 1 0 1 RGB888 4 1 1 0 ARGB888 4 <- should be ARGB8888? A05F8060 - Render output address The address of the frame buffer to which rendering is to take place. The coordinates for the individual tiles will be added as an offset to this base address. A05F8108 - Texture palette colour mode 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ----------------------------------------------------------------------------------------- -COL- The format of the entries in the palettes used by CLUT mode textures. COL - Colour mode select Value Mode Bytes/entry 0 0 ARGB1555 2 0 1 RGB565 2 1 0 ARGB4444 2 1 1 ARGB8888 4 Note that each palette entry always occupies 4 bytes of address space, even if only two bytes are used. The above basd on pvr.html; the below based on video.c from tatest. a05f6884 /* Disable all interrupt events 1 */ a05f6900 /* Clear all pending int events 1 */ a05f6908 /* Clear all pending int events 2 */ a05f6930 /* Disable all interrupt events 2; Re-enable some events 1 */ a05f6938 /* Disable all interrupt events 3; Re-enable some events 2 */ a05f7814 /* More interrupt control stuff (so it seems) 1 */ a05f7834 /* More interrupt control stuff (so it seems) 2 */ a05f7854 /* More interrupt control stuff (so it seems) 3 */ a05f7874 /* More interrupt control stuff (so it seems) 4 */ a05f78bc /* More interrupt control stuff (so it seems) 5 */ a05f8008 /* TA out of reset; TA reset */ a05f8030 /* M */ a05f8040 /* border color */ a05f8044 /* pixel mode (vb+0x11) */ a05f8048 /* alpha config */ a05f804c /* display align (640*2)/8 */ a05f805c /* Size modulo and display lines (vb+0x17) */ a05f8068 /* (X resolution - 1) << 16; pixel clipping x */ a05f806c /* (Y resolution - 1) << 16; pixel clipping y */ a05f8074 /* cheap shadow */ a05f8078 /* polygon culling (1.0f) */ a05f807c /* M */ a05f8080 /* M */ a05f8084 /* M */ a05f8098 /* M */ a05f80a0 /* M */ a05f80a8 /* M (Unknown magic value) */ a05f80b0 /* Fog table color */ a05f80b4 /* Fog vertex color */ a05f80b8 /* fog density */ a05f80bc /* color clamp max */ a05f80c0 /* color clamp min */ a05f80c8 /* set to same as border H in 80d4 */ a05f80cc /* M */ a05f80d0 /* interlace flags */ a05f80d4 /* horizontal border */ a05f80d8 /* M */ a05f80dc /* vertical position */ a05f80e0 /* sync control */ a05f80e4 /* stride width */ a05f80e8 /* screen control */ a05f80ec /* horizontal position */ a05f80f0 /* vertical border */ a05f80f4 /* anti-aliasing */ a05f8108 /* 32bit palette */ a05f8110 /* M */ a05f8118 /* M */