GPU/External Registers: Difference between revisions
Reword TextureCopy |
PPF rewrites + add T2L stuff |
||
(4 intermediate revisions by the same user not shown) | |||
Line 140: | Line 140: | ||
Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0. | Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0. | ||
The addresses must be part of VRAM. | |||
These registers are used by [[GSP Shared Memory#GX SetMemoryFill|GX SetMemoryFill]]. | These registers are used by [[GSP Shared Memory#GX SetMemoryFill|GX SetMemoryFill]]. | ||
Line 473: | Line 475: | ||
|- | |- | ||
| 0x1EF00C08 | | 0x1EF00C08 | ||
| DisplayTransfer output width (bits 0-15) and height (bits 16-31) | | DisplayTransfer output width (bits 0-15) and height (bits 16-31) | ||
|- | |- | ||
| 0x1EF00C0C | | 0x1EF00C0C | ||
| DisplayTransfer input width and height | | DisplayTransfer input width and height | ||
|- | |- | ||
| 0x1EF00C10 | | 0x1EF00C10 | ||
| Transfer flags | | Transfer flags | ||
|- | |- | ||
| 0x1EF00C14 | | 0x1EF00C14 | ||
| GSP | | ?, GSP writes value 0 here prior to writing to 0x1EF00C18 for DisplayTransfer | ||
|- | |- | ||
| 0x1EF00C18 | | 0x1EF00C18 | ||
| Setting bit0 starts the transfer | | Setting bit0 starts the transfer; upon completion, bit0 is unset and bit8 is set | ||
|- | |- | ||
| 0x1EF00C1C | | 0x1EF00C1C | ||
Line 491: | Line 493: | ||
|- | |- | ||
| 0x1EF00C20 | | 0x1EF00C20 | ||
| TextureCopy total amount of data to copy, in bytes | | TextureCopy total amount of data to copy, in bytes | ||
|- | |- | ||
| 0x1EF00C24 | | 0x1EF00C24 | ||
| TextureCopy input line width (bits 0-15) and gap (bits 16-31), in 16 byte units | | TextureCopy input line width (bits 0-15) and gap (bits 16-31), in 16 byte units | ||
|- | |- | ||
| 0x1EF00C28 | | 0x1EF00C28 | ||
| TextureCopy output line width and gap | | TextureCopy output line width and gap | ||
|} | |} | ||
Transfer flags: | |||
{| class="wikitable" border="1" | {| class="wikitable" border="1" | ||
! Bit | ! Bit | ||
Line 508: | Line 509: | ||
|- | |- | ||
| 0 | | 0 | ||
| When set, the framebuffer data is flipped vertically | | When set, the framebuffer data is flipped vertically | ||
|- | |- | ||
| 1 | | 1 | ||
| | | Linear->tiled mode (overrides tiled->linear mode) | ||
|- | |- | ||
| 2 | | 2 | ||
| This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned | | This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned | ||
|- | |- | ||
| 3 | | 3 | ||
| | | TextureCopy mode (overrides all other modes) | ||
|- | |- | ||
| 4 | | 4 | ||
Line 523: | Line 524: | ||
|- | |- | ||
| 5 | | 5 | ||
| | | Tiled->tiled mode (overrides tiled->linear, linear->tiled modes) | ||
|- | |- | ||
| 7-6 | | 7-6 | ||
Line 529: | Line 530: | ||
|- | |- | ||
| 10-8 | | 10-8 | ||
| Input | | Input [[GPU/External_Registers#Framebuffer_color_formats|color format]] | ||
|- | |- | ||
| 11 | | 11 | ||
Line 535: | Line 536: | ||
|- | |- | ||
| 14-12 | | 14-12 | ||
| Output | | Output color format | ||
|- | |- | ||
| 15 | | 15 | ||
Line 541: | Line 542: | ||
|- | |- | ||
| 16 | | 16 | ||
| Use 32x32 block tiling mode, instead of the usual 8x8 one | | Use 32x32 block tiling mode, instead of the usual 8x8 one (output dimensions must be multiples of 32, even if cropping with bit 2 set above) | ||
|- | |- | ||
| 17-23 | | 17-23 | ||
Line 547: | Line 548: | ||
|- | |- | ||
| 24-25 | | 24-25 | ||
| Scale down the input image using a box filter | | Scale down the input image using a box filter (0 = No downscale, 1 = 2x1 downscale, 2 = 2x2 downscale, 3 = invalid) | ||
|- | |- | ||
| 31-26 | | 31-26 | ||
| Not writable | | Not writable | ||
|} | |||
These registers are used by [[GSP_Shared_Memory#Commands|GSP]] for DisplayTransfer and TextureCopy. TextureCopy registers are only used in TextureCopy mode; likewise, DisplayTransfer registers are only used when TextureCopy mode is not set. By default, DisplayTransfer will work in tiled->linear mode. | |||
=== Tiled to linear === | |||
Unswizzles the input buffer, this is usually used for transferring GPU framebuffer data onto LCD framebuffers. The following constraints apply: | |||
* Output dimensions must not be bigger than input ones. | |||
* Width dimensions must be >= 64. | |||
* Height dimensions must be >= 16. | |||
* Width dimensions are required to be aligned to 16 bytes when doing RGB8 transfers. | |||
** Otherwise they are required to be aligned to 8 bytes. | |||
* If downscale is used, input and output dimensions should be the same, and width/2 must also follow alignment constraints. | |||
Format conversion results: | |||
{| class="wikitable" border="1" | |||
! Conversion | |||
! Result | |||
|- | |||
| RGBA8 -> RGBA8 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGBA8 -> RGB8 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGBA8 -> RGB565 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGBA8 -> RGB5A1 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGBA8 -> RGBA4 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGB8 -> RGBA8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB8 -> RGB8 | |||
| style="background: lightgreen" | Has interrupt, correct output | |||
|- | |||
| RGB8 -> RGB565 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB8 -> RGB5A1 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB8 -> RGBA4 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB565 -> RGBA8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB565 -> RGB8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB565 -> RGB565 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGB565 -> RGB5A1 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGB565 -> RGBA4 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGB5A1 -> RGBA8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB5A1 -> RGB8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGB5A1 -> RGB565 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGB5A1 -> RGB5A1 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGB5A1 -> RGBA4 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGBA4 -> RGBA8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGBA4 -> RGB8 | |||
| style="background: salmon" | No interrupt | |||
|- | |||
| RGBA4 -> RGB565 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGBA4 -> RGB5A1 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|- | |||
| RGBA4 -> RGBA4 | |||
| style="background: yellow" | Has interrupt, output not tested | |||
|} | |} | ||
Line 562: | Line 658: | ||
line width = (16 * 24) >> 4 = 24 | line width = (16 * 24) >> 4 = 24 | ||
gap = line width | gap = line width | ||
size = | size = 16 * 32 * 3 = 1536 | ||
By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters. | By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters. |