GPU/External Registers: Difference between revisions

Kynex7510 (talk | contribs)
Reword TextureCopy
Kynex7510 (talk | contribs)
PPF rewrites + add T2L stuff
 
(4 intermediate revisions by the same user not shown)
Line 140: Line 140:


Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0.
Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0.
The addresses must be part of VRAM.


These registers are used by [[GSP Shared Memory#GX SetMemoryFill|GX SetMemoryFill]].
These registers are used by [[GSP Shared Memory#GX SetMemoryFill|GX SetMemoryFill]].
Line 473: Line 475:
|-
|-
| 0x1EF00C08
| 0x1EF00C08
| DisplayTransfer output width (bits 0-15) and height (bits 16-31).
| DisplayTransfer output width (bits 0-15) and height (bits 16-31)
|-
|-
| 0x1EF00C0C
| 0x1EF00C0C
| DisplayTransfer input width and height.
| DisplayTransfer input width and height
|-
|-
| 0x1EF00C10
| 0x1EF00C10
| Transfer flags. (See below)
| Transfer flags
|-
|-
| 0x1EF00C14
| 0x1EF00C14
| GSP module writes value 0 here prior to writing to 0x1EF00C18, for cmd3.
| ?, GSP writes value 0 here prior to writing to 0x1EF00C18 for DisplayTransfer
|-
|-
| 0x1EF00C18
| 0x1EF00C18
|  Setting bit0 starts the transfer. Upon completion, bit0 is unset and bit8 is set.
|  Setting bit0 starts the transfer; upon completion, bit0 is unset and bit8 is set
|-
|-
| 0x1EF00C1C
| 0x1EF00C1C
Line 491: Line 493:
|-
|-
| 0x1EF00C20
| 0x1EF00C20
| TextureCopy total amount of data to copy, in bytes.
| TextureCopy total amount of data to copy, in bytes
|-
|-
| 0x1EF00C24
| 0x1EF00C24
| TextureCopy input line width (bits 0-15) and gap (bits 16-31), in 16 byte units.
| TextureCopy input line width (bits 0-15) and gap (bits 16-31), in 16 byte units
|-
|-
| 0x1EF00C28
| 0x1EF00C28
| TextureCopy output line width and gap.
| TextureCopy output line width and gap
|}
|}


These registers are used by [[GSP_Shared_Memory|GX command]] 3 and 4. For cmd4, *0x1EF00C18 |= 1 is used instead of just writing value 1. The DisplayTransfer registers are only used if bit 3 of the flags is unset and ignored otherwise. The TextureCopy registers are likewise only used if bit 3 is set, and ignored otherwise.
Transfer flags:


==== Flags Register - 0x1EF00C10 ====
{| class="wikitable" border="1"
{| class="wikitable" border="1"
!  Bit
!  Bit
Line 508: Line 509:
|-
|-
| 0
| 0
| When set, the framebuffer data is flipped vertically.
| When set, the framebuffer data is flipped vertically
|-
|-
| 1
| 1
| When set, the input framebuffer is treated as linear and converted to tiled in the output, converts tiled->linear when unset.
| Linear->tiled mode (overrides tiled->linear mode)
|-
|-
| 2
| 2
| This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned.
| This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned
|-
|-
| 3
| 3
| Uses a TextureCopy mode transfer. See below for details.
| TextureCopy mode (overrides all other modes)
|-
|-
| 4
| 4
Line 523: Line 524:
|-
|-
| 5
| 5
| Don't perform tiled-linear conversion. Incompatible with bit 1, so only tiled-tiled transfers can be done, not linear-linear.
| Tiled->tiled mode (overrides tiled->linear, linear->tiled modes)
|-
|-
| 7-6
| 7-6
Line 529: Line 530:
|-
|-
| 10-8
| 10-8
| Input framebuffer color format, value0 and value1 are the same as the [[GPU Registers#Framebuffer_color_formats|LCD Source Framebuffer Formats]] (usually zero)
| Input [[GPU/External_Registers#Framebuffer_color_formats|color format]]
|-
|-
| 11
| 11
Line 535: Line 536:
|-
|-
| 14-12
| 14-12
| Output framebuffer color format
| Output color format
|-
|-
| 15
| 15
Line 541: Line 542:
|-
|-
| 16
| 16
| Use 32x32 block tiling mode, instead of the usual 8x8 one. Output dimensions must be multiples of 32, even if cropping with bit 2 set above.
| Use 32x32 block tiling mode, instead of the usual 8x8 one (output dimensions must be multiples of 32, even if cropping with bit 2 set above)
|-
|-
| 17-23
| 17-23
Line 547: Line 548:
|-
|-
| 24-25
| 24-25
| Scale down the input image using a box filter. 0 = No downscale, 1 = 2x1 downscale. 2 = 2x2 downscale, 3 = invalid
| Scale down the input image using a box filter (0 = No downscale, 1 = 2x1 downscale, 2 = 2x2 downscale, 3 = invalid)
|-
|-
| 31-26
| 31-26
| Not writable
| Not writable
|}
These registers are used by [[GSP_Shared_Memory#Commands|GSP]] for DisplayTransfer and TextureCopy. TextureCopy registers are only used in TextureCopy mode; likewise, DisplayTransfer registers are only used when TextureCopy mode is not set. By default, DisplayTransfer will work in tiled->linear mode.
=== Tiled to linear ===
Unswizzles the input buffer, this is usually used for transferring GPU framebuffer data onto LCD framebuffers. The following constraints apply:
* Output dimensions must not be bigger than input ones.
* Width dimensions must be >= 64.
* Height dimensions must be >= 16.
* Width dimensions are required to be aligned to 16 bytes when doing RGB8 transfers.
** Otherwise they are required to be aligned to 8 bytes.
* If downscale is used, input and output dimensions should be the same, and width/2 must also follow alignment constraints.
Format conversion results:
{| class="wikitable" border="1"
!  Conversion
!  Result
|-
| RGBA8 -> RGBA8
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGBA8 -> RGB8
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGBA8 -> RGB565
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGBA8 -> RGB5A1
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGBA8 -> RGBA4
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGB8 -> RGBA8
| style="background: salmon" | No interrupt
|-
| RGB8 -> RGB8
| style="background: lightgreen" | Has interrupt, correct output
|-
| RGB8 -> RGB565
| style="background: salmon" | No interrupt
|-
| RGB8 -> RGB5A1
| style="background: salmon" | No interrupt
|-
| RGB8 -> RGBA4
| style="background: salmon" | No interrupt
|-
| RGB565 -> RGBA8
| style="background: salmon" | No interrupt
|-
| RGB565 -> RGB8
| style="background: salmon" | No interrupt
|-
| RGB565 -> RGB565
| style="background: yellow" | Has interrupt, output not tested
|-
| RGB565 -> RGB5A1
| style="background: yellow" | Has interrupt, output not tested
|-
| RGB565 -> RGBA4
| style="background: yellow" | Has interrupt, output not tested
|-
| RGB5A1 -> RGBA8
| style="background: salmon" | No interrupt
|-
| RGB5A1 -> RGB8
| style="background: salmon" | No interrupt
|-
| RGB5A1 -> RGB565
| style="background: yellow" | Has interrupt, output not tested
|-
| RGB5A1 -> RGB5A1
| style="background: yellow" | Has interrupt, output not tested
|-
| RGB5A1 -> RGBA4
| style="background: yellow" | Has interrupt, output not tested
|-
| RGBA4 -> RGBA8
| style="background: salmon" | No interrupt
|-
| RGBA4 -> RGB8
| style="background: salmon" | No interrupt
|-
| RGBA4 -> RGB565
| style="background: yellow" | Has interrupt, output not tested
|-
| RGBA4 -> RGB5A1
| style="background: yellow" | Has interrupt, output not tested
|-
| RGBA4 -> RGBA4
| style="background: yellow" | Has interrupt, output not tested
|}
|}


Line 562: Line 658:
  line width = (16 * 24) >> 4 = 24
  line width = (16 * 24) >> 4 = 24
  gap = line width
  gap = line width
  size = (16 * 32 * 24) >> 4 = 768
  size = 16 * 32 * 3 = 1536


By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters.
By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters.