GPU/External Registers

< GPU
Revision as of 08:39, 7 September 2015 by Yuriks (talk | contribs) (→‎Flags Register - 0x1EF00C10: More info about the 32x32 block mode)

This page describes the address range accessible from the ARM11, used to configure the basic GPU functionality. For information about the internal registers used for 3D rendering, see GPU/Internal Registers.

Map

User VA PA Length Name Comments
0x1EF00004 0x10400004 4 ?
0x1EF00010 0x10400010 16 Memory Fill1 "PSC0" GX command 2
0x1EF00020 0x10400020 16 Memory Fill2 "PSC1" GX command 2
0x1EF00030 0x10400030 4 ?
0x1EF00034 0x10400034 4 GPU Busy Bit31 = cmd-list busy, bit27 = PSC0 busy, bit26 = PSC1 busy.
0x1EF00050 0x10400050 4 ? Writes 0x22221200 on GPU init.
0x1EF00054 0x10400054 4 ? Writes 0xFF2 on GPU init.
0x1EF00400 0x10400400 0x100 Framebuffer Setup "PDC0" (top screen)
0x1EF00500 0x10400500 0x100 Framebuffer Setup "PDC1" (bottom)
0x1EF00C00 0x10400C00 ? Transfer Engine "DMA"
0x1EF01000 0x10401000 0x4 ? Writes 0 on GPU init and before the Command List is used
0x1EF01080 0x10401080 0x4 ? Writes 0x12345678 on GPU init.
0x1EF010C0 0x104010C0 0x4 ? Writes 0xFFFFFFF0 on GPU init.
0x1EF010D0 0x104010D0 0x4 ? Writes 1 on GPU init.
0x1EF014?? 0x104014?? 0x14 "PPF" ?
0x1EF018E0 0x104018E0 0x14 Command List "P3D"

Memory Fill

User VA Description
0x1EF000X0 Buffer start physaddr >> 3
0x1EF000X4 Buffer end physaddr >> 3
0x1EF000X8 Fill value
0x1EF000XC Control. bit0: start/busy, bit1: finished, bit8-9: fill-width (0=16bit, 1=3=24bit, 2=32bit)

Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0.

These registers are used by GX SetMemoryFill.

LCD Source Framebuffer Setup

Offset Length Name Comments
0x5C 4 Framebuffer width & height Lower 16 bits: width, upper 16 bits: height
0x68 4 Framebuffer A first address For top screen, this is the left eye 3D framebuffer.
0x6C 4 Framebuffer A second address For top screen, this is the left eye 3D framebuffer.
0x70 4 Framebuffer format Bit0-15: framebuffer format, bit16-31: unknown
0x78 4 Framebuffer select Bit0: which framebuffer to display, bit1-7: unknown
0x90 4 Framebuffer stride Distance in bytes between the start of two framebuffer rows (must be a multiple of 8).
0x94 4 Framebuffer B first address For top screen, this is the right eye 3D framebuffer. Unused for bottom screen.
0x98 4 Framebuffer B second address For top screen, this is the right eye 3D framebuffer. Unused for bottom screen.

Framebuffer format

Bit Description
2-0 Color format
3 ?
4 Unused?
5 Enable parallax barrier (i.e. 3D).
6 1 = main screen, 0 = sub screen. However if bit5 is set, this bit is cleared.
7 ?
9-8 Value 1 = unknown: get rid of rainbow strip on top of screen, 3 = unknown: black screen.
15-10 Unused?

GSP module only allows the LCD stereoscopy to be enabled when bit5=1 and bit6=0 here. When GSP module updates this register, GSP module will automatically disable the stereoscopy if those bits are not set for enabling stereoscopy.

Framebuffer color formats

Value Description
0 GL_RGBA8_OES
1 GL_RGB8_OES
2 GL_RGB565_OES
3 GL_RGB5_A1_OES
4 GL_RGBA4_OES

Color components are laid out in reverse byte order, with the most significant bits used first (i.e. non-24-bit pixels are stored as a little-endian values). For instance, a raw data stream of two GL_RGB565_OES pixels looks like GGGBBBBB RRRRRGGG GGGBBBBB RRRRRGGG.

Transfer Engine

Register address Description
0x1EF00C00 Input physical address >> 3
0x1EF00C04 Output physical address >> 3
0x1EF00C08 DisplayTransfer output width (bits 0-15) and height (bits 16-31).
0x1EF00C0C DisplayTransfer input width and height.
0x1EF00C10 Transfer flags. (See below)
0x1EF00C14 GSP module writes value 0 here prior to writing to 0x1EF00C18, for cmd3.
0x1EF00C18 Setting bit0 starts the transfer. Upon completion, bit0 is unset and bit8 is set.
0x1EF00C20 TextureCopy total amount of data to copy, in bytes.
0x1EF00C24 TextureCopy input line width (bits 0-15) and gap (bits 16-31), in bytes.
0x1EF00C28 TextureCopy output line width and gap.

These registers are used by GX command 3 and 4. For cmd4, *0x1EF00C18 |= 1 is used instead of just writing value 1. The DisplayTransfer registers are only used if bit 3 of the flags is unset and ignored otherwise. The TextureCopy registers are likewise only used if bit 3 is set, and ignored otherwise.

Flags Register - 0x1EF00C10

Bit Description
0 When set, the framebuffer data is flipped vertically.
1 When set, the input framebuffer is treated as linear and converted to tiled in the output, converts tiled->linear when unset.
2 This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned.
3 Uses a TextureCopy mode transfer. See below for details.
4 Not writable
5 Don't perform tiled-linear conversion. Incompatible with bit 1, so only tiled-tiled transfers can be done, not linear-linear.
7-6 Not writable
10-8 Input framebuffer color format, value0 and value1 are the same as the LCD Source Framebuffer Formats (usually zero)
11 Not writable
14-12 Output framebuffer color format
15 Not writable
16 Use 32x32 block tiling mode, instead of the usual 8x8 one. Output dimensions must be multiples of 32, even if cropping with bit 2 set above.
17-23 Not writable
24-25 Scale down the input image using a box filter. 0 = No downscale, 1 = 2x1 downscale. 2 = 2x2 downscale, 3 = invalid
31-26 Not writable

TextureCopy

When bit 3 of the control register is set, the hardware performs a TextureCopy-mode transfer. In this mode, all other bits of the control register (except for bit 2, which still needs to be set correctly) and the regular dimension registers are ignored, and no format conversions are done. Instead, it performs a raw data copy from the source to the destination, but with a configurable gap between lines. The total amount of bytes to copy is specified in the size register, and the hardware loops reading lines from the input and writing them to the output until this amount is copied. The "gap" specified in the input/output dimension register is the number of bytes to skip after each "width" bytes of the input/output, and is NOT counted towards the total size of the transfer.

By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters.

Specifying invalid/junk values for the TextureCopy dimensions can result in the GPU hanging while attempting to process this TextureCopy.

Command List

Register address Description
0x1EF018E0 Buffer size in bytes >> 3
0x1EF018E8 Buffer physical address >> 3
0x1EF018F0 Setting bit0 to 1 enables processing GPU command execution. Upon completion, bit0 seems to be reset to 0.

These 3 registers are used by GX command 1. This is used for GPU commands.

Framebuffers

These LCD framebuffers normally contain the last rendered frames from the GPU. The framebuffers are drawn from left-to-right, instead of top-to-bottom.(Thus the beginning of the framebuffer is drawn starting at the left side of the screen)

Both of the 3D screen left/right framebuffers are displayed regardless of the 3D slider's state, however when the 3D slider is set to "off" the 3D effect is disabled. Normally when the 3D slider's state is set to "off" the left/right framebuffer addresses are set to the same physical address. When the 3D effect is disabled and the left/right framebuffers are set to separate addresses, the LCD seems to alternate between displaying the left/right framebuffer each frame.

Init Values from nngxInitialize for Top Screen

  • 0x1EF00400 = 0x1C2
  • 0x1EF00404 = 0xD1
  • 0x1EF00408 = 0x1C1
  • 0x1EF0040C = 0x1C1
  • 0x1EF00410 = 0
  • 0x1EF00414 = 0xCF
  • 0x1EF00418 = 0xD1
  • 0x1EF0041C = 0x1C501C1
  • 0x1EF00420 = 0x10000
  • 0x1EF00424 = 0x19D
  • 0x1EF00428 = 2
  • 0x1EF0042C = 0x1C2
  • 0x1EF00430 = 0x1C2
  • 0x1EF00434 = 0x1C2
  • 0x1EF00438 = 1
  • 0x1EF0043C = 2
  • 0x1EF00440 = 0x1960192
  • 0x1EF00444 = 0
  • 0x1EF00448 = 0
  • 0x1EF0045C = 0x19000F0
  • 0x1EF00460 = 0x1c100d1
  • 0x1EF00464 = 0x1920002
  • 0x1EF00470 = 0x80340
  • 0x1EF0049C = 0

More Init Values from nngxInitialize for Top Screen

  • 0x1EF00468 = 0x18300000, later changed by GSP module when updating state, framebuffer
  • 0x1EF0046C = 0x18300000, later changed by GSP module when updating state, framebuffer
  • 0x1EF00494 = 0x18300000
  • 0x1EF00498 = 0x18300000
  • 0x1EF00478 = 1, doesn't stay 1, read as 0
  • 0x1EF00474 = 0x10501