GPU/Commands

From 3dbrew
< GPU
Revision as of 03:12, 10 November 2014 by Smea (talk | contribs) (Command IDs)
Jump to navigation Jump to search

This page describes the structure of the buffer submitted via the registers at 0x1EF018E0 (or equivalently via GX command 1). This buffer is used for GPU commands including functionality equivalent to OpenGL commands.

Overview

Each command is at least 8 bytes wide. The first word is the command parameter and the second word constitutes the command header. Optionally, more parameter words may follow (potentially including a padding word to align commands to multiples of 8 bytes).

In the simplest case, a command is exactly 8 bytes wide. You can think of such a command as writing the parameter word to an internal register (the index of which is given in the command header). The more general case where more than one parameter word is given is equivalent to multiple simple commands (one for each parameter word). If consecutive writing mode is enabled in the command header, the current command index will be incremented after each parameter write. Otherwise, the parameters will be consecutively written to the same register.

For example, the sequence "0xAAAAAAAA 0x802F011C 0xBBBBBBBB 0xCCCCCCCC" is equivalent to a call to commands 0xF011C with parameter 0xAAAAAAAA, 0xF011D with parameter 0xBBBBBBBB and 0xF011E with parameter 0xCCCCCCCC. If consecutive writing mode were disabled, the command would be equivalent to three consecutive calls to 0xF011C (once with parameter 0xAAAAAAAA, once with 0xBBBBBBBB, and finally with 0xCCCCCCCC).

Invalid GPU command parameters including NaN floats can cause the GPU to hang, which then causes the GSP module to hang as well.

Command Header

Bit Description
15-0 Command ID
19-16 Parameter mask
30-20 Number of extra parameters (may be zero)
31 Consecutive writing mode

Parameter masking

Using a value other than 0xF, parts of a word in internal GPU memory can be updated without touching the other bits of it. For example, setting bit 16 to zero indicates that the least significant byte of the parameter will not be overwritten, setting bit 17 to zero indicates that the parameter's second LSB will not be overwritten, etc. This means that for instance commands 0x00010107 and 0x00020107 refer to the same thing but write different parts of the parameter.

Command IDs

CommandID Parameter Description
0x0010 Value is 0x12345678 This command is always the last command in the buffer.
0x0110 Value 0x1 This command is immediately before command 0x0010, this is also used elsewhere for beginning rendering of mesh(es) as well.
0x0111 Value 0x1 This command is immediately before command 0x0110, however command 0x0110 doesn't always follow this command.
0x0040 u32, valid values are 0x1 and 0x2, values 0x0 and 0x3 have the same effect as value 0x2. Only bits 1-0 are used. Value 2 = GL_FRONT/GL_CW or GL_BACK/GL_CCW. Value 1 = GL_FRONT/GL_CCW or GL_BACK/GL_CW.
0x0041 float24 VIEWPORT_WIDTH. See command set 0x0041.
0x0042 float32 VIEWPORT_WIDTH_INV. See command set 0x0041.
0x0043 float24 VIEWPORT_HEIGHT. See command set 0x0041.
0x0044 float32 VIEWPORT_HEIGHT_INV. See command set 0x0041.
0x004D See command set 0x004D.
0x0065 Scissor test. See command set 0x0065.
0x0068 u32 VIEWPORT Y/X. See command set 0x0041.
0x006D See command set 0x004D.
0x006E u32 See command set 0x0111.
0x006F u32 See command set 0x006F.
0x0080 u32 See command set 0x0080.
0x0081 This is used to set the current texture info used for rendering, see command set 0x0081.
0x008E u32 color type This command sets the texture color type, see command set 0x0081.
0x0091 This sets current texture info, see command 0x0091.
0x0099 This sets current texture info, see command 0x0099.
0x00C3 val<<24 Val is usually 0xFF or 0x00, however 0x00-0xFF is valid as well. This is alpha-blending related?
0x00CB val<<24 Val is usually 0xFF or 0x00, however 0x00-0xFF is valid as well. This is alpha-blending related?
0x00C0 See command set 0x00C0.
0x00C4 See command set 0x00C0.
0x00C8 See command set 0x00C0.
0x00CC See command set 0x00C0.
0x00D0 See command set 0x00C0.
0x00D4 See command set 0x00C0.
0x00D8 See command set 0x00C0.
0x00DC See command set 0x00C0.
0x00F0 See command set 0x00C0.
0x00F4 See command set 0x00C0.
0x00F8 See command set 0x00C0.
0x00FC See command set 0x00C0.
0x00E0 Normally value zero. Unknown, fragment related?
0x00E0 See command set 0x00E0.
0x00E1 See command set 0x00E0.
0x00E6 Value zero See command set 0x00E6.
0x00E8 See command set 0x00E6.
0x0100 u32, value is 0x00E40100 See command set 0x0100.
0x0100 0x00E40000 | val. See command set 0x0100.
0x0101 u32 See command set 0x0100.
0x0102 u32 See command set 0x0100.
0x0103 See command set 0x0100.
0x0104 u32 glAlphaFunc()
0x0105 u32 Stencil test settings
0x0106 u32 Stencil replacement operators
0x0107 See command set command 0x0107.
0x0116 u32 DEPTHBUFFER FORMAT. See command set 0x0111.
0x0117 u32 COLORBUFFER FORMAT/PIXEL. See command set 0x0111.
0x011C Physical address>>3 DEPTHBUFFER ADDRESS. See command set 0x0111.
0x011D Physical address>>3 COLORBUFFER ADDRESS. See command set 0x0111.
0x011E u32 COLORBUFFER HEIGHT/WIDTH. See command set 0x0111.
0x0112 ?
0x01C5 u32 dmp_FragmentLightSource set ID and enable/disable. See command 0x01C8.
0x01C8 u32 Used to send dmp_FragmentLightSource parameters. See command 0x01C8.
0x0200 See command set 0x0200.
0x0126 See command set command 0x0107.
0x0227 u32 This specifies the address of an array containing vertex array indices, and the data-type of the indices, used for rendering primitives. See command set glDrawElements().
0x0228 u32 total elements in the array to use for rendering. See command set glDrawElements().
0x0232 See command set 0x0200.
0x025E u32, val<<8. This sets the GL rendering mode, see command set 0x0200.
0x02B0 u32, value is 0x7FFF0000 | val. Texture related?
0x02BB See command set 0x0200.
0x02BA 0x7FFF0000 | entrypoint offset Sets the entrypoint offset for the shader program
0x02C0 0x80000000 | Type This is used immediately before command 0xXXXF02C1. This type field controls the command parameter buffer type. This command can also be used to send over (float24 only ?) data directly, without using 0xXXXF02C1. In that case, the first parameter is still Type but with bit 31 not set; the actual data follows.
0x02C1 First word in the first entry A list of entries follow this command.
0x02CB Value 0x0 ? This is used immediately before command 0xXXXF02CC. It is used to indicate that shader program data will follow.
0x02CC First word of shader program data chunk. This command is used to transfer shader program data (as the parameter data). It can be called multiple times in a row if the shader program is too big to fit into a single call.
0x02BF Value 0x1 ? This is used immediately after a set of command 0xXXXF02CC. It is used to indicate that shader program data transfer is complete.
0x02D5 Value 0x0 ? This is used immediately before command 0xXXXF02d6.
0x02D6 First entry. This is used to send over the shader program operand descriptor table.
0x004F Number of shader output attributes Sets number of shader output attributes
0x0050 First entry This command is used to setup shader output registers. The n-th word-long entry is a map of the (n*2)-th output register's components. Each byte of each entry corresponds to where a component is mapped. Value 0x1F indicates that the corresponding component is unused.

Command Sets

glDrawElements()

See GPU GL Arrays.

glClear() / glClearColor()

The GPU does not have dedicated commands for clearing the color buffer, therefore applications implement color buffer clearing by rendering a quad. Applications normally store this vertex and color array in the GSP application heap.

Command 0x0081

This sets current texture info, see GPU textures.

Command 0x0065
Command Index CommandID Parameter Description
0 0x0065 Scissor test enable 0 = disabled, 1 = inverted (pixels within the scissor box are excluded), 2 = disabled, 3 = normal (pixels outside of the scissor box are excluded)
1 0x0066 Scissor box X/Y Bit 0-15: X, bit 16-31: Y
2 0x0067 Scissor box width/height Bit 0-15: width-1, bit 16-31: height-1
Command 0x006F
Command Index CommandID Parameter Description
0 0x006F Typically only bit8-10 are used. Bit8 enables texture coordinate output for texture unit 0, bit9 enables texcoords for texture unit 1, and bit2 enables texcoords for texture unit 2.
Command 0x0080
Command Index CommandID Parameter Description
0 0x0080 0x11000 | val, where only bits 2-0 are used in val. bit0-2 enables/disables texture units 0-2 respectively

Note that bit0-2 in this command only enable texture processing. For texturing to work fully, the corresponding texture coordinate outputs must be enabled as well via command 0x006F.

Command 0x00C0
Command Index CommandID Parameter Description
0 SlotCmdID
1 SlotCmdID + 4

This is used for glTexEnv(), for the slot indicated by the command id. There's a total of 6 slots, where each slot corresponds to the following u16 command ids: 0xC0, 0xC8, 0xD0, 0xD8, 0xF0, 0xF8.

Command 0x00E0
Command Index CommandID Parameter Description
0 0x00E0 5 | val<<16, where val is 0 or 1. Val0 = enable, val1 = disable.
1 0x00E1 This specifies a color.

This is usually used immediately after command set glDrawElements(). This is used to specify a color used for blending?

Command 0x00E6
Command Index CommandID Parameter Description
0 0x00E6 Value 0 ?
1 0x00E8

This is usually the last command set used for rendering a mesh, when command set 0x00E0 was used. This command set is used immediately after command set 0x00E0.

Command 0x0100
Command Index CommandID Parameter Description
0 0x0100 Value 0x00E40100 Controls color compositing
1 0x0101 0x01010000 when disabled Alphablending equations and factors
2 0x0103 This is set to zero when the command 0x0101 parameter is value 0x01010000. Constant color for alphablending

This is fragment related?

Command 0x004D
Command Index CommandID Parameter Description
0 0x004D glDepthRange()
1 0x006D 0 = unknown, 1 = unknown. Value zero causes the mesh to not be rendered.
Command 0x0041
Command Index CommandID Parameter Description
0 0x0041 float This corresponds to the framebuffer width.
1 0x0043 float This parameter value is calculated the same way as the command 0x0041 parameter, except the framebuffer height is used instead.
2 0x0042 float This corresponds to the framebuffer width.
3 0x0044 float This parameter value value is calculated the same way as the command 0x0042 parameter, except the framebuffer height is used instead.
4 0x0068 u32 This sets the X/Y coordinates used for glViewport().

This command set initializes the projection matrix. This command set is used twice when beginning rendering for each screen. The framebuffer width used here for the main screen is 240, however this is 480 with stereoscopy enabled for the second time this command set is used.

Command 0x0111
Command Index CommandID Parameter Description
0 0x0111 Value 1
1 0x0110 Value 1
2 0x0117 Bits15-0 = unk, 31-16 = unk. Unknown, normally the input parameter is value 0x2.
3 0x011D Physical address>>3 This initializes the framebuffer address used for rendering, this framebuffer is used for the input framebuffer with GX command 3 and 4. This command is used immediately after command 0x0117.
4 0x0116 ?
5 0x011C Physical address>>3 Unknown, normally this address is located in VRAM.
6 0x011E (((h-1)&0xFFF)<<12)|(w&0xFFF) This sets the width and height for the framebuffer used for rendering. Therefore this is glViewport(), x/y are specified by command 0x0068.
7 0x006E Same input parameter value as command 0x011E.

This command set is normally used after the two 0x0041 command sets.

Command 0x0107
Command Index CommandID Parameter Description
0 0x0107
1 0x0126 type<<24

This command set is used for disabling the alpha-blending info set by command set 0x0107? The GL AlphaFunction used here is normally GL_ALWAYS.

Parameter format for command 0x0107

Bit Description
0 0 = disable GL_DEPTH_TEST, 1 = enable GL_DEPTH_TEST
3-1 Unused?
7-4 Depth test function
8 Enable color writing for red component
9 Enable color writing for green component
10 Enable color writing for blue component
11 Enable color writing for alpha component
12 Enable depth writing (doesn't affect stencil writing)
31-13 Unused

Alpha function values

Value GL AlphaFunction
0 GL_NEVER
1 GL_ALWAYS
2 GL_EQUAL
3 GL_NOTEQUAL
4 GL_LESS
5 GL_LEQUAL
6 GL_GREATER
7 GL_GEQUAL

Alpha types for command 0x0126

Type GL AlphaFunction
0 GL_NEVER
1 GL_ALWAYS
2 GL_GREATER/GL_GEQUAL
3 The remaining GL alpha functions.

Parameter value format for command 0x0104

Bit Description
0 0 = disable GL_ALPHA_TEST, 1 = enable GL_ALPHA_TEST
3-1 Unused?
7-4 Alpha function
15-8 u8 ref, range is 0-255
31-16 Unused?

This is glAlphaFunc().

Parameter value format for command 0x011E

Bit Description
11-0 Framebuffer/viewport width
23-12 Framebuffer/viewport height - 1
24 Must be set
31-25 Unused?

This specifies the width/height for glViewport(). Normally the framebuffer width and height is set to the same dimensions used with GX command 3 and 4.

Parameter value format for command 0x0068

Bit Description
15-0 X
31-16 Y

This specifies the X/Y coordinates for glViewport().

Parameter structure for command 0x00C0

Index Word Description
0 Value 0xFFF0FFF / 0x0
1 Value 0x0
2 Value 0x0
3 Value 0xFFFFFFFF
4 Value 0x0

This individual command is used instead of the 0x80XF00C0 command set when none of the associated rendering parameters for this slot are set.

Parameter structure for command 0x00C0

Index Word Description
0 Param0
1 Param1
2 Param2

See command set 0x80XF00C0.

Param0 format for command 0x00C0

Bit Description
3-0 See below values.(Field0 index0)
7-4 See below values.(Field0 index1)
11-8 See below values.(Field0 index2)
15-12 Unused
19-16 See below values.(Field1 index0)
23-20 See below values.(Field1 index1)
27-24 See below values.(Field1 index2)
31-28 Unused

Param0 values for command 0x00C0

Value GL type
0x0 GL_PRIMARY_COLOR
0x1 ?
0x2 ?
0x3 GL_TEXTURE0
0x4 GL_TEXTURE1
0x5 GL_TEXTURE2
0x6 GL_TEXTURE3
0xC-0x7 GL_PRIMARY_COLOR
0xD ?
0xE GL_CONSTANT
0xF GL_PREVIOUS

Param1 format for command 0x00C0

Bit Description
3-0 See below values for field0.(Index0)
7-4 See below values for field0.(Index1)
11-8 See below values for field0.(Index2)
15-12 See below values for field1.(Index0)
19-16 See below values for field1.(Index1)
23-20 See below values for field1.(Index2)
31-24 Unused

This specifies the pname for glTexEnv().

Param1 field0 values for command 0x00C0

Value GL type
0x0 GL_SRC_COLOR
0x1 GL_ONE_MINUS_SRC_COLOR
0x2 GL_SRC_ALPHA
0x3 GL_ONE_MINUS_SRC_ALPHA
0x4 GL_SRC0_RGB
0x5 ?
0x6 GL_SRC_COLOR
0x7 GL_SRC_COLOR
0x8 GL_SRC1_RGB
0x9 ?
0xA GL_SRC_COLOR
0xB GL_SRC_COLOR
0xC GL_SRC2_RGB
0xD ?

Param1 field1 values for command 0x00C0

Value GL type
0x0 GL_SRC_ALPHA
0x1 GL_ONE_MINUS_SRC_ALPHA
0x2 GL_SRC0_RGB
0x3 ?
0x4 GL_SRC1_RGB
0x5 ?
0x6 GL_SRC2_RGB
0x7 ?

Param2 format for command 0x00C0

Bit Description
15-0 See below field0 values.
31-16 See below field1 values.

This is used to specify the param for glTexEnv(..., ..., param).

Param2 field0 values for command 0x00C0

Value GL type
0x0 GL_REPLACE
0x1 GL_MODULATE
0x2 GL_ADD
0x3 GL_ADD_SIGNED
0x4 GL_INTERPOLATE
0x5 GL_SUBTRACT
0x6 GL_DOT3_RGB
0x7 GL_DOT3_RGBA
0x8 ?
0x9 ?

Param2 field1 values for command 0x00C0

Value GL type
0x0 GL_REPLACE
0x1 GL_MODULATE
0x2 GL_ADD
0x3 GL_ADD_SIGNED
0x4 GL_INTERPOLATE
0x5 GL_SUBTRACT
0x6 GL_REPLACE
0x7 GL_DOT3_RGB
0x8 ?
0x9 ?

Parameter value format for command 0x00C4

Bit Description
15-0 Valid values: 0=unknown, 1=unknown, 2=unknown.
31-16 Same format as bits15-0.

See command set 0x80XF00C0.

Parameter value format for command 0x00E1

Bit Description
7-0 Red component
15-8 Green component
23-16 Blue component
31-24 Unused

Parameter value format for command 0x0100

This command controls color compositing. It is typically used right after commands 0x0101 or 0x0102 to select the appropriate blending mode.

Alphablending and color logic op can't be used together. Attempting to issue commands 0x0101 and 0x0102 at the same time can freeze the GPU.

For blending to work correctly, color buffer reading needs to be enabled (see command set 0x0112). Otherwise zero values will be used as destination color/alpha.

Bit Description
0 Weird mode (see below)
1 When set, nothing is drawn to the color, depth and stencil buffers. This bit can cause a noisy picture when used with bit 0 (this seems to also cause the depth buffer's endianness to be reversed, and forces stencil values to 0xFF).
8 Selects blending mode. 0 = color logic op, 1 = alphablending
23-20 Unknown, typically set to 0xE4. No observed effect when changing this.
25-24 0 = normal, 1-3 = apply dithering (3 = 0% source)

When "weird mode" is enabled, the source color/alpha values are ignored. Instead, each 16-bit value in the destination color buffer is converted according to its bits 14-8, as follows:

* if bits 14-8 are between 0x00 and 0x03, the value is replaced with 0x0000
* if bits 14-8 are between 0x7D and 0x7F, the value is replaced with 0x7FFF
* in all other cases, the value is left unchanged

Parameter value format for command 0x0101

This command controls alphablending. To disable alphablending, the value is set to 0x01010000.

Bit Description
7-0 Color blend equation
15-8 Alpha blend equation
19-16 Color source factor
23-20 Color destination factor
27-24 Alpha source factor
31-28 Alpha destination factor

Blend equation values:

Value Description
0 GL_FUNC_ADD
1 GL_FUNC_SUBTRACT
2 GL_FUNC_REVERSE_SUBTRACT
3 GL_MIN
4 GL_MAX

Source/destination factor values:

Value Description
0 GL_ZERO
1 GL_ONE
2 GL_SRC_COLOR
3 GL_ONE_MINUS_SRC_COLOR
4 GL_DST_COLOR
5 GL_ONE_MINUS_DST_COLOR
6 GL_SRC_ALPHA
7 GL_ONE_MINUS_SRC_ALPHA
8 GL_DST_ALPHA
9 GL_ONE_MINUS_DST_ALPHA
10 GL_CONSTANT_COLOR
11 GL_ONE_MINUS_CONSTANT_COLOR
12 GL_CONSTANT_ALPHA
13 GL_ONE_MINUS_CONSTANT_ALPHA
14 GL_SRC_ALPHA_SATURATE

Parameter value format for command 0x0102

This command controls color logic op.

Bit Description
3-0 Logic operation

Logic operation values:

Value Description
0 GL_CLEAR
1 GL_AND
2 GL_AND_REVERSE
3 GL_COPY
4 GL_SET
5 GL_COPY_INVERTED
6 GL_NOOP
7 GL_INVERT
8 GL_NAND
9 GL_OR
10 GL_NOR
11 GL_XOR
12 GL_EQUIV
13 GL_AND_INVERTED
14 GL_OR_REVERSE
15 GL_OR_INVERTED

Parameter value format for command 0x0105

This command controls stencil testing.

Bit Description
0 Enable stencil test
7-4 Stencil test function (values same as for alpha and depth tests)
15-8 Replacement value, used as specified by command 0x0106
23-16 Reference value for the stencil test. Note that the test does "reference FUNC value".
31-24 Mask for the stencil test.

Parameter value format for command 0x0106

This command controls stencil buffer replacement.

Bit Description
2-0 Action when the stencil test fails
6-4 Action when the stencil test passes but the depth test fails
10-8 Action when both stencil test and depth test pass

Action values:

Value Final stencil value
0 destination
1 destination & ~source
2 same as 1
3 Weird operation.
4 Weird operation. TODO: find out what it is exactly.
5 destination ^ source
6 Another weird operation.
7 same as 4

'destination' is the value present in the stencil buffer, 'source' is the replacement value specified in command 0x0105.

Parameter structure for command 0x004D

Index Word Description
0 float far
1 float near

This is glDepthRange().

Parameter structure for command 0x00E8

Index Word Description
0x7D-0x00 Usually value 0x00FFE000.
0x7E Usually value 0x00FFFEE6?
0x7F Usually value 0x00DCD919?

Parameter structure for command 0x0112

Index Word Description
0 Setting bits 3-0 to a nonzero value allows the GPU to read from the color buffer.
1 Setting bits 3-0 to a nonzero value allows the GPU to write to the color buffer.
2 Setting bits 1-0 to a nonzero value allows the GPU to read from the depth/stencil buffer.
3 Setting bits 1-0 to a nonzero value allows the GPU to write to the depth/stencil buffer.

Entries for command 0x02C1

Index Word Description
0 float, the GPU handles this as the 4th word.
1 float, the GPU handles this as the 3rd word.
2 float, the GPU handles this as the 2nd word.
3 float, the GPU handles this as the 1st word.

The below entry structure info is in the raw order used for the command, not the order used by the GPU.

Color Entry

Index Word Description
0 float Red component
1 float Blue component
2 float Green component
3 float Alpha

Lighting Color Entry

Index Word Description
0 float Alpha
1 float Blue component
2 float Green component
3 float Red component

Types for command 0x02C0

The 0x02C0/0x02C1 is actually used as a generic way to set uniforms, regardless of what they represent. 0x02C0's parameter represents the ID of the destination GPU register (0x0 is c0, 0x1 is c1 etc). As such, the meaning of the data being sent over is entirely dependant on the shader currently in use. The values below may be "default" values used by Nintendo's openGL implementation.

Value Entries per chunk Description
0x00 4 This specifies 16-floats for a 4x4 matrix, used for glLoadMatrix() for the projection matrix.
0x04 4 This specifies a 4x4 matrix, used for glLoadMatrix() for the model-view matrix. This is usually an identity matrix.
0x08 2 Sets the color.
0x0A 4 Specifies a 4x4 matrix, used for glLoadMatrix() for the texture matrix.(Index0)
0x0E 3 Specifies a 4x3 texture matrix.(Index1)
0x11 3 Specifies a 4x3 texture matrix.(Index2)
0x14 <=30 Used to specify a 4xN matrix, where N is the total command 0x02C1 entries. This is glMultMatrix() for the model-view matrix, except the input matrix is 4xN instead of 4x4.
0x4C 4 This specifies a 4x4 float matrix.
0x50, 0x53, and 0x56 1 This specifies the GL_LIGHT0-2 color for GL_AMBIENT?
0x51, 0x54, and 0x57 1 This specifies the GL_LIGHT0-2 color for GL_DIFFUSE?
0x52, 0x55, and 0x58 1 This specifies the GL_LIGHT0-2 color for GL_SPECULAR?
0x59 1 Unknown, the entry data is floats converted from s32s. Usually each entry word is zeros.
0x5A 2 Color related?
0x5C 1 ?

The matrices for types 0x00 and 0x04 use row-major order, instead of column-major order.