GPU/Commands: Difference between revisions
| mNo edit summary | |||
| Line 583: | Line 583: | ||
| |- | |- | ||
| | 0 | | 0 | ||
| | 0 = disable  | | 0 = disable GL_DEPTH_TEST, 1 = enable GL_DEPTH_TEST | ||
| |- | |- | ||
| | 3-1 | | 3-1 | ||
Revision as of 22:55, 14 March 2014
This page describes the structure of the buffer for GX command 1 with the registers at 0x1EF018E0. This buffer is used for GPU commands including OpenGL commands, each 8-byte entry in the buffer is a command. Cmd+0 is the command parameter, and cmd+4 is the command header.
Invalid GPU command parameters including NaN floats can cause the GPU to hang, this then causes the GSP module to hang as well.
Command Header
| Bit | Description | 
|---|---|
| 19-0 | Command ID | 
| 30-20 | Total words following the command, if any. | 
| 31 | ? | 
The first word in the parameter data structure is the command parameter value, the rest of the data structure is from the data following the command. The word after the last data structure word is padding when needed for 8-byte alignment, for the following command.
Bit 31 may be linked to command grouping.
Parameter masking
It appears that bits 16-19 in the command header may not be part of the command ID but in fact a parameter mask : bit 16 would indicate that the LSB of the parameter will be written, bit17 that the parameter's second LSB will be written etc. This would mean that for instance commands 0x00020107 and 0x00010107 refer to the same thing but write different parts of the parameter.
Command grouping
It appears that in certain circumstances it is possible to group multiple command calls into a single one, given that those commands' IDs are contiguous. For example, a call with command header 0x802F011C (3 parameters total) seems to be equivalent to a call to commands 0xF011C with parameter 0, 0xF011D with parameter 1 and 0xF011E with parameter 2.
Commands
| CommandID | Parameter | Description | 
|---|---|---|
| 0x000F0010 | Value is 0x12345678 | This command is always the last command in the buffer. | 
| 0x000F0110 | Value 0x1 | This command is immediately before CmdID 0x000F0010, this is also used elsewhere for beginning rendering of mesh(es) as well. | 
| 0x000F0111 | Value 0x1 | This command is immediately before CmdID 0x000F0110, however CmdID 0x000F0110 doesn't always follow this command. | 
| 0x000F0040 | u32, valid values are 0x1 and 0x2, values 0x0 and 0x3 have the same effect as value 0x2. Only bits 1-0 are used. | Value 2 = GL_FRONT/GL_CW or GL_BACK/GL_CCW. Value 1 = GL_FRONT/GL_CCW or GL_BACK/GL_CW. | 
| 0x000F0041 | float24 | VIEWPORT_WIDTH. See command set 0x000F0041. | 
| 0x000F0042 | float32 | VIEWPORT_WIDTH_INV. See command set 0x000F0041. | 
| 0x000F0043 | float24 | VIEWPORT_HEIGHT. See command set 0x000F0041. | 
| 0x000F0044 | float32 | VIEWPORT_HEIGHT_INV. See command set 0x000F0041. | 
| 0x801F004D | See command set 0x801F004D. | |
| 0x000F0068 | u32 | VIEWPORT Y/X. See command set 0x000F0041. | 
| 0x000F006D | See command set 0x801F004D. | |
| 0x000F006E | u32 | See command set 0x000F0111. | 
| 0x00010080 | u32 | See command set 0x00030080. | 
| 0x00030080 | u32 | See command set 0x00030080. | 
| 0x00040080 | u32 | See command set 0x00030080. | 
| 0x809F0081 | This is used to set the current texture info used for rendering, see command set 0x809F0081. | |
| 0x000F008E | u32 color type | This command sets the texture color type, see command set 0x809F0081. | 
| 0x805F0091 | This sets current texture info, see command 0x805F0091. | |
| 0x805F0099 | This sets current texture info, see command 0x805F0099. | |
| 0x800F00C3 | val<<24 | Val is usually 0xFF or 0x00, however 0x00-0xFF is valid as well. This is alpha-blending related? | 
| 0x800F00CB | val<<24 | Val is usually 0xFF or 0x00, however 0x00-0xFF is valid as well. This is alpha-blending related? | 
| 0x80XF00C0 | See command set 0x80XF00C0. | |
| 0x800F00C4 | See command set 0x80XF00C0. | |
| 0x80XF00C8 | See command set 0x80XF00C0. | |
| 0x800F00CC | See command set 0x80XF00C0. | |
| 0x80XF00D0 | See command set 0x80XF00C0. | |
| 0x800F00D4 | See command set 0x80XF00C0. | |
| 0x80XF00D8 | See command set 0x80XF00C0. | |
| 0x800F00DC | See command set 0x80XF00C0. | |
| 0x80XF00F0 | See command set 0x80XF00C0. | |
| 0x800F00F4 | See command set 0x80XF00C0. | |
| 0x80XF00F8 | See command set 0x80XF00C0. | |
| 0x800F00FC | See command set 0x80XF00C0. | |
| 0x000100E0 | Normally value zero. | Unknown, fragment related? | 
| 0x000500E0 | See command set 0x000500E0. | |
| 0x000F00E1 | See command set 0x000500E0. | |
| 0x000F00E6 | Value zero | See command set 0x000F00E6. | 
| 0x000F00E8 | See command set 0x000F00E6. | |
| 0x00020100 | u32, value is 0x00E40100 | See command set 0x00020100. | 
| 0x000D0100 | 0x00E40000 | val. | Val0 = unknown, val1 = unknown, val3 = unknown. The default val used here is 0. | 
| 0x000F0101 | u32 | See command set 0x00020100. | 
| 0x000F0103 | See command set 0x00020100. | |
| 0x000F0104 | u32 | glAlphaFunc() | 
| 0x00010107 | See command set CmdID 0x00010107. | |
| 0x00020107 | See command set CmdID 0x00010107. | |
| 0x00030107 | See command set CmdID 0x00030107. | |
| 0x000F0116 | u32 | DEPTHBUFFER FORMAT. See command set 0x000F0111. | 
| 0x000F0117 | u32 | COLORBUFFER FORMAT/PIXEL. See command set 0x000F0111. | 
| 0x000F011C | Physical address>>3 | DEPTHBUFFER ADDRESS. See command set 0x000F0111. | 
| 0x000F011D | Physical address>>3 | COLORBUFFER ADDRESS. See command set 0x000F0111. | 
| 0x000F011E | u32 | COLORBUFFER HEIGHT/WIDTH. See command set 0x000F0111. | 
| 0x803F0112 | ? | |
| 0x826F0200 | See command set 0x826F0200. | |
| 0x00080126 | See command set CmdID 0x00030107. | |
| 0x000F0227 | u32 | This specifies the address of an array containing vertex array indices, and the data-type of the indices, used for rendering primitives. See command set glDrawElements(). | 
| 0x000F0228 | u32 total elements in the array to use for rendering. | See command set glDrawElements(). | 
| 0x803F0232 | See command set 0x826F0200. | |
| 0x0002025E | u32, val<<8. | This sets the GL rendering mode, see command set 0x826F0200. | 
| 0x000F02B0 | u32, value is 0x7FFF0000 | val. | Texture related? | 
| 0x801F02BB | See command set 0x826F0200. | |
| 0x000F02BA | 0x7FFF0000 | entrypoint offset | Sets the entrypoint offset for the shader program | 
| 0x000F02C0 | 0x80000000 | Type | This is used immediately before CmdID 0xXXXF02C1. This type field controls the command parameter buffer type. This command can also be used to send over (float24 only ?) data directly, without using 0xXXXF02C1. In that case, the first parameter is still Type but with bit 31 not set; the actual data follows. | 
| 0xXXXF02C1 | First word in the first entry | A list of entries follow this command. | 
| 0x000F02CB | Value 0x0 ? | This is used immediately before CmdID 0xXXXF02CC. It is used to indicate that shader program data will follow. | 
| 0xXXXF02CC | First word of shader program data chunk. | This command is used to transfer shader program data (as the parameter data). It can be called multiple times in a row if the shader program is too big to fit into a single call. | 
| 0x000F02BF | Value 0x1 ? | This is used immediately after a set of CmdID 0xXXXF02CC. It is used to indicate that shader program data transfer is complete. | 
| 0x000F02D5 | Value 0x0 ? | This is used immediately before CmdID 0xXXXF02d6. | 
| 0xXXXF02D6 | First entry. | This is used to send over the shader program operand descriptor table. | 
| 0x000F0050 | First entry | This command is used to setup shader output registers. The n-th word-long entry is a map of the (n*2)-th output register's components. Each byte of each entry corresponds to where a component is mapped. Value 0x1F indicates that the corresponding component is unused. | 
Command Sets
glDrawElements()
See GPU GL Arrays.
glClear() / glClearColor()
The GPU does not have dedicated commands for clearing the color buffer, therefore applications implement color buffer clearing by rendering a quad. Applications normally store this vertex and color array in the GSP application heap.
CmdID 0x809F0081
This sets current texture info, see GPU textures.
CmdID 0x00030080
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x00030080 | 0x11000 | val, where only bits 2-1 are used in val. | ? | 
| 0 | 0x00040080 | Same value as CmdID 0x00030080. | ? | 
| 0 | 0x00010080 | Same value as CmdID 0x00030080. | ? | 
CmdID 0x80XF00C0
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x802F0000 | SlotCmdID | ||
| 1 | 0x800F0000 | SlotCmdID + 4 | 
This is used for glTexEnv(), for the slot indicated by the CmdID. There's a total of 6 slots, where each slot corresponds to the following u16 CmdIDs: 0xC0, 0xC8, 0xD0, 0xD8, 0xF0, 0xF8.
CmdID 0x000500E0
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x000500E0 | 5 | val<<16, where val is 0 or 1. | Val0 = enable, val1 = disable. | 
| 1 | 0x000F00E1 | This specifies a color. | 
This is usually used immediately after command set glDrawElements(). This is used to specify a color used for blending?
CmdID 0x000F00E6
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x000F00E6 | Value 0 | ? | 
| 1 | 0x000F00E8 | 
This is usually the last command set used for rendering a mesh, when command set 0x000500E0 was used. This command set is used immediately after command set 0x000500E0.
CmdID 0x00020100
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x00020100 | Value 0x00E40100 | ? | 
| 1 | 0x000F0101 | 0x01010000 when disabled? | ? | 
| 2 | 0x000F0103 | This is set to zero when the Cmd 0x000F0101 parameter is value 0x01010000. | ? | 
This is fragment related?
CmdID 0x801F004D
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x801F004D | glDepthRange() | |
| 1 | 0x000F006D | 0 = unknown, 1 = unknown. | Value zero causes the mesh to not be rendered. | 
CmdID 0x000F0041
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x000F0041 | float | This corresponds to the framebuffer width. | 
| 1 | 0x000F0043 | float | This parameter value is calculated the same way as the CmdID 0x000F0041 parameter, except the framebuffer height is used instead. | 
| 2 | 0x000F0042 | float | This corresponds to the framebuffer width. | 
| 3 | 0x000F0044 | float | This parameter value value is calculated the same way as the CmdID 0x000F0042 parameter, except the framebuffer height is used instead. | 
| 4 | 0x000F0068 | u32 | This sets the X/Y coordinates used for glViewport(). | 
This command set initializes the projection matrix. This command set is used twice when beginning rendering for each screen. The framebuffer width used here for the main screen is 240, however this is 480 with stereoscopy enabled for the second time this command set is used.
CmdID 0x000F0111
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x000F0111 | Value 1 | |
| 1 | 0x000F0110 | Value 1 | |
| 2 | 0x000F0117 | Bits15-0 = unk, 31-16 = unk. | Unknown, normally the input parameter is value 0x2. | 
| 3 | 0x000F011D | Physical address>>3 | This initializes the framebuffer address used for rendering, this framebuffer is used for the input framebuffer with GX command 3 and 4. This command is used immediately after CmdID 0x000F0117. | 
| 4 | 0x000F0116 | ? | |
| 5 | 0x000F011C | Physical address>>3 | Unknown, normally this address is located in VRAM. | 
| 6 | 0x000F011E | (((h-1)&0xFFF)<<12)|(w&0xFFF) | This sets the width and height for the framebuffer used for rendering. Therefore this is glViewport(), x/y are specified by CmdID 0x000F0068. | 
| 7 | 0x000F006E | Same input parameter value as CmdID 0x000F011E. | 
This command set is normally used after the two 0x000F0041 command sets.
CmdID 0x00030107
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x00030107 | ||
| 1 | 0x00080126 | type<<24 | 
This command set is used for disabling the alpha-blending info set by command set 0x00010107? The GL AlphaFunction used here is normally GL_ALWAYS.
CmdID 0x00010107
| Command Index | CommandID | Parameter | Description | 
|---|---|---|---|
| 0 | 0x00010107 | Same format as CmdID 0x00030107. | |
| 1 | 0x00080126 | type<<24 | |
| 0 | 0x00020107 | Same value as CmdID 0x00010107. | 
Parameter format for CmdIDs 0x00030107, 0x00020107, and 0x00010107
| Bit | Description | 
|---|---|
| 0 | 0 = disable GL_DEPTH_TEST, 1 = enable GL_DEPTH_TEST | 
| 3-1 | Unused? | 
| 7-4 | Alpha function | 
| 11-8 | Color to blend with? | 
| 12 | ? | 
| 31-13 | Unused | 
Alpha function values
| Value | GL AlphaFunction | 
|---|---|
| 0 | GL_NEVER | 
| 1 | GL_ALWAYS | 
| 2 | GL_EQUAL | 
| 3 | GL_NOTEQUAL | 
| 4 | GL_LESS | 
| 5 | GL_LEQUAL | 
| 6 | GL_GREATER | 
| 7 | GL_GEQUAL | 
Alpha types for CmdID 0x00080126
| Type | GL AlphaFunction | 
|---|---|
| 0 | GL_NEVER | 
| 1 | GL_ALWAYS | 
| 2 | GL_GREATER/GL_GEQUAL | 
| 3 | The remaining GL alpha functions. | 
Parameter value format for CmdID 0x000F0104
| Bit | Description | 
|---|---|
| 0 | 0 = disable GL_ALPHA_TEST, 1 = enable GL_ALPHA_TEST | 
| 3-1 | Unused? | 
| 7-4 | Alpha function | 
| 15-8 | u8 ref, range is 0-255 | 
| 31-16 | Unused? | 
This is glAlphaFunc().
Parameter value format for CmdID 0x000F011E
| Bit | Description | 
|---|---|
| 11-0 | Framebuffer/viewport width | 
| 23-12 | Framebuffer/viewport height - 1 | 
| 24 | Must be set | 
| 31-25 | Unused? | 
This specifies the width/height for glViewport(). Normally the framebuffer width and height is set to the same dimensions used with GX command 3 and 4.
Parameter value format for CmdID 0x000F0068
| Bit | Description | 
|---|---|
| 15-0 | X | 
| 31-16 | Y | 
This specifies the X/Y coordinates for glViewport().
Parameter structure for CmdID 0x804F00C0
| Index Word | Description | 
|---|---|
| 0 | Value 0xFFF0FFF / 0x0 | 
| 1 | Value 0x0 | 
| 2 | Value 0x0 | 
| 3 | Value 0xFFFFFFFF | 
| 4 | Value 0x0 | 
This individual command is used instead of the 0x80XF00C0 command set when none of the associated rendering parameters for this slot are set.
Parameter structure for CmdID 0x802F00C0
| Index Word | Description | 
|---|---|
| 0 | Param0 | 
| 1 | Param1 | 
| 2 | Param2 | 
See command set 0x80XF00C0.
Param0 format for CmdID 0x802F00C0
| Bit | Description | 
|---|---|
| 3-0 | See below values.(Field0 index0) | 
| 7-4 | See below values.(Field0 index1) | 
| 11-8 | See below values.(Field0 index2) | 
| 15-12 | Unused | 
| 19-16 | See below values.(Field1 index0) | 
| 23-20 | See below values.(Field1 index1) | 
| 27-24 | See below values.(Field1 index2) | 
| 31-28 | Unused | 
Param0 values for CmdID 0x802F00C0
| Value | GL type | 
|---|---|
| 0x0 | GL_PRIMARY_COLOR | 
| 0x1 | ? | 
| 0x2 | ? | 
| 0x3 | GL_TEXTURE0 | 
| 0x4 | GL_TEXTURE1 | 
| 0x5 | GL_TEXTURE2 | 
| 0x6 | GL_TEXTURE3 | 
| 0xC-0x7 | GL_PRIMARY_COLOR | 
| 0xD | ? | 
| 0xE | GL_CONSTANT | 
| 0xF | GL_PREVIOUS | 
Param1 format for CmdID 0x802F00C0
| Bit | Description | 
|---|---|
| 3-0 | See below values for field0.(Index0) | 
| 7-4 | See below values for field0.(Index1) | 
| 11-8 | See below values for field0.(Index2) | 
| 15-12 | See below values for field1.(Index0) | 
| 19-16 | See below values for field1.(Index1) | 
| 23-20 | See below values for field1.(Index2) | 
| 31-24 | Unused | 
This specifies the pname for glTexEnv().
Param1 field0 values for CmdID 0x802F00C0
| Value | GL type | 
|---|---|
| 0x0 | GL_SRC_COLOR | 
| 0x1 | GL_ONE_MINUS_SRC_COLOR | 
| 0x2 | GL_SRC_ALPHA | 
| 0x3 | GL_ONE_MINUS_SRC_ALPHA | 
| 0x4 | GL_SRC0_RGB | 
| 0x5 | ? | 
| 0x6 | GL_SRC_COLOR | 
| 0x7 | GL_SRC_COLOR | 
| 0x8 | GL_SRC1_RGB | 
| 0x9 | ? | 
| 0xA | GL_SRC_COLOR | 
| 0xB | GL_SRC_COLOR | 
| 0xC | GL_SRC2_RGB | 
| 0xD | ? | 
Param1 field1 values for CmdID 0x802F00C0
| Value | GL type | 
|---|---|
| 0x0 | GL_SRC_ALPHA | 
| 0x1 | GL_ONE_MINUS_SRC_ALPHA | 
| 0x2 | GL_SRC0_RGB | 
| 0x3 | ? | 
| 0x4 | GL_SRC1_RGB | 
| 0x5 | ? | 
| 0x6 | GL_SRC2_RGB | 
| 0x7 | ? | 
Param2 format for CmdID 0x802F00C0
| Bit | Description | 
|---|---|
| 15-0 | See below field0 values. | 
| 31-16 | See below field1 values. | 
This is used to specify the param for glTexEnv(..., ..., param).
Param2 field0 values for CmdID 0x802F00C0
| Value | GL type | 
|---|---|
| 0x0 | GL_REPLACE | 
| 0x1 | GL_MODULATE | 
| 0x2 | GL_ADD | 
| 0x3 | GL_ADD_SIGNED | 
| 0x4 | GL_INTERPOLATE | 
| 0x5 | GL_SUBTRACT | 
| 0x6 | GL_DOT3_RGB | 
| 0x7 | GL_DOT3_RGBA | 
| 0x8 | ? | 
| 0x9 | ? | 
Param2 field1 values for CmdID 0x802F00C0
| Value | GL type | 
|---|---|
| 0x0 | GL_REPLACE | 
| 0x1 | GL_MODULATE | 
| 0x2 | GL_ADD | 
| 0x3 | GL_ADD_SIGNED | 
| 0x4 | GL_INTERPOLATE | 
| 0x5 | GL_SUBTRACT | 
| 0x6 | GL_REPLACE | 
| 0x7 | GL_DOT3_RGB | 
| 0x8 | ? | 
| 0x9 | ? | 
Parameter value format for CmdID 0x800F00C4
| Bit | Description | 
|---|---|
| 15-0 | Valid values: 0=unknown, 1=unknown, 2=unknown. | 
| 31-16 | Same format as bits15-0. | 
See command set 0x80XF00C0.
Parameter value format for CmdID 0x000F00E1
| Bit | Description | 
|---|---|
| 7-0 | Red component | 
| 15-8 | Green component | 
| 23-16 | Blue component | 
| 31-24 | Unused | 
Parameter value format for CmdID 0x000F0101
| Bit | Description | 
|---|---|
| 7-0 | ? | 
| 15-8 | ? | 
| 19-16 | ? | 
| 23-20 | ? | 
| 27-24 | ? | 
| 31-28 | ? | 
Parameter structure for CmdID 0x801F004D
| Index Word | Description | 
|---|---|
| 0 | float far | 
| 1 | float near | 
This is glDepthRange().
Parameter structure for CmdID 0x000F00E8
| Index Word | Description | 
|---|---|
| 0x7D-0x00 | Usually value 0x00FFE000. | 
| 0x7E | Usually value 0x00FFFEE6? | 
| 0x7F | Usually value 0x00DCD919? | 
Parameter structure for CmdID 0x803F0112
| Index Word | Description | 
|---|---|
| 0 | 0x0 = unknown, 0xF = unknown. Only bits 3-0 are used.(Values 0x1-0xF all have the same effect) | 
| 1 | 0x0 = unknown, 0xF = unknown. Only bits 3-0 are used. | 
| 2 | 0x0 = unknown, 0x2 = unknown. Only bits 1-0 are used.(Values 0x1-0x3 all have the same effect) | 
| 3 | 0x0 = unknown, 0x2 = unknown. Only bits 1-0 are used.(Values 0x1-0x3 all have the same effect) | 
Entries for CmdID 0xXXXF02C1
| Index Word | Description | 
|---|---|
| 0 | float, the GPU handles this as the 4th word. | 
| 1 | float, the GPU handles this as the 3rd word. | 
| 2 | float, the GPU handles this as the 2nd word. | 
| 3 | float, the GPU handles this as the 1st word. | 
The below entry structure info is in the raw order used for the command, not the order used by the GPU.
Color Entry
| Index Word | Description | 
|---|---|
| 0 | float Red component | 
| 1 | float Blue component | 
| 2 | float Green component | 
| 3 | float Alpha | 
Lighting Color Entry
| Index Word | Description | 
|---|---|
| 0 | float Alpha | 
| 1 | float Blue component | 
| 2 | float Green component | 
| 3 | float Red component | 
Types for CmdID 0x000F02C0
The 0x000F02C0/0x000F02C1 is actually used as a generic way to set uniforms, regardless of what they represent. 0x000F02C0's parameter represents the ID of the destination GPU register (0x0 is c0, 0x1 is c1 etc). As such, the meaning of the data being sent over is entirely dependant on the shader currently in use. The values below may be "default" values used by Nintendo's openGL implementation.
| Value | Entries per chunk | Description | 
|---|---|---|
| 0x00 | 4 | This specifies 16-floats for a 4x4 matrix, used for glLoadMatrix() for the projection matrix. | 
| 0x04 | 4 | This specifies a 4x4 matrix, used for glLoadMatrix() for the model-view matrix. This is usually an identity matrix. | 
| 0x08 | 2 | Sets the color. | 
| 0x0A | 4 | Specifies a 4x4 matrix, used for glLoadMatrix() for the texture matrix.(Index0) | 
| 0x0E | 3 | Specifies a 4x3 texture matrix.(Index1) | 
| 0x11 | 3 | Specifies a 4x3 texture matrix.(Index2) | 
| 0x14 | <=30 | Used to specify a 4xN matrix, where N is the total CmdID 0xXXXF02C1 entries. This is glMultMatrix() for the model-view matrix, except the input matrix is 4xN instead of 4x4. | 
| 0x4C | 4 | This specifies a 4x4 float matrix. | 
| 0x50, 0x53, and 0x56 | 1 | This specifies the GL_LIGHT0-2 color for GL_AMBIENT? | 
| 0x51, 0x54, and 0x57 | 1 | This specifies the GL_LIGHT0-2 color for GL_DIFFUSE? | 
| 0x52, 0x55, and 0x58 | 1 | This specifies the GL_LIGHT0-2 color for GL_SPECULAR? | 
| 0x59 | 1 | Unknown, the entry data is floats converted from s32s. Usually each entry word is zeros. | 
| 0x5A | 2 | Color related? | 
| 0x5C | 1 | ? | 
The matrices for types 0x00 and 0x04 use row-major order, instead of column-major order.