Changes

704 bytes added ,  11:38, 2 October 2023
Clarify LITP instruction
Line 349: Line 349:  
|  1u
 
|  1u
 
|  LITP
 
|  LITP
Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100
+
Partial lighting computation, may be used in conjunction with EX2, LG2, etc to compute the vertex lighting coefficients. See the [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx Microsoft] and [https://registry.khronos.org/OpenGL/extensions/ARB/ARB_vertex_program.txt ARB] docs for more information on how to implement the full lit function; DST = {max(src.x, 0), max(min(src.y, 127.9961), -127.9961), 0, max(src.w, 0)} and it sets the cmp.x and cmp.y flags based on if the respective src.x and src.w components are >= 0.
 
|-
 
|-
 
|  0x08
 
|  0x08
Line 663: Line 663:  
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).
 
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).
   −
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.
+
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values. The way out-of-bounds values behave when reading uniforms is as follows:
 +
* If the offset is out of byte bounds (less than -128 or greater than 127), the offset is not applied (treated as 0).
 +
* The offset is added to the constant register index and masked by 0x7F.
 +
* If the resulting index is greater than 95, the result is (1, 1, 1, 1).
 +
* Otherwise, the result is the value at the indexed constant register.
    
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.
 
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.
Line 775: Line 779:  
|  float
 
|  float
 
|  Read only
 
|  Read only
|  Application
+
|  Application/Vertex-stream
 
|  Floating-point Constant registers.
 
|  Floating-point Constant registers.
 
|-
 
|-
Line 809: Line 813:  
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.
 
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.
   −
Output attribute registers hold the data to be passed to the later GPU stages and are write-only. Each of the output attribute register components is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.
+
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.
It appears that writing twice to a component of an output register that was written to before can cause problems (e.g. GPU hangs).
+
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).
    
Temporary registers can be used for intermediate calculations and can be both read and written.
 
Temporary registers can be used for intermediate calculations and can be both read and written.
Line 827: Line 831:  
!  Description
 
!  Description
 
|-
 
|-
|  0x0-0x6
+
|  0x0-0xF
|  o0-o6
+
|  o0-o15
 
|  Output registers.
 
|  Output registers.
 
|-
 
|-
Line 840: Line 844:  
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
SRC1 raw value
+
SRC raw value
 
!  Register name
 
!  Register name
 
!  Description
 
!  Description
 
|-
 
|-
|  0x0-0x7
+
|  0x0-0xF
|  v0-v7
+
|  v0-v15
 
|  Input attribute registers.
 
|  Input attribute registers.
 
|-
 
|-
35

edits