GSP Shared Memory: Difference between revisions

Kynex7510 (talk | contribs)
m Document PPF behaviour
Kynex7510 (talk | contribs)
m Fix misinformation
(9 intermediate revisions by the same user not shown)
Line 34: Line 34:
GSP fills the interrupt list, then triggers the event set with [[GSPGPU:RegisterInterruptRelayQueue|RegisterInterruptRelayQueue]] for the specified process(es).
GSP fills the interrupt list, then triggers the event set with [[GSPGPU:RegisterInterruptRelayQueue|RegisterInterruptRelayQueue]] for the specified process(es).


PDC interrupts are sent to all processes; other interrupts are only sent to the process with GPU rights.
PDC interrupts are sent to all processes; other interrupts are only sent to the process with rendering rights.
 
When issuing a [[GSP_Shared_Memory#Trigger_Memory_Fill|Memory Fill]] command with both buffers set GSP will only dispatch PSC0.


= Framebuffer Info =
= Framebuffer Info =
Line 100: Line 102:
The command queue is located at sharedMemBase + 0x800 + (clientID * 0x200). It consists of an header followed by at most 15 command entries. Each command entry is of size 0x20 and has an header followed by command specific parameters.
The command queue is located at sharedMemBase + 0x800 + (clientID * 0x200). It consists of an header followed by at most 15 command entries. Each command entry is of size 0x20 and has an header followed by command specific parameters.


After adding a command, [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]] must be used to trigger GSP processing when the total commands field is value 1.
After adding a command, [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]] must be used to start command processing (official code does so when the total commands field is 1).


== Command Queue Header ==
== Command Queue Header ==
Line 116: Line 118:
|-
|-
| 2
| 2
| Flags (bit0 = completed?, bit7 = fatal error)
| Status (bit0 = halted, bit7 = fatal error)
|-
|-
| 3
| 3
| ? (bit0 = set flags.bit0)
| When bit0 is set, further processing of commands is halted until the client resets the flag and calls [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]]
|-
|-
| 4
| 7-4
| Result code for the last GX command which failed
| Result code for the last command which failed
|}
|}
GSP checks for status.bit0 and optionally avoids handling further commands, however the check is done by equality, which means it will always fail if status.bit7 is also set (and thus other commands will be processed). This bug prevents the halting logic from working propertly, but can be worked around by keeping bit0 of word3 set, as that will force halting on each iteration.


== Command Header ==
== Command Header ==
Line 136: Line 140:
|-
|-
| 1
| 1
| ?
| Unused
|-
|-
| 2
| 2
| ? (bit0 = set queue.flags.bit0 after processing)
| When bit0 is set, GSP stops processing further commands (can be used for packing together sets of commands)
|-
|-
| 3
| 3
Line 147: Line 151:
== Commands ==
== Commands ==


Addresses specified in parameters are virtual addresses. For applications these are normally located in GSP memory, while for other processes they are located in VRAM.
Addresses specified in parameters are virtual addresses. Depending on the command, there might be constraints on the accepted parameters. In general, some commands require parameters to be aligned, and addresses are expected to be on [[Memory_Management#Memory_Mapping|linear]], [[Memory_layout#0x1F000000_.28New_3DS_only.29|QTM]] or VRAM memory.
 
Address and size parameters except for command 0 and command 5 must be 8-byte aligned.


=== Trigger DMA Request ===
=== Trigger DMA Request ===
Line 159: Line 161:
|-
|-
| 0
| 0
| u8 CommandID is 0x00
| Command header (ID = 0x00)
|-
|-
| 1
| 1
Line 177: Line 179:
|}
|}


This command is normally used to DMA data from the application GSP [[Memory_layout|heap]] to VRAM. When flushing is enabled and the source buffer is not located within VRAM, svcFlushProcessDataCache is used to flush the source buffer.
This command issues a [[Corelink_DMA_Engines|DMA request]] as the process with [[GSPGPU:AcquireRight|rendering rights]]. When the destination address is within VRAM, GSP places itself as the destination process: this makes it possible to transfer data in VRAM without needing it listed in the destination process [[NCCH/Extended_Header#ARM11_Kernel_Capabilities|exheader mappings]]. Otherwise, both source and destination of the DMA request are the process with rendering rights.
 
The source buffer must be mapped as readable in the source process, while the destination address must be mapped as writable in the destination process, otherwise GSP calls [[SVC|svcBreak]]. When flushing is enabled and the source address is above VRAM, svcFlushProcessDataCache is used to flush the source buffer.
 
Any process must have acquired rendering rights, otherwise the command does nothing.


=== Trigger Command List Processing ===
=== Trigger Command List Processing ===
Line 187: Line 193:
|-
|-
| 0
| 0
| u8 CommandID is 0x01
| Command header (ID = 0x01)
|-
|-
| 1
| 1
Line 205: Line 211:
|}
|}


This command converts the specified address to a physical address, then writes the physical address and size to the [[GPU]] registers at 0x1EF018E0. This buffer contains [[GPU/Internal_Registers|GPU commands]]. When flushing is enabled, svcFlushProcessDataCache is used to flush the buffer.
This command sets the [[GPU/External_Registers#Command_List|Command List registers]], and optionally updates gas additive blend results after command processing has ended.
 
No error checking is performed on the parameters. Address and size should be both aligned to 8 bytes, and the address should be in linear, QTM or VRAM memory, otherwise PA 0 is used. When flushing is enabled, [[SVC|svcFlushProcessDataCache]] is used to flush the buffer on the process that has acquired rendering rights.
 
Any process must have acquired rendering rights, otherwise the command does nothing.


=== Trigger Memory Fill ===
=== Trigger Memory Fill ===
Line 215: Line 225:
|-
|-
| 0
| 0
| u8 CommandID is 0x02
| Command header (ID = 0x02)
|-
|-
| 1
| 1
| Buf0 start address (0 = don't fill anything)
| Buffer 0 start address
|-
|-
| 2
| 2
| Buf0 value
| Buffer 0 value
|-
|-
| 3
| 3
| Buf0 end address
| Buffer 0 end address
|-
|-
| 4
| 4
| Buf1 start address (0 = don't fill anything)
| Buffer 1 start address
|-
|-
| 5
| 5
| Buf1 value
| Buffer 1 value
|-
|-
| 6
| 6
| Buf1 end address
| Buffer 1 end address
|-
|-
| 7
| 7
Line 239: Line 249:
|}
|}


This command converts the specified addresses to physical addresses, then writes these addresses and the specified parameters to the [[GPU]] registers at 0x1EF00010 and 0x1EF00020. Doing so fills the specified buffers with the associated 4-byte value. This is used to clear GPU framebuffers.
This command sets the [[GPU/External_Registers#Memory_Fill|Memory Fill registers]].
The associated buffer address must not be <= to the main buffer address, thus the associated buffer address must not be zero as well. When the bufX address is zero, processing for the bufX parameters is skipped.


The values of Control0 and Control1 give information about the type of memory fill. See [[GPU/External_Registers#Memory Fill|here]] for more information about memory fill parameters.
Addresses should be aligned to 8 bytes and must be in linear, QTM or VRAM memory, otherwise error 0xE0E02BF5 (GSP_INVALID_ADDRESS) is returned. The start address for a buffer must be below its end address, else the same error is returned. If the start address for a buffer is 0, that buffer is skipped; otherwise, its relative PSC unit is used for the fill operation.


=== Trigger Display Transfer ===
=== Trigger Display Transfer ===
Line 252: Line 261:
|-
|-
| 0
| 0
| u8 CommandID is 0x03
| Command header (ID = 0x03)
|-
|-
| 1
| 1
| Input framebuffer address
| Source address
|-
|-
| 2
| 2
| Output framebuffer address
| Destination address
|-
|-
| 3
| 3
| Input framebuffer [[GPU|dimensions]]
| Source dimensions
|-
|-
| 4
| 4
| Output framebuffer dimensions
| Output dimensions
|-
|-
| 5
| 5
| [[GPU|Flags]], for applications this is 0x1001000 for the main screen, and 0x1000 for the sub screen.
| Flags
|-
|-
| 7-6
| 7-6
Line 273: Line 282:
|}
|}


This command converts the specified addresses to physical addresses, then writes these physical addresses and parameters to the [[GPU]] registers at 0x1EF00C00. This GPU command copies the already rendered framebuffer data from the input GPU framebuffer address to the specified output LCD framebuffer. The input framebuffer is normally located in VRAM.
This command sets the [[GPU/External_Registers#Transfer_Engine|Display Transfer registers]].


The GPU color buffer is stored in the same Z-curve (tiled) format as textures. By default, SetDisplayTransfer converts the given buffer from the tiled format to a linear format adapted to the LCD framebuffers.
No error checking is performed on the parameters. Addresses should be aligned to 8 bytes and should be in linear, QTM or VRAM memory, otherwise PA 0 is used.
 
Display transfers are performed asynchronously, so after requesting a display transfer you should wait for the PPF interrupt to fire before reading the output data.
 
The minimum supported dimension for output is 64x64, anything lower will hang the engine.


=== Trigger Texture Copy ===
=== Trigger Texture Copy ===
Line 289: Line 294:
|-
|-
| 0
| 0
| u8 CommandID is 0x04
| Command header (ID = 0x04)
|-
|-
| 1
| 1
| Input buffer address.
| Source address
|-
|-
| 2
| 2
| Output buffer address.
| Destination address
|-
|-
| 3
| 3
| Total bytes to copy, not including gaps.
| Size
|-
|-
| 4
| 4
| Bits 0-15: Size of input line, in bytes. Bits 16-31: Gap between input lines, in bytes.
| Line width <nowiki>|</nowiki> (gap << 16)
|-
|-
| 5
| 5
| Same as 4, but for the output.
| Same as above, for the destination
|-
|-
| 6
| 6
| Flags, corresponding to the [[GPU/External_Registers#Transfer_Engine|Transfer Engine flags]]. However, for TextureCopy commands, bit 3 is always set, bit 2 is set if any output dimension is smaller than the input, and other bits are always 0.
| Flags
|-
|-
| 7
| 7
Line 313: Line 318:
|}
|}


This command is similar to cmd3. It also triggers the [[GPU/External_Registers#Transfer_Engine|GPU Transfer Engine]], but setting the TextureCopy parameters.
This command sets the [[GPU/External_Registers#TextureCopy|Texture Copy registers]]. Note that GSP doesn't enforce bit3 of the flags to be set.
 
No error checking is performed on the parameters. Addresses and size should be aligned to 8 bytes, and the addresses should be in linear, QTM or VRAM memory, otherwise PA 0 is used.


=== Flush Cache Regions ===
=== Flush Cache Regions ===
Line 323: Line 330:
|-
|-
| 0
| 0
| u8 CommandID is 0x05
| Command header (ID = 0x05)
|-
|-
| 1
| 1
| Buf0 address
| Buffer 0 address
|-
|-
| 2
| 2
| Buf0 size
| Buffer 0 size
|-
|-
| 3
| 3
| Buf1 address
| Buffer 1 address
|-
|-
| 4
| 4
| Buf1 size
| Buffer 1 size
|-
|-
| 5
| 5
| Buf2 address
| Buffer 2 address
|-
|-
| 6
| 6
| Buf2 size
| Buffer 2 size
|-
|-
| 7
| 7
Line 347: Line 354:
|}
|}


The application buffer addresses specified in the parameters are used with [[SVC|svcFlushProcessDataCache]]. The input buf0 size must not be zero. When buf1 size is zero, svcFlushProcessDataCache() for buf1 and buf2 are skipped. When buf2 size is zero, svcFlushProcessDataCache() for buf2 is skipped.
This command calls svcFlushProcessDataCache for each buffer on the process that has acquired rendering rights.
 
If any call fails, its error is returned; If any buffer has size 0, the buffer is skipped. In both cases, subsequent buffers are not processed.
 
Any process must have acquired rendering rights, otherwise the error 0xD8202A06 (GSP_NO_RIGHT) is returned.