Changes

Jump to navigation Jump to search
7,195 bytes added ,  02:59, 25 December 2016
Line 1: Line 1: −
This page is a work in progress. Put everything related to multi-threading here, threads, synchronization, multi-core support, etc.
+
This page documents all kernel functionality for managing multiple processes and threads as well as handling synchronization between them.
   −
The Nintendo 3DS offers support for threading through use of [[SVC]] calls.
+
= Processes =
 +
 
 +
Each process is given an array of [[NCCH/Extended_Header#ARM11_Kernel_Capabilities|kernel capability descriptors]] upon creation (see CreateProcess). Official software forwards the descriptors specified in the [[NCCH#Extended_Header|NCCH exheader]].
 +
 
 +
Any process can only use SVCs which are enabled in its kernel capability descriptors. This is enforced by the ARM11 kernel SVC handler by checking the syscall access control mask stored on the SVC-mode stack. If the SVC isn't enabled, a kernelpanic() is triggered. Each process has a separate SVC-mode stack; this stack and the syscall access mask stored here are initialized when the process is started. Applications normally only have access to SVCs <=0x3D, however not all SVCs <=0x3D are accessible to the application. The majority of the SVCs accessible to applications are unused by the application.
 +
 
 +
Each process has a separate handle-table, the size of which is stored in the kernel capability descriptor. The handles in a handle-table can't be used in the context of other processes, since those handles don't exist in other handle-tables.
 +
 
 +
0xFFFF8001 is a handle alias for the current process.
 +
 
 +
Calling svcBreak on retail will only terminate the process which called this SVC.
 +
 
 +
== Usage ==
 +
 
 +
=== CreateCodeSet ===
 +
(behavior unconfirmed)
 +
 
 +
Allocates memory for a process according to the given CodeSetInfo contents and copies the segment data from the given memory locations to the allocated memory.
 +
 
 +
=== CreateProcess ===
 +
(behavior unconfirmed)
 +
 
 +
Sets up a process using the segments managed by the given CodeSet handle.
 +
 
 +
This system call furthermore processes the [[NCCH/Extended_Header#ARM11_Kernel_Capabilities|kernel capabilities]] from the [[NCCH/Extended_Header|ExHeader]], hence setting up virtual address mappings, CPU clock frequency/L2 cache configuration, and other things.
 +
 
 +
=== Run ===
 +
(behavior unconfirmed)
 +
 
 +
Sets up the main process thread and appends it to the scheduler queue.
 +
 
 +
The argc, argv, and envp fields from the given StartupInfo structure are ignored.
 +
 
 +
== struct CodeSetInfo ==
 +
All addresses are given virtual for the process to be created.
 +
All sizes are given in 0x1000-pages.
 +
 
 +
{| class="wikitable" border="1"
 +
!  Type
 +
!  Field
 +
|-
 +
| u8[8]
 +
| Codeset Name
 +
|-
 +
| u16
 +
| Unknown, this is written to field 0x5A of KCodeSet
 +
|-
 +
| u16
 +
| Unknown/padding
 +
|-
 +
| u32
 +
| Unknown/padding
 +
|-
 +
| u32
 +
| .text addr
 +
|-
 +
| u32
 +
| .text size
 +
|-
 +
| u32
 +
| .rodata start
 +
|-
 +
| u32
 +
| .rodata size
 +
|-
 +
| u32
 +
| RW addr (.data + .bss)
 +
|-
 +
| u32
 +
| RW size (.data + .bss)
 +
|-
 +
| u32
 +
| Total .text pages
 +
|-
 +
| u32
 +
| Total .rodata pages
 +
|-
 +
| u32
 +
| Total RW pages (.data + .bss)
 +
|-
 +
| u32
 +
| Unknown/padding
 +
|-
 +
| u8[8]
 +
| Program ID
 +
|}
    
= Threads =
 
= Threads =
Line 13: Line 98:  
Lower priority values give the thread higher priority. For userland apps, priorities between 0x18 and 0x3F are allowed. The priority of the app's main thread seems to be 0x30.
 
Lower priority values give the thread higher priority. For userland apps, priorities between 0x18 and 0x3F are allowed. The priority of the app's main thread seems to be 0x30.
   −
The thread scheduler is cooperative, therefore if a thread takes up all the CPU time (for example if it enters an endless loop), all the other threads that run on the same CPU core won't get a chance to run. The main way of yielding another thread is using an address arbiter.
+
The [[Glossary#appcore|appcore]] thread scheduler primarily uses a cooperative design, therefore if a thread takes up all the CPU time (for example if it enters an endless loop), all the other threads that run on the same CPU core won't get a chance to run. The main way of yielding another thread is using an address arbiter. In certain cases, execution of the current task may be preempted regardless, for instance when a thread was waiting on svcSendSyncRequest to return.
 +
 
 +
0xFFFF8000 is a handle alias for the currently active thread.
    
== Usage ==
 
== Usage ==
Line 20: Line 107:  
'''svc''' : 0x08
 
'''svc''' : 0x08
   −
'''Definition'''
+
'''Signature'''
 
  Result CreateThread(Handle* thread, func entrypoint, u32 arg, u32 stacktop, s32 threadpriority, s32 processorid);
 
  Result CreateThread(Handle* thread, func entrypoint, u32 arg, u32 stacktop, s32 threadpriority, s32 processorid);
   Line 34: Line 121:     
'''Details'''
 
'''Details'''
The processorid parameter specifies which processor the thread can run on. Non-negative values correspond to a specific CPU. (e.g. 0 for the Appcore and 1 for the Syscore on Old3DS) Special value -1 means all CPUs, and -2 means the default CPU for the process (Read from the [[NCCH/Extended Header|Exheader]], usually 0 for applications, 1 for system services). Games usually create threads using -2.
     −
With the Old3DS kernel, the s32 processorid must be <=2.
+
Creates a new thread in the current process which will begin execution at the given entrypoint. The SP CPU register will be initialized to stacktop, while r0 will be initialized to the given arg.
   −
With the New3DS kernel: processorid must be <= <total cores(MPCore "SCU Configuration Register" CPU number value + 1)>. When processorid==0x2 and the process is not an APPLICATION mem-region process, exheader kernel-flags bitmask 0x2000 must be set otherwise error 0xD9001BEA is returned. When processorid==0x3 and the process is not an APPLICATION mem-region process, error 0xD9001BEA is returned. These are the only restriction checks done by the kernel for processorid.
+
The input address used for Entrypoint_Param and StackTop are normally the same, but they may be chosen arbitrarily. For the main thread (created in svcRun), the Entrypoint_Param is value 0.
   −
The thread priority value must be in the following range: 0x0..0x3F.
+
The stacktop must be aligned to 0x8-bytes, otherwise when not aligned to 0x8-bytes the ARM11 kernel clears the low 3-bits of the stacktop address.
   −
The stacktop must be aligned to 0x8-bytes, otherwise when not aligned to 0x8-bytes the ARM11 kernel clears the low 3-bits of the stacktop address.
+
The processorid parameter specifies which processor the thread can run on. Non-negative values correspond to a specific CPU. (e.g. 0 for the Appcore and 1 for the Syscore on Old3DS) Special value -1 means all CPUs, and -2 means the default CPU for the process (Read from the [[NCCH/Extended Header|Exheader]], usually 0 for applications, 1 for system services). Games usually create threads using -2.
 +
 
 +
The thread priority value must be in the range 0x0..0x3F. Otherwise, error 0xE0E01BFD is returned.
   −
The input address used for Entrypoint_Param and StackTop are normally the same, however these can be arbitrary. For the main thread the Entrypoint_Param is value 0.
+
With the Old3DS kernel, the s32 processorid must be <=2 (for the processorid validation check in the kernel). With the New3DS kernel, the processorid validation check requires processorid to be less than or equal to <total cores(MPCore "SCU Configuration Register" CPU number value + 1)>, and a number of additional constraints apply: When processorid==0x2 and the process is not a BASE mem-region process, exheader kernel-flags bitmask 0x2000 must be set (otherwise error 0xD9001BEA is returned). When processorid==0x3 and the process is not a BASE mem-region process, error 0xD9001BEA is returned. These are the only restriction checks done by the kernel for processorid.
    
=== ExitThread  ===
 
=== ExitThread  ===
 
'''svc''' : 0x09
 
'''svc''' : 0x09
   −
'''Definition'''
+
'''Signature'''
 
  void ExitThread(void);
 
  void ExitThread(void);
   Line 55: Line 143:  
'''svc''' : 0x0A
 
'''svc''' : 0x0A
   −
'''Definition'''
+
'''Signature'''
 
  void SleepThread(s64 nanoseconds);
 
  void SleepThread(s64 nanoseconds);
   Line 61: Line 149:  
'''svc''' : 0x0B
 
'''svc''' : 0x0B
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadPriority(s32* priority, Handle thread);
 
  Result GetThreadPriority(s32* priority, Handle thread);
   Line 77: Line 165:  
'''svc''' : 0x0C
 
'''svc''' : 0x0C
   −
'''Definition'''
+
'''Signature'''
 
  Result SetThreadPriority(Handle thread, s32 priority);
 
  Result SetThreadPriority(Handle thread, s32 priority);
   Line 83: Line 171:  
'''svc''' : 0x34
 
'''svc''' : 0x34
   −
'''Definition'''
+
'''Signature'''
 
  Result OpenThread(Handle* thread, Handle process, u32 threadId);
 
  Result OpenThread(Handle* thread, Handle process, u32 threadId);
   Line 89: Line 177:  
'''svc''' : 0x36
 
'''svc''' : 0x36
   −
'''Definition'''
+
'''Signature'''
 
  Result GetProcessIdOfThread(u32* processId, Handle thread);
 
  Result GetProcessIdOfThread(u32* processId, Handle thread);
   Line 95: Line 183:  
'''svc''' : 0x37
 
'''svc''' : 0x37
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadId(u32* threadId, Handle thread);
 
  Result GetThreadId(u32* threadId, Handle thread);
 +
 +
'''Details'''
 +
It seems that only the thread itself or one of its parent can get the ID. Calling this on the handle of a sibling or parent seems to always yield the ID 0.
    
=== GetThreadInfo ===
 
=== GetThreadInfo ===
 
'''svc''' : 0x2C
 
'''svc''' : 0x2C
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadInfo(s64* out, Handle thread, ThreadInfoType type);
 
  Result GetThreadInfo(s64* out, Handle thread, ThreadInfoType type);
   −
{| class="wikitable" border="1"
+
''' Details '''
!  ThreadInfoType value
+
This requests always return an error when called, it only checks if the handle is a thread or not.
!  Description
+
Hence, it will return 0xD8E007ED (BAD_ENUM) if the Handle is a Thread Handle, 0xD8E007F7 (BAD_HANDLE) if it isn't.
|-
  −
| ?
  −
| ?
  −
|}
      
=== GetThreadContext ===
 
=== GetThreadContext ===
 
'''svc''' : 0x3B
 
'''svc''' : 0x3B
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadContext(ThreadContext* context, Handle thread);
 
  Result GetThreadContext(ThreadContext* context, Handle thread);
   Line 122: Line 209:     
== Core affinity ==  
 
== Core affinity ==  
 +
 +
The cores are numbered from 0 to 1 for Old 3DS and 0 to 3 for the new 3DS.
    
=== GetThreadAffinityMask ===
 
=== GetThreadAffinityMask ===
 
'''svc''' : 0x0D
 
'''svc''' : 0x0D
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadAffinityMask(u8* affinitymask, Handle thread, s32 processorcount);
 
  Result GetThreadAffinityMask(u8* affinitymask, Handle thread, s32 processorcount);
   Line 132: Line 221:  
'''svc''' : 0x0E
 
'''svc''' : 0x0E
   −
'''Definition'''
+
'''Signature'''
 
  Result SetThreadAffinityMask(Handle thread, u8* affinitymask, s32 processorcount);
 
  Result SetThreadAffinityMask(Handle thread, u8* affinitymask, s32 processorcount);
   Line 138: Line 227:  
'''svc''' : 0x0F
 
'''svc''' : 0x0F
   −
'''Definition'''
+
'''Signature'''
 
  Result GetThreadIdealProcessor(s32* processorid, Handle thread);
 
  Result GetThreadIdealProcessor(s32* processorid, Handle thread);
    
=== SetThreadIdealProcessor ===
 
=== SetThreadIdealProcessor ===
 
'''svc''' : 0x10
 
'''svc''' : 0x10
 +
 +
=== APT:SetApplicationCpuTimeLimit ===
 +
 +
See [[APT:SetApplicationCpuTimeLimit]].
 +
 +
You are not able to use the system core (core1) by default. You have to first assign the amount of time dedicated to the system.
 +
The value is in percent, the higher it is, the more the system will be available for your application.
 +
 +
For example if you set this value to 25%, it means that your application will be able to use 25% of the system core at most, even if you never issue system calls.
 +
 +
If you set the value to a non-zero value, you will not be able to set it back to 0%.
 +
Keep in mind that if your application is heavily dependant on the system, setting a high value for your application might yield poorer performance than if you had set a low value.
 +
 +
=== APT:GetApplicationCpuTimeLimit ===
 +
 +
See [[APT:GetApplicationCpuTimeLimit]].
    
== Debug ==  
 
== Debug ==  
Line 155: Line 260:     
= Synchronization =
 
= Synchronization =
 +
 +
Synchronization can be performed via WaitSynchronization on any handles deriving from [[KSynchronizationObject]]. The semantic meaning of the call depends on the particular object type referred to by the given handle:
 +
 +
* KClientPort: Wakes if max sessions not reached (free session available)
 +
* KClientSession: Always false?
 +
* KDebug: Waits until a debug event is signaled (the user should then use svcGetProcessDebugEvent to get the debug event info)
 +
* KDmaObject: ???
 +
* KEvent: Waits until the event is signaled
 +
* KMutex: Acquires a lock on the mutex (blocks until this succeeds)
 +
* KProcess: Waits until the process exits/is terminated
 +
* KSemaphore: This consumes a value from the semaphore count, if possible, otherwise continues to wait
 +
* KServerPort: Waits for a new client connection, upon which svcAcceptSession is ready to be called
 +
* KServerSession: Waits for an IPC command to be submitted to the server process
 +
* KThread: Waits until the thread terminates
 +
* KTimer: Wakes when timer activates (this also clears the timer if it is oneshot)
    
Most synchronization systems seem to have both a "normal" and "light-weight" version
 
Most synchronization systems seem to have both a "normal" and "light-weight" version
   −
== Mutex (normal) ==
+
== Mutex ==
    
For Kernel implementation details, see [[KMutex]]
 
For Kernel implementation details, see [[KMutex]]
Line 168: Line 288:  
=== ReleaseMutex ===
 
=== ReleaseMutex ===
   −
== Ciritical Section (light-weight mutex) ==
+
== Semaphore ==
 +
 
 +
== Event ==
   −
== CriticalSection::Initialize ==
+
== Address Arbiters ==
   −
Same thread ownership as a mutex ?
+
Address arbiters are a low-level primitive to implement synchronization based on a counter stored at some user-specified virtual memory address. Address arbiters are used to put the current thread to sleep until the counter is signaled. Both of these tasks are implemented in ArbitrateAddress.
   −
=== CriticalSection::Enter ===
+
Address arbiters are implemented by [[KAddressArbiter]].
   −
=== CriticalSection::Leave ===
+
===CreateAddressArbiter===
 +
Result CreateAddressArbiter(Handle* arbiter)
   −
== Semaphore ==
+
Creates an address arbiter handle for use with ArbitrateAddress.
   −
== Light Semaphore ? ==
+
=== ArbitrateAddress ===
 +
Result ArbitrateAddress(Handle arbiter, u32 addr, ArbitrationType type, s32 value, s64 nanoseconds)
   −
Does it exist ?
+
if <code>type</code> is SIGNAL, the ArbitrateAddress call will resume up to <code>value</code> of the threads waiting on <code>addr</code> using an arbiter, starting with the highest-priority threads. If <code>value</code> is negative, all of these threads are released. <code>nanoseconds</code> remains unused in this mode.
   −
== Event ==
+
The other modes are used to (conditionally) put the current thread to sleep based on the memory word at virtual address <code>addr</code> until another thread signals that address using ArbitrateAddress with the <code>type</code> SIGNAL. WAIT_IF_LESS_THAN will put the current thread to sleep if that word is smaller than <code>value</code>. DECREMENT_AND_WAIT_IF_LESS_THAN will furthermore decrement the memory value before the comparison. WAIT_IF_LESS_THAN_TIMEOUT and DECREMENT_AND_WAIT_IF_LESS_THAN_TIMEOUT will do the same as their counterparts, but will have thread execution resume if <code>nanoseconds</code> nanoseconds pass without <code>addr</code> being signaled.
   −
== Light Event ==
+
=== enum ArbitrationType ===
 +
{| class="wikitable" border="1"
 +
!  Address arbitration type
 +
!  Value
 +
|-
 +
| SIGNAL
 +
| 0
 +
|-
 +
| WAIT_IF_LESS_THAN
 +
| 1
 +
|-
 +
| DECREMENT_AND_WAIT_IF_LESS_THAN
 +
| 2
 +
|-
 +
| WAIT_IF_LESS_THAN_TIMEOUT
 +
| 3
 +
|-
 +
| DECREMENT_AND_WAIT_IF_LESS_THAN_TIMEOUT
 +
| 4
 +
|}
516

edits

Navigation menu