Bugs: Difference between revisions
(→Kernel Boot Loader: Add MapASLR bug) |
(→Non-Secure Kernel Boot Loader (NSKBL): Add MapASLR bug) |
||
Line 339: | Line 339: | ||
Exists since at least System Software version 0.920.050. Fixed in System Software 0.990 - due to a rework of the kernel TLS system, <code>sceKernelTlsKernelSet</code> was removed; thus the invalid call is no longer performed. | Exists since at least System Software version 0.920.050. Fixed in System Software 0.990 - due to a rework of the kernel TLS system, <code>sceKernelTlsKernelSet</code> was removed; thus the invalid call is no longer performed. | ||
=== Incorrect mapping size specified to <code>MapASLR</code> === | |||
See [[Bugs#Incorrect_mapping_size_specified_to_MapASLR| the description in SKBL]]. | |||
== Shell == | == Shell == |
Revision as of 20:24, 25 July 2023
The PS Vita has bugs. Some bugs can lead to Vulnerabilities. Others lead to nothing useful (yet) but can serve as examples of what not to do.
Exploitable bugs
See Vulnerabilities.
Non-exploitable bugs
Kernel
Syscall table collision between modules
Because of the time between a slot is allocated and the time it is written to, there could be collisions. For example, assume there is one empty syscall slot left. Two modules each exporting syscalls are loaded and both of them are assigned the final free slot. One user library is loaded that imports from the first module. Then it imports from the second module. At this point, the function pointer exported by the first module is replaced with the second one.
It is unlikely this would lead to any security vulnerabilities, but it could create system instability. However, if the system has so many (let us assume more than 3000) syscalls loaded, it may be already in an unstable state.
Kernel heap pointer leak in sceKernelGetLibraryInfoByNID
Discovered on 2019-12-17 by Princess of Sleeping.
SceKernelModulemgr#sceKernelGetLibraryInfoByNID leaks a kernel heap pointer, but it is probably not useful for kernel exploitation.
SceKernelLibraryInfo.libname is a pointer to kernel memory. See SceKernelModulemgr#Types.
PoC code:
SceUID modids[0x80]; SceSize num = 0x80; SceKernelLibraryInfo libinfo; libinfo.size = sizeof(libinfo); sceKernelGetModuleList(~0, modids, (int *)&num); sceKernelGetLibraryInfoByNID(modids[num - 2], 0xCAE9ACE6, &libinfo); sceClibPrintf("LEAKED KERNEL HEAP POINTER !!! ---> 0x%X <--- !!!\n", libinfo.libname);
Not fixed as of FW 3.600.011.
SceIofilemgr misses internal NULL pointer checks
SceIofilemgr's syscalls wrappers do various checks in usermode for the sanity of usermode arguments, but some internal functions that the syscalls call do not do proper checks.
For example, you can simply trigger a Kernel DABT by running the following code:
sceIoDevctl(NULL, 0, NULL, 0, NULL, 0);
Confirmed in FW 2.10. FWs >=3.60 have proper checks.
sceAppMgrDestroyAppByAppId triggers kernel panic
Triggering a usermode exception immediately after calling sceAppMgrDestroyAppByAppId causes ?SceKernelThreadMgr? to get confused and trigger a kernel exception.
sceKernelCreateThread in thumb mode
SceKernelThreadMgr#sceKernelCreateThreadForUser checks the memory attributes to see if the entry point is executable, but in thumb mode, the function pointer always has bit 0 as 1, so if the entry point is the last 4-bytes of a memory page, then the next check fails and returns 0x80020006.
res = sceKernelIsEqualAccessibleRangeProcBySWForDriver(pid, memory_attr, entry, 4);
sceNetRecvfromForDriver 0xC0022005 error on kernel call
This is because the internal function always sets the is_user flag in the parameter, so setting the kernel memory pointer to data in SceNetPs#sceNetRecvfromForDriver will result in an error in SceSysmem#sceKernelCopyToUserDomainForKernel or SceSysmem#sceKernelCopyToUserTextDomainForKernel.
// Offsets are for FW 3.60 // Patch by function hook SceUID target = -1; tai_hook_ref_t FUN_8100d5a8_ref; int FUN_8100d5a8_patch(void *a1, void *a2, void *a3, int a4, void *a5, void *a6) { if (target == sceKernelGetThreadIdForDriver()) *(int *)(a3 + 5 * 4) = 1; // 0:user 1:kernel 2~:kpanic return TAI_CONTINUE(int, FUN_8100d5a8_ref, a1, a2, a3, a4, a5, a6); } // Patch by code injection (recommended) int patch_netrecv_0xC0022005(void) { /* 810067b2 c0 ef 10 00 vmov.i32 d16,#0 -> DD F8 30 C0 ldr ip, [sp, #0x30] 810067b6 19 68 ldr r1, [r3] 810067b8 a2 60 str r2, [r4, #8] 810067ba da f8 0c 30 ldr.w r3, [sl, #0xc] 810067be c4 e9 07 55 strd r5, r5, [r4,#0x1c] -> C4 E9 07 C5 strd ip, r5, [r4,#0x1c] 810067c2 a5 61 str r5, [r4, #0x18] 810067c4 e3 60 str r3, [r4, #0xc] 810067c6 61 62 str r1, [r4, #0x24] 810067c8 c4 ed 04 0b vstr.64 d16, [r4,#0x10] -> C4 E9 04 55 strd r5, r5, [r4,#0x10] */ SceUID module_id; void *patch_point; char inst[0x20]; module_id = sceKernelSearchModuleByNameForDriver("SceNetPs"); module_get_offset(0x10005, module_id, 0, 0x67b2, (uintptr_t *)&patch_point); memcpy(inst, patch_point, 0x1E); memcpy(&(inst[0x0]), (const char[4]){0xDD, 0xF8, 0x30, 0xC0}, 4); memcpy(&(inst[0xC]), (const char[4]){0xC4, 0xE9, 0x07, 0xC5}, 4); memcpy(&(inst[0x16]), (const char[4]){0xC4, 0xE9, 0x04, 0x55}, 4); taiInjectDataForKernel(0x10005, module_id, 0, 0x67B2, inst, 0x1E); return 0; }
Illegal alignment check of kernel allocator
Discovered on 2021-08-30 by Princess of Sleeping.
For example, if 0x880 is passed as the alignment argument of kernel malloc, the function will not return NULL.
This affects at least SceNetPs malloc and system malloc internal/external.
Ignored sceGUIDGetNameCore error propagation
Discovered on 2022-03-10 by Princess of Sleeping.
sceGUIDGetNameCore, which is called internally by SceSysmem#sceGUIDGetNameForDriver or SceSysmem#sceGUIDGetName2ForDriver, always returns 0 even if an error occurs in the function.
void unsafe_calling_example_1(void) { int res; const char *name; // Use some tricks to reach sceGUIDGetNameCore with invalid guid. res = sceGUIDGetName((invalid_guid | 1) & ~0xC0000000, &name); // res is always 0 even failed internally. // And sceGUIDGetNameCore initializes name with NULL, but if the internal check fails too early, name is not initialized and is undefined. } void unsafe_calling_example_2(void) { const char *name; // Use some tricks to reach sceGUIDGetNameCore with invalid guid. name = sceGUIDGetName2((invalid_guid | 1) & ~0xC0000000); // res is always 0 even failed internally. // And sceGUIDGetNameCore initializes name with NULL, but if the internal check fails too early, name is not initialized and is undefined. // If sceGUIDGetNameCore failed internally, name value is *(uint32_t *)(unsafe_calling_example_2_current_sp - 0x10) } void safe_calling_example_1(void) { int res; const char *name = NULL; // Initialize with NULL in advance // Use some tricks to reach sceGUIDGetNameCore with invalid guid. res = sceGUIDGetName((invalid_guid | 1) & ~0xC0000000, &name); // res is always 0 even failed internally. if(NULL == name){ sceKernelPrintf("Failed %s\n", "sceGUIDGetName"); } } void safe_calling_example_2(void) { int res; const char *name; // Add guid valid check res = some_guid_valid_check(invalid_guid); if (res < 0) return; // If invalid guid it, do not call sceGUIDGetName2. // Use some tricks to reach sceGUIDGetNameCore with invalid guid. name = sceGUIDGetName2((invalid_guid | 1) & ~0xC0000000); // name is always not NULL. }
Incomplete register restore on intr handler
Discovered on 2023-03-08 by Princess of Sleeping.
Confirmed on fw 1.810.
In the example below, an interrupt occurs when rw_data is loaded. In that case the interrupt handler will handle it, but not fully restore the DACR when leaving the function, but restore it from the ThreadCB.
And what is set in ThreadCB is the kernel's default client setting of 0x55550000. So when the interrupt ends and you try to write to rx_data, a DABT occurs.
int resolve_something(void *rx_data, void *rw_data){ SceUInt32 dacr = sceKernelGetDACR(); sceKernelSetDACR(dacr | 0xFFFF0000); int write_data = *(int *)rw_data; // Happened intr And Setting dacr to 0x55550000 on intr_handler register restore. *(int *)rx_data = write_data; // Trigger DABT because there is no write in RX. sceKernelSetDACR(dacr); return 0; }
Add disable intr to fix these.
int resolve_something(void *rx_data, void *rw_data){ asm volatile ("cpsid aif\n"); SceUInt32 dacr = sceKernelGetDACR(); sceKernelSetDACR(dacr | 0xFFFF0000); int write_data = *(int *)rw_data; *(int *)rx_data = write_data; sceKernelSetDACR(dacr); asm volatile ("cpsie aif\n"); return 0; }
DACR corrupte due to sceKernelIsAccessibleRangeProc
If sceKernelIsAccessibleRangeProc is specified in the pid argument, switches to the target process MMU Mapping, but does not restore DACR correctly at the time of termination processing.
Simplified code.
int sceKernelIsAccessibleRangeProc(int pid, int perm, const void *addr, int size){ if(pid != 0){ SceUInt32 dacr = sceKernelGetDACR(); set_process_mmu(pid); sceKernelIsAccessibleRangeProc_core(perm, addr, size); sceKernelSetDACR(dacr & 0x55555555); }else{ // ... } }
Be careful if you are in Development Kit. The callback of sceKernellPrintf calls sceKernelIsAccessibleRangeProc. (SceSysmem::sceKernellPrintf -> SceDeci4pSTtyp::handler -> call sceKernelIsAccessibleRangeProc for n/s format)
Also, if you crash something when you write on RX with after DACR 0xFFFFFFFF, suspect this. This is not the only function for MMU Mapping like this.
Limited buffer size in dbginfo handler for sceKernelPrintf*
The handler properly converts dbginfo like 0:0xAAAAAAAA55555555(something_func:335):0xA5A5A5A5(file.c):Hi\n
and outputs it to tty, but its buffer size is limited, so if the function name or file name is too long, the conversion will be cut off and the incorrect output will be output to tty.
Wrong range control in vnode lock/unlock
If your thread tries to lock the vnode while another thread is locking the target vnode, vp->waiter
is incremented, but waiter
is a 32-bit member, but the lock function Tries to increment over a range of 64-bits.
R_ARM_CALL
/R_ARM_JUMP24
relocations not performed properly
Discovered on 2023-06-17 by CreepNT.
There is a bug in the SceKernelModulemgr routine that handles relocation types R_ARM_CALL
(28) and R_ARM_JUMP24
(29):
//S, A and P correspond to the relocation variables detailed in the "ELF for the Arm® Architecture" document. int displacement = (A - P) + S; unsigned opcode = read_opcode_from_address(P); if ((opcode & 0xF0000000) == 0x0) { //<- bug here opcode = (opcode & 0xFEFFFFFF) | (displacement & 0x2) << 23; //write bit 1 of displacement in 'H' bit of BLX } opcode = (opcode & 0xFF000000) | (displacement >> 2) & 0xFFFFFF; //write displacement in imm24 (bottom 2 bits not needed due to code alignment) write_opcode_to_address(P);
The if
-gated code is supposed to handle the special case of the BLX
instruction, which has an additional bit (H
) of storage for the offset to target function (because ARM code is 4-byte aligned but Thumb code is 2-byte aligned). The BLX
instruction should be identified because it has cond=0xF
, but this code checks for cond=0x0
instead (EQ
).
This bug will thus cause all relocated BLEQ
instructions to turn into BEQ
instructions - fortunately, this has no consequence because the instructions are equivalent.
However, and most importantly, it also results in some BLX
instructions not being properly relocated (as H
is not set/cleared when it should). One of three scenarios happens when a BLX
is "relocated":
H
has the correct value: everything goes fineH
is set but should be clear:BLX
will skip the first instruction of the target functionH
is clear but should be set:BLX
will jump to the one instruction right before the target function's start
When an improperly relocated BLX
is executed, the program may end crashing (e.g. UNDEFINED abort), behave unexpectedly (function doesn't actually run, argument is corrupted, etc) or appear to work properly depending on the exact situation.
This bug exists since at least firmware 0.920 and has never been fixed. It is plausible it exists since an earlier (or even the first) revision of the OS.
Kernel Boot Loader
Out of range access in SKBL
Discovered on 2022-01-20 by Princess of Sleeping.
To decode ARZL encoded TrustZone SceSysmem, SKBL maps Compati SRAM (PA 0x1C000000) to TrustZone VA with a size of 2MiB. It then calls SKBL#sceArlzDecode with an improper argument, thus using glitches during decoding to exceed 2MiB will pass the size check and access outside the range of the device, so it can trigger a Data abort exception.
Moreover, even if SKBL#sceArlzDecode returns an error code, it will be passed to the argument of SKBL#sceArlzArmFilter without being checked, so access for up to 0x80560201-bytes will occur.
if (sceKernelCpuId() == 0) { sceKernelMMUMapSections(*(void **)(param_1 + 0x60), 0x1061D007, 0xC, 0x1C000000, 0x200000 /* mapping size */, 0x1C000000); res = sceArlzDecode(0x1C000000 /* dst */, 0x1000000 /* dst max size */, &ARZL_encoded_SceSysmem[4] /* src */, NULL); size = sceArlzArmFilter(0x1C000000, res, 0); g_Tzs_SceSysmem_start_address = 0x1C000000; g_Tzs_SceSysmem_end_address = 0x1C000000 + size; }
It is currently just a bug as no glitching has been tried and as a Data abort exception is not useful.
Incorrect mapping size specified to MapASLR
Discovered on 2023-07-25 by CreepNT. This bug also affects NSKBL.
Since System Software version ?, SKBL and NSKBL randomize the virtual address of objects allocated during boot that remain mapped after KBL ends. To achieve this, the ASLR seed from KBL Param and the size of each mapping are used by the MapASLR
routine. Along with an internal bitmap to keep track of the previously allocated virtual memory pages, it finds a virtual address aligned with vsize
such that enough pages to fit vsize
bytes are free after it, before marking the whole range as allocated. However, the first call to MapASLR
(for SceKernelL2PageTable000) is performed with vsize=0x1000
instead of vsize=0x2000
. This results in an improper update of the bitmap - some virtual memory that should be considered allocated remains marked as free. This bug will usually not result in any noticeable behavior because all other allocations are performed properly - in addition, all other vsize
s are >= 0x2000 so they are more strictly aligned, thus reducing the risk of overlap. However, due to the random nature of this algorithm, it might be possible that certain ASLR seeds cause a kernel panic (in SKBL) due to two allocations overlapping (this should be caught later during Sysmem start, as the Memblock objects created to back these mappings should conflict).
Probably present since ASLR was introduced in SKBL & NSKBL. Not fixed as of System Software 3.74.
Non-Secure Kernel Boot Loader (NSKBL)
Null dereference in the NSKBL kernel panic handler
(2021/06/19 by Princess of Sleeping) The kernel panic handler accesses the SceSysroot pointer, but since that pointer is set to NULL during early boot, NULL access to SceSysroot occurs.
CelesteBlue: If I understood correctly, this means that as long as NSKBL is running, a non-secure Kernel panic from any cause will end up in a DABT exception at NSKBL level.
CreepNT: The global SceSysroot pointer is initialized during sceKernelSysrootStart
(soon after the MMU is brought up) and any panic after this point will not DABT.
However, if a panic happens before but the MMU is disabled, since 0 is a valid (physical) address, no DABT will occur (but since bogus "Sysroot" data will be read, system may e.g. PABT if bogus data is interpreted as a function pointer).
This only leaves a tiny window during which the MMU is enabled but sceKernelSysrootStart
has not been executed where a panic will cause a DABT (but since there is basically no code in that window that can panic, a DABT should never happen because of this bug).
Present in FW 3.600.011, 3.650.011.
Out-of-bounds write in sceKernelSysrootStart
(2023/04/25 by CreepNT)
In old firmwares, during the execution of the sceKernelSysrootStart
function, 0x80 bytes are allocated from the Sysroot heap then divided in 4 blocks of 0x20 bytes each. Each CPU then loads the address of its block into the TPIDRPRW
register.
Later on during this same function, the sceKernelTlsKernelSet
subroutine is called - however, it expects TPIDRPRW
to hold a pointer to a ThreadCB
(Thread object) which is much larger than 0x20, and does the following: *(uint32_t*)(TPIDRPRW + 0x74) = 0;
. This results in an out-of-bounds write at offsets 0x74, 0x94, 0xB4 and 0xD4 of the Sysroot heap for CPU0, CPU1, CPU2 and CPU3 respectively.
However, this bug has no real consequence (and was probably never noticed) because the data at these offsets is:
- 0x74: inside the TPIDRPRW block, which is unused
- 0x94: inside a 0x28 bytes allocation which has not been written to yet
- 0xB4: inside heap padding (all allocations are rounded up to 32/64B boundary)
- 0xD4: offset 0x14 inside the
SceKblParam
structure, which is unused
Exists since at least System Software version 0.920.050. Fixed in System Software 0.990 - due to a rework of the kernel TLS system, sceKernelTlsKernelSet
was removed; thus the invalid call is no longer performed.
Incorrect mapping size specified to MapASLR
Shell
Unvalidated IPMI arguments lead to DoS
(2023/05/31 by CreepNT, reported by M Ibrahim) At least one IPMI server (SceDownload
) does not validate the amount of IPMI::DataInfo
(input) or IPMI::BufferInfo
(output) arguments received before using them. This leads to garbage being used as pointers and dereferenced, in turn crashing the SceShell process due to a Data Abort Exception.
It might be possible to use this as a vector for a data-only attack on Shell.
Present in firmware 3.60.