From 8398baf359288ec6a08197949aa02a1e202610ff Mon Sep 17 00:00:00 2001 From: Ishan09811 <156402647+Ishan09811@users.noreply.github.com> Date: Tue, 23 Jul 2024 23:59:15 +0530 Subject: [PATCH] Implement Jit32 (#9) * Introduce Jit32 and JitCore32 objects * Initialize JIT when launching 32bit executables * Introduce kernel objects for 32bit processes This commit introduces two new kernel thread types, `KNceThread` and `Jit32Thread`. `KNceThread`s behave like the previous kernel thread object by setting up thread state and jumping into guest code. `KJit32Thread`s need to run guest code on a `JitCore32` object, so they perform the necessary state setup and then they also setup the jit core for executing guest code. A loop was introduced because jit execution might return when halted, either for an SVC or for preemption. In those cases the thread needs to wait to be scheduled before executing again. The process object has also been updated to be able to create 32bit threads when running 32bit processes. Additionally NCE's ThreadContext has been removed from DeviceState, since a thread is not an NCE thread only anymore, and IPC code has been changed to retrieve the tls region from the thread object. * Introduce a preemption handler for scheduling with JIT Scheduler initialization has been delayed until process information is available, as it needs to differentiate between 32bit and 64bit processes. * Support initializing VMM for 32bit address spaces * Implement GetThreadContext3 SVC for 32bit processes * Introduce a thread local pointer to the current guest thread This also gives easier access to the current guest process structure via the thread structure, just like any kernel does for their internal structures. * Add a signal handler for JIT threads * Implement coprocessor 15 accesses * Implement exclusive memory writes and exclusive monitor * Enable JIT fastmem * Enable more JIT optimizations and log exceptions * Fix incorrect logging call in QueryMemory * Translate guest virtual addresses on direct accesses from SVCs * Perform TLS page address translation for direct accesses This allows the IPC code to work without modifications since `KThread::tlsRegion` now stores a host address that can be accessed directly. * Add Dynarmic as a submodule * Revert "Perform TLS page address translation for direct accesses" This reverts commit 2e25b3f7e4f0687b038fa949648c74e3393da006. * Revert "Translate guest virtual addresses on direct accesses from SVCs" This reverts commit 7bec4e0902e6dbb6f06a2efac53a1e2127f44068. * add an option to change cpu backend * Fix --------- Co-authored-by: lynxnb --- .gitmodules | 3 + app/CMakeLists.txt | 11 +- app/libraries/dynarmic | 1 + app/src/main/cpp/emu_jni.cpp | 4 + app/src/main/cpp/skyline/common.cpp | 2 - app/src/main/cpp/skyline/common.h | 5 +- app/src/main/cpp/skyline/jit/coproc_15.cpp | 126 +++++++++ app/src/main/cpp/skyline/jit/coproc_15.h | 34 +++ app/src/main/cpp/skyline/jit/exception.h | 33 +++ app/src/main/cpp/skyline/jit/halt_reason.h | 53 ++++ app/src/main/cpp/skyline/jit/jit32.cpp | 70 +++++ app/src/main/cpp/skyline/jit/jit32.h | 37 +++ app/src/main/cpp/skyline/jit/jit_core_32.cpp | 252 ++++++++++++++++++ app/src/main/cpp/skyline/jit/jit_core_32.h | 181 +++++++++++++ .../main/cpp/skyline/jit/thread_context32.h | 53 ++++ app/src/main/cpp/skyline/kernel/ipc.cpp | 4 +- app/src/main/cpp/skyline/kernel/memory.cpp | 40 ++- app/src/main/cpp/skyline/kernel/memory.h | 2 +- app/src/main/cpp/skyline/kernel/scheduler.cpp | 24 +- app/src/main/cpp/skyline/kernel/scheduler.h | 8 +- app/src/main/cpp/skyline/kernel/svc.cpp | 42 ++- .../cpp/skyline/kernel/types/KProcess.cpp | 8 +- .../main/cpp/skyline/kernel/types/KProcess.h | 18 +- .../main/cpp/skyline/kernel/types/KThread.cpp | 231 +++++++++------- .../main/cpp/skyline/kernel/types/KThread.h | 80 +++++- app/src/main/cpp/skyline/loader/nro.cpp | 1 + app/src/main/cpp/skyline/loader/nso.cpp | 1 + app/src/main/cpp/skyline/nce.cpp | 4 +- app/src/main/cpp/skyline/os.cpp | 14 +- app/src/main/cpp/skyline/os.h | 2 + .../java/emu/skyline/EmulationActivity.kt | 9 + .../emu/skyline/settings/EmulationSettings.kt | 3 + app/src/main/res/values/array.xml | 5 + app/src/main/res/values/strings.xml | 3 + .../main/res/xml/emulation_preferences.xml | 10 + 35 files changed, 1228 insertions(+), 146 deletions(-) create mode 160000 app/libraries/dynarmic create mode 100644 app/src/main/cpp/skyline/jit/coproc_15.cpp create mode 100644 app/src/main/cpp/skyline/jit/coproc_15.h create mode 100644 app/src/main/cpp/skyline/jit/exception.h create mode 100644 app/src/main/cpp/skyline/jit/halt_reason.h create mode 100644 app/src/main/cpp/skyline/jit/jit32.cpp create mode 100644 app/src/main/cpp/skyline/jit/jit32.h create mode 100644 app/src/main/cpp/skyline/jit/jit_core_32.cpp create mode 100644 app/src/main/cpp/skyline/jit/jit_core_32.h create mode 100644 app/src/main/cpp/skyline/jit/thread_context32.h diff --git a/.gitmodules b/.gitmodules index 687a2a5d..263069b8 100644 --- a/.gitmodules +++ b/.gitmodules @@ -58,3 +58,6 @@ [submodule "app/libraries/audio-core"] path = app/libraries/audio-core url = https://github.com/ishan09811/audio-core +[submodule "app/libraries/dynarmic"] + path = app/libraries/dynarmic + url = https://github.com/strato-emu/dynarmic.git diff --git a/app/CMakeLists.txt b/app/CMakeLists.txt index 3669a92b..a339e71e 100644 --- a/app/CMakeLists.txt +++ b/app/CMakeLists.txt @@ -130,6 +130,12 @@ add_subdirectory("libraries/audio-core") include_directories(SYSTEM "libraries/audio-core/include") target_link_libraries_system(audio_core Boost::intrusive Boost::container range-v3) +# Dynarmic +set(DYNARMIC_FRONTENDS "A32" CACHE STRING "Enabled Dynarmic frontends" FORCE) +set(DYNARMIC_INSTALL OFF CACHE BOOL "Skip Dynarmic installation" FORCE) +set(DYNARMIC_USE_PRECOMPILED_HEADERS OFF CACHE BOOL "Use precompiled headers for Dynarmic" FORCE) +add_subdirectory("libraries/dynarmic") + # Skyline add_library(skyline SHARED ${source_DIR}/driver_jni.cpp @@ -143,6 +149,9 @@ add_library(skyline SHARED ${source_DIR}/skyline/common/trace.cpp ${source_DIR}/skyline/common/trap_manager.cpp ${source_DIR}/skyline/logger/logger.cpp + ${source_DIR}/skyline/jit/coproc_15.cpp + ${source_DIR}/skyline/jit/jit_core_32.cpp + ${source_DIR}/skyline/jit/jit32.cpp ${source_DIR}/skyline/nce/guest.S ${source_DIR}/skyline/nce.cpp ${source_DIR}/skyline/jvm.cpp @@ -397,5 +406,5 @@ target_include_directories(skyline PRIVATE ${source_DIR}/skyline) # target_precompile_headers(skyline PRIVATE ${source_DIR}/skyline/common.h) # PCH will currently break Intellisense target_compile_options(skyline PRIVATE -Wall -Wno-unknown-attributes -Wno-c++20-extensions -Wno-c++17-extensions -Wno-c99-designator -Wno-reorder -Wno-missing-braces -Wno-unused-variable -Wno-unused-private-field -Wno-dangling-else -Wconversion -fsigned-bitfields) -target_link_libraries(skyline PRIVATE shader_recompiler audio_core) +target_link_libraries(skyline PRIVATE shader_recompiler audio_core dynarmic) target_link_libraries_system(skyline android perfetto fmt lz4_static tzcode vkma mbedcrypto opus Boost::intrusive Boost::container Boost::preprocessor Boost::regex range-v3 adrenotools tsl::robin_map) diff --git a/app/libraries/dynarmic b/app/libraries/dynarmic new file mode 160000 index 00000000..93cf00ad --- /dev/null +++ b/app/libraries/dynarmic @@ -0,0 +1 @@ +Subproject commit 93cf00adc20260d2a476998d57ac44d552d7a5d2 diff --git a/app/src/main/cpp/emu_jni.cpp b/app/src/main/cpp/emu_jni.cpp index e70692b3..a5628f0f 100644 --- a/app/src/main/cpp/emu_jni.cpp +++ b/app/src/main/cpp/emu_jni.cpp @@ -248,3 +248,7 @@ extern "C" JNIEXPORT void JNICALL Java_emu_skyline_settings_NativeSettings_updat extern "C" JNIEXPORT void JNICALL Java_emu_skyline_EmulationActivity_enableDynamicResolution(JNIEnv *env, jobject obj, jboolean enable) { skyline::soc::gm20b::engine::enableDynamicResolution(enable); } + +extern "C" JNIEXPORT void JNICALL Java_emu_skyline_EmulationActivity_enableJit(JNIEnv *env, jobject obj, jboolean enable) { + skyline::kernel::isJitEnabled = enable; +} diff --git a/app/src/main/cpp/skyline/common.cpp b/app/src/main/cpp/skyline/common.cpp index 76ba2276..3a91126e 100644 --- a/app/src/main/cpp/skyline/common.cpp +++ b/app/src/main/cpp/skyline/common.cpp @@ -16,8 +16,6 @@ namespace skyline { gpu = std::make_shared(*this); soc = std::make_shared(*this); audio = std::make_shared(*this); - nce = std::make_shared(*this); - scheduler = std::make_shared(*this); input = std::make_shared(*this); } diff --git a/app/src/main/cpp/skyline/common.h b/app/src/main/cpp/skyline/common.h index b670e01a..83ddd7c1 100644 --- a/app/src/main/cpp/skyline/common.h +++ b/app/src/main/cpp/skyline/common.h @@ -28,6 +28,9 @@ namespace skyline { class NCE; struct ThreadContext; } + namespace jit { + class Jit32; + } class JvmManager; namespace gpu { class GPU; @@ -66,9 +69,9 @@ namespace skyline { std::shared_ptr settings; std::shared_ptr loader; std::shared_ptr nce; + std::shared_ptr jit32; std::shared_ptr process{}; static thread_local inline std::shared_ptr thread{}; //!< The KThread of the thread which accesses this object - static thread_local inline nce::ThreadContext *ctx{}; //!< The context of the guest thread for the corresponding host thread std::shared_ptr gpu; std::shared_ptr soc; std::shared_ptr audio; diff --git a/app/src/main/cpp/skyline/jit/coproc_15.cpp b/app/src/main/cpp/skyline/jit/coproc_15.cpp new file mode 100644 index 00000000..a02eda16 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/coproc_15.cpp @@ -0,0 +1,126 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2024 Strato Team and Contributors (https://github.com/strato-emu/) +// Copyright © 2017 Citra Emulator Project + +#include "coproc_15.h" +#include +#include + +template<> +struct fmt::formatter { + constexpr auto parse(format_parse_context &ctx) { + return ctx.begin(); + } + + template + auto format(const Dynarmic::A32::CoprocReg ®, FormatContext &ctx) { + return fmt::format_to(ctx.out(), "cp{}", static_cast(reg)); + } +}; + +namespace skyline::jit { + using Callback = Dynarmic::A32::Coprocessor::Callback; + using CallbackOrAccessOneWord = Dynarmic::A32::Coprocessor::CallbackOrAccessOneWord; + using CallbackOrAccessTwoWords = Dynarmic::A32::Coprocessor::CallbackOrAccessTwoWords; + + static u32 dummy_value; + + std::optional Coprocessor15::CompileInternalOperation(bool two, unsigned opc1, CoprocReg CRd, CoprocReg CRn, CoprocReg CRm, unsigned opc2) { + LOGE("CP15: cdp{} p15, {}, {}, {}, {}, {}", two ? "2" : "", opc1, CRd, CRn, CRm, opc2); + return std::nullopt; + } + + CallbackOrAccessOneWord Coprocessor15::CompileSendOneWord(bool two, unsigned opc1, CoprocReg CRn, CoprocReg CRm, unsigned opc2) { + if (!two && CRn == CoprocReg::C7 && opc1 == 0 && CRm == CoprocReg::C5 && opc2 == 4) { + // CP15_FLUSH_PREFETCH_BUFFER + // This is a dummy write, we ignore the value written here + return &dummy_value; + } + + if (!two && CRn == CoprocReg::C7 && opc1 == 0 && CRm == CoprocReg::C10) { + switch (opc2) { + case 4: + // CP15_DATA_SYNC_BARRIER + return Callback{ + [](void *, std::uint32_t, std::uint32_t) -> std::uint64_t { + asm volatile("dsb sy\n\t" : : : "memory"); + return 0; + }, + std::nullopt, + }; + case 5: + // CP15_DATA_MEMORY_BARRIER + return Callback{ + [](void *, std::uint32_t, std::uint32_t) -> std::uint64_t { + asm volatile("dmb sy\n\t" : : : "memory"); + return 0; + }, + std::nullopt, + }; + default: + break; + } + } + + if (!two && CRn == CoprocReg::C13 && opc1 == 0 && CRm == CoprocReg::C0 && opc2 == 2) { + // CP15_THREAD_URW + return &tpidrurw; + } + + LOGE("CP15: mcr{} p15, {}, , {}, {}, {}", two ? "2" : "", opc1, CRn, CRm, opc2); + return {}; + } + + CallbackOrAccessTwoWords Coprocessor15::CompileSendTwoWords(bool two, unsigned opc, CoprocReg CRm) { + LOGE("CP15: mcrr{} p15, {}, , , {}", two ? "2" : "", opc, CRm); + return {}; + } + + CallbackOrAccessOneWord Coprocessor15::CompileGetOneWord(bool two, unsigned opc1, CoprocReg CRn, CoprocReg CRm, unsigned opc2) { + if (!two && CRn == CoprocReg::C13 && opc1 == 0 && CRm == CoprocReg::C0) { + switch (opc2) { + case 2: + // CP15_THREAD_URW + return &tpidrurw; + case 3: + // CP15_THREAD_URO + return &tpidruro; + default: + break; + } + } + + LOGE("CP15: mrc{} p15, {}, , {}, {}, {}", two ? "2" : "", opc1, CRn, CRm, opc2); + return {}; + } + + CallbackOrAccessTwoWords Coprocessor15::CompileGetTwoWords(bool two, unsigned opc, CoprocReg CRm) { + if (!two && opc == 0 && CRm == CoprocReg::C14) { + // CNTPCT + return Callback{[](void *arg, u32, u32) -> u64 { + return util::GetTimeTicks(); + }, std::nullopt}; + } + + LOGE("CP15: mrrc{} p15, {}, , , {}", two ? "2" : "", opc, CRm); + return {}; + } + + std::optional Coprocessor15::CompileLoadWords(bool two, bool long_transfer, CoprocReg CRd, std::optional option) { + if (option) + LOGE("CP15: mrrc{}{} p15, {}, [...], {}", two ? "2" : "", long_transfer ? "l" : "", CRd, *option); + else + LOGE("CP15: mrrc{}{} p15, {}, [...]", two ? "2" : "", long_transfer ? "l" : "", CRd); + + return std::nullopt; + } + + std::optional Coprocessor15::CompileStoreWords(bool two, bool long_transfer, CoprocReg CRd, std::optional option) { + if (option) + LOGE("CP15: mrrc{}{} p15, {}, [...], {}", two ? "2" : "", long_transfer ? "l" : "", CRd, *option); + else + LOGE("CP15: mrrc{}{} p15, {}, [...]", two ? "2" : "", long_transfer ? "l" : "", CRd); + + return std::nullopt; + } +} diff --git a/app/src/main/cpp/skyline/jit/coproc_15.h b/app/src/main/cpp/skyline/jit/coproc_15.h new file mode 100644 index 00000000..58184a03 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/coproc_15.h @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2024 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include +#include + +namespace skyline::jit { + class Coprocessor15 final : public Dynarmic::A32::Coprocessor { + public: + using CoprocReg = Dynarmic::A32::CoprocReg; + + Coprocessor15() = default; + + std::optional CompileInternalOperation(bool two, unsigned opc1, CoprocReg CRd, CoprocReg CRn, CoprocReg CRm, unsigned opc2) override; + + CallbackOrAccessOneWord CompileSendOneWord(bool two, unsigned opc1, CoprocReg CRn, CoprocReg CRm, unsigned opc2) override; + + CallbackOrAccessTwoWords CompileSendTwoWords(bool two, unsigned opc, CoprocReg CRm) override; + + CallbackOrAccessOneWord CompileGetOneWord(bool two, unsigned opc1, CoprocReg CRn, CoprocReg CRm, unsigned opc2) override; + + CallbackOrAccessTwoWords CompileGetTwoWords(bool two, unsigned opc, CoprocReg CRm) override; + + std::optional CompileLoadWords(bool two, bool long_transfer, CoprocReg CRd, std::optional option) override; + + std::optional CompileStoreWords(bool two, bool long_transfer, CoprocReg CRd, std::optional option) override; + + u32 tpidrurw = 0; //!< Thread ID Register User and Privileged R/W accessible (equivalent to aarch64 TPIDR_EL0) + u32 tpidruro = 0; //!< Thread ID Register User read-only and Privileged R/W accessible (equivalent to aarch64 TPIDRRO_EL0) + }; +} diff --git a/app/src/main/cpp/skyline/jit/exception.h b/app/src/main/cpp/skyline/jit/exception.h new file mode 100644 index 00000000..73bcae31 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/exception.h @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2024 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include + +namespace skyline::jit { + inline std::string to_string(Dynarmic::A32::Exception e) { + #define CASE(x) case Dynarmic::A32::Exception::x: return #x + + switch (e) { + CASE(UndefinedInstruction); + CASE(UnpredictableInstruction); + CASE(DecodeError); + CASE(SendEvent); + CASE(SendEventLocal); + CASE(WaitForInterrupt); + CASE(WaitForEvent); + CASE(Yield); + CASE(Breakpoint); + CASE(PreloadData); + CASE(PreloadDataWithIntentToWrite); + CASE(PreloadInstruction); + CASE(NoExecuteFault); + default: + return "Unknown"; + } + + #undef CASE + } +} diff --git a/app/src/main/cpp/skyline/jit/halt_reason.h b/app/src/main/cpp/skyline/jit/halt_reason.h new file mode 100644 index 00000000..b11ad5a8 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/halt_reason.h @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include +#include + +namespace skyline::jit { + namespace { + using DynarmicHaltReasonType = std::underlying_type_t; + } + + /** + * @brief The reason that the JIT has halted + * @note The binary representation of this enum's values must match Dynarmic::HaltReason one + */ + enum class HaltReason : DynarmicHaltReasonType { + Step = static_cast(Dynarmic::HaltReason::Step), + CacheInvalidation = static_cast(Dynarmic::HaltReason::CacheInvalidation), + MemoryAbort = static_cast(Dynarmic::HaltReason::MemoryAbort), + Svc = static_cast(Dynarmic::HaltReason::UserDefined1), + Preempted = static_cast(Dynarmic::HaltReason::UserDefined2) + }; + + inline std::string to_string(HaltReason hr) { + #define CASE(x) case HaltReason::x: return #x + + switch (hr) { + CASE(Step); + CASE(CacheInvalidation); + CASE(MemoryAbort); + CASE(Svc); + CASE(Preempted); + default: + return "Unknown"; + } + + #undef CASE + } + + inline std::string to_string(Dynarmic::HaltReason dhr) { + return to_string(static_cast(dhr)); + } + + /** + * @brief Converts a HaltReason to a Dynarmic::HaltReason + */ + inline Dynarmic::HaltReason ToDynarmicHaltReason(HaltReason hr) { + return static_cast(hr); + } +} diff --git a/app/src/main/cpp/skyline/jit/jit32.cpp b/app/src/main/cpp/skyline/jit/jit32.cpp new file mode 100644 index 00000000..4526b724 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/jit32.cpp @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#include "jit32.h" +#include +#include +#include +#include +#include + +namespace skyline::jit { + static std::array MakeJitCores(const DeviceState &state, Dynarmic::ExclusiveMonitor &monitor) { + // Set the signal handler before creating the JIT cores to ensure proper chaining with the Dynarmic handler which is set during construction + signal::SetHostSignalHandler({SIGINT, SIGILL, SIGTRAP, SIGBUS, SIGFPE, SIGSEGV}, Jit32::SignalHandler); + + return {JitCore32(state, monitor, 0), + JitCore32(state, monitor, 1), + JitCore32(state, monitor, 2), + JitCore32(state, monitor, 3)}; + } + + Jit32::Jit32(DeviceState &state) + : state{state}, + monitor{CoreCount}, + cores{MakeJitCores(state, monitor)} {} + + JitCore32 &Jit32::GetCore(u32 coreId) { + return cores[coreId]; + } + + void Jit32::SignalHandler(int signal, siginfo *info, ucontext *ctx) { + if (signal == SIGSEGV) + // Handle any accesses that may be from a trapped region + if (TrapManager::TrapHandler(reinterpret_cast(info->si_addr), true)) + return; + + auto &mctx{ctx->uc_mcontext}; + auto thread{kernel::this_thread}; + bool isGuest{thread->jit != nullptr}; // Whether the signal happened while running guest code + + if (isGuest) { + if (signal != SIGINT) { + signal::StackFrame topFrame{.lr = reinterpret_cast(ctx->uc_mcontext.pc), .next = reinterpret_cast(ctx->uc_mcontext.regs[29])}; + // TODO: this might give garbage stack frames and/or crash + std::string trace{thread->process.state.loader->GetStackTrace(&topFrame)}; + + std::string cpuContext; + if (mctx.fault_address) + cpuContext += fmt::format("\n Fault Address: 0x{:X}", mctx.fault_address); + if (mctx.sp) + cpuContext += fmt::format("\n Stack Pointer: 0x{:X}", mctx.sp); + for (size_t index{}; index < (sizeof(mcontext_t::regs) / sizeof(u64)); index += 2) + cpuContext += fmt::format("\n X{:<2}: 0x{:<16X} X{:<2}: 0x{:X}", index, mctx.regs[index], index + 1, mctx.regs[index + 1]); + + LOGE("Thread #{} has crashed due to signal: {}\nStack Trace:{} \nCPU Context:{}", thread->id, strsignal(signal), trace, cpuContext); + + if (thread->id) { + signal::BlockSignal({SIGINT}); + thread->process.Kill(false); + } + } + + mctx.pc = reinterpret_cast(&std::longjmp); + mctx.regs[0] = reinterpret_cast(thread->originalCtx); + mctx.regs[1] = true; + } else { + signal::ExceptionalSignalHandler(signal, info, ctx); // Delegate throwing a host exception to the exceptional signal handler + } + } +} diff --git a/app/src/main/cpp/skyline/jit/jit32.h b/app/src/main/cpp/skyline/jit/jit32.h new file mode 100644 index 00000000..dfdecca5 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/jit32.h @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include +#include "common.h" +#include "jit_core_32.h" + +namespace skyline::jit { + constexpr auto CoreCount{4}; + + /** + * @brief The JIT for the 32-bit ARM CPU + */ + class Jit32 { + public: + explicit Jit32(DeviceState &state); + + /** + * @brief Gets the JIT core for the specified core ID + */ + JitCore32 &GetCore(u32 coreId); + + /** + * @brief Handles any signals in the JIT threads + */ + static void SignalHandler(int signal, siginfo *info, ucontext *ctx); + + private: + DeviceState &state; + + Dynarmic::ExclusiveMonitor monitor; + std::array cores; + }; +} diff --git a/app/src/main/cpp/skyline/jit/jit_core_32.cpp b/app/src/main/cpp/skyline/jit/jit_core_32.cpp new file mode 100644 index 00000000..7668f6a5 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/jit_core_32.cpp @@ -0,0 +1,252 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#include +#include +#include +#include "jit_core_32.h" +#include "exception.h" + +namespace skyline::jit { + JitCore32::JitCore32(const DeviceState &state, Dynarmic::ExclusiveMonitor &monitor, u32 coreId) + : state{state}, monitor{monitor}, coreId{coreId}, jit{MakeDynarmicJit()} {} + + Dynarmic::A32::Jit JitCore32::MakeDynarmicJit() { + coproc15 = std::make_shared(); + + Dynarmic::A32::UserConfig config; + + config.callbacks = this; + config.processor_id = coreId; + config.global_monitor = &monitor; + + config.coprocessors[15] = coproc15; + + // Enable "safe" unsafe optimizations + config.optimizations |= Dynarmic::OptimizationFlag::Unsafe_UnfuseFMA; + config.optimizations |= Dynarmic::OptimizationFlag::Unsafe_IgnoreStandardFPCRValue; + config.optimizations |= Dynarmic::OptimizationFlag::Unsafe_InaccurateNaN; + config.optimizations |= Dynarmic::OptimizationFlag::Unsafe_IgnoreGlobalMonitor; + config.unsafe_optimizations = true; + + config.fastmem_pointer = state.process->memory.base.data(); + config.fastmem_exclusive_access = true; + + config.define_unpredictable_behaviour = true; + + config.wall_clock_cntpct = true; + config.enable_cycle_counting = false; + + return Dynarmic::A32::Jit{config}; + } + + void JitCore32::Run() { + auto haltReason{static_cast(jit.Run())}; + ClearHalt(haltReason); + + switch (haltReason) { + case HaltReason::Svc: + SvcHandler(lastSwi); + break; + + case HaltReason::Preempted: + state.thread->isPreempted = false; + return; + + default: + LOGE("JIT halted: {}", to_string(haltReason)); + break; + } + } + + void JitCore32::HaltExecution(HaltReason hr) { + jit.HaltExecution(ToDynarmicHaltReason(hr)); + } + + void JitCore32::ClearHalt(HaltReason hr) { + jit.ClearHalt(ToDynarmicHaltReason(hr)); + } + + void JitCore32::SaveContext(ThreadContext32 &context) { + context.gpr = jit.Regs(); + context.fpr = jit.ExtRegs(); + context.cpsr = jit.Cpsr(); + context.fpscr = jit.Fpscr(); + } + + void JitCore32::RestoreContext(const ThreadContext32 &context) { + jit.Regs() = context.gpr; + jit.ExtRegs() = context.fpr; + jit.SetCpsr(context.cpsr); + jit.SetFpscr(context.fpscr); + } + + kernel::svc::SvcContext JitCore32::MakeSvcContext() { + kernel::svc::SvcContext ctx{}; + const auto &jitRegs{jit.Regs()}; + + for (size_t i = 0; i < ctx.regs.size(); i++) + ctx.regs[i] = static_cast(jitRegs[i]); + + return ctx; + } + + void JitCore32::ApplySvcContext(const kernel::svc::SvcContext &svcCtx) { + auto &jitRegs{jit.Regs()}; + + for (size_t i = 0; i < svcCtx.regs.size(); i++) + jitRegs[i] = static_cast(svcCtx.regs[i]); + } + + void JitCore32::SetThreadPointer(u32 threadPtr) { + coproc15->tpidruro = threadPtr; + } + + void JitCore32::SetTlsPointer(u32 tlsPtr) { + coproc15->tpidrurw = tlsPtr; + } + + u32 JitCore32::GetPC() { + return jit.Regs()[15]; + } + + void JitCore32::SetPC(u32 pc) { + jit.Regs()[15] = pc; + } + + u32 JitCore32::GetSP() { + return jit.Regs()[13]; + } + + void JitCore32::SetSP(u32 sp) { + jit.Regs()[13] = sp; + } + + u32 JitCore32::GetRegister(u32 reg) { + return jit.Regs()[reg]; + } + + void JitCore32::SetRegister(u32 reg, u32 value) { + jit.Regs()[reg] = value; + } + + void JitCore32::SvcHandler(u32 swi) { + auto svc{kernel::svc::SvcTable[swi]}; + if (svc) [[likely]] { + TRACE_EVENT("kernel", perfetto::StaticString{svc.name}); + auto svcContext = MakeSvcContext(); + (svc.function)(state, svcContext); + ApplySvcContext(svcContext); + } else { + throw exception("Unimplemented SVC 0x{:X}", swi); + } + } + + template + __attribute__((__always_inline__)) T ReadUnaligned(u8 *ptr) { + T value; + std::memcpy(&value, ptr, sizeof(T)); + return value; + } + + template + __attribute__((__always_inline__)) void WriteUnaligned(u8 *ptr, T value) { + std::memcpy(ptr, &value, sizeof(T)); + } + + template + __attribute__((__always_inline__)) T JitCore32::MemoryRead(u32 vaddr) { + // The number of bits needed to encode the size of T minus 1 + constexpr u32 bits = std::bit_width(sizeof(T)) - 1; + // Compute the mask to have "bits" number of 1s (e.g. 0b111 for 3 bits) + constexpr u32 mask{(1 << bits) - 1}; + + if ((vaddr & mask) == 0) // Aligned access + return state.process->memory.base.cast()[vaddr >> bits]; + else + return ReadUnaligned(state.process->memory.base.data() + vaddr); + } + + template + __attribute__((__always_inline__)) void JitCore32::MemoryWrite(u32 vaddr, T value) { + // The number of bits needed to encode the size of T minus 1 + constexpr u32 bits = std::bit_width(sizeof(T)) - 1; + // Compute the mask to have "bits" number of 1s (e.g. 0b111 for 3 bits) + constexpr u32 mask{(1 << bits) - 1}; + + if ((vaddr & mask) == 0) // Aligned access + state.process->memory.base.cast()[vaddr >> bits] = value; + else + WriteUnaligned(state.process->memory.base.data() + vaddr, value); + } + + template + __attribute__((__always_inline__)) bool JitCore32::MemoryWriteExclusive(u32 vaddr, T value, T expected) { + auto ptr = reinterpret_cast(state.process->memory.base.data() + vaddr); + // Sync built-ins should handle unaligned accesses + return __sync_bool_compare_and_swap(ptr, expected, value); + } + + u8 JitCore32::MemoryRead8(u32 vaddr) { + return MemoryRead(vaddr); + } + + u16 JitCore32::MemoryRead16(u32 vaddr) { + return MemoryRead(vaddr); + } + + u32 JitCore32::MemoryRead32(u32 vaddr) { + return MemoryRead(vaddr); + } + + u64 JitCore32::MemoryRead64(u32 vaddr) { + return MemoryRead(vaddr); + } + + void JitCore32::MemoryWrite8(u32 vaddr, u8 value) { + MemoryWrite(vaddr, value); + } + + void JitCore32::MemoryWrite16(u32 vaddr, u16 value) { + MemoryWrite(vaddr, value); + } + + void JitCore32::MemoryWrite32(u32 vaddr, u32 value) { + MemoryWrite(vaddr, value); + } + + void JitCore32::MemoryWrite64(u32 vaddr, u64 value) { + MemoryWrite(vaddr, value); + } + + bool JitCore32::MemoryWriteExclusive8(u32 vaddr, std::uint8_t value, std::uint8_t expected) { + return MemoryWriteExclusive(vaddr, value, expected); + } + + bool JitCore32::MemoryWriteExclusive16(u32 vaddr, std::uint16_t value, std::uint16_t expected) { + return MemoryWriteExclusive(vaddr, value, expected); + } + + bool JitCore32::MemoryWriteExclusive32(u32 vaddr, std::uint32_t value, std::uint32_t expected) { + return MemoryWriteExclusive(vaddr, value, expected); + } + + bool JitCore32::MemoryWriteExclusive64(u32 vaddr, std::uint64_t value, std::uint64_t expected) { + return MemoryWriteExclusive(vaddr, value, expected); + } + + void JitCore32::InterpreterFallback(u32 pc, size_t numInstructions) { + LOGE("Interpreter fallback at 0x{:X} for {} instructions is not supported", pc, numInstructions); + state.process->Kill(false, true); + } + + void JitCore32::CallSVC(u32 swi) { + lastSwi = swi; + HaltExecution(HaltReason::Svc); + } + + void JitCore32::ExceptionRaised(u32 pc, Dynarmic::A32::Exception exception) { + LOGE("Exception raised at 0x{:X}: {}", pc, to_string(exception)); + state.process->Kill(false, true); + } +} diff --git a/app/src/main/cpp/skyline/jit/jit_core_32.h b/app/src/main/cpp/skyline/jit/jit_core_32.h new file mode 100644 index 00000000..06b85397 --- /dev/null +++ b/app/src/main/cpp/skyline/jit/jit_core_32.h @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include +#include +#include "coproc_15.h" +#include "thread_context32.h" +#include "halt_reason.h" + +namespace skyline::jit { + /** + * @brief A wrapper around a Dynarmic 32-bit JIT object with additional state and functionality, representing a single core of the emulated CPU + */ + class JitCore32 final : public Dynarmic::A32::UserCallbacks { + private: + const DeviceState &state; + Dynarmic::ExclusiveMonitor &monitor; + u32 coreId; + u32 lastSwi{}; + + std::shared_ptr coproc15; + Dynarmic::A32::Jit jit; + + /** + * @brief Creates a new Dynarmic 32-bit JIT instance + * @note This is only called once in the initialization list because the Dynarmic JIT class is not default constructible + * @return A new Dynarmic 32-bit JIT instance + */ + Dynarmic::A32::Jit MakeDynarmicJit(); + + public: + JitCore32(const DeviceState &state, Dynarmic::ExclusiveMonitor &monitor, u32 coreId); + + /** + * @brief Runs the JIT + * @note This function does not return + */ + void Run(); + + /** + * @brief Stops execution by setting the given halt flag + */ + void HaltExecution(HaltReason hr); + + /** + * @brief Clears a previously set halt flag + */ + void ClearHalt(HaltReason hr); + + /** + * @brief Saves the current state of the JIT to the given context. + */ + void SaveContext(ThreadContext32 &context); + + /** + * @brief Restores the state of the JIT from the given context. + */ + void RestoreContext(const ThreadContext32 &context); + + /** + * @brief Constructs an SvcContext from the current state of the JIT + */ + kernel::svc::SvcContext MakeSvcContext(); + + /** + * @brief Applies the given SvcContext to the current state of the JIT + */ + void ApplySvcContext(const kernel::svc::SvcContext &context); + + /** + * @brief Sets the Thread Pointer register to the specified value + * @details Thread pointer is stored in TPIDRURO + */ + void SetThreadPointer(u32 threadPtr); + + /** + * @brief Sets the Thread Local Storage Pointer register to the specified value + * @details TLS is stored in TPIDRURW + */ + void SetTlsPointer(u32 tlsPtr); + + /** + * @brief Gets the Program Counter + */ + u32 GetPC(); + + /** + * @brief Sets the Program Counter to the specified value + */ + void SetPC(u32 pc); + + /** + * @brief Gets the Stack Pointer + */ + u32 GetSP(); + + /** + * @brief Sets the Stack Pointer to the specified value + */ + void SetSP(u32 sp); + + /** + * @brief Gets the specified register value + */ + u32 GetRegister(u32 reg); + + /** + * @brief Sets the specified register to the given value + */ + void SetRegister(u32 reg, u32 value); + + /** + * @brief Handles an SVC call from the JIT + * @param swi The SVC number + */ + void SvcHandler(u32 swi); + + // Dynarmic callbacks + public: + // @fmt:off + u8 MemoryRead8(u32 vaddr) override; + u16 MemoryRead16(u32 vaddr) override; + u32 MemoryRead32(u32 vaddr) override; + u64 MemoryRead64(u32 vaddr) override; + + void MemoryWrite8(u32 vaddr, u8 value) override; + void MemoryWrite16(u32 vaddr, u16 value) override; + void MemoryWrite32(u32 vaddr, u32 value) override; + void MemoryWrite64(u32 vaddr, u64 value) override; + + bool MemoryWriteExclusive8(u32 vaddr, std::uint8_t value, std::uint8_t expected) override; + bool MemoryWriteExclusive16(u32 vaddr, std::uint16_t value, std::uint16_t expected) override; + bool MemoryWriteExclusive32(u32 vaddr, std::uint32_t value, std::uint32_t expected) override; + bool MemoryWriteExclusive64(u32 vaddr, std::uint64_t value, std::uint64_t expected) override; + // @fmt:on + + void InterpreterFallback(u32 pc, size_t numInstructions) override; + + void CallSVC(u32 swi) override; + + void ExceptionRaised(u32 pc, Dynarmic::A32::Exception exception) override; + + // Cycle counting callbacks are unused + void AddTicks(u64 ticks) override {} + + u64 GetTicksRemaining() override { return 0; } + + private: + /** + * @brief Reads the memory at the given virtual address + * @tparam T The type of the value to read + * @param vaddr The virtual address to read from + * @return The value read from memory + */ + template + T MemoryRead(u32 vaddr); + + /** + * @brief Writes the given value to the memory at the given virtual address + * @tparam T The type of the value to write + * @param vaddr The virtual address to write to + * @param value The value to write to memory + */ + template + void MemoryWrite(u32 vaddr, T value); + + /** + * @brief Writes the given value to the memory at the given virtual address if the previous value matches the expected value + * @tparam T The type of the value to write + * @param vaddr The virtual address to write to + * @param value The value to write to memory + * @param expected The expected value to compare against + * @return True if the value was written, false otherwise + */ + template + bool MemoryWriteExclusive(u32 vaddr, T value, T expected); + }; +} diff --git a/app/src/main/cpp/skyline/jit/thread_context32.h b/app/src/main/cpp/skyline/jit/thread_context32.h new file mode 100644 index 00000000..ad3d18ab --- /dev/null +++ b/app/src/main/cpp/skyline/jit/thread_context32.h @@ -0,0 +1,53 @@ +// SPDX-License-Identifier: GPL-3.0-or-later +// Copyright © 2023 Strato Team and Contributors (https://github.com/strato-emu/) + +#pragma once + +#include +#include + +namespace skyline::jit { + class JitCore32; // Forward declaration + + /** + * @brief The register context of a thread running in 32-bit mode + */ + struct ThreadContext32 { + union { + std::array gpr{}; //!< General purpose registers + struct { + std::array _pad_; + u32 sp; + u32 lr; + u32 pc; + }; + }; + u32 cpsr{0}; //!< Current program status register + u32 pad{0}; + union { + std::array fpr{}; //!< Floating point and vector registers + std::array fpr_d; //!< Floating point and vector registers as double words + }; + u32 fpscr{0}; //!< Floating point status and control register + u32 tpidr{0}; //!< Thread ID register + }; + static_assert(sizeof(ThreadContext32) == 0x150, "ThreadContext32 should be 336 bytes in size to match HOS"); + + /** + * @brief Creates a new SvcContext from the given JIT32 thread context + */ + inline kernel::svc::SvcContext MakeSvcContext(const ThreadContext32 &threadctx) { + kernel::svc::SvcContext ctx; + for (size_t i = 0; i < ctx.regs.size(); i++) + ctx.regs[i] = static_cast(threadctx.gpr[i]); + return ctx; + } + + /** + * @brief Applies changes from the given SvcContext to the JIT32 thread context + */ + inline void ApplySvcContext(const kernel::svc::SvcContext &svcCtx, ThreadContext32 &threadctx) { + for (size_t i = 0; i < svcCtx.regs.size(); i++) + threadctx.gpr[i] = static_cast(svcCtx.regs[i]); + } +} diff --git a/app/src/main/cpp/skyline/kernel/ipc.cpp b/app/src/main/cpp/skyline/kernel/ipc.cpp index 9242cc09..0eef6bd6 100644 --- a/app/src/main/cpp/skyline/kernel/ipc.cpp +++ b/app/src/main/cpp/skyline/kernel/ipc.cpp @@ -6,7 +6,7 @@ namespace skyline::kernel::ipc { IpcRequest::IpcRequest(bool isDomain, const DeviceState &state) : isDomain(isDomain) { - auto tls{state.ctx->tpidrroEl0}; + auto tls{state.thread->tlsRegion}; u8 *pointer{tls}; header = reinterpret_cast(pointer); @@ -136,7 +136,7 @@ namespace skyline::kernel::ipc { IpcResponse::IpcResponse(const DeviceState &state) : state(state) {} void IpcResponse::WriteResponse(bool isDomain, bool isTipc) { - auto tls{state.ctx->tpidrroEl0}; + auto tls{state.thread->tlsRegion}; u8 *pointer{tls}; memset(tls, 0, constant::TlsIpcSize); diff --git a/app/src/main/cpp/skyline/kernel/memory.cpp b/app/src/main/cpp/skyline/kernel/memory.cpp index 9d51692d..8b3440d8 100644 --- a/app/src/main/cpp/skyline/kernel/memory.cpp +++ b/app/src/main/cpp/skyline/kernel/memory.cpp @@ -196,6 +196,15 @@ namespace skyline::kernel { namespace { constexpr size_t RegionAlignment{1ULL << 21}; //!< The minimum alignment of a HOS memory region + namespace AS32bit { + constexpr size_t CodeRegionStart{0x200000}; //!< The start address of the code/stack region (2MiB) + constexpr size_t CodeRegionSize{0x3fe00000}; //!< The size of the code/stack region (1GiB - 2MiB) + constexpr size_t AliasRegionSize{0x40000000}; //!< The size of the alias region (1GiB) + constexpr size_t HeapRegionSize{0x40000000}; //!< The size of the heap region (1GiB) + + constexpr size_t TotalSize{1ULL << 32}; + } + namespace AS36bit { constexpr size_t CodeRegionStart{0x8000000}; //!< The start address of the code region (128MiB) constexpr size_t CodeRegionSize{0x78000000}; //!< The size of the code region (2GiB - 128MiB) @@ -224,8 +233,12 @@ namespace skyline::kernel { size_t baseSize{}, maxAddress{}; switch (type) { case memory::AddressSpaceType::AddressSpace32Bit: - case memory::AddressSpaceType::AddressSpace32BitNoReserved: - throw exception("32-bit address spaces are not supported"); + case memory::AddressSpaceType::AddressSpace32BitNoReserved: { + addressSpace = span{reinterpret_cast(0), 1ULL << 32}; + baseSize = AS32bit::TotalSize; + maxAddress = std::numeric_limits::max(); // No limit on address space placement + break; + } case memory::AddressSpaceType::AddressSpace36Bit: { addressSpace = span{reinterpret_cast(0), (1ULL << 36)}; @@ -251,6 +264,13 @@ namespace skyline::kernel { base = AllocateMappedRange(baseSize, RegionAlignment, KgslReservedRegionSize, maxAddress, false); switch (type) { + case memory::AddressSpaceType::AddressSpace32Bit: + case memory::AddressSpaceType::AddressSpace32BitNoReserved: { + code = MemoryRegion{span{base.data() + AS32bit::CodeRegionStart, AS32bit::CodeRegionSize}, reinterpret_cast(base.data())}; + guestOffset = reinterpret_cast(base.data()); + break; + } + case memory::AddressSpaceType::AddressSpace36Bit: { code = codeBase36Bit = AllocateMappedRange(AS36bit::CodeRegionSize, RegionAlignment, AS36bit::CodeRegionStart, KgslReservedRegionSize, false); @@ -286,6 +306,22 @@ namespace skyline::kernel { throw exception("Non-aligned code region was used to initialize regions: {} - {}", fmt::ptr(codeRegion.data()), fmt::ptr(codeRegion.end().base())); switch (addressSpaceType) { + case memory::AddressSpaceType::AddressSpace32Bit: { + stack = code; // stack is shared with code on 32-bit + tlsIo = stack; // TLS/IO is shared with stack on 32-bit + alias = MemoryRegion{span{stack.host.end().base(), AS32bit::AliasRegionSize}, guestOffset}; + heap = MemoryRegion{span{alias.host.end().base(), AS32bit::HeapRegionSize}, guestOffset}; + break; + } + + case memory::AddressSpaceType::AddressSpace32BitNoReserved: { + stack = code; // stack is shared with code on 32-bit + tlsIo = stack; // TLS/IO is shared with stack on 32-bit + heap = MemoryRegion{span{stack.host.end().base(), AS32bit::HeapRegionSize * 2}, guestOffset}; + alias = MemoryRegion{span{heap.host.end().base(), 0}, guestOffset}; + break; + } + case memory::AddressSpaceType::AddressSpace36Bit: { // As a workaround if we can't place the code region at the base of the AS we mark it as inaccessible heap so rtld doesn't crash if (codeBase36Bit.data() != reinterpret_cast(AS36bit::CodeRegionStart)) { diff --git a/app/src/main/cpp/skyline/kernel/memory.h b/app/src/main/cpp/skyline/kernel/memory.h index ca21204d..d087c002 100644 --- a/app/src/main/cpp/skyline/kernel/memory.h +++ b/app/src/main/cpp/skyline/kernel/memory.h @@ -268,7 +268,7 @@ namespace skyline { memory::AddressSpaceType addressSpaceType{}; span addressSpace{}; //!< The entire address space span codeBase36Bit{}; //!< A mapping in the lower 36 bits of the address space for mapping code and stack on 36-bit guests - span base{}; //!< The application-accessible address space (for 39-bit guests) or the heap/alias address space (for 36-bit guests) + span base{}; //!< The application-accessible address space (for 39-bit and 32-bit guests) or the heap/alias address space (for 36-bit guests) MemoryRegion code{}; MemoryRegion alias{}; MemoryRegion heap{}; diff --git a/app/src/main/cpp/skyline/kernel/scheduler.cpp b/app/src/main/cpp/skyline/kernel/scheduler.cpp index 2f6b44dc..f577013b 100644 --- a/app/src/main/cpp/skyline/kernel/scheduler.cpp +++ b/app/src/main/cpp/skyline/kernel/scheduler.cpp @@ -4,7 +4,10 @@ #include #include #include +#include +#include #include "types/KThread.h" +#include "types/KProcess.h" #include "scheduler.h" namespace skyline::kernel { @@ -12,8 +15,12 @@ namespace skyline::kernel { Scheduler::Scheduler(const DeviceState &state) : state(state) { // Don't restart syscalls: we want futexes to fail and their predicates rechecked - signal::SetGuestSignalHandler({Scheduler::YieldSignal, Scheduler::PreemptionSignal}, Scheduler::GuestSignalHandler, false); - signal::SetHostSignalHandler({Scheduler::YieldSignal, Scheduler::PreemptionSignal}, Scheduler::HostSignalHandler, false); + if (state.process->is64bit()) { + signal::SetGuestSignalHandler({Scheduler::YieldSignal, Scheduler::PreemptionSignal}, Scheduler::GuestSignalHandler, false); + signal::SetHostSignalHandler({Scheduler::YieldSignal, Scheduler::PreemptionSignal}, Scheduler::HostSignalHandler, false); + } else { + signal::SetHostSignalHandler({Scheduler::YieldSignal, Scheduler::PreemptionSignal}, Scheduler::JitSignalHandler, false); + } } void Scheduler::GuestSignalHandler(int signal, siginfo *info, ucontext *ctx, void **tls) { @@ -23,7 +30,7 @@ namespace skyline::kernel { const auto &state{*reinterpret_cast(*tls)->state}; if (signal == PreemptionSignal) state.thread->isPreempted = false; - state.scheduler->Rotate(false); + state.scheduler->Rotate(); YieldPending = false; state.scheduler->WaitSchedule(); } @@ -34,6 +41,15 @@ namespace skyline::kernel { YieldPending = true; } + void Scheduler::JitSignalHandler(int signal, siginfo *info, ucontext *ctx) { + if (kernel::this_thread->jit) + // The thread is running on a JIT core, preempt it + kernel::this_thread->jit->HaltExecution(jit::HaltReason::Preempted); + else + // The thread is running host code + YieldPending = true; + } + Scheduler::CoreContext &Scheduler::GetOptimalCoreForThread(const std::shared_ptr &thread) { auto *currentCore{&cores.at(thread->coreId)}; @@ -232,7 +248,7 @@ namespace skyline::kernel { } } - void Scheduler::Rotate(bool cooperative) { + void Scheduler::Rotate() { auto &thread{state.thread}; auto &core{cores.at(thread->coreId)}; diff --git a/app/src/main/cpp/skyline/kernel/scheduler.h b/app/src/main/cpp/skyline/kernel/scheduler.h index 611a421b..eb1a0b06 100644 --- a/app/src/main/cpp/skyline/kernel/scheduler.h +++ b/app/src/main/cpp/skyline/kernel/scheduler.h @@ -88,6 +88,11 @@ namespace skyline { */ static void HostSignalHandler(int signal, siginfo *info, ucontext *ctx); + /** + * @brief A signal handler for scheduling guest threads running through the JIT + */ + static void JitSignalHandler(int signal, siginfo *info, ucontext *ctx); + /** * @brief Checks all cores and determines the core where the supplied thread should be scheduled the earliest * @note 'KThread::coreMigrationMutex' **must** be locked by the calling thread prior to calling this @@ -118,9 +123,8 @@ namespace skyline { /** * @brief Rotates the calling thread's resident core queue, if it's at the front of it - * @param cooperative If this was triggered by a cooperative yield as opposed to a preemptive one */ - void Rotate(bool cooperative = true); + void Rotate(); /** * @brief Removes the calling thread from its resident core queue diff --git a/app/src/main/cpp/skyline/kernel/svc.cpp b/app/src/main/cpp/skyline/kernel/svc.cpp index ea0eb531..5a5da106 100644 --- a/app/src/main/cpp/skyline/kernel/svc.cpp +++ b/app/src/main/cpp/skyline/kernel/svc.cpp @@ -225,7 +225,7 @@ namespace skyline::kernel::svc { .ipcRefCount = 0, }; - fmt::format("Address: {}, Region Start: 0x{:X}, Size: 0x{:X}, Type: 0x{:X}, Attributes: 0x{:X}, Permissions: {}", fmt::ptr(address), memInfo.address, memInfo.size, memInfo.type, memInfo.attributes, chunk->second.permission); + LOGD("Address: {}, Region Start: 0x{:X}, Size: 0x{:X}, Type: 0x{:X}, Attributes: 0x{:X}, Permissions: {}", fmt::ptr(address), memInfo.address, memInfo.size, memInfo.type, memInfo.attributes, chunk->second.permission); } else { u64 addressSpaceEnd{reinterpret_cast(state.process->memory.addressSpace.end().base())}; @@ -1218,17 +1218,41 @@ namespace skyline::kernel::svc { auto &context{*reinterpret_cast(ctx.x0)}; context = {}; // Zero-initialize the contents of the context as not all fields are set - auto &targetContext{thread->ctx}; - for (size_t i{}; i < targetContext.gpr.regs.size(); i++) - context.gpr[i] = targetContext.gpr.regs[i]; + if (state.process->is64bit()) { + auto &targetContext{dynamic_cast(thread.get())->ctx}; + for (size_t i{}; i < targetContext.gpr.regs.size(); i++) + context.gpr[i] = targetContext.gpr.regs[i]; - for (size_t i{}; i < targetContext.fpr.regs.size(); i++) - context.vreg[i] = targetContext.fpr.regs[i]; + for (size_t i{}; i < targetContext.fpr.regs.size(); i++) + context.vreg[i] = targetContext.fpr.regs[i]; - context.fpcr = targetContext.fpr.fpcr; - context.fpsr = targetContext.fpr.fpsr; + context.fpcr = targetContext.fpr.fpcr; + context.fpsr = targetContext.fpr.fpsr; - context.tpidr = reinterpret_cast(targetContext.tpidrEl0); + context.tpidr = reinterpret_cast(targetContext.tpidrEl0); + } else { // 32 bit + constexpr u32 El0Aarch32PsrMask = 0xFE0FFE20; + // https://developer.arm.com/documentation/ddi0601/2023-12/AArch32-Registers/FPSCR--Floating-Point-Status-and-Control-Register + constexpr u32 FpsrMask = 0xF800009F; // [31:27], [7], [4:0] + constexpr u32 FpcrMask = 0x07FF9F00; // [26:15], [12:8] + + auto &targetContext{dynamic_cast(thread.get())->ctx}; + + context.pc = targetContext.pc; + context.pstate = targetContext.cpsr & El0Aarch32PsrMask; + + for (size_t i{}; i < targetContext.gpr.size() - 1; i++) + context.gpr[i] = targetContext.gpr[i]; + + // TODO: Check if this is correct + for (size_t i{}; i < targetContext.fpr.size(); i++) { + context.vreg[i] = targetContext.fpr_d[i]; + } + + context.fpsr = targetContext.fpscr & FpsrMask; + context.fpcr = targetContext.fpscr & FpcrMask; + context.tpidr = targetContext.tpidr; + } // Note: We don't write the whole context as we only store the parts required according to the ARMv8 ABI for syscall handling LOGD("Written partial context for thread #{}", thread->id); diff --git a/app/src/main/cpp/skyline/kernel/types/KProcess.cpp b/app/src/main/cpp/skyline/kernel/types/KProcess.cpp index fbe559cd..2ed43224 100644 --- a/app/src/main/cpp/skyline/kernel/types/KProcess.cpp +++ b/app/src/main/cpp/skyline/kernel/types/KProcess.cpp @@ -120,7 +120,13 @@ namespace skyline::kernel::type { mainThreadStack = span(pageCandidate, state.process->npdm.meta.mainThreadStackSize); } size_t tid{threads.size() + 1}; //!< The first thread is HOS-1 rather than HOS-0, this is to match the HOS kernel's behaviour - auto thread{NewHandle(this, tid, entry, argument, stackTop, priority ? *priority : state.process->npdm.meta.mainThreadPriority, idealCore ? *idealCore : state.process->npdm.meta.idealCore).item}; + + auto thread{[&]() -> std::shared_ptr { + if (is64bit()) + return NewHandle(std::ref(*this), tid, entry, argument, stackTop, priority ? *priority : state.process->npdm.meta.mainThreadPriority, idealCore ? *idealCore : state.process->npdm.meta.idealCore).item; + else + return NewHandle(std::ref(*this), tid, entry, argument, stackTop, priority ? *priority : state.process->npdm.meta.mainThreadPriority, idealCore ? *idealCore : state.process->npdm.meta.idealCore).item; + }()}; threads.push_back(thread); return thread; } diff --git a/app/src/main/cpp/skyline/kernel/types/KProcess.h b/app/src/main/cpp/skyline/kernel/types/KProcess.h index 30bace70..4d6b11af 100644 --- a/app/src/main/cpp/skyline/kernel/types/KProcess.h +++ b/app/src/main/cpp/skyline/kernel/types/KProcess.h @@ -70,6 +70,10 @@ namespace skyline { ~KProcess(); + bool is64bit() { + return npdm.meta.flags.is64Bit; + } + /** * @brief Kill the main thread/all running threads in the process in a graceful manner * @param join Return after the main thread has joined rather than instantly @@ -117,7 +121,7 @@ namespace skyline { std::unique_lock lock(handleMutex); std::shared_ptr item; - if constexpr (std::is_same()) + if constexpr (std::is_base_of()) item = std::make_shared(state, constant::BaseHandleIndex + handles.size(), args...); else item = std::make_shared(state, args...); @@ -142,23 +146,23 @@ namespace skyline { std::shared_lock lock(handleMutex); KType objectType; - if constexpr(std::is_same()) { + if constexpr (std::is_same()) { constexpr KHandle threadSelf{0xFFFF8000}; // The handle used by threads to refer to themselves if (handle == threadSelf) return state.thread; objectType = KType::KThread; - } else if constexpr(std::is_same()) { + } else if constexpr (std::is_same()) { constexpr KHandle processSelf{0xFFFF8001}; // The handle used by threads in a process to refer to the process if (handle == processSelf) return state.process; objectType = KType::KProcess; - } else if constexpr(std::is_same()) { + } else if constexpr (std::is_same()) { objectType = KType::KSharedMemory; - } else if constexpr(std::is_same()) { + } else if constexpr (std::is_same()) { objectType = KType::KTransferMemory; - } else if constexpr(std::is_same()) { + } else if constexpr (std::is_same()) { objectType = KType::KSession; - } else if constexpr(std::is_same()) { + } else if constexpr (std::is_same()) { objectType = KType::KEvent; } else { throw exception("KProcess::GetHandle couldn't determine object type"); diff --git a/app/src/main/cpp/skyline/kernel/types/KThread.cpp b/app/src/main/cpp/skyline/kernel/types/KThread.cpp index 5c33454f..82f8a83c 100644 --- a/app/src/main/cpp/skyline/kernel/types/KThread.cpp +++ b/app/src/main/cpp/skyline/kernel/types/KThread.cpp @@ -6,14 +6,15 @@ #include #include #include +#include "jit/jit32.h" #include #include "KProcess.h" #include "KThread.h" namespace skyline::kernel::type { - KThread::KThread(const DeviceState &state, KHandle handle, KProcess *parent, size_t id, void *entry, u64 argument, void *stackTop, i8 priority, u8 idealCore) + KThread::KThread(const DeviceState &state, KHandle handle, KProcess &process, size_t id, void *entry, u64 argument, void *stackTop, i8 priority, u8 idealCore) : handle(handle), - parent(parent), + process(process), id(id), entry(entry), entryArgument(argument), @@ -24,6 +25,7 @@ namespace skyline::kernel::type { coreId(idealCore), KSyncObject(state, KType::KThread) { affinityMask.set(coreId); + this_thread = this; } KThread::~KThread() { @@ -34,7 +36,7 @@ namespace skyline::kernel::type { timer_delete(preemptionTimer); } - void KThread::StartThread() { + void KThread::ThreadEntrypoint() { pthread = pthread_self(); std::array threadName{}; if (int result{pthread_getname_np(pthread, threadName.data(), threadName.size())}) @@ -44,11 +46,9 @@ namespace skyline::kernel::type { LOGW("Failed to set the thread name: {}", strerror(result)); AsyncLogger::UpdateTag(); - if (!ctx.tpidrroEl0) - ctx.tpidrroEl0 = parent->AllocateTlsSlot(); + if (!tlsRegion) + tlsRegion = process.AllocateTlsSlot(); - ctx.state = &state; - state.ctx = &ctx; state.thread = shared_from_this(); if (setjmp(originalCtx)) { // Returns 1 if it's returning from guest, 0 otherwise @@ -63,6 +63,7 @@ namespace skyline::kernel::type { Signal(); + // Restore the previous thread name if any if (threadName[0] != 'H' || threadName[1] != 'O' || threadName[2] != 'S' || threadName[3] != '-') { if (int result{pthread_setname_np(pthread, threadName.data())}) LOGW("Failed to set the thread name: {}", strerror(result)); @@ -80,6 +81,9 @@ namespace skyline::kernel::type { if (timer_create(CLOCK_THREAD_CPUTIME_ID, &event, &preemptionTimer)) throw exception("timer_create has failed with '{}'", strerror(errno)); + // Initialize execution-mode-specific stuff + Init(); + { std::scoped_lock lock{statusMutex}; ready = true; @@ -89,95 +93,19 @@ namespace skyline::kernel::type { try { if (!Scheduler::YieldPending) state.scheduler->WaitSchedule(); - while (Scheduler::YieldPending) { - // If there is a yield pending on us after thread creation - state.scheduler->Rotate(); - Scheduler::YieldPending = false; - state.scheduler->WaitSchedule(); + + while (!killed) { + while (Scheduler::YieldPending) [[unlikely]] { + // If there is a yield pending on us after thread creation + state.scheduler->Rotate(); + Scheduler::YieldPending = false; + state.scheduler->WaitSchedule(); + } + + TRACE_EVENT("guest", "Guest"); + // Run the guest code + Run(); } - - TRACE_EVENT_BEGIN("guest", "Guest"); - - asm volatile( - "MRS X0, TPIDR_EL0\n\t" - "MSR TPIDR_EL0, %x0\n\t" // Set TLS to ThreadContext - "STR X0, [%x0, #0x2A0]\n\t" // Write ThreadContext::hostTpidrEl0 - "MOV X0, SP\n\t" - "STR X0, [%x0, #0x2A8]\n\t" // Write ThreadContext::hostSp - "MOV SP, %x1\n\t" // Replace SP with guest stack - "MOV LR, %x2\n\t" // Store entry in Link Register so it's jumped to on return - "MOV X0, %x3\n\t" // Store the argument in X0 - "MOV X1, %x4\n\t" // Store the thread handle in X1, NCA applications require this - "MOV X2, XZR\n\t" // Zero out other GP and SIMD registers, not doing this will break applications - "MOV X3, XZR\n\t" - "MOV X4, XZR\n\t" - "MOV X5, XZR\n\t" - "MOV X6, XZR\n\t" - "MOV X7, XZR\n\t" - "MOV X8, XZR\n\t" - "MOV X9, XZR\n\t" - "MOV X10, XZR\n\t" - "MOV X11, XZR\n\t" - "MOV X12, XZR\n\t" - "MOV X13, XZR\n\t" - "MOV X14, XZR\n\t" - "MOV X15, XZR\n\t" - "MOV X16, XZR\n\t" - "MOV X17, XZR\n\t" - "MOV X18, XZR\n\t" - "MOV X19, XZR\n\t" - "MOV X20, XZR\n\t" - "MOV X21, XZR\n\t" - "MOV X22, XZR\n\t" - "MOV X23, XZR\n\t" - "MOV X24, XZR\n\t" - "MOV X25, XZR\n\t" - "MOV X26, XZR\n\t" - "MOV X27, XZR\n\t" - "MOV X28, XZR\n\t" - "MOV X29, XZR\n\t" - "MSR FPSR, XZR\n\t" - "MSR FPCR, XZR\n\t" - "MSR NZCV, XZR\n\t" - "DUP V0.16B, WZR\n\t" - "DUP V1.16B, WZR\n\t" - "DUP V2.16B, WZR\n\t" - "DUP V3.16B, WZR\n\t" - "DUP V4.16B, WZR\n\t" - "DUP V5.16B, WZR\n\t" - "DUP V6.16B, WZR\n\t" - "DUP V7.16B, WZR\n\t" - "DUP V8.16B, WZR\n\t" - "DUP V9.16B, WZR\n\t" - "DUP V10.16B, WZR\n\t" - "DUP V11.16B, WZR\n\t" - "DUP V12.16B, WZR\n\t" - "DUP V13.16B, WZR\n\t" - "DUP V14.16B, WZR\n\t" - "DUP V15.16B, WZR\n\t" - "DUP V16.16B, WZR\n\t" - "DUP V17.16B, WZR\n\t" - "DUP V18.16B, WZR\n\t" - "DUP V19.16B, WZR\n\t" - "DUP V20.16B, WZR\n\t" - "DUP V21.16B, WZR\n\t" - "DUP V22.16B, WZR\n\t" - "DUP V23.16B, WZR\n\t" - "DUP V24.16B, WZR\n\t" - "DUP V25.16B, WZR\n\t" - "DUP V26.16B, WZR\n\t" - "DUP V27.16B, WZR\n\t" - "DUP V28.16B, WZR\n\t" - "DUP V29.16B, WZR\n\t" - "DUP V30.16B, WZR\n\t" - "DUP V31.16B, WZR\n\t" - "RET" - : - : "r"(&ctx), "r"(stackTop), "r"(entry), "r"(entryArgument), "r"(handle) - : "x0", "x1", "lr" - ); - - __builtin_unreachable(); } catch (const std::exception &e) { LOGE("{}", e.what()); if (id) { @@ -214,9 +142,9 @@ namespace skyline::kernel::type { statusCondition.notify_all(); if (self) { lock.unlock(); - StartThread(); + ThreadEntrypoint(); } else { - thread = std::thread(&KThread::StartThread, this); + thread = std::thread(&KThread::ThreadEntrypoint, this); } } } @@ -333,4 +261,113 @@ namespace skyline::kernel::type { } } } + + void KNceThread::Init() { + ctx.tpidrroEl0 = tlsRegion; + ctx.state = &state; + } + + void KNceThread::Run() { + asm volatile( + "MRS X0, TPIDR_EL0\n\t" // Retrieve current (host) TLS + "MSR TPIDR_EL0, %x0\n\t" // Set TLS to ThreadContext + "STR X0, [%x0, #0x2A0]\n\t" // Write host TLS to ThreadContext::hostTpidrEl0 + "MOV X0, SP\n\t" // Load the current (host) stack pointer + "STR X0, [%x0, #0x2A8]\n\t" // Write host SP to ThreadContext::hostSp + "MOV SP, %x1\n\t" // Replace SP with guest stack + "MOV LR, %x2\n\t" // Store entry in Link Register so it's jumped to on return + "MOV X0, %x3\n\t" // Store the argument in X0 + "MOV X1, %x4\n\t" // Store the thread handle in X1, NCA applications require this + "MOV X2, XZR\n\t" // Zero out other GP and SIMD registers, not doing this will break applications + "MOV X3, XZR\n\t" + "MOV X4, XZR\n\t" + "MOV X5, XZR\n\t" + "MOV X6, XZR\n\t" + "MOV X7, XZR\n\t" + "MOV X8, XZR\n\t" + "MOV X9, XZR\n\t" + "MOV X10, XZR\n\t" + "MOV X11, XZR\n\t" + "MOV X12, XZR\n\t" + "MOV X13, XZR\n\t" + "MOV X14, XZR\n\t" + "MOV X15, XZR\n\t" + "MOV X16, XZR\n\t" + "MOV X17, XZR\n\t" + "MOV X18, XZR\n\t" + "MOV X19, XZR\n\t" + "MOV X20, XZR\n\t" + "MOV X21, XZR\n\t" + "MOV X22, XZR\n\t" + "MOV X23, XZR\n\t" + "MOV X24, XZR\n\t" + "MOV X25, XZR\n\t" + "MOV X26, XZR\n\t" + "MOV X27, XZR\n\t" + "MOV X28, XZR\n\t" + "MOV X29, XZR\n\t" + "MSR FPSR, XZR\n\t" + "MSR FPCR, XZR\n\t" + "MSR NZCV, XZR\n\t" + "DUP V0.16B, WZR\n\t" + "DUP V1.16B, WZR\n\t" + "DUP V2.16B, WZR\n\t" + "DUP V3.16B, WZR\n\t" + "DUP V4.16B, WZR\n\t" + "DUP V5.16B, WZR\n\t" + "DUP V6.16B, WZR\n\t" + "DUP V7.16B, WZR\n\t" + "DUP V8.16B, WZR\n\t" + "DUP V9.16B, WZR\n\t" + "DUP V10.16B, WZR\n\t" + "DUP V11.16B, WZR\n\t" + "DUP V12.16B, WZR\n\t" + "DUP V13.16B, WZR\n\t" + "DUP V14.16B, WZR\n\t" + "DUP V15.16B, WZR\n\t" + "DUP V16.16B, WZR\n\t" + "DUP V17.16B, WZR\n\t" + "DUP V18.16B, WZR\n\t" + "DUP V19.16B, WZR\n\t" + "DUP V20.16B, WZR\n\t" + "DUP V21.16B, WZR\n\t" + "DUP V22.16B, WZR\n\t" + "DUP V23.16B, WZR\n\t" + "DUP V24.16B, WZR\n\t" + "DUP V25.16B, WZR\n\t" + "DUP V26.16B, WZR\n\t" + "DUP V27.16B, WZR\n\t" + "DUP V28.16B, WZR\n\t" + "DUP V29.16B, WZR\n\t" + "DUP V30.16B, WZR\n\t" + "DUP V31.16B, WZR\n\t" + "RET" + : + : "r"(&ctx), "r"(stackTop), "r"(entry), "r"(entryArgument), "r"(handle) + : "x0", "x1", "lr" + ); + + __builtin_unreachable(); + } + + void KJit32Thread::Init() { + ctx.gpr[0] = static_cast(entryArgument); + ctx.gpr[1] = handle; + + ctx.sp = static_cast(reinterpret_cast(stackTop)); + ctx.pc = static_cast(reinterpret_cast(entry)); + } + + void KJit32Thread::Run() { + jit = &state.jit32->GetCore(coreId); + + jit->RestoreContext(ctx); + jit->SetThreadPointer(static_cast(handle)); // Probably unused by guest, set to the thread handle just in case + jit->SetTlsPointer(static_cast(reinterpret_cast(tlsRegion))); + + jit->Run(); + + jit->SaveContext(ctx); + jit = nullptr; + } } diff --git a/app/src/main/cpp/skyline/kernel/types/KThread.h b/app/src/main/cpp/skyline/kernel/types/KThread.h index bc0f2d76..20c3402a 100644 --- a/app/src/main/cpp/skyline/kernel/types/KThread.h +++ b/app/src/main/cpp/skyline/kernel/types/KThread.h @@ -5,46 +5,61 @@ #include #include +#include #include #include #include #include "KSyncObject.h" #include "KSharedMemory.h" -namespace skyline { - namespace kernel::type { +namespace skyline::kernel { + thread_local inline type::KThread *this_thread{nullptr}; //!< The guest KThread for the current host thread + + namespace type { /** * @brief KThread manages a single thread of execution which is responsible for running guest code and kernel code which is invoked by the guest */ class KThread : public KSyncObject, public std::enable_shared_from_this { + public: + KProcess &process; //!< The process this thread belongs to + private: - KProcess *parent; std::thread thread; //!< If this KThread is backed by a host thread then this'll hold it pthread_t pthread{}; //!< The pthread_t for the host thread running this guest thread timer_t preemptionTimer{}; //!< A kernel timer used for preemption interrupts /** - * @brief Entry function any guest threads, sets up necessary context and jumps into guest code from the calling thread - * @note This function also serves as the entry point for host threads created in StartThread + * @brief Entry function of any guest threads, sets up necessary context and jumps into guest code from the calling thread */ - void StartThread(); + void ThreadEntrypoint(); + + protected: + /** + * @brief Initializes the thread's execution-mode-specific host object + */ + virtual void Init() = 0; + + /** + * @brief Runs this thread's guest code + */ + virtual void Run() = 0; public: std::mutex statusMutex; //!< Synchronizes all thread state changes (running/ready/killed) std::condition_variable statusCondition; //!< Signalled on the status of the thread changing bool running{false}; //!< If the host thread that corresponds to this thread is running, this doesn't reflect guest scheduling changes - bool ready{false}; //!< If this thread is ready to recieve signals or not + bool ready{false}; //!< If this thread is ready to receive signals or not bool killed{false}; //!< If this thread was previously running and has been killed KHandle handle; size_t id; //!< Index of thread in parent process's KThread vector - nce::ThreadContext ctx{}; //!< The context of the guest thread during the last SVC - jmp_buf originalCtx; //!< The context of the host thread prior to jumping into guest code + jmp_buf originalCtx{}; //!< The context of the host thread prior to jumping into guest code void *entry; //!< A function pointer to the thread's entry u64 entryArgument; //!< An argument to provide with to the thread entry function void *stackTop; //!< The top of the guest's stack, this is set to the initial guest stack pointer + u8 *tlsRegion{}; //!< The TLS region for this thread AdaptiveSingleWaiterConditionVariable scheduleCondition; //!< Signalled to wake the thread when it's scheduled or its resident core changes std::atomic basePriority; //!< The priority of the thread for the scheduler without any priority-inheritance @@ -63,11 +78,11 @@ namespace skyline { bool forceYield{}; //!< If the thread has been forcefully yielded by another thread RecursiveSpinLock waiterMutex; //!< Synchronizes operations on mutation of the waiter members - u32 *waitMutex; //!< The key of the mutex which this thread is waiting on - KHandle waitTag; //!< The handle of the thread which requested the mutex lock + u32 *waitMutex{}; //!< The key of the mutex which this thread is waiting on + KHandle waitTag{}; //!< The handle of the thread which requested the mutex lock std::shared_ptr waitThread; //!< The thread which this thread is waiting on std::list> waiters; //!< A queue of threads waiting on this thread sorted by priority - void *waitConditionVariable; //!< The condition variable which this thread is waiting on + void *waitConditionVariable{}; //!< The condition variable which this thread is waiting on bool waitSignalled{}; //!< If the conditional variable has been signalled already Result waitResult; //!< The result of the wait operation @@ -78,9 +93,11 @@ namespace skyline { bool isPaused{false}; //!< If the thread is currently paused and not runnable bool insertThreadOnResume{false}; //!< If the thread should be inserted into the scheduler when it resumes (used for pausing threads during sleep/sync) - KThread(const DeviceState &state, KHandle handle, KProcess *parent, size_t id, void *entry, u64 argument, void *stackTop, i8 priority, u8 idealCore); + jit::JitCore32 *jit{nullptr}; //!< The JIT core this thread is running on, or nullptr if it's not currently running - ~KThread(); + KThread(const DeviceState &state, KHandle handle, KProcess &process, size_t id, void *entry, u64 argument, void *stackTop, i8 priority, u8 idealCore); + + ~KThread() override; /** * @param self If the calling thread should jump directly into guest code or if a new thread should be created for it @@ -123,5 +140,40 @@ namespace skyline { return priority < it->priority; } }; + + class KNceThread : public KThread { + /** + * @brief Initializes the thread object for execution in NCE mode + */ + void Init() override; + + /** + * @brief Entry function for any guest thread when running in NCE mode + * @note This function does not return as it jumps into guest code + */ + void Run() override; + + public: + nce::ThreadContext ctx{}; //!< The context of the guest thread during the last SVC + + using KThread::KThread; + }; + + class KJit32Thread : public KThread { + /** + * @brief Initializes the thread object for execution in 32-bit JIT mode + */ + void Init() override; + + /** + * @brief Entry function for any guest thread when running in 32-bit JIT mode + */ + void Run() override; + + public: + jit::ThreadContext32 ctx{}; //!< The context of the guest thread + + using KThread::KThread; + }; } } diff --git a/app/src/main/cpp/skyline/loader/nro.cpp b/app/src/main/cpp/skyline/loader/nro.cpp index aaa79b03..088adf4e 100644 --- a/app/src/main/cpp/skyline/loader/nro.cpp +++ b/app/src/main/cpp/skyline/loader/nro.cpp @@ -63,6 +63,7 @@ namespace skyline::loader { executable.dynstr = {header.dynstr.offset, header.dynstr.size}; } + state.process->npdm.meta.flags.is64Bit = true; state.process->memory.InitializeVmm(memory::AddressSpaceType::AddressSpace39Bit); auto applicationName{nacp ? nacp->GetApplicationName(nacp->GetFirstSupportedTitleLanguage()) : ""}; auto loadInfo{LoadExecutable(process, state, executable, 0, applicationName.empty() ? "main.nro" : applicationName + ".nro")}; diff --git a/app/src/main/cpp/skyline/loader/nso.cpp b/app/src/main/cpp/skyline/loader/nso.cpp index 6e555c82..a447bc84 100644 --- a/app/src/main/cpp/skyline/loader/nso.cpp +++ b/app/src/main/cpp/skyline/loader/nso.cpp @@ -63,6 +63,7 @@ namespace skyline::loader { } void *NsoLoader::LoadProcessData(const std::shared_ptr &process, const DeviceState &state) { + state.process->npdm.meta.flags.is64Bit = true; state.process->memory.InitializeVmm(memory::AddressSpaceType::AddressSpace39Bit); auto loadInfo{LoadNso(this, backing, process, state)}; state.process->memory.InitializeRegions(span{loadInfo.base, loadInfo.size}); diff --git a/app/src/main/cpp/skyline/nce.cpp b/app/src/main/cpp/skyline/nce.cpp index 3d42cf13..8943379d 100644 --- a/app/src/main/cpp/skyline/nce.cpp +++ b/app/src/main/cpp/skyline/nce.cpp @@ -36,7 +36,7 @@ namespace skyline::nce { } while (kernel::Scheduler::YieldPending) [[unlikely]] { - state.scheduler->Rotate(false); + state.scheduler->Rotate(); kernel::Scheduler::YieldPending = false; state.scheduler->WaitSchedule(); } @@ -113,7 +113,7 @@ namespace skyline::nce { }, hookedSymbol.hook); while (kernel::Scheduler::YieldPending) [[unlikely]] { - state.scheduler->Rotate(false); + state.scheduler->Rotate(); kernel::Scheduler::YieldPending = false; state.scheduler->WaitSchedule(); } diff --git a/app/src/main/cpp/skyline/os.cpp b/app/src/main/cpp/skyline/os.cpp index 68b1d11d..1c68a7cd 100644 --- a/app/src/main/cpp/skyline/os.cpp +++ b/app/src/main/cpp/skyline/os.cpp @@ -3,6 +3,7 @@ #include "gpu.h" #include "nce.h" +#include #include "nce/guest.h" #include "kernel/types/KProcess.h" #include "vfs/os_backing.h" @@ -29,7 +30,9 @@ namespace skyline::kernel { assetFileSystem(std::move(assetFileSystem)), state(this, jvmManager, settings), serviceManager(state) {} - + + bool isJitEnabled = false; + void OS::Execute(int romFd, loader::RomFormat romType) { auto romFile{std::make_shared(romFd)}; auto keyStore{std::make_shared(privateAppFilesPath + "keys/")}; @@ -67,6 +70,15 @@ namespace skyline::kernel { LOGINF(R"(Starting "{}" ({}) v{} by "{}")", name, nacp->GetSaveDataOwnerId(), nacp->GetApplicationVersion(), publisher); } + // Scheduler retrieves information from the NPDM of the process so it needs to be initialized after the process is created + state.scheduler = std::make_shared(state); + + if (!isJitEnabled) { + state.nce = std::make_shared(state); + } else { // 32-bit + state.jit32 = std::make_shared(state); + } + process->InitializeHeapTls(); auto thread{process->CreateThread(entry)}; if (thread) { diff --git a/app/src/main/cpp/skyline/os.h b/app/src/main/cpp/skyline/os.h index b46684d4..94aea066 100644 --- a/app/src/main/cpp/skyline/os.h +++ b/app/src/main/cpp/skyline/os.h @@ -12,6 +12,8 @@ namespace skyline::kernel { /** * @brief The OS class manages the interaction between the various Skyline components */ + extern bool isJitEnabled; + class OS { public: std::string nativeLibraryPath; //!< The full path to the app's native library directory diff --git a/app/src/main/java/emu/skyline/EmulationActivity.kt b/app/src/main/java/emu/skyline/EmulationActivity.kt index 1500d077..cd9d762e 100644 --- a/app/src/main/java/emu/skyline/EmulationActivity.kt +++ b/app/src/main/java/emu/skyline/EmulationActivity.kt @@ -197,6 +197,8 @@ class EmulationActivity : AppCompatActivity(), SurfaceHolder.Callback, View.OnTo private external fun enableDynamicResolution(enable: Boolean) + private external fun enableJit(enable: Boolean) + /** * @see [InputHandler.initializeControllers] */ @@ -347,6 +349,13 @@ class EmulationActivity : AppCompatActivity(), SurfaceHolder.Callback, View.OnTo } ) + enableJit( + when (emulationSettings.cpuBackend) { + 0 -> false + else -> true + } + ) + lifecycleScope.launch(Dispatchers.Main) { lifecycle.repeatOnLifecycle(Lifecycle.State.STARTED) { WindowInfoTracker.getOrCreate(this@EmulationActivity) diff --git a/app/src/main/java/emu/skyline/settings/EmulationSettings.kt b/app/src/main/java/emu/skyline/settings/EmulationSettings.kt index b69d9a7d..8437ea44 100644 --- a/app/src/main/java/emu/skyline/settings/EmulationSettings.kt +++ b/app/src/main/java/emu/skyline/settings/EmulationSettings.kt @@ -43,6 +43,9 @@ class EmulationSettings private constructor(context : Context, prefName : String var enableFoldableLayout by sharedPreferences(context, false, prefName = prefName) var showPauseButton by sharedPreferences(context, false, prefName = prefName) + // CPU + var cpuBackend by sharedPreferences(context, 0, prefName = prefName) + // GPU var gpuDriver by sharedPreferences(context, SYSTEM_GPU_DRIVER, prefName = prefName) var forceTripleBuffering by sharedPreferences(context, true, prefName = prefName) diff --git a/app/src/main/res/values/array.xml b/app/src/main/res/values/array.xml index a2f28561..3f6ebe0a 100644 --- a/app/src/main/res/values/array.xml +++ b/app/src/main/res/values/array.xml @@ -87,6 +87,11 @@ 16:10 Device Aspect Ratio (Stretch to fit) + + Native code execution (NCE) + Dynarmic + + Auto Landscape diff --git a/app/src/main/res/values/strings.xml b/app/src/main/res/values/strings.xml index 224b60ae..c35fa273 100644 --- a/app/src/main/res/values/strings.xml +++ b/app/src/main/res/values/strings.xml @@ -130,6 +130,9 @@ Disable Audio Output Audio output is disabled Audio output is enabled + + CPU + CPU backend GPU GPU Driver Configuration diff --git a/app/src/main/res/xml/emulation_preferences.xml b/app/src/main/res/xml/emulation_preferences.xml index b2d9062f..47804596 100644 --- a/app/src/main/res/xml/emulation_preferences.xml +++ b/app/src/main/res/xml/emulation_preferences.xml @@ -96,6 +96,16 @@ app:key="is_audio_output_disabled" app:title="@string/disable_audio_output" /> + + +