文件最后提交记录最后更新时间
i#5352: Fix Windows performance regression (#5353) Fixes a failure to add indirect branch targets to the lookup table on Windows, a regression due to commit 638fb10 (PR #4886). Adds a regression test. Since small-enough pure performance tests on test VM's are difficult to keep non-flaky, we instead check the count of cache exits as a proxy for performance. We turn the total exit count into a release-build statistic, paying the cost of that one counter. (This stat should be useful for other purposes as well.) We add a workload with several separate calls (to avoid trace capture of one target) in a loop to the api.detach test as a convenient test for programmatically grabbing the stats and checking the resulting count. Manually tested on the Botan library program from the original reported problem: the big slowdown disappears with this fix. Fixes #53524 年前
i#95 Linux detach: detaching via dr_app_stop_and_cleanup() Modeled on Peter Goodman's code at https://codereview.appspot.com/13314047 but with many changes, including: + not including any of the re-takeover changes: global var resets, etc. + using a futex instead of sleeping and checking a volatile bool. This does retain the duplication of Windows detach code from the patch. That will be addressed in an upcoming CL. Adds a new app API routine dr_app_stop_and_cleanup() which triggers a detach that can assume it's on an app thread and app stack. Implements detach cleanup for Linux by splitting the portions that must be executed by each thread (mainly segment restoration) from the rest of the cleanup that the detaching thread can perform. The final go-native step is accomplished via sigreturn using a generalized dynamorio_futex_wake_and_exit/dynamorio_semaphore_signal_all routine, renamed to dynamorio_condvar_wake_and_jmp and taking in a target routine to jump to after the wake. This target is dynamorio_sigreturn for detach. Adds two app-detach tests. Review-URL: https://codereview.appspot.com/3027100439 年前
i#5651: Switch DR's suspend signal to SIGILL with sigqueueinfo (#5652) Switches the signal DR uses to suspend threads from SIGSTKFLT, which is not supported in gdb, to SIGILL, which is shared with nudges. SIGILL is unlikely to be blocked or to be sent asynchronously. We now use sigqueueinfo (where supported) to set a flag to distinguish a nudge from a suspend signal, and to set certain siginfo fields that libc does not set, further increasing confidence that there will be no confusion with the application's use of signals. Since QEMU crashes if we send SIGILL, we make SUSPEND_SIGNAL a variable instead of a cosntant and set it back to SIGSTKFLT when -xarch_root is set and live with the downsides there. Fixes #56513 年前
i#1921 native sig: Deliver signals to unmanaged threads (#4636) When a signal arrives in an completely unmanaged thread with no dcontext, typically because DR is detaching, we now deliver the signal if the application has a handler for it. This requires adding support for no dcontext to several parts of the frame setup code even beyond what was added in PR #4603 for temporarily-native threads. We have to save the app's handler when we detach a thread so we know where to send a native signal. Full support is complex when we're cleaning up and have no dynamic storage, so we use a single global handler per signal. We detect whether multiple handlers are in operation in this single DR instance (quite rare: only custom non-pthread clones have this behavior) and in that case we abort like before on a native signal. Adds ATOMIC_READ_1BYTE() to complement the existing atomic operations for a cleaner read of the new multi-handler flag. Delivering the frame often overlaps with DR's frame and even DR's stack usage while delivering, if the app has no sigaltstack. We add logic to detect this overlap and avoid clobbering the stack memory. Alarm signals are still dropped, since they can arrive mid-thread-init when it is even harder to deliver. Adds a new test api.detach_signal which creates 10 threads who all sit in a loop sending 4 different alternating signals (SIGSEGV, SIGBUS, SIGURG, SIGALRM) while the main thread attaches and then detaches. When run in debug build, many many signals arrive in post-detach threads, since detach takes a while to do debug cleanup, exercising the new code. Adds a new RSTAT for native signals so we can identify when this happens in release build. Exports the stat to the API and uses it to check that at least some signals were delivered natively in the new test. Removes the fatal error on a signal arriving with no dcontext. But, non-ignore default signal actions when no handler is present are still not fully handled, along with multi-sighand-processes as mentioned, and the fatal error remains in those cases. For default actions, since the process is going to terminate anyway, the only shortcoming of this is whether a core is generated and whether the proper process exit code is raised. Issue: #19215 年前
i#2876 clang-format: massive reformat of every file (#3093) Massive clang-format of every C or C++ source file (*.h, *.c, *.cpp), except for: + third_party/ + ext/drsysm/libelftc/include/ + ext/drsyms/demangle.cc This is a single, large commit by design to present a single history disruption point and bring the code base into a consistent format. clang-format version 6.0 was used. Tweaks the clang-format rules to indent 4 after pre-processor hashes, to allow single-line case labels, and align trailing comments. Disables the pp_indent vera style check as we changed the indent rule and clang-format now covers it. Leaves the other checks, even though some are redundant (they found clang-format errors). Moves C++ token defines from globals_shared.h and globals.h to a new header, core/lib/c_defines.h, to avoid a clang-format parsing error where it fails to identify the include guard. Adds genapi.pl removal of extra indentation inside API_EXPORT_ONLY and CLIENT_INTERFACE regions. Adds exclusions around large regions we don't want to format, in these files: + core/win32/syscallx.h + core/arch/x86/decode_table.c + core/arch/arm/table_a32_pred.c + core/arch/arm/table_a32_unpred.c + core/arch/arm/table_t32_16.c + core/arch/arm/table_t32_16_it.c + core/arch/arm/table_t32_base.c + core/arch/arm/table_t32_coproc.c + core/arch/arm/table_encode.c Adds smaller exclusions to work around clang-format bugs: + Several missing break-after-return-type + Several >90-char lines + Misc scattered issues, all listed in #2876 Issue: #28767 年前
i#2600: fix race between detach and new thread (#2602) Adds a counter uninit_thread_count which is incremented in UNIX just prior to a new thread being created. Synchall will keep looping while this counter is positive, ensuring we avoid races with threads created but not yet initialized during a detach synchall which then use DR resources after detach, causing crashes. Leaves the counter at 0 on Windows where we can't distinguish external vs internal thread creation, but Windows does have the init_apc_go_native flag which eliminates some races. Adds a test. To more reliably hit the race (about half the time) I locally added a 50ms sleep at the top of dynamo_thread_init(). Fixes #26008 年前
i#3427 detach flakiness: Add two derivatives of the api.detach_spawn test. (#3492) The remaining items in #3427 were non-issues, see comment in #3427. Instead, this patch is adding two tests. Both are derivatives of api.detach_spawn. api.detach_spawn_quick_exit: Adds start/stop_and_cleanup re-attach to api.detach_spawn. It only spawns one thread. The thread is synch'd such that it exits while running native. This case is not supported by DynamoRIO's dr_app_stop w/o detach, but must work w/ dr_app_stop_and_cleanup(). api.detach_spawn_stress: Adds start/stop_and_cleanup re-attach to api.detach_spawn. Increase the number of spawned parent and child threads to 25 each, spawning a total number of 25 + 25 * 25 = 650 threads at a time, but limit the re-spawn loop. Like api.detach_spawn, this test is currently flaky in 32-bit, and very rarely flaky in 64-bit. #3427 covers the 64-bit failure. I have added a comment to #2694. Issues: #3427, #26947 年前
i#3427 detach flakiness: Add two derivatives of the api.detach_spawn test. (#3492) The remaining items in #3427 were non-issues, see comment in #3427. Instead, this patch is adding two tests. Both are derivatives of api.detach_spawn. api.detach_spawn_quick_exit: Adds start/stop_and_cleanup re-attach to api.detach_spawn. It only spawns one thread. The thread is synch'd such that it exits while running native. This case is not supported by DynamoRIO's dr_app_stop w/o detach, but must work w/ dr_app_stop_and_cleanup(). api.detach_spawn_stress: Adds start/stop_and_cleanup re-attach to api.detach_spawn. Increase the number of spawned parent and child threads to 25 each, spawning a total number of 25 + 25 * 25 = 650 threads at a time, but limit the re-spawn loop. Like api.detach_spawn, this test is currently flaky in 32-bit, and very rarely flaky in 64-bit. #3427 covers the 64-bit failure. I have added a comment to #2694. Issues: #3427, #26947 年前
i#3427 detach flakiness: Add two derivatives of the api.detach_spawn test. (#3492) The remaining items in #3427 were non-issues, see comment in #3427. Instead, this patch is adding two tests. Both are derivatives of api.detach_spawn. api.detach_spawn_quick_exit: Adds start/stop_and_cleanup re-attach to api.detach_spawn. It only spawns one thread. The thread is synch'd such that it exits while running native. This case is not supported by DynamoRIO's dr_app_stop w/o detach, but must work w/ dr_app_stop_and_cleanup(). api.detach_spawn_stress: Adds start/stop_and_cleanup re-attach to api.detach_spawn. Increase the number of spawned parent and child threads to 25 each, spawning a total number of 25 + 25 * 25 = 650 threads at a time, but limit the re-spawn loop. Like api.detach_spawn, this test is currently flaky in 32-bit, and very rarely flaky in 64-bit. #3427 covers the 64-bit failure. I have added a comment to #2694. Issues: #3427, #26947 年前
i#3427 detach flakiness: Add two derivatives of the api.detach_spawn test. (#3492) The remaining items in #3427 were non-issues, see comment in #3427. Instead, this patch is adding two tests. Both are derivatives of api.detach_spawn. api.detach_spawn_quick_exit: Adds start/stop_and_cleanup re-attach to api.detach_spawn. It only spawns one thread. The thread is synch'd such that it exits while running native. This case is not supported by DynamoRIO's dr_app_stop w/o detach, but must work w/ dr_app_stop_and_cleanup(). api.detach_spawn_stress: Adds start/stop_and_cleanup re-attach to api.detach_spawn. Increase the number of spawned parent and child threads to 25 each, spawning a total number of 25 + 25 * 25 = 650 threads at a time, but limit the re-spawn loop. Like api.detach_spawn, this test is currently flaky in 32-bit, and very rarely flaky in 64-bit. #3427 covers the 64-bit failure. I have added a comment to #2694. Issues: #3427, #26947 年前
i#5786: Add precise clean call mangling identification (#5791) Adds new labels delimiting clean call sequences. Converts into a translation record flag when storing translations. Uses the new labels and flag to precisely identify clean call mangling, replacing the previous scheme which incorrectly thought mangled tool pc-relative was a clean call, resulting in incorrect translations and crashes. Adds a test case to api.detach_state by adding a client (by converting it to use static DR) which inserts a pc-relative load. This reproduces the crash on detach, and is fixed with this fix. The added instrumentation caused periodic detach failures which were solved by setting the translation and adding a restore-state event: i#4232 covers trying to improve the situation. Adds a new instr_t.offset field. Stops using instr_t.note to hold encoding offsets for pc-releative operands. Adds a new field instr_t.offset which is used for this purpose. This leaves note values in place across encodings, which is needed for new clean call marking labels and also simplifies rseq handling code. This instr_t field is a compatibility break and we bump the version and OLDEST_COMPATIBLE_VERSION here to 990. Updates dr_get_note docs. Augments logging of xl8 info with new flag info. Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels. This is another compatibility break, while at it. Fixes several issues hit in tests that happened to trigger on the heap bucket size and other changes: + Fixes a rank order violation at loglevel 5: xref #1649 + Writes real xstate_bv into signal frame when setting the xstate context to avoid lazy AVX restore problems. + Tweaks the thread_churn test to work around non-linearities. Issue: #5786, #4232 Fixes #57863 年前
i#5786: Add precise clean call mangling identification (#5791) Adds new labels delimiting clean call sequences. Converts into a translation record flag when storing translations. Uses the new labels and flag to precisely identify clean call mangling, replacing the previous scheme which incorrectly thought mangled tool pc-relative was a clean call, resulting in incorrect translations and crashes. Adds a test case to api.detach_state by adding a client (by converting it to use static DR) which inserts a pc-relative load. This reproduces the crash on detach, and is fixed with this fix. The added instrumentation caused periodic detach failures which were solved by setting the translation and adding a restore-state event: i#4232 covers trying to improve the situation. Adds a new instr_t.offset field. Stops using instr_t.note to hold encoding offsets for pc-releative operands. Adds a new field instr_t.offset which is used for this purpose. This leaves note values in place across encodings, which is needed for new clean call marking labels and also simplifies rseq handling code. This instr_t field is a compatibility break and we bump the version and OLDEST_COMPATIBLE_VERSION here to 990. Updates dr_get_note docs. Augments logging of xl8 info with new flag info. Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels. This is another compatibility break, while at it. Fixes several issues hit in tests that happened to trigger on the heap bucket size and other changes: + Fixes a rank order violation at loglevel 5: xref #1649 + Writes real xstate_bv into signal frame when setting the xstate context to avoid lazy AVX restore problems. + Tweaks the thread_churn test to work around non-linearities. Issue: #5786, #4232 Fixes #57863 年前
i#4698 detach_state: Port signal + GPR-cache tests to A64 (#4708) Enables api.detach_state on AArch64 with some initial the sub-tests enabled: + Enables the cross-platform signal state tests on AArch64. + Ports the assembly to push and pop all the general-purpose registers, fill them with sentinel values, and test them on a detach from the code cache. As part of this, adds some general assembly support: + @P@ expands to # in cpp2asm. + Adds REG_SP and JUMP_NOT_EQUAL cross-platform asm_defines. + Adds a missing dep on detach_state_shared.h for the asm code. Still left to do for future work: + Check SIMD registers. + Check flags. + Check stack. + Detach from DR rather than the cache. Issue: #46985 年前
i#3044 AArch64 SVE codec: change LDR/STR and PRF to use byte offsets (#6230) For the current decode/encode functions of: LDR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}] LDR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}] STR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}] STR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}] PRFB <prfop>, <Pg>, [<Xn|SP>{, #<imm>, MUL VL}] PRFH <prfop>, <Pg>, [<Xn|SP>{, #<imm>, MUL VL}] PRFW <prfop>, <Pg>, [<Xn|SP>{, #<imm>, MUL VL}] PRFD <prfop>, <Pg>, [<Xn|SP>{, #<imm>, MUL VL}] Vector indexing is used in the memory operand at the IR level. However the IR must always refer to the address in terms of the base register value plus a byte offset displacement. This patch changes the decode/encode functions for these instructions to expect byte offsets at the IR level, converting to vector length offsets within the codec. Issues #3044, #53652 年前
i#3044 AArch64 SVE2 codec: Add misc predicate instructions (#6191) This patch adds the appropriate macros, tests and codec entries to encode and decode the following variants: MATCH <Pd>.<Ts>, <Pg>/Z, <Zn>.<Ts>, <Zm>.<Ts> NMATCH <Pd>.<Ts>, <Pg>/Z, <Zn>.<Ts>, <Zm>.<Ts> URECPE <Zd>.S, <Pg>/M, <Zn>.S URSQRTE <Zd>.S, <Pg>/M, <Zn>.S WHILEGE <Pd>.<Ts>, <R><n>, <R><m> WHILEGT <Pd>.<Ts>, <R><n>, <R><m> WHILEHI <Pd>.<Ts>, <R><n>, <R><m> WHILEHS <Pd>.<Ts>, <R><n>, <R><m> WHILERW <Pd>.<Ts>, <Xn>, <Xm> WHILEWR <Pd>.<Ts>, <Xn>, <Xm> Issue: #30442 年前
i#2626 Finish AArch64 encoder/decoder: DC CVA[D]P (#5959) This patch adds the appropriate macros, tests and codec entries to encode the following variants: DC CVAP, <Xt> DC CVADP, <Xt> Issue: #26263 年前
i#2626 Finish AArch64 encoder/decoder: FJCVTS (#5948) This patch adds the appropriate macros, tests and codec entries to encode the following variants: FJCVTZS <Wd>, <Dn> Issue: #26263 年前
i#2626: Make dis-a64*.txt test order stable (#5923) Tests were previously sorted by mnemonic only, so instructions with multiple variants could change order when the the file is added to and re-sorted. The order should now be stable which should cut back on code review noise. This also reorders the test files to match the new order used by the script. Issue: #26263 年前
i#3044 AArch64 SVE codec: Data-processing and DC memory tagged (#6216) This patch adds the appropriate macros, tests and codec entries to encode the following variants: GMI <Xd>, <Xn|SP>, <Xm> IRG <Xd|SP>, <Xn|SP>{, <Xm>} SUBP <Xd>, <Xn|SP>, <Xm|SP> SUBPS <Xd>, <Xn|SP>, <Xm|SP> ADDG <Xd|SP>, <Xn|SP>, #<imm1>, #<imm2> SUBG <Xd|SP>, <Xn|SP>, #<imm1>, #<imm2> DC GVA, <Xt> DC GZVA, <Xt> Issue #30442 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#1569 AArch64: Add dis-a64 pre-commit test. In dis-a64.txt there is a list of encodings and corresponding disassemblies. This is checked by dis-a64.c, while dis-a64.pl is a script to facilitate generating dis-a64.txt. Review-URL: https://codereview.appspot.com/3005400439 年前
i#2440 AArch64 is missing XINST_CREATE and INSTR_CREATE macros (#2808) * i#2440 AArch64 is missing XINST_CREATE and INSTR_CREATE macros This patch adds some missing macros, adds macro tests and fixes ldar encodings.8 年前
i#2626 Finish AArch64 encoder/decoder: XPAC* (#5943) This patch adds the appropriate macros, tests and codec entries to encode the following variants: XPACD <Xd> XPACI <Xd> XPACLRI Issues: #2626, #56233 年前
i#1686 decoder testing: add api.disA32 and api.disT32 binary decoding tests Adds a new helper app api.dis-create which takes a large random binary and pulls out a subset of it that contains the head of the file and then caps the count of each opcode in order to create a manageable pre-commit-suite test. It also appends a few hardcoded entries for rare opcodes to ensure coverage. Adds two new tests, api.disA32 and api.disT32, which use the result of the api.dis-create helper and disassemble their own input binaries. The expected output for now is the DR format (already compared for register usage with Capstone). Eventually we want to finish -syntax_arm and use that as the format to check all the operands as well. Splits up the tobuild_api() routine to support this. Review-URL: https://codereview.appspot.com/23176004410 年前
i#5441 Fix Arm's AESE and AESD instructions (#5457) The instruction now decodes to two source operands, in the order used in the instruction. The source operand previously at index 0 is now at index 1. AArch32 fix by Al Grant. Issue: #54414 年前
i#1686 decoder testing: add api.disA32 and api.disT32 binary decoding tests Adds a new helper app api.dis-create which takes a large random binary and pulls out a subset of it that contains the head of the file and then caps the count of each opcode in order to create a manageable pre-commit-suite test. It also appends a few hardcoded entries for rare opcodes to ensure coverage. Adds two new tests, api.disA32 and api.disT32, which use the result of the api.dis-create helper and disassemble their own input binaries. The expected output for now is the DR format (already compared for register usage with Capstone). Eventually we want to finish -syntax_arm and use that as the format to check all the operands as well. Splits up the tobuild_api() routine to support this. Review-URL: https://codereview.appspot.com/23176004410 年前
i#5441 Fix Arm's AESE and AESD instructions (#5457) The instruction now decodes to two source operands, in the order used in the instruction. The source operand previously at index 0 is now at index 1. AArch32 fix by Al Grant. Issue: #54414 年前
i#4117: Add missing dr_standalone_init() and _exit() calls (#4118) The filed bug is about raw2trace_directory using the DR API without initializing DR, but I expanded this to clean up the whole code base: Adds dr_standalone_init() and _exit() to raw2trace_directory_t. This fixes an assert when calling dr_open_file() due to uninitialized state. Adds {dr_,}standalone_exit() calls to all cases where _init() is called, except where we can't do that (drdecode heap usage). For the C++ view and opcode_mix tools, uses a first-field destructor to ensure dr_standalone_exit() is called after any DR heap used by other fields is freed. Fixes several bugs where IR cleanup calls were missing in the api.ir test, revealed by dr_standalone_exit's unfreed-memory check. Fixes #41176 年前
Add attribution for dis-udis86-randtest.raw test file (#3583) Adds missing attribution for the dis-udis86-randtest.raw binary test file which came from the Udis86 project.7 年前
initial open source import from formerly proprietary VMware sources SVN-Revision: 117 年前
i#6003: Only accept FS/GS segment prefixes on X64. (#6056) On 64 bit the legacy CS/SS/DS/ES prefixes are ignored. Account for that so instructions decode correctly. Fixes #60033 年前
i#1312 AVX512 support: add partial EVEX support to decode_sizeof (#3270) This patch is adding partial AVX512 functionality by partially supporting fast decode of EVEX encoded instructions lengths. The patch supports decode_sizeof for instructions with EVEX prefix but has only been manually tested with a limited set of instructions. Full decode of EVEX instructions is not supported. Issue: #13127 年前
i#4117: Add missing dr_standalone_init() and _exit() calls (#4118) The filed bug is about raw2trace_directory using the DR API without initializing DR, but I expanded this to clean up the whole code base: Adds dr_standalone_init() and _exit() to raw2trace_directory_t. This fixes an assert when calling dr_open_file() due to uninitialized state. Adds {dr_,}standalone_exit() calls to all cases where _init() is called, except where we can't do that (drdecode heap usage). For the C++ view and opcode_mix tools, uses a first-field destructor to ensure dr_standalone_exit() is called after any DR heap used by other fields is freed. Fixes several bugs where IR cleanup calls were missing in the api.ir test, revealed by dr_standalone_exit's unfreed-memory check. Fixes #41176 年前
i#1686 decoder testing: add api.disA32 and api.disT32 binary decoding tests Adds a new helper app api.dis-create which takes a large random binary and pulls out a subset of it that contains the head of the file and then caps the count of each opcode in order to create a manageable pre-commit-suite test. It also appends a few hardcoded entries for rare opcodes to ensure coverage. Adds two new tests, api.disA32 and api.disT32, which use the result of the api.dis-create helper and disassemble their own input binaries. The expected output for now is the DR format (already compared for register usage with Capstone). Eventually we want to finish -syntax_arm and use that as the format to check all the operands as well. Splits up the tobuild_api() routine to support this. Review-URL: https://codereview.appspot.com/23176004410 年前
Add instruction category interface and implement for AArch64 (#6237) Adds a new instruction category bit type for combining to describe instruction operations. Adds a new category field to instr_t, which breaks 32-bit compatibilty (it fits into a padding slot for 64-bit). Adds instr_get_category() and instr_set_category(). Implements categorization for AARch64 instruction mix. Each opcode is set to categories from the following list: - Integer arithmetic operations - Floating-Point arithmetic operations - Loads - Stores - Branches - Operations with vector registers (SIMD) - Other types of instructions Calculation of category is based on C4.1.67 of ARM V8 Architecture reference manual * https://developer.arm.com/documentation/ddi0487/ Other platform categories are left for future work. Categories for synthetic instructions are not supported. This is a step toward drmemtrace categories (#6238) and may help with arithmetic intensity and other calculations (#5994). Co-authored-by: Derek Bruening <bruening@google.com> Issue: #6238, #59942 年前
i#1569 AArch64: Add api.drdecode test for AArch64. Review-URL: https://codereview.appspot.com/3084000439 年前
i#4180: Add non-heap-using instuction structure (#4183) Adds instr_noalloc_t, a wrapper around instr_t which contains space for all possible operands and a temporary encoding buffer. Adds support in the core instruction routines for using these internal fields and never using external heap, enabling this structure to be used for decoding in a signal handler. A regular instr_ pointer is obtained from an instr_noalloc_t using instr_from_noalloc(). The encoding buffer, due to reachability issues, cannot be re-relativized, making this structure inefficient for encoding. It is targeted toward decoding only. Adds a sanity test. Fixes #41806 年前
i#975 static DR: refactor STATIC_LIBRARY to a side-by-side target This is a big refactoring of the STATIC_LIBRARY top-level CMake configuration and separate build into instead using a side-by-side dynamorio_static target. Removes STATIC_LIBRARY from all CMake code. It is no longer a separate build choice. Adds a dynamorio_static static library build of DR that is always built. Disables data section protection for dynamorio_static. Adds a -static option to drrun to set the env var for auto-takeover of an app using dynamorio_static. Adds a new CMake utility configure_DynamoRIO_static() to make it easy to link an app with dynamorio_static. Renames the existing static_* tests to drdecode_* to make it clear what they are testing. Adds a new test static_explicit.c to test explicit use of the start/stop API with dynamorio_static. DR still does not look for clients without runtime options being set: that will be added in future work. Review-URL: https://codereview.appspot.com/3092600439 年前
i#1034: Eliminate signed char assumptions (#5349) Updates pieces of the code that used "char" and assumed it was signed to use "sbyte" instead. Adds "-funsigned-char" to the base compiler flags to really ensure the code base is free of assumptions and to further ensure it is interoperable with other toolchains that use "-funsigned-char". Adds a note to the style guide. Tweaks the drdecode_x86 test to fail without the fixes here in the x86 decoder. Fixes #10344 年前
i#1034: Eliminate signed char assumptions (#5349) Updates pieces of the code that used "char" and assumed it was signed to use "sbyte" instead. Adds "-funsigned-char" to the base compiler flags to really ensure the code base is free of assumptions and to further ensure it is interoperable with other toolchains that use "-funsigned-char". Adds a note to the style guide. Tweaks the drdecode_x86 test to fail without the fixes here in the x86 decoder. Fixes #10344 年前
i#5383 mac a64, part 4: Enable tests to build (#6182) Enables BUILD_TESTS to compile for M1: a combination of fixing and disabling tests. Many issues came from assembler and pre-processor differences. Issue: #53832 年前
i#3098: fix shared ibt table races (#3103) Fixes two races with shared ibt tables: + Adding a new table entry must write the start_pc before the tag. This is accomplished with a new ENTRY_SET_TO_ENTRY hashtablex.h optional specifier. For ARM #2502 a new MEMORY_STORE_BARRIER macro is added. + Resizing a table must not clear the tags in the old table to avoid losing the tag on the target_delete ibl path. Adds a test api.ibl-stress which uses the DR IR to synthetically construct thousands of basic blocks with indirect branches betweent them. To make the test work, relaxes several is-on-stack checks to support pre-building basic blocks (#2463) from generated code or other locations not known prior to starting the application. Issue: #3098, #2502, #2463 Fixes #30987 年前
i#5383 mac a64, part 4: Enable tests to build (#6182) Enables BUILD_TESTS to compile for M1: a combination of fixing and disabling tests. Many issues came from assembler and pre-processor differences. Issue: #53832 年前
i#3044 AArch64 SVE codec: Update old AND, BIC, EOR and ORR encodings (#6012) These instructions were implemented before the IR was fully defined for SVE and were thus missing the predicate mode and the vector element sizes. This patch updates the appropriate macros, tests and codec entries to encode the following variants: AND <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> BIC <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> EOR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> ORR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> issue: #30443 年前
i#3044 AArch64 SVE codec: Add memory tagged load/stores (#6215) This patch adds the appropriate macros, tests and codec entries to decode and encode the following instructions: LDG <Xt>, [<Xn|SP>, #<simm>] ST2G <Xt>, [<Xn|SP>], #<simm> ST2G <Xt>, [<Xn|SP>, #<simm>]! ST2G <Xt>, [<Xn|SP>, #<simm>] STG <Xt>, [<Xn|SP>], #<simm> STG <Xt>, [<Xn|SP>, #<simm>]! STG <Xt>, [<Xn|SP>, #<simm>] STZ2G <Xt>, [<Xn|SP>], #<simm> STZ2G <Xt>, [<Xn|SP>, #<simm>]! STZ2G <Xt>, [<Xn|SP>, #<simm>] STZG <Xt>, [<Xn|SP>], #<simm> STZG <Xt>, [<Xn|SP>, #<simm>]! STZG <Xt>, [<Xn|SP>, #<simm>] STGP <Xt>, <Xt2>, [<Xn|SP>], #<simm> STGP <Xt>, <Xt2>, [<Xn|SP>, #<simm>]! STGP <Xt>, <Xt2>, [<Xn|SP>, #<simm>] Issue #3044 Co-authored-by: Joshua Warburton <joshua.warburton@arm.com>2 年前
i#3044 AArch64 SVE codec: Update old AND, BIC, EOR and ORR encodings (#6012) These instructions were implemented before the IR was fully defined for SVE and were thus missing the predicate mode and the vector element sizes. This patch updates the appropriate macros, tests and codec entries to encode the following variants: AND <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> BIC <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> EOR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> ORR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> issue: #30443 年前
i#3044 AArch64 SVE codec: Update old AND, BIC, EOR and ORR encodings (#6012) These instructions were implemented before the IR was fully defined for SVE and were thus missing the predicate mode and the vector element sizes. This patch updates the appropriate macros, tests and codec entries to encode the following variants: AND <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> BIC <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> EOR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> ORR <Zdn>.<Ts>, <Pg>/M, <Zdn>.<Ts>, <Zm>.<Ts> issue: #30443 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#3044: Split out SVE instruction test files (#5680) This patch creates a new file for SVE disassembly tests and another ir_aarch64 test file. Also included is a small bug fix for the disassembly test sorter that could remove a test if the file did not end in a new line. issues: #30443 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626 Finish AArch64 encoder/decoder: Preparations for v8.6 and SVE2 (#5876) * Updating the build and test system backends to support v8.6 and SVE2 * Added v8.3 to the codec checking scripts, as it seems to have been missed Issue #26263 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626: AArch64 v8.1 codec: add SQRDMLSH (#5563) * i#2626: AArch64 v8.1 codec: add SQRDMLSH Completes implementation of signed saturating rounding doubling multiply arithmetic instructions SQRDMLAH and SQRDMULH. Co-authored-by: Joshua Warburton <joshua.warburton@arm.com> Issue: #26263 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626: AArch64 v8.2 codec: add FP16 FCVT instructions (#5578) This patch adds half-precision floating-point conversion instructions introduced in v8.2 as part of the FEAT_FP16 feature. These are an extension of existing v8.0 floating-point conversion instructions handled in codec_v80.txt. FCVTAS <Vd>.<T>, <Vn>.<T> FCVTNS <Vd>.<T>, <Vn>.<T> FCVTAS <Hd>, <Hn> FCVTNS <Hd>, <Hn> FCVTAS <Wd>, <Hn> FCVTNS <Wd>, <Hn> FCVTAS <Xd>, <Hn> FCVTNS <Xd>, <Hn> FCVTAU <Vd>.<T>, <Vn>.<T> FCVTPS <Vd>.<T>, <Vn>.<T> FCVTAU <Hd>, <Hn> FCVTPS <Hd>, <Hn> FCVTAU <Wd>, <Hn> FCVTPS <Wd>, <Hn> FCVTAU <Xd>, <Hn> FCVTPS <Xd>, <Hn> FCVTMS <Vd>.<T>, <Vn>.<T> FCVTPU <Vd>.<T>, <Vn>.<T> FCVTMS <Hd>, <Hn> FCVTPU <Hd>, <Hn> FCVTMS <Wd>, <Hn> FCVTPU <Wd>, <Hn> FCVTMS <Xd>, <Hn> FCVTPU <Xd>, <Hn> Issue: #26263 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626 Finish AArch64 encoder/decoder: SIMD v8.3 Complex number (#5886) This patch adds the appropriate macros, tests and codec entries to encode the following variants: FCADD <Vd>.<Ts>, <Vn>.<Ts>, <Vm>.<Ts>, <imm> FCMLA <Vd>.<Ts>, <Vn>.<Ts>, <Vm>.<Ts>, <imm> FCMLA <Vd>.<Ts>, <Vn>.<Ts>, <Vm>.<Tb>[<imm1>], <imm2> Issue #26263 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626 Finish AArch64 encoder/decoder: ARMv8.4-RCPC (#5920) This patch adds the appropriate macros, tests and codec entries to encode the following variants: LDAPUR <Wt>, [<Xn|SP>{, #<simm>}] LDAPUR <Xt>, [<Xn|SP>{, #<simm>}] LDAPURB <Wt>, [<Xn|SP>{, #<simm>}] LDAPURSB <Wt>, [<Xn|SP>{, #<simm>}] LDAPURSB <Xt>, [<Xn|SP>{, #<simm>}] LDAPURH <Wt>, [<Xn|SP>{, #<simm>}] LDAPURSH <Wt>, [<Xn|SP>{, #<simm>}] LDAPURSH <Xt>, [<Xn|SP>{, #<simm>}] LDAPURSW <Xt>, [<Xn|SP>{, #<simm>}] STLUR <Wt>, [<Xn|SP>{, #<simm>}] STLUR <Xt>, [<Xn|SP>{, #<simm>}] STLURB <Wt>, [<Xn|SP>{, #<simm>}] STLURH <Wt>, [<Xn|SP>{, #<simm>}] Issue: #26263 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#2626 Finish AArch64 encoder/decoder: Preparations for v8.6 and SVE2 (#5876) * Updating the build and test system backends to support v8.6 and SVE2 * Added v8.3 to the codec checking scripts, as it seems to have been missed Issue #26263 年前
i#5302: Fix a32 assert on rel-addr opnd disassembly (#5306) Adds correct ifdef ARM handling of REL_ADDR_kind operands when disassembling, and in legacy loadtoconst code. Adds a test to api.ir (confirmed it fails without this fix). Fixes other latent issues in api.ir on arm which unfortunately does not run on QEMU yet and so is not run automatically in today's CI. Tested manually on an arm system. The test also includes a regression test for #5304. Issue: #5302, #5304 Fixes #53024 年前
i#5302: Fix a32 assert on rel-addr opnd disassembly (#5306) Adds correct ifdef ARM handling of REL_ADDR_kind operands when disassembling, and in legacy loadtoconst code. Adds a test to api.ir (confirmed it fails without this fix). Fixes other latent issues in api.ir on arm which unfortunately does not run on QEMU yet and so is not run automatically in today's CI. Tested manually on an arm system. The test also includes a regression test for #5304. Issue: #5302, #5304 Fixes #53024 年前
i#3544 RV64: Fix AUIPC uimm sign-extension issue (#6306) The "u" in uimm means U-type, not unsigned, the immediate is a signed integer. This patch fixes this issue and adds a test case for it. Issue: #35442 年前
i#3544 RV64: Include implicit operands in instr_t (#6304) The RISC-V compressed extension (RVC) contains some instructions that have implicit operands, e.g.: C.JR expands to jalr x0, 0(rs1); C.JALR expands to jalr x1, 0(rs1). This patch puts these implicit operands into instr_t.2 年前
i#1556: Mark x86 rep instructions as predicated. (#6284) i#1556: Mark x86 rep instructions as predicated. Issue: #15562 年前
i#5320: Fix error handling in in x86 decode_modrm (#5324) Fix the following error reported during fuzzing: core/ir/x86/decode.c:1758:36: runtime error: division of -2147483648 by -1 cannot be represented in type 'int' in decode_modrm core/ir/x86/decode.c:1758:36 in decode_operand core/ir/x86/decode.c in decode_common core/ir/x86/decode.c:2574:18 in decode core/ir/x86/decode.c:2695:12 Adds a test case to api.ir. Co-authored-by: Derek Bruening <bruening@google.com> Fixes #53204 年前
i#5979 decode: Rename ud2a/ud2b to ud2/ud1 and decode ud1 operands (#6087) The opcodes DR calls ud2a and ud2b are more commonly known (and documented by Intel) as ud2 and ud1, respectively. Rename them, preserving the old operand constants and instruction creation macros through preprocessor defines. Additionally, ud1 actually takes two operands. Fix the decode table for them. Fixes #59793 年前
i#3581 SHA/MPX/PT support: Add support for Intel PT instructions. (#3797) Adds the Intel PT instruction ptwrite. Adds tests for above. The opcode has been checked against llvm-mc and binutils/gas/objdump. Fixes #35816 年前
i#5979 decode: Rename ud2a/ud2b to ud2/ud1 and decode ud1 operands (#6087) The opcodes DR calls ud2a and ud2b are more commonly known (and documented by Intel) as ud2 and ud1, respectively. Rename them, preserving the old operand constants and instruction creation macros through preprocessor defines. Additionally, ud1 actually takes two operands. Fix the decode table for them. Fixes #59793 年前
i#4755: Add EVEX encodings of missing vmovq variants. (#4756) These instruction variants, which do not have a EVEX.W=0 vmovd equivalent, were missed when EVEX instructions were added. The XMM/mem forms of these instructions are redundant with existing encodings and can't be tested via the normal encoding API. Fixes #47555 年前
i#1312 AVX-512 support: Add all v(p)gather|scatter opcodes. (#3698) Adds the evex-promoted opcodes vpgatherdd, vpgatherdq, vpgatherqd, vpgatherqq, vgatherdps, vgatherdpd, vgatherqps, vgatherqpd. Adds the new AVX-512 opcodes vpscatterdd, vpscatterdq, vpscatterqd, vpscatterqq, vpscatterdps, vpscatterdpd, vpscatterqps, vpscatterqpd, vgatherpf0dpd, vgatherpf0dps, vgatherpf0qpd, vgatherpf0qps, vgatherpf1dpd, vgatherpf1dps, vgatherpf1qpd, vgatherpf1qps, vscatterpf0dpd, vscatterpf0dps, vscatterpf0qpd, vscatterpf0qps, vscatterpf1dpd, vscatterpf1dps, vscatterpf1qpd, vscatterpf1qps. Adds an internal binary compatible flag index_reg_is_zmm to opnd_t. Adds two new decode table flags vsiby, vsibz, in order to constrain the VSIB index register's size, instead of adding new types TYPE_VSIB. VSIB's size is already used for the element's size. Splits the VSIBX|Y|Z macros into 6|15|31 in order to test different index registers. Adds support to encoder/decoder to support AVX-512 extended VSIB index registers. Adds tests for above. Adds to existing AVX2 gather/scatter tests. Opcodes have been checked against llvm-mc, binutils/gas/objdump and capstone. Please note that Vol. 1 5.19 INTEL® ADVANCED VECTOR EXTENSIONS 512 (INTEL® AVX-512) mentions the instructions vscatterdd/dq/qd/qq, but the instructions are no further mentioned in the instruction tables. This is likey a typo for vpscatterdd/dq/qd/qq. This patch concludes the AVX-512 encoder/decoder extensions. This is preliminary and further missed instructions may be added, if any. Further testing will be enabled. Also support for compressed displacements and VSIB address computation hasn't been added yet. Please note DynamoRIO runtime does not fully support AVX-512 processes just yet. One binutils/objdump bug has been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=24719 Issue: #13126 年前
i#1312 AVX-512 support: Add AVX-512 VEX encoded scalar opmask instructions. (#3558) Adds the AVX-512 VEX encoded scalar opmask instructions and extends the existing VEX tables. Unfortunately there are no Appendix A/B style documented Intel manuals available for AVX-512 in the same format as the existing x86 instruction sets. Therefore, the introduced type name 'K' is invented by us, based on the Intel AVX-512 instruction descriptions. Makes necessary modification in decoder to support VEX encoded AVX-512 instructions, specifically decoding and encoding more cases of the vex W opcode extension bit. Adds support to encoder for new VEX encodings. Fixes a bug to encode primary opcode byte in 3-byte VEX versions. Adds support for prefix_ext indirection to after vex_ext in decoder. Adds new operand types TYPE_K_REG, TYPE_K_MODRM, TYPE_K_MODRM_R, TYPE_K_VEX, TYPE_K_EVEX. Adds all INSTR_CREATE macros for the new instructions. Adds all test macros to x86 api.ir test. Adds raw encodings decoder test to x86 common.decode test. Fixes existing disassemble test. Issue: #13127 年前
Expand sizes of non-contiguous or offset half-width operands (#6054) Encoding the unpack-and-interleave instructions and vmovddup with the half-width operands is not in general correct. While the *number* of bytes taken from the operand is correct, those bytes are not in general taken in a contiguous sequence starting from offset zero within the operand. A tool using thi operand information will capture the wrong instruction inputs. movddup, unpcklps, and unpcklpd actually do degenerate to true half-width operands, as they start at offset zero and only use the first eight bytes. The other opcodes, including the AVX versions of however, use bytes that are offset from zero, that are non-contiguous, or both. Their representation is changed to be full-width. For prior art see e.g. movsldup/movshdup which also take a non-contiguous set of bytes and are encoded with full-width operands.3 年前
Expand sizes of non-contiguous or offset half-width operands (#6054) Encoding the unpack-and-interleave instructions and vmovddup with the half-width operands is not in general correct. While the *number* of bytes taken from the operand is correct, those bytes are not in general taken in a contiguous sequence starting from offset zero within the operand. A tool using thi operand information will capture the wrong instruction inputs. movddup, unpcklps, and unpcklpd actually do degenerate to true half-width operands, as they start at offset zero and only use the first eight bytes. The other opcodes, including the AVX versions of however, use bytes that are offset from zero, that are non-contiguous, or both. Their representation is changed to be full-width. For prior art see e.g. movsldup/movshdup which also take a non-contiguous set of bytes and are encoded with full-width operands.3 年前
Expand sizes of non-contiguous or offset half-width operands (#6054) Encoding the unpack-and-interleave instructions and vmovddup with the half-width operands is not in general correct. While the *number* of bytes taken from the operand is correct, those bytes are not in general taken in a contiguous sequence starting from offset zero within the operand. A tool using thi operand information will capture the wrong instruction inputs. movddup, unpcklps, and unpcklpd actually do degenerate to true half-width operands, as they start at offset zero and only use the first eight bytes. The other opcodes, including the AVX versions of however, use bytes that are offset from zero, that are non-contiguous, or both. Their representation is changed to be full-width. For prior art see e.g. movsldup/movshdup which also take a non-contiguous set of bytes and are encoded with full-width operands.3 年前
i#1312 AVX-512 support: Add vshuf*, vpinsr*, vpextr*, vpavg*, vpabs*, vpalignr, v(p)broadcast* opcodes. (#3686) Adds the evex-promoted opcodes vpinsrw, vpextrw, vpextrb, vpextrd, vpextrq, vpavgb, vpavgw, vpabsb, vpabsw, vpabsd, vpsadbw, vpalignr, vbroadcastss, vbroadcastsd, vpbroadcastb, vpbroadcastw, vpbroadcastd, and vpbroadcastq. Adds the new AVX-512 opcodes vshuff32x4, vshuff64x2, vshufi32x4, vshufi64x2, vpinsrb, vpinsrd, vpinsrq, vpabsq, vbroadcastf32x2, vbroadcastf32x4, vbroadcastf64x2, vbroadcastf32x8, vbroadcastf64x4, vbroadcasti32x2, vbroadcasti32x4, vbroadcasti64x2, vbroadcasti32x8, and vbroadcasti64x4. Please note that vpbroadcastb, vpbroadcastw, vpbroadcastd, vpbroadcastq support an additional general register version in AVX-512. Adds the new naturally AVX-512 extended type codes Vfs, Vfd, Hfs, Hfd, Wfs, Wfd. Adds missing - but not functionally necessary - 'evex' flag to a few AVX-512 decode table entries. Removes a superfluous decode table entry that had been added in 2a732a8. Adds the new sizes OPSZ_4_of_32_evex64 and OPSZ_8_of_32_evex64. Adds tests for above as well as for some existing additional AVX derivatives of some of the opcodes above. Opcodes have been checked against llvm-mc, binutils/gas/objdump and capstone. Two bugs on binutils/objdump have been filed: https://sourceware.org/bugzilla/show_bug.cgi?id=24691 https://sourceware.org/bugzilla/show_bug.cgi?id=24700 Issue: #13126 年前
i#6136: Add AVX-512 VPOPCNTDQ support. (#6156) Adds the new AVX512_VPOPCNTDQ instructions, which were present on certain Xeon Phis but broadly introduced in Ice Lake. Fixes #61362 年前
i#1312 AVX-512 support: Add AVX-512 VEX encoded scalar opmask instructions. (#3558) Adds the AVX-512 VEX encoded scalar opmask instructions and extends the existing VEX tables. Unfortunately there are no Appendix A/B style documented Intel manuals available for AVX-512 in the same format as the existing x86 instruction sets. Therefore, the introduced type name 'K' is invented by us, based on the Intel AVX-512 instruction descriptions. Makes necessary modification in decoder to support VEX encoded AVX-512 instructions, specifically decoding and encoding more cases of the vex W opcode extension bit. Adds support to encoder for new VEX encodings. Fixes a bug to encode primary opcode byte in 3-byte VEX versions. Adds support for prefix_ext indirection to after vex_ext in decoder. Adds new operand types TYPE_K_REG, TYPE_K_MODRM, TYPE_K_MODRM_R, TYPE_K_VEX, TYPE_K_EVEX. Adds all INSTR_CREATE macros for the new instructions. Adds all test macros to x86 api.ir test. Adds raw encodings decoder test to x86 common.decode test. Fixes existing disassemble test. Issue: #13127 年前
i#1312 AVX-512 support: Add missing instr_create macro for vpinsrq. (#3816) The macro was missing both for AVX and AVX-512. The existing AVX roundtrip test was incorrectly using the vpinsrd version. Adds tests for AVX-512. Issue: #13126 年前
i#1312 AVX-512 support: Add missing instr_create macro for vpinsrq. (#3816) The macro was missing both for AVX and AVX-512. The existing AVX roundtrip test was incorrectly using the vpinsrd version. Adds tests for AVX-512. Issue: #13126 年前
Expand sizes of non-contiguous or offset half-width operands (#6054) Encoding the unpack-and-interleave instructions and vmovddup with the half-width operands is not in general correct. While the *number* of bytes taken from the operand is correct, those bytes are not in general taken in a contiguous sequence starting from offset zero within the operand. A tool using thi operand information will capture the wrong instruction inputs. movddup, unpcklps, and unpcklpd actually do degenerate to true half-width operands, as they start at offset zero and only use the first eight bytes. The other opcodes, including the AVX versions of however, use bytes that are offset from zero, that are non-contiguous, or both. Their representation is changed to be full-width. For prior art see e.g. movsldup/movshdup which also take a non-contiguous set of bytes and are encoded with full-width operands.3 年前
i#4610: Fix VS2017 OOM by splitting avx512 test files (#4611) Splits off about half of suite/tests/api/ir_x86_4args_avx512_evex_mask_B.h into a new _C file to avoid an out-of-memory in VS2017 on Github Actions. The recent AVX-512 broadcast additions for #4534 in PR #4577 seem to have pushed it over the edge. Issue: #4534, #4610, #4131, #4549 Fixes #46105 年前
i#5696: Fix EVEX-prefixed vsqrtss/vsqrtsd merge operand sizes (#5696) The EVEX-prefixed vsqrtss/vsqrtsd entries should take three-quarter and half width merge operands respectively, like their VEX-prefixed counterparts. This was an oversight when I wrote #4739.3 年前
i#4534: Implement AVX-512 broadcasts (#4577) To make room for broadcast forms of AVX-512 instructions, modify the evex_W_ext table to vary on EVEX.b as well. All slots are filled with invalid entries. Because all EVEX-prefixed instructions have at least two byte opcodes, we can steal the OPCODE_TWOBYTES bit to indicate the value of EVEX.b. (Presumably something similar could be done for VEX-prefixed instructions if it were ever desired.) Largely mechanical changes: instructions that accept broadcast operands get a new entry in their EVEX.b=1 slot with the appropriate memory operand. OP_CONTD decoder table entries now need to support a next entry. Note that some of these instructions previously had incorrect operand sizes. That's also fixed now. No INSTR_CREATE macro was present for some instructions and thus there were no tests. Tests are added, and errors in the decode table for vpsllvw (paralleling vpsrlvw) and vpsllvd are fixed. With this I believe broadcasts are implemented for every instruction DynamoRIO currently supports. There are additional broadcasts on instructions that DR does not yet support still outstanding. Fixes #4534 vpermb is distinguished from vpermw by EVEX.W. Correct the decode table entries, rejigger everything else around the new entries, and add API tests for the new instruction. Fixes #4539.5 年前
i#3978: Fix decode_cti bug on particular (e)vex byte sequences (#3979) Fixes a bug in decode_cti() with a (e)vex prefix with its 2nd or 3rd byte also looking like the first byte of a (e)vex prefix where the instruction is considered invalid, crashing the application. Adds testing of decode_cti() to api.ir by export decode_cti() in DEBUG and BUILD_TESTS builds. Adds a test case that triggers the observed bug without the fix. Fixes #39786 年前
i#4117: Add missing dr_standalone_init() and _exit() calls (#4118) The filed bug is about raw2trace_directory using the DR API without initializing DR, but I expanded this to clean up the whole code base: Adds dr_standalone_init() and _exit() to raw2trace_directory_t. This fixes an assert when calling dr_open_file() due to uninitialized state. Adds {dr_,}standalone_exit() calls to all cases where _init() is called, except where we can't do that (drdecode heap usage). For the C++ view and opcode_mix tools, uses a first-field destructor to ensure dr_standalone_exit() is called after any DR heap used by other fields is freed. Fixes several bugs where IR cleanup calls were missing in the api.ir test, revealed by dr_standalone_exit's unfreed-memory check. Fixes #41176 年前
i#1702 CC in IT-blocks: handle cbrs in IT-blocks (#2537) Previously, if a cbr was inserted in the middle of an IT-block we would fail to maintain the property that cbr's may only terminate an IT-block. We now maintain this property. This allows us to predicate clean calls by emitting a conditional branch, even in IT-blocks, if the auto-predication field we add here is set. Adds instrlist_{set,get}_auto_predicate() method: - instrlist_set_auto_predicate() maintains that all meta instructions inserted into an the instrlist will be predicated correctly - modifies dr_insert_clean_call() to emit correct assembly in the presence of auto predication Note that this does not fix i#1723 until we have added auto predication to relevant clients and tools. Fixes #17028 年前
i#5365: Add AArch64 SVE support to the core (part 1) (#5835) This patch adds Arm AArch64 Scalable Vector Extension (SVE) support to the core including related changes to the codec, IR and relevant clients. SVE and SVE2 are major extensions to Arm's 64 bit architecture. Developers and users should reference the relevant documentation at developer.arm.com, (currently https://developer.arm.com/Architectures/Scalable%20Vector%20Extensions). The architecture allows hardware implementations to support vector lengths from 128 to 2048 bits. This patch supports up to 512 bits due to DynamoRIO's stack size limitation. There is currently no stock SVE hardware with vector lengths greater than 512 bits. The vector length is determined by get_processor_specific_info() at runtime on startup and is available by calling proc_get_vector_length(). For Z registers, reg_get_size() will return the vector size implemented by the hardware rather than OPSZ_SCALABLE. There will be follow up patches for: - SVE scatter/gather emulation - Full SVE signal context support - Complete SVE support in sample clients and drcachesim tracer. Issues: #5365, #3044 --------- Co-authored-by: Cam Mannett <camden.mannett@arm.com>2 年前
i#3044 AArch64 SVE codec: add AND, BIC, EOR, ORN, ORR (imm) (#5789) This patch adds the appropriate macros, tests and codec entries to encode the following variants: AND <Zdn>.<T>, <Zdn>.<T>, #<const> BIC <Zdn>.<T>, <Zdn>.<T>, #<const> EOR <Zdn>.<T>, <Zdn>.<T>, #<const> ORN <Zdn>.<T>, <Zdn>.<T>, #<const> ORR <Zdn>.<T>, <Zdn>.<T>, #<const> Issue #30443 年前
i#4117: Add missing dr_standalone_init() and _exit() calls (#4118) The filed bug is about raw2trace_directory using the DR API without initializing DR, but I expanded this to clean up the whole code base: Adds dr_standalone_init() and _exit() to raw2trace_directory_t. This fixes an assert when calling dr_open_file() due to uninitialized state. Adds {dr_,}standalone_exit() calls to all cases where _init() is called, except where we can't do that (drdecode heap usage). For the C++ view and opcode_mix tools, uses a first-field destructor to ensure dr_standalone_exit() is called after any DR heap used by other fields is freed. Fixes several bugs where IR cleanup calls were missing in the api.ir test, revealed by dr_standalone_exit's unfreed-memory check. Fixes #41176 年前
i#2626: Make dis-a64*.txt test order stable (#5923) Tests were previously sorted by mnemonic only, so instructions with multiple variants could change order when the the file is added to and re-sorted. The order should now be stable which should cut back on code review noise. This also reorders the test files to match the new order used by the script. Issue: #26263 年前
i#2039 trace trim, part 1: Add post-attach and pre-detach events (#5653) Adds two new DR events: post-attach and pre-detach. These help tools align instrumentation with points where all threads are under DR control, avoiding uneven thread execution during the incremental staggered attach and detach processes. Adds some sanity tests that the events are called where they should be, for both internal and external attach. Adding them to drmemtrace will be done separately. Issue: #20393 年前
i#2039 trace trim, part 1: Add post-attach and pre-detach events (#5653) Adds two new DR events: post-attach and pre-detach. These help tools align instrumentation with points where all threads are under DR control, avoiding uneven thread execution during the incremental staggered attach and detach processes. Adds some sanity tests that the events are called where they should be, for both internal and external attach. Adding them to drmemtrace will be done separately. Issue: #20393 年前
i#1312 AVX-512 support: AVX-512 compiled client handling, get_xmm_vals() support. (#3751) Compile time detection has been added for whether a client was compiled with AVX-512. In this case, and if DynamoRIO is not deployed via "earliest" inject, we initialize the lazy AVX-512 code detection to true, in order to prevent the client from clobbering potential AVX-512 application state. Adds two new tests to check for the initialized state of AVX-512 lazy code detection. Provides support for zmm context in get_xmm_vals(). Provides support to dump zmm SIMD mcontext state. No tests for this have been added as this is used for log files. Issue: #13126 年前
Share duplicate CHECK macros in client-interface tests (#5363) Moves the CHECK macro defined in quite a few client-interface tests into client_tools.h for sharing.4 年前
i#1312 AVX-512 support: AVX-512 compiled client handling, get_xmm_vals() support. (#3751) Compile time detection has been added for whether a client was compiled with AVX-512. In this case, and if DynamoRIO is not deployed via "earliest" inject, we initialize the lazy AVX-512 code detection to true, in order to prevent the client from clobbering potential AVX-512 application state. Adds two new tests to check for the initialized state of AVX-512 lazy code detection. Provides support for zmm context in get_xmm_vals(). Provides support to dump zmm SIMD mcontext state. No tests for this have been added as this is used for log files. Issue: #13126 年前
i#4640: Pass static "client" faults to the app (#4661) A fault in the executable that has a statically-linked client and DR is currently reported as a client crash, even though we have no way to distinguish client code from app code in such a situation. We change that here to invoke execute_native_handler(), which solves problems where the application has a fault handler in place and a tool or other mechanism that generates faults during DR init or other points. Updates the has-dcontext-but-during-init case from PR #4662 (issue #1921) to use the stored app sigstack and not query the kernel and find DR's sigstack. The api.static_crash exercises this path and we updated its template here, but it uses -unsafe_crash_process to trigger it. I tried to reproduce an incoming signal trigger by making a static version of api.detach_signal but it requires a fault-generating allocator replacement (hit by droption and other places that use the system allocator) or something similar, which is non-trivial to replicate in a small test. I did test on the original proprietary application. Fixes #46405 年前
i#4640: Pass static "client" faults to the app (#4661) A fault in the executable that has a statically-linked client and DR is currently reported as a client crash, even though we have no way to distinguish client code from app code in such a situation. We change that here to invoke execute_native_handler(), which solves problems where the application has a fault handler in place and a tool or other mechanism that generates faults during DR init or other points. Updates the has-dcontext-but-during-init case from PR #4662 (issue #1921) to use the stored app sigstack and not query the kernel and find DR's sigstack. The api.static_crash exercises this path and we updated its template here, but it uses -unsafe_crash_process to trigger it. I tried to reproduce an incoming signal trigger by making a static version of api.detach_signal but it requires a fault-generating allocator replacement (hit by droption and other places that use the system allocator) or something similar, which is non-trivial to replicate in a small test. I did test on the original proprietary application. Fixes #46405 年前
i#2661: Clear options on detach (#3120) Resets the dynamo_options to their default values on detach and standalone_exit to ensure that values from any previous attaches do not get carried over to the current attach. Comes with a unit test which fails without these changes. Fixes #2661 Co-authored-by: Derek Bruening <bruening@google.com>4 年前
i#2157 re-attach: add best-effort re-attach support (#2182) While there are many globals that are not reset, we take a best-effort approach here and reset the ones that matter, allowing us to re-attach with a memtrace client and static DR. Portions of this were initially based on https://codereview.appspot.com/13314047/ from Peter Goodman. For DO_ONCE, an incremented counter is used. For numerous other globals, at exit time if doing_detach is set we reset them to NULL. For locks, we simply reset count_times_acquired but leave the deleted field as it does not matter much (part of the whole theme here: this may never be rock-solid for unusual option combinations but our goal is supporting the common case). For drreg, with the new multi-init feature we have to zero the options at exit time. Adds several tests by adding loops around existing start/stop tests, but stops short on burst_threads until i#2175 is fixed. Review-URL: https://codereview.appspot.com/3185200439 年前
i#3387 DR lib bounds: use linker vars if available in LINUX+STATIC_LIBRARY. (#3487) Introduce the ability to set the dynamorio_so_start and dynamorio_so_end linker variables even when being compiled as a STATIC_LIBRARY. If not specified, the old functionality is used. Includes tests which verify that, for some specific way an application can mix up their maps files, the application passes or fails depending on whether the linker variables are provided. This should be a no-op for Windows, Mac, and non-static DR. Issue: #33877 年前
i#3538 flaky maps mixup: mark static_maps_mixup_novars as FLAKY (#3587) Currently this just causes some CIs to ignore failures in this test. xref i#2204 for having ctest actually re-run these tests. Issue: #35387 年前
i#3387 DR lib bounds: use linker vars if available in LINUX+STATIC_LIBRARY. (#3487) Introduce the ability to set the dynamorio_so_start and dynamorio_so_end linker variables even when being compiled as a STATIC_LIBRARY. If not specified, the old functionality is used. Includes tests which verify that, for some specific way an application can mix up their maps files, the application passes or fails depending on whether the linker variables are provided. This should be a no-op for Windows, Mac, and non-static DR. Issue: #33877 年前
i#975 static DR: add dr_standalone_exit() (#3708) Adds dr_standalone_exit() with support for subsequent DR use, including full attach. This is a better solution than setup;detach support added in de99d45 for the use case of performing DR decode/encode with DR statically linked (which precludes using drdecodelib), followed by a separate use of DR for instrumentation. Updates the test of this use case in api.static_noclient. Issue: #9756 年前
i#975 static DR: client existence and loading Implements the proposed model for dynamorio_static: it allows any part of the app to acts as the client, but also looks for and invokes a dr_init or dr_client_main routine, supporting multiple methods of including a client in the application. Switches to accessing the client lib via the system loader when it's the same as the application on Windows (the code for Linux was added in the past). Replaces checks for !IS_INTERNAL_STRING_OPTION_EMPTY(client_lib) with a new macro CLIENTS_EXIST(), which is set to true for STATIC_LIBRARY. Sets -code_api to true for STATIC_LIBRARY by default, and sets its dependences via moving that code into options.c. For STATIC_LIBRARY, "loads" the app itself as the client, and makes the absence of an init routine non-fatal. Updates the static_startstop test and adds two new tests: static_noclient and static_noinit. Review-URL: https://codereview.appspot.com/3026300439 年前
i#2876 clang-format: massive reformat of every file (#3093) Massive clang-format of every C or C++ source file (*.h, *.c, *.cpp), except for: + third_party/ + ext/drsysm/libelftc/include/ + ext/drsyms/demangle.cc This is a single, large commit by design to present a single history disruption point and bring the code base into a consistent format. clang-format version 6.0 was used. Tweaks the clang-format rules to indent 4 after pre-processor hashes, to allow single-line case labels, and align trailing comments. Disables the pp_indent vera style check as we changed the indent rule and clang-format now covers it. Leaves the other checks, even though some are redundant (they found clang-format errors). Moves C++ token defines from globals_shared.h and globals.h to a new header, core/lib/c_defines.h, to avoid a clang-format parsing error where it fails to identify the include guard. Adds genapi.pl removal of extra indentation inside API_EXPORT_ONLY and CLIENT_INTERFACE regions. Adds exclusions around large regions we don't want to format, in these files: + core/win32/syscallx.h + core/arch/x86/decode_table.c + core/arch/arm/table_a32_pred.c + core/arch/arm/table_a32_unpred.c + core/arch/arm/table_t32_16.c + core/arch/arm/table_t32_16_it.c + core/arch/arm/table_t32_base.c + core/arch/arm/table_t32_coproc.c + core/arch/arm/table_encode.c Adds smaller exclusions to work around clang-format bugs: + Several missing break-after-return-type + Several >90-char lines + Misc scattered issues, all listed in #2876 Issue: #28767 年前
i#975 static DR: client existence and loading Implements the proposed model for dynamorio_static: it allows any part of the app to acts as the client, but also looks for and invokes a dr_init or dr_client_main routine, supporting multiple methods of including a client in the application. Switches to accessing the client lib via the system loader when it's the same as the application on Windows (the code for Linux was added in the past). Replaces checks for !IS_INTERNAL_STRING_OPTION_EMPTY(client_lib) with a new macro CLIENTS_EXIST(), which is set to true for STATIC_LIBRARY. Sets -code_api to true for STATIC_LIBRARY by default, and sets its dependences via moving that code into options.c. For STATIC_LIBRARY, "loads" the app itself as the client, and makes the absence of an init routine non-fatal. Updates the static_startstop test and adds two new tests: static_noclient and static_noinit. Review-URL: https://codereview.appspot.com/3026300439 年前
i#2039 trace trim, part 1: Add post-attach and pre-detach events (#5653) Adds two new DR events: post-attach and pre-detach. These help tools align instrumentation with points where all threads are under DR control, avoiding uneven thread execution during the incremental staggered attach and detach processes. Adds some sanity tests that the events are called where they should be, for both internal and external attach. Adding them to drmemtrace will be done separately. Issue: #20393 年前
i#2039 trace trim, part 1: Add post-attach and pre-detach events (#5653) Adds two new DR events: post-attach and pre-detach. These help tools align instrumentation with points where all threads are under DR control, avoiding uneven thread execution during the incremental staggered attach and detach processes. Adds some sanity tests that the events are called where they should be, for both internal and external attach. Adding them to drmemtrace will be done separately. Issue: #20393 年前
i#2661: Clear options on detach (#3120) Resets the dynamo_options to their default values on detach and standalone_exit to ensure that values from any previous attaches do not get carried over to the current attach. Comes with a unit test which fails without these changes. Fixes #2661 Co-authored-by: Derek Bruening <bruening@google.com>4 年前
i#2661: Clear options on detach (#3120) Resets the dynamo_options to their default values on detach and standalone_exit to ensure that values from any previous attaches do not get carried over to the current attach. Comes with a unit test which fails without these changes. Fixes #2661 Co-authored-by: Derek Bruening <bruening@google.com>4 年前
i#2876 clang-format: massive reformat of every file (#3093) Massive clang-format of every C or C++ source file (*.h, *.c, *.cpp), except for: + third_party/ + ext/drsysm/libelftc/include/ + ext/drsyms/demangle.cc This is a single, large commit by design to present a single history disruption point and bring the code base into a consistent format. clang-format version 6.0 was used. Tweaks the clang-format rules to indent 4 after pre-processor hashes, to allow single-line case labels, and align trailing comments. Disables the pp_indent vera style check as we changed the indent rule and clang-format now covers it. Leaves the other checks, even though some are redundant (they found clang-format errors). Moves C++ token defines from globals_shared.h and globals.h to a new header, core/lib/c_defines.h, to avoid a clang-format parsing error where it fails to identify the include guard. Adds genapi.pl removal of extra indentation inside API_EXPORT_ONLY and CLIENT_INTERFACE regions. Adds exclusions around large regions we don't want to format, in these files: + core/win32/syscallx.h + core/arch/x86/decode_table.c + core/arch/arm/table_a32_pred.c + core/arch/arm/table_a32_unpred.c + core/arch/arm/table_t32_16.c + core/arch/arm/table_t32_16_it.c + core/arch/arm/table_t32_base.c + core/arch/arm/table_t32_coproc.c + core/arch/arm/table_encode.c Adds smaller exclusions to work around clang-format bugs: + Several missing break-after-return-type + Several >90-char lines + Misc scattered issues, all listed in #2876 Issue: #28767 年前
i#297 client sideline thread: make static_sideline less flaky (#2377) Added more synchronization in static_sideline test to make it less flaky. Fixed a build error for vps-release-external build.9 年前
i#2311 sigmask: Reroute blocked signal if unblocked thread exists (#5472) Adds a solution to the problem of the real signal mask having a signal unblocked in a thread where the app has it blocked, while another app thread has it unblocked, when the kernel delivers a group-wide signal. To avoid checking all threads on every blocked signal, which requires heavyweight locks that can't be acquired from our signal handler, we maintain a count of unmasked threads for each signal number. This is shared across the handler group and is updated using atomic operations. On arrival of a signal blocked by the app, we first detect whether it is whole-group-directed, by examining the si_code values. If it is, and if the unmasked thread count is > 0, we want to re-route it: but we can't acquire the locks here, so we mark it unblocked, unlink the interrupted fragment (if any), and head back through dispatch with the signals_pending flag set. Once in the pending-signal code from dispatch, we grab the thread_initexit_lock and walk the threads, looking for a thread with this signal unblocked. For synchronization with other threads' masks, we can't have a mutex used on every access as we need lock-free reads from our signal handler. Instead, we have a mutex used only on writes (only done by the owning thread) and on reads from other threads (just this rerouting code). Once we find a target thread, we send it a new signal and drop the current one. This will properly interrupt any syscall. This should be an asynchronous signal, so a new later copy should be fine. We handle sigqueue's extra value by using SYS_rt_tgsigqueueinfo where available. Adds a new test linux.sigmask which tests both kill() and sigqueue(). Enables previously-disabled alarms in the api.static_signal to further test the fix. Future improvements include keeping the actual mask matching the app's mask, if DR and clients don't care about that signal: which will require an API addition. Issue: #23114 年前
i#2311 sigmask: Reroute blocked signal if unblocked thread exists (#5472) Adds a solution to the problem of the real signal mask having a signal unblocked in a thread where the app has it blocked, while another app thread has it unblocked, when the kernel delivers a group-wide signal. To avoid checking all threads on every blocked signal, which requires heavyweight locks that can't be acquired from our signal handler, we maintain a count of unmasked threads for each signal number. This is shared across the handler group and is updated using atomic operations. On arrival of a signal blocked by the app, we first detect whether it is whole-group-directed, by examining the si_code values. If it is, and if the unmasked thread count is > 0, we want to re-route it: but we can't acquire the locks here, so we mark it unblocked, unlink the interrupted fragment (if any), and head back through dispatch with the signals_pending flag set. Once in the pending-signal code from dispatch, we grab the thread_initexit_lock and walk the threads, looking for a thread with this signal unblocked. For synchronization with other threads' masks, we can't have a mutex used on every access as we need lock-free reads from our signal handler. Instead, we have a mutex used only on writes (only done by the owning thread) and on reads from other threads (just this rerouting code). Once we find a target thread, we send it a new signal and drop the current one. This will properly interrupt any syscall. This should be an asynchronous signal, so a new later copy should be fine. We handle sigqueue's extra value by using SYS_rt_tgsigqueueinfo where available. Adds a new test linux.sigmask which tests both kill() and sigqueue(). Enables previously-disabled alarms in the api.static_signal to further test the fix. Future improvements include keeping the actual mask matching the app's mask, if DR and clients don't care about that signal: which will require an API addition. Issue: #23114 年前
i#2876 clang-format: massive reformat of every file (#3093) Massive clang-format of every C or C++ source file (*.h, *.c, *.cpp), except for: + third_party/ + ext/drsysm/libelftc/include/ + ext/drsyms/demangle.cc This is a single, large commit by design to present a single history disruption point and bring the code base into a consistent format. clang-format version 6.0 was used. Tweaks the clang-format rules to indent 4 after pre-processor hashes, to allow single-line case labels, and align trailing comments. Disables the pp_indent vera style check as we changed the indent rule and clang-format now covers it. Leaves the other checks, even though some are redundant (they found clang-format errors). Moves C++ token defines from globals_shared.h and globals.h to a new header, core/lib/c_defines.h, to avoid a clang-format parsing error where it fails to identify the include guard. Adds genapi.pl removal of extra indentation inside API_EXPORT_ONLY and CLIENT_INTERFACE regions. Adds exclusions around large regions we don't want to format, in these files: + core/win32/syscallx.h + core/arch/x86/decode_table.c + core/arch/arm/table_a32_pred.c + core/arch/arm/table_a32_unpred.c + core/arch/arm/table_t32_16.c + core/arch/arm/table_t32_16_it.c + core/arch/arm/table_t32_base.c + core/arch/arm/table_t32_coproc.c + core/arch/arm/table_encode.c Adds smaller exclusions to work around clang-format bugs: + Several missing break-after-return-type + Several >90-char lines + Misc scattered issues, all listed in #2876 Issue: #28767 年前
i#975 static DR: client existence and loading Implements the proposed model for dynamorio_static: it allows any part of the app to acts as the client, but also looks for and invokes a dr_init or dr_client_main routine, supporting multiple methods of including a client in the application. Switches to accessing the client lib via the system loader when it's the same as the application on Windows (the code for Linux was added in the past). Replaces checks for !IS_INTERNAL_STRING_OPTION_EMPTY(client_lib) with a new macro CLIENTS_EXIST(), which is set to true for STATIC_LIBRARY. Sets -code_api to true for STATIC_LIBRARY by default, and sets its dependences via moving that code into options.c. For STATIC_LIBRARY, "loads" the app itself as the client, and makes the absence of an init routine non-fatal. Updates the static_startstop test and adds two new tests: static_noclient and static_noinit. Review-URL: https://codereview.appspot.com/3026300439 年前
i#3348 sym conflicts: Rename one-word global symbols, part 2 (#3414) Renames more single-word global symbols to help reduce the chance of name conflicts: + s/initstack/d_r_initstack/ + s/loginst/d_r_loginst/ + s/logopnd/d_r_logopnd/ + s/logtrace/d_r_logtrace/ + s/mangle/d_r_mangle/ + s/mangle/d_r_mangle/ + s/MD5Init/d_r_md5_init/ + s/MD5Final/d_r_md5_final/ + s/MD5Update/d_r_md5_update/ + s/notify/d_r_notify/ + s/stackdump/d_r_stackdump/ + s/stats/d_r_stats/ Issue: #33487 年前
i#3348 sym conflicts: support not hiding symbols in static DR lib (#3411) Adds a new dynamorio_static_nohide library which is only built when tests are built. It is used for sanity checks on global symbol names for better interoperability with toolchains where objcopy --localize-hidden is a pain point. Renames core/CMake_globalize_pic_thunks.cmake to core/CMake_finalize_static_lib.cmake to better reflect the multiple steps it is taking beyond just the thunks. Adds a new test api.static_symbols of linking dynamorio_static_nohide. Adds a new post-build step that runs a new script CMake_symbol_check.cmake which looks for likely-to-conflict symbols in dynamorio_static_nohide. Issue: #33487 年前
i#4117: Add missing dr_standalone_init() and _exit() calls (#4118) The filed bug is about raw2trace_directory using the DR API without initializing DR, but I expanded this to clean up the whole code base: Adds dr_standalone_init() and _exit() to raw2trace_directory_t. This fixes an assert when calling dr_open_file() due to uninitialized state. Adds {dr_,}standalone_exit() calls to all cases where _init() is called, except where we can't do that (drdecode heap usage). For the C++ view and opcode_mix tools, uses a first-field destructor to ensure dr_standalone_exit() is called after any DR heap used by other fields is freed. Fixes several bugs where IR cleanup calls were missing in the api.ir test, revealed by dr_standalone_exit's unfreed-memory check. Fixes #41176 年前
i#4131: Set up Windows Github Actions (#4586) Adds three separate jobs on Windows: 32-bit debug with tests, 64-bit debug with tests, and a job that creates 3 builds without tests: release-32, release-64, and vps-32-debug. Adds corresponding flags to the script layers to accomplish this build splitting. We target a VS2017 Win10-1607 image to match what we had on Appveyor. Modifies the runsuite_wrapper.pl script to *not* fork, since there is no Cygwin perl available by default. Instead, we tee to a file. Updates checks for Cygwin to apply to native Windows perl as well. Includes attempts to improve two tests: + Add error code printing for drx-test + Generalize api.symtest golden output to handle extra operators found Other failing tests were either already on the ignore list or we add them here. The long list of tests ignored for #4058 is culled to remove tests never observed to fail on GA. Removes APPVEYOR_PULL_REQUEST_NUMBER-driven build skips now that we have a separate build job. Sets a CI_TRIGGER env var and use it to shrink 32-bit tests in the same manner as before for Appveyor PR runs. This is based on the similar work for Dr. Memory: https://github.com/DynamoRIO/drmemory/pull/2326 Issue: #41315 年前
i#5786: Add precise clean call mangling identification (#5791) Adds new labels delimiting clean call sequences. Converts into a translation record flag when storing translations. Uses the new labels and flag to precisely identify clean call mangling, replacing the previous scheme which incorrectly thought mangled tool pc-relative was a clean call, resulting in incorrect translations and crashes. Adds a test case to api.detach_state by adding a client (by converting it to use static DR) which inserts a pc-relative load. This reproduces the crash on detach, and is fixed with this fix. The added instrumentation caused periodic detach failures which were solved by setting the translation and adding a restore-state event: i#4232 covers trying to improve the situation. Adds a new instr_t.offset field. Stops using instr_t.note to hold encoding offsets for pc-releative operands. Adds a new field instr_t.offset which is used for this purpose. This leaves note values in place across encodings, which is needed for new clean call marking labels and also simplifies rseq handling code. This instr_t field is a compatibility break and we bump the version and OLDEST_COMPATIBLE_VERSION here to 990. Updates dr_get_note docs. Augments logging of xl8 info with new flag info. Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels. This is another compatibility break, while at it. Fixes several issues hit in tests that happened to trigger on the heap bucket size and other changes: + Fixes a rank order violation at loglevel 5: xref #1649 + Writes real xstate_bv into signal frame when setting the xstate context to avoid lazy AVX restore problems. + Tweaks the thread_churn test to work around non-linearities. Issue: #5786, #4232 Fixes #57863 年前
i#5786: Add precise clean call mangling identification (#5791) Adds new labels delimiting clean call sequences. Converts into a translation record flag when storing translations. Uses the new labels and flag to precisely identify clean call mangling, replacing the previous scheme which incorrectly thought mangled tool pc-relative was a clean call, resulting in incorrect translations and crashes. Adds a test case to api.detach_state by adding a client (by converting it to use static DR) which inserts a pc-relative load. This reproduces the crash on detach, and is fixed with this fix. The added instrumentation caused periodic detach failures which were solved by setting the translation and adding a restore-state event: i#4232 covers trying to improve the situation. Adds a new instr_t.offset field. Stops using instr_t.note to hold encoding offsets for pc-releative operands. Adds a new field instr_t.offset which is used for this purpose. This leaves note values in place across encodings, which is needed for new clean call marking labels and also simplifies rseq handling code. This instr_t field is a compatibility break and we bump the version and OLDEST_COMPATIBLE_VERSION here to 990. Updates dr_get_note docs. Augments logging of xl8 info with new flag info. Reduces DR_NOTE_FIRST_RESERVED to give DR more reserved labels. This is another compatibility break, while at it. Fixes several issues hit in tests that happened to trigger on the heap bucket size and other changes: + Fixes a rank order violation at loglevel 5: xref #1649 + Writes real xstate_bv into signal frame when setting the xstate context to avoid lazy AVX restore problems. + Tweaks the thread_churn test to work around non-linearities. Issue: #5786, #4232 Fixes #57863 年前