| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
[libc] Add float type and flag for nearest_integer to enable SSE4.2. Add float type and flag for nearest integer to automatically test with and without SSE4.2 flag. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D129916 | 3 年前 | |
[libc][math] Fix broken compilation due to __builtin_inf/nan functions. | 3 年前 | |
[libc] Add float type and flag for nearest_integer to enable SSE4.2. Add float type and flag for nearest integer to automatically test with and without SSE4.2 flag. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D129916 | 3 年前 | |
[libc] apply new lint rules This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302 | 4 年前 | |
[libc] Implement sinf function that is correctly rounded to all rounding modes. Implement sinf function that is correctly rounded to all rounding modes. - We use a simple range reduction for pi/16 < |x| : Let k = round(x / pi) and y = (x/pi) - k. So k is an integer and -0.5 <= y <= 0.5. Then sin(x) = sin(y*pi + k*pi) = (-1)^(k & 1) * sin(y*pi) ~ (-1)^(k & 1) * y * P(y^2) where y*P(y^2) is a degree-15 minimax polynomial generated by Sollya with: > P = fpminimax(sin(x*pi)/x, [|0, 2, 4, 6, 8, 10, 12, 14|], [|D...|], [0, 0.5]); - Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.892 System LIBC reciprocal throughput : 25.559 LIBC reciprocal throughput : 29.381 After this patch (correctly rounded): `` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.896 System LIBC reciprocal throughput : 25.740 LIBC reciprocal throughput : 27.872 LIBC reciprocal throughput : 20.012 (with -msse4.2 flag) LIBC reciprocal throughput : 14.244 (with -mfma flag) `` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D123154 | 3 年前 | |
[libc] apply new lint rules This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302 | 4 年前 | |
[libc] Initial support for darwin-aarch64. Add initial support for darwin-aarch64 (macOS M1). Some differences compared to linux-aarch64: - math.h defined math_errhandling by the compiler builtin __math_errhandling() but Apple Clang 13.0.0 on M1 does not support __math_errhandling() builtin as a macro function or a constexpr function. - math.h defines UNDERFLOW and OVERFLOW macros. - Besides 5 usual floating point exceptions: FE_INEXACT, FE_UNDERFLOW, FE_OVERFLOW, FE_DIVBYZERO, and FE_INVALID, fenv.h also has another floating point exception: FE_FLUSHTOZERO. The corresponding trap for FE_FLUSHTOZERO in the control register is at the different location compared to the status register. - FE_FLUSHTOZERO exception flag cannot be raised with the default CPU floating point operation mode. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D120914 | 4 年前 | |
[libc] Implement double precision FMA for targets without FMA instructions. Implement double precision FMA (Fused Multiply-Add) for targets without FMA instructions using __uint128_t to store the intermediate results. Reviewed By: michaelrj, sivachandra Differential Revision: https://reviews.llvm.org/D124495 | 3 年前 | |
[libc][nfc] update get_explicit_mantissa The get_explicit_mantissa function returns the mantissa of an FPBits floating point value with the implicit leading 1, if appropriate. This function existed previously, but did not handle non-normal numbers properly. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D129241 | 3 年前 | |
[libc] Add Uint128 type as a fallback when __uint128_t is not available. Also, the unused specializations of __int128_t have been removed. Differential Revision: https://reviews.llvm.org/D128304 | 3 年前 | |
[libc][math] fmod/fmodf implementation. This is a implementation of find remainder fmod function from standard libm. The underline algorithm is developed by myself, but probably it was first invented before. Some features of the implementation: 1. The code is written on more-or-less modern C++. 2. One general implementation for both float and double precision numbers. 3. Spitted platform/architecture dependent and independent code and tests. 4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc. 5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided). 6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication. Performance tests: The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases. ./check.sh <--special|--worst> fmodf passed. CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf results are GNU libc version: 2.35 GNU libc release: stable 21.166 <-- FPU 51.031 <-- current glibc 37.659 <-- this fmod version. | 3 年前 | |
[libc] Replace type punning with bit_cast Although type punning is defined for union in C, it is UB in C++. This patch introduces a bit_cast function to convert between types in a safe way. This is necessary to get llvm-libc compile with GCC. This patch is extracted from D119002. Differential Revision: https://reviews.llvm.org/D119145 | 4 年前 | |
[libc] Initial support for darwin-aarch64. Add initial support for darwin-aarch64 (macOS M1). Some differences compared to linux-aarch64: - math.h defined math_errhandling by the compiler builtin __math_errhandling() but Apple Clang 13.0.0 on M1 does not support __math_errhandling() builtin as a macro function or a constexpr function. - math.h defines UNDERFLOW and OVERFLOW macros. - Besides 5 usual floating point exceptions: FE_INEXACT, FE_UNDERFLOW, FE_OVERFLOW, FE_DIVBYZERO, and FE_INVALID, fenv.h also has another floating point exception: FE_FLUSHTOZERO. The corresponding trap for FE_FLUSHTOZERO in the control register is at the different location compared to the status register. - FE_FLUSHTOZERO exception flag cannot be raised with the default CPU floating point operation mode. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D120914 | 4 年前 | |
[libc] apply new lint rules This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302 | 4 年前 | |
[libc] Support 32-bit ARM platform tests Set LONG_DOUBLE_IS_DOUBLE, add ifdefs for 128-bit integer types Differential Revision: https://reviews.llvm.org/D124204 | 4 年前 | |
[libc] Automatically add -mfma flag for architectures supporting FMA. Detect if the architecture supports FMA instructions and if the targets depend on fma. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D123615 | 3 年前 | |
[libc] add uint128 implementation Some platforms don't support proper 128 bit integers, but some algorithms use them, such as any that use long doubles. This patch modifies the existing UInt class to support the necessary operators. This does not put this new class into use, that will be in followup patches. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D124959 | 4 年前 | |
[libc][math] Fix broken compilation due to __builtin_inf/nan functions. | 3 年前 | |
[libc] Implement sinf function that is correctly rounded to all rounding modes. Implement sinf function that is correctly rounded to all rounding modes. - We use a simple range reduction for pi/16 < |x| : Let k = round(x / pi) and y = (x/pi) - k. So k is an integer and -0.5 <= y <= 0.5. Then sin(x) = sin(y*pi + k*pi) = (-1)^(k & 1) * sin(y*pi) ~ (-1)^(k & 1) * y * P(y^2) where y*P(y^2) is a degree-15 minimax polynomial generated by Sollya with: > P = fpminimax(sin(x*pi)/x, [|0, 2, 4, 6, 8, 10, 12, 14|], [|D...|], [0, 0.5]); - Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.892 System LIBC reciprocal throughput : 25.559 LIBC reciprocal throughput : 29.381 After this patch (correctly rounded): `` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.896 System LIBC reciprocal throughput : 25.740 LIBC reciprocal throughput : 27.872 LIBC reciprocal throughput : 20.012 (with -msse4.2 flag) LIBC reciprocal throughput : 14.244 (with -mfma flag) `` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D123154 | 3 年前 | |
[libc] Fix nested namespace issues with multiply_add.h. The FMA header was included inside namespaces in multiply_add.h. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D123539 | 4 年前 | |
[libc] Add float type and flag for nearest_integer to enable SSE4.2. Add float type and flag for nearest integer to automatically test with and without SSE4.2 flag. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D129916 | 3 年前 | |
[libc] Refactor sqrt implementations and add tests for generic sqrt implementations. Re-apply https://reviews.llvm.org/D118173 with fix for aarch64. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D118433 | 4 年前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 3 年前 | ||
| 3 年前 | ||
| 3 年前 | ||
| 4 年前 | ||
| 3 年前 | ||
| 4 年前 | ||
| 4 年前 | ||
| 3 年前 | ||
| 3 年前 | ||
| 3 年前 | ||
| 3 年前 | ||
| 4 年前 | ||
| 4 年前 | ||
| 4 年前 | ||
| 4 年前 | ||
| 3 年前 | ||
| 4 年前 | ||
| 3 年前 | ||
| 3 年前 | ||
| 4 年前 | ||
| 3 年前 | ||
| 4 年前 |