| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
[MLIR] Add attributes no_unwind and will_return to the LLVMIR dialect (#98921) And testing. These are being added to be used in the GPU to LLVM SPV pass. --------- Co-authored-by: Victor Perez <victor.perez@codeplay.com> | 1 年前 | |
[MLIR][LLVM] Remove typed pointer remnants from target tests (#71210) This commit removes all LLVM dialect typed pointers from the target tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502 | 2 年前 | |
[mlir][Pass] Include anchor op in -pass-pipeline In D134622 the printed form of a pass manager is changed to include the name of the op that the pass manager is anchored on. This updates the -pass-pipeline argument format to include the anchor op as well, so that the printed form of a pipeline can be directly passed to -pass-pipeline. In most cases this requires updating -pass-pipeline='pipeline' to -pass-pipeline='builtin.module(pipeline)'. This also fixes an outdated assert that prevented running a PassManager anchored on 'any'. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D134900 | 3 年前 | |
[mlir][ArmNeon] Adds Arm Neon SMMLA, UMMLA, and USMMLA Intrinsics (#80511) This adds the SMMLA, UMMLA, and USMMLA intrinsics to Neon dialect bringing it in line with the SVE dialect. These ops enable matrix multiply-accumulate instructions with two e 2x8 matrix inputs of respective signage into a 2x2 32-bit integer accumulator. This is equivalent to performing an 8-way dot product per destination element. Op details: https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=mmla | 2 年前 | |
[mlir][ArmSME] Add arm_sme.intr.cnts(b|h|w|d) intrinsics (#77319) This adds MLIR versions of the Arm streaming vector length intrinsics. These allow reading the streaming vector length regardless of the streaming mode. | 2 年前 | |
[mlir][ArmSME] Support 2-way widening outer products (#78975) This patch introduces support for 2-way widening outer products. This enables the fusion of 2 'arm_sme.outerproduct' operations that are chained via the accumulator into a 2-way widening outer product operation. Changes: - Add 'llvm.aarch64.sme.[us]mop[as].za32' intrinsics for 2-way variants. These map to instruction variants added in SME2 and use different intrinsics. Intrinsics are already implemented for widening variants from SME1. - Adds the following operations: - fmopa_2way, fmops_2way - smopa_2way, smops_2way - umopa_2way, umops_2way - Implements conversions for the above ops to intrinsics in ArmSMEToLLVM. - Adds a pass 'arm-sme-outer-product-fusion' that fuses 'arm_sme.outerproduct' operations. For a detailed description of these operations see the 'arm_sme.fmopa_2way' description. The reason for introducing many operations rather than one is the signed/unsigned variants can't be distinguished with types (e.g., ui16, si16) since 'arith.extui' and 'arith.extsi' only support signless integers. A single operation would require this information and an attribute (for example) for the sign doesn't feel right if floating-point types are also supported where this wouldn't apply. Furthermore, the SME FP8 extensions (FEAT_SME_F8F16, FEAT_SME_F8F32) introduce FMOPA 2-way (FP8 to FP16) and 4-way (FP8 to FP32) variants but no subtract variant. Whilst these are not supported in this patch, it felt simpler to have separate ops for add/subtract given this. | 2 年前 | |
[mlir][ArmSVE] Add arm_sve.psel operation (#95764) This adds a new operation for the SME/SVE2.1 psel instruction. This allows selecting a predicate based on a bit within another predicate, essentially allowing for 2-D predication. Informally, the semantics are: mlir %pd = arm_sve.psel %p1, %p2[%index] : vector<[4]xi1>, vector<[8]xi1> => if p2[index % num_elements(p2)] == 1: pd = p1 : type(p1) else: pd = all-false : type(p1) | 1 年前 | |
[mlir][llvm] Fix access group translation (#83257) This commit fixes the translation of access group metadata to LLVM IR. Previously, it did not use a temporary metadata node to model the placeholder of the self-referencing access group nodes. This is dangerous since, the translation may produce a metadata list with a null entry that is later on changed changed with a self reference. At the same time, for example the debug info translation may create the same uniqued node, which after setting the self-reference the suddenly references the access group metadata. The commit avoids such breakages. | 2 年前 | |
[mlir][llvm] Translate alias scopes lazily Change the LLVM dialect to LLVM IR translation to convert the alias scope attributes lazily to LLVM IR metadata. Previously, the alias scopes have been translated upfront walking the alias scopes of operations that implement the AliasAnalysisOpInterface. As a result, the translation of a module that contains only a noalias scope intrinsic failed, since its alias scope attribute has not been translated due to the intrinsic not implementing AliasAnalysisOpInterface. Reviewed By: zero9178 Differential Revision: https://reviews.llvm.org/D159187 | 2 年前 | |
[mlir][llvm] Add comdat attribute to functions This revision adds comdat support to functions. Additionally, it ensures only comdats that have uses are imported/exported and only non-empty global comdat operations are created. Reviewed By: Dinistro Differential Revision: https://reviews.llvm.org/D153739 | 2 年前 | |
[MLIR] Add index bitwidth to the DataLayout (#85927) When importing from LLVM IR the data layout of all pointer types contains an index bitwidth that should be used for index computations. This revision adds a getter to the DataLayout that provides access to the already stored bitwidth. The function returns an optional since only pointer-like types have an index bitwidth. Querying the bitwidth of a non-pointer type returns std::nullopt. The new function works for the built-in Index type and, using a type interface, for the LLVMPointerType. | 2 年前 | |
[MLIR] Enable converting dialect attributes on external functions This patch modifies the MLIR-to-LLVMIR translation pass to enable dialect attributes attached to external functions being processed by the corresponding dialect's translation interface via amendOperation(). Differential Revision: https://reviews.llvm.org/D156988 | 2 年前 | |
[MLIR][LLVM] Add ftz and fuse FP ops related function attribute support (#97812) Adds denormal-fp-math-f32, denormal-fp-math, fp-contract to llvmFuncOp attributes. denormal-fp-math-f32 and denormal-fp-math can enable the ftz, that is , flushing denormal to zero. fp-contract can enable the fma fusion such as mul + add -> fma | 1 年前 | |
[MLIR] Add support for frame pointers in MLIR (#72145) Add support for frame pointers in MLIR. --------- Co-authored-by: Markus Böck <markus.boeck02@gmail.com> Co-authored-by: Christian Ulmann <christianulmann@gmail.com> | 2 年前 | |
[MLIR] Enable GPU Dialect to SYCL runtime integration (#71430) GPU Dialect lowering to SYCL runtime is driven by spirv.target_env attached to gpu.module. As a result of this, spirv.target_env remains as an input to LLVMIR Translation. A SPIRVToLLVMIRTranslation without any actual translation is added to avoid an unregistered error in mlir-cpu-runner. SelectObjectAttr.cpp is updated to 1) Pass binary size argument to getModuleLoadFn 2) Pass parameter count to getKernelLaunchFn This change does not impact CUDA and ROCM usage since both mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept and ignore the extra arguments. | 2 年前 | |
[mlir-translate] Support parsing operations other than 'builtin.module' as top-level This adds a '--no-implicit-module' option, which disables the insertion of a top-level 'builtin.module' during parsing. The translation APIs are also updated to take/return 'Operation*' instead of 'ModuleOp', to allow other operation types to be used. To simplify translations which are restricted to specific operation types, 'TranslateFromMLIRRegistration' has an overload which performs the necessary cast and error checking. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D134237 | 3 年前 | |
[mlir][debug] Allow global with local scope. (#98358) Currently, only those global variables in debug info are allowed which have a compile unit or module scope. But global variable with function scope are used in debug info to represent certain type of variables. One example will be static local variables in C. Here the variable is represented as global variable with parent function as its scope. See the code and debug metadata as generated by clang below. You can see that scope of DIGlobalVariable is DISubprogram. int test() { static int a = 0; return a++; } !0 = !DIGlobalVariableExpression(var: !1...) !1 = !DIGlobalVariable(name: "a", scope: !2 ...) !2 = !DISubprogram(name: "test" ...) !7 = !DICompileUnit(... globals: !8...) !8 = !{!0} Similar issue exist in fortran where global variable with function scope are used to represent local variable with save attribute. This PR will allows such variables during module translation. | 1 年前 | |
Reapply "[Intrinsics][PreISelInstrinsicLowering] llvm.memcpy.inline length no longer needs to be constant (#98281)" This reverts commit ac4b6b662630cd4d3bf6929f2b39ea203c0054a1. A test change was missing for mlir/test/Target/LLVMIR/llvmir-intrinsics.mlir in the initial commit. | 1 年前 | |
[mlir][vector] Propagate scalability to gather/scatter ptrs vector (#97584) In convert-vector-to-llvm the first operand (vector of pointers holding all memory addresses to read) to the masked.gather (and scatter) intrinsic has a fixed vector type. This may result in intrinsics where the scalable flag has been dropped: %0 = llvm.intr.masked.gather %1, %2, %3 {alignment = 4 : i32} : (!llvm.vec<4 x ptr>, vector<[4]xi1>, vector<[4]xi32>) -> vector<[4]xi32> Fortunately the operand is overloaded on the result type so we end up with the correct IR when lowering to LLVM, but this is still incorrect. This patch fixes it by propagating scalability. | 1 年前 | |
[mlir] Skip invalid test on big endian platform (s390x) (#80246) The buildbot test running on s390x platform keeps failing since [this time](https://lab.llvm.org/buildbot/#/builders/199/builds/31136). This is because of the dependency on the endianness of the platform. It expects the format invalid in the big endian platform (s390x). We can simply skip it. See: https://discourse.llvm.org/t/mlir-s390x-linux-failure/76695 | 2 年前 | |
[MLIR][LLVM] Remove typed pointer remnants from target tests (#71210) This commit removes all LLVM dialect typed pointers from the target tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502 | 2 年前 | |
[MLIR] Add f8E4M3 IEEE 754 type (#97118) This PR adds f8E4M3 type to mlir. f8E4M3 type follows IEEE 754 convention c f8E4M3 (IEEE 754) - Exponent bias: 7 - Maximum stored exponent value: 14 (binary 1110) - Maximum unbiased exponent value: 14 - 7 = 7 - Minimum stored exponent value: 1 (binary 0001) - Minimum unbiased exponent value: 1 − 7 = −6 - Precision specifies the total number of bits used for the significand (mantisa), including implicit leading integer bit = 3 + 1 = 4 - Follows IEEE 754 conventions for representation of special values - Has Positive and Negative zero - Has Positive and Negative infinity - Has NaNs Additional details: - Max exp (unbiased): 7 - Min exp (unbiased): -6 - Infinities (+/-): S.1111.000 - Zeros (+/-): S.0000.000 - NaNs: S.1111.{001, 010, 011, 100, 101, 110, 111} - Max normal number: S.1110.111 = +/-2^(7) x (1 + 0.875) = +/-240 - Min normal number: S.0001.000 = +/-2^(-6) - Max subnormal number: S.0000.111 = +/-2^(-6) x 0.875 = +/-2^(-9) x 7 - Min subnormal number: S.0000.001 = +/-2^(-6) x 0.125 = +/-2^(-9) Related PRs: - [PR-97179](https://github.com/llvm/llvm-project/pull/97179) [APFloat] Add support for f8E4M3 IEEE 754 type | 1 年前 | |
[MLIR][LLVM] Add distinct identifier to DICompileUnit attribute (#77070) This commit adds a distinct attribute parameter to the DICompileUnit to enable the modeling of distinctness. LLVM requires DICompileUnits to be distinct and there are cases where one gets two equivalent compilation units but LLVM still requires differentiates them. We observed such cases for combinations of LTO and inline functions. This patch also changes the DIScopeForLLVMFuncOp pass to a module pass, to ensure that only one distinct DICompileUnit is created, instead of one for each function. | 2 年前 | |
[mlir][LLVM] Add nsw and nuw flags (#74508) The implementation of these are modeled after the existing fastmath flags for floating point arithmetic. | 2 年前 | |
Revert "[mlir][NVVM] Disallow results on kernel functions (#96399)" (#97074) NVVM IR itself doesn't place any restriction that a function annotated as nvvm.kernel actually has no results, so this is a mismatch at the NVVMDialect level and NVVMIR. The GPU dialect might enforce that kernel functions have no results, but it doesn't make sense to propagate this constraint downstream. Reverts llvm/llvm-project#96399 | 1 年前 | |
[mlir][test] Fix filecheck annotation typos (#92897) Moved fixes for mlir from https://github.com/llvm/llvm-project/pull/91854, plus few additional in second commit. --------- Co-authored-by: klensy <nightouser@gmail.com> | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[llvm-project] Fix typo "seperate" (#95373) | 2 年前 | |
[ConstantFold] Remove notional over-indexing fold (#93697) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities. | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[Offload][NFC] Remove omp_ prefix from offloading entries (#88071) Summary: These entires are generic for offloading with the new driver now. Having the omp prefix was a historical artifact and is confusing when used for CUDA. This patch just renames them for now, future patches will rework the binary format to make it more common. | 2 年前 | |
[OMPIRBuilder] - Handle dependencies in createTarget (#93977) This patch handles dependencies specified by the depend clause on an OpenMP target construct. It does this much the same way clang does it by materializing an OpenMP task that is tagged with the dependencies. The following functions are relevant to this patch - 1) createTarget - This function itself is largely unchanged except that it now accepts a vector of DependData objects that it simply forwards to emitTargetCall 2) emitTargetCall - This function has changed now to check if an outer target-task needs to be materialized (i.e if target construct has nowait or has depend clause). If yes, it calls emitTargetTask to do all the heavy lifting for creating and dispatching the task. 3) emitTargetTask - Bulk of the change is here. See the large comment explaining what it does at the beginning of this function | 1 年前 | |
[ConstantFold] Remove notional over-indexing fold (#93697) The data-layout independent constant folding currently has some rather gnarly code for canonicalizing GEP indices to reduce "notional overindexing", and then infers inbounds based on that canonicalization. Now that we canonicalize to i8 GEPs, this canonicalization is essentially useless, as we'll discard it as soon as the GEP hits the data-layout aware constant folder anyway. As such, I'd like to remove this code entirely. This shouldn't have any impact on optimization capabilities. | 2 年前 | |
[Flang][OpenMP][MLIR] Fix common block mapping for regular and declare target link (#91829) This PR attempts to fix common block mapping for regular mapping of these types as well as when they have been marked as "declare target link". This PR should allow correct mapping of both the members of a common block and the full common block via its block symbol. The main changes were some adjustments to the Fortran OpenMP lowering to HLFIR/FIR, the lowering of the LLVM+OpenMP dialect to LLVM-IR and adjustments to the way the we handle target kernel map argument rebinding inside of the OMPIRBuilder. For the Fortran OpenMP lowering were two changes, one to prevent the implicit capture of common block members when the common block symbol itself has been marked and the other creates intermediate member access inside of the target region to be used in-place of those external to the target region, this prevents external usages breaking the IsolatedFromAbove pact. In the latter case, there was an adjustment to the size calculation for types to better handle cases where we pass an array as the type of a map (as opposed to the bounds and the type of the element), which occurs in the case of common blocks. There is also some adjustment to how handleDeclareTargetMapVar handles renaming of declare target symbols in the module to the reference pointer, now it will only apply to those within the kernel that is currently being generated and we also perform a modification to replace constants with instructions as necessary as we cannot replace these with our reference pointer (non-constant and constants do not mix nicely). In the case of the OpenMPIRBuilder some changes were made to defer global symbol rebinding to kernel arguments until all other arguments have been rebound. This makes sure we do not replace uses that may refer to the global (e.g. a GEP) but are themselves actually a separate argument that needs bound. Currently "declare target to" still needs some work, but this may be the case for all types in conjunction with "declare target to" at the moment. | 1 年前 | |
[MLIR][OpenMP] Clause-based OpenMP operation definition (#92523) This patch updates OpenMP_Op definitions to be based on the new set of OpenMP_Clause definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen OpenMP_Clause definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the map clause are updated, so that they are able to handle map clauses linked to entry block arguments as well as those which aren't. This PR causes a build failure in the flang subproject. This is addressed by the next PR in the stack. | 1 年前 | |
[OpenMP][MLIR] Extend record member map support for omp dialect to LLVM-IR This patch seeks to refactor slightly and extend the current record type map support that was put in place for Fortran's descriptor types to handle explicit member mapping for record types at a single level of depth (the case of explicit mapping of nested record types is currently unsupported). This patch seeks to support this by extending the OpenMPToLLVMIRTranslation phase to more generally support record types, building on the prior groundwork in the Fortran allocatables/pointers patch. It now supports different kinds of record type mapping, in this case full record type mapping and then explicit member mapping in which there is a special case for certain types when mapped individually to not require any parent map link in the kernel argument structure. To facilitate this required: * The movement of the setting of the map flag type "ptr_and_obj" to respective frontends, now supporting it as a possible flag that can be read and printed in mlir form. Some minor changes to declare target map type setting was neccesary for this. * The addition of a member index array operand, which tracks the position of the member in the parent, required for caclulating the appropriate size to offload to the target, alongside the parents offload pointer (always the first member currently being mapped). * A partial mapping attribute operand, to indicate if the entire record type is being mapped or just member components, aiding the ability to lower record types in the different manners that are possible. * Refactoring bounds calculation for record types and general arrays to one location (as well as load/store generation prior to assigning to the kernel argument structure), as a side affect enter/exit/update/data mapping should now be more correct and fully support bounds mapping, previously this would have only worked for target. Pull Request: https://github.com/llvm/llvm-project/pull/82852 | 2 年前 | |
[MLIR][OpenMP] Clause-based OpenMP operation definition (#92523) This patch updates OpenMP_Op definitions to be based on the new set of OpenMP_Clause definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen OpenMP_Clause definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the map clause are updated, so that they are able to handle map clauses linked to entry block arguments as well as those which aren't. This PR causes a build failure in the flang subproject. This is addressed by the next PR in the stack. | 1 年前 | |
[MLIR][OpenMP] Clause-based OpenMP operation definition (#92523) This patch updates OpenMP_Op definitions to be based on the new set of OpenMP_Clause definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen OpenMP_Clause definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the map clause are updated, so that they are able to handle map clauses linked to entry block arguments as well as those which aren't. This PR causes a build failure in the flang subproject. This is addressed by the next PR in the stack. | 1 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[OpenMP][MLIR] Extend record member map support for omp dialect to LLVM-IR This patch seeks to refactor slightly and extend the current record type map support that was put in place for Fortran's descriptor types to handle explicit member mapping for record types at a single level of depth (the case of explicit mapping of nested record types is currently unsupported). This patch seeks to support this by extending the OpenMPToLLVMIRTranslation phase to more generally support record types, building on the prior groundwork in the Fortran allocatables/pointers patch. It now supports different kinds of record type mapping, in this case full record type mapping and then explicit member mapping in which there is a special case for certain types when mapped individually to not require any parent map link in the kernel argument structure. To facilitate this required: * The movement of the setting of the map flag type "ptr_and_obj" to respective frontends, now supporting it as a possible flag that can be read and printed in mlir form. Some minor changes to declare target map type setting was neccesary for this. * The addition of a member index array operand, which tracks the position of the member in the parent, required for caclulating the appropriate size to offload to the target, alongside the parents offload pointer (always the first member currently being mapped). * A partial mapping attribute operand, to indicate if the entire record type is being mapped or just member components, aiding the ability to lower record types in the different manners that are possible. * Refactoring bounds calculation for record types and general arrays to one location (as well as load/store generation prior to assigning to the kernel argument structure), as a side affect enter/exit/update/data mapping should now be more correct and fully support bounds mapping, previously this would have only worked for target. Pull Request: https://github.com/llvm/llvm-project/pull/82852 | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[Flang][MLIR][OpenMP] Use function-attached target attributes for OpenMP lowering (#78291) This patch removes the omp.target module attribute, since the information it held on the target CPU and features is available through the fir.target_cpu and fir.target_features module attributes. Target outlining during the MLIR to LLVM IR translation stage is updated, so that these attributes, at that point available as llvm.func attributes, are passed along to the newly created function. | 2 年前 | |
[MLIR][OpenMP] Skip host omp ops when compiling for the target device (#85239) This patch separates the lowering dispatch for host and target devices. For the target device, if the current operation is not a top-level operation (e.g. omp.target) or is inside a target device code region it will be ignored, since it belongs to the host code. This is an alternative approach to #84611, the new test in this PR was taken from there. | 2 年前 | |
[MLIR][OpenMP] Skip host omp ops when compiling for the target device (#85239) This patch separates the lowering dispatch for host and target devices. For the target device, if the current operation is not a top-level operation (e.g. omp.target) or is inside a target device code region it will be ignored, since it belongs to the host code. This is an alternative approach to #84611, the new test in this PR was taken from there. | 2 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[mlir][OpenMP] - Honor dependencies in code-generation of the if clause in omp.task correctly (#90891) This patch fixes the code generation of the if clause, specifically when the condition evaluates to false and when the task directive has the depend clause on it. When the if clause of a task construct evaluates to false, then the task is an undeferred task. This undeferred task still has to honor dependencies. Previously, the OpenMPIRbuilder didn't honor dependencies. This patch fixes that. Fixes https://github.com/llvm/llvm-project/issues/90869 | 2 年前 | |
Revert "[mlir][llvm] Fix elem type passing into getelementptr (#68136)" This reverts commit fe283a1ff74fdfa46dd2bedf60e544d747d3416e. Bot is broken. | 2 年前 | |
[MLIR][LLVM] Remove typed pointer remnants from target tests (#71210) This commit removes all LLVM dialect typed pointers from the target tests. Typed pointers have been deprecated for a while now and it's planned to soon remove them from the LLVM dialect. Related PSA: https://discourse.llvm.org/t/psa-removal-of-typed-pointers-from-the-llvm-dialect/74502 | 2 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[MLIR][OpenMP] Extend omp.private materialization support: firstprivate (#82164) Extends current support for delayed privatization during translation to LLVM IR. This adds support for one-block firstprivate omp.private ops. | 2 年前 | |
[OpenMP][MLIR] Use opaque pointers in OpenMP translation tests 1/2 (#70057) First half of the tests switching to opaque pointers. Rest of the tests are omptarget-*.mlir and a specific test for typed pointers. Patch created as requested in https://github.com/llvm/llvm-project/pull/69772 | 2 年前 | |
[OpenMP]Support for lowering masked op (#98401) | 1 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[MLIR][OpenMP] Extend omp.private materialization support: dealloc (#90841) Extends current support for delayed privatization during translation to LLVM IR. This adds support for materlizaing the dealloc region in omp.private ops when this region contains clean-up/deallocation logic that needs to be executed at the end of the parallel region. This changes the OMPIRBuilder slightly to execute the finalization callback **after** the privatization callback. This allows us to collect information about privatized variables on the MLIR and LLVM sides so that we can properly emit deallocation logic. | 2 年前 | |
[Flang]Fix for changed code at the end of AllocaIP. (#92430) Some of the OpenMP code can change the instruction pointed at by the insertion point. This leads to an assert in the compiler about BB->getParent() and IP->getParent() not matching. The fix is to rebuild the insertionpoint from the block, rather than use builder.restoreIP. Also, move some of the alloca generation, rather than skipping back and forth between insert points (and ensure all the allocas are done before their users are created). A simple test, mainly to ensure the minimal reproducer doesn't fail to compile in the future is also added. | 1 年前 | |
[mlir][LLVMIR][OpenMP] fix dominance for reduction init block (#96052) It was incorrect to set the insertion point to the init block after inlining the initialization region because the code generated in the init block depends upon the value yielded from the init region. When there were multiple reduction initialization regions each with multiple blocks, this could lead to the initilization region being inlined after the init block which depends upon it. Moving the insertion point to before inlining the initialization block turned up further issues around the handling of the terminator for the initialization block, which are also fixed here. This fixes a bug in #92430 (but the affected code couldn't compile before #92430 anyway). | 1 年前 | |
[OpenMP][LLVM] Fix access to reduction args of omp.parallel. (#96426) Fix for Fujitsu test suite test: 0275_0032.f90. The MLIR to LLVM translation logic assumed that reduction arguments to an omp.parallel op are always the last set of arguments to the op. However, this is a wrong assumption since private args come afterward. | 1 年前 | |
[mlir][OpenMP] Lower REDUCTION clause for SECTIONS construct (#97859) This shares code with WsloopOp (the changes to Wsloop should be NFC). OpenMPIRBuilder basically implements SECTIONS as a wsloop over a case statement with each SECTION as a case for a particular loopiv value. Unfortunately it proved very difficult to share code between these and ParallelOp. ParallelOp does quite a few things differently (doing more work inside of the bodygen callback and laying out blocks differently). Aligning reduction implementations for wsloop and parallel will probably involve functional changes to both, so I won't attempt that in this commit. | 1 年前 | |
[flang][MLIR][OpenMP] make reduction by-ref toggled per variable (#92244) Fixes #88935 Toggling reduction by-ref broke when multiple reduction clauses were used. Decisions made for the by-ref status for later clauses could then invalidate decisions for earlier clauses. For example, reduction(+:scalar,scalar2) reduction(+:array) The first clause would choose by value reduction and generate by-value reduction regions, but then after this the second clause would force by-ref to support the array argument. But by the time the second clause is processed, the first clause has already had the wrong kind of reduction regions generated. This is solved by toggling whether a variable should be reduced by reference per variable. In the above example, this allows only array to be reduced by ref. | 2 年前 | |
[MLIR][OpenMP] NFC: Uniformize OpenMP ops names (#85393) This patch proposes the renaming of certain OpenMP dialect operations with the goal of improving readability and following a uniform naming convention for MLIR operations and associated classes. In particular, the following operations are renamed: - omp.map_info -> omp.map.info - omp.target_update_data -> omp.target_update - omp.ordered_region -> omp.ordered.region - omp.cancellationpoint -> omp.cancellation_point - omp.bounds -> omp.map.bounds - omp.reduction.declare -> omp.declare_reduction Also, the following MLIR operation classes have been renamed: - omp::TaskLoopOp -> omp::TaskloopOp - omp::TaskGroupOp -> omp::TaskgroupOp - omp::DataBoundsOp -> omp::MapBoundsOp - omp::DataOp -> omp::TargetDataOp - omp::EnterDataOp -> omp::TargetEnterDataOp - omp::ExitDataOp -> omp::TargetExitDataOp - omp::UpdateDataOp -> omp::TargetUpdateOp - omp::ReductionDeclareOp -> omp::DeclareReductionOp - omp::WsLoopOp -> omp::WsloopOp | 2 年前 | |
[mlir][LLVMIR][OpenMP] fix dominance for reduction init block (#96052) It was incorrect to set the insertion point to the init block after inlining the initialization region because the code generated in the init block depends upon the value yielded from the init region. When there were multiple reduction initialization regions each with multiple blocks, this could lead to the initilization region being inlined after the init block which depends upon it. Moving the insertion point to before inlining the initialization block turned up further issues around the handling of the terminator for the initialization block, which are also fixed here. This fixes a bug in #92430 (but the affected code couldn't compile before #92430 anyway). | 1 年前 | |
[mlir][OpenMP] Lower REDUCTION clause for SECTIONS construct (#97859) This shares code with WsloopOp (the changes to Wsloop should be NFC). OpenMPIRBuilder basically implements SECTIONS as a wsloop over a case statement with each SECTION as a case for a particular loopiv value. Unfortunately it proved very difficult to share code between these and ParallelOp. ParallelOp does quite a few things differently (doing more work inside of the bodygen callback and laying out blocks differently). Aligning reduction implementations for wsloop and parallel will probably involve functional changes to both, so I won't attempt that in this commit. | 1 年前 | |
[MLIR][OpenMP] Update omp.wsloop translation to LLVM IR (4/5) (#89214) This patch introduces minimal changes to the MLIR to LLVM IR translation of omp.wsloop to support the loop wrapper approach. There is omp.loop_nest related translation code that should be extracted and shared among all loop operations (e.g. omp.simd). This would possibly also help in the addition of support for compound constructs later on. This first approach is only intended to keep things running after the transition to loop wrappers and not to add support for other use cases enabled by that transition. This PR on its own will not pass premerge tests. All patches in the stack are needed before it can be compiled and passes tests. | 2 年前 | |
[MLIR][OpenMP] Skip host omp ops when compiling for the target device (#85239) This patch separates the lowering dispatch for host and target devices. For the target device, if the current operation is not a top-level operation (e.g. omp.target) or is inside a target device code region it will be ignored, since it belongs to the host code. This is an alternative approach to #84611, the new test in this PR was taken from there. | 2 年前 | |
[OpenMPIRBuilder][MLIR] Pass target-cpu and target-features to outlined functions (#80283) This patch adds support for forwarding the target-cpu and target-features attributes to functions outlined in the OpenMPIRBuilder. This, in turn, results in the addition of these attributes for functions created during the translation of the omp.parallel, omp.task and omp.teams operations, and for the omp.wsloop operation when doing codegen for an OpenMP target device. | 2 年前 | |
[flang][MLIR][OpenMP] make reduction by-ref toggled per variable (#92244) Fixes #88935 Toggling reduction by-ref broke when multiple reduction clauses were used. Decisions made for the by-ref status for later clauses could then invalidate decisions for earlier clauses. For example, reduction(+:scalar,scalar2) reduction(+:array) The first clause would choose by value reduction and generate by-value reduction regions, but then after this the second clause would force by-ref to support the array argument. But by the time the second clause is processed, the first clause has already had the wrong kind of reduction regions generated. This is solved by toggling whether a variable should be reduced by reference per variable. In the above example, this allows only array to be reduced by ref. | 2 年前 | |
[mlir][ROCDL] Swap range metadata to range attribute (#94853) Swap out range metadata to range attribute for calls to be able to deprecate range metadata on calls in the future. | 2 年前 | |
[MLIR][LLVM] Add explicit target_cpu attribute to llvm.func (#78287) This patch adds the target_cpu attribute to llvm.func MLIR operations and updates the translation to/from LLVM IR to match "target-cpu" function attributes. | 2 年前 | |
[mlir][llvm] Use zeroinitializer for TargetExtType (#66510) Use the recently introduced llvm.mlir.zero operation for values with LLVM target extension type. Replaces the previous workaround that uses a single zero-valued integer attribute constant operation. Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com> | 2 年前 | |
[mlir][llvm] Add llvm.target_features features attribute (#71510) This patch adds a target_features (TargetFeaturesAttr) to the LLVM dialect to allow setting and querying the features in use on a function. The motivation for this comes from the Arm SME dialect where we would like a convenient way to check what variants of an operation are available based on the CPU features. Intended usage: The target_features attribute is populated manually or by a pass: mlir func.func @example() attributes { target_features = #llvm.target_features<["+sme", "+sve", "+sme-f64f64"]> } { // ... } Then within a later rewrite the attribute can be checked, and used to make lowering decisions. c++ // Finds the "target_features" attribute on the parent // FunctionOpInterface. auto targetFeatures = LLVM::TargetFeaturesAttr::featuresAt(op); // Check a feature. // Returns false if targetFeatures is null or the feature is not in // the list. if (!targetFeatures.contains("+sme-f64f64")) return failure(); For now, this is rather simple just checks if the exact feature is in the list, though it could be possible to extend with implied features using information from LLVM. | 2 年前 | |
[mlir] support dialect attribute translation to LLVM IR (#75309) Extend the amendOperation mechanism for translating dialect attributes attached to operations from another dialect when translating MLIR to LLVM IR. Previously, this mechanism would have no knowledge of the LLVM IR instructions created for the given operation, making it impossible for it to perform local modifications such as attaching operation-level metadata. Collect instructions inserted by the LLVM IR builder and pass them to amendOperation. | 2 年前 | |
| 1 年前 | ||
[MLIR][VCIX] Support VCIX intrinsics in LLVMIR dialect (#75875) The changeset extends LLVMIR intrinsics with VCIX intrinsics. The VCIX intrinsics allow MLIR users to interact with RISC-V co-processors that are compatible with XSfvcp extension Source: https://www.sifive.com/document-file/sifive-vector-coprocessor-interface-vcix-software | 2 年前 | |
[MLIR][VCIX] Support VCIX intrinsics in LLVMIR dialect (#75875) The changeset extends LLVMIR intrinsics with VCIX intrinsics. The VCIX intrinsics allow MLIR users to interact with RISC-V co-processors that are compatible with XSfvcp extension Source: https://www.sifive.com/document-file/sifive-vector-coprocessor-interface-vcix-software | 2 年前 | |
[mlir][Pass] Include anchor op in -pass-pipeline In D134622 the printed form of a pass manager is changed to include the name of the op that the pass manager is anchored on. This updates the -pass-pipeline argument format to include the anchor op as well, so that the printed form of a pipeline can be directly passed to -pass-pipeline. In most cases this requires updating -pass-pipeline='pipeline' to -pass-pipeline='builtin.module(pipeline)'. This also fixes an outdated assert that prevented running a PassManager anchored on 'any'. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D134900 | 3 年前 | |
Add support for MLIR to llvm vscale attribute (#67012) The vscale_range is used for scalabale vector functionality in Arm Scalable Vector Extension to select the size of vector operation (and I thnk RISCV has something similar). This patch adds the base support for the vscale_range attribute to the LLVM::FuncOp, and the marshalling for translation to LLVM-IR and import from LLVM-IR to LLVM dialect. This attribute is intended to be used at higher level MLIR, specified either by command-line options to the compiler or using compiler directives (e.g. pragmas or function attributes in the source code) to indicate the desired range. | 2 年前 | |
[mlir] X86Vector: Add AVX Rsqrt Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D99818 | 5 年前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 1 年前 | ||
| 2 年前 | ||
| 3 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 3 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 1 年前 | ||
| 2 年前 | ||
| 2 年前 | ||
| 3 年前 | ||
| 2 年前 | ||
| 5 年前 |