Commit Graph

289 Commits

Author SHA1 Message Date
Yaxun (Sam) Liu cac4e2fe25 [CUDA][HIP] Fix gpu.used.external
Rename gpu.used.external as __clang_gpu_used_external as ptxas does not
allow . in global variable name.

Fixes: https://github.com/llvm/llvm-project/issues/54934

Reviewed by: Joseph Huber, Artem Belevich

Differential Revision: https://reviews.llvm.org/D123946
2022-04-18 23:10:31 -04:00
Yaxun (Sam) Liu 0424b5115c [CUDA][HIP] Fix host used external kernel in archive
For -fgpu-rdc, a host function may call an external kernel
which is defined in an archive of bitcode. Since this external
kernel is only referenced in host function, the device
bitcode does not contain reference to this external
kernel, then the linker will not try to resolve this external
kernel in the archive.

To fix this issue, host-used external kernels and device
variables are tracked. A global array containing pointers
to these external kernels and variables is emitted which
serves as an artificial references to the external kernels
and variables used by host.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123441
2022-04-13 10:47:16 -04:00
Fangrui Song 63fbc77121 [Driver][test] Remove unused/obsoleted REQUIRES: clang-driver
It (introduced by 556d713c70) appears to be
related to the removed dragonegg project. In addition, the feature was a bit
misnamed and may lur users to unnecessarily use it.
2022-04-12 13:29:46 -07:00
Yaxun (Sam) Liu 4ea1d43509 [CUDA][HIP] Externalize kernels in anonymous name space
kernels in anonymous name space needs to have unique name
to avoid duplicate symbols.

Fixes: https://github.com/llvm/llvm-project/issues/54560

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D123353
2022-04-10 21:56:28 -04:00
Jonas Hahnfeld e4903d8be3 [CUDA/HIP] Remove argument from module ctor/dtor signatures
In theory, constructors can take arguments when called via .init_array
where at least glibc passes in (argc, argv, envp). This isn't used in
the generated code and if it was, the first argument should be an
integer, not a pointer. For destructors registered via atexit, the
function should never take an argument.

Differential Revision: https://reviews.llvm.org/D123370
2022-04-09 12:34:41 +02:00
Nikita Popov b16a3b4f3b [Clang] Add -no-opaque-pointers to more tests (NFC)
This adds the flag to more tests that were not caught by the
mass-migration in 532dc62b90.
2022-04-07 12:53:29 +02:00
Nikita Popov 532dc62b90 [OpaquePtrs][Clang] Add -no-opaque-pointers to tests (NFC)
This adds -no-opaque-pointers to clang tests whose output will
change when opaque pointers are enabled by default. This is
intended to be part of the migration approach described in
https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322/9.

The patch has been produced by replacing %clang_cc1 with
%clang_cc1 -no-opaque-pointers for tests that fail with opaque
pointers enabled. Worth noting that this doesn't cover all tests,
there's a remaining ~40 tests not using %clang_cc1 that will need
a followup change.

Differential Revision: https://reviews.llvm.org/D123115
2022-04-07 12:09:47 +02:00
Nikita Popov 8a72391f60 [IR] Require intrinsic struct return type to be anonymous
This is an alternative to D122376. Rather than working around the
problem, this patch requires that struct return types in intrinsics
are anonymous/literal and adds auto-upgrade code to convert
existing uses of intrinsics with named struct types.

This ensures that the mapping between intrinsic name and
intrinsic function type is actually bijective, as it is supposed
to be.

This also fixes https://github.com/llvm/llvm-project/issues/37891.

Differential Revision: https://reviews.llvm.org/D122471
2022-03-30 09:51:24 +02:00
Yaxun (Sam) Liu d41445113b [CUDA][HIP] Fix hostness check with -fopenmp
CUDA/HIP determines whether a function can be called based on
the device/host attributes of callee and caller. Clang assumes the
caller is CurContext. This is correct in most cases, however, it is
not correct in OpenMP parallel region when CUDA/HIP program
is compiled with -fopenmp. This causes incorrect overloading
resolution and missed diagnostics.

To get the correct caller, clang needs to chase the parent chain
of DeclContext starting from CurContext until a function decl
or a lambda decl is reached. Sema API is adapted to achieve that
and used to determine the caller in hostness check.

Reviewed by: Artem Belevich, Richard Smith

Differential Revision: https://reviews.llvm.org/D121765
2022-03-24 15:19:47 -04:00
Daniil Kovalev 828b63c309 [NVPTX] Enhance vectorization of ld.param & st.param
Since function parameters and return values are passed via param space, we
can force special alignment for values hold in it which will add vectorization
options. This change may be done if the function has private or internal
linkage. Special alignment is forced during 2 phases.

1) Instruction selection lowering. Here we use special alignment for function
   prototypes (changing both own return value and parameters alignment), call
   lowering (changing both callee's return value and parameters alignment).

2) IR pass nvptx-lower-args. Here we change alignment of byval parameters that
   belong to param space (or are casted to it). We only handle cases when all
   uses of such parameters are loads from it. For such loads, we can change the
   alignment according to special type alignment and the load offset. Then,
   load-store-vectorizer IR pass will perform vectorization where alignment
   allows it.

Special alignment calculated as maximum from default ABI type alignment and
alignment 16. Alignment 16 is chosen because it's the maximum size of
vectorized ld.param & st.param.

Before specifying such special alignment, we should check if it is a multiple
of the alignment that the type already has. For example, if a value has an
enforced alignment of 64, default ABI alignment of 4 and special alignment
of 16, we should preserve 64.

This patch will be followed by a refactoring patch that removes duplicating
code in handling byval and non-byval arguments.

Differential Revision: https://reviews.llvm.org/D120129
2022-03-24 12:36:52 +03:00
Daniil Kovalev a034878564 Revert "[NVPTX] Enhance vectorization of ld.param & st.param"
This reverts commit f854434f0f.

Placed URL to wrong differential revision in commit message.
2022-03-24 12:32:06 +03:00
Daniil Kovalev f854434f0f [NVPTX] Enhance vectorization of ld.param & st.param
Since function parameters and return values are passed via param space, we
can force special alignment for values hold in it which will add vectorization
options. This change may be done if the function has private or internal
linkage. Special alignment is forced during 2 phases.

1) Instruction selection lowering. Here we use special alignment for function
   prototypes (changing both own return value and parameters alignment), call
   lowering (changing both callee's return value and parameters alignment).

2) IR pass nvptx-lower-args. Here we change alignment of byval parameters that
   belong to param space (or are casted to it). We only handle cases when all
   uses of such parameters are loads from it. For such loads, we can change the
   alignment according to special type alignment and the load offset. Then,
   load-store-vectorizer IR pass will perform vectorization where alignment
   allows it.

Special alignment calculated as maximum from default ABI type alignment and
alignment 16. Alignment 16 is chosen because it's the maximum size of
vectorized ld.param & st.param.

Before specifying such special alignment, we should check if it is a multiple
of the alignment that the type already has. For example, if a value has an
enforced alignment of 64, default ABI alignment of 4 and special alignment
of 16, we should preserve 64.

This patch will be followed by a refactoring patch that removes duplicating
code in handling byval and non-byval arguments.

Differential Revision: https://reviews.llvm.org/D121549
2022-03-24 12:25:36 +03:00
Changpeng Fang dd5895cc39 AMDGPU: Use the implicit kernargs for code object version 5
Summary:
  Specifically, for trap handling, for targets that do not support getDoorbellID,
we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1].
To get aperture bases when targets do not have aperture registers, we load
private_base or shared_base directly from the implicit kernarg. In clang, we use
implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}.

Reviewers: arsenm, sameerds, yaxunl

Differential Revision: https://reviews.llvm.org/D120265
2022-03-17 14:12:36 -07:00
Stanislav Mekhanoshin 9eabea3968 [AMDGPU] Set noclobber metadata on loads instead of cast to constant
A load via pointer cast to constant will return true from
pointsToConstantMemory which is not necessarily so.

Fixes: SWDEV-326463

Differential Revision: https://reviews.llvm.org/D121172
2022-03-07 23:13:02 -08:00
Yaxun (Sam) Liu 9d899d8f01 [HIP] Support `-fgpu-default-stream`
Introduce -fgpu-default-stream={legacy|per-thread} option to
support per-thread default stream for HIP runtime.

When -fgpu-default-stream=per-thread, HIP kernels are
launched through hipLaunchKernel_spt instead of
hipLaunchKernel. Also HIP_API_PER_THREAD_DEFAULT_STREAM=1
is defined by the preprocessor to enable other per-thread stream
API's.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120298
2022-02-23 22:28:29 -05:00
Stanislav Mekhanoshin b0aa1946df [AMDGPU] Promote recursive loads from kernel argument to constant
Not clobbered pointer load chains are promoted to global now. That
is possible to promote these loads itself into constant address
space. Loaded pointers still need to point to global because we
need to be able to store into that pointer and because an actual
load from it may occur after a clobber.

Differential Revision: https://reviews.llvm.org/D119886
2022-02-17 11:07:03 -08:00
Sameer Sahasrabuddhe d8f99bb6e0 [AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216
2022-02-11 22:51:56 +05:30
Yaxun (Sam) Liu 1d97cb1f6e [HIP] Emit amdgpu_code_object_version module flag
code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).

The amdgpu_code_object_version value is code object version times 100.

LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.

The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D119026
2022-02-08 21:58:40 -05:00
Yaxun (Sam) Liu 8428c75da1 [CUDA][HIP] Do not treat host var address as constant in device compilation
Currently clang treats host var address as constant in device compilation,
which causes const vars initialized with host var address promoted to
device variables incorrectly and results in undefined symbols.

This patch fixes that.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D118153

Fixes: SWDEV-309881

Change-Id: I0a69357063c6f8539ef259c96c250d04615f4473
2022-01-28 16:04:52 -05:00
Florian Hahn 67aa314bce
[IRGen] Do not overwrite existing attributes in CGCall.
When adding new attributes, existing attributes are dropped. While
this appears to be a longstanding issue, this was highlighted by D105169
which dropped a lot of attributes due to adding the new noundef
attribute.

Ahmed Bougacha (@ab) tracked down the issue and provided the fix in
CGCall.cpp. I bundled it up and updated the tests.
2022-01-20 13:45:19 +00:00
Yaxun (Sam) Liu 85c2bd2a0e Prevent adding module flag amdgpu_hostcall multiple times
HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Roman Lebedev

Differential Revision: https://reviews.llvm.org/D116216
2022-01-19 12:52:33 -05:00
hyeongyu kim 1b1c8d83d3 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2022-01-16 18:54:17 +09:00
Matt Arsenault 33315ef321 clang/AMDGPU: Don't set implicit arg attribute to default size
Since 2959e082e1, we conservatively
assume all inputs are enabled by default. This isn't the best
interface for controlling these anyway, since it's not granular and
only allows trimming the last fields.
2022-01-14 18:43:30 -05:00
Yaxun (Sam) Liu 3b172f60c6 [HIP] Fix -fgpu-rdc for Windows
This patch fixes issues for -fgpu-rdc for Windows MSVC
toolchain:

Fix COFF specific section flags and remove section types
in llvm-mc input file for Windows.

Escape fatbin path in llvm-mc input file.

Add -triple option to llvm-mc.

Put __hip_gpubin_handle in comdat when it has linkonce_odr
linkage.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D115039
2021-12-06 16:42:23 -05:00
Anshil Gandhi df0560ca00 [HIP] Add atomic load, atomic store and atomic cmpxchng_weak builtin support in HIP-clang
Introduce `__hip_atomic_load`, `__hip_atomic_store` and `__hip_atomic_compare_exchange_weak`
builtins in HIP.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D114553
2021-11-29 12:07:13 -07:00
Yaxun (Sam) Liu 38211bbab1 [HIP] Fix device stub name for Windows
This is a follow up of https://reviews.llvm.org/D68578
where device stub name is changed for Itanium
mangling but not Microsoft mangling.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D113491
2021-11-23 12:03:49 -05:00
Yaxun (Sam) Liu e13246a2ec [HIP] Add HIP scope atomic operations
Add an AtomicScopeModel for HIP and support for OpenCL builtins
that are missing in HIP.

Patch by: Michael Liao

Revised by: Anshil Ghandi

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D113925
2021-11-23 10:13:37 -05:00
Yaxun (Sam) Liu 4b3881e9f3 Emit hidden hostcall argument for sanitized kernels
this patch - https://reviews.llvm.org/D110337 changes the way how hostcall
hidden argument is emitted for printf, but the sanitized kernels also use
hostcall buffer to report a error for invalid memory access, which is not
handled by the above patch and it leads to vdi runtime error:

Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT:
Agent attempted to access an inaccessible address. code: 0x2b

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Matt Arsenault

Differential Revision: https://reviews.llvm.org/D112820
2021-11-10 17:05:57 -05:00
Yaxun (Sam) Liu 80072fde61 [CUDA][HIP] Allow comdat for kernels
Two identical instantiations of a template function can be emitted by two TU's
with linkonce_odr linkage without causing duplicate symbols in linker. MSVC
also requires these symbols be in comdat sections. Linux does not require
the symbols in comdat sections to be merged by linker but by default
clang puts them in comdat sections.

If a template kernel is instantiated identically in two TU's. MSVC requires
that them to be in comdat sections, otherwise MSVC linker will diagnose them as
duplicate symbols. However, currently clang does not put instantiated template
kernels in comdat sections, which causes link error for MSVC.

This patch allows putting instantiated template kernels into comdat sections.

Reviewed by: Artem Belevich, Reid Kleckner

Differential Revision: https://reviews.llvm.org/D112492
2021-11-10 16:42:23 -05:00
hsmahesha 3b9a85d10a [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas
at the start of the entry block, which in turn would aid better code transformation/optimization.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D110257
2021-11-10 08:45:21 +05:30
hyeongyu kim fd9b099906 Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953e.

Revert "Fix lit test failures in CodeGenCoroutines"

This reverts commit 63fff0f5bf.
2021-11-09 02:15:55 +09:00
hyeongyukim aacfbb953e [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169

[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)

This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453

Resolve lit failures in clang after 8ca4b3e's land

Fix lit test failures in clang-ppc* and clang-x64-windows-msvc

Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc

Fix internal_clone(aarch64) inline assembly
2021-11-06 19:19:22 +09:00
Juneyoung Lee 89ad2822af Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit 7584ef766a.
2021-11-06 15:39:19 +09:00
Juneyoung Lee 7584ef766a [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions.
I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.

Test updates are made as a separate patch: D108453

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D105169
2021-11-06 15:36:42 +09:00
Anshil Gandhi 0567f03331 [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
By default clang emits complete contructors as alias of base constructors if they are the same.
The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
@yaxunl observed that this issue is related to the llvm options `-amdgpu-early-inline-all=true`
and `-amdgpu-function-calls=false`. This issue is resolved by only inlining global values
with internal linkage. The `getCalleeFunction()` in AMDGPUResourceUsageAnalysis also had
to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.

Reviewed By: yaxunl, #amdgpu

Differential Revision: https://reviews.llvm.org/D109707
2021-10-18 16:53:15 -06:00
Juneyoung Lee f193bcc701 Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits:
37ca7a795b
9aa6c72b92
705387c507
8ca4b3ef19
80dba72a66
2021-10-18 23:52:46 +09:00
Juneyoung Lee 8ca4b3ef19 [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169.
Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:

(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.

(2) The remaining tests are updated manually.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D108453
2021-10-16 12:01:41 +09:00
Anshil Gandhi 1830ec94ac Revert "[HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols"
This reverts commit 03375a3fb3.
2021-10-15 16:16:18 -06:00
Anshil Gandhi f92db6d3ff [HIP] Relax conditions for address space cast in builtin args
Allow (implicit) address space casting between LLVM-equivalent
target address spaces.

Reviewed By: yaxunl, tra

Differential Revision: https://reviews.llvm.org/D111734
2021-10-15 15:35:52 -06:00
Anshil Gandhi 53fc5100e0 Revert "[HIP] Relax conditions for address space cast in builtin args"
This reverts commit 3b48e1170d.
2021-10-15 14:42:28 -06:00
Anshil Gandhi 3b48e1170d [HIP] Relax conditions for address space cast in builtin args
Allow (implicit) address space casting between LLVM-equivalent
target address spaces.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D111734
2021-10-15 14:06:47 -06:00
Anshil Gandhi 03375a3fb3 [HIP] [AlwaysInliner] Disable AlwaysInliner to eliminate undefined symbols
By default clang emits complete contructors as alias of base constructors if they are the same.
The backend is supposed to emit symbols for the alias, otherwise it causes undefined symbols.
@yaxunl observed that this issue is related to the llvm options `-amdgpu-early-inline-all=true`
and `-amdgpu-function-calls=false`. This issue is resolved by only inlining global values
with internal linkage. The `getCalleeFunction()` in AMDGPUResourceUsageAnalysis also had
to be extended to support aliases to functions. inline-calls.ll was corrected appropriately.

Reviewed By: yaxunl, #amdgpu

Differential Revision: https://reviews.llvm.org/D109707
2021-10-15 11:39:15 -06:00
hsmahesha 393581d8a5 [CFE][Codegen] Update auto-generated check lines for few GPU lit tests
which is essentially required as a pre-commit for https://reviews.llvm.org/D110257.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D110676
2021-10-07 09:05:39 +05:30
Yaxun (Sam) Liu c4afb5f81b [HIP] Fix linking of asanrt.bc
HIP currently uses -mlink-builtin-bitcode to link all bitcode libraries, which
changes the linkage of functions to be internal once they are linked in. This
works for common bitcode libraries since these functions are not intended
to be exposed for external callers.

However, the functions in the sanitizer bitcode library is intended to be
called by instructions generated by the sanitizer pass. If their linkage is
changed to internal, their parameters may be altered by optimizations before
the sanitizer pass, which renders them unusable by the sanitizer pass.

To fix this issue, HIP toolchain links the sanitizer bitcode library with
-mlink-bitcode-file, which does not change the linkage.

A struct BitCodeLibraryInfo is introduced in ToolChain as a generic
approach to pass the bitcode library information between ToolChain and Tool.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D110304
2021-09-27 13:25:46 -04:00
Jonas Hahnfeld ea08c4cd1c [CUDA] Fix static device variables with -fgpu-rdc
NVPTX does not allow dots in the identifier, so ptxas errors out with
   fatal   : Parsing error near '.static': syntax error
because it parses .static as a directive. Avoid this problem by using
two underscores, similar to what OpenMP does for outlined functions.

Differential Revision: https://reviews.llvm.org/D108456
2021-08-25 09:31:22 +02:00
Anshil Gandhi 7063ac1afa [HIP] Allow target addr space in target builtins
This patch allows target specific addr space in target builtins for HIP. It inserts implicit addr
space cast for non-generic pointer to generic pointer in general, and inserts implicit addr
space cast for generic to non-generic for target builtin arguments only.

It is NFC for non-HIP languages.

Differential Revision: https://reviews.llvm.org/D102405
2021-08-19 23:51:58 -06:00
Anshil Gandhi f5d5f17d3a Revert "[HIP] Allow target addr space in target builtins"
This reverts commit a35008955f.
2021-08-18 21:38:42 -06:00
Anshil Gandhi f22ba51873 [Remarks] Emit optimization remarks for atomics generating CAS loop
Implements ORE in AtomicExpand pass to report atomics generating a
compare and swap loop.

Differential Revision: https://reviews.llvm.org/D106891
2021-08-16 14:56:01 -06:00
Dávid Bolvanský 49de6070a2 Revert "[Remarks] Emit optimization remarks for atomics generating CAS loop"
This reverts commit 435785214f. Still same compile time issues for -O0 -g, eg. +1.3% for sqlite3.
2021-08-15 11:44:13 +02:00
Anshil Gandhi 435785214f [Remarks] Emit optimization remarks for atomics generating CAS loop
Implements ORE in AtomicExpand pass to report atomics generating
a compare and swap loop.

Differential Revision: https://reviews.llvm.org/D106891
2021-08-14 23:37:23 -06:00