Commit Graph

43897 Commits

Author SHA1 Message Date
Alexey Lapshin fb244ffb9f [dsymutil][DWARFLinker][NFC] make AddressManager not depending on the order of checks for relocations.
Current dsymutil implementation of hasLiveMemoryLocation()/hasLiveAddressRange()
and applyValidRelocs() assume that calls should be done in certain order
(from first Dies to last). Multi-thread implementation might call these methods
in other order(it might process compilation units in order other than they are physically
located), so we remove restriction that searching for relocations should be done
in ascending order. This change does not introduce noticable performance degradation.
The testing results for clang binary:

golden-dsymutil/dsymutil  23787992
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec

build-Release/bin/dsymutil 23855616
clang MD5: 5efa8fd9355ebf81b65f24db5375caa2
elapsed time=91sec

Differential Revision: https://reviews.llvm.org/D93106
2021-01-31 16:34:10 +03:00
Kazu Hirata 627b5bda11 [llvm] Add missing header guards (NFC)
Identified with llvm-header-guard.
2021-01-30 09:53:42 -08:00
Florian Hahn 7a6a2cc81a [LTO] Add option enable NewPM with LTOCodeGenerator.
This patch adds an option to enable the new pass manager in
LTOCodeGenerator. It also updates a few tests with legacy PM specific
tests, which started failing after 6a59f05606 when
LLVM_ENABLE_NEW_PASS_MANAGER=true.
2021-01-30 11:54:20 +00:00
Florian Hahn 6a59f05606 [LTO] Use lto::backend for code generation.
This patch updates LTOCodeGenerator to use the utilities provided by
LTOBackend to run middle-end optimizations and backend code generation.

This is a first step towards unifying the code used by libLTO's C API
and the newer, C++ interface (see PR41541).

The immediate motivation is to allow using the new pass manager when
doing LTO using libLTO's C API, which is used on Darwin, among others.

With the changes, there are no codegen/stats differences when building
MultiSource/SPEC2000/SPEC2006 on Darwin X86 with LTO, compared
to without the patch.

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D94487
2021-01-30 10:09:55 +00:00
Kazu Hirata 7728cc003a [llvm] Use append_range (NFC) 2021-01-29 23:23:34 -08:00
Nathan Hawes 719f778441 [VFS] Combine VFSFromYamlDirIterImpl and OverlayFSDirIterImpl into a single implementation (NFC)
As a fixme notes, both of these directory iterator implementations are
conceptually similar and duplicate the functionality of returning and uniquing
entries across two or more directories. This patch combines them into a single
class 'CombiningDirIterImpl'.

This also drops the 'Redirecting' prefix from RedirectingDirEntry and
RedirectingFileEntry to save horizontal space. There's no loss of clarity as
they already have to be prefixed with 'RedirectingFileSystem::' whenever
they're referenced anyway.

rdar://problem/72485443
Differential Revision: https://reviews.llvm.org/D94857
2021-01-30 11:10:10 +10:00
Roman Lebedev c2534a7097
[ShadowStackGCLowering] Preserve Dominator Tree, if avaliable
This doesn't help avoid any Dominator Tree recalculations just yet,
there's one more pass to go..
2021-01-30 01:14:51 +03:00
Christopher Tetreault 49a6502cd5 [SVE] delete VectorType::getNumElements()
The previously agreed-upon deprecation period for
VectorType::getNumElements() has passed. This patch removes this method
and completes the refactor proposed in the RFC:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html

Reviewed By: david-arm, rjmccall

Differential Revision: https://reviews.llvm.org/D95570
2021-01-29 13:46:54 -08:00
Jay Foad 5cf6412a27 [GlobalISel] Fix modifying a G_OR without notifying the observer
Remove the call to setFlags in favour of creating the instruction with
the correct flags in the first place, so we don't have to explicitly
notify the observer.

Differential Revision: https://reviews.llvm.org/D95681
2021-01-29 16:32:24 +00:00
Florian Hahn f3a710cade [LTO] Update splitCodeGen to take a reference to the module. (NFC)
splitCodeGen does not need to take ownership of the module, as it
currently clones the original module for each split operation.

There is an ~4 year old fixme to change that, but until this is
addressed, the function can just take a reference to the module.

This makes the transition of LTOCodeGenerator to use LTOBackend a bit
easier, because under some circumstances, LTOCodeGenerator needs to
write the original module back after codegen.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95222
2021-01-29 11:53:11 +00:00
Kazu Hirata 046cfb8565 [llvm] Forward-declare formatted_raw_ostream (NFC)
Various *TargetStreamer.h need formatted_raw_ostream but rely on a
forward declaration of formatted_raw_ostream in MCStreamer.h.  This
patch adds forward declarations right in *TargetStreamer.h.

While we are at it, this patch removes the one in MCStreamer.h, where
it is unnecessary.
2021-01-28 22:21:13 -08:00
Christudasan Devadasan 892e4567e1 Support a list of CostPerUse values
This patch allows targets to define multiple cost
values for each register so that the cost model
can be more flexible and better used during the
register allocation as per the target requirements.

For AMDGPU the VGPR allocation will be more efficient
if the register cost can be associated dynamically
based on the calling convention.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D86836
2021-01-29 10:14:52 +05:30
Duncan P. N. Exon Smith 17c584551d ADT: Add SFINAE to the generic IntrusiveRefCntPtr constructors
Add an `enable_if` to the generic `IntrusiveRefCntPtr` constructors so
that std::is_convertible gives an honest answer when the underlying
pointers cannot be converted. Added `static_assert`s to the test suite
to verify.

Also combine generic constructors from `IntrusiveRefCntPtr<X>&&` and
`const IntrusiveRefCntPtr<X>&`. At first glance this appears to be an
infinite loop, but the real copy/move constructors are spelled out
separately above. Added a unit test to verify.

Differential Revision: https://reviews.llvm.org/D95498
2021-01-28 15:07:27 -08:00
Jessica Paquette daffab1985 Recommit "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
Recommit of 4580acf675

`Opc = DefMI->getOpcode()` was in the wrong place.
2021-01-28 14:43:00 -08:00
Jessica Paquette dcb5b5f1f2 Revert "[GlobalISel] Walk through hints in getDefIgnoringCopies et al"
This reverts commit 4580acf675.

Reverting while looking into some test failures.
2021-01-28 14:37:57 -08:00
Jessica Paquette 4580acf675 [GlobalISel] Walk through hints in getDefIgnoringCopies et al
Treat hint instructions like G_ASSERT_ZEXT like COPY instructions in helpers
which walk through copies.

This ensures that instructions like G_ASSERT_ZEXT won't impact any optimizations
that rely on these helpers.

Differential Revision: https://reviews.llvm.org/D95577
2021-01-28 14:27:00 -08:00
Cassie Jones f22f4557a7 [GlobalISel] Implement widenScalar for carry-in add/sub
These are widened to a wider UADDE/USUBE, with the overflow value
unused, and with the same synthesis of a new overflow value as for the
O operations.

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D95326
2021-01-28 17:06:24 -05:00
Jessica Paquette 24261729a4 [GlobalISel] Add G_ASSERT_ZEXT
This adds a generic opcode which communicates that a type has already been
zero-extended from a narrower type.

This is intended to be similar to AssertZext in SelectionDAG.

For example,

```
%x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16
```

Signifies that the top 48 bits of %x are known to be 0.

This is useful in cases like this:

```
define i1 @zeroext_param(i8 zeroext %x) {
  %cmp = icmp ult i8 %x, -20
  ret i1 %cmp
}
```

In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit
value.

If we know that `%x` is already zero-ed out in the relevant high bits, we can
avoid the truncate.

Currently, in GISel, this looks like this:

```
_zeroext_param:
  and w8, w0, #0xff ; We don't actually need this!
  cmp w8, #236
  cset w0, lo
  ret
```

While SDAG does not produce the truncation, since it knows that it's
unnecessary:

```
_zeroext_param:
  cmp w0, #236
  cset w0, lo
  ret
```

This patch

- Adds G_ASSERT_ZEXT
- Adds MIRBuilder support for it
- Adds MachineVerifier support for it
- Documents it

It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There
should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.)

This allows us to skip over hints in the legalizer etc. These can then later
be selected like COPY instructions or removed.

Differential Revision: https://reviews.llvm.org/D95564
2021-01-28 13:58:37 -08:00
Greg Clayton f8122d3532 Add the ability to extract the unwind rows from DWARF Call Frame Information.
This patch adds the ability to evaluate the state machine for CIE and FDE unwind objects and produce a UnwindTable with all UnwindRow objects needed to unwind registers. It will also dump the UnwindTable for each CIE and FDE when dumping DWARF .debug_frame or .eh_frame sections in llvm-dwarfdump or llvm-objdump. This allows users to see what the unwind rows actually look like for a given CIE or FDE instead of just seeing a list of opcodes.

This patch adds new classes: UnwindLocation, RegisterLocations, UnwindRow, and UnwindTable.

UnwindLocation is a class that describes how to unwind a register or Call Frame Address (CFA).

RegisterLocations is a class that tracks registers and their UnwindLocations. It gets populated when parsing the DWARF call frame instruction opcodes for a unwind row. The registers are mapped from their register numbers to the UnwindLocation in a map.

UnwindRow contains the result of evaluating a row of DWARF call frame instructions for the CIE, or a row from a FDE. The CIE can produce a set of initial instructions that each FDE that points to that CIE will use as the seed for the state machine when parsing FDE opcodes. A UnwindRow for a CIE will not have a valid address, whille a UnwindRow for a FDE will have a valid address.

The UnwindTable is a class that contains a sorted (by address) vector of UnwindRow objects and is the result of parsing all opcodes in a CIE, or FDE. Parsing a CIE should produce a UnwindTable with a single row. Parsing a FDE will produce a UnwindTable with one or more UnwindRow objects where all UnwindRow objects have valid addresses. The rows in the UnwindTable will be sorted from lowest Address to highest after parsing the state machine, or an error will be returned if the table isn't sorted. To parse a UnwindTable clients can use the following methods:

    static Expected<UnwindTable> UnwindTable::create(const CIE *Cie);
    static Expected<UnwindTable> UnwindTable::create(const FDE *Fde);

A valid table will be returned if the DWARF call frame instruction opcodes have no encoding errors. There are a few things that can go wrong during the evaluation of the state machine and these create functions will catch and return them.

Differential Revision: https://reviews.llvm.org/D89845
2021-01-28 13:39:17 -08:00
Reid Kleckner bacf9cf2c5 Revert "[PDB] Defer relocating .debug$S until commit time and parallelize it"
This reverts commit 1a9bd5b813.

I suspect that this patch may have caused https://crbug.com/1171438.
2021-01-28 13:17:27 -08:00
Thomas Lively 4b68b64dcc [WebAssembly] Prototype i8x16 to i32x4 widening instructions
As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the
opcodes used in V8:
https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h

Differential Revision: https://reviews.llvm.org/D95557
2021-01-28 10:59:32 -08:00
David Blaikie 4318028cd2 DebugInfo: Add a DWARF FORM extension for addrx+offset references to reduce relocations
This is an alternative to the use of complex DWARF expressions for
addresses - shaving off a few extra bytes of expression overhead.
2021-01-28 10:20:02 -08:00
Simon Pilgrim b06ccc7446 [APFloat] Remove orphan ilogb(DoubleAPFloat) declaration. NFCI. 2021-01-28 15:18:25 +00:00
Simon Pilgrim 5169627c14 [APFloat] scalbn - pass DoubleAPFloat arg as const-ref. NFCI.
Avoid unnecessary copy and fix clang-tidy warning.
2021-01-28 15:18:24 +00:00
Stefan Gränitz b9ff5da0c8 [Orc] Remove unused header from TPC server
The header would include OrcJIT headers in OrcTargetProcess, which is not desired. All common declarations should be in OrcShared.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D95606
2021-01-28 14:16:49 +01:00
Simon Pilgrim 7396f720f9 [DebugInfo] Remove some unused includes. NFCI.
Mainly removing a lot of <vector> includes from files that don't explicitly use std::vector
2021-01-28 11:21:35 +00:00
Georgii Rymar 68195b15a3 [yaml2obj] - Allow empty SectionHeaderTable definitions.
Currently we don't allow the following definition:

```
Sections:
  - Type: SectionHeaderTable
  - Name: .foo
    Type: SHT_PROGBITS
```

We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table".

It was implemented in this way earlier, when `SectionHeaderTable`
was a dedicated key outside of the `Sections` list. And we did not
allow to select where the table is written.

Currently it makes sense to allow it, because a user might
want to place the default section header table at an arbitrary position,
e.g. before other sections. In this case it is not convenient and error prone
to require specifying all sections:

```
Sections:
  - Type: SectionHeaderTable
    Sections:
      - Name: .foo
      - Name: .strtab
      - Name: .shstrtab
  - Name: .foo
    Type: SHT_PROGBITS
```

This patch allows empty SectionHeaderTable definitions.

Differential revision: https://reviews.llvm.org/D95341
2021-01-28 10:51:52 +03:00
Kazu Hirata f82b5a647e [DebugInfo] Forward-declare PDBFile (NFC)
NativeEnumInjectedSources.h needs PDBFile but relies on a
forward declaration of PDBFile in InjectedSourceStream.h.
This patch adds a forward declaration right in
NativeEnumInjectedSources.h.

While we are at it, this patch removes the one in
InjectedSourceStream.h, where it is unnecessary.
2021-01-27 23:25:38 -08:00
Hongtao Yu 7e99bddfea [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Fangrui Song 6612c2bb68 [llvm-c] Move LLVMX86_AMXTypeKind & LLVMPoisonValueValueKind to the bottom to avoid value changes compared with LLVM<=11
Fixes PR48905
2021-01-27 16:28:04 -08:00
Teresa Johnson 1487747e99 [LTO] Prevent devirtualization for symbols dynamically exported
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Differential Revision: https://reviews.llvm.org/D91583
2021-01-27 15:54:13 -08:00
James Y Knight 9c7aeaebb3 Itanium Mangling: Mangle `__alignof__` differently than `alignof`.
The two operations have acted differently since Clang 8, but were
unfortunately mangled the same. The new mangling uses new "vendor
extended expression" syntax proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112

GCC had the same mangling problem, https://gcc.gnu.org/PR88115, and
will hopefully be switching to the same mangling as implemented here.

Additionally, fix the mangling of `__uuidof` to use the new extension
syntax, instead of its previous nonstandard special-case.

Adjusts the demangler accordingly.

Differential Revision: https://reviews.llvm.org/D93922
2021-01-27 16:46:51 -05:00
Varun Gandhi 44f792966e [Demangle] Support demangling Swift calling convention in MS demangler.
Previously, Clang was able to mangle the Swift calling
convention but 'MicrosoftDemangle.cpp' was not able to demangle it.

Reviewed By: compnerd, rnk

Differential Revision: https://reviews.llvm.org/D95053
2021-01-27 13:24:54 -08:00
Sanjay Patel ab93c18c12 [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)
I am trying to untangle the fast-math-flags propagation logic
in the vectorizers (see a6f022127 for SLP).

The loop vectorizer has a mix of checking FP function attributes,
IR-level FMF, and just wrong assumptions.

I am trying to avoid regressions while fixing this, and I think
the IR-level logic is good enough for that, but it's hard to say
for sure. This would be the 1st step in the clean-up.

The existing test that I changed to include 'fast' actually shows
a miscompile: the function only had the equivalent of nnan, but we
created new instructions that had fast (all FMF set). This is
similar to the example in https://llvm.org/PR35538

Differential Revision: https://reviews.llvm.org/D95452
2021-01-27 14:17:11 -05:00
Fangrui Song 54fb3ca96e [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Craig Topper 0b50fa9945 [FaultsMaps][llvm-objdump] Move FaultMapParser to Object/. Remove CodeGen dependency from llvm-objdump
FaultsMapParser lived in CodeGen and was forcing llvm-objdump to
link CodeGen and everything CodeGen depends on.

This was previously attempted in r240364 to fix a link failure.
The CodeGen dependency was independently added to fix the same
link failure, and that ended up being kept.

Removing the dependency seems like the correct layering for
llvm-objdump.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D95414
2021-01-27 10:39:59 -08:00
Valentin Clement f30c523660 [flang][openacc] Allow multiple wait clauses
kernels loop and enter data had a too restrictive constraint for the wait clause.
The wait clause is allowed multiple times and not only once. This patch fix this problem.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95469
2021-01-27 13:18:46 -05:00
Florian Hahn 28410d17f5
[LoopUtils] Pass SCEVExpander instead SE to addRuntimeChecks.
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.

Used in D75980.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94295
2021-01-27 17:36:19 +00:00
Valentin Clement b65896ef8b [flang][openacc] Fix clause restriction for exit data directive
Restriction on clauses for the EXIT DATA directive were not fully correct.
This patch fixes the situation. The async, if and finalize clauses are allowed
only once.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95470
2021-01-27 10:07:19 -05:00
Valentin Clement 5e09a02527 [flang][openacc] Fix clause restriction for host_data directive
Restriction on clauses for the HOST_DATA directive were not fully correct.
This patch fixes the situation. The if and if_present clauses are allowed
only once.

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D95473
2021-01-27 10:06:33 -05:00
Kazu Hirata 6bde085366 [AMDGPU] Forward-declare TargetRegisterClass (NFC)
AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a
forward declaration of TargetRegisterClass in InstructionSelector.h.
This patch adds a forward declaration right in
AMDGPUInstructionSelector.h.

While we are at it, this patch removes the one in
InstructionSelector.h, where it is unnecessary.
2021-01-26 20:00:16 -08:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Duncan P. N. Exon Smith 2f721476d1 Frontend: Simplify handling of non-seeking streams in CompilerInstance, NFC
Add a new `raw_pwrite_ostream` variant, `buffer_unique_ostream`, which
is like `buffer_ostream` but with unique ownership of the stream it's
wrapping. Use this in CompilerInstance to simplify the ownership of
non-seeking output streams, avoiding logic sprawled around to deal with
them specially.

This also simplifies future work to encapsulate output files in a
different class.

Differential Revision: https://reviews.llvm.org/D93260
2021-01-26 15:20:43 -08:00
Haowei Wu 15313f64be [llvm-elfabi] Support ELF file that lacks .gnu.hash section
Before this change, when reading ELF file, elfabi determines number of
entries in .dynsym by reading the .gnu.hash section. This change makes
elfabi read section headers directly first. This change allows elfabi
works on ELF files which do not have .gnu.hash sections.

Differential Revision: https://reviews.llvm.org/D93362
2021-01-26 12:31:52 -08:00
Fangrui Song 34b60d8a56 Add -fbinutils-version= to gate ELF features on the specified binutils version
There are two use cases.

Assembler
We have accrued some code gated on MCAsmInfo::useIntegratedAssembler().  Some
features are supported by latest GNU as, but we have to use
MCAsmInfo::useIntegratedAs() because the newer versions have not been widely
adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26).

Linker
We want to use features supported only by LLD or very new GNU ld, or don't want
to work around older GNU ld. We currently can't represent that "we don't care
about old GNU ld".  You can find such workarounds in a few other places, e.g.
Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp
AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276),
R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969)

Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001;
GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available).
This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table).

This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc.
It changes one codegen place in SHF_MERGE to demonstrate its usage.
`-fbinutils-version=2.35` means the produced object file does not care about GNU
ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced
assembly can be consumed by GNU as>=2.35, but older versions may not work.

`-fbinutils-version=none` means that we can use all ELF features, regardless of
GNU as/ld support.

Both clang and llc need `parseBinutilsVersion`. Such command line parsing is
usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen),
however, ClangCodeGen does not depend on LLVMCodeGen. So I add
`parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget).

Differential Revision: https://reviews.llvm.org/D85474
2021-01-26 12:28:23 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Simon Pilgrim f82cff31d3 [AMDGPU] HSAMD::fromString - replace std::string arg with StringRef. NFCI.
Removes an unnecessary chain of StringRef -> std::string -> StringRef conversions
2021-01-26 16:09:39 +00:00
Sander de Smalen b9417c3616 [CostModel] Handle CTLZ and CCTZ in getTypeBasedIntrinsicInstrCost
Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D95355
2021-01-26 14:37:51 +00:00
Sebastian Neubauer b36370d153 [AMDGPU] Add IntrWillReturn to three intrinsics
None of these can terminate a wave or lane.
With these, all intrinsic are IntrWillReturn except those that change
exec or can terminate the wave.

Not marking intrinsics as WillReturn may prevent optimizations in the
future: https://lists.llvm.org/pipermail/llvm-dev/2021-January/148047.html

Differential Revision: https://reviews.llvm.org/D95436
2021-01-26 15:33:15 +01:00