Commit Graph

389032 Commits

Author SHA1 Message Date
Fangrui Song 37561ba89b -fno-semantic-interposition: Don't set dso_local on GlobalVariable
`clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic.

GCC folks consider there are 'address interposition' and 'semantic interposition',
and 'disabling semantic interposition' can optimize function calls but
cannot change variable references to use local aliases
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483).

This patch removes dso_local for variables in
`clang -fpic -fno-semantic-interposition` mode so that the built shared objects can
work with copy relocations. Building llvm-project tiself with
-fno-semantic-interposition (D102453) should now be safe with trunk Clang.

Example:
```
// a.c
int var;
int *addr() { return var; }

// old: cannot be interposed
movslq  .Lvar$local(%rip), %rax
// new: can be interposed
movq    var@GOTPCREL(%rip), %rax
movslq  (%rax), %rax
```

The local alias lowering for `GlobalVariable`s is kept in case there is a
future option allowing local aliases.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D102583
2021-05-19 16:08:28 -07:00
Richard Smith 2f8ac0758b PR50402: Use proper constant evaluation rules for checking constraint
satisfaction.

Previously we used the rules for constant folding in a non-constant
context, meaning that we'd incorrectly accept foldable non-constant
expressions and that std::is_constant_evaluated() would evaluate to
false.
2021-05-19 16:02:53 -07:00
LLVM GN Syncbot f2c97605a0 [gn build] Port 4bf69fb52b 2021-05-19 22:27:27 +00:00
Sam Clegg 356b85edd7 [lld][WebAssembly] Fix for string tail merging and -r/--relocatable
Ensure that both SyntheticMergedChunk and all MergeInfoChunks that it
comprises are assigned the correct output section.  Without this we
would crash when outputting relocations in --relocatable mode.

Fixes: https://github.com/emscripten-core/emscripten/issues/14220

Differential Revision: https://reviews.llvm.org/D102806
2021-05-19 15:25:58 -07:00
Ahmed Bougacha c9dbaa4c86 [docs] Describe reporting security issues on the chromium tracker.
To track security issues, we're starting with the chromium bug tracker
(using the llvm project there).

We considered using Github Security Advisories.  However, they are
currently intended as a way for project owners to publicize their
security advisories, and aren't well-suited to reporting issues.

This also moves the issue-reporting paragraph to the beginning of the
document, in part to make it more discoverable, in part to allow the
anchor-linking to actually display the paragraph at the top of the page.

Note that this doesn't update the concrete list of security-sensitive
areas, which is still an open item.  When we do, we may want to move the
list of security-sensitive areas next to the issue-reporting paragraph
as well, as it seems like relevant information needed in the reporting
process.

Finally, when describing the discission medium, this splits the topics
discussed into two: the concrete security issues, discussed in the
issue tracker, and the logistics of the group, in our mailing list,
as patches on public lists, and in the monthly sync-up call.

While there, add a SECURITY.md page linking to the relevant paragraph.

Differential Revision: https://reviews.llvm.org/D100873
2021-05-19 15:21:50 -07:00
Jon Roelofs 4bf69fb52b [Remarks] Add analysis remarks for memset/memcpy/memmove lengths
Differential revision: https://reviews.llvm.org/D102452
2021-05-19 15:09:18 -07:00
Ryan Prichard 65d0264ba2 [MC][ARM] Reject Thumb "ror rX, #0"
The ROR instruction can only handle immediates between 1 and 31. The
would-be encoding for ROR #0 is actually the RRX instruction.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D102455
2021-05-19 15:05:39 -07:00
Petr Hosek 757a851a2c [CMake] Don't LTO optimize targets that aren't part of any distribution
When using distributions, targets that aren't included in any
distribution don't need to be as optimized as targets that are
included since those targets are typically only used for tests.

We might consider avoiding LTO for these targets altogether, see
https://lists.llvm.org/pipermail/llvm-dev/2021-April/149843.html

Differential Revision: https://reviews.llvm.org/D102732
2021-05-19 15:02:11 -07:00
hasheddan 0316f3e649 [mlir][docs] Fix minor typos in vector dialect docs
Updates a minor typo in vector dialect documentation.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D101203
2021-05-19 14:20:28 -07:00
Martin Storsjö 688b917b4b Revert "[Driver] Delete -mimplicit-it="
This reverts commit 2919222d80.

That commit broke backwards compatibility. Additionally, the
replacement, -Wa,-mimplicit-it, isn't yet supported by any stable
release of Clang.

See D102812 for a fix for the error cases when callers specify both
-mimplicit-it and -Wa,-mimplicit-it.
2021-05-20 00:17:50 +03:00
Vitaly Buka 09a8372726 [NFC][tsan] clang-format the test 2021-05-19 14:03:50 -07:00
Aart Bik bf9ef3efaa [mlir][sparse] skip sparsification for unannotated (or unhandled) cases
Skip the sparsification pass for Linalg ops without annotated tensors
(or cases that are not properly handled yet).

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102787
2021-05-19 13:49:28 -07:00
River Riddle 3b7f8daed4 [mlir] Properly align StorageUniquer::BaseStorage to fix 32 bit build
We allow stealing up to 3 bits of pointers to BaseStorage, so we need to make sure that we align by at least 8.
2021-05-19 13:46:43 -07:00
Marius Brehler f878e1af9f [mlir] Harmonize TOSA include guards
Reviewed By: sjarus

Differential Revision: https://reviews.llvm.org/D102802
2021-05-19 22:45:06 +02:00
Sean Silva 35454268cf [mlir][CAPI] Expose [u]int8 DenseElementsAttr.
Also, fix a small typo where the "unsigned" splat variants were not
being created with an unsigned type.

Differential Revision: https://reviews.llvm.org/D102797
2021-05-19 13:41:44 -07:00
wlei 6539a80bc9 [CSSPGO] Avoid deleting probe instruction in FoldValueComparisonIntoPredecessors
This change tries to fix a place missing `moveAndDanglePseudoProbes `. In FoldValueComparisonIntoPredecessors, it folds the BB into predecessors and then marked the BB unreachable. However, the original logic from the BB is still alive, deleting the probe will mislead the SampleLoader mark it as zero count sample.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D102721
2021-05-19 13:39:05 -07:00
Richard Smith d38057f3ec Treat implicit deduction guides as being equivalent to their
corresponding constructor for access checking purposes.
2021-05-19 13:31:53 -07:00
Lang Hames 1dfa47910a [ORC-RT] Add ORC runtime error and expected types.
These will be used for error propagation and handling in the ORC runtime.

The implementations of these types are cut-down versions of the error
support in llvm/Support/Error.h. Most advice on llvm::Error and llvm::Expected
(e.g. from the LLVM Programmer's manual) applies equally to __orc_rt::Error
and __orc_rt::Expected. The primary difference is the mechanism for testing
and handling error types: The ORC runtime uses a new 'error_cast' operation
to replace the handleErrors family of functions. See error_cast comments in
error.h.
2021-05-19 13:31:25 -07:00
Lang Hames ef6e1213b1 [ORC] Add a CPU getter to JITTargetMachineBuilder. 2021-05-19 13:31:25 -07:00
Raphael Isemann 30a5ddaef3 Revert "[lldb] Fix UB in half2float and add some more tests."
This reverts commit 4b074b49be.

Some of the new tests are failing on Debian.
2021-05-19 22:06:53 +02:00
Marius Brehler 745ddd27ea [mlir] Add include guard to TOSA tblgen passes
Reviewed By: sjarus, stellaraccident

Differential Revision: https://reviews.llvm.org/D102800
2021-05-19 22:02:31 +02:00
River Riddle 3b43226032 [Reland] [mlir] Speed up Lexer::getEncodedSourceLocation
Reland Note: This was accidentally reverted in 80d981eda6, but is an important improvement even outside of the driving motivator in D102567.

We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
2021-05-19 12:57:18 -07:00
Arthur Eubanks 0bebda17be [OpaquePtr] Make atomicrmw work with opaque pointers
FullTy is only necessary when we need to figure out what type an
instruction works with given a pointer's pointee type. However, we just
end up using the value operand's type, so FullTy isn't necessary.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102788
2021-05-19 12:49:28 -07:00
Arthur Eubanks 1b25fce404 [OpaquePtr] Make cmpxchg work with opaque pointers
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102745
2021-05-19 12:44:10 -07:00
Reid Kleckner 12dd8df38b [PDB] Do not record PGO or coverage public symbols
These symbols are long, and they tend to cause the PDB file size to
overflow. They are generally not necessary when debugging problems in
user code.

This change reduces the size of chrome.dll.pdb with coverage from
6,937,108,480 bytes to 4,690,210,816 bytes.

Differential Revision: https://reviews.llvm.org/D102719
2021-05-19 12:41:31 -07:00
Arthur Eubanks 28b9771472 [OpaquePtr] Make GEPs work with opaque pointers
No verifier changes needed, the verifier currently doesn't check that
the pointer operand's pointee type matches the GEP type. There is a
similar check in GetElementPtrInst::Create() though.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102744
2021-05-19 12:39:37 -07:00
Raphael Isemann 4b074b49be [lldb] Fix UB in half2float and add some more tests.
The added DumpDataExtractorTest uncovered that this is lshifting a negative
integer which upsets ubsan and breaks the sanitizer bot. This patch just
changes the variable we shift to be unsigned and adds a bunch of tests to make
sure this function does what it promises.
2021-05-19 21:37:10 +02:00
Alex Lorenz 50be48b0f3 [clang][ObjC] Allow different availability annotation on a method
when implementing an optional protocol requirement

When an Objective-C method implements an optional protocol requirement,
allow the method to use a newer introduced or older obsoleted
availability version than what's specified on the method in the protocol
itself. This allows SDK adopters to adopt an optional method from a
protocol later than when the method is introduced in the protocol. The users
that call an optional method on an object that conforms to this protocol
are supposed to check whether the object implements the method or not,
so a lack of appropriate `if (@available)` check for a new OS version
is not a cause of concern as there's already another runtime check that's required.

Differential Revision: https://reviews.llvm.org/D102459
2021-05-19 12:13:57 -07:00
Joseph Huber 2db182ff8d [Diagnostics] Allow emitting analysis and missed remarks on functions
Summary:
Currently, only `OptimizationRemarks` can be emitted using a Function.
Add constructors to allow this for `OptimizationRemarksAnalysis` and
`OptimizationRemarkMissed` as well.

Reviewed By: jdoerfert thegameg

Differential Revision: https://reviews.llvm.org/D102784
2021-05-19 15:10:20 -04:00
Sanjay Patel 9b59a61cfc [x86] add tests for fma folds with fast-math-flags; NFC
Part of prep work for D90901
2021-05-19 14:28:57 -04:00
Sanjay Patel f12f9beb04 [x86] propagate FMF from x86-specific intrinsic nodes to others during combining
This is another FMF gap exposed by D90901, but I don't see a way
to show the difference in a regression test as with:
f66ba4c
6025663

We will see an asm difference if we add a test as part of D90901.
2021-05-19 14:25:09 -04:00
Nico Weber fd09a764eb [lld/mac] Remove dead declaration 2021-05-19 14:18:03 -04:00
Christopher Di Bella d8fad66149 [libcxx][ranges] adds concept `sized_range` and cleans up `ranges::size`
* adds `sized_range` and conformance tests
* moves `disable_sized_range` into namespace `std::ranges`
* removes explicit type parameter

Implements part of P0896 'The One Ranges Proposal'.

Differential Revision: https://reviews.llvm.org/D102434
2021-05-19 18:16:45 +00:00
Andrea Di Biagio 9acabe8b6f [MCA] Unbreak the buildbots by passing flag -mcpu=generic to the new test added by commit e5d59db469.
This should unbreak buildbot clang-ppc64le-linux-lnt.
2021-05-19 19:12:33 +01:00
Christopher Di Bella 0f80365722 [libcxx][iterator][nfc] acquires lock for working on [range.iter.ops]
Differential Revision: https://reviews.llvm.org/D101845
2021-05-19 18:05:33 +00:00
Sanjay Patel 333c968d40 [x86] update fma test with deprecated intrinsics; NFC
Similar to 8854b27 -

All of the CHECK lines should be identical to before,
but without any of the x86-specific calls that were
replaced with generic FMA long ago.

The file still has value because it shows a miscompile
as demonstrated in D90901, but we probably need to
add tests with FMF to make that explicit without
losing coverage.
2021-05-19 13:52:08 -04:00
Stephen Neuendorffer 29a50c5864 [MLIR] Update Vector To LLVM conversion to be aware of assume_alignment
vector.transfer_read and vector.transfer_write operations are converted
to llvm intrinsics with specific alignment information, however there
doesn't seem to be a way in llvm to take information from llvm.assume
intrinsics and change this alignment information.  In any
event, due the to the structure of the llvm.assume instrinsic, applying
this information at the llvm level is more cumbersome.  Instead, let's
generate the masked vector load and store instrinsic with the right
alignment information from MLIR in the first place.  Since
we're bothering to do this, lets just emit the proper alignment for
loads, stores, scatter, and gather ops too.

Differential Revision: https://reviews.llvm.org/D100444
2021-05-19 10:50:48 -07:00
Pirama Arumuga Nainar e4274cfe06 [CoverageMapping] Handle gaps in counter IDs for source-based coverage
For source-based coverage, the frontend sets the counter IDs and the
constraints of counter IDs is not defined.  For e.g., the Rust frontend
until recently had a reserved counter #0
(https://github.com/rust-lang/rust/pull/83774).  Rust coverage
instrumentation also creates counters on edges in addition to basic
blocks.  Some functions may have more counters than regions.

This breaks an assumption in CoverageMapping.cpp where the number of
counters in a function is assumed to be bounded by the number of
regions:
  Counts.assign(Record.MappingRegions.size(), 0);

This assumption causes CounterMappingContext::evaluate() to fail since
there are not enough counter values created in the above call to
`Counts.assign`.  Consequently, some uncovered functions are not
reported in coverage reports.

This change walks a Function's CoverageMappingRecord to find the maximum
counter ID, and uses it to initialize the counter array when instrprof
records are missing for a function in sparse profiles.

Differential Revision: https://reviews.llvm.org/D101780
2021-05-19 10:46:38 -07:00
Roman Lebedev 40fb4eeff9
[NFCI][Local] TryToSimplifyUncondBranchFromEmptyBlock(): use DeleteDeadBlocks() 2021-05-19 20:38:30 +03:00
Roman Lebedev c60ca9856c
[NFCI][Local] MergeBlockIntoPredecessor(): use DeleteDeadBlocks() 2021-05-19 20:38:30 +03:00
Roman Lebedev b0bb2149b3
[NFCI][Local] removeUnreachableBlocks(): use DeleteDeadBlocks() 2021-05-19 20:38:30 +03:00
Patrick Holland e5d59db469 [MCA] llvm-mca MCTargetStreamer segfault fix
In order to create the code regions for llvm-mca to analyze, llvm-mca creates an
AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions().
Within this function, both an MCAsmParser and MCTargetAsmParser are created so
that MCAsmParser::Run() can be used to create the code regions for us.

These parser classes were created for llvm-mc so they are designed to emit code
with an MCStreamer and MCTargetStreamer that are expected to be setup and passed
into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any
code, an MCStreamerWrapper class gets created instead and passed into the
MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides
many of the emit methods to just do nothing. The exception is the
emitInstruction() method which calls Regions.addInstruction(Inst).

This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build
our code regions, however there are a few directives which rely on the
MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the
MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because
llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of
those directives, a segfault will occur.

In x86, each one of these 7 directives will cause this segfault if they exist in
the input assembly to llvm-mca:

.cv_fpo_proc
.cv_fpo_setframe
.cv_fpo_pushreg
.cv_fpo_stackalloc
.cv_fpo_stackalign
.cv_fpo_endprologue
.cv_fpo_endproc
I haven’t looked at other targets, but I wouldn’t be surprised if some of the
other ones also have certain directives which could result in this same
segfault.

My proposed solution is to simply initialize an MCTargetStreamer after we
initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream
object, but we don’t actually want any of these directives to be emitted
anywhere, so I use an ostream created with the nulls() function. Since this
needs to happen after the MCStreamerWrapper has been initialized, it needs to
happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The
MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize
within the main() function of llvm-mca. So this MCInstPrinter gets constructed
within main() then passed into the parseCodeRegions() function as a parameter.
(If you feel like it would be appropriate and possible to create the
MCInstPrinter within the parseCodeRegions() function, then feel free to modify
my solution. That would stop us from having to pass it into the function and
would limit its scope / lifetime.)

My solution stops the segfault from happening and still passes all of the
current (expected) llvm-mca tests. I also added a new test for x86 that checks
for this segfault on an input that includes one of the .cv_fpo directives (this
test fails without my solution, but passes with it).

As far as I can tell, all of the functions that I modified are only called from
within llvm-mca so there shouldn’t be any worries about breaking other tools.

Differential Revision: https://reviews.llvm.org/D102709
2021-05-19 18:36:10 +01:00
Philip Reames 449d14ebd2 Do actual DCE in LoopUnroll (try 4)
Turns out simplifyLoopIVs sometimes returns a non-dead instruction in it's DeadInsts out param.  I had done a bit of NFC cleanup which was only NFC if simplifyLoopIVs obeyed it's documentation.  I'm simplfy dropping that part of the change.

Commit message from try 3:

Recommitting after fixing a bug found post commit. Amusingly, try 1 had been correct, and by reverting to incorporate last minute review feedback, I introduce the bug. Oops. :)

Original commit message:

The problem was that recursively deleting an instruction can delete instructions beyond the current iterator (via a dead phi), thus invalidating iteration. Test case added in LoopUnroll/dce.ll to cover this case.

LoopUnroll does a limited DCE pass after unrolling, but if you have a chain of dead instructions, it only deletes the last one. Improve the code to recursively delete all trivially dead instructions.

Differential Revision: https://reviews.llvm.org/D102511
2021-05-19 10:25:31 -07:00
Frederik Gossen 76b8754d1b Revert "Reapply "[clang][deps] Support inferred modules""
This reverts commit c98833cdaa.
The test `ClangScanDeps/modules-inferred-explicit-build.m` creates files
in the current directory.
2021-05-19 19:19:37 +02:00
Sanjay Patel f66ba4cfa7 [x86] propagate FMF from x86-specific intrinsic nodes to others during lowering
This is another fast-math-flags failure exposed by D90901.
2021-05-19 13:11:15 -04:00
Sanjay Patel 25207d5f81 [x86] add test check lines to demonstrate FMF propagation failure; NFC 2021-05-19 13:11:15 -04:00
Nikita Popov b661a55a25 [ScalarEvolution] Remove unused ExitLimit::hasOperand() method (NFC)
We only use BackedgeTakenInfo::hasOperand().
2021-05-19 18:42:14 +02:00
Vedant Kumar 7014a10161 [profile] Skip mmap() if there are no counters
If there are no counters, an mmap() of the counters section would fail
due to the size argument being too small (EINVAL).

rdar://78175925

Differential Revision: https://reviews.llvm.org/D102735
2021-05-19 09:31:40 -07:00
Jessica Paquette 84ae1cf8ed Recommit "[GlobalISel] Simplify G_ICMP to true/false when the result is known"
Add missing REQUIRES line to
prelegalizer-combiner-icmp-to-true-false-known-bits.
2021-05-19 09:29:19 -07:00
Hongtao Yu 4ca6e37b98 [CSSPGO] Overwrite branch weight annotated in previous pass.
Sample profile loader can be run in both LTO prelink and postlink. Currently the counts annoation in postilnk doesn't fully overwrite what's done in prelink. I'm adding a switch (`-overwrite-existing-weights=1`) to enable a full overwrite, which includes:

1. Clear old metadata for calls when their parent block has a zero count. This could be caused by prelink code duplication.

2. Clear indirect call metadata if somehow all the rest targets have a sum of zero count.

3. Overwrite branch weight for basic blocks.

With a CS profile, I was seeing #1 and #2 help reduce code size by preventing post-sample ICP and CGSCC inliner working on obsolete metadata, which come from a partial global inlining in prelink.  It's not expected to work well for non-CS case with a less-accurate post-inline count quality.

It's worth calling out that some prelink optimizations can damage counts quality in an irreversible way. One example is the loop rotate optimization. Due to lack of exact loop entry count (profiling can only give loop iteration count and loop exit count), moving one iteration out of the loop body leaves the rest iteration count unknown. We had to turn off prelink loop rotate to achieve a better postlink counts quality. A even better postlink counts quality can be archived by turning off prelink CGSCC inlining which is not context-sensitive.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D102537
2021-05-19 09:12:24 -07:00