Commit Graph

370151 Commits

Author SHA1 Message Date
Arthur Eubanks e5766f25c6 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-26 20:24:04 -07:00
Arthur Eubanks 4af5ba1726 Revert "[AlwaysInliner] Pass callee AAResults to InlineFunction()"
This reverts commit 504fbec7a6.

Test failure.
2020-10-26 20:23:38 -07:00
Bing1 Yu 2c08f1b4b6 [CostModel][X86] teach TTI calculate cost of chain of vector inserts/extracts more precisely and correctly:In each 128-lane, if there is at least one index is demanded and not all indices are demanded...
In each 128-lane, if there is at least one index is demanded and not all
indices are demanded and this 128-lane is not the first 128-lane of the
legalized-vector, then this 128-lane needs a extracti128;
If in each 128-lane, there is at least one index is demanded, this 128-lane
needs a inserti128.

The following cases will help you build a better understanding:
Assume we insert several elements into a v8i32 vector in avx2,
Case#1: inserting into 1th index needs vpinsrd + inserti128
Case#2: inserting into 5th index needs extracti128 + vpinsrd +
inserti128
Case#3: inserting into 4,5,6,7 index needs 4*vpinsrd + inserti128.

Reviewed By: pengfei, RKSimon

Differential Revision: https://reviews.llvm.org/D89767
2020-10-27 11:21:13 +08:00
Arthur Eubanks 504fbec7a6 [AlwaysInliner] Pass callee AAResults to InlineFunction()
Test copied from noalias-calls.ll with small changes.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89609
2020-10-26 20:10:09 -07:00
Arthur Eubanks a87f618ea7 [PlaceSafepoints] Pin tests to legacy PM
This pass isn't used in tree and can be ported to the NPM later on if desired.

Differential Revision: https://reviews.llvm.org/D90189
2020-10-26 20:07:37 -07:00
Arthur Eubanks 3dd1c72458 Port -objc-arc-expand to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90182
2020-10-26 20:05:10 -07:00
Arthur Eubanks 90c0b0d3d6 Port -objc-arc-apelim to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90181
2020-10-26 20:01:46 -07:00
River Riddle eda450bb27 [mlir][SymbolTable] Use Identifier instead of StringRef when looking up symbol name attributes
Using an Identifier is much more efficient for attribute lookups because it uses pointer comparison as opposed to string comparison.

Differential Revision: https://reviews.llvm.org/D89660
2020-10-26 19:40:19 -07:00
River Riddle 67f52f35d6 [mlir][StorageUniquer] Refactor parametric storage to use sharded dense sets
This revisions implements sharding in the storage of parametric instances to decrease lock contention by sharding out the allocator/mutex/etc. to use for a specific storage instance based on the hash key. This is a somewhat common approach to reducing lock contention on data structures, and is used by the concurrent hashmaps provided by folly/java/etc. For several compilations tested, this removed all/most lock contention from profiles and reduced compile time by several seconds.

Differential Revision: https://reviews.llvm.org/D89659
2020-10-26 19:40:19 -07:00
Shilei Tian e20d64c3d9 [Clang][OpenMP] Fixed an issue of segment fault when using target nowait
The implementation of target nowait just wraps the target region into a task. The essential four parameters (base ptr, ptr, size, mapper) are taken as firstprivate such that they will be copied to the private location. When there is no user-defined mapper, the mapper variable will be nullptr. However, it will be still copied to the corresponding place. Therefore, a memcpy will be generated and the source pointer will be nullptr, causing a segmentation fault. The root cause is when calling `emitOffloadingArraysArgument`, the last argument `Options` has a field about whether it requires a task. It only takes depend clause into account. In this patch, the nowait clause is also included.

There're two things that will be done in another patches:
1. target data nowait has not been supported yet. D90099 added the support.
2. When there is no mapper, the mapper array can be nullptr no matter whether it requires outer task or not. It can avoid an unnecessary data copy. This is an optimization that is covered in D90101.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89844
2020-10-26 22:33:22 -04:00
Chen Zheng 00e573cadb [LSR] fix typo in comments and rename for a new added hook. 2020-10-26 22:29:22 -04:00
Duncan P. N. Exon Smith f057e6dc5e SourceManager: clang-format the SrcMgr namespace, NFC 2020-10-26 21:58:52 -04:00
Duncan P. N. Exon Smith ebb4ea1d53 IR: Simplify two loops walking ConstantDataSequential, NFC
Follow-up to b2b7cf39d5.

Differential Revision: https://reviews.llvm.org/D90198
2020-10-26 21:55:48 -04:00
Craig Topper 470d2d1d28 Update email addresses in CODE_OWNERS. 2020-10-26 18:51:04 -07:00
Chandler Carruth aaf7ffd4e1 Teach `-fsanitize=fuzzer` to respect `-static` and `-static-libstdc++` when adding C++ standard libraries.
Summary:
Makes linking the sanitizers follow the same logic as the rest of the
driver with respect to the static linking strategy for the C++ standard
library.

Subscribers: mcrosier, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80488
2020-10-27 01:36:54 +00:00
Carl Ritson 7a880ab388 [AMDGPU] Move WQM Pass after MI Scheduler
Exec mask manipulation inserted by SIWholeQuadMode barriers to
instruction scheduling.  Move the entire pass after the machine
instruction scheduler and make changes so pass is correct for
non-SSA operation.  These changes should leave the pass still
usable pre-scheduler, although tests have be updated to reflect
post-scheduler results.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D88081
2020-10-27 10:25:53 +09:00
TaWeiTu 0efbfa38ae [NPM] Port -slsr to NPM
`-separate-const-offset-from-gep` has not yet be ported, so some tests are not updated.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D90149
2020-10-27 09:21:40 +08:00
Duncan P. N. Exon Smith 52821f6a71 IR: Add a comment at missing std::make_unique calls from b2b7cf39d5, NFC 2020-10-26 21:18:34 -04:00
River Riddle 3fffffa882 [mlir][Pattern] Add a new FrozenRewritePatternList class
This class represents a rewrite pattern list that has been frozen, and thus immutable. This replaces the uses of OwningRewritePatternList in pattern driver related API, such as dialect conversion. When PDL becomes more prevalent, this API will allow for optimizing a set of patterns once without the need to do this per run of a pass.

Differential Revision: https://reviews.llvm.org/D89104
2020-10-26 18:01:06 -07:00
River Riddle b6eb26fd0e [mlir][NFC] Move around the code related to PatternRewriting to improve layering
There are several pieces of pattern rewriting infra in IR/ that really shouldn't be there. This revision moves those pieces to a better location such that they are easier to evolve in the future(e.g. with PDL). More concretely this revision does the following:

* Create a Transforms/GreedyPatternRewriteDriver.h and move the apply*andFold methods there.
The definitions for these methods are already in Transforms/ so it doesn't make sense for the declarations to be in IR.

* Create a new lib/Rewrite library and move PatternApplicator there.
This new library will be focused on applying rewrites, and will also include compiling rewrites with PDL.

Differential Revision: https://reviews.llvm.org/D89103
2020-10-26 18:01:06 -07:00
River Riddle b99bd77162 [mlir][Pattern] Refactor the Pattern class into a "metadata only" class
The Pattern class was originally intended to be used for solely matching operations, but that use never materialized. All of the pattern infrastructure uses RewritePattern, and the infrastructure for pure matching(Matchers.h) is implemented inline. This means that this class isn't a useful abstraction at the moment, so this revision refactors it to solely encapsulate the "metadata" of a pattern. The metadata includes the various state describing a pattern; benefit, root operation, etc. The API on PatternApplicator is updated to now operate on `Pattern`s as nothing special from `RewritePattern` is necessary.

This refactoring is also necessary for the upcoming use of PDL patterns alongside C++ rewrite patterns.

Differential Revision: https://reviews.llvm.org/D86258
2020-10-26 18:01:06 -07:00
River Riddle 8a1ca2cd34 [mlir] Add a conversion pass between PDL and the PDL Interpreter Dialect
The conversion between PDL and the interpreter is split into several different parts.
** The Matcher:

The matching section of all incoming pdl.pattern operations is converted into a predicate tree and merged. Each pattern is first converted into an ordered list of predicates starting from the root operation. A predicate is composed of three distinct parts:
* Position
  - A position refers to a specific location on the input DAG, i.e. an
    existing MLIR entity being matched. These can be attributes, operands,
    operations, results, and types. Each position also defines a relation to
    its parent. For example, the operand `[0] -> 1` has a parent operation
    position `[0]` (the root).
* Question
  - A question refers to a query on a specific positional value. For
  example, an operation name question checks the name of an operation
  position.
* Answer
  - An answer is the expected result of a question. For example, when
  matching an operation with the name "foo.op". The question would be an
  operation name question, with an expected answer of "foo.op".

After the predicate lists have been created and ordered(based on occurrence of common predicates and other factors), they are formed into a tree of nodes that represent the branching flow of a pattern match. This structure allows for efficient construction and merging of the input patterns. There are currently only 4 simple nodes in the tree:
* ExitNode: Represents the termination of a match
* SuccessNode: Represents a successful match of a specific pattern
* BoolNode/SwitchNode: Branch to a specific child node based on the expected answer to a predicate question.

Once the matcher tree has been generated, this tree is walked to generate the corresponding interpreter operations.

 ** The Rewriter:
The rewriter portion of a pattern is generated in a very straightforward manor, similarly to lowerings in other dialects. Each PDL operation that may exist within a rewrite has a mapping into the interpreter dialect. The code for the rewriter is generated within a FuncOp, that is invoked by the interpreter on a successful pattern match. Referenced values defined in the matcher become inputs the generated rewriter function.

An example lowering is shown below:

```mlir
// The following high level PDL pattern:
pdl.pattern : benefit(1) {
  %resultType = pdl.type
  %inputOperand = pdl.input
  %root, %results = pdl.operation "foo.op"(%inputOperand) -> %resultType
  pdl.rewrite %root {
    pdl.replace %root with (%inputOperand)
  }
}

// is lowered to the following:
module {
  // The matcher function takes the root operation as an input.
  func @matcher(%arg0: !pdl.operation) {
    pdl_interp.check_operation_name of %arg0 is "foo.op" -> ^bb2, ^bb1
  ^bb1:
    pdl_interp.return
  ^bb2:
    pdl_interp.check_operand_count of %arg0 is 1 -> ^bb3, ^bb1
  ^bb3:
    pdl_interp.check_result_count of %arg0 is 1 -> ^bb4, ^bb1
  ^bb4:
    %0 = pdl_interp.get_operand 0 of %arg0
    pdl_interp.is_not_null %0 : !pdl.value -> ^bb5, ^bb1
  ^bb5:
    %1 = pdl_interp.get_result 0 of %arg0
    pdl_interp.is_not_null %1 : !pdl.value -> ^bb6, ^bb1
  ^bb6:
    // This operation corresponds to a successful pattern match.
    pdl_interp.record_match @rewriters::@rewriter(%0, %arg0 : !pdl.value, !pdl.operation) : benefit(1), loc([%arg0]), root("foo.op") -> ^bb1
  }
  module @rewriters {
    // The inputs to the rewriter from the matcher are passed as arguments.
    func @rewriter(%arg0: !pdl.value, %arg1: !pdl.operation) {
      pdl_interp.replace %arg1 with(%arg0)
      pdl_interp.return
    }
  }
}
```

Differential Revision: https://reviews.llvm.org/D84580
2020-10-26 18:01:06 -07:00
Duncan P. N. Exon Smith aab50af8c1 SourceManager: Use the same fake SLocEntry whenever it fails to load
Instead of putting a fake `SLocEntry` at `LoadedSLocEntryTable[Index]`
when it fails to load in `SourceManager::loadSLocEntry`, allocate a fake
one. Unless someone is sniffing the address of the returned `SLocEntry`
(doubtful), this won't be a functionality change. Note that
`SLocEntryLoaded[Index]` wasn't being set to `true` either before or
after this change so no accessor is every going to look at
`LoadedSLocEntryTable[Index]`.

As a side effect, drop the `mutable` from `LoadedSLocEntryTable`.

Differential Revision: https://reviews.llvm.org/D89748
2020-10-26 20:56:28 -04:00
Gaurav Jain 17cdba61d4 [NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer
Differential Revision: https://reviews.llvm.org/D90008
2020-10-26 17:13:32 -07:00
Zequan Wu 779deb9750 [lldb][NativePDB] fix test load-pdb.cpp 2020-10-26 17:12:51 -07:00
Nathan James b698ad00cb
[clang][NFC] Rearrange Comment Token and Lexer fields to reduce padding
Rearrange the fields to reduce the size of the classes

Reviewed By: gribozavr2

Differential Revision: https://reviews.llvm.org/D90127
2020-10-27 00:03:43 +00:00
Richard Smith a5c7b46862 Fix checking for C++98 ICEs in C++11-and-later mode to not consider use
of a reference to be acceptable.
2020-10-26 16:59:48 -07:00
Amy Kwan 803cc3aff2 [PowerPC] Implement Set Boolean Condition Instructions
This patch implements the set boolean condition instructions introduced in
POWER10.

The set boolean condition instructions (set[n]bc[r]) are used during
the following situations:
- sign/zero/any extending i1 to an i32 or i64,
- reg+reg, reg+imm or floating point comparisons being sign/zero extended to i32 or i64,
- spilling CR bits (using the setnbc instruction)

Differential Revision: https://reviews.llvm.org/D87705
2020-10-26 18:42:51 -05:00
Vedant Kumar a77a739abc [profile] Suppress spurious 'expected profile to require unlock' warning
In %c (continuous sync) mode, avoid attempting to unlock an
already-unlocked profile.

The profile is only locked when profile merging is enabled.
2020-10-26 16:25:08 -07:00
Adrian Prantl 5b3bf8b453 [DebugInfo] Expose Fortran array debug info attributes through DIBuilder.
The support of a few debug info attributes specifically for Fortran
arrays have been added to LLVM recently, but there's no way to take
advantage of them through DIBuilder. This patch extends
DIBuilder::createArrayType to enable the settings of those attributes.

Patch by Chih-Ping Chen!

Differential Revision: https://reviews.llvm.org/D89817
2020-10-26 16:23:36 -07:00
MaheshRavishankar 78f37b74da [mlir][Linalg] Miscalleneous enhancements to cover more fusion cases.
Adds support for
- Dropping unit dimension loops for indexed_generic ops.
- Folding consecutive folding (or expanding) reshapes when the result
  (or src) is a scalar.
- Fixes to indexed_generic -> generic fusion when zero-dim tensors are
  involved.

Differential Revision: https://reviews.llvm.org/D90118
2020-10-26 16:17:24 -07:00
Rahman Lavaee 0b2f4cdf2b Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty.
Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D89423
2020-10-26 16:15:56 -07:00
Stanislav Mekhanoshin d176e13ca5 Fixed release build after D89170 2020-10-26 16:00:57 -07:00
Stephen Neuendorffer 745c1671b1 [mlir] Document 'ParentOneOf' with the HasParent trait
Differential Revision: https://reviews.llvm.org/D90197
2020-10-26 15:56:03 -07:00
Vedant Kumar 905f874c44 [cmake] Add LLVM_UBSAN_FLAGS, to allow overriding UBSan flags
Allow overriding the default set of flags used to enable UBSan when
building llvm.

This can be used to test new checks or opt out of certain checks.

Differential Revision: https://reviews.llvm.org/D89439
2020-10-26 15:48:19 -07:00
Duncan P. N. Exon Smith b2b7cf39d5 IR: Clarify ownership of ConstantDataSequentials, NFC
Change `ConstantDataSequential::Next` to a
`unique_ptr<ConstantDataSequential>` and update `CDSConstants` to a
`StringMap<unique_ptr<ConstantDataSequential>>`, making the ownership
more obvious.

Differential Revision: https://reviews.llvm.org/D90083
2020-10-26 18:47:25 -04:00
Ulysse Beaugnon db4863ffd1 [MLIR] Fix AttributeInterface declaration.
Substitues `Type` by `Attribute` in the declaration of AttributeInterface. It
looks like the code was written by copy-pasting the definition of TypeInterface,
but the substitution of Type by Attribute was missing at some places.

Reviewed By: rriddle, ftynse

Differential Revision: https://reviews.llvm.org/D90138
2020-10-26 23:41:39 +01:00
Amy Huang 515973222e [CodeView] Emit static data members as S_CONSTANTs.
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.

I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.

Bug: https://bugs.llvm.org/show_bug.cgi?id=47580

Differential Revision: https://reviews.llvm.org/D89072
2020-10-26 15:30:35 -07:00
Quentin Colombet 78a7941e5c [TargetRegisterInfo] Fix a couple of typos in the comments
Spotted by Nicolas Guillemot <nguillemot@apple.com>.

Thanks Nicolas!

NFC
2020-10-26 15:19:38 -07:00
Alex Zinenko 03e6f40cdb [mlir] Do not print back 0 alignment in LLVM dialect 'alloca' op
The alignment attribute in the 'alloca' op treats the '0' value as 'unset'.
When parsing the custom form of the 'alloca' op, ignore the alignment attribute
with if its value is '0' instead of actually creating it and producing a
slightly different textually yet equivalent semantically form in the output.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D90179
2020-10-26 23:19:20 +01:00
Jan Kratochvil 7611c5bb42 [nfc] [lldb] Refactor DWARFUnit::GetDIE
Reduce indentation of the code by early returns for failed code paths.
2020-10-26 23:18:19 +01:00
Puyan Lotfi 4f98eaf655 [NFC] Fixing comment heading for MachineStableHash.h.
Wrong filename and description.
2020-10-26 18:07:26 -04:00
Louis Dionne d1afe2e25c [libc++] Remove the reliance of several <random> tests on <iostream> 2020-10-26 18:02:01 -04:00
Lei Zhang f52b4a65f0 [mlir] NFC: properly align IR in comments
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D90164
2020-10-26 17:58:00 -04:00
Stanislav Mekhanoshin 038d884a50 [AMDGPU] Use flat scratch instructions where available
The support is disabled by default. So far there is instruction
selection, spilling, and frame elimination. It also changes SP
from unswizzled to swizzled as used by flat scratch instructions,
so it cannot be mixed with MUBUF stack access.

At the very least missing:

- GlobalISel;
- Some optimizations in frame elimination in between vector
  and scalar ALU;
- It shall finally allow to always materialize frame index
  as an SGPR, but that is not implemented and frame elimination
  cannot handle it yet;
- Unaligned and/or multidword flat scratch shall work, but it
  is legalized now for MUBUF;
- Operand folding cannot optimize FI like with MUBUF yet;
- It will need scaling the value of the SP/FP in the DWARF
  expression to recover the unswizzled scratch address;

Differential Revision: https://reviews.llvm.org/D89170
2020-10-26 14:40:42 -07:00
Kiran Chandramohan c551ba0e90 Run test only if X86 target is available
This fixes failures in AArch64 buildbots by running the
clang/test/CodeGen/X86/att-inline-asm-prefix.c only when the X86
target is available.
2020-10-26 21:28:59 +00:00
Sriraman Tallam ad1b9daa4b Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names.
Prepend the module name hash with a fixed string ".__uniq." which helps tools
that consume sampled profiles and attribute it to functions to understand
that this symbol belongs to a unique internal linkage type symbol.

Symbols with suffixes can result from various optimizations in the compiler.
Function Multiversioning, function splitting, parameter constant propogation,
unique internal linkage names.

External tools like sampled profile aggregators combine profiles from multiple
runs of a binary. They use various heuristics with symbols that have suffixes
to try and attribute the profile to the right function instance. For instance
multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different
should be attributed to the same source function if a single function is
versioned, using attribute target_clones (supported in GCC but yet to land in
LLVM). Similarly, functions that are split (split part having a .cold suffix)
could have profiles for both the original and split symbols but would be
aggregated and attributed to the original function that was split.

Unique internal linkage functions however have different source instances and
the aggregator must not put them together but attribute it to the appropriate
function instance. To be sure that we are dealing with a symbol of a unique
internal linkage function, we would like to prepend the hash with a known
string ".__uniq." which these tools can check to understand the suffix type.

Differential Revision: https://reviews.llvm.org/D89617
2020-10-26 14:24:28 -07:00
Martin Storsjö df6d2e8ab1 [libunwind] Add -Wno-dll-attribute-on-redeclaration when building for windows
It's not worth trying to fix these warnings within libunwind, instead
silence them.

Differential Revision: https://reviews.llvm.org/D90075
2020-10-26 23:23:01 +02:00
Xiangling Liao 357715ce97 [NFC] Remove max_align.c LIT testcase
Since we fixed the definition of `SuitableAlign`[https://reviews.llvm.org/D88659],
`max_align_t` and `__BIGGEST_ALIGNMENT__` are not necessarily the same always.

The original testcase was added here: https://reviews.llvm.org/D59048

Differential Revision: https://reviews.llvm.org/D90187
2020-10-26 17:14:30 -04:00
Sriraman Tallam 9aa7a721ce Test to check backtraces with machine function splitting.
clang supports option -fsplit-machine-functions and this test checks if the
backtraces are sane when functions are split.

With -fsplit-machine-functions, a function with profiles can get split into 2
parts, the original function containing hot code and a cold part as determined
by the profile info and the cold cutoff threshold.. The cold part gets the
".cold" suffix to disambiguate its symbol from the hot part and can be placed
arbitrarily in the address space.

This test checks if the back-trace looks correct when the cold part is executed.

Differential Revision: https://reviews.llvm.org/D90081
2020-10-26 14:08:42 -07:00