Commit Graph

366676 Commits

Author SHA1 Message Date
Peter Collingbourne 7bd75b6301 scudo: Add an API for disabling memory initialization per-thread.
Here "memory initialization" refers to zero- or pattern-init on
non-MTE hardware, or (where possible to avoid) memory tagging on MTE
hardware. With shared TSD the per-thread memory initialization state
is stored in bit 0 of the TLS slot, similar to PointerIntPair in LLVM.

Differential Revision: https://reviews.llvm.org/D87739
2020-09-18 12:04:27 -07:00
Simon Pilgrim 4ebd30722a [X86][AVX] lowerBuildVectorAsBroadcast - improve BROADCASTM lowering on non-VLX targets
Broadcast to a ZMM type then extract the low subvector.
2020-09-18 19:52:02 +01:00
Arthur Eubanks 2b1cb6d54a [test][TSan] Fix tests under NPM
Under NPM, the TSan passes are split into a module and function pass. A
couple tests were testing for inserted module constructors, which is
only part of the module pass.
2020-09-18 11:37:55 -07:00
Huihui Zhang 9ad6049736 [InstCombine][SVE] Skip scalable type for InstCombiner::getFlippedStrictnessPredicateAndConstant.
We cannot iterate on scalable vector, the number of elements is unknown at compile-time.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D87918
2020-09-18 11:26:36 -07:00
David Blaikie 82af17cde8 Linewrap & remove some dead typedefs from previous commit
Cleanup for 51a505340d
2020-09-18 11:22:37 -07:00
David Blaikie 51a505340d DebugInfo: Simplify line table parsing to take all the units together, rather than CUs and TUs separately 2020-09-18 11:18:23 -07:00
James Y Knight f7a53d82c0 PR47468: Fix findPHICopyInsertPoint, so that copies aren't incorrectly inserted after an INLINEASM_BR.
findPHICopyInsertPoint special cases placement in a block with a
callbr or invoke in it. In that case, we must ensure that the copy is
placed before the INLINEASM_BR or call instruction, if the register is
defined prior to that instruction, because it may jump out of the
block.

Previously, the code placed it immediately after the last def _or
use_. This is wrong, if the use is the instruction which may jump.  We
could correctly place it immediately after the last def (ignoring
uses), but that is non-optimal for register pressure.

Instead, place the copy after the last def, or before the
call/inlineasm_br, whichever is later.

Differential Revision: https://reviews.llvm.org/D87865
2020-09-18 14:14:04 -04:00
Simon Pilgrim ecba9d793e [X86][AVX] Add missing non AVX512VL broadcastm test coverage 2020-09-18 19:11:29 +01:00
Matt Arsenault 3105d0f84b CodeGen: Move split block utility to MachineBasicBlock
AMDGPU needs this in several places, so consolidate them here.
2020-09-18 14:05:18 -04:00
Matt Arsenault c8757ff3aa RegAllocFast: Rewrite and improve
This rewrites big parts of the fast register allocator. The basic
strategy of doing block-local allocation hasn't changed but I tweaked
several details:

Track register state on register units instead of physical
registers. This simplifies and speeds up handling of register aliases.
Process basic blocks in reverse order: Definitions are known to end
register livetimes when walking backwards (contrary when walking
forward then uses may or may not be a kill so we need heuristics).

Check register mask operands (calls) instead of conservatively
assuming everything is clobbered.  Enhance heuristics to detect
killing uses: In case of a small number of defs/uses check if they are
all in the same basic block and if so the last one is a killing use.
Enhance heuristic for copy-coalescing through hinting: We check the
first k defs of a register for COPYs rather than relying on there just
being a single definition.  When testing this on the full llvm
test-suite including SPEC externals I measured:

average 5.1% reduction in code size for X86, 4.9% reduction in code on
aarch64. (ranging between 0% and 20% depending on the test) 0.5%
faster compiletime (some analysis suggests the pass is slightly slower
than before, but we more than make up for it because later passes are
faster with the reduced instruction count)

Also adds a few testcases that were broken without this patch, in
particular bug 47278.

Patch mostly by Matthias Braun
2020-09-18 14:05:18 -04:00
Matt Arsenault 870fd53e4f Reapply "RegAllocFast: Record internal state based on register units"
The regressions this caused should be fixed when
https://reviews.llvm.org/D52010 is applied.

This reverts commit a21387c654.
2020-09-18 14:05:18 -04:00
Zequan Wu 91aed9bf97 [CodeGen] emit CG profile for COFF object file
I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775)

Differential Revision: https://reviews.llvm.org/D87811
2020-09-18 10:57:54 -07:00
Arthur Eubanks d419e34c4d [test][HWAsan] Fix kernel-inline.ll under NPM 2020-09-18 10:56:08 -07:00
David Blaikie e0802fe016 DebugInfo: Tidy up initializing multi-section contributions in DWARFContext 2020-09-18 10:54:43 -07:00
Raul Tambre a1aa330b20 [Sema] Handle objc_super special lookup when checking builtin compatibility
objc_super is special and needs LookupPredefedObjCSuperType() called before performing builtin type comparisons.
This fixes an error when compiling macOS headers. A test is added.

Differential Revision: https://reviews.llvm.org/D87917
2020-09-18 20:51:55 +03:00
Arthur Eubanks 06fe76cc4f [ASan][NewPM] Fix byref-args.ll under NPM 2020-09-18 10:50:53 -07:00
peter klausler 01def7f7c3 [flang] Rework preprocessing of stringification
Hew more closely to the C17 standard; perform macro replacement
of arguments to function-like macros unless they're being stringified
or pasted.  Test with a model "assert" macro idiom that exposed
the problem.

Differential Revision: https://reviews.llvm.org/D87650
2020-09-18 10:45:57 -07:00
Matt Arsenault 0576f436e5 AMDGPU: Don't sometimes allow instructions before lowered si_end_cf
Since 6524a7a2b9, this would sometimes
not emit the or to exec at the beginning of the block, where it really
has to be. If there is an instruction that defines one of the source
operands, split the block and turn the si_end_cf into a terminator.

This avoids regressions when regalloc fast is switched to inserting
reloads at the beginning of the block, instead of spills at the end of
the block.

In a future change, this should always split the block.
2020-09-18 13:43:01 -04:00
Amara Emerson 615695de27 [AArch64][GlobalISel] Make <8 x s8> of G_BUILD_VECTOR legal. 2020-09-18 10:32:33 -07:00
Utkarsh Saxena 9b6765e784 [clangd] Add Random Forest runtime for code completion.
Summary:
[WIP]
- Proposes a json format for representing Random Forest model.
- Proposes a way to test the generated runtime using a test model.

TODO:
- Add generated source code snippet for easier review.
- Fix unused label warning.
- Figure out required using declarations for CATEGORICAL columns from Features.json.
- Necessary Google3 internal modifications for blaze before landing.
- Add documentation for format of the model.
- Document more.

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83814
2020-09-18 19:25:56 +02:00
Sean Silva 7c44651360 [mlir][shape] Extend shape.cstr_require with a message.
I realized when using this that one can't get very good error messages
without an additional message attribute.

Differential Revision: https://reviews.llvm.org/D87875
2020-09-18 10:21:10 -07:00
mydeveloperday a16e4a63ae [clang-format] NFC ensure the clang-format tests remain clang-formatted 2020-09-18 18:16:02 +01:00
mydeveloperday 2e7add812e [clang-format] Add a option for the position of Java static import
Some Java style guides and IDEs group Java static imports after
 non-static imports. This patch allows clang-format to control
 the location of static imports.

Patch by: @bc-lee

Reviewed By: MyDeveloperDay, JakeMerdichAMD

Differential Revision: https://reviews.llvm.org/D87201
2020-09-18 18:12:21 +01:00
JonChesterfield a9be2b5cb2 [libomptarget] Disable build of amdgpu plugin as it doesn't build with rocm. 2020-09-18 18:10:27 +01:00
Francis Visoiu Mistrih 0345d88de6 [NFC][ScheduleDAG] Remove unused EntrySU SUnit
EntrySU doesn't seem to be used at all when building the ScheduleDAG.

Differential Revision: https://reviews.llvm.org/D87867
2020-09-18 09:50:47 -07:00
Jianzhou Zhao cab6f5b2ab Use one more byte to silence a warning from Vistual C++ 2020-09-18 16:42:38 +00:00
Jon Roelofs 51c5add854 Extending Baremetal toolchain's support for the rtlib option.
Differential Revision: https://reviews.llvm.org/D87164

Patch by Manuel Carrasco!
2020-09-18 09:19:37 -07:00
Andy Ly 3c2e2df8d0 [MLIR][ODS] Add constBuilderCall for TypeArrayAttr
constBuilderCall was not defined for TypeArrayAttr, resulting in tblgen not emitting the correct code when TypeArrayAttr is used with a default valued attribute.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D87907
2020-09-18 16:14:41 +00:00
Simon Pilgrim ceadd98c2f [X86][AVX] lowerBuildVectorAsBroadcast - improve i64 BROADCASTM lowering on 32-bit targets
We already handle the the cases where we have a 'zero extended splat' build vector (a, 0, 0, 0, a, 0, 0, 0, ...) but were missing the case where the 'a' scalar was zero-extended as well - such as i64 -> vXi64 splat cases on 32-bit targets.
2020-09-18 16:59:57 +01:00
Matt Morehouse 23bab1eb43 [DFSan] Add strpbrk wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87849
2020-09-18 08:54:14 -07:00
ergawy 7b61b19275 [MLIR][SPIRV] Create new ctx for deserialization in roundtrips.
Roundtripping SPIR-V modules used the same MLIRContext object for both
ways of the trip. This resulted in deserialization using a context
object already containing Types constructed during serialization.
This commit rectifies that by creating a new MLIRContext during
deserialization.

Reviewed By: mravishankar, antiagainst

Differential Revision: https://reviews.llvm.org/D87692
2020-09-18 11:53:51 -04:00
Valentin Clement 88a1d402d6 [mlir][openacc] Add missing operands for acc.data operation
Add missing operands to represent copyin with readonly modifier, copyout with zero modifier
and create with zero modifier.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87874
2020-09-18 11:52:24 -04:00
Valentin Clement 22dde1f92f [mlir][openacc] Support Index and AnyInteger in loop op
Following patch D87712, this patch switch AnyInteger for operands gangNum, gangStatic,
workerNum, vectoreLength and tileOperands to Index and AnyInteger.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87848
2020-09-18 11:37:49 -04:00
Simon Pilgrim 81dce71acf [X86][AVX] Add missing i686 broadcastm test coverage 2020-09-18 16:10:24 +01:00
Simon Pilgrim d967aaa8fa [DAG] BuildVectorSDNode::getSplatValue - pull out repeated getNumOperands() calls. NFCI. 2020-09-18 16:10:23 +01:00
David Tenty 5d1f8395be [AIX] Enable large code model when building with clang 2020-09-18 11:03:22 -04:00
Adam Czachorowski c894bfd1f5 [clangd] Add option for disabling AddUsing tweak on some namespaces.
For style guides forbid "using" declarations for namespaces like "std".
With this new config option, AddUsing can be selectively disabled on
those.

Differential Revision: https://reviews.llvm.org/D87775
2020-09-18 16:46:09 +02:00
Hanhan Wang 1909b6ac0d [mlir][StandardToSPIRV] Handle vector of i1 case for lowering zexti to SPIR-V.
Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D87887
2020-09-18 07:07:22 -07:00
Sanjay Patel 3f100e64b4 [InstSimplify] fix fmin/fmax miscompile for partial undef vectors (PR47567)
It would also be correct to return the variable operand in these cases,
but eliminating a variable use is probably better for optimization.
2020-09-18 10:05:44 -04:00
Matt Arsenault 751a6c5760 IR: Move denormal mode parsing from MachineFunction to Function
This was just inspecting the IR to begin with, and is useful to check
in some places in the IR.
2020-09-18 09:55:47 -04:00
Matt Arsenault 05c02eda45 emacs: Add nofree and willreturn to list of attributes 2020-09-18 09:48:33 -04:00
Matt Arsenault 27df165270 Revert "[amdgpu] Lower SGPR-to-VGPR copy in the final phase of ISel."
This reverts commit c3492a1aa1.

I think this is the wrong strategy and wrong place to do this
transform anyway. Also reverts follow up commit
7d593d0d69.
2020-09-18 09:48:33 -04:00
Alexey Bataev 455ca0ebb6 [SLP] Allow reordering of vectorization trees with reused instructions.
If some leaves have the same instructions to be vectorized, we may
incorrectly evaluate the best order for the root node (it is built for the
vector of instructions without repeated instructions and, thus, has less
elements than the root node). In this case we just can not try to reorder
the tree + we may calculate the wrong number of nodes that requre the
same reordering.
For example, if the root node is \<a+b, a+c, a+d, f+e\>, then the leaves
are \<a, a, a, f\> and \<b, c, d, e\>. When we try to vectorize the first
leaf, it will be shrink to \<a, b\>. If instructions in this leaf should
be reordered, the best order will be \<1, 0\>. We need to extend this
order for the root node. For the root node this order should look like
\<3, 0, 1, 2\>. This patch allows extension of the orders of the nodes
with the reused instructions.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D45263
2020-09-18 09:34:59 -04:00
Mirko Brkusanin ae36c02ad0 [AMDGPU] Set DS alignment requirements to be more strict
Alignment requirements for ds_read/write_b96/b128 for gfx9 and onward are
now the same as for other GCN subtargets. This way we can avoid any
unintentional use of these instructions on systems that do not support dword
alignment and instead require natural alignment.
This also makes 'SH_MEM_CONFIG.alignment_mode == STRICT' the default.

Differential Revision: https://reviews.llvm.org/D87821
2020-09-18 15:26:24 +02:00
Sanjay Patel 6690de098e [InstSimplify] add another test for NaN propagation; NFC 2020-09-18 09:20:26 -04:00
Daniel Kiss 22b615a965 [libunwind] Support for leaf function unwinding.
Unwinding leaf function is useful in cases when the backtrace finds a
leaf function for example when it caused a signal.
This patch also add the support for the DW_CFA_undefined because it marks
the end of the frames.

Ryan Prichard provided code for the tests.

Reviewed By: #libunwind, mstorsjo

Differential Revision: https://reviews.llvm.org/D83573

Reland with limit the test to the x86_64-linux target.
2020-09-18 15:09:42 +02:00
Xing GUO 2d35092cd2 [DWARFYAML] Make the include_directories, file_names and opcodes fields of the line table optional.
This patch makes the include_directories, file_names and opcodes fields
of the line table optional. This helps us simplify some tests.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D87878
2020-09-18 20:21:11 +08:00
Xing GUO a761e81e22 [DWARFYAML][test] Use 'CHECK-NEXT:' to make checkers stricter. NFC.
This patch makes checkers stricter so that we are able to avoid
some potential problems earlier.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D87876
2020-09-18 20:19:33 +08:00
David Greene 7c8bb409f3 [UpdateCCTestChecks] Include generated functions if asked
Add the --include-generated-funcs option to update_cc_test_checks.py so that any
functions created by the compiler that don't exist in the source will also be
checked.

We need to maintain the output order of generated function checks so that
CHECK-LABEL works properly.  To do so, maintain a list of functions output for
each prefix in the order they are output.  Use this list to output checks for
generated functions in the proper order.

Differential Revision: https://reviews.llvm.org/D83004
2020-09-18 06:34:59 -05:00
Max Kazantsev 09a3737384 [Test] Missing range check removal opportunity 2020-09-18 17:55:23 +07:00