This reverts commit b86c16ad8c97dadc1f529da72a5bb74e9eaed344.
This is being reverted because I forgot to write a useful
commit message, so I'm going to resubmit it with an actual
commit message.
llvm-svn: 344358
if the function has globalized variables and called in context of
target/teams/distribute regions, it does not need to globalize 32
copies of the same variables for memory coalescing, it is enough to
have just one copy, because there is parallel region.
Patch does this by adding call for `__kmpc_parallel_level` function and
checking its return value. If the code sees that the parallel level is
0, then only one variable is allocated, not 32.
llvm-svn: 344356
Generalize SelectionDAGLegalize's CTLZ expansion to handle vectors - lets VectorLegalizer::ExpandCTLZ to just pass the expansion on instead of repeating the same codegen.
llvm-svn: 344349
Pull out repeated byte sum stage for popcount of vector elements > 8bits.
This allows us to simplify the LUT/BITMATH popcnt code to always assume vXi8 vectors, and also improves avx512bitalg codegen which only has access to vpopcntb/vpopcntw.
llvm-svn: 344348
The current BitPermutationSelector generates a code to build a value by tracking two types of bits: ConstZero and Variable.
ConstZero means a bit we need to mask off and Variable is a bit we copy from an input value.
This patch add third type of bits VariableKnownToBeZero caused by AssertZext node or zero-extending load node.
VariableKnownToBeZero means a bit comes from an input value, but it is known to be already zero. So we do not need to mask them.
VariableKnownToBeZero enhances flexibility to group bits, since we can avoid redundant masking for these bits.
This patch also renames "HasZero" to "NeedMask" since now we may skip masking even when we have zeros (of type VariableKnownToBeZero).
Differential Revision: https://reviews.llvm.org/D48025
llvm-svn: 344347
Summary:
Otherwise, at least on Mac, the linker eliminates unused symbols which
causes libFuzzer to error out due to a mismatch of the sizes of coverage tables.
Issue in Chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=892167
Reviewers: morehouse, kcc, george.karpenkov
Reviewed By: morehouse
Subscribers: kubamracek, llvm-commits
Differential Revision: https://reviews.llvm.org/D53113
llvm-svn: 344345
New option added to these three checks to be able to silence false positives on
types that are intentionally passed by value or copied. Such types are e.g.
intrusive reference counting pointer types like llvm::IntrusiveRefCntPtr. The
new option is named WhiteListTypes and can contain a semicolon-separated list of
names of these types. Regular expressions are allowed. Default is empty.
Differential Revision: https://reviews.llvm.org/D52727
llvm-svn: 344340
The documentation stated "Access to explicit operands of the
instruction." This is misleading, as it also lists implicit operands.
Patch by Philip Ginsbach.
Differential Revision: https://reviews.llvm.org/D35481
llvm-svn: 344338
Fixes PR32160 by reducing the size of PSHUFB if we only use one of the lanes.
This approach can probably be generalized to handle any target shuffle (and any subvector index) but we have no test coverage at the moment.
llvm-svn: 344336
This patch adds the ability to identify instructions that are "move elimination
candidates". It also allows scheduling models to describe processor register
files that allow move elimination.
A move elimination candidate is an instruction that can be eliminated at
register renaming stage.
Each subtarget can specify which instructions are move elimination candidates
with the help of tablegen class "IsOptimizableRegisterMove" (see
llvm/Target/TargetInstrPredicate.td).
For example, on X86, BtVer2 allows both GPR and MMX/SSE moves to be eliminated.
The definition of 'IsOptimizableRegisterMove' for BtVer2 looks like this:
```
def : IsOptimizableRegisterMove<[
InstructionEquivalenceClass<[
// GPR variants.
MOV32rr, MOV64rr,
// MMX variants.
MMX_MOVQ64rr,
// SSE variants.
MOVAPSrr, MOVUPSrr,
MOVAPDrr, MOVUPDrr,
MOVDQArr, MOVDQUrr,
// AVX variants.
VMOVAPSrr, VMOVUPSrr,
VMOVAPDrr, VMOVUPDrr,
VMOVDQArr, VMOVDQUrr
], CheckNot<CheckSameRegOperand<0, 1>> >
]>;
```
Definitions of IsOptimizableRegisterMove from processor models of a same
Target are processed by the SubtargetEmitter to auto-generate a target-specific
override for each of the following predicate methods:
```
bool TargetSubtargetInfo::isOptimizableRegisterMove(const MachineInstr *MI)
const;
bool MCInstrAnalysis::isOptimizableRegisterMove(const MCInst &MI, unsigned
CPUID) const;
```
By default, those methods return false (i.e. conservatively assume that there
are no move elimination candidates).
Tablegen class RegisterFile has been extended with the following information:
- The set of register classes that allow move elimination.
- Maxium number of moves that can be eliminated every cycle.
- Whether move elimination is restricted to moves from registers that are
known to be zero.
This patch is structured in three part:
A first part (which is mostly boilerplate) adds the new
'isOptimizableRegisterMove' target hooks, and extends existing register file
descriptors in MC by introducing new fields to describe properties related to
move elimination.
A second part, uses the new tablegen constructs to describe move elimination in
the BtVer2 scheduling model.
A third part, teaches llm-mca how to query the new 'isOptimizableRegisterMove'
hook to mark instructions that are candidates for move elimination. It also
teaches class RegisterFile how to describe constraints on move elimination at
PRF granularity.
llvm-mca tests for btver2 show differences before/after this patch.
Differential Revision: https://reviews.llvm.org/D53134
llvm-svn: 344334
This is a follow-up patch to r342541. After further investigations, only
48bits VMA size can be supported. As this is enforced in function
InitializePlatformEarly from lib/rt1/tsan_platform_linux.cc, the access
to the global variable vmaSize variable + switch can be removed. This
also addresses a comment from https://reviews.llvm.org/D52167.
vmaSize of 39 or 42bits are not compatible with a Go program memory
layout as the Go heap will not fit in the shadow memory area.
Patch by: Fangming Fang <Fangming.Fang@arm.com>
llvm-svn: 344329
LLDB does not support this DWARF5 form atm.
At least gcc emits it in some cases when doing optimization
for abbreviations.
As far I can tell, clang does not support it yet, though
the rest LLVM code already knows about it.
The patch adds the support.
Differential revision: https://reviews.llvm.org/D52689
llvm-svn: 344328
Failure was discovered upon running
projects/compiler-rt/test/builtins/Unit/divtc3_test.c
in a stage2 compiler build.
When compiling projects/compiler-rt/lib/builtins/divtc3.c,
a call to fmaxl within the divtc3 implementation had its
return values read from registers $2 and $3 instead of $f0 and $f2.
Include fmaxl in the list of long double emulation routines
to have its return value correctly interpreted as f128.
Almost exact issue here: https://reviews.llvm.org/D17760
Differential Revision: https://reviews.llvm.org/D52649
llvm-svn: 344326
`config.asan_dynamic` should actually be `True` because dylibs are the
only supported form of the ASan runtime on Apple platforms.
Reviewers: kubamracek, george.karpenkov, samsonov
Subscribers: srhines, mgorny, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D53183
llvm-svn: 344324
Summary:
This change adds support for the GNU --target flag, which sets both --input-target and --output-target.
GNU objcopy doesn't do any checking for whether both --target and --{input,output}-target are used, and so it allows both, e.g. "--target A --output-target B" is equivalent to "--input-target A --output-target B" since the later command line flag would override earlier ones. This may be error prone, so I chose to implement it as an error if both are used. I'm not sure if anyone is actually using both.
Reviewers: jakehehrlich, jhenderson, alexshap
Reviewed By: jakehehrlich, alexshap
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53029
llvm-svn: 344321
I want to add another pattern here that includes scalar_to_vector,
so this makes that patch smaller. I was hoping to remove the
hasOneUse() check because it shouldn't be necessary for common
codegen, but an AMDGPU test has a comment suggesting that the
extra check makes things better on one of those targets.
llvm-svn: 344320
It originally triggered a stepping problem in the debugger, which could
be fixed by adjusting CodeGen/LexicalScopes.cpp however it seems we prefer
the previous behavior anyway.
See the discussion for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181008/593833.html
This reverts commit r343880.
This reverts commit r343874.
llvm-svn: 344318
DIV/REM by constants should always be expanded into mul/shift/etc.
patterns. Unfortunately the ConstantHoisting pass runs too early at a
point where the pattern isn't expanded yet. However after
ConstantHoisting hoisted some immediate the result may not expand
anymore. Also the hoisting typically doesn't make sense because it
operates on immediates that will change completely during the expansion.
Report DIV/REM as TCC_Free so ConstantHoisting will not touch them.
Differential Revision: https://reviews.llvm.org/D53174
llvm-svn: 344315
Summary:
We have two copies of createPrivateGlobalForString (in asan and in esan).
This change merges them into one. NFC
Reviewers: vitalybuka
Reviewed By: vitalybuka
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53178
llvm-svn: 344314
For now, tresting the cast as a no-op, and disregarding the case where
the output becomes null due to the type mismatch.
rdar://45174557
Differential Revision: https://reviews.llvm.org/D53156
llvm-svn: 344311
Summary:
Instruction with 0 in fence field being disassembled as fence , iorw.
Printing "unknown" to match GAS behavior.
This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer
for the RISC-V assembly language.
Reviewers: asb
Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, asb
Differential Revision: https://reviews.llvm.org/D51828
llvm-svn: 344309