Commit Graph

270100 Commits

Author SHA1 Message Date
Craig Topper cc255bcd77 [InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.

Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36944

llvm-svn: 311343
2017-08-21 16:04:11 +00:00
Craig Topper 8078dd2984 [X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match
Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.

Reviewers: RKSimon, spatel, zvi, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36938

llvm-svn: 311342
2017-08-21 16:04:04 +00:00
Xinliang David Li d2838fc4b9 Revert 311208, 311209
llvm-svn: 311341
2017-08-21 16:00:38 +00:00
Sanjay Patel 707f786cc5 revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments
We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux

llvm-svn: 311340
2017-08-21 15:16:25 +00:00
Sanjay Patel 0707434ce8 [InstCombine] add vector tests; NFC
llvm-svn: 311339
2017-08-21 15:11:39 +00:00
Zachary Turner d1de2f4f5e [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

llvm-svn: 311338
2017-08-21 14:53:25 +00:00
Sanjay Patel 48c67c9965 [InstCombine] regenerate test checks; NFC
llvm-svn: 311337
2017-08-21 14:34:06 +00:00
Tobias Grosser 0dd42512ff [ZoneAlgorithm] Move computeScalarReachingDefinition to c++
llvm-svn: 311336
2017-08-21 14:19:40 +00:00
James Henderson d447a7a128 [ELF] Remove dependency on hexdump from lit test
hexdump is not part of the GNU coreutils, and so is not required to be able to
build and test LLVM, according to the documentation. This change removes the
dependency on hexdump from a lit test.

Reviewers: grimar

Differential Revision: https://reviews.llvm.org/D36958

llvm-svn: 311335
2017-08-21 14:11:08 +00:00
Simon Atanasyan 29706f24e2 [mips] Remove checking of the redundant condition. NFC
llvm-svn: 311334
2017-08-21 14:08:29 +00:00
Sanjay Patel 7756edfa93 [LibCallSimplifier] try harder to fold memcmp with constant arguments
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data, 
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion 
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're 
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to 
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311333
2017-08-21 13:55:49 +00:00
Stefan Pintilie 9495f33e45 [PowerPC] Check if the pre-increment PHI Node already exists
Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.

Differential Revision: https://reviews.llvm.org/D36736

llvm-svn: 311332
2017-08-21 13:36:18 +00:00
Siddharth Bhat 0a198dc18a [ManagedMemoryRewrite] hide debug output behing DEBUG(...). [NFC]
llvm-svn: 311331
2017-08-21 12:51:57 +00:00
Ilya Biryukov f315000613 Fixed a crash on replaying Preamble's PP conditional stack.
Summary:
The crash occurs when the first token after a preamble is a macro
expansion.
Fixed by moving replayPreambleConditionalStack from Parser into
Preprocessor. It is now called right after the predefines file is
processed.

Reviewers: erikjv, bkramer, klimek, yvvan

Reviewed By: bkramer

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D36872

llvm-svn: 311330
2017-08-21 12:03:08 +00:00
Siddharth Bhat 7bc77e87c8 [ScopInfo] Add option to treat all function parameters as dereferencible.
Dragonegg generates most function parameters as pointers to the actual
parameters. However, it does not mark these parameters with the
dereferencable attribute.

Polly is conservative when it comes to invariant load
hoisting, thus we add runtime checks to invariant load hoisted pointers
when we do not know that pointers are dereferencable. This is correct behaviour,
but is a performance penalty.

Add a flag that allows all pointer parameters to be dereferencable. That
way, polly can speculatively load-hoist paramters to functions without
runtime checks.

Differential Revision: https://reviews.llvm.org/D36461

llvm-svn: 311329
2017-08-21 11:57:04 +00:00
Siddharth Bhat 7b9f5ca27e [PPCGCodeGeneration] Enable `polly-codegen-perf-monitoring` for PPCGCodegen.
This feature was not enabled for `PPCGCodeGeneration`. Now that this is
enabled, we can benchmark Scops that have been optimised with
`-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option.

Differential Revision: https://reviews.llvm.org/D36934

llvm-svn: 311328
2017-08-21 11:44:01 +00:00
Igor Breger 685889cf9b [GlobalISel][X86] Support G_BRCOND operation.
Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34754

llvm-svn: 311327
2017-08-21 10:51:54 +00:00
Oliver Stannard 9bd18aa7d8 [AsmParser] Recommit: Hash is not a comment on some targets
Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.

The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405

llvm-svn: 311326
2017-08-21 09:58:37 +00:00
Oliver Stannard 7f18864473 [ObjC] Use consistent comment style in inline asm
The comment markers accepted by the assembler vary between different targets,
but '//' is always accepted, so we should use that for consistency.

Differential revision: https://reviews.llvm.org/D36666

llvm-svn: 311325
2017-08-21 09:54:46 +00:00
Tobias Grosser b09bd74da8 [GPGPU] Add llvm.powi to the libdevice supported functions
These intrinsics are used in COSMO.

llvm-svn: 311324
2017-08-21 09:52:08 +00:00
Igor Breger 03c2208d5f [GlobalISel][X86] InstructionSelector, for now use fallback path for LOAD_STACK_GUARD and PHI nodes.
llvm-svn: 311323
2017-08-21 09:17:28 +00:00
Tobias Grosser 5170b6627a [GPGPU] Add log / logf to the libdevice supported functions
These two functions are used in COSMO

llvm-svn: 311322
2017-08-21 09:00:31 +00:00
Igor Breger 1b5e3d3e28 [GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments.
llvm-svn: 311321
2017-08-21 08:59:59 +00:00
Michael Zuckerman bdb6673151 [InterLeaved] Adding lit test for future work interleaved load strid 3
llvm-svn: 311320
2017-08-21 08:56:39 +00:00
Sam Parker ffccda6303 [ARM][AArch64] Cortex-A75 and Cortex-A55 tests
Add frontend tests for Cortex-A75 and Cortex-A55, Arm's latest
big.LITTLE A-class cores. They implement the ARMv8.2-A architecture,
including the cryptography and RAS extensions, plus the optional dot
product extension. They also implement the RCpc AArch64 extension
from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36731

llvm-svn: 311319
2017-08-21 08:52:45 +00:00
Chandler Carruth 98c51cbee1 [x86] Teach the "generic" x86 CPU to avoid patterns that are slow on
widely used processors.

This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.

I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.

In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.

Differential Revision: https://reviews.llvm.org/D36947

llvm-svn: 311318
2017-08-21 08:45:22 +00:00
Chandler Carruth 63dd5e0ef6 [x86] Handle more cases where we can re-use an atomic operation's flags
rather than doing a separate comparison.

This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.

The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.

There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.

Differential Revision: https://reviews.llvm.org/D36945

llvm-svn: 311317
2017-08-21 08:45:19 +00:00
Sam Parker b252ffd2cc [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

llvm-svn: 311316
2017-08-21 08:43:06 +00:00
George Rimar f7ef2a13f6 [ELF] - Recommit "[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions."
With fix: explicitly specify ouput format for hexdump tool call.

Original commit message:

[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.

Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.

Differential revision: https://reviews.llvm.org/D36262

llvm-svn: 311315
2017-08-21 08:31:14 +00:00
George Rimar 09a6945b48 [ELF] - Revert r311310 "[ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions."
It broke BB:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/11792/steps/test_lld/logs/stdio

llvm-svn: 311314
2017-08-21 08:13:45 +00:00
George Rimar 8902bb8e62 [ELF] - Enable threading in many-sections.s testcase. NFC.
This is PR32942, previously threading was disabled
because slowed down this testcase a lot.
It was fixed in r311312.

llvm-svn: 311313
2017-08-21 08:10:35 +00:00
George Rimar d7305ef06c [Support/Parallel] - Do not use a task group for a very small task.
parallel_for_each_n splits a given task into small pieces of tasks and then
passes them to background threads managed by a thread pool to process them
in parallel. TaskGroup then waits for all tasks to be done, which is done by
TaskGroup's destructor.

In the previous code, all tasks were passed to background threads, and the
main thread just waited for them to finish their jobs. This patch changes
the logic so that the main thread processes a task just like other
worker threads instead of just waiting for workers.

This patch improves the performance of parallel_for_each_n for a task which
is too small that we do not split it into multiple tasks. Previously, such task
was submitted to another thread and the main thread waited for its completion.
That involves multiple inter-thread synchronization which is not cheap for
small tasks. Now, such task is processed by the main thread, so no inter-thread
communication is necessary.

Differential revision: https://reviews.llvm.org/D36607

llvm-svn: 311312
2017-08-21 08:00:54 +00:00
George Rimar 5d0ea70ad5 [ELF] - Do not segfault when doing logical and/or operations on symbols that have no output sections.
Previously we would crash on samples from testcase,
because were trying to access zero pointer to output section.

Differential revision: https://reviews.llvm.org/D36145

llvm-svn: 311311
2017-08-21 07:57:12 +00:00
George Rimar c7392cbe9a [ELF] - Do not forget to fill last bytes of PT_LOADs with trap instructions.
Previously last 4 bytes of executable loads
were not filled with trap instructions,
patch fixes this bug.

Differential revision: https://reviews.llvm.org/D36262

llvm-svn: 311310
2017-08-21 07:51:21 +00:00
Coby Tayree c54c5cbe67 [X86] Allow xacquire/xrelease prefixes
Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845

llvm-svn: 311309
2017-08-21 07:50:15 +00:00
Craig Topper d6f4be97e6 [AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled.
There's no functional difference between the AVX512DQ instructions if we're not masking.

This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.

llvm-svn: 311308
2017-08-21 05:29:02 +00:00
Craig Topper 485cca1ecb [AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table.
llvm-svn: 311307
2017-08-21 05:03:28 +00:00
Dean Michael Berris c5caf3e9c6 [XRay][tools] Support new kinds of instrumentation map entries
Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).

Reviewers: kpw, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36819

llvm-svn: 311305
2017-08-21 00:14:06 +00:00
Chandler Carruth bd6dc14230 Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

llvm-svn: 311304
2017-08-20 23:17:11 +00:00
Craig Topper a152903c1b [InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC
llvm-svn: 311303
2017-08-20 21:38:28 +00:00
Michael Kruse d091bf8d8e [MatMul] Make MatMul detection independent of internal isl representations.
The pattern recognition for MatMul is restrictive.

The number of "disjuncts" in the isl_map containing constraint
information was previously required to be 1
(as per isl_*_coalesce - which should ideally produce a domain map with
a single disjunct, but does not under some circumstances).

This was changed and made more flexible.

Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in>

Differential Revision: https://reviews.llvm.org/D36460

llvm-svn: 311302
2017-08-20 21:31:11 +00:00
Johannes Altmanninger d6491f2c4a Allow thiscall attribute in test/Tooling/clang-diff-ast.cpp
llvm-svn: 311301
2017-08-20 20:13:33 +00:00
Craig Topper d63b33f9c4 [AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node.
We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load.

llvm-svn: 311300
2017-08-20 19:47:00 +00:00
Kuba Mracek 6734671dda Fix archive-update.test after r311296.
llvm-svn: 311299
2017-08-20 18:31:30 +00:00
Kuba Mracek b17fd11e09 Remove "%T" from ASan Darwin tests.
llvm-svn: 311298
2017-08-20 18:31:00 +00:00
Craig Topper 702097dafc [AVX-512] Use a scalar load pattern for FPCLASSSS/FPCLASSSD patterns.
llvm-svn: 311297
2017-08-20 18:30:24 +00:00
Kuba Mracek 2c0bca49b1 Remove uses of "%T" from test/Object/archive-* tests.
llvm-svn: 311296
2017-08-20 18:18:44 +00:00
Benjamin Kramer 806ae44012 [NVPTX] Reduce copypasta.
No functionality change intended.

llvm-svn: 311295
2017-08-20 17:30:32 +00:00
Kuba Mracek d3f3fae32d Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311294
2017-08-20 17:05:22 +00:00
Kuba Mracek 5c393d2565 Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311293
2017-08-20 17:00:08 +00:00