Commit Graph

153205 Commits

Author SHA1 Message Date
Zachary Turner d1de2f4f5e [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

llvm-svn: 311338
2017-08-21 14:53:25 +00:00
Sanjay Patel 48c67c9965 [InstCombine] regenerate test checks; NFC
llvm-svn: 311337
2017-08-21 14:34:06 +00:00
Sanjay Patel 7756edfa93 [LibCallSimplifier] try harder to fold memcmp with constant arguments
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data, 
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion 
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're 
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to 
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922

llvm-svn: 311333
2017-08-21 13:55:49 +00:00
Stefan Pintilie 9495f33e45 [PowerPC] Check if the pre-increment PHI Node already exists
Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.

Differential Revision: https://reviews.llvm.org/D36736

llvm-svn: 311332
2017-08-21 13:36:18 +00:00
Igor Breger 685889cf9b [GlobalISel][X86] Support G_BRCOND operation.
Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34754

llvm-svn: 311327
2017-08-21 10:51:54 +00:00
Oliver Stannard 9bd18aa7d8 [AsmParser] Recommit: Hash is not a comment on some targets
Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.

The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405

llvm-svn: 311326
2017-08-21 09:58:37 +00:00
Igor Breger 03c2208d5f [GlobalISel][X86] InstructionSelector, for now use fallback path for LOAD_STACK_GUARD and PHI nodes.
llvm-svn: 311323
2017-08-21 09:17:28 +00:00
Igor Breger 1b5e3d3e28 [GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments.
llvm-svn: 311321
2017-08-21 08:59:59 +00:00
Michael Zuckerman bdb6673151 [InterLeaved] Adding lit test for future work interleaved load strid 3
llvm-svn: 311320
2017-08-21 08:56:39 +00:00
Chandler Carruth 98c51cbee1 [x86] Teach the "generic" x86 CPU to avoid patterns that are slow on
widely used processors.

This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.

I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.

In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.

Differential Revision: https://reviews.llvm.org/D36947

llvm-svn: 311318
2017-08-21 08:45:22 +00:00
Chandler Carruth 63dd5e0ef6 [x86] Handle more cases where we can re-use an atomic operation's flags
rather than doing a separate comparison.

This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.

The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.

There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.

Differential Revision: https://reviews.llvm.org/D36945

llvm-svn: 311317
2017-08-21 08:45:19 +00:00
Sam Parker b252ffd2cc [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667

llvm-svn: 311316
2017-08-21 08:43:06 +00:00
George Rimar d7305ef06c [Support/Parallel] - Do not use a task group for a very small task.
parallel_for_each_n splits a given task into small pieces of tasks and then
passes them to background threads managed by a thread pool to process them
in parallel. TaskGroup then waits for all tasks to be done, which is done by
TaskGroup's destructor.

In the previous code, all tasks were passed to background threads, and the
main thread just waited for them to finish their jobs. This patch changes
the logic so that the main thread processes a task just like other
worker threads instead of just waiting for workers.

This patch improves the performance of parallel_for_each_n for a task which
is too small that we do not split it into multiple tasks. Previously, such task
was submitted to another thread and the main thread waited for its completion.
That involves multiple inter-thread synchronization which is not cheap for
small tasks. Now, such task is processed by the main thread, so no inter-thread
communication is necessary.

Differential revision: https://reviews.llvm.org/D36607

llvm-svn: 311312
2017-08-21 08:00:54 +00:00
Coby Tayree c54c5cbe67 [X86] Allow xacquire/xrelease prefixes
Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845

llvm-svn: 311309
2017-08-21 07:50:15 +00:00
Craig Topper d6f4be97e6 [AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled.
There's no functional difference between the AVX512DQ instructions if we're not masking.

This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.

llvm-svn: 311308
2017-08-21 05:29:02 +00:00
Craig Topper 485cca1ecb [AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table.
llvm-svn: 311307
2017-08-21 05:03:28 +00:00
Dean Michael Berris c5caf3e9c6 [XRay][tools] Support new kinds of instrumentation map entries
Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).

Reviewers: kpw, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36819

llvm-svn: 311305
2017-08-21 00:14:06 +00:00
Chandler Carruth bd6dc14230 Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

llvm-svn: 311304
2017-08-20 23:17:11 +00:00
Craig Topper a152903c1b [InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC
llvm-svn: 311303
2017-08-20 21:38:28 +00:00
Craig Topper d63b33f9c4 [AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node.
We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load.

llvm-svn: 311300
2017-08-20 19:47:00 +00:00
Kuba Mracek 6734671dda Fix archive-update.test after r311296.
llvm-svn: 311299
2017-08-20 18:31:30 +00:00
Craig Topper 702097dafc [AVX-512] Use a scalar load pattern for FPCLASSSS/FPCLASSSD patterns.
llvm-svn: 311297
2017-08-20 18:30:24 +00:00
Kuba Mracek 2c0bca49b1 Remove uses of "%T" from test/Object/archive-* tests.
llvm-svn: 311296
2017-08-20 18:18:44 +00:00
Benjamin Kramer 806ae44012 [NVPTX] Reduce copypasta.
No functionality change intended.

llvm-svn: 311295
2017-08-20 17:30:32 +00:00
Kuba Mracek d3f3fae32d Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311294
2017-08-20 17:05:22 +00:00
Kuba Mracek 5c393d2565 Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396>.
llvm-svn: 311293
2017-08-20 17:00:08 +00:00
Benjamin Kramer 760e00b0bc [MachO] Use Twines more efficiently.
llvm-svn: 311291
2017-08-20 15:13:39 +00:00
Benjamin Kramer 2e5be849cc [Mem2Reg] Modernize code a bit.
No functionality change intended.

llvm-svn: 311290
2017-08-20 14:34:44 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Benjamin Kramer df8c2628ac [dlltool] Make memory buffer ownership less weird.
There's no reason to destroy them in a global destructor.

llvm-svn: 311287
2017-08-20 13:03:32 +00:00
Elena Demikhovsky f58f838495 Changed basic cost of store operation on X86
Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.

Differential Revision: https://reviews.llvm.org/D35888

llvm-svn: 311285
2017-08-20 12:34:29 +00:00
Aditya Kumar a525fffd07 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

llvm-svn: 311281
2017-08-20 10:32:41 +00:00
Igor Breger 88a3d5c855 [GlobalISel][X86] Support call ABI.
Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: oren_ben_simhon

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34602

llvm-svn: 311279
2017-08-20 09:25:22 +00:00
Igor Breger b3a860a5e8 [GlobalISel][X86] Support asimetric copy from/to GPR physical register.
Usually this case generated by ABI lowering, it requare to performe trancate/anyext.

llvm-svn: 311278
2017-08-20 07:14:40 +00:00
Alex Bradbury 21c8adf50b [RISCV] Trivial whitespace fix in RISCVInstPrinter
llvm-svn: 311277
2017-08-20 06:58:43 +00:00
Alex Bradbury e45186d43f [RISCV] Fix two abuses of llvm_unreachable
Replace with report_fatal_error.

llvm-svn: 311276
2017-08-20 06:57:27 +00:00
Alex Bradbury dd83484ab4 [RISCV] Set HasRelocationAddend for RISCVELFObjectWriter
llvm-svn: 311275
2017-08-20 06:55:14 +00:00
Sam Elliott 7fe0aaa140 Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

llvm-svn: 311274
2017-08-20 06:55:10 +00:00
Sam Elliott 785dd75369 Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

llvm-svn: 311273
2017-08-20 06:43:34 +00:00
Igor Breger d88dfd32f8 [GlobalIsel] Fix undefined behavior if Action not set (release), it aslo crashing in debug mode.
Differential Revision: https://reviews.llvm.org/D34978

llvm-svn: 311272
2017-08-20 06:26:22 +00:00
Sam Elliott b0c9753691 Keep Optimization Remark Yaml in NewPM
Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.

So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33951

Reviewers: anemet, chandlerc

Reviewed By: anemet, chandlerc

Subscribers: javed.absar, chandlerc, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36906

llvm-svn: 311271
2017-08-20 01:30:45 +00:00
Chandler Carruth 9ef881efab [x86] Fix an even stranger corner case where we have multiple levels of
cmov self-refrencing.

Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.

llvm-svn: 311267
2017-08-19 23:35:50 +00:00
Craig Topper 4de6f583da [X86] Merge all of the vecload and alignedload predicates into single predicates.
We can load the memory VT and check for natural alignment. This also adds a new preferNonTemporalLoad helper that checks the correct subtarget feature based on the load size.

This shrinks the isel table by at least 5000 bytes by allowing more reordering and combining to occur.

llvm-svn: 311266
2017-08-19 23:21:22 +00:00
Craig Topper afa69eecbb [X86] Converge alignedstore/alignedstore256/alignedstore512 to a single predicate.
We can read the memoryVT and get its store size directly from the SDNode to check its alignment.

llvm-svn: 311265
2017-08-19 23:21:21 +00:00
Craig Topper a0319bb434 [AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
llvm-svn: 311263
2017-08-19 22:02:02 +00:00
Victor Leschuk ee3292c5e4 Set init value for ScalarEvolution::BackedgeTakenInfo::MaxOrZero
Otherwise it can be used uninitialized in move ctor.

llvm-svn: 311262
2017-08-19 21:05:08 +00:00
Martin Storsjo d606d2a643 [ARM] Factorize the calculation of WhichResult in isV*Mask. NFC.
Differential Revision: https://reviews.llvm.org/D36930

llvm-svn: 311260
2017-08-19 20:26:51 +00:00
Martin Storsjo 91522ffa12 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

llvm-svn: 311258
2017-08-19 19:47:48 +00:00
Teresa Johnson b225ad05af Fix bot failures by requiring x86 target
The tests added in r311254 require a target triple since they are
running through code generation. Fix bot failures by requiring
an x86 target.

llvm-svn: 311257
2017-08-19 19:15:04 +00:00
Konstantin Zhuravlyov 89377c440c AMDGPU/NFC: Reorder functions in SIMemoryLegalizer:
- Move *load* functions before *atomic* functions
  - Move *store* functions before *atomic* functions

llvm-svn: 311256
2017-08-19 18:44:27 +00:00