Commit Graph

152599 Commits

Author SHA1 Message Date
Alina Sbirlea 967e7966fc Allow None as a MemoryLocation to getModRefInfo
Summary:
Adding part of the changes in D30369 (needed to make progress):
Current patch updates AliasAnalysis and MemoryLocation, but does _not_ clean up MemorySSA.

Original summary from D30369, by dberlin:
Currently, we have instructions which affect memory but have no memory
location. If you call, for example, MemoryLocation::get on a fence,
it asserts. This means things specifically have to avoid that. It
also means we end up with a copy of each API, one taking a memory
location, one not.

This starts to fix that.

We add MemoryLocation::getOrNone as a new call, and reimplement the
old asserting version in terms of it.

We make MemoryLocation optional in the (Instruction, MemoryLocation)
version of getModRefInfo, and kill the old one argument version in
favor of passing None (it had one caller). Now both can handle fences
because you can just use MemoryLocation::getOrNone on an instruction
and it will return a correct answer.

We use all this to clean up part of MemorySSA that had to handle this difference.

Note that literally every actual getModRefInfo interface we have could be made private and replaced with:

getModRefInfo(Instruction, Optional<MemoryLocation>)
and
getModRefInfo(Instruction, Optional<MemoryLocation>, Instruction, Optional<MemoryLocation>)

and delegating to the right ones, if we wanted to.

I have not attempted to do this yet.

Reviewers: dberlin, davide, dblaikie

Subscribers: sanjoy, hfinkel, chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D35441

llvm-svn: 309641
2017-08-01 00:28:29 +00:00
Craig Topper 410d252f5b [AVX-512] Add unmasked subvector inserts and extract to the execution domain tables.
llvm-svn: 309632
2017-07-31 22:07:29 +00:00
David Blaikie 038e28a5a7 DebugInfo: Put range base specifier entry functionality behind a flag
Chromium's gold build seems to have trouble with this (gold produces
errors) - not sure if it's gold that's not coping with the valid
representation, or a bug in the implementation in LLVM, etc.

llvm-svn: 309630
2017-07-31 21:48:42 +00:00
Craig Topper 7dd26785e7 [AVX512] Add a common prefix to avx512-insert-extract.ll so we can reduce the number of check lines on some test cases.
This was pointed out during the review for D313804.

llvm-svn: 309629
2017-07-31 21:20:06 +00:00
Reid Kleckner 2de471dca9 [codeview] Ignore DBG_VALUEs when choosing a BB start source loc
When the first instruction of a basic block has no location (consider a
LEA materializing the address of an alloca for a call), we want to start
the line table for the block with the first valid source location in the
block.  We need to ignore DBG_VALUE instructions during this scan to get
decent line tables.

llvm-svn: 309628
2017-07-31 21:03:08 +00:00
Sanjay Patel dac0ab272c [InstCombine] allow mask hoisting transform for vector types
llvm-svn: 309627
2017-07-31 21:01:53 +00:00
Craig Topper 043c44efff [AVX-512] Use AVX512 as test check prefix instead of AVX3. NFC
llvm-svn: 309625
2017-07-31 20:58:06 +00:00
Adrian Prantl 56619fa7e9 Debug Info: Also check the DWARF output in assembler-only test cases
This will prevent me from introducing a regression in my next commit.

llvm-svn: 309623
2017-07-31 20:48:52 +00:00
Peter Collingbourne bcd204b478 Update phi nodes in LowerTypeTests control flow simplification
D33925 added a control flow simplification for -O2 --lto-O0 builds that
manually splits blocks and reassigns conditional branches but does not
correctly update phi nodes. If the else case being branched to had
incoming phi nodes the control-flow simplification would leave phi nodes
in that BB with an unhandled predecessor.

Patch by Vlad Tsyrklevich!

Differential Revision: https://reviews.llvm.org/D36012

llvm-svn: 309621
2017-07-31 20:43:07 +00:00
Kostya Serebryany b2a1eba2f5 [libFuzzer] implement __sanitizer_cov_pcs_init and add pc-table to build flags for one test (for now)
llvm-svn: 309615
2017-07-31 20:20:59 +00:00
Konstantin Belochapka b77d0a5cf1 [X86][MMX] Added custom lowering action for MMX SELECT (PR30418)
Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch
Differential Revision: https://reviews.llvm.org/D34661

llvm-svn: 309614
2017-07-31 20:11:49 +00:00
Sanjay Patel 904801597e [InstCombine] add tests for mask hoisting; NFC
The scalar transforms exist with no test coverage. The vector equivalents are missing.

llvm-svn: 309612
2017-07-31 20:02:04 +00:00
Kostya Serebryany bfc83fa8d7 [sanitizer-coverage] don't instrument available_externally functions
llvm-svn: 309611
2017-07-31 20:00:22 +00:00
Kostya Serebryany bb6f079a45 [sanitizer-coverage] ensure minimal alignment for coverage counters and guards
llvm-svn: 309610
2017-07-31 19:49:45 +00:00
Zachary Turner 8d927b6bf9 [lld/pdb] Add an empty globals stream.
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.

Differential Revision: https://reviews.llvm.org/D35290

llvm-svn: 309608
2017-07-31 19:36:08 +00:00
Davide Italiano e4c2782cba [SLPVectorizer] Unbreak the build with -Werror.
GCC was complaining about `&&` within `||` without explicit
parentheses. NFCI.

llvm-svn: 309606
2017-07-31 19:14:19 +00:00
Craig Topper 317a51e886 [X86][InstCombine] Add some simplifications for BZHI intrinsics
This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications.

This could be in InstSimplify, but we don't handle any target specific intrinsics there today.

Differential Revision: https://reviews.llvm.org/D36069

llvm-svn: 309604
2017-07-31 18:52:15 +00:00
Craig Topper 8324003818 [X86][InstCombine] Add basic simplification support for BEXTR/BEXTRI intrinsics.
This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either.

I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify.

Differential Revision: https://reviews.llvm.org/D36063

llvm-svn: 309603
2017-07-31 18:52:13 +00:00
Reid Kleckner 1b4e9ae384 [lit] Avoid copying llvm/utils/lit/tests/Inputs with lit site configs
Summary:
This is an alternative solution to running the lit test suite on bots
without polluting the source directory. Each input test suite gets an
auto-generated site config in the build directory that points back to
the test input source directory.

This adds some cmake comlexity, but now we don't need to remove and
re-copy the test input directory before every test.

Reviewers: delcypher, modocache

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D36026

llvm-svn: 309602
2017-07-31 18:45:44 +00:00
Quentin Colombet f22c578d67 [llc][NFC] Update message in assert.
llvm-svn: 309600
2017-07-31 18:31:04 +00:00
Quentin Colombet 15f6ffbf7c [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

llvm-svn: 309599
2017-07-31 18:24:07 +00:00
Sanjay Patel fea731a4aa [CGP] use subtract or subtract-of-cmps for result of memcmp expansion
As noted in the code comment, transforming this in the other direction might require 
a separate transform here in CGP given the block-at-a-time DAG constraint.

Besides that theoretical motivation, there are 2 practical motivations for the 
subtract-of-cmps form:

1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). 
   There is discussion about canonicalizing IR to the select form 
   ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), 
   so we probably need to add DAG transforms for those patterns anyway, but this improves the 
   memcmp output without waiting for that step.

2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
   that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
   if we choose to enable that.

Differential Revision: https://reviews.llvm.org/D34904

llvm-svn: 309597
2017-07-31 18:08:24 +00:00
Spyridoula Gravani 70d35e102e [DWARF] Added verification check for tags in accelerator tables. This patch verifies that the atom tag is actually the same with the tag of the DIE that we retrieve from the table.
Differential Revision: https://reviews.llvm.org/D35963

llvm-svn: 309596
2017-07-31 18:01:16 +00:00
David Majnemer 91c6330c96 [IPSCCP] Guard a user of getInitializer with hasDefinitiveInitializer
We are not allowed to reason about an initializer value without first
consulting hasDefinitiveInitializer.

llvm-svn: 309594
2017-07-31 17:47:07 +00:00
Craig Topper cb0e74975a [AVX-512] Remove patterns that select vmovdqu8/16 for unmasked loads. Prefer vmovdqa64/vmovdqu64 instead.
These were taking priority over the aligned load instructions since there is no vmovda8/16. I don't think there is really a difference between aligned and unaligned on newer cpus so I don't think it matters which instructions we use.

But with this change we reduce the size of the isel table a little and we allow the aligned information to pass through to the evex->vec pass and produce the same output has avx/avx2 in some cases.

I also generally dislike patterns rooted in a bitcast which these were.

Differential Revision: https://reviews.llvm.org/D35977

llvm-svn: 309589
2017-07-31 17:35:44 +00:00
Simon Pilgrim 7b89ab5887 Strip trailing whitespace. NFCI.
llvm-svn: 309584
2017-07-31 17:09:27 +00:00
Simon Pilgrim 77bdbc197e Fix typo in comment.
llvm-svn: 309583
2017-07-31 17:06:55 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Amaury Sechet 4358c5217d Do not recombine FMA when that is not needed.
Summary: As per title. This creates useless recombines.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33848

llvm-svn: 309578
2017-07-31 16:56:25 +00:00
Florian Hahn a3ad61d874 Exclude more unused functions from release build.
llvm-svn: 309576
2017-07-31 16:44:28 +00:00
Florian Hahn 1b09a37c07 Extend ifndef to printDebugLoc.
GCC7 did not warn about that, but Clang does.

llvm-svn: 309573
2017-07-31 16:29:00 +00:00
Florian Hahn 94b8a87c8e Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Benjamin Kramer d7b1e5af0a [DebugInfo] Don't overwrite DWARFUnit fields if the CU DIE doesn't have them.
DIEs are lazily deserialized so it's possible that the DWO CU is created
before the DIE is parsed. DWO shares .debug_addr and .debug_ranges with the
object file so overwriting the offset with 0 will make the CU unusable.

No test case because I couldn't get clang to emit a non-zero range base.

llvm-svn: 309570
2017-07-31 15:32:39 +00:00
Don Hinton 1e4f60275f [docker] Fix unmatched quote problem in here-document on older versions of bash
Summary:
When outputing usage, emit here-document directly instead of
saving in a variable first -- avoids problem with bash 3.2.57 where an
unmatched ' in the here-document results in the following error:

./build_docker_image.sh: line 135: unexpected EOF while looking for matching `''

bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin16)

Differential Revision: https://reviews.llvm.org/D36064

llvm-svn: 309568
2017-07-31 15:18:57 +00:00
Alexey Bataev 0ab22bb991 [SLP] Initial rework for min/max horizontal reduction vectorization, NFC.
Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations.

Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D29402

llvm-svn: 309566
2017-07-31 14:36:05 +00:00
Simon Pilgrim 11ea6fdcd7 [X86] Extending a test cases for LEA factorization.
Submitted on the behalf of Jatin Bhateja

Differential Revision: https://reviews.llvm.org/D36048

llvm-svn: 309565
2017-07-31 14:23:28 +00:00
Alexey Bataev 3e9b3eb91d [Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC.
llvm-svn: 309563
2017-07-31 14:19:32 +00:00
Simon Dardis 8930b4a049 [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765

llvm-svn: 309561
2017-07-31 14:06:58 +00:00
Ayal Zaks e841b214b1 [LV] Avoid redundant operations manipulating masks
The Loop Vectorizer generates redundant operations when manipulating masks:
AND with true, OR with false, compare equal to true. Instead of relying on
a subsequent pass to clean them up, this patch avoids generating them.

Use null (no-mask) to represent all-one full masks, instead of a constant
all-one vector, following the convention of masked gathers and scatters.

Preparing for a follow-up VPlan patch in which these mask manipulating
operations are modeled using recipes.

Differential Revision: https://reviews.llvm.org/D35725

llvm-svn: 309558
2017-07-31 13:21:42 +00:00
Martin Storsjo 4a5764e3ca [llvm-dlltool] Write correct weak externals
Previously, the created object files for the import library were broken.
Write the symbol table before the string table. Simplify the code by
using a separate variable Prefix instead of duplicating a few lines.

Also update the coff-weak-exports to actually check that the generated
weak symbols can be found as intended.

Differential Revision: https://reviews.llvm.org/D36065

llvm-svn: 309555
2017-07-31 11:18:41 +00:00
Florian Hahn 6b3216aad8 Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
NAKAMURA Takumi b4b4c0ae17 [Modules] llvm-config: Exclude CMAKE_CFG_INTDIR. It isn't used in headers.
This is part of https://reviews.llvm.org/D35559

llvm-svn: 309552
2017-07-31 10:07:13 +00:00
George Rimar e36d7a6d68 [Support/GlobPattern] - Do not crash when pattern has characters with int value < 0.
Found it during work on LLD, it would crash on following 
linker script:

SECTIONS { .foo : { *("*®") } }
That happens because ® has int value -82. And chars are used as
array index in code, and are signed by default.

Differential revision: https://reviews.llvm.org/D35891

llvm-svn: 309549
2017-07-31 09:26:50 +00:00
Florian Hahn 4284049dcc [LoopInterchange] Do not interchange loops with function calls.
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.

For now, I think it is only safe to interchange loops with calls marked
as readnone.

With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.

Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35489

llvm-svn: 309547
2017-07-31 09:00:52 +00:00
Guy Blank b169d56dc3 [X86][AVX512] Add masked MOVS[S|D] patterns
Added patterns to recognize AND 1 on the mask of a scalar masked
move is not needed since only the lower bit is relevant for the
instruction.

Differential Revision:
https://reviews.llvm.org/D35897

llvm-svn: 309546
2017-07-31 08:26:14 +00:00
Mohammad Shahid 126e91ab09 [SLP]: Add test to resurrect the jumbled load patch. This test has multiple uses
of memory loads by different user

Change-Id: I40b5ba8b810265440f3e55efca77c4b41ca98fa4
llvm-svn: 309544
2017-07-31 07:40:54 +00:00
Hiroshi Inoue 5703fe37ab [PowerPC] Change method names; NFC
Changed method names based on the discussion in https://reviews.llvm.org/D34986:
getInt64 -> selectI64Imm,
getInt64Count -> selectI64ImmInstrCount.

llvm-svn: 309541
2017-07-31 06:27:09 +00:00
Craig Topper 97e9fa7954 [X86] Add pattern to use bzhi for 64-bit 'and' with a mask when there is a load involved.
We already had a pattern without load, but with a load we were falling back to a regular 'and' due to pattern complexity priority.

llvm-svn: 309535
2017-07-31 05:55:54 +00:00
NAKAMURA Takumi d19b960a13 gold/CMakeLists.txt: Prune (-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64).
They are handled in HandleLLVMOptions.cmake for -m32.
They are not required for -m64.

llvm-svn: 309532
2017-07-31 00:39:22 +00:00
NAKAMURA Takumi ebe04da06d Prune trailing linefeed at eof.
llvm-svn: 309531
2017-07-31 00:39:19 +00:00
David Blaikie 4dd663752d DebugInfo: Fix r309526, ensure resetting base address selection entries are used
Missed the resetting base address selections when going from a base
address version to zero base address for non-base-addressed entries.

llvm-svn: 309529
2017-07-31 00:18:24 +00:00
David Blaikie 89c81a0b91 DebugInfo: Use base address selection entries in debug_ranges to reduce relocations
(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.

This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).

llvm-svn: 309526
2017-07-30 22:10:00 +00:00
Saleem Abdulrasool 16a2f5ac8e test: add an additional cfi_return_column test
Ensure that we still coalesce identical CIEs across FDEs even with
cfi_return_column alterations.

llvm-svn: 309525
2017-07-30 21:30:54 +00:00
Saleem Abdulrasool 86495756df test: make the test clearer (NFC)
Use `llvm-objdump -dwarf=frames` to dump the .eh_frame to validate the
output textually rather than compare the binary output.  This makes it
easier to see what is being checked.  NFC.

llvm-svn: 309524
2017-07-30 21:30:53 +00:00
Lama Saba 880e8788b1 NFC: spell correction.
On behalf of jbhateja

Differential Revision: https://reviews.llvm.org/D35885

llvm-svn: 309521
2017-07-30 20:12:17 +00:00
Tobias Grosser efeb1a61f0 Fix typo in comment
llvm-svn: 309519
2017-07-30 18:01:16 +00:00
David Blaikie e0c47f9dd8 llvm-symbolizer/print_context.c test: Make debug info path independent
llvm-svn: 309518
2017-07-30 17:26:34 +00:00
David Blaikie 58cee4cdfd llvm-symbolizer: Make test portable using an explicit object file rather than the host compiler
llvm-svn: 309517
2017-07-30 17:16:32 +00:00
David Blaikie 80ee892dfc Make test robust to changes in prefix/avoid hardcoded line numbers
llvm-svn: 309516
2017-07-30 16:05:26 +00:00
Dylan McKay 6b5e5c38cb Revert "[AVR] Mark a failing symbolizer test as XFAIL"
This reverts commit 83a0e876349adb646ba858eb177b22b0b4bfc59a.

llvm-svn: 309515
2017-07-30 15:38:07 +00:00
David Blaikie a62f1cb1fa DebugInfo: Fix for CU index usage in 309507
Not sure quite how I failed so clearly to test this, but anyway.

llvm-svn: 309514
2017-07-30 15:15:58 +00:00
Dylan McKay 9c626282da [AVR] Mark a failing symbolizer test as XFAIL
llvm-svn: 309512
2017-07-30 14:55:11 +00:00
Michael Zuckerman 11148120d0 Expanding the test case for vf8 for stride 4 interleaved.
llvm-svn: 309511
2017-07-30 11:54:57 +00:00
Coby Tayree 48d67cdbb4 [x86][inline-asm][ms-compat] legalize the use of "jc/jz short <op>"
MS ignores the keyword "short" when used after a jc/jz instruction, LLVM ought to do the same.
Test: D35893

Differential Revision: https://reviews.llvm.org/D35892

llvm-svn: 309509
2017-07-30 11:12:47 +00:00
David Blaikie ebac0b9c62 DebugInfo: Use DWP cu_index to speed up symbolizing (as intended)
I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))

llvm-svn: 309507
2017-07-30 08:12:07 +00:00
David Blaikie 4b4c47c406 DebugInfo: Group member variable along with the rest
Committed in r309498 I didn't spot where the rest of the private members
were in DWARFContext at the time - group them up again.

llvm-svn: 309506
2017-07-30 08:12:05 +00:00
Craig Topper 951f0ca104 [X86] Add addsub intrinsics to the intrinsic lowering table so we have a single set of isel patterns.
llvm-svn: 309502
2017-07-30 06:02:59 +00:00
Dehao Chen 95f003003d Refactor the build{Module|Function}SimplificationPipeline to expose optimization phase.
Summary: This is in preparation of https://reviews.llvm.org/D36052

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36053

llvm-svn: 309500
2017-07-30 04:55:39 +00:00
David Blaikie e5adb68e04 DebugInfo: Provide option for explicitly specifying the name of the DWP file
If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.

This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.

llvm-svn: 309498
2017-07-30 01:34:08 +00:00
Sam Elliott 67b0e589d0 Migrate PGOMemOptSizeOpt to use new OptimizationRemarkEmitter Pass
Summary:
Fixes PR33790.

This patch still needs a yaml-style test, which I shall write tomorrow

Reviewers: anemet

Reviewed By: anemet

Subscribers: anemet, llvm-commits

Differential Revision: https://reviews.llvm.org/D35981

llvm-svn: 309497
2017-07-30 00:35:33 +00:00
Florian Hahn f63a5e91db [AArch64] Tie source and destination operands for AESMC/AESIMC.
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
    AESE Vn, _
    AESMC Vn, Vn
and
    AESD Vn, _
    AESIMC Vn, Vn

The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.

A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.

I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.

Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga

Reviewed By: t.p.northover, evandro

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35299

llvm-svn: 309495
2017-07-29 20:35:28 +00:00
Florian Hahn 2f86e3d494 [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53.
Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.

Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin

Reviewed By: rengolin

Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35568

llvm-svn: 309494
2017-07-29 20:04:54 +00:00
Saleem Abdulrasool 29199f5260 MC: simplify internal function call parameter
Rather than passing along most of the parameters, pass a reference to
the MCDWARFrameInfo instead.  This makes it easier to pass additional
information about the frame to the checks.  We need to keep the extra
constructor for the Key around to allow the construction of the null and
tombstone keys.  NFC.

llvm-svn: 309493
2017-07-29 20:03:02 +00:00
Saleem Abdulrasool fc85067f30 MC: account for the return column in the CIE key
If the return column is different, we cannot coalesce the CIE across the
FDEs.  Add that to the key calculation.  This ensures that we emit a
separate CIE.

llvm-svn: 309492
2017-07-29 20:03:00 +00:00
Hiroshi Inoue 40d01f3cb5 Fix test failure without X86 backend
move test/Transforms/SimplifyCFG/disable-lookup-table.ll into test/Transforms/SimplifyCFG/X86/disable-lookup-table.ll to avoid test failure when X86 backend is not enabled

llvm-svn: 309487
2017-07-29 15:03:31 +00:00
Simon Pilgrim 718cb0ea62 [SelectionDAG][X86] CombineBT - more aggressively determine demanded bits
This patch is in 2 parts:

1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.

2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.

Differential Revision: https://reviews.llvm.org/D35896

llvm-svn: 309486
2017-07-29 14:50:25 +00:00
Tobias Grosser 670a5d88a3 [tests] Do not emity binary bitcode to stdout in RegionInfo tests
llvm-svn: 309485
2017-07-29 09:58:43 +00:00
Michal Gorny 4ba0c1dc4a [OCaml] Pass -D/-UNDEBUG through to ocamlc
Detect [/-][DU]NDEBUG in CMAKE_C_FLAGS* and pass them through to ocamlc.
This is necessary because their value might affect visibility of dump
functions in LLVM and ocamlc uses its own compiler and flags by default.

Differential Revision: https://reviews.llvm.org/D35898

llvm-svn: 309483
2017-07-29 08:10:24 +00:00
Dehao Chen 4194ebff8d Update the test to make windows bot pass.
llvm-svn: 309482
2017-07-29 07:01:25 +00:00
Michal Gorny cc2eccdf50 [OCaml] Install dynamic libraries in 'stubdirs' directory
Install the OCaml dynamic libraries in the 'stubdirs' directory rather
than the llvm subdirectory in order to fix running executables created
by ocamlc. Otherwise, the executables fail to run being unable to locate
the libraries (unless the LLVM directory is explicitly added to
LD_LIBRARY_PATH).

The staging directories are not altered since they work for our
development setup anyway, and installing into two directories would
unnecessarily make the code more complex.

Differential Revision: https://reviews.llvm.org/D35995

llvm-svn: 309481
2017-07-29 06:46:45 +00:00
Sanjoy Das b5a968f62d [SCEV] Change an early exit to an assert; NFC
llvm-svn: 309480
2017-07-29 05:32:47 +00:00
Dehao Chen 246254b97d update the test file that was omitted in r309478.
llvm-svn: 309479
2017-07-29 04:11:20 +00:00
Dehao Chen ce0842ce9c Refine the PGOOpt and SamplePGOSupport handling.
Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36040

llvm-svn: 309478
2017-07-29 04:10:24 +00:00
Tom Stellard 503fd446ad AMDGPU: Remove deadcode from AMDGPUInstPrinter
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36034

llvm-svn: 309477
2017-07-29 03:56:53 +00:00
Tom Stellard 5c50cdf0e8 AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files
Summary: This is only used by R600.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D35926

llvm-svn: 309476
2017-07-29 03:44:07 +00:00
Jessica Paquette d87f54493d [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
NAKAMURA Takumi 15b5c8b206 lit::shtest-format.py: Make write-bad-encoding.py py3-aware.
Traceback (most recent call last):
    File "llvm/utils/lit/tests/Inputs/shtest-format/external_shell/write-bad-encoding.py", line 5, in <module>
      sys.stdout.write(b"a line with bad encoding: \xc2.")

sys.stdout.write doesn't accept bytes but sys.stdout.buffer.write accepts.

llvm-svn: 309473
2017-07-29 02:52:56 +00:00
Matt Arsenault 9608a2891d AMDGPU: Make areMemAccessesTriviallyDisjoint more aware of segment flat
Checking the encoding is insufficient since now there can
be global or scratch instructions.

llvm-svn: 309472
2017-07-29 01:26:21 +00:00
Matt Arsenault dc8f5cc39c AMDGPU: Teach isLegalAddressingMode about global_* instructions
Also refine the flat check to respect flat-for-global feature,
and constant fallback should check global handling, not
specifically MUBUF.

llvm-svn: 309471
2017-07-29 01:12:31 +00:00
Matt Arsenault 4e309b0861 AMDGPU: Start selecting global instructions
llvm-svn: 309470
2017-07-29 01:03:53 +00:00
Eugene Zelenko 4d060b71cc [Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 309469
2017-07-29 00:56:56 +00:00
Alexander Shaposhnikov e574034f28 [llvm] Update MachOObjectFile::exports interface
This diff removes the second argument of the method MachOObjectFile::exports.
In all in-tree uses this argument is equal to "this" and 
without this argument the interface seems to be cleaner.

Test plan: make check-all

llvm-svn: 309462
2017-07-29 00:30:45 +00:00
Eli Friedman b6fc8dc5e5 Fix update_llc_test_checks.py ARM parsing
When I tried running the script, the ARM regex parser could not parse
my code. It failed because the .Lfunc_end line has a comment at the
end of it, so this commit removes the newline at the end of the regex.

Patch by Joel Galenson!

Differential Revision: https://reviews.llvm.org/D35641

llvm-svn: 309457
2017-07-28 23:58:24 +00:00
Tobias Edler von Koch 819d84032e [LTO] llvm-lto2: Add option to load sample profile
Summary:
This exposes LTO's Conf.SampleProfile as a command line option
(-lto-sample-profile-file) for testing via the llvm-lto2 utility.

Reviewers: pcc, danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36030

llvm-svn: 309456
2017-07-28 23:43:22 +00:00
Adrian Prantl 359846fe1c Remove the unused offset field from LiveDebugValues (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309455
2017-07-28 23:25:51 +00:00
Adrian Prantl b2811e50f9 Remove the unused offset field from LiveDebugVariables (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309451
2017-07-28 23:06:50 +00:00
Adrian Prantl 8b9bb534a1 Remove the unused offset from DBG_VALUE (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
2017-07-28 23:00:45 +00:00
Adrian Prantl d92ac5a259 Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309449
2017-07-28 22:46:20 +00:00
Adrian Prantl eb5d69ba81 Update the Go bindings for r309426 (remove offset from llvm.dbg.value)
llvm-svn: 309448
2017-07-28 22:44:44 +00:00
Farhana Aleen 7ccc9fd0e2 Added tests for i8 interleaved-load-pattern of stride=4, VF=(8, 16, 32).
llvm-svn: 309447
2017-07-28 22:43:34 +00:00
Adrian Prantl 1b63dc54b0 Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309446
2017-07-28 22:36:55 +00:00
Sumanth Gundapaneni 6af104e418 Add documentation for the attribute "no-jump-tables"
llvm-svn: 309445
2017-07-28 22:26:22 +00:00
Sumanth Gundapaneni 8d50a50e98 [SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables
Differential Revision: https://reviews.llvm.org/D35579

llvm-svn: 309444
2017-07-28 22:25:40 +00:00
Kostya Serebryany f14996b962 [libFuzzer] improve support for inline-8bit-counters (make it more correct and faster)
llvm-svn: 309443
2017-07-28 22:00:56 +00:00
Krzysztof Parzyszek cfd8806b2b [Hexagon] Formatting changes, NFC
llvm-svn: 309442
2017-07-28 21:52:21 +00:00
Easwaran Raman 51b809bf2f [Inliner] Do not apply any bonus for cold callsites.
Summary:
Inlining threshold is increased by application of bonuses when the
callee has a single reachable basic block or is rich in vector
instructions. Similarly, inlining cost is reduced by applying a large
bonus when the last call to a static function is considered for
inlining. This patch disables the application of these bonuses when the
callsite or the callee is cold. The intention here is to prevent a large
cold callsite from being inlined to a non-cold caller that could prevent
the caller from being inlined. This is especially important when the
cold callsite is a last call to a static since the associated bonus is
very high.

Reviewers: chandlerc, davidxl

Subscribers: danielcdh, llvm-commits

Differential Revision: https://reviews.llvm.org/D35823

llvm-svn: 309441
2017-07-28 21:47:36 +00:00
Adrian Prantl a617576bb1 Remove the unused dbg.value offset from SelectionDAG (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309436
2017-07-28 21:27:35 +00:00
Reid Kleckner 67de34897c [lit] Use a %{python} substitution to avoid relying on python being on PATH
llvm-svn: 309434
2017-07-28 21:13:47 +00:00
Reid Kleckner 7895d2fcab [lit] Remove stale test inputs before running check-lit
This should fix googletest-format test failures on the clang modules
buildbots, which have a stale copy of the OneTest script in the build
directory.

llvm-svn: 309432
2017-07-28 21:00:57 +00:00
Adrian Prantl 1b842dad3e Reword sentence in LangRef
llvm-svn: 309431
2017-07-28 20:44:29 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Alexey Bataev e109655c90 [SLP] Allow vectorization of the instruction from the same basic blocks only, NFC.
Summary:
After some changes in SLP vectorizer we missed some additional checks to
limit the instructions for vectorization. We should not perform analysis
of the instructions if the parent of instruction is not the same as the
parent of the first instruction in the tree or it was analyzed already.

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D34881

llvm-svn: 309425
2017-07-28 20:11:16 +00:00
Reid Kleckner 9be82c3169 Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980

llvm-svn: 309422
2017-07-28 19:48:40 +00:00
Matt Arsenault da9ab148f3 AMDGPU: Look through a bitcast user of an out argument
This allows handling of a lot more of the interesting
cases in Blender. Most of the large functions unlikely
to be inlined have this pattern.

This is a special case for what clang emits for OpenCL 3
element vectors. Annoyingly, these are emitted as
<3 x elt>* pointers, but accessed as <4 x elt>* operations.
This also needs to handle cases where a struct containing
a single vector is used.

llvm-svn: 309419
2017-07-28 19:06:16 +00:00
Chad Rosier 2f49803c1f [Value Tracking] Refactor icmp comparison logic into helper. NFC.
llvm-svn: 309417
2017-07-28 18:47:43 +00:00
Matt Arsenault c06574ffc0 AMDGPU: Add pass to replace out arguments
It is better to return arguments directly in registers
if we are making a call rather than introducing expensive
stack usage. In one of sample compile from one of
Blender's many kernel variants, this fires on about
~20 different functions. Future improvements may be to
recognize simple cases where the pointer is indexing a small
array. This also fails when the store to the out argument
is in a separate block from the return, which happens in
a few of the Blender functions. This should also probably
be using MemorySSA which might help with that.

I'm not sure this is correct as a FunctionPass, but
MemoryDependenceAnalysis seems to not work with
a ModulePass.

I'm also not sure where it should run.I think it should
run  before DeadArgumentElimination, so maybe either
EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate.

llvm-svn: 309416
2017-07-28 18:40:05 +00:00
Hiroshi Yamauchi 1b179bc5ff [LVI] Constant-propagate a zero extension of the switch condition value through case edges
Summary:
LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.

But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.

This patch adds a small logic to handle such a case in getEdgeValueLocal().

This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.

With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.




Reviewers: wmi, dberlin, sanjoy

Reviewed By: sanjoy

Subscribers: davide, davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34822

llvm-svn: 309415
2017-07-28 18:35:25 +00:00
Tim Northover a7f583e33b GlobalISel: map 128-bit values to an FPR by default.
Eventually we may want to allow a pair of GPRs but absolutely nothing in the
entire world is ready for that yet.

llvm-svn: 309404
2017-07-28 17:11:01 +00:00
Reid Kleckner 125c74bc56 [lit] Dump some FileCheck inputs to try to debug some failing tests
llvm-svn: 309400
2017-07-28 16:24:18 +00:00
Reid Kleckner 432914bba0 [lit] Fix shtest-format external_shell failures
When using win32 cmd.exe, turn off command echoing at the beginning of
the script (@echo off).

Replace a bash shell script with a python script for the
fail_with_bad_encoding test.

llvm-svn: 309399
2017-07-28 16:13:02 +00:00
Matt Arsenault 9166ce86e8 AMDGPU: Annotate implicitarg.ptr usage
We need to pass something to functions for this to work.
It isn't derivable just from the kernarg segment pointer
because the implicit arguments are placed after the
kernel arguments.

Also fixes missing test for the intrinsic.

llvm-svn: 309398
2017-07-28 15:52:08 +00:00
Wei Mi 55c05e14af [GVN] Recommit the patch "Add phi-translate support in scalarpre"
Recommit after workaround the bug PR31652.

Three bugs fixed in previous recommits: The first one is to use CurrentBlock
instead of PREInstr's Parent as param of performScalarPREInsertion because
the Parent of a clone instruction may be uninitialized. The second one is stop
PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst
is defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

llvm-svn: 309397
2017-07-28 15:47:25 +00:00
Chris Bieneman 5726589209 [CMake] NFC. Add intrinsics_gen target to CMake Exports
By creating a dummy of this target in LLVMConfig.cmake, projects that can build against out-of-tree LLVM can freely depend on the target without needing to have conditionals for if LLVM is in-tree or out-of-tree.

llvm-svn: 309389
2017-07-28 15:33:35 +00:00
Chad Rosier e42b44b87d [ValueTracking] Remove a number of unused arguments. NFC.
llvm-svn: 309385
2017-07-28 14:39:06 +00:00
Joel Jones 08e88e8df7 [AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI)
This NFC changeset standardizes the suffixes used for LSE Atomics
instructions.

It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing
standard 'B', 'H', 'W' and 'X'.

This changeset is the result of the code review discussion for D35319.

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D35927

llvm-svn: 309384
2017-07-28 14:09:24 +00:00
Strahinja Petrovic 25e9e1b866 [ARM] Add the option to directly access TLS pointer
This patch enables choice for accessing thread local
storage pointer (like '-mtp' in gcc).

Differential Revision: https://reviews.llvm.org/D34408

llvm-svn: 309381
2017-07-28 12:54:57 +00:00
Simon Pilgrim 1ff3da7273 [X86] Add test case for PR33290
llvm-svn: 309375
2017-07-28 09:43:52 +00:00
Simon Pilgrim 88d3bed351 [X86][AVX] Cleanup shuffle combine tests - remove old prefixes.
llvm-svn: 309374
2017-07-28 09:41:55 +00:00
Peter Smith 5804364f7a [ARM] Add test to check pcs of ARM ABI runtime floating point helpers
The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
helper functions in section 4.1.2 The floating-point helper functions.
The functions listed in this section must always use the base AAPCS calling
convention.

This test generates calls to all the helper functions that llvm supports
and checks that the base AAPCS calling convention has been used. We test
the equivalent of -mfloat-abi=soft, -mfloat-abi=softfp, -mfloat-abi=hardfp
with an FPU that supports single and double precision, and one that only
supports double precision.

Differential Revision: https://reviews.llvm.org/D35904

llvm-svn: 309371
2017-07-28 09:21:00 +00:00
Max Kazantsev fa4969539a [SCEV] Do not visit nodes twice in containsConstantSomewhere
This patch reworks the function that searches constants in Add and Mul SCEV expression
chains so that now it does not visit a node more than once, and also renames this function
for better correspondence between its implementation and semantics.

Differential Revision: https://reviews.llvm.org/D35931

llvm-svn: 309367
2017-07-28 06:42:15 +00:00
Jessica Paquette 4602c3437c [MachineOutliner] NFC: Comment tidying
The comment on describing the suffix tree had some pruning
stuff that was out of date in it.

Also fixed some typos.

llvm-svn: 309365
2017-07-28 05:59:30 +00:00
Michal Gorny a3ad81d255 Revert rL309320 - "[OCaml] Respect CMAKE_C_FLAGS for OCaml C files"
This causes buildbot breakage for systems where OCaml files are built
with a different compiler.

llvm-svn: 309364
2017-07-28 04:29:20 +00:00
Saleem Abdulrasool 61d81ec754 test: require x86 backend
Ensure that the target is registered before using it.  Should fix the
hexagon Bots.

llvm-svn: 309363
2017-07-28 04:15:35 +00:00
Saleem Abdulrasool a219b3d8d1 MC: add support for cfi_return_column
This adds support for the CFI pseudo-op return_column.  This specifies
the frame table column which contains the return address.

Addresses PR33953!

llvm-svn: 309360
2017-07-28 03:39:19 +00:00
Saleem Abdulrasool b3c70c09e3 MC: clang-format enumeration (NFC)
This was hard to insert elements into.  clang-format it so that it is
easier.  NFC.

llvm-svn: 309359
2017-07-28 03:39:18 +00:00
Sanjoy Das 843ab57457 Revert "[SCEV] Cache results of computeExitLimit"
This reverts commit r309080.  The patch needs to clear out the
ScalarEvolution::ExitLimits cache in forgetMemoizedResults.

I've replied on the commit thread for the patch with more details.

llvm-svn: 309357
2017-07-28 03:25:07 +00:00
Jessica Paquette 809d708b8a [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
Davide Italiano 75a001ba78 [JumpThreading] Stop falsely preserving LazyValueInfo.
JumpThreading claims to preserve LVI, but it doesn't preserve
the analyses which LVI holds a reference to (e.g. the Dominator).
In the current pass manager infrastructure, after JT runs, the
PM frees these analyses (including DominatorTree) but preserves
LVI.

CorrelatedValuePropagation runs immediately after and queries
a corrupted domtree, causing weird miscompiles.

This commit disables the preservation of LVI for the time being.
Eventually, we should either move LVI to a proper dependency
tracking mechanism (i.e. an analyses shouldn't hold references
to other analyses and compute them on demand if needed), or
we should teach all the passes preserving LVI to preserve the
analyses LVI depends on.

The new pass manager has a mechanism to invalidate LVI in case
one of the analyses it depends on becomes invalid, so this problem
shouldn't exist (at least not in this immediate form), but handling
of analyses holding references is still a very delicate subject.

Fixes PR33917 (and rustc).

llvm-svn: 309355
2017-07-28 03:10:43 +00:00
David Blaikie 89daf77a11 DebugInfo: Consider a CU containing only local imported entities to be 'empty'
This can come up in ThinLTO & wastes space & makes degenerate IR.

As per the added FIXME, ultimately, local imported entities should hang
off the function and that way the imported entity list on the CU can be
tested for emptiness like all the other CU lists.

(function-attached local imported entities are probably also the best
path forward for fixing how imported entities are handled both in
cross-module use (currently, while ThinLTO preserves the imported
entities, they would not get used at the imported inlined location -
only in the abstract origin that appears in the partial CU created by
the import (which isn't emitted under Fission due to cross-CU
limitations there)) and to reduce the number of points where imported
entities are emitted (they're currently emitted into every inlined
instance, concrete instance, and abstract origin - they should only go
in teh abstract origin if there is one, otherwise in the concrete
instance - but this requires lots of delayed handling and wiring up,
same as abstract variables & subprograms))

llvm-svn: 309354
2017-07-28 03:06:25 +00:00
Davide Italiano 01cb947abb [JumpThreading] Add an option to dump LazyValueInfo after the run.
Differential Revision:  https://reviews.llvm.org/D35973

llvm-svn: 309353
2017-07-28 02:57:43 +00:00
Matthias Braun c618a466f1 ARMFrameLowering: Only set ExtraCSSpill for actually unused registers.
The code assumed that unclobbered/unspilled callee saved registers are
unused in the function. This is not true for callee saved registers that are
also used to pass parameters such as swiftself.

rdar://33401922

llvm-svn: 309350
2017-07-28 01:36:32 +00:00
Reid Kleckner 4ca8d21ef3 [lit] Port googletest lit tests to Windows
Summary:
The technique of directly calling subprocess.Popen on a python script
doesn't work on Windows. The executable path of the command must refer
to a valid win32 executable.

Instead, rename all the python scripts masquerading as gtest executables
to have .py extensions, so we can easily detect then and call the python
executable for them. Do this on Linux as well as Windows for
consistency.

The test suite directory names also come out in lower-case on Windows.
We can consider removing that in a later patch. This change just updates
the FileCheck lines to match on Windows.

Fixes PR33933

Reviewers: modocache, mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35909

llvm-svn: 309347
2017-07-28 01:05:55 +00:00
Dehao Chen e70a472bad Changing the default MaxNumPromotions from 2 to 3.
Summary: In performance tuning, we see performance benefits when enlarge the maximum num promotion targets to 3. This is safe as soon as we have total percentage threshold properly setup (https://reviews.llvm.org/D35962)

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D35966

llvm-svn: 309346
2017-07-28 01:03:10 +00:00
Dehao Chen f4240b5b91 Separate the ICP total threshold and remaining threshold.
Summary: In the current implementation, isPromotionProfitable only checks if the call count to a direct target is no less than a certain percentage threshold of the remaining call counts that have not been promoted. This causes code size problems when the target count is small but greater than a large portion of remaining counts. E.g. target1 takes 99.9%, while target2 takes 0.1%. Both targets will be promoted and inlined, makes the function size too large, which potentially prevents it from further inlining into its callers. This patch adds another percentage threshold against the total indirect call count. If the target count needs to be no less than both thresholds in order to be promoted speculatively.

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35962

llvm-svn: 309345
2017-07-28 01:02:54 +00:00
Dehao Chen 8260d66556 Increase the ImportHotMultiplier to 10.0
Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D35969

llvm-svn: 309344
2017-07-28 01:02:34 +00:00
Reid Kleckner 07a5d4372e [X86] Fix latent bug in sibcall eligibility logic
The X86 tail call eligibility logic was correct when it was written, but
the addition of inalloca and argument copy elision broke its
assumptions. It was assuming that fixed stack objects were immutable.

Currently, we aim to emit a tail call if no arguments have to be
re-arranged in memory. This code would trace the outgoing argument
values back to check if they are loads from an incoming stack object.
If the stack argument is immutable, then we won't need to store it back
to the stack when we tail call.

Fortunately, stack objects track their mutability, so we can just make
the obvious check to fix the bug.

This was http://crbug.com/749826

llvm-svn: 309343
2017-07-28 00:58:35 +00:00
Kostya Serebryany 063b652096 [sanitizer-coverage] rename sanitizer-coverage-create-pc-table into sanitizer-coverage-pc-table and add plumbing for a clang flag
llvm-svn: 309337
2017-07-28 00:09:29 +00:00
Adrian Prantl 8f4b353ee1 Remove unused function from AArch64 backend (NFC)
llvm-svn: 309336
2017-07-27 23:52:06 +00:00
Kostya Serebryany b75d002f15 [sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time
llvm-svn: 309335
2017-07-27 23:36:49 +00:00
Jessica Paquette 78681be2a4 [MachineOutliner] Cleanup: move findCandidates out of suffix tree
Doing some cleanup in preparation for some functional changes.
This commit moves findCandidates out of the suffix tree and into the
MachineOutliner class. This is much easier to follow, and removes
the burden of candidate choice from the suffix tree.

It also adds a couple FIXMEs and simplifies building outlined function
names.

llvm-svn: 309334
2017-07-27 23:24:43 +00:00
Reid Kleckner dd853e5406 [llvm-pdbutil] Clean up ExitOnError usage to add ": " to our errors
The banner parameter is supposed to end in a separator, like ": ".
Otherwise, we get ugly errors like:

Error while reading publics streamNative error: blah blah

llvm-svn: 309332
2017-07-27 23:13:18 +00:00
Reid Kleckner ef443296a4 [PDB] Initialize the std::array<ulittle32_t> used for the gsi bitmap
With ASan, we would write about 512 bytes of malloc fill value to the
PDB, with some random bits ORed in here and there. Dumping the PDB would
always fail reliably.

llvm-svn: 309331
2017-07-27 23:13:05 +00:00
Davide Italiano 1a26f24f35 [ConstantFolder] Don't try to fold gep when the idx is a vector.
The code in ConstantFoldGetElementPtr() assumes integers, and
therefore it crashes trying to get the integer bidwith of a vector
type (in this case <4 x i32>. I just changed the code to prevent
the folding in case of vectors and I didn't bother to generalize
as this doesn't seem to me something that really happens in
practice, but I'm willing to change the patch if you think
it's worth it.
This is hard to trigger from -instsimplify or -instcombine
only as the second instruction is dead, so the test uses loop-unroll.

Differential Revision:  https://reviews.llvm.org/D35956

llvm-svn: 309330
2017-07-27 22:20:44 +00:00
Ahmed Bougacha c890993726 [X86] Don't lie about legality to TLI's demanded bits.
Like r309323, X86 had a typo where it passed the wrong flags to TLO.

Found by inspection; I haven't been able to tickle this into having
observable behavior.  I don't think it does, given that X86 doesn't have
custom demanded bits logic, and the generic logic doesn't have a lot of
exposure to illegal constructs.

llvm-svn: 309325
2017-07-27 21:28:59 +00:00
Ahmed Bougacha 52cecb1f27 [AArch64] Remove outdated comment. NFC.
There hasn't been a ternary since r231987.

llvm-svn: 309324
2017-07-27 21:27:58 +00:00
Ahmed Bougacha 87807c5a86 [AArch64] Fix legality info passed to demanded bits for TBI opt.
The (seldom-used) TBI-aware optimization had a typo lying dormant since
it was first introduced, in r252573:  when asking for demanded bits, it
told TLI that it was running after legalize, where the opposite was
true.

This is an important piece of information, that the demanded bits
analysis uses to make assumptions about the node.  r301019 added such an
assumption, which was broken by the TBI combine.

Instead, pass the correct flags to TLO.

llvm-svn: 309323
2017-07-27 21:27:25 +00:00
Michal Gorny 83012b4721 [OCaml] Fix undefined reference to LLVMDumpType() with NDEBUG
Account for the possibility of LLVMDumpType() not being available with
NDEBUG in the OCaml bindings. If it is not built into LLVM, make
the dump function raise an exception.

Since rL293359, the dump functions are built only if either NDEBUG is
not defined, or LLVM_ENABLE_DUMP is defined. As a result, if the dump
functions are not built in LLVM, the dynamic OCaml libraries fail to
load due to undefined LLVMDumpType symbol.

Differential Revision: https://reviews.llvm.org/D35899

llvm-svn: 309321
2017-07-27 21:13:25 +00:00
Michal Gorny 3073ff9fd3 [OCaml] Respect CMAKE_C_FLAGS for OCaml C files
Pass the values of CMAKE_C_FLAGS and CMAKE_C_FLAGS_${CMAKE_BUILD_TYPE}
as -ccopt to ocamlc. This enforces the specific flags used for the LLVM
build to be used for OCaml bindings as well, notably -O and -march
flags.

This also solves the issue of the user being unable to force specific
flags for OCaml bindings builds. Gentoo needs this to enforce -DNDEBUG
consistently between the LLVM build and the split OCaml bindings build.

Differential Revision: https://reviews.llvm.org/D35898

llvm-svn: 309320
2017-07-27 21:13:19 +00:00
Eric Beckmann d8bac0fa3f Add test to reject merging of empty manifest.
Reviewers: ruiu, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35954

llvm-svn: 309317
2017-07-27 19:58:12 +00:00
Florian Hahn e3583bdf91 [ARM] Add use-misched feature, to enable the MachineScheduler.
Summary:
This change makes it easier to experiment with the MachineScheduler in
the ARM backend and also makes it very explicit which CPUs use the
MachineScheduler (currently only swift and cyclone).



Reviewers: MatzeB, t.p.northover, javed.absar

Reviewed By: MatzeB

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35935

llvm-svn: 309316
2017-07-27 19:56:44 +00:00
Dinar Temirbulatov 636ac1b6da Change prefix in vector-shuffle-combining-avx.patch to reduce test size.
llvm-svn: 309315
2017-07-27 19:47:35 +00:00
whitequark 9e8197ac8f [MergeFunctions] Remove alias support.
The alias support was dead code since 2011. It was last touched
in r124182, where it was reintroduced after being removed
in r110434, and since then it was gated behind a HasGlobalAliases
flag that was permanently stuck as `false`.

It is also broken. I'm not sure if it bitrotted or was just broken
in the first place because it appears to have never been tested,
but the following IR results in a crash:

    define internal i32 @a(i32 %a, i32 %b) unnamed_addr {
      %c = add i32 %a, %b
      %d = xor i32 %a, %c
      ret i32 %c
    }

    define internal i32 @b(i32 %a, i32 %b) unnamed_addr {
      %c = add i32 %a, %b
      %d = xor i32 %a, %c
      ret i32 %c
    }

It seems safe to remove buggy untested code that no one cared about
for seven years.

Differential Revision: https://reviews.llvm.org/D34802

llvm-svn: 309313
2017-07-27 19:36:13 +00:00
Brian Gesiak 8c44b033c1 [lit] Fix TestRunner unit test on Windows
Summary:
Normally Python converts all newline characters, Windows or Unix,
to Unix newlines when opening a file. However, lit opens files in
binary mode, which does not perform this conversion. As a result,
trailing Windows newlines are not stripped from test input, which
caused a failure in the TestRunner unit test:

```
FAIL: test_custom (__main__.TestIntegratedTestKeywordParser)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\bgesiak\src\llvm\llvm\utils\lit\tests\unit\TestRunner.py", line 109, in test_custom
    self.assertItemsEqual(value, ['a', 'b', 'c'])
AssertionError: Element counts were not equal:
First has 1, Second has 0: 'c\r'
First has 0, Second has 1:  'c'
```

Fix the discrepancy in behavior across the two platforms by
manually stripping Windows newlines before yielding each line in
the test file.

Reviewers: echristo, beanz, ddunbar, delcypher, rnk

Reviewed By: rnk

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D27746

llvm-svn: 309312
2017-07-27 19:27:10 +00:00
Brian Gesiak f0850e3ddb Un-revert "Teach the CMake build system to run lit's test suite. These can be run"
Summary:
Depends on https://reviews.llvm.org/D35879.

This reverts rL257268, which in turn was a revert of rL257221.
https://reviews.llvm.org/D35879 marks the tests in the lit test suite
that fail on Windows as XFAIL, which should allow these tests to pass
on Windows-based buildbots.

Reviewers: delcypher, beanz, mgorny, jroelofs, rnk

Reviewed By: mgorny

Subscribers: rnk, ddunbar, george.karpenkov, llvm-commits

Differential Revision: https://reviews.llvm.org/D35880

llvm-svn: 309310
2017-07-27 19:18:35 +00:00
Davide Italiano 82c7d3768d [FunctionImport] Prefer isa<> to dyn_cast<> as the value is not used.
This change makes GCC7 happy again.

llvm-svn: 309305
2017-07-27 18:38:09 +00:00
Hiroshi Yamauchi 60855214c2 [InstCombine] Simplify pointer difference subtractions (GEP-GEP) where GEPs have other uses and one non-constant index
Summary:
Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic.

This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses.

For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses.

Reviewers: sanjoy, bkramer

Reviewed By: sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D35499

llvm-svn: 309304
2017-07-27 18:27:11 +00:00
Reid Kleckner eacdf04fdd [PDB] Write public symbol records and the publics hash table
Summary:
MSVC link.exe records all external symbol names in the publics stream.
It provides similar functionality to an ELF .symtab.

Reviewers: zturner, ruiu

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35871

llvm-svn: 309303
2017-07-27 18:25:59 +00:00
Simon Pilgrim ac84850ea6 [SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.

llvm-svn: 309302
2017-07-27 18:15:54 +00:00
Dinar Temirbulatov aead31a36f [X86] SET0 to use XMM registers where possible PR26018 PR32862
Differential Revision: https://reviews.llvm.org/D35839

llvm-svn: 309298
2017-07-27 17:47:01 +00:00
Adam Nemet a67dfe3b04 Relax the matching in these tests
Looks like the template arguments are displayed differently depending on the
host compiler(?).  E.g.:

InnerAnalysisManagerProxy<CGSCCAnalysisManager
InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::LazyCallGraph::SCC, ...

Fix fallout after r309294

llvm-svn: 309297
2017-07-27 17:45:02 +00:00
Adam Nemet 0d8b5d6f69 [ICP] Migrate to OptimizationRemarkEmitter
This is a module pass so for the old PM, we can't use ORE, the function
analysis pass.  Instead ORE is created on the fly.

A few notes:

- isPromotionLegal is folded in the caller since we want to emit the Function
in the remark but we can only do that if the symbol table look-up succeeded.

- There was good test coverage for remarks in this pass.

- promoteIndirectCall uses ORE conditionally since it's also used from
SampleProfile which does not use ORE yet.

Fixes PR33792.

Differential Revision: https://reviews.llvm.org/D35929

llvm-svn: 309294
2017-07-27 16:54:15 +00:00
Adam Nemet 6374331a8c [OptRemark] Allow streaming of 64-bit integers
llvm-svn: 309293
2017-07-27 16:54:13 +00:00
Brian Gesiak d256538b3a [lit] Fix order of checks in shtest-shell.py test
Summary:
An expectation in `utils/lit/tests/Inputs/shtest-shell/redirects.txt`
expects that first a string printed to stdout is seen, and then a
string printed to stderr. Add `flush()` calls to ensure that stdout is
printed before stderr, as expected.

Reviewers: rnk, mgorny, jroelofs

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35947

llvm-svn: 309292
2017-07-27 16:50:40 +00:00
Daniel Neilson 2574d7cbf6 All libcalls should be considered to be GC-leaf functions.
Summary:
It is possible for some passes to materialize a call to a libcall (ex: ldexp, exp2, etc),
but these passes will not mark the call as a gc-leaf-function. All libcalls are
actually gc-leaf-functions, so we change llvm::callsGCLeafFunction() to tell us that
available libcalls are equivalent to gc-leaf-function calls.

Reviewers: sanjoy, anna, reames

Reviewed By: anna

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35840

llvm-svn: 309291
2017-07-27 16:49:39 +00:00
Florian Hahn 67ddd1d08f [TargetParser] Use enum classes for various ARM kind enums.
Summary:
Using c++11 enum classes ensures that only valid enum values are used
for ArchKind, ProfileKind, VersionKind and ISAKind. This removes the
need for checks that the provided values map to a proper enum value,
allows us to get rid of AK_LAST and prevents comparing values from
different enums. It also removes a bunch of static_cast
from unsigned to enum values and vice versa, at the cost of introducing
static casts to access AArch64ARCHNames and ARMARCHNames by ArchKind.

FPUKind and ArchExtKind are the only remaining old-style enum in
TargetParser.h. I think it's beneficial to keep ArchExtKind as old-style
enum, but FPUKind can be converted too, but this patch is quite big, so
could do this in a follow-up patch. I could also split this patch up a
bit, if people would prefer that.

Reviewers: rengolin, javed.absar, chandlerc, rovka

Reviewed By: rovka

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35882

llvm-svn: 309287
2017-07-27 16:27:56 +00:00
Alexey Bataev 07b96e8e96 [SLP] Outline code for the check that instruction users are part of
vectorization tree, NFC.

llvm-svn: 309284
2017-07-27 15:48:44 +00:00
Simon Pilgrim 64a795c5f5 [SelectionDAG] Avoid repeated calls to getNumOperands in for loop. NFCI.
llvm-svn: 309283
2017-07-27 15:42:21 +00:00
David Blaikie 72c0b1cc1b Fix assert from r309278
llvm-svn: 309281
2017-07-27 15:28:10 +00:00
Adrian Prantl 960e7663f3 remove redundant check
llvm-svn: 309280
2017-07-27 15:24:20 +00:00
David Blaikie 2f0cc477ab ThinLTO: Don't import aliases of any kind (even linkonce_odr)
Summary:
Until a more advanced version of importing can be implemented for
aliases (one that imports an alias as an available_externally definition
of the aliasee), skip the narrow subset of cases that was possible but
came at a cost: aliases of linkonce_odr functions could be imported
because the linkonce_odr function could be safely duplicated from the
source module. This came/comes at the cost of not being able to 'home'
imported linkonce functions (they had to be emitted linkonce_odr in all
the destination modules (even if they weren't used by an alias) rather
than as available_externally - causing extra object size).

Tangentially, this also was the only reason ThinLTO would emit multiple
CUs in to the resulting DWARF - which happens to be a problem for
Fission (there's a fix for this in GDB but not released yet, etc).
(actually it's not the only reason - but I'm sending a patch to fix the
other reason shortly)

There's no reason to believe this particularly narrow alias importing
was especially/meaningfully important, only that it was /possible/ to
implement in this way. When a more general solution is done, it should
still satisfy the DWARF concerns above, since the import will still be
available_externally, and thus not create extra CUs.

Since now all aliases are treated the same, I removed/simplified some
test cases since they were testing corner cases where there are no
longer any corners.

Reviewers: tejohnson, mehdi_amini

Differential Revision: https://reviews.llvm.org/D35875

llvm-svn: 309278
2017-07-27 15:09:06 +00:00
Simon Pilgrim 0d543b5921 [SelectionDAG] Tidyup mask creation. NFCI.
Assign all concat elements to UNDEF and then just replace the first element, instead of copying everything individually.

llvm-svn: 309277
2017-07-27 15:08:53 +00:00
Florian Hahn db479524dd [ARM] Mark labels in skipAlignedDPRCS2Spills as fallthrough (NFC).
The comment at the top of the switch statement indicates that the
fall-through behavior is intentional. By using LLVM_FALLTHROUGH,
-Wimplicit-fallthrough are silenced, which is enabled by default in GCC
7.

llvm-svn: 309272
2017-07-27 14:37:17 +00:00
Andrew V. Tischenko e255526d0b Added cost of ZEROALL and ZEROUPPER instrs in btver2 cpu.
Differential Revision https://reviews.llvm.org/D35834

llvm-svn: 309269
2017-07-27 13:12:08 +00:00
Evgeny Astigeevich 61c1bd5abc [InlineCost, NFC] Change CallAnalyzer::isGEPFree to use TTI::getUserCost instead of TTI::getGEPCost
Currently CallAnalyzer::isGEPFree uses TTI::getGEPCost to check if GEP is free.
TTI::getGEPCost cannot handle cases when GEPs participate in Def-Use dependencies
(see https://reviews.llvm.org/D31186 for example).
There is TTI::getUserCost which can calculate the cost more accurately by
taking dependencies into account.

Differential Revision: https://reviews.llvm.org/D33685

llvm-svn: 309268
2017-07-27 12:49:27 +00:00
Daniel Sanders cbbbfe4033 [globalisel][tablegen] Ensure MatchTable's are compile-time constants with constexpr. NFC.
This should prevent any re-occurence of the problem where the table was
initialized at run-time.

llvm-svn: 309267
2017-07-27 12:47:31 +00:00
Simon Pilgrim 31f5402711 [X86][AVX] Regenerate shuffle tests with broadcast comments.
llvm-svn: 309266
2017-07-27 12:32:45 +00:00
Daniel Sanders 8e82af2be6 Re-commit: r309094 [globalisel][tablegen] Fuse the generated tables together.
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.

This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().

The compile-time regressions from the last commit were caused by the ARM target
including a non-const variable (zero_reg) in the table and therefore generating
an initializer for it. That variable is now const.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35681

llvm-svn: 309264
2017-07-27 11:03:45 +00:00
Simon Pilgrim 804cbd61e6 [X86] Adding test cases for LEA factorization (PR32755 / D35014)
Differential Revision: https://reviews.llvm.org/D35886

llvm-svn: 309262
2017-07-27 10:36:09 +00:00
Simon Pilgrim afc1ac2735 [X86] Tidyup MaskedLoad/Store mask creation. NFCI.
Assign all concat elements to zero and then just replace the first element, instead of setting them all to null and copying everything in.

llvm-svn: 309261
2017-07-27 10:29:04 +00:00
Mohammed Agabaria cef53dcb6f [TTI] fixing a bug in the isLegalMaskedScatter API
isLegalMaskedScatter called the Gather version which is a bug.
use test case is provided within the patch of AVX2 gathers at: https://reviews.llvm.org/D35772

Differential Revision: https://reviews.llvm.org/D35786

llvm-svn: 309260
2017-07-27 10:28:16 +00:00
Hiroshi Inoue 967dc58ac1 [PowerPC] enable optimizeCompareInstr for branch with static branch hint
In optimizeCompareInstr, a compare instruction is eliminated by using a record form instruction if possible.
If the branch instruction that uses the result of the compare has a static branch hint, the optimization does not happen.
This patch makes this optimization happen regardless of the branch hint by splitting branch hint and branch condition before checking the predicate to identify the possible optimizations.

Differential Revision: https://reviews.llvm.org/D35801

llvm-svn: 309255
2017-07-27 08:14:48 +00:00
Petr Hosek 88ac9a8c07 Revert "Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started""
This change is failing tests on Windows bots due to permissions.

This reverts commit r309249.

llvm-svn: 309251
2017-07-27 06:02:05 +00:00
Petr Hosek 27bcf6a680 Reland "[LLVM][llvm-objcopy] Added basic plumbing to get things started"
As discussed on llvm-dev I've implemented the first basic steps towards
llvm-objcopy/llvm-objtool (name pending).

This change adds the ability to copy (without modification) 64-bit
little endian ELF executables that have SHT_PROGBITS, SHT_NOBITS,
SHT_NULL and SHT_STRTAB sections.

Patch by Jake Ehrlich

Differential Revision: https://reviews.llvm.org/D33964

llvm-svn: 309249
2017-07-27 04:35:30 +00:00
Craig Topper 4eda7561b3 [X86] Improve the unknown stepping support for Intel CPUs in getHostCPUName
This patch improves our guessing of unknown Intel CPUs to support Goldmont and skylake-avx512.

Differential Revision: https://reviews.llvm.org/D35161

llvm-svn: 309246
2017-07-27 03:26:52 +00:00
Aditya Nandakumar 69a9971ade [GISel]: Missed passing in a parameter to addUsesFromArgs
llvm-svn: 309243
2017-07-27 02:15:34 +00:00
Eric Beckmann 11dcf62cb9 Remove check for i686.
libxml2 is supported for 32 bit, so our build system should be checking
the target rather than native os when choosing shared libs.

llvm-svn: 309242
2017-07-27 01:16:19 +00:00
Eric Beckmann f7e84c3ff2 Re-enable libxml2 tests.
llvm-svn: 309241
2017-07-27 01:11:53 +00:00
Spyridoula Gravani 73e1796da2 [DWARF] Minor code style modification, no functionality change.
llvm-svn: 309240
2017-07-27 00:59:33 +00:00
David Blaikie 2195e13676 DebugInfo: Ensure imported entities at the top level of an inlined function don't cause degenerate concrete definitions
Local imported entities at the top level of a subprogram were being
handled differently from those in nested scopes - that different
handling would cause pseudo concrete out-of-line definitions to be
created (but without any of their attributes, nor an abstract_origin) in
the case where there was no real concrete definition.

These local imported entities also only appeared in the concrete
definition where those imported entities in nested scopes appear in all
cases (abstract, concrete, and inlined). This change at least makes top
level case handle the same as the others - though there's a FIXME to
improve this to /only/ emit them into the abstract origin (though this
requires more plumbing - like the abstract subprogram and variable
handling that must defer population until the end of the unit to
discover if there is an abstract origin, or only a standalone concrete
definition).

llvm-svn: 309237
2017-07-27 00:06:53 +00:00
Eugene Zelenko 569932d1e6 [Hexagon] Fix expensive checks build bot broken in r309230.
llvm-svn: 309236
2017-07-26 23:56:29 +00:00
Petr Hosek b88afb3fbd [CMake] Disable -Werror for CMake checks
-Werror may cause some of the CMake checks to fail, so we disable
it even if it's enabled for the build itself.

Differential Revision: https://reviews.llvm.org/D35924

llvm-svn: 309235
2017-07-26 23:49:18 +00:00
Eugene Zelenko efd3d5887b [Hexagon] Partially revert r309230 which caused some build bots failures.
llvm-svn: 309233
2017-07-26 23:45:28 +00:00
Eugene Zelenko e4fc6ee790 [Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 309230
2017-07-26 23:20:35 +00:00
Eric Beckmann 08e38d6b3d See if disabling libxml tests will pass the i686 bot.
llvm-svn: 309229
2017-07-26 23:15:44 +00:00
Reid Kleckner c9e700042b [lit] Fix race between shtest-shell and max-failures tests
Previously these tests would use the same Output directory leading to
flaky non-deterministic failures.

llvm-svn: 309227
2017-07-26 22:57:32 +00:00
Reid Kleckner 6dac9134f9 [lit] Fix shtest-shell and max-failures lit tests on Windows
Rewrite the write-to-stderr.sh and write-to-stdout-and-stderr.sh shell
scripts as python scripts and call python on them.

Fixes PR33940

llvm-svn: 309200
2017-07-26 22:21:25 +00:00
Reid Kleckner fa83d5a398 [lit] Fix shtest-output-printing.py on Windows by matching either / or \\
Fixes PR33938

llvm-svn: 309198
2017-07-26 22:11:30 +00:00
Reid Kleckner f63d4d121b [lit] Fix discovery.py on Windows by matching backslashes when necessary
Fixes PR33932

llvm-svn: 309194
2017-07-26 22:00:38 +00:00
Hiroshi Yamauchi 0445e31c88 Fix a comment (test commit).
llvm-svn: 309192
2017-07-26 21:54:43 +00:00
Reid Kleckner cb98c14f89 [lit] Un-XFAIL selecting.py test on Windows
This passes locally for me, which fails the overall lit test suite. I
can't debug a passing test, but I will try to help debug the test when
we get some failing logs.

llvm-svn: 309190
2017-07-26 21:48:41 +00:00
Stanislav Mekhanoshin 3197eb6981 [AMDGPU] Optimize SI_IF lowering for simple if regions
Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64.
The xor is used to extract only the changed bits. In case of a simple
if region where the only use of that value is in the SI_END_CF to
restore the old exec mask, we can omit the xor and perform an or of
the exec mask with the original exec value saved by the
s_and_saveexec_b64.

Differential Revision: https://reviews.llvm.org/D35861

llvm-svn: 309185
2017-07-26 21:29:15 +00:00
Evandro Menezes b3ed4bcb8f [ARM] Minor cosmetic edits (NFC)
Change the order of a case and the description for Exynos Mx processors.

llvm-svn: 309184
2017-07-26 21:28:20 +00:00
Evandro Menezes d192a8ae7d [AArch64] Adjust the cost model for Exynos M1 and M2
Add the information for the scalar reciprocal square root approximation.

llvm-svn: 309183
2017-07-26 21:28:15 +00:00
Eric Beckmann 1dd3130270 Close if statement in config-ix.cmake while checking for i686 arch.
Reapply "Set a different var for checking I686, because LLVM_NATIVE_ARCH is"

This reverts commit e7400d7cbc2b7539de3aa7a20adc8f4ee0cb7bef.

llvm-svn: 309181
2017-07-26 21:20:24 +00:00
Eric Beckmann ac84c74795 Revert "Set a different var for checking I686, because LLVM_NATIVE_ARCH is"
This reverts commit 38a6db6397364ee91b04afea2cdcb1b5b4d252bf.

llvm-svn: 309179
2017-07-26 21:11:07 +00:00
Wei Ding a126a13bb3 AMDGPU : Widen extending scalar loads to 32-bits.
Differential Revision: http://reviews.llvm.org/D35146

llvm-svn: 309178
2017-07-26 21:07:28 +00:00
Eric Beckmann 92d4dd0da7 Set a different var for checking I686, because LLVM_NATIVE_ARCH is
overwritten.

llvm-svn: 309177
2017-07-26 21:03:55 +00:00
Davide Italiano dae1a15a79 [gold] Relax this tests a little more.
Thanks to Peter for the report!

llvm-svn: 309176
2017-07-26 21:01:57 +00:00
Davide Italiano 853ce87a4d [gold] Relax tests to account for difference in layout across versions.
llvm-svn: 309174
2017-07-26 20:40:33 +00:00
Matt Arsenault 894e53d6ac AMDGPU: Fix using SMRD instructions for argument loads in functions
These are not actually uniform values except in kernels.

llvm-svn: 309172
2017-07-26 20:39:42 +00:00
Jakub Kuderski ca7e6f6ea8 [Dominators] Fix typos. NFC.
llvm-svn: 309170
2017-07-26 20:26:13 +00:00
Eric Beckmann b25bdc27b9 Disable libxml on i686, because it is a 32 bit architecture and
libxml2.so is for 64 bit.

llvm-svn: 309169
2017-07-26 20:22:26 +00:00
Tom Stellard 55038cd1d3 AMDGPU/GlobalISel: Mark 32-bit G_OR as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D35127

llvm-svn: 309165
2017-07-26 20:00:53 +00:00
Aditya Nandakumar e469a6f550 [GISel]: Avoid zero length array when building Instrs that don't have
uses.

Also splitting the buildSources part allows more overloads such as
adding MachineOperands directly in the arguments for buildInstr.

llvm-svn: 309163
2017-07-26 19:58:03 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Adam Nemet ea06e6e865 Migrate SimplifyLibCalls to new OptimizationRemarkEmitter
Summary:
This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter
API.

In fact, as SimplifyLibCalls is only ever called via InstCombine,
(as far as I can tell) the OptimizationRemarkEmitter is added there,
and then passed through to SimplifyLibCalls later.

I have avoided changing any remark text.

This closes PR33787

Patch by Sam Elliott!

Reviewers: anemet, davide

Reviewed By: anemet

Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D35608

llvm-svn: 309158
2017-07-26 19:03:18 +00:00
Andrew V. Tischenko d1fefa3d7c This patch returns proper value to indicate the case when instruction throughput can't be calculated.
Differential revision https://reviews.llvm.org/D35831

llvm-svn: 309156
2017-07-26 18:55:14 +00:00
Adrian Prantl 833ad37c90 Do a better job at emitting prefrabricated skeleton CUs.
This is a better fix than r308708 for the problem introduced in
r304020. It restores the skeleton CU testcases modified by that commit
to their original form and most importantly ensures that
frontend-generated skeleton CUs (such as used to point to Clang
modules) come after the regular CUs. This broke for DICompileUnit
nodes that don't have any immediate children because they are now
constructed lazily instead of the order in which they are listed in
!llvm.dbg.cu. After this commit we still don't guarantee that order,
but we do guarantee that empty skeletons come last.

Shipping versions of LLDB are very sensitive to the ordering of
CUs. I'll track a fix for LLDB to be more permissive separately.
This fixes a test failure in the LLDB testsuite.

rdar://problem/33357252

llvm-svn: 309154
2017-07-26 18:48:32 +00:00
Eric Beckmann 6ba5c81387 Unlink nodes instead of copying, to avoid memory problems.
llvm-svn: 309151
2017-07-26 18:33:21 +00:00
Jakub Kuderski 40f59b8596 [Dominators] Change Roots type to SmallVector
Summary: We can use the template parameter `IsPostDom` to pick an appropriate SmallVector size to store DomTree roots for dominators and postdominators. Before, the code would always allocate memory with `std::vector`.

Reviewers: dberlin, davide, sanjoy, grosser

Reviewed By: grosser

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35636

llvm-svn: 309148
2017-07-26 18:27:39 +00:00
Jakub Kuderski c271dea0a7 [Dominators] Move root-finding out of DomTreeBase and simplify it
Summary:
This patch moves root-finding logic from DominatorTreeBase to GenericDomTreeConstruction.h.
It makes the behavior simpler and more consistent by always adding a virtual root to PostDominatorTrees.

Reviewers: dberlin, davide, grosser, sanjoy

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35597

llvm-svn: 309146
2017-07-26 18:07:40 +00:00
Reid Kleckner 0ff0eacf32 Un-XFAIL some internal lit tests on Windows, they pass for me locally
llvm-svn: 309144
2017-07-26 18:04:18 +00:00
Eric Beckmann 6638ba2d75 Diffing against a file that is itself used in the test seems to be a bad
idea, because it might get locked down and rendered unopenable.

llvm-svn: 309142
2017-07-26 17:47:44 +00:00
Rafael Espindola e06e4df8be Simplify. NFC.
llvm-svn: 309141
2017-07-26 17:27:27 +00:00
George Karpenkov 2fae2465c1 Fix LIT test breakage
Differential Revision: https://reviews.llvm.org/D35867

llvm-svn: 309140
2017-07-26 17:19:36 +00:00
Simon Pilgrim 66a2eb8c77 [X86][AVX512] Regenerated and cleaned up extension tests.
llvm-svn: 309139
2017-07-26 16:47:00 +00:00
Simon Pilgrim b77cb95744 [X86] Regenerate setcc tests
llvm-svn: 309138
2017-07-26 16:45:57 +00:00
Simon Pilgrim 164160b4f6 [X86][AVX512] Regenerate shuffle tests with broadcast comments.
llvm-svn: 309137
2017-07-26 16:41:18 +00:00
Simon Pilgrim 0a7d9ac766 [X86] Regenerate memset tests
llvm-svn: 309136
2017-07-26 16:39:07 +00:00
Eric Beckmann 3f4fe8f4bd Correctly enable the llvm-mt tests, now that build flags changed.
llvm-svn: 309134
2017-07-26 16:35:44 +00:00
Reid Kleckner 43c2b131d9 Quote '?' in llvm-rc test
Summary:
Bash interperets the '?' character as matching an arbitrary character.
On systems that have a file or directory with exactly one character in
their root directory, '/?' gets reinterpreted into that pathname, which
fails to match the expected Help text for llvm-rc.
This patch quotes the '/?' to avoid that edge case.

Reviewers: mnbvmar, ecbeckmann, rnk

Reviewed By: rnk

Subscribers: dyung, ruiu, llvm-commits

Differential Revision: https://reviews.llvm.org/D35852

llvm-svn: 309133
2017-07-26 16:25:48 +00:00
Florian Hahn 239e4b9301 [Hexagon] Mark raise_relocation_error as NORETURN.
Summary:
This silences a couple of implicit fallthrough warnings with GCC 7.1 in
this file.

Reviewers: colinl, kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35889

llvm-svn: 309129
2017-07-26 16:07:51 +00:00
Dehao Chen 641f387cd0 Update the assertion to meet with the changes in r309121. (NFC)
llvm-svn: 309125
2017-07-26 15:47:00 +00:00
Simon Pilgrim 01ab86e62b [X86] Add combineBT test failure because bits have multiple uses.
llvm-svn: 309124
2017-07-26 15:41:57 +00:00
Brian Gesiak 0787253cb6 [lit] Mark several of lit's tests XFAIL on Windows
Summary:
rL257221 attempted to run lit's own test suite continuously, but that
commit was reverted because lit's test suite does not pass on Windows.
Because lit's tests do not run continuously, they often regress.

In order to un-revert rL257221, mark lit tests that fail as XFAIL for
Windows platforms.

Test Plan:
On a Windows development environment, follow the instructions in
utils/lit/README.txt to run lit's test suite:

```
utils/lit/lit.py \
    --path /path/to/your/llvm/build/bin \
    utils/lit/tests
```

Verify that the test suite is run and a successful exit code is
returned.

Reviewers: mgorny, rnk, delcypher, beanz

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35879

llvm-svn: 309123
2017-07-26 15:10:50 +00:00
Brian Gesiak 5e7c089b8a [lit] Fix type error for parallelism groups
Summary:
Whereas rL299560 and rL309071 call `parallelism_groups.items()`, under the
assumption that `parallelism_groups` is a `dict` type, the default
parameter for that attribute is a `list`. Change the default to a
`dict` for type correctness.

This regression in the unit tests would have been caught if the
unit tests were being run continously. It also would have been caught
if the lit project used a Python type checker such as `mypy`.

Test Plan:
As per the instructions in `utils/lit/README.txt`, run the lit unit
test suite:

```
utils/lit/lit.py \
    --path /path/to/your/llvm/build/bin \
    utils/lit/tests
```

Verify that the test `lit :: unit/TestRunner.py` fails before applying this
patch, but passes once this patch is applied.

Reviewers: mgorny, rnk, rafael

Reviewed By: mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35878

llvm-svn: 309122
2017-07-26 15:02:05 +00:00
Dehao Chen e90d0153ca Make new PM honor -fdebug-info-for-profiling
Summary: The new PM needs to invoke add-discriminator pass when building with -fdebug-info-for-profiling.

Reviewers: chandlerc, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35744

llvm-svn: 309121
2017-07-26 15:01:20 +00:00
Brian Gesiak 6d8de5ce63 Revert "[lit] Remove dead code not referenced in the LLVM SVN repo."
Summary:
This reverts rL306623, which removed `FileBasedTest`, an abstract base class,
but did not also remove the usages of that class in the lit unit tests.
The revert fixes four test failures in the lit unit test suite.

Test plan:
As per the instructions in `utils/lit/README.txt`, run the lit unit
test suite:

```
utils/lit/lit.py \
    --path /path/to/your/llvm/build/bin \
    utils/lit/tests
```

Verify that the following tests fail before applying this patch, and
pass once the patch is applied:

```
lit :: test-data.py
lit :: test-output.py
lit :: xunit-output.py
```

In addition, run `check-llvm` to make sure the existing LLVM test suite
executes normally.

Reviewers: george.karpenkov, mgorny, dlj

Reviewed By: mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35877

llvm-svn: 309120
2017-07-26 14:59:36 +00:00
Nuno Lopes e02fcee536 [docs] change a few code-blocks to llvm from text
llvm-svn: 309117
2017-07-26 14:11:23 +00:00
Stefan Pintilie df0ee9e1b9 [NFC] test commit.
Added a comment to explain how to add a PPCISD node.

llvm-svn: 309114
2017-07-26 13:44:59 +00:00