Commit Graph

95580 Commits

Author SHA1 Message Date
Konstantin Zhuravlyov 836cbff695 [AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version
Differential Revision: https://reviews.llvm.org/D24973

llvm-svn: 282877
2016-09-30 17:01:40 +00:00
Konstantin Zhuravlyov d7bdf24f32 [AMDGPU] Ask subtarget if waitcnt instruction is needed before barrier instruction
Differential Revision: https://reviews.llvm.org/D24985

llvm-svn: 282875
2016-09-30 16:50:36 +00:00
Konstantin Zhuravlyov 4658e5f7b0 [AMDGPU] Do not run scalar optimization passes at "-O0"
Differential Revision: https://reviews.llvm.org/D25055

llvm-svn: 282873
2016-09-30 16:39:24 +00:00
Artur Pilipenko 2af93490fb CVP. Turn marking adds as no wrap on by default (was turned off by 279082)
With 282650 in tree extra no wrap on adds doesn't cause regressions anymore. Reenable the optimzation.

llvm-svn: 282872
2016-09-30 16:20:08 +00:00
Matthew Simpson 7808833e28 [LV] Build all scalar steps for non-uniform induction variables
When building the steps for scalar induction variables, we previously attempted
to determine if all the scalar users of the induction variable were uniform. If
they were, we would only emit the step corresponding to vector lane zero. This
optimization was too aggressive. We generally don't know the entire set of
induction variable users that will be scalar. We have
isScalarAfterVectorization, but this is only a conservative estimate of the
instructions that will be scalarized. Thus, an induction variable may have
scalar users that aren't already known to be scalar. To avoid emitting unused
steps, we can only check that the induction variable is uniform. This should
fix PR30542.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30542
llvm-svn: 282863
2016-09-30 15:13:52 +00:00
Dylan McKay 4a25499b13 [AVR] Add the ELF object file writer
Summary: This adds the ELF32 writer for AVR.

Reviewers: kparzysz

Subscribers: beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25031

llvm-svn: 282856
2016-09-30 14:09:20 +00:00
Dylan McKay 309eba75b1 Revert "[RegAllocGreedy] Attempt to split unspillable live intervals"
It was accidentally committed.

llvm-svn: 282855
2016-09-30 14:05:15 +00:00
Dylan McKay 1a7bd84a92 [AVR] Add the assembly instruction printer
Summary:
This change adds the AVR assembly instruction printer.

No tests are included in this patch. I have left them downstream so we can
add them once `llc` successfully runs (there's very few components left
to upstream until this).

Reviewers: arsenm, kparzysz

Subscribers: wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25028

llvm-svn: 282854
2016-09-30 14:01:50 +00:00
Dylan McKay 2a80cc688a [RegAllocGreedy] Attempt to split unspillable live intervals
Summary:
Previously, when allocating unspillable live ranges, we would never
attempt to split. We would always bail out and try last ditch graph
recoloring.

This patch changes this by attempting to split all live intervals before
performing recoloring.

This fixes LLVM bug PR14879.

I can't add test cases for any backends other than AVR because none of
them have small enough register classes to trigger the bug.

Reviewers: qcolombet

Subscribers: MatzeB

Differential Revision: https://reviews.llvm.org/D25070

llvm-svn: 282852
2016-09-30 13:59:20 +00:00
Craig Topper f3e671e020 [AVX-512] Store address operand should be an input operand for the special stack spilling pseudos for XMM16-31 and YMM16-31 without VLX.
llvm-svn: 282843
2016-09-30 05:35:47 +00:00
Craig Topper 1c01cbe9ee [AVX-512] Add the special stack spilling pseudos for XMM16-31 and YMM16-31 without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode.
llvm-svn: 282842
2016-09-30 05:35:45 +00:00
Craig Topper 3f37a4180b Revert r282835 "[AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not."
Turns out this doesn't pass verify-machineinstrs.

llvm-svn: 282841
2016-09-30 05:35:42 +00:00
Kostya Serebryany cad612a472 [libfuzzer] test for c-ares CVE-2016-5180
llvm-svn: 282839
2016-09-30 05:15:45 +00:00
Adam Nemet f744ad78e9 [LDist] Port to new streaming API for opt remarks
llvm-svn: 282838
2016-09-30 04:56:25 +00:00
Craig Topper de03ff7063 [X86] Add AVX-512 VTs to findRepresentativeClass as well as v16i16 which was also missing. Change register class to include the extra 16 AVX512 registers.
I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar.

llvm-svn: 282836
2016-09-30 04:31:37 +00:00
Craig Topper bc6e97b8f4 [AVX-512] Always use the full 32 register vector classes for addRegisterClass regardless of whether AVX512/VLX is enabled or not.
If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of.

I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change.

llvm-svn: 282835
2016-09-30 04:31:33 +00:00
Adam Nemet f57cc62abf [LoopUnroll] Port to the new streaming interface for opt remarks.
llvm-svn: 282834
2016-09-30 03:44:16 +00:00
Piotr Padlewski d28694739c [thinlto] Don't decay threshold for hot callsites
Summary:
We don't want to decay hot callsites to import chains of hot
callsites. The same mechanism is used in LIPO.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24976

llvm-svn: 282833
2016-09-30 03:01:17 +00:00
Matt Arsenault 5d8eb25e78 AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.

llvm-svn: 282832
2016-09-30 01:50:20 +00:00
Kostya Serebryany b3949ef885 [libFuzzer] remove the code for -print_pcs=1 with the old coverage. It still works with the new one (trace-pc-guard)
llvm-svn: 282831
2016-09-30 01:24:57 +00:00
Kostya Serebryany 2c55613a08 [libFuzzer] more the feature set to InputCorpus; on feature update, change the feature counter of the old best input
llvm-svn: 282829
2016-09-30 01:19:56 +00:00
Adam Nemet fce0178847 [LoopDataPrefetch] Port to new streaming API for opt remarks
llvm-svn: 282826
2016-09-30 00:42:43 +00:00
Justin Lebar 9091055efa Move UTF functions into namespace llvm.
Summary:
This lets people link against LLVM and their own version of the UTF
library.

I determined this only affects llvm, clang, lld, and lldb by running

$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq
  clang
  lld
  lldb
  llvm

Tested with

  ninja lldb
  ninja check-clang check-llvm check-lld

(ninja check-lldb doesn't complete for me with or without this patch.)

Reviewers: rnk

Subscribers: klimek, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D24996

llvm-svn: 282822
2016-09-30 00:38:45 +00:00
Adam Nemet 951c6b1955 [LV] Port the remarks in processLoop to the new streaming API
This completes LV.

llvm-svn: 282821
2016-09-30 00:29:30 +00:00
Adam Nemet 4fd9c42279 [LV] Port the last opt remark in Hints to the new streaming interface
llvm-svn: 282820
2016-09-30 00:29:25 +00:00
Reid Kleckner 147f91c88e [X86] Don't preserve Win64 SSE CSRs when SSE is disabled
Code that doesn't use floating point and doesn't use SSE (kernel code)
shouldn't save and restore SSE registers.

Fixes PR30503

llvm-svn: 282819
2016-09-30 00:17:49 +00:00
Quentin Colombet 4b36e0c409 [AArch64][RegisterBankInfo] Use static mapping for 3-operands instrs.
This uses a TableGen'ed like structure for all 3-operands instrs.
The output of the RegBankSelect pass should be identical but the
RegisterBankInfo will do less dynamic allocations.

llvm-svn: 282817
2016-09-30 00:10:00 +00:00
Quentin Colombet fdd303afe2 [AArch64][RegisterBankInfo] Add static value mapping for 3-op instrs.
This is the kind of input TableGen should generate at some point.
NFC.

llvm-svn: 282816
2016-09-30 00:09:58 +00:00
Quentin Colombet eb8d3da9a0 [AArch64][RegisterBankInfo] Check the statically created ValueMapping.
Make sure that the ValueMappings contain the value we expect at the
indices we expect.

NFC.

llvm-svn: 282815
2016-09-30 00:09:43 +00:00
Adam Nemet 877ccee8cc [LAA, LV] Port to new streaming interface for opt remarks. Update LV
(Recommit after making sure IsVerbose gets properly initialized in
DiagnosticInfoOptimizationBase.  See previous commit that takes care of
this.)

OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282813
2016-09-30 00:01:30 +00:00
Sanjay Patel 453ceff261 [InstCombine] fix function names; NFC
Also, make foldSelectExtConst() a member of InstCombiner, remove
unnecessary parameters from its interface, and group visitSelectInst
helpers together in the header file.

llvm-svn: 282796
2016-09-29 22:18:30 +00:00
Joerg Sonnenberger 8472f847ca Make HAVE_DECL_ARC4RANDOM always defined. Sort the entry correctly.
llvm-svn: 282768
2016-09-29 21:10:38 +00:00
Joerg Sonnenberger 5c50539503 HAVE_UNWIND_BACKTRACE -> HAVE__UNWIND_BACKTRACE
Check for existance and not truth value.

llvm-svn: 282767
2016-09-29 21:07:57 +00:00
Kevin Enderby 4f229d867b Next set of additional error checks for invalid Mach-O files for the
load command that uses the MachO::entry_point_command type
but not used in llvm libObject code but used in llvm tool code.

This includes just the LC_MAIN load command.

llvm-svn: 282766
2016-09-29 21:07:29 +00:00
Adrian McCarthy d1185fc081 Clamp version number in S_COMPILE3 to avoid overflowing 16-bit field.
llvm-svn: 282761
2016-09-29 20:28:25 +00:00
Adam Nemet 556a06b1ee Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"
This reverts commit r282758.

There are some clang failures I haven't seen.

llvm-svn: 282759
2016-09-29 20:17:37 +00:00
Adam Nemet c1d21817d1 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

llvm-svn: 282758
2016-09-29 20:12:18 +00:00
Quentin Colombet 1b01677e61 [RegisterBankInfo] Change the default mapping for Copy and PHI.
Instead of producing a mapping for all the operands, we only generate a
mapping for the definition. Indeed, the other operands are not
constrained by the instruction and thus, we should leave the choice to
the actual definition to do the right thing.

In pratice this is almost NFC, but with one advantage. We will have only
one instance of OperandsMapping for each copy and phi that map to one
register bank instead of one different instance for each different
number of operands for each copy and phi.

llvm-svn: 282756
2016-09-29 19:51:46 +00:00
Douglas Katzman 93a9a8397a Generalize ArgList::AddAllArgs more
llvm-svn: 282755
2016-09-29 19:47:58 +00:00
Adam Nemet 3628282a77 [LV] Port OptimizationRemarkAnalysisFPCommute and
OptimizationRemarkAnalysisAliasing to new streaming API for opt remarks

llvm-svn: 282742
2016-09-29 18:04:47 +00:00
Adam Nemet 6e1edd5d1f [LV] Convert processLoop to new streaming API for opt remarks
llvm-svn: 282740
2016-09-29 17:55:13 +00:00
Reid Kleckner e45b2c7d8e [codeview] Use character types for all byte-sized integer types
The VS debugger doesn't appear to understand the 0x68 or 0x69 type
indices, which were probably intended for use on a platform where a C
'int' is 8 bits. So, use the character types instead. Clang was already
using the character types because '[u]int8_t' is usually defined in
terms of 'char'.

See the Rust issue for screenshots of what VS does:
https://github.com/rust-lang/rust/issues/36646

Fixes PR30552

llvm-svn: 282739
2016-09-29 17:55:01 +00:00
Sanjay Patel ccc2927b69 fix formatting; NFC
llvm-svn: 282737
2016-09-29 17:48:19 +00:00
Kevin Enderby 245be3ed2a Next set of additional error checks for invalid Mach-O files for the
load command that uses the Mach::source_version_command type
but not used in llvm libObject code but used in llvm tool code.

This includes just the LC_SOURCE_VERSION load command.

llvm-svn: 282736
2016-09-29 17:45:23 +00:00
Kostya Serebryany a9b0dd0e51 [sanitizer-coverage/libFuzzer] make the guards for trace-pc 32-bit; create one array of guards per function, instead of one guard per BB. reorganize the code so that trace-pc-guard does not create unneeded globals
llvm-svn: 282735
2016-09-29 17:43:24 +00:00
Piotr Padlewski ba72b95f7b [thinlto] Add cold-callsite import heuristic
Summary:
Not tunned up heuristic, but with this small heuristic there is about
+0.10% improvement on SPEC 2006

Reviewers: tejohnson, mehdi_amini, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24940

llvm-svn: 282733
2016-09-29 17:32:07 +00:00
Douglas Katzman 3ace13adfa [X86] Avoid "unused" warnings if no asserts
llvm-svn: 282732
2016-09-29 17:26:12 +00:00
Adam Nemet eb0ba8d50f [LV] Move static createMissedAnalysis from anonymous to global namespace
This is an attempt to fix a windows bot.

llvm-svn: 282730
2016-09-29 17:25:00 +00:00
Adam Nemet 0bfa441701 [LV] Convert CostModel to use the new streaming opt remark API
Here we can already remove the member function emitAnalysis.

llvm-svn: 282729
2016-09-29 17:15:48 +00:00
Adam Nemet 70757dd95a [LV] Split most of createMissedAnalysis into a static function. NFC
This will be shared between Legality and CostModel.

llvm-svn: 282728
2016-09-29 17:05:35 +00:00
Adam Nemet 9988ca3db3 [LV] Convert all but one opt remark in Legality to new streaming interface
The last one remaining after which emitAnalysis can be removed is when
we convert the LAA's report to a vectorization report.  This requires
converting LAA to the new interface first.

llvm-svn: 282726
2016-09-29 16:49:42 +00:00
Adam Nemet 9a1a5ef212 [LV] Convert emitRemark to new opt remark streaming interface
Also renamed the function to emitRemarkWithHints to better reflect what
the function actually does.

llvm-svn: 282723
2016-09-29 16:23:12 +00:00
Kostya Serebryany a9a135b4f5 [libFuzzer] initialize ValueBitMap::NumBits
llvm-svn: 282721
2016-09-29 15:51:28 +00:00
Simon Pilgrim 97a4820ccd [X86][SSE] Added common helper for shuffle mask constant pool decodes.
The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data.

This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements.

Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future).

llvm-svn: 282720
2016-09-29 15:25:48 +00:00
Volkan Keles 6ec2ac0416 Test commit. NFC.
llvm-svn: 282717
2016-09-29 13:04:37 +00:00
Dylan McKay 7e91886a3f Revert "[AVR] Add instruction selection lowering code"
I accidentally comitted it.

llvm-svn: 282712
2016-09-29 12:49:18 +00:00
Dylan McKay b79c01a423 [AVR] Add instruction selection lowering code
Summary: This adds AVRISelLowering.cpp

Reviewers: kparzysz, arsenm

Subscribers: wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25034

llvm-svn: 282711
2016-09-29 12:44:38 +00:00
Craig Topper d875d6b9b4 [AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available.
This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used.

Fixes one of the cases from PR29112.

llvm-svn: 282690
2016-09-29 06:07:09 +00:00
Craig Topper 7eb0e7ce1f [AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition.
llvm-svn: 282688
2016-09-29 05:54:43 +00:00
Craig Topper e7f2611160 [X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution domain fixing table.
llvm-svn: 282687
2016-09-29 05:54:39 +00:00
Craig Topper cb3ae5a03d [X86] Remove AddedComplexity adjustments that don't seem to be needed.
llvm-svn: 282686
2016-09-29 05:54:34 +00:00
Craig Topper 816a1d7783 [X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables.
llvm-svn: 282684
2016-09-29 05:54:28 +00:00
Peter Collingbourne f75609e015 Add explanatory comment.
llvm-svn: 282678
2016-09-29 03:29:28 +00:00
Eric Christopher 20ac943748 Remove an unnecessary duplicate initialization of TLOF from the Mips
AsmPrinter. This was reinitializing the Mangler after we moved the
Mangler down to TLOF and causing us to have two different unnamed
global values accessed with the same name.

This should fix the problems on the ubsan tests here:
http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/15307

llvm-svn: 282675
2016-09-29 02:03:52 +00:00
Eric Christopher 1c3bb79864 Remove the default constructor and count variable from the Mangler since
we can just use the size of the DenseMap as a unique counter.

llvm-svn: 282674
2016-09-29 02:03:50 +00:00
Eric Christopher b4b75a531e Update comment about initializing TLOF with a pointer at the previous
line or the other commented out place.

llvm-svn: 282673
2016-09-29 02:03:47 +00:00
Eric Christopher 8e94895555 Tidy spelling and grammar.
llvm-svn: 282672
2016-09-29 02:03:44 +00:00
Matthias Braun aae7fe99d0 MachineFunction: Add missing newline in debug print()
Should not be a functional but an aesthetic change.

llvm-svn: 282669
2016-09-29 01:47:42 +00:00
Matt Arsenault e6740754f0 AMDGPU: Partially fix control flow at -O0
Fixes to allow spilling all registers at the end of the block
work with exec modifications. Don't emit s_and_saveexec_b64 for
if lowering, and instead emit copies. Mark control flow mask
instructions as terminators to get correct spill code placement
with fast regalloc, and then have a separate optimization pass
form the saveexec.

This should work if SGPRs are spilled to VGPRs, but
will likely fail in the case that an SGPR spills to memory
and no workitem takes a divergent branch.

llvm-svn: 282667
2016-09-29 01:44:16 +00:00
Peter Collingbourne 0d5636e517 LTO: Fix use-after-scope error.
llvm-svn: 282665
2016-09-29 01:28:36 +00:00
Lei Liu 361615cfd0 AArch64: Set shift bit of TLSLE HI12 add instruction
Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model.  This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000.

Reviewers: t.p.northover, peter.smith, rovka

Subscribers: salim.nasser, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24702

llvm-svn: 282661
2016-09-29 01:05:48 +00:00
Evgeny Stupachenko dc8a254663 Wisely choose sext or zext when widening IV.
Summary:
The patch fixes regression caused by two earlier patches D18777 and D18867.

Reviewers: reames, sanjoy

Differential Revision: http://reviews.llvm.org/D24280

From: Li Huang
llvm-svn: 282650
2016-09-28 23:39:39 +00:00
Kevin Enderby 76966bf066 Next set of additional error checks for invalid Mach-O files for the
load command that uses the Mach::rpath_command type
but not used in llvm libObject code but used in llvm tool code.

This includes just the LC_RPATH load command.

llvm-svn: 282649
2016-09-28 23:16:01 +00:00
Quentin Colombet 40cbc27ff3 [RegisterBankInfo] Uniquely generate OperandsMapping.
This is a step toward statically allocate InstructionMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

This should already improve compile time by getting rid of a bunch of
memmove of SmallVectors.

llvm-svn: 282643
2016-09-28 22:20:49 +00:00
Quentin Colombet 97d2d21d65 [RegisterBankInfo] Rework the APIs of ValueMapping.
This is a preparatory commit for more TableGen-like structure.
NFC

llvm-svn: 282642
2016-09-28 22:20:24 +00:00
Adrian Prantl f8d10ceafe Remove dead code from LiveDebugVariables.cpp (NFC)
LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block
boundaries any more; this functionality was split into LiveDebugValues.
We can thus drop the now dead references to LexicalScopes from LiveDebugVariables.

llvm-svn: 282638
2016-09-28 21:34:23 +00:00
Kevin Enderby 32359dbf6b Next set of additional error checks for invalid Mach-O files for the
other load commands that use the Mach::version_min_command type
but not used in llvm libObject code but used in llvm tool code.

This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS,
LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands.

llvm-svn: 282635
2016-09-28 21:20:45 +00:00
Dehao Chen 5461d8bdb5 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

llvm-svn: 282630
2016-09-28 21:00:58 +00:00
Krzysztof Parzyszek dcb1bcae0b IfConversion: Add implicit uses for redefined regs with live subregisters
Normally, if conversion would add implicit uses for redefined registers,
e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of
R0 are known to be live but not R0 itself, such implicit uses will not be
added, causing prior definitions of such subregisters and R0 itself to
become dead.

llvm-svn: 282626
2016-09-28 20:07:41 +00:00
Konstantin Zhuravlyov e14df4b236 [AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions
Differential Revision: https://reviews.llvm.org/D24125

llvm-svn: 282624
2016-09-28 20:05:39 +00:00
Dehao Chen 10d0abb515 Fix the bug introduced in r282616.
llvm-svn: 282618
2016-09-28 18:54:36 +00:00
Dehao Chen 80c8ebb4d8 Fix the bug when -compile-twice is specified, the PSI will be invalidated.
Summary:
When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary.
The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24993

llvm-svn: 282616
2016-09-28 18:41:14 +00:00
Artur Pilipenko b6ce6e5dac Don't look through addrspacecast in GetPointerBaseWithConstantOffset
Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value.

This is similar to D13008.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D24729

llvm-svn: 282612
2016-09-28 17:57:16 +00:00
Adrian Prantl 7f5866c227 Teach LiveDebugValues about lexical scopes.
This addresses PR26055 LiveDebugValues is very slow.

Contrary to the old LiveDebugVariables pass LiveDebugValues currently
doesn't look at the lexical scopes before inserting a DBG_VALUE
intrinsic. This means that we often propagate DBG_VALUEs much further
down than necessary. This is especially noticeable in large C++
functions with many inlined method calls that all use the same
"this"-pointer.

For example, in the following code it makes no sense to propagate the
inlined variable a from the first inlined call to f() into any of the
subsequent basic blocks, because the variable will always be out of
scope:

void sink(int a);
void __attribute((always_inline)) f(int a) { sink(a); }
void foo(int i) {
   f(i);
   if (i)
     f(i);
   f(i);
}

This patch reuses the LexicalScopes infrastructure we have for
LiveDebugVariables to take this into account.

The effect on compile time and memory consumption is quite noticeable:
I tested a benchmark that is a large C++ source with an enormous
amount of inlined "this"-pointers that would previously eat >24GiB
(most of them for DBG_VALUE intrinsics) and whose compile time was
dominated by LiveDebugValues. With this patch applied the memory
consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues.

https://reviews.llvm.org/D24994
Thanks to Daniel Berlin and Keith Walker for reviewing!

llvm-svn: 282611
2016-09-28 17:51:14 +00:00
Adrian Prantl 16b2ace0ab Rewrite loops to use range-based for. (NFC)
llvm-svn: 282608
2016-09-28 17:31:17 +00:00
Artem Belevich 3e1211581c [NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions.
These are only available on sm_60+ GPUs.

Differential Revision: https://reviews.llvm.org/D24943

llvm-svn: 282607
2016-09-28 17:25:38 +00:00
Sanjoy Das f0022125e0 [SCEV] Use a SmallPtrSet as a temporary union predicate; NFC
Summary:
Instead of creating and destroying SCEVUnionPredicate instances (which
internally creates and destroys a DenseMap), use temporary SmallPtrSet
instances of remember the set of predicates that will get reified into a
SCEVUnionPredicate.

Reviewers: silviu.baranga, sbaranga

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25000

llvm-svn: 282606
2016-09-28 17:14:58 +00:00
Nirav Dave e524f50882 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r282600 due to test failues with MCJIT

llvm-svn: 282604
2016-09-28 16:37:50 +00:00
Dylan McKay 1f69cdb321 [AVR] Rename the builtin calling convention names
'BUILTIN' is clearer than 'RT' in this context.

llvm-svn: 282602
2016-09-28 16:04:40 +00:00
Marina Yatsina 76bfc6670b [x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel)
Implement 'retn' simply by aliasing it to the relevant 'ret' instruction

Commit on behalf of coby

Differential Revision: https://reviews.llvm.org/D24346

llvm-svn: 282601
2016-09-28 15:52:56 +00:00
Nirav Dave e17e055b75 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Simplify Consecutive Merge Store Candidate Search

  Now that address aliasing is much less conservative, push through
  simplified store merging search which only checks for parallel stores
  through the chain subgraph. This is cleaner as the separation of
  non-interfering loads/stores from the store-merging logic.

  Whem merging stores, search up the chain through a single load, and
  finds all possible stores by looking down from through a load and a
  TokenFactor to all stores visited. This improves the quality of the
  output SelectionDAG and generally the output CodeGen (with some
  exceptions).

  Additional Minor Changes:

    1. Finishes removing unused AliasLoad code
    2. Unifies the the chain aggregation in the merged stores across
       code paths
    3. Re-add the Store node to the worklist after calling
       SimplifyDemandedBits.
    4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
       arbitrary, but seemed sufficient to not cause regressions in
       tests.

  This finishes the change Matt Arsenault started in r246307 and
  jyknight's original patch.

  Many tests required some changes as memory operations are now
  reorderable. Some tests relying on the order were changed to use
  volatile memory operations

  Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -
      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

    CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
      This test appears to work but no longer exhibits the spill
      behavior.

Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

llvm-svn: 282600
2016-09-28 15:50:43 +00:00
Dylan McKay 536239f144 [AVR] Import the LLVM namespace inside AVRMCTargetDesc.cpp
llvm-svn: 282598
2016-09-28 15:35:26 +00:00
Dylan McKay e762094864 [AVR] Add AVRMCTargetDesc.cpp
Summary:
This adds the AVRMCTargetDesc file in tree. It allows creation of the
core classes used in the backend.

Reviewers: arsenm, kparzysz

Subscribers: wdng, beanz, mgorny

Differential Revision: https://reviews.llvm.org/D25023

llvm-svn: 282597
2016-09-28 15:31:12 +00:00
Dylan McKay d6e7fc6d9a [AVR] Update the signature of createAVRAsmBackend
It has been recently changed to also take a MCTargetOptions structure.

llvm-svn: 282594
2016-09-28 14:35:07 +00:00
Dylan McKay f010a2b41a [AVR] Enable the assembly parser
We very recently landed the code. This commit enables the parser.

It also adds a missing include to AVRAsmParser.cpp

llvm-svn: 282593
2016-09-28 14:34:42 +00:00
Sanjay Patel 220a8730fb [InstSimplify] allow or-of-icmps folds with vector splat constants
llvm-svn: 282592
2016-09-28 14:27:21 +00:00
Sanjay Patel 1b312ad42d [InstSimplify] allow and-of-icmps folds with vector splat constants
llvm-svn: 282590
2016-09-28 13:53:13 +00:00
Dylan McKay 0fe1e63837 [AVR] Merge most recent changes to AVRInstrInfo.td
This adds two new things:

- Operand types per fixup
- Atomic pseudo operations

llvm-svn: 282588
2016-09-28 13:44:02 +00:00
Dylan McKay b967d16c43 [AVR] Update the data layout
The previous data layout caused issues when dealing with atomics.

Foe example, it is illegal to load a 16-bit value with less than 16-bits
of alignment.

This changes the data layout so that all types are aligned by at least
their own width.

Interestingly, this also _slightly_ decreased register pressure in some
cases.

llvm-svn: 282587
2016-09-28 13:29:10 +00:00
Dylan McKay 35047ed741 [AVR] Handle AVR relocations when handling ELF files
llvm-svn: 282586
2016-09-28 13:23:42 +00:00
Dylan McKay 1f877f06b9 [AVR] Add assembly parser
Summary: This patch adds the AVRAsmParser library.

Reviewers: arsenm, kparzysz

Subscribers: wdng, beanz, mgorny, kparzysz, simoncook, jtbandes, llvm-commits

Differential Revision: https://reviews.llvm.org/D20046

llvm-svn: 282584
2016-09-28 13:02:57 +00:00
Guy Blank 2bdc74a471 [X86][FastISel] Use a COPY from K register to a GPR instead of a K operation
The KORTEST was introduced due to a bug where a TEST instruction used a K register.
but, turns out that the opposite case of KORTEST using a GPR is now happening

The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR.

Differential Revision: https://reviews.llvm.org/D24953

llvm-svn: 282580
2016-09-28 11:22:17 +00:00
Simon Pilgrim 55b8eaa505 Strip trailing whitespace
llvm-svn: 282579
2016-09-28 11:08:00 +00:00
Jonas Paulsson 58c5a7f55a [SystemZ] Implementation of getUnrollingPreferences().
This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.

It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).

Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451

llvm-svn: 282570
2016-09-28 09:41:38 +00:00
Michael Kuperstein 3e06eafc20 [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine
This check currently doesn't seem to do anything useful on any in-tree target:
On non-x86, it always evaluates to false, so we never hit the code path that
creates the shuffle with zero.
On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to
query in general, but doesn't make sense if only restricted to zero blends.

Differential Revision: https://reviews.llvm.org/D24625

llvm-svn: 282567
2016-09-28 06:13:58 +00:00
Kostya Serebryany 3ee6c213d6 [libFuzzer] speedup TracePC::FinalizeTrace
llvm-svn: 282562
2016-09-28 01:16:24 +00:00
Adam Nemet 69330e0bc5 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

llvm-svn: 282561
2016-09-28 00:58:36 +00:00
Adam Nemet c507ac96f5 [Inliner] Port all opt remarks to new streaming API
llvm-svn: 282559
2016-09-27 23:47:03 +00:00
Kevin Enderby 3e490ef94e Next set of additional error checks for invalid Mach-O files for the
other load commands that use the MachO::dylinker_command type
but not used in llvm libObject code but used in llvm tool code.

This includes LC_ID_DYLINKER, LC_LOAD_DYLINKER
and LC_DYLD_ENVIRONMENT load commands.

llvm-svn: 282553
2016-09-27 23:24:13 +00:00
Quentin Colombet c0f11a9fb8 [AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping.
Another step toward TableGen'ed like structure for the RegisterBankInfo
of AArch64. By doing this, we also save a bit of compile time for the
exact same output.

llvm-svn: 282550
2016-09-27 22:55:04 +00:00
Quentin Colombet caae9cd246 [AArch64][RegisterBankInfo] Fix copy/paste in comments.
NFC.

llvm-svn: 282549
2016-09-27 22:54:57 +00:00
Sanjay Patel 764ae8bd72 [x86] add folds for FP logic with vector zeros
The 'or' case shows up in copysign. The copysign code also had 
redundant checking for a scalar zero operand with 'and', so I 
removed that. 

I'm not sure how to test vector 'and', 'andn', and 'xor' yet, 
but it seems better to just include all of the logic ops since
we're fixing 'or' anyway.

llvm-svn: 282546
2016-09-27 22:28:13 +00:00
Adam Nemet 04758ba385 Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

llvm-svn: 282544
2016-09-27 22:19:23 +00:00
Geoff Berry b124331db7 [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().
Summary:
The current implementation of isConstantPhysReg() checks for defs of
physical registers to determine if they are constant.  Some
architectures (e.g. AArch64 XZR/WZR) have registers that are constant
and may be used as destinations to indicate the generated value is
discarded, preventing isConstantPhysReg() from returning true.  This
change adds a TargetRegisterInfo hook that overrides the no defs check
for cases such as this.

Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy

Subscribers: junbuml, aemerson, mcrosier, rengolin

Differential Revision: https://reviews.llvm.org/D24570

llvm-svn: 282543
2016-09-27 22:17:27 +00:00
Adam Nemet 1142147e41 [Inliner] Fold the analysis remark into the missed remark
There is really no reason for these to be separate.

The vectorizer started this pretty bad tradition that the text of the
missed remarks is pretty meaningless, i.e. vectorization failed.  There,
you have to query analysis to get the full picture.

I think we should just explain the reason for missing the optimization
in the missed remark when possible.  Analysis remarks should provide
information that the pass gathers regardless whether the optimization is
passing or not.

llvm-svn: 282542
2016-09-27 21:58:17 +00:00
Michael Zolotukhin 1a554be3b6 [LoopSimplify] When simplifying phis in loop-simplify, do it only if it preserves LCSSA form.
llvm-svn: 282541
2016-09-27 21:03:45 +00:00
Adam Nemet a62b7e1a28 Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282539
2016-09-27 20:55:07 +00:00
Adam Nemet 37c9e7f301 Sort headers
llvm-svn: 282538
2016-09-27 20:55:01 +00:00
Matthias Braun 5391ffb671 Statistic: Bring back printing on exit by default
Turns out several external projects relied on llvm printing statistics
on exit. Let's go back to this behaviour by default and have an optional
parameter to disable it.

llvm-svn: 282532
2016-09-27 19:38:55 +00:00
Sanjay Patel 43ef1ad0ba [x86] use isNullFPConstant(); NFCI
Also, put the related FP logic functions together to see the similarities. 

llvm-svn: 282522
2016-09-27 18:48:02 +00:00
Reid Kleckner 6481822e28 [DebugInfo] Add comments to phi dbg.value tracking code, NFC
LLVM developers might be surprised to learn that there are blocks
without valid insertion points (catchswitch), so it seems worth calling
that out explicitly.  Also add a FIXME about what we should really be
doing if we ever need to make optimized Windows EH code debuggable.

While I'm here, make auto usage more consistent with LLVM standards and
avoid an unecessary call to insertBefore.

llvm-svn: 282521
2016-09-27 18:45:31 +00:00
Krzysztof Parzyszek 586fc12e32 [RDF] Add "dead" flag to node attributes
llvm-svn: 282520
2016-09-27 18:24:33 +00:00
Krzysztof Parzyszek 1d32220721 [RDF] Special treatment of exception handling registers
A landing pad can have live-in registers that are defined by the runtime,
not the program (exception pointer register and exception selector
register). Make sure to recognize that case and not link these registers
with any defs in the program.
Each landing pad will have phi nodes added at the beginning to provide
definitions of these registers, but the uses of those phi nodes will not
have any reaching defs.

llvm-svn: 282519
2016-09-27 18:18:44 +00:00
Sanjoy Das 237c84540f [SCEV] Replace a struct with a function; NFC
We can do this now thanks to C++11 lambdas.

llvm-svn: 282515
2016-09-27 18:01:48 +00:00
Sanjoy Das a26021414a [SCEV] Use find instead of find_as; NFC
We don't need the extra generality here.

llvm-svn: 282514
2016-09-27 18:01:46 +00:00
Sanjoy Das c220ac79c4 [SCEV] Reduce the scope of a struct; NFC
llvm-svn: 282513
2016-09-27 18:01:44 +00:00
Sanjoy Das c46bceb632 [SCEV] Remove custom RAII wrapper; NFC
Instead use the pre-existing `scope_exit` class.

llvm-svn: 282512
2016-09-27 18:01:42 +00:00
Sanjoy Das db93375711 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

llvm-svn: 282511
2016-09-27 18:01:38 +00:00
Keith Walker 83ebef5db3 Propagate DBG_VALUE entries when there are unvisited predecessors
Variables are sometimes missing their debug location information in
blocks in which the variables should be available. This would occur
when one or more predecessor blocks had not yet been visited by the
routine which propagated the information from predecessor blocks.

This is addressed by only considering predecessor blocks which have
already been visited.

The solution to this problem was suggested by Daniel Berlin on the
LLVM developer mailing list.

Differential Revision: https://reviews.llvm.org/D24927

llvm-svn: 282506
2016-09-27 16:46:07 +00:00
Adam Nemet cc2a3fa8e8 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

llvm-svn: 282503
2016-09-27 16:39:24 +00:00
Adam Nemet 92e928c10a Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

llvm-svn: 282499
2016-09-27 16:15:16 +00:00
Rafael Espindola eaeb6d91a1 Add xxhash to llvm.
It will be used for fast fingerprinting in lld at least.

llvm-svn: 282493
2016-09-27 15:45:57 +00:00
Konstantin Zhuravlyov da4687c531 [AMDGPU] Enable changing instprinter's behavior based on the per-function
subtarget

This is a prerequisite for coming waitcnt changes

Differential Revision: https://reviews.llvm.org/D24939

llvm-svn: 282489
2016-09-27 14:42:48 +00:00
Simon Dardis d2ed8abb15 [mips] Disable tail calls temporarily
Disable tail calls while the remaining bugs are fixed. Enable only for tests.

Reviewers: vkalintiris

Differential Review: https://reviews.llvm.org/D24912

llvm-svn: 282487
2016-09-27 13:15:54 +00:00
Simon Dardis 0486d585c5 [mips] Add rsqrt, recip for MIPS
Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for
architecture support and register usage.

Reviewers: vkalintiris, zoran.jovanoic

Differential Review: https://reviews.llvm.org/D24499

llvm-svn: 282485
2016-09-27 12:25:15 +00:00
Nemanja Ivanovic 6f22b41398 [Power9] Builtins for ELF v.2 API conformance - back end portion
This patch corresponds to review:
https://reviews.llvm.org/D24396

This patch adds support for the "vector count trailing zeroes",
"vector compare not equal" and "vector compare not equal or zero instructions"
as well as "scalar count trailing zeroes" instructions. It also changes the
vector negation to use XXLNOR (when VSX is enabled) so as not to increase
register pressure (previously this was done with a splat immediate of all
ones followed by an XXLXOR). This was done because the altivec.h
builtins (patch to follow) use vector negation and the use of an additional
register for the splat immediate is not optimal.

llvm-svn: 282478
2016-09-27 08:42:12 +00:00
Craig Topper 789888002a [X86] Use std::max to calculate alignment instead of assuming RC->getSize() will not return a value greater than 32. I think it theoretically could be 64 for AVX-512.
llvm-svn: 282471
2016-09-27 06:44:25 +00:00
Kostya Serebryany 7d6935c184 [libFuzzer] run re2 test in 8 threads by default
llvm-svn: 282469
2016-09-27 03:33:57 +00:00
Kostya Serebryany 45c144754b [sanitizer-coverage] fix a bug in trace-gep
llvm-svn: 282467
2016-09-27 01:55:08 +00:00
Kostya Serebryany 186d61801c [sanitizer-coverage] don't emit the CTOR function if nothing has been instrumented
llvm-svn: 282465
2016-09-27 01:08:33 +00:00
Ivan Krasin 4ff4f21e15 Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation
Summary:
We don't currently need this facility for CFI. Disabling individual hot methods proved
to be a better strategy in Chrome.

Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne.

Reviewers: pcc

Subscribers: kcc

Differential Revision: https://reviews.llvm.org/D24948

llvm-svn: 282461
2016-09-27 00:29:53 +00:00
Kostya Serebryany 53543af036 [libFuzzer] add a test based on openssl-1.0.1f (finds heartbleed)
llvm-svn: 282460
2016-09-27 00:27:40 +00:00
Kostya Serebryany 5ff481fd9e [libFuzzer] add -exit_on_src_pos to test libFuzzer itself, add a test script for RE2 that uses this flag
llvm-svn: 282458
2016-09-27 00:10:20 +00:00
Peter Collingbourne 53a852b648 LowerTypeTests: Remove unused variable.
llvm-svn: 282456
2016-09-26 23:56:17 +00:00
Peter Collingbourne 6ed92e3f53 LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications.
llvm-svn: 282455
2016-09-26 23:54:39 +00:00
Davide Italiano a9f85d68cc [CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD.
PR: 30494
llvm-svn: 282451
2016-09-26 22:53:15 +00:00
Derek Schuff 92d300eb8f [WebAssembly] Use the frame pointer instead of the stack pointer
When we have dynamic allocas we have a frame pointer, and
when we're lowering frame indexes we should make sure we use it.

Patch by Jacob Gravelle

Differential Revision: https://reviews.llvm.org/D24889

llvm-svn: 282442
2016-09-26 21:18:03 +00:00
Kevin Enderby 90986e6c7c Next set of additional error checks for invalid Mach-O files for the
other load commands that use the Mach::linkedit_data_command type
but not used in llvm libObject code but used in llvm tool code.

This includes LC_FUNCTION_STARTS, LC_SEGMENT_SPLIT_INFO
and LC_DYLIB_CODE_SIGN_DRS load commands.

llvm-svn: 282441
2016-09-26 21:11:03 +00:00
Aditya Kumar 0a48b37cfd Move computation past early return
Reviewers:
        rafael
        spatel

Differential Revision: https://reviews.llvm.org/D24843

llvm-svn: 282440
2016-09-26 21:01:13 +00:00
Piotr Padlewski d9830eb79f [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

llvm-svn: 282437
2016-09-26 20:37:32 +00:00
Nirav Dave 6477ce2697 Add support for Code16GCC
[X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and
outputs in 16-bit mode. Teach parser to switch modes appropriately.

Reviewers: dwmw2, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20109

llvm-svn: 282430
2016-09-26 19:33:36 +00:00
Andrew Kaylor 595307a468 Add optimization bisect support to an optional Mips pass
Differential Revision: https://reviews.llvm.org/D19513

llvm-svn: 282428
2016-09-26 19:05:37 +00:00
Matthias Braun c603551b4e Statistic: Only print statistics on exit for -stats
Previously enabling the statistics with EnableStatistics() would lead to
them getting printed to stderr/-info-output-file on exit. However
frontends may want a way to enable statistics and do the printing on
their own instead of the forced printing on exit.

This changes the code so that only the -stats option enables printing on
exit, EnableStatistics() only enables the tracking but requires invoking
one of the PrintStatistics() variants.

Differential Revision: https://reviews.llvm.org/D24819

llvm-svn: 282425
2016-09-26 18:38:07 +00:00
Tom Stellard 1b9748c6a2 AMDGPU/SI: Don't crash on anonymous GlobalValues
Summary:
We need to call AsmPrinter::getNameWithPrefix() in order to handle
anonymous GlobalValues (e.g. @0, @1).

Reviewers: arsenm, b-sumner

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D24865

llvm-svn: 282420
2016-09-26 17:29:25 +00:00
Daniel Berlin 1e98c04226 Remove pruning of phi nodes in MemorySSA - it makes updating harder
Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24923

llvm-svn: 282419
2016-09-26 17:22:54 +00:00
Matthew Simpson b764aba2ab [LV] Scalarize instructions marked scalar after vectorization
This patch ensures that we actually scalarize instructions marked scalar after
vectorization. Previously, such instructions may have been vectorized instead.

Differential Revision: https://reviews.llvm.org/D23889

llvm-svn: 282418
2016-09-26 17:08:37 +00:00
Gor Nishanov bc0ebb383c [Coroutines] Part14: Handle coroutines with no suspend points.
Summary:
If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function.

Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>.

Reviewers: majnemer

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24408

llvm-svn: 282414
2016-09-26 15:49:28 +00:00
Geoff Berry 256fcf975f [AArch64] Improve add/sub/cmp isel of uxtw forms.
Don't match the UXTW extended reg forms of ADD/ADDS/SUB/SUBS if the
32-bit to 64-bit zero-extend can be done for free by taking advantage
of the 32-bit defining instruction zeroing the upper 32-bits of the X
register destination.  This enables better instruction selection in a
few cases, such as:

  sub x0, xzr, x8
  instead of:
  mov x8, xzr
  sub x0, x8, w9, uxtw

  madd x0, x1, x1, x8
  instead of:
  mul x9, x1, x1
  add x0, x9, w8, uxtw

  cmp x2, x8
  instead of:
  sub x8, x2, w8, uxtw
  cmp x8, #0

  add x0, x8, x1, lsl #3
  instead of:
  lsl x9, x1, #3
  add x0, x9, w8, uxtw

Reviewers: t.p.northover, jmolloy

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D24747

llvm-svn: 282413
2016-09-26 15:34:47 +00:00
Evandro Menezes e45de8a5ec Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables.  As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified.  One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.

This patch considers a limit that a target may set to the number of
destination addresses in a jump table.

Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.

Differential revision: https://reviews.llvm.org/D21940

llvm-svn: 282412
2016-09-26 15:32:33 +00:00
Alexey Bataev 793c946ecb [InstCombine] Fixed bug introduced in r282237
The index of the new insertelement instruction was evaluated in the
wrong way, it was considered as the index of the inserted value instead
of index of the position, where the value should be inserted.

llvm-svn: 282401
2016-09-26 13:18:59 +00:00
Andrea Di Biagio a82d52d11d [InstCombine] Teach the udiv folding logic how to handle constant expressions.
This patch fixes PR30366.

Function foldUDivShl() worked under the assumption that one of the values
in input to the function was always an instance of llvm::Instruction.
However, function visitUDivOperand() (the only user of foldUDivShl) was
clearly violating that precondition; internally, visitUDivOperand() uses pattern
matches to check the operands of a udiv. Pattern matchers for binary operators
know how to handle both Instruction and ConstantExpr values.

This patch fixes the problem in foldUDivShl(). Now we use pattern matchers
instead of explicit casts to Instruction. The reduced test case from PR30366
has been added to test file InstCombine/udiv-simplify.ll.

Differential Revision: https://reviews.llvm.org/D24565

llvm-svn: 282398
2016-09-26 12:07:23 +00:00
Dylan McKay c4ec11f451 [AVR] Add AVRMCExpr
Summary: This adds the AVRMCExpr headers and implementation.

Reviewers: arsenm, ruiu, grosbach, kparzysz

Subscribers: wdng, beanz, mgorny, kparzysz, jtbandes, llvm-commits

Differential Revision: https://reviews.llvm.org/D20503

llvm-svn: 282397
2016-09-26 11:35:32 +00:00
Sam Kolton 984461062f Revert "[AMDGPU] Disassembler: print label names in branch instructions"
This reverts commit 6c6dbe625263ec9fcf8de0df27263cf147cde550.

llvm-svn: 282396
2016-09-26 11:29:03 +00:00
Sam Kolton 1559f76257 [AMDGPU] Disassembler: print label names in branch instructions
Summary: Add AMDGPUSymbolizer for finding names for labels from ELF symbol table.

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D24802

llvm-svn: 282394
2016-09-26 10:05:50 +00:00
James Molloy 9abb2fa5bb [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

llvm-svn: 282387
2016-09-26 07:26:24 +00:00
Zvi Rackover 839d15a194 [X86] Optimization for replacing LEA with MOV at frame index elimination time
Summary:
Replace a LEA instruction of the form 'lea (%esp), %ebx' --> 'mov %esp, %ebx'

MOV is preferable over LEA because usually there are more issue-slots available to execute MOVs than LEAs. Latest processors also support zero-latency MOVs.

Fixes pr29022.

Reviewers: hfinkel, delena, igorb, myatsina, mkuper

Differential Revision: https://reviews.llvm.org/D24705

llvm-svn: 282385
2016-09-26 06:42:07 +00:00
Ayman Musa d7a5ed4141 [X86][avx512] Fix bug in masked compress store.
Differential Revision: https://reviews.llvm.org/D23984

llvm-svn: 282381
2016-09-26 06:22:08 +00:00
Chandler Carruth 68abda52c2 [SCEV] Fix the order of members in the initializer list.
Noticed due to the warning on this line. Sanjoy is on
a less-than-awesome internet connection, so committing on his behalf.

llvm-svn: 282380
2016-09-26 04:49:58 +00:00
Sanjoy Das 5cb11b6423 [SCEV] Assign LoopPropertiesCache in the move constructor
In a previous change I collapsed two different caches into one.  When
doing that I noticed that ScalarEvolution's move constructor was not
moving those caches.

To keep the previous change simple, I've moved that bugfix into this
separate change.

llvm-svn: 282376
2016-09-26 02:44:10 +00:00
Sanjoy Das 5603fc00a6 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

llvm-svn: 282375
2016-09-26 02:44:07 +00:00
Sanjoy Das 5c4869b39d [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

llvm-svn: 282374
2016-09-26 01:10:27 +00:00
Sanjoy Das 6b76cdf0d5 [SCEV] Further isolate incidental data structure; NFC
llvm-svn: 282373
2016-09-26 01:10:25 +00:00
Sanjoy Das 7326861abd [SCEV] Simplify BackedgeTakenInfo::getMax; NFC
llvm-svn: 282372
2016-09-26 01:10:22 +00:00
Sanjoy Das e935c77e20 [SCEV] Reserve space in SmallVector; NFC
llvm-svn: 282368
2016-09-25 23:12:08 +00:00
Sanjoy Das c9bbf56358 [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC
SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to
store the (optional) data out of line.

llvm-svn: 282366
2016-09-25 23:12:04 +00:00
Sanjoy Das d1eb62ad11 [SCEV] Simplify tracking ExitNotTakenInfo instances; NFC
This change simplifies a data structure optimization in the
`BackedgeTakenInfo` class for loops with exactly one computable exit.

I've sanity checked that this does not regress compile time performance,
using sqlite3's amalgamated build.

llvm-svn: 282365
2016-09-25 23:12:00 +00:00
Sanjoy Das 89eea6b2ed [SCEV] Rename a couple of fields; NFC
llvm-svn: 282364
2016-09-25 23:11:57 +00:00
Sanjoy Das bdd9710252 [SCEV] Remove incidental data structure; NFC
llvm-svn: 282363
2016-09-25 23:11:55 +00:00
Craig Topper 87155274b8 [X86] Remove what appears to be leftover MMX code involving (v1i64 scalar_to_vector).
llvm-svn: 282361
2016-09-25 16:34:11 +00:00
Craig Topper aab59a48e7 [X86] Remove patterns for scalar_to_vector from FR32/FR64 to 256-bit vectors. Lowering explicitly avoids creating this pattern.
llvm-svn: 282360
2016-09-25 16:34:09 +00:00
Craig Topper 0cc188d979 [AVX-512] Replace get512BitSuperRegister with calls to TargetRegisterInfo::getMatchingSuperReg.
llvm-svn: 282359
2016-09-25 16:34:06 +00:00
Craig Topper 60d3ef1d72 [AVX-512] Fix some patterns predicates to properly enforce priority for various versions of CVTDQ2PD instruction.
llvm-svn: 282358
2016-09-25 16:34:02 +00:00
Craig Topper 3c9faa32c1 [AVX-512] Add rounding versions of instructions to hasUndefRegUpdate.
llvm-svn: 282357
2016-09-25 16:33:59 +00:00
Craig Topper d8b2bd492c [AVX-512] Add the scalar unsigned integer to fp conversion instructions to hasUndefRegUpdate.
llvm-svn: 282356
2016-09-25 16:33:57 +00:00
Craig Topper ac941b9736 [AVX-512] Remove duplicate instructions for converting integer to scalar floating point. We can use patterns to point to the other instructions instead.
llvm-svn: 282355
2016-09-25 16:33:53 +00:00
Craig Topper 8f2e85e669 [AVX-512] Don't use two opcodes for INTR_TYPE_SCALAR_MASK_RM. The handling was such that if the second opcode was present the first was ingored, so we can just have one opcode.
llvm-svn: 282344
2016-09-25 01:03:10 +00:00
Craig Topper 1776f4c965 [X86] Teach combineShuffle to avoid creating floating point operations with integer types and integer operations with floating point types. Seems isOperationLegal lies for mismatched types and operations.
Fixes PR30511.

llvm-svn: 282341
2016-09-24 21:42:49 +00:00
Craig Topper aeca0460f3 [AVX-512] Split scalar version of X86ISD::SELECT into a separate opcode because isel is not robust with multiple type profiles for the same opcode.
llvm-svn: 282340
2016-09-24 21:42:47 +00:00
Craig Topper 7e664dad60 [AVX-512] Remove the patterns for selecting scalar VCOMI/VUCOMI instructions with SAE as there is no way to create the pattern.
llvm-svn: 282339
2016-09-24 21:42:43 +00:00
Duncan P. N. Exon Smith 11c06ea55a ObjCARC: Don't look at users of ConstantData
Stop looking at users of UndefValue and ConstantPointerNull in the
objective C ARC optimizers.  The other users aren't actually
interesting, since they're not pointing at a particular object.  I
imagine these calls could be optimized through -instcombine... maybe
they already are?

These early returns will be required at some point in the future, with a
WIP patch that asserts when someone accesses a use-list on ConstantData.

llvm-svn: 282338
2016-09-24 21:01:20 +00:00
Duncan P. N. Exon Smith b1b208a1f5 Analysis: Return early for UndefValue in computeKnownBits
There is no benefit in looking through assumptions on UndefValue to
guess known bits.  Return early to avoid walking their use-lists, and
assert that all instances of ConstantData are handled here for similar
reasons (UndefValue was the only integer/pointer holdout).

llvm-svn: 282337
2016-09-24 20:42:02 +00:00
Sanjay Patel 752ad8fde7 [x86] don't try to create a vector integer inst for an SSE1 target (PR30512)
This bug was introduced with:
http://reviews.llvm.org/rL272511

We need to restrict the lowering to v4f32 comparisons because that's all SSE1 can handle.

This should fix:
https://llvm.org/bugs/show_bug.cgi?id=28044

llvm-svn: 282336
2016-09-24 20:24:06 +00:00
Duncan P. N. Exon Smith 4fd9b7e16f Scalar: Ignore ConstantData in processAssumption
Assumptions on UndefValue and ConstantPointerNull aren't relevant to
other users.  Ignore them entirely to avoid wasting cycles walking
through their (possibly extremely extensive (cross-module)) use-lists.

It wasn't clear how to add a specific test for this, and it'll be
covered anyway by an eventual patch that asserts when trying to access
the use-list of an instance of ConstantData.

llvm-svn: 282334
2016-09-24 20:00:38 +00:00
Duncan P. N. Exon Smith b479873912 Analysis: Return early in isKnownNonNullAt for ConstantData
Check and return early for ConstantPointerNull and UndefValue
specifically in isKnownNonNullAt, and assert that ConstantData never
make it to isKnownNonNullFromDominatingCondition.

This confirms that isKnownNonNullFromDominatingCondition never walks
through the use-list of an instance of ConstantData.  Given that such
use-lists cross module boundaries, it never really made sense to do so,
and was potentially very expensive.

llvm-svn: 282333
2016-09-24 19:39:47 +00:00
Dylan McKay 907cde3cc2 [AVR] Update signature of AVRTargetObjectFile::SelectSectionForGlobal
It was changed recently, and was breaking compilation of the backend.

llvm-svn: 282329
2016-09-24 11:38:08 +00:00
Quentin Colombet d816bfb282 [RegisterBankInfo] Constify the member of the XXXMapping maps.
This makes it obvious that items in those maps behave like statically
created objects.

llvm-svn: 282327
2016-09-24 04:54:03 +00:00
Quentin Colombet a50813f608 [RegisterBankInfo] Add statistics for dynamic value mappings.
Like partial mappings, as we move toward TableGen'ed information, the
number should reach zero eventually.

llvm-svn: 282325
2016-09-24 04:53:55 +00:00
Quentin Colombet fd8c95adf4 [RegisterBankInfo] Uniquely generate ValueMapping.
This is a step toward statically allocate ValueMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.

llvm-svn: 282324
2016-09-24 04:53:52 +00:00
Quentin Colombet 8159de4de9 [RegisterBankInfo] Keep valid pointers for PartialMappings.
Previously we were using the address of the unique instance of a partial
mapping in the related map to access this instance. However, when the
map grows, the whole set of instances may be moved elsewhere and the
previous addresses are not valid anymore.

Instead, keep the address of the unique heap allocated instance of a
partial mapping.

Note: I did not see any actual bugs for that problem as the number of
partial mappings dynamically allocated is small (<= 4).

llvm-svn: 282323
2016-09-24 04:53:48 +00:00
Kostya Serebryany 273d767215 [libFuzzer] add a standalone build script
llvm-svn: 282321
2016-09-24 04:00:00 +00:00
Duncan P. N. Exon Smith c82c11428e GlobalStatus: Don't walk use-lists of ConstantData
Return early from llvm::isSafeToDestroyConstant() whenever the value
`isa<ConstantData>()`.  These constants are shared across the
LLVMContext.  We never really want to delete them here, and walking
their use-lists can be very expensive.

(This is motivated by an eventual goal of removing use-lists entirely
from ConstantData.)

llvm-svn: 282320
2016-09-24 02:30:11 +00:00
Kostya Serebryany 0800b81a21 [libFuzzer] simplify HandleTrace again, start re-running interesting units and collecting their features.
llvm-svn: 282316
2016-09-23 23:51:58 +00:00
Peter Collingbourne a638fe057c Add qualification to fix MSVC build.
llvm-svn: 282313
2016-09-23 23:23:23 +00:00
Sanjay Patel 0b36337d61 [x86] fix FCOPYSIGN lowering to create constants instead of ConstantPool loads
This is similar to:
https://reviews.llvm.org/rL279958

By not prematurely lowering to loads, we should be able to more easily eliminate
the 'or' with zero instructions seen in copysign-constant-magnitude.ll.

We should also be able to extend this code to handle vectors.

llvm-svn: 282312
2016-09-23 23:17:29 +00:00
Petr Hosek 85b2f67613 [MC] Support .ds directives in assembler parser
These directives are already supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24740

llvm-svn: 282303
2016-09-23 21:53:36 +00:00
Matthias Braun 729c989083 llc: Add -start-before/-stop-before options
Differential Revision: https://reviews.llvm.org/D23089

llvm-svn: 282302
2016-09-23 21:46:02 +00:00
Peter Collingbourne 80186a57d6 LTO: Simplify caching interface.
The NativeObjectOutput class has a design problem: it mixes up the caching
policy with the interface for output streams, which makes the client-side
code hard to follow and would for example make it harder to replace the
cache implementation in an arbitrary client.

This change separates the two aspects by moving the caching policy
to a separate field in Config, replacing NativeObjectOutput with a
NativeObjectStream class which only deals with streams and does not need to
be overridden by most clients and introducing an AddFile callback for adding
files (e.g. from the cache) to the link.

Differential Revision: https://reviews.llvm.org/D24622

llvm-svn: 282299
2016-09-23 21:33:43 +00:00
Valery Pykhtin fbf2d93f73 [AMDGPU] Fix for bz30427: wrong MTBUF encoding on VI
Differential revision: https://reviews.llvm.org/D24875

llvm-svn: 282296
2016-09-23 21:21:21 +00:00
Kostya Serebryany 2d1d944f7e [libFuzzer] first steps in adding a proper automated test suite based on real-life code: add a script to build RE2 at a revision that has known bugs
llvm-svn: 282292
2016-09-23 20:43:22 +00:00
Kostya Serebryany 0d26de3922 [libFuzzer] reset Counters (trace-pc-guard) before every run
llvm-svn: 282284
2016-09-23 20:04:13 +00:00
Petr Hosek 4cb08ce3bc [MC] Support .dcb directives in assembler parser
These directives are already supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24741

llvm-svn: 282283
2016-09-23 19:25:15 +00:00
Sanjay Patel 04949faf69 [TLI] isdigit / isascii / toascii param type should match return type (PR30484)
We crash in LibCallSimplifier if we don't check the validity of the function signature properly.

llvm-svn: 282278
2016-09-23 18:44:09 +00:00
Quentin Colombet 380cd88cfd [ResetMachineFunction] Populate the comments in the header of the file.
NFC

llvm-svn: 282276
2016-09-23 18:38:15 +00:00
Quentin Colombet 0f4c20a1e7 [ResetMachineFunction] Add statistic on the number of reset functions.
As the development of GlobalISel move forward, this statistic should
strictly decrease until it reaches zero. At this point, it would mean
GlobalISel can replace SDISel (at least on the tested inputs :P).

llvm-svn: 282275
2016-09-23 18:38:13 +00:00
Quentin Colombet d4c02437c1 [RegisterBankInfo] Add statistics for dynamic partial mappings.
Collect statistics about the number of partial mappings dynamically
allocated and accessed. Ultimately, when the whole TableGen
infrastructure is set, those numbers should be zero.

llvm-svn: 282274
2016-09-23 18:38:06 +00:00
Matthias Braun 1acb55e67c ScheduleDAG: Match enum names when printing sdep kinds
It is less confusing to have the same names in the debug print as the
enum members.

llvm-svn: 282273
2016-09-23 18:28:31 +00:00
Peter Collingbourne 6e8607527c BitcodeReader: Deduplicate code. NFC.
Differential Revision: https://reviews.llvm.org/D24852

llvm-svn: 282272
2016-09-23 18:27:42 +00:00
Quentin Colombet c13ea88a14 [RegBankSelect] Use DEBUG_TYPE instead of repeating the name of the pass
NFC

llvm-svn: 282267
2016-09-23 17:50:06 +00:00
Quentin Colombet 09a9edcc26 [RegisterBank] Mark the dump method with LLVM_DUMP_METHOD.
NFC

llvm-svn: 282266
2016-09-23 17:50:03 +00:00
Jun Bum Lim 3822939ba7 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
James Molloy 85124c76fc Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r282241. It caused http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19882.

llvm-svn: 282249
2016-09-23 13:35:43 +00:00
Nemanja Ivanovic d2c3c51a70 [Power9] Exploit move and splat instructions for build_vector improvement
This patch corresponds to review:
https://reviews.llvm.org/D21135

This patch exploits the following instructions:
mtvsrws
lxvwsx
mtvsrdd
mfvsrld

In order to improve some build_vector and extractelement patterns.

llvm-svn: 282246
2016-09-23 13:25:31 +00:00
James Molloy 1ce54d6be2 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

llvm-svn: 282241
2016-09-23 12:15:58 +00:00
George Rimar 4f82df52ae Revert r282238 "Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.""
Build bot issues (http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dump-gdbindex.test)
should be fixed in that version. Issue was that MSVS does not support "%zu". Though it works fine on MSCS 2015,
Bot looks running MSVS 2013 that does not like it. MSDN also says that "z" prefix is not supported: https://msdn.microsoft.com/en-us/library/tcxf1dw6.aspx
I had to use PRId64 instead.

Original commit message:

[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.

gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them,
this helps reduce the total size of the object files processed by the linker.

More info about that:
https://gcc.gnu.org/wiki/DebugFission
https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html

Patch teaches dwarfdump tool to dump this section.

Differential revision: https://reviews.llvm.org/D21503

llvm-svn: 282239
2016-09-23 11:01:53 +00:00
George Rimar a348527186 Revert r282235 "[llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section."
It broke BB:
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15856

llvm-svn: 282238
2016-09-23 10:12:56 +00:00
Alexey Bataev fee9078dcd [InstCombine] Fix for PR29124: reduce insertelements to shufflevector
If inserting more than one constant into a vector:

define <4 x float> @foo(<4 x float> %x) {
  %ins1 = insertelement <4 x float> %x, float 1.0, i32 1
  %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2
  ret <4 x float> %ins2
}

InstCombine could reduce that to a shufflevector:

define <4 x float> @goo(<4 x float> %x) {
 %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3>
 ret <4 x float> %shuf
}
Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e.
shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float
undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> ->
insertelement <4 x float> %v, float 1.0, 1

Differential Revision: https://reviews.llvm.org/D24182

llvm-svn: 282237
2016-09-23 09:14:08 +00:00
George Rimar a77bcf5e42 [llvm-dwarfdump] - Teach dwarfdump to dump gdb-index section.
gold linker's --gdb-index option currently is able to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them,
this helps reduce the total size of the object files processed by the linker.

More info about that:
https://gcc.gnu.org/wiki/DebugFission
https://sourceware.org/gdb/onlinedocs/gdb/Index-Section-Format.html

Patch teaches dwarfdump tool to dump this section.

Differential revision: https://reviews.llvm.org/D21503

llvm-svn: 282235
2016-09-23 09:09:26 +00:00
Valery Pykhtin 355103f6c0 [AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions
Differential revision: https://reviews.llvm.org/D24738

llvm-svn: 282234
2016-09-23 09:08:07 +00:00
Craig Topper a02e394872 [AVX-512] Split X86ISD::VFPROUND and X86ISD::VFPEXT into separate opcodes for each type constraint.
This revealed that scalar intrinsics could create nodes with a rounding mode of FROUND_CUR_DIRECTION, but the patterns didn't check for it. It just worked because isel doesn't check operand count and we had a pattern without the rounding mode argument at all.

llvm-svn: 282231
2016-09-23 06:24:43 +00:00
Craig Topper 3174b6e467 [AVX-512] Add separate ISD opcodes for each form of CVT instructions. Don't reuse non-X86 ISD opcodes with extra X86 specific arguments.
llvm-svn: 282230
2016-09-23 06:24:39 +00:00
Craig Topper d70ec9b25e [AVX-512] Use different ISD opcodes for some of the scalar intrinsic lowering. Isel is not very robust against using the same ISD opcode with different number of operands so its better to separate.
llvm-svn: 282229
2016-09-23 06:24:35 +00:00
Kostya Serebryany ce1cab169f [libFuzzer] be more precise about what we reset in TracePC
llvm-svn: 282225
2016-09-23 02:18:59 +00:00
Kostya Serebryany 16a145fd0f [libFuzzer] fix merging with trace-pc-guard
llvm-svn: 282224
2016-09-23 01:58:51 +00:00
Tom Stellard e88bbc34c6 AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size
Reviewers: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D24835

llvm-svn: 282223
2016-09-23 01:33:26 +00:00
Kostya Serebryany 87a598e19f [libFuzzer] simplify the TracePC logic
llvm-svn: 282222
2016-09-23 01:20:07 +00:00
Quentin Colombet ceb71c2f43 [RegisterBankInfo] Mark the dump methods with LLVM_DUMP_METHOD.
NFC

llvm-svn: 282221
2016-09-23 00:59:12 +00:00
Quentin Colombet fd0ab5c660 [AArch64][RegisterBankInfo] Sanity check TableGen'ed like inputs.
Make sure the entries written to mimic the behavior of TableGen are
sane.

llvm-svn: 282220
2016-09-23 00:59:07 +00:00
Kostya Serebryany ab73c6924f [libFuzzer] move value profiling logic into TracePC
llvm-svn: 282219
2016-09-23 00:46:18 +00:00
Tom Stellard e190cd2834 Triple: Add opencl environment type
Summary:
For AMDGPU, we have been using the operating system component of the triple
for specifying the low-level runtime that is being used.  The rationale for
this is that the host operating system (e.g. Linux) is irrelevant for GPU code,
since its execution enviroment will be mostly controled by the low-level runtime
being used to execute the code.

In most cases, higher level languages have their own runtime which is
implemented on top of the low-level runtime.  The kernel ABIs of each
language mostly depend on the low-level runtime, but there may be some
slight differences between languages.  OpenCL for example, may append
additional arguments to the kernel in order to pass values like global
offsets or buffers for printf.  OpenMP, HCC, or other languages may want
to add their own values which differ from OpenCL.

The reason for adding a new opencl environment type is to make it possible for the backend
to distinguish between the ABIs of the higher-level languages and handle them correctly.
It seems cleaner to use the enviroment component for this rather than creating a new
OS type for every combination of low-level runtime / high-level language.

Reviewers: Anastasia, chandlerc

Subscribers: whchung, pekka.jaaskelainen, wdng, yaxunl, llvm-commits

Differential Revision: https://reviews.llvm.org/D24735

llvm-svn: 282218
2016-09-23 00:42:56 +00:00
Petr Hosek 2f4ac44dc3 [MC] Support skip and count for .incbin directive
These optional arguments are supported by GNU assembler.

Differential Revision: https://reviews.llvm.org/D24714

llvm-svn: 282217
2016-09-23 00:41:06 +00:00
Kostya Serebryany d28099de5d [libFuzzer] change ValueBitMap to remember the number of bits in it
llvm-svn: 282216
2016-09-23 00:22:46 +00:00
Quentin Colombet 5b16d931dc [AArch64][RegisterBankInfo] Switch to TableGen'ed like PartialMapping.
Statically instanciate the most common PartialMappings. This should
be closer to what the code would look like when TableGen support is
added for GlobalISel. As a side effect, this should improve compile
time.

llvm-svn: 282215
2016-09-23 00:14:36 +00:00
Quentin Colombet 1946580cf2 [RegisterBankInfo] Check that the mapping covers the interesting bits.
In the verify method of the ValueMapping class we used to check that the
mapping exactly matches the bits of the input value. This is problematic
for statically allocated mappings because we would need a different
mapping for each different size of the value that maps on one
instruction. For instance, with such scheme, we would need a different
mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit
wide instruction.

Therefore, change the verifier to check that the meaningful bits are
covered by the mapping instead of matching them.

llvm-svn: 282214
2016-09-23 00:14:34 +00:00
Quentin Colombet 0afa7d6b82 [RegisterBankInfo] Use array instead of SmallVector for BreakDown.
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.

We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.

llvm-svn: 282213
2016-09-23 00:14:30 +00:00
Kostya Serebryany be0ed59cdc [libFuzzer] simplify the crash minimizer; split MaxLen into two: MaxInputLen and MaxMutationLen, allow MaxMutationLen to be less than MaxInputLen
llvm-svn: 282211
2016-09-22 23:16:36 +00:00
Sanjay Patel 30ef70b090 [InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672)
We already have the udiv variant of this transform, so I think this is ok for 
InstCombine too even though there is an increase in IR instructions. As the 
tests and TODO comments show, the transform can lead to follow-on combines.

This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672

Differential Revision: https://reviews.llvm.org/D24527

llvm-svn: 282209
2016-09-22 22:36:26 +00:00
Davide Italiano fcee2d8001 [AsmParser] Remove unused partial template specialization.
llvm-svn: 282206
2016-09-22 22:02:59 +00:00
Matthias Braun 5f8492e2ce MachineScheduler: Slightly simplify release node
llvm-svn: 282201
2016-09-22 21:39:56 +00:00
Matthias Braun 46533e614b MachineScheduler: Remove ineffective heuristic; NFC
Currently all nodes get added to the NextSU list when they are released,
so any candidate must be in that list, making the heuristic ineffective.
Remove it for now, we can add it back later in a working fashion if
necessary.

llvm-svn: 282200
2016-09-22 21:39:52 +00:00
Hans Wennborg c7957ef86c Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)"
and also the dependent r282175 "GVN-hoist: do not dereference null pointers"

It's causing compiler crashes building Harfbuzz (PR30499).

llvm-svn: 282199
2016-09-22 21:20:53 +00:00