Commit Graph

274657 Commits

Author SHA1 Message Date
Benjamin Kramer 24952ce5b9 Create fewer copies of StringMaps. No functionality change intended.
llvm-svn: 316301
2017-10-22 20:16:28 +00:00
Martin Storsjo c5115b9e65 Make HIDDEN_DIRECTIVE a function-like macro. NFCI.
This avoids a hack for making it a no-op for windows.

Also explicitly check for _WIN32 instead of assuming it.

Differential Revision: https://reviews.llvm.org/D39156

llvm-svn: 316300
2017-10-22 19:39:26 +00:00
Benjamin Kramer a7c822a238 [X86] Add missing override. NFC.
llvm-svn: 316299
2017-10-22 19:16:31 +00:00
Sanjay Patel b80daf0b48 [SimplifyCFG] delay switch condition forwarding to -latesimplifycfg
As discussed in D39011:
https://reviews.llvm.org/D39011
...replacing constants with a variable is inverting the transform done
by other IR passes, so we definitely don't want to do this early. 
In fact, it's questionable whether this transform belongs in SimplifyCFG 
at all. I'll look at moving this to codegen as a follow-up step.

llvm-svn: 316298
2017-10-22 19:10:07 +00:00
Fangrui Song dc168722da [utils] Support -mtriple=powerpc64
Summary: test/CodeGen/PowerPC/pr33093.ll uses both powerpc64 (big-endian) and powerpc64le while the former was unsupported.

Subscribers: nemanjai

Differential Revision: https://reviews.llvm.org/D39164

llvm-svn: 316297
2017-10-22 18:43:23 +00:00
Simon Pilgrim ce55eab936 Strip trailing whitespace. NFCI.
llvm-svn: 316296
2017-10-22 18:38:57 +00:00
Marina Yatsina f9371d821f Add logic to greedy reg alloc to avoid bad eviction chains
This fixes bugzilla 26810
https://bugs.llvm.org/show_bug.cgi?id=26810

This is intended to prevent sequences like:
movl %ebp, 8(%esp) # 4-byte Spill
movl %ecx, %ebp
movl %ebx, %ecx
movl %edi, %ebx
movl %edx, %edi
cltd
idivl %esi
movl %edi, %edx
movl %ebx, %edi
movl %ecx, %ebx
movl %ebp, %ecx
movl 16(%esp), %ebp # 4 - byte Reload

Such sequences are created in 2 scenarios:

Scenario #1:
vreg0 is evicted from physreg0 by vreg1
Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from)
Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.)
one of the split intervals ends up evicting vreg2 from physreg1
Evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

Scenario #2
vreg0 is evicted from physreg0 by vreg1
vreg2 is evicted from physreg2 by vreg3 etc
Evictee vreg0 is intended for region splitting with split candidate physreg1
Region splitting creates a local interval because of interference with the evictor vreg1
one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from)
Another evictee vreg2 is intended for region splitting with split candidate physreg1
one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills

As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D35816

Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39
llvm-svn: 316295
2017-10-22 17:59:38 +00:00
Craig Topper dac20263a1 [X86] More correctly support LIG and WIG for EVEX instructions in the disassembler tables.
This is similar to how we generate the VEX tables.

More fixes are still needed for the instructions that use EVEX.b (broadcast and embedded rounding).

llvm-svn: 316294
2017-10-22 17:22:29 +00:00
Sanjay Patel 24226504a7 [SimplifyCFG] try harder to forward switch condition to phi (PR34471)
The missed canonicalization/optimization in the motivating test from PR34471 leads to very different codegen:

  int switcher(int x) {
      switch(x) {
      case 17: return 17;
      case 19: return 19;
      case 42: return 42;
      default: break;
      }
      return 0;
    }

  int comparator(int x) {
    if (x == 17) return 17;
    if (x == 19) return 19;
    if (x == 42) return 42;
    return 0;
  }

For the first example, we use a bit-test optimization to avoid a series of compare-and-branch:
https://godbolt.org/g/BivDsw

Differential Revision: https://reviews.llvm.org/D39011

llvm-svn: 316293
2017-10-22 16:51:03 +00:00
Faisal Vali 81b756e6a3 [C++17] Fix PR34970 - tweak overload resolution for class template deduction-guides in line with WG21's p0620r0.
In order to identify the copy deduction candidate, I considered two approaches:
  - attempt to determine whether an implicit guide is a copy deduction candidate by checking certain properties of its subsituted parameter during overload-resolution.
  - using one of the many bits (WillHaveBody) from FunctionDecl (that CXXDeductionGuideDecl inherits from) that are otherwise irrelevant for deduction guides

After some brittle gymnastics w the first strategy, I settled on the second, although to avoid confusion and to give that bit a better name, i turned it into a member of an anonymous union.

Given this identification 'bit', the tweak to overload resolution was a simple reordering of the deduction guide checks (in SemaOverload.cpp::isBetterOverloadCandidate), in-line with Jason Merrill's p0620r0 drafting which made it into the working paper.  Concordant with that, I made sure the copy deduction candidate is always added.


References:
See https://bugs.llvm.org/show_bug.cgi?id=34970 
See http://wg21.link/p0620r0

llvm-svn: 316292
2017-10-22 14:45:08 +00:00
Jan Vesely 7ab2d0bdcd shared: Implement aligned vector stores (vstorea_half)
Float version passes newly posted piglit tests on turks, float and double pass on carrizo.
v2: scalar vstorea_half
v3: fix typo

Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 316291
2017-10-22 14:21:59 +00:00
Jan Vesely 12061c7125 shared: Implement aligned vector loads (vloada_half)
Passes newly posted piglits on turks and carrizo
v2: add scalar vloada_half
v3: fix typo

Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 316290
2017-10-22 14:21:56 +00:00
Momchil Velikov d6a4ab3d49 [ARM] Dynamic stack alignment for 16-bit Thumb
This patch implements dynamic stack (re-)alignment for 16-bit Thumb. When
targeting processors, which support only the 16-bit Thumb instruction set
the compiler ignores the alignment attributes of automatic variables and may
silently generate incorrect code.

Differential revision: https://reviews.llvm.org/D38143

llvm-svn: 316289
2017-10-22 11:56:35 +00:00
Guy Blank 92d5ce3bd4 [X86] Add a pass to convert instruction chains between domains.
The pass scans the function to find instruction chains that define
registers in the same domain (closures).
It then calculates the cost of converting the closure to another domain.
If found profitable, the instructions are converted to instructions in
the other domain and the register classes are changed accordingly.

This commit adds the pass infrastructure and a simple conversion from
the GPR domain to the Mask domain.

Differential Revision:
https://reviews.llvm.org/D37251

Change-Id: Ic2cf1d76598110401168326d411128ae2580a604
llvm-svn: 316288
2017-10-22 11:43:08 +00:00
Nitesh Jain 757f74c2d3 [mips] Adds support for R_MIPS_26, HIGHER, HIGHEST relocations in RuntimeDyld.
Reviewers: sdardis

Subscribers: jaydeep, bhushan, llvm-commits

Differential Revision: https://reviews.llvm.org/D38314

llvm-svn: 316287
2017-10-22 09:47:41 +00:00
Nitesh Jain cf8a5c26f9 [Compiler-rt][MIPS] Fix cross build for XRAY.
Reviewers: dberris, sdardis

Subscribers: jaydeep, bhushan, llvm-commits

Differential Revision: https://reviews.llvm.org/D38021

llvm-svn: 316286
2017-10-22 09:37:50 +00:00
Craig Topper e975127db6 [X86] Teach the disassembler that some instructions use VEX.W==0 without a corresponding VEX.W==1 instruction and we shouldn't treat them as if VEX.W is ignored.
Fixes PR11304.

llvm-svn: 316285
2017-10-22 06:18:26 +00:00
Craig Topper a33846aca6 [X86] Add VEX_WIG to applicable AVX512 instructions.
This should be NFC. Will be used in future patches to fix disassembler bugs.

llvm-svn: 316284
2017-10-22 06:18:23 +00:00
Craig Topper 1bcb0d8a7f [X86] Add VEX_WIG to VROUNDSSrr/VROUNDSSrm/VROUNDSDrr/VROUNDSDrm
llvm-svn: 316283
2017-10-22 06:18:20 +00:00
Craig Topper 158bc6474a [X86] Don't allow gather/scatter to disassembler if memory operand does not use a SIB byte.
Fixes PR34998.

llvm-svn: 316282
2017-10-22 04:32:30 +00:00
Rui Ueyama 53a9aff93e Simplify.
llvm-svn: 316281
2017-10-22 01:58:30 +00:00
Rui Ueyama 95bf509873 Assume that mergeable input sections are smaller than 4 GiB.
By assuming that mergeable input sections are smaller than 4 GiB,
lld's memory usage when linking clang with debug info drops from
2.788 GiB to 2.019 GiB (measured by valgrind, and that does not include
memory space for mmap'ed files). I think that's a reasonable assumption
given such a large RAM savings, so this patch.

According to valgrind, gold needs 3.54 GiB of RAM to do the same thing.

NB: This patch does not introduce a limitation on the size of
output sections. You can still create sections larger than 4 GiB.

llvm-svn: 316280
2017-10-21 23:20:13 +00:00
Aaron Ballman 5b4f81ec14 Reverting r316278 due to failing build bots.
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/11896
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/12380

llvm-svn: 316279
2017-10-21 21:52:48 +00:00
Masud Rahman 5b3fa2cd99 [libclang, bindings]: add spelling location
o) Add a 'Location' class that represents the four properties of a
    physical location

 o) Enhance 'SourceLocation' to provide 'expansion' and 'spelling'
    locations, maintaining backwards compatibility with existing code by
    forwarding the four properties to 'expansion'.

 o) Update the implementation to use 'clang_getExpansionLocation'
    instead of the deprecated 'clang_getInstantiationLocation', which
    has been present since 2011.

 o) Update the implementation of 'clang_getSpellingLocation' to actually
    obtain spelling location instead of file location.

llvm-svn: 316278
2017-10-21 20:53:49 +00:00
Simon Pilgrim ab6dbe2b29 Strip trailing whitespace. NFCI.
llvm-svn: 316277
2017-10-21 20:40:49 +00:00
Aaron Ballman fc02869c96 Reverting r316270 due to failing build bots.
http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/12899
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7951

llvm-svn: 316276
2017-10-21 20:38:15 +00:00
Aaron Ballman 6173655639 Fix a typo with -fno-double-square-bracket-attributes and add a test to demonstrate that it works as expected in C++11 mode. Additionally corrected the handling of -fdouble-square-bracket-attributes to be properly passed down to the cc1 option.
llvm-svn: 316275
2017-10-21 20:28:58 +00:00
Simon Pilgrim 3cb024490a [X86][SSE] Add extractps/pextrd equivalence to domain tables
Differential Revision: https://reviews.llvm.org/D39135

llvm-svn: 316274
2017-10-21 20:19:48 +00:00
Craig Topper ca2382d809 [X86] Fix disassembling of EVEX instructions to stop accidentally decoding the SIB index register as an XMM/YMM/ZMM register.
This introduces a new operand type to encode the whether the index register should be XMM/YMM/ZMM. And new code to fixup the results created by readSIB.

This has the nice effect of removing a bunch of code that hard coded the name of every GATHER and SCATTER instruction to map the index type.

This fixes PR32807.

llvm-svn: 316273
2017-10-21 20:03:20 +00:00
Simon Pilgrim cb028c7321 Fix MSVC 'result of 32-bit shift implicitly converted to 64 bits' warning. NFCI.
llvm-svn: 316271
2017-10-21 17:23:04 +00:00
Fangrui Song c7b749bd06 [PPC CodeGen] Fix the bitreverse.i64 intrinsic.
Summary: The two 32-bit words were swapped.

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D38705

llvm-svn: 316270
2017-10-21 16:59:40 +00:00
Aaron Ballman 2b3bc4ce2b Add release notes for the recent -fdouble-square-bracket-attributes and -fno-double-square-bracket-attributes compiler flags.
llvm-svn: 316269
2017-10-21 16:45:08 +00:00
Roman Lebedev ca1aaacc32 [Sema] Fixes for enum handling for tautological comparison diagnostics
Summary:
As Mattias Eriksson has reported in PR35009, in C, for enums, the underlying type should
be used when checking for the tautological comparison, unlike C++, where the enumerator
values define the value range. So if not in CPlusPlus mode, use the enum underlying type.

Also, i have discovered a problem (a crash) when evaluating tautological-ness of the following comparison:
```
enum A { A_a = 0 };
if (a < 0) // expected-warning {{comparison of unsigned enum expression < 0 is always false}}
return 0;
```
This affects both the C and C++, but after the first fix, only C++ code was affected.
That was also fixed, while preserving (i think?) the proper diagnostic output.

And while there, attempt to enhance the test coverage.
Yes, some tests got moved around, sorry about that :)

Fixes PR35009

Reviewers: aaron.ballman, rsmith, rjmccall

Reviewed By: aaron.ballman

Subscribers: Rakete1111, efriedma, materi, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D39122

llvm-svn: 316268
2017-10-21 16:44:03 +00:00
Aaron Ballman f45629d2bb Fixing broken attribute documentation for __attribute__((noescape)); a code block was missing and the existing code block was missing a mandatory newline.
llvm-svn: 316267
2017-10-21 16:43:01 +00:00
Craig Topper 8e8b6efdc8 [ValueTracking] Remove unnecessary temporary APInt from computeNumSignBitsVectorConstant.
We can just use getNumSignBits instead of inverting negative numbers.

llvm-svn: 316266
2017-10-21 16:35:41 +00:00
Craig Topper b98ee58511 [ValueTracking] Simplify the known bits code for constant vectors a little.
Neither of these cases really require a temporary APInt outside the loop. For the ConstantDataSequential case the APInt will never be larger than 64-bits so its fine to just call getElementAsAPInt. For ConstantVector we can get the APInt by reference and only make a copy where the inversion is needed.

llvm-svn: 316265
2017-10-21 16:35:39 +00:00
Masud Rahman c739affc43 [bindings] allow null strings in Python 3
Some API calls accept 'NULL' instead of a char array (e.g. the second
argument to 'clang_ParseTranslationUnit').  For Python 3 compatibility,
all strings are passed through 'c_interop_string' which expects to
receive only 'bytes' or 'str' objects.  This change extends this
behavior to additionally allow 'None' to be supplied.

A test case was added which breaks in Python 3, and is fixed by this
change.  All the test cases pass in both, Python 2 and Python 3.

llvm-svn: 316264
2017-10-21 16:13:41 +00:00
Masud Rahman 93135cb89a Test commit
llvm-svn: 316263
2017-10-21 16:03:17 +00:00
Simon Pilgrim 7025b07828 [X86][SSE] Add missing extractps scheduling test
llvm-svn: 316262
2017-10-21 14:35:09 +00:00
David Green 907b60fbba [LoopInterchange] Fix phi node ordering miscompile.
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.

Differential Revision: https://reviews.llvm.org/D38682

llvm-svn: 316261
2017-10-21 13:58:37 +00:00
NAKAMURA Takumi f793fa3335 clang-tidy: Fix deps.
llvm-svn: 316260
2017-10-21 11:02:30 +00:00
Florian Hahn b0a263cf94 [SelectionDAG] Use dyn_cast without cast.
llvm-svn: 316258
2017-10-21 05:37:10 +00:00
Florian Hahn 3d81254b7c [SelectionDAG] Use isa to silence unused variable warning (NFC).
llvm-svn: 316257
2017-10-21 04:57:03 +00:00
Craig Topper 554151160f [SelectionDAG] Don't subject ConstantSDNodes to the depth limit in computeKnownBits and ComputeNumSignBits.
We don't need to do any additional recursion, we just need to analyze the APInt stored in the node. This matches what the ValueTracking versions do for IR.

llvm-svn: 316256
2017-10-21 03:22:13 +00:00
Craig Topper 195dad4264 [SelectionDAG] Don't subject ISD:Constant to the depth limit in TargetLowering::SimplifyDemandedBits.
Summary:
We shouldn't recurse any further but it doesn't mean we shouldn't be able to give the known bits for a constant. The caller would probably like that we always return the right answer for a constant RHS. This matches what InstCombine does in this case.

I don't have a test case because this showed up while trying to revive D31724.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D38967

llvm-svn: 316255
2017-10-21 02:27:19 +00:00
Craig Topper fcf27188d7 [X86] Do not generate __multi3 for mul i128 on X86
Summary: __multi3 is not available on x86 (32-bit). Setting lib call name for MULI_128 to nullptr forces DAGTypeLegalizer::ExpandIntRes_MUL to generate instructions for 128-bit multiply instead of a call to an undefined function.  This fixes PR20871 though it may be worth looking at why licm and indvars combine to generate 65-bit multiplies in that test.

Patch by Riyaz V Puthiyapurayil

Reviewers: craig.topper, schweitz

Reviewed By: craig.topper, schweitz

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D38668

llvm-svn: 316254
2017-10-21 02:26:00 +00:00
Eugene Zelenko fce435764e [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316253
2017-10-21 00:57:46 +00:00
Rafael Espindola 849d499e6d Don't call buildSectionOrder multiple times.
This takes linking the linux kernel from 1.52s to 0.58s.

llvm-svn: 316251
2017-10-21 00:05:01 +00:00
Sanjay Patel 0c203d6b6d [CodeGen] add tests for __builtin_sqrt*; NFC
I don't know if this is correct, but this is what we currently do.
More discussion in PR27108 and PR27435 and D27618.

llvm-svn: 316250
2017-10-20 23:32:41 +00:00
George Karpenkov bd4254c692 [Analyzer] Correctly handle parameters passed by reference when bodyfarming std::call_once
Explicitly not supporting functor objects.

Differential Revision: https://reviews.llvm.org/D39031

llvm-svn: 316249
2017-10-20 23:29:59 +00:00