Commit Graph

273145 Commits

Author SHA1 Message Date
Rafael Espindola 0038dfa19d Revert "Revert r314810: Use sched_getaffinity instead of std:🧵:hardware_concurrency."
This reverts commit r314924.

The required llvm patch was recommitted.

llvm-svn: 314933
2017-10-04 20:35:05 +00:00
Yaxun Liu 10712d9203 [OpenCL] Clean up and add missing fields for block struct
Currently block is translated to a structure equivalent to

struct Block {
  void *isa;
  int flags;
  int reserved;
  void *invoke;
  void *descriptor;
};
Except invoke, which is the pointer to the block invoke function,
all other fields are useless for OpenCL, which clutter the IR and
also waste memory since the block struct is passed to the block
invoke function as argument.

On the other hand, the size and alignment of the block struct is
not stored in the struct, which causes difficulty to implement
__enqueue_kernel as library function, since the library function
needs to know the size and alignment of the argument which needs
to be passed to the kernel.

This patch removes the useless fields from the block struct and adds
size and align fields. The equivalent block struct will become

struct Block {
  int size;
  int align;
  generic void *invoke;
 /* custom fields */
};
It also changes the pointer to the invoke function to be
a generic pointer since the address space of a function
may not be private on certain targets.

Differential Revision: https://reviews.llvm.org/D37822

llvm-svn: 314932
2017-10-04 20:32:17 +00:00
Rafael Espindola 8c0ff9508d Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.

Original message:

Use sched_getaffinity instead of std:🧵:hardware_concurrency.

The issue with std:🧵:hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

llvm-svn: 314931
2017-10-04 20:27:01 +00:00
Sanjay Patel 4c33d5213b [SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI
This is a follow-up to https://reviews.llvm.org/D38138. 

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping 
any options by using default param values.

llvm-svn: 314930
2017-10-04 20:26:25 +00:00
Leonard Mosescu 63ed8c6c2e LLDB cmake fix: define LLDB_CONFIGURATION_xxx based on the build type
Neither LLDB_CONFIGURATION_DEBUG nor LLDB_CONFIGURATION_RELEASE were ever set in the CMake LLDB project.

Also cleaned up a questionable #ifdef in SharingPtr.h, removing all the references to LLDB_CONFIGURATION_BUILD_AND_INTEGRATION in the process.

Differential Revision: https://reviews.llvm.org/D38552

llvm-svn: 314929
2017-10-04 20:23:56 +00:00
Xinliang David Li 7a73757358 Recommit r314561 after fixing msan build failure
(trial 2) Incoming val defined by terminator instruction which
also requires bitcasts can not be handled.

llvm-svn: 314928
2017-10-04 20:17:55 +00:00
Guozhi Wei eb301875b8 [TargetTransformInfo] Check if function pointer is valid before calling isLoweredToCall
Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.

Differential Revision: https://reviews.llvm.org/D38204

llvm-svn: 314927
2017-10-04 20:14:08 +00:00
Sumanth Gundapaneni 2d9aa7432f [Hexagon] Move getHexagonTargetFeatures to Hexagon.cpp (NFC)
Differential Revision: https://reviews.llvm.org/D38548

llvm-svn: 314926
2017-10-04 19:09:29 +00:00
Jeroen Ketema feefb0870f Add vstore_half helpers for ptx
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 314925
2017-10-04 19:07:48 +00:00
Rui Ueyama 61b9ce217a Revert r314810: Use sched_getaffinity instead of std:🧵:hardware_concurrency.
This reverts commit r314810 because r314809 was reverted.

llvm-svn: 314924
2017-10-04 18:39:51 +00:00
Jun Bum Lim d40e03c2d8 Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

llvm-svn: 314923
2017-10-04 18:33:52 +00:00
Daniel Neilson bef94bcbae Revert D38481 due to missing cmake check for CPU_COUNT
Summary:
This reverts D38481. The change breaks systems with older versions of glibc. It
injects a use of CPU_COUNT() from sched.h without checking to ensure that the
function exists first.

Reviewers:

Subscribers:

llvm-svn: 314922
2017-10-04 18:19:03 +00:00
Simon Pilgrim 9edbe110e8 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit vector compare results.
AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts

llvm-svn: 314921
2017-10-04 18:00:42 +00:00
Krzysztof Parzyszek 4697ddeea4 [Hexagon] Add a member Subtarget to HexagonInstrInfo, NFC
llvm-svn: 314920
2017-10-04 18:00:15 +00:00
Hans Wennborg 2a6c9adb2f Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"
It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314919
2017-10-04 17:54:06 +00:00
Jake Ehrlich 084400bad9 [llvm-objcopy] Fix major layout bugs in llvm-objcopy
Somehow a few massive errors slipped though the cracks of testing.

1. The code in Segment::finalize was left over from the old layout
algorithm. In certain situations this would cause very strange issues
with segment layout. For instance in the shift-segments.test case it
would cause the second segment to have the same offset as the first.

2. In debugging this I discovered another issue. Namely section alignment
was not being computed based on Section->Align but instead
Section->Offset which is bizarre and makes no sense. I have no clue how
it worked in the first place. This issue is also fixed

3. Fixing #2 exposed a bug where things were not being written past the end
of the file that technically should have been. This was because in
certain cases (like overlapping-segments) the end of the file wouldn't
always be bumped if the offset could be chosen relative to an existing
segment that already had it's offset chosen. For fully nested segments
this is fine but for overlapping segments this leaves the end of the
file short. So I changed how the offset is bumped when looping though
segments.

Differential Revision: https://reviews.llvm.org/D38436

llvm-svn: 314918
2017-10-04 17:44:42 +00:00
Jakub Kuderski 68fc036e1d [Dominators] Take fast path when applying <=1 updates
Summary:
This patch teaches `DT.applyUpdates` to take the fast when applying zero or just one update and makes it not run the internal batch updater machinery.

With this patch, it should no longer make sense to have a special check in user's code that checks the update sequence size before applying them, e.g.
```
if (!MyUpdates.empty())
  DT.applyUpdates(MyUpdates);
```
or
```
if (MyUpdates.size() == 1)
  if (...)
    DT.insertEdge(...)
  else
    DT.deleteEdge(...)
```

Reviewers: dberlin, brzycki, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38541

llvm-svn: 314917
2017-10-04 17:32:55 +00:00
Simon Pilgrim b47b3f2564 [X86][SSE] Add support for lowering v8i16 binary shuffles to PACKSS/PACKUS
Missed in D38472

llvm-svn: 314916
2017-10-04 17:31:28 +00:00
Francis Ricci 954b94f5df [test] Fix append_path in the empty case
Summary:
normpath() was being called on an empty string and appended to
the environment variable in the case where the environment variable
was unset. This led to ":." being appended to the path, since
normpath() of an empty string is '.', presumably to represent cwd.

Reviewers: zturner, sqlbyme, modocache

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38542

llvm-svn: 314915
2017-10-04 17:30:28 +00:00
Craig Topper 6fb55716e9 [X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input instead of FR32/FR64
This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted.

This should fix PR33079

Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results.

Differential Revision: https://reviews.llvm.org/D38449

llvm-svn: 314914
2017-10-04 17:20:12 +00:00
Jonas Toth 5a09996ff7 [clang-tidy] Emit note for variable declaration that are later deleted
This patch introduces a note for variable declaration that are later deleted.
Adds FIXME notes for possible automatic type-rewriting positions as well.

Reviewed by aaron.ballman
Differential: https://reviews.llvm.org/D38411

llvm-svn: 314913
2017-10-04 16:49:20 +00:00
Balaram Makam 45ff83ce1e "[ARM] Mark flaky test MachineBranchProb.ll unsupported again for ARM/AArch64"
r314857 changed the CFG that resulted in the flaky test MachineBranchProb.ll to
fail the bots again. Marking it as unsupported for ARM/AArch64 again until we
find the cause.

llvm-svn: 314912
2017-10-04 16:45:24 +00:00
Yonghong Song 09b01b3555 bpf: fix an insn encoding issue for neg insn
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 314911
2017-10-04 16:11:52 +00:00
Artem Dergachev 9445b89c49 [analyzer] Fix autodetection of binding types.
In ProgramState::getSVal(Location, Type) API which dereferences a pointer value,
when the optional Type parameter is not supplied and the Location is not typed,
type should have been guessed on a best-effort basis by inspecting the Location
more deeply. However, this never worked; the auto-detected type was instead
a pointer type to the correct type.

Fixed the issue and added various test cases to demonstrate which parts of the
analyzer were affected (uninitialized pointer argument checker, C++ trivial copy
modeling, Google test API modeling checker).

Additionally, autodetected void types are automatically replaced with char,
in order to simplify checker APIs. Which means that if the location is a void
pointer, getSVal() would read the first byte through this pointer
and return its symbolic value.

Fixes pr34305.

Differential Revision: https://reviews.llvm.org/D38358

llvm-svn: 314910
2017-10-04 15:59:40 +00:00
Adam Nemet 6c381b7a2e [OptRemark] Move YAML writing to IR
Before the patch this was in Analysis.  Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager.  Jessica is planning to use this to
report function size after each pass.  The same could be used for time
reports.

Tested with BUILD_SHARED_LIBS=On.

llvm-svn: 314909
2017-10-04 15:18:11 +00:00
Adam Nemet f1bea0a6ab Also update MachineORE after r314874.
llvm-svn: 314908
2017-10-04 15:18:07 +00:00
Sanjay Patel 9366b9c53b [InstCombine] add 'exact' variants of all tests; NFC
We can likely remove most of these as redundant in the near future, 
but I'm trying to make sure I don't introduce any regressions with D38514.

llvm-svn: 314907
2017-10-04 15:17:25 +00:00
Clement Courbet 98eaa88357 [NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp
llvm-svn: 314906
2017-10-04 15:13:52 +00:00
Carlo Bertolli ba1487ba69 [OpenMP] Initial implementation of teams distribute code generation
https://reviews.llvm.org/D38371

This patch implements codegen for the combined 'teams distribute" OpenMP pragma and adds regression tests for all its clauses.

llvm-svn: 314905
2017-10-04 14:12:09 +00:00
Jonas Hahnfeld cbbcd7fc1b [test] Pass in fixed triple for openmp-offload.c
This should fix the test on other architectures.

Related to: https://reviews.llvm.org/D38372

llvm-svn: 314904
2017-10-04 13:54:09 +00:00
Simon Pilgrim 46a366ccb7 [X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.
Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call.

llvm-svn: 314903
2017-10-04 13:41:26 +00:00
Jonas Hahnfeld bbf56fb621 [OpenMP] Fix passing of -m arguments correctly
The recent fix in D38258 was wrong: getAuxTriple() only returns
non-null values for the CUDA toolchain. That is why the now added
test for PPC and X86 failed.

Differential Revision: https://reviews.llvm.org/D38372

llvm-svn: 314902
2017-10-04 13:32:59 +00:00
Simon Pilgrim bd5d2f0284 [X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUS
Extension to D38472

llvm-svn: 314901
2017-10-04 13:12:08 +00:00
Michael Kruse 482d3f41e5 [ScopBuilder] Introduce -polly-stmt-granularity option. NFC.
The option is introduced with only one possible value
-polly-stmt-granularity=bb which represents the current behaviour, which
is outlined into the new function buildSequentialBlockStmts().

More options will be added in future commits.

llvm-svn: 314900
2017-10-04 12:18:57 +00:00
George Rimar 3040da03e0 [gold-plugin] - Fix compilation after LLVM update (r314883). NFC.
llvm-svn: 314899
2017-10-04 11:00:30 +00:00
Dylan McKay 8dd702c1cd [AVR] Implement LPMWRdZ pseudo-instruction's expansion.
FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should
refactor a bit and unify the two

Patch by Gerdo Erdi.

llvm-svn: 314898
2017-10-04 10:37:22 +00:00
Dylan McKay 3f71f1c91e [AVR] Factor out mayLoad in tablegen patterns
Patch by Gergo Erdi.

llvm-svn: 314897
2017-10-04 10:36:07 +00:00
Dylan McKay d00f9c1ef1 [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`
Patch by Gergo Erdi.

llvm-svn: 314896
2017-10-04 10:33:36 +00:00
Alexander Kornienko 9ad43a1cf1 Fix assertion failure in thread safety analysis (PR34800).
Summary:
Fix an assertion failure (http://llvm.org/PR34800) and clean up unused code relevant to the fixed logic.

A bit of context: when `SExprBuilder::translateMemberExpr` is called on a member expression that involves a conversion operator, for example, `til::Project` constructor can't just call `getName()` on it, since the name is not a simple identifier. In order to handle this case I've introduced an optional string to print the member name to. I discovered that the other two `til::Project` constructors are not used, so it was better to delete them instead of ensuring they work correctly with the new logic.

Reviewers: aaron.ballman

Reviewed By: aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D38458

llvm-svn: 314895
2017-10-04 10:24:36 +00:00
Tobias Grosser c52b71db15 [GPGPU] Make sure escaping invariant load hoisted scalars are preserved
We make sure that the final reload of an invariant scalar memory access uses the
same stack slot into which the invariant memory access was stored originally.
Earlier, this was broken as we introduce a new stack slot aside of the preload
stack slot, which remained uninitialized and caused our escaping loads to
contain garbage. This happened due to us clearing the pre-populated values
in EscapeMap after kernel code generation. We address this issue by preserving
the original host values and restoring them after kernel code generation.
EscapeMap is not expected to be used during kernel code generation, hence we
clear it during kernel generation to make sure that any unintended uses are
noticed.

llvm-svn: 314894
2017-10-04 10:24:23 +00:00
Dylan McKay 39069208d5 [AVR] Insert JMP for long branches
Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.

llvm-svn: 314891
2017-10-04 09:51:28 +00:00
Dylan McKay c4b002bf5a [AVR] Fix displacement overflow for LDDW/STDW
In some cases, the code generator attempts to generate instructions such as:

lddw r24, Y+63

which expands to:

ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary

This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.

Patch by Thomas Backman.

llvm-svn: 314890
2017-10-04 09:51:21 +00:00
George Rimar 11632a9cfb [ELF] - Get rid of precompiled input objects from testcases.
We have verneed1.so, verneed2.so files and verneed.so.sh script
to produce them. They were committed long time ago when LLD
was not yet able to produce some sections for versioning
(".gnu.version_r" I think).

There is no point to have them as binaries anymore. Patch
creates asm inputs instead based on verneed.so.sh content.

Differential revision: https://reviews.llvm.org/D38505

llvm-svn: 314889
2017-10-04 09:46:53 +00:00
Oliver Stannard 878216dd05 [ARM] Add diag string for movw/movt immediates in assembly
This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

llvm-svn: 314888
2017-10-04 09:24:54 +00:00
Oliver Stannard 5a7aae3a80 [ARM, Asm] Change grammar of immediate operand diagnostics
Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

llvm-svn: 314887
2017-10-04 09:18:07 +00:00
Jatin Bhateja 3c29bacd43 [X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 314886
2017-10-04 09:02:10 +00:00
Sean Eveson ea9dceed77 [llvm-cov] Fix showing title when filtering and not outputting to a directory
Differential Revision: https://reviews.llvm.org/D38507

llvm-svn: 314885
2017-10-04 08:54:37 +00:00
George Rimar ed906d8e63 [ELF] - Update after LLVM r314883 change. NFC.
llvm-svn: 314884
2017-10-04 08:50:34 +00:00
George Rimar 099960d322 [MC] - Don't assert when non-english characters are used.
I found that llvm-mc does not like non-english characters even in comments,
which it tries to tokenize.

Problem happens because of functions like isdigit(), isalnum() which takes
int argument and expects it is not negative.
But at the same time MCParser uses char* to store input buffer poiner, char has signed value,
so it is possible to pass negative value to one of functions from above and
that triggers an assert. 
Testcase for demonstration is provided.

To fix the issue helper functions were introduced in StringExtras.h

Differential revision: https://reviews.llvm.org/D38461

llvm-svn: 314883
2017-10-04 08:50:08 +00:00
Mikael Holmen a1a3f5c5e6 Recommit [UnreachableBlockElim] Use COPY if PHI input is undef
This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

    %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
    %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

llvm-svn: 314882
2017-10-04 07:42:45 +00:00