Commit Graph

254555 Commits

Author SHA1 Message Date
Aaron Ballman b802b8d75b Correcting several sphinx errors; should fix the LLVM documentation build.
llvm-svn: 294865
2017-02-11 18:45:24 +00:00
Simon Pilgrim 4ef9672f0f [X86][SSE] Add early-out when trying to match blend shuffle. NFCI.
llvm-svn: 294864
2017-02-11 18:06:24 +00:00
Sanjay Patel 63499b61c9 [TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits
I don't know if anything other than x86 vectors is affected by this change, but this may allow 
us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises
from the fact that blendv* instructions only use the sign-bit when deciding which vector element
to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already
exists in x86 lowering; this demanded bits change just enables the transform to fire more often.

The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, 
but I've explained the likely steps in this series here:
https://llvm.org/bugs/show_bug.cgi?id=11210

Differential Revision: https://reviews.llvm.org/D29687

llvm-svn: 294863
2017-02-11 18:01:55 +00:00
Aaron Ballman 5c092a3438 Hopefully fixes a compile error introduced by r294861.
llvm-svn: 294862
2017-02-11 18:00:32 +00:00
Aaron Ballman 3dcb85b01f Attributes on K&R C functions should not cause incompatible function type with a redeclaration having the same attribute. Fixing this introduced a secondary problem where we were assuming that K&R functions could not be attributed types when reporting old-style function definitions that are not preceded by a prototype.
This patch fixes PR31020.

llvm-svn: 294861
2017-02-11 17:49:53 +00:00
Amaury Sechet 9df26d330f Fix typo in test filename. NFC
llvm-svn: 294860
2017-02-11 17:48:49 +00:00
Amaury Sechet 58ce15aba1 Fix indentation in X86ISelLowering. NFC
llvm-svn: 294859
2017-02-11 17:48:48 +00:00
Craig Topper 255343483d [AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables.
llvm-svn: 294858
2017-02-11 17:35:28 +00:00
Craig Topper b2fa216dd5 [X86] Improve alphabetizing of load folding tables. NFC
llvm-svn: 294857
2017-02-11 17:35:25 +00:00
Simon Pilgrim 0e6945e48a [X86][SSE] Convert getTargetShuffleMaskIndices to use getTargetConstantBitsFromNode.
Removes duplicate constant extraction code in getTargetShuffleMaskIndices.

getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits.

llvm-svn: 294856
2017-02-11 17:27:21 +00:00
Saleem Abdulrasool 5b1f0edf2d docs: update docs for objc_storeStrong behaviour
objc_storeStrong does not return a value.

llvm-svn: 294855
2017-02-11 17:24:09 +00:00
Saleem Abdulrasool e60561c073 CodeGen: rename variables to adhere to naming convention
Adjust style before making more intrusive changes.  NFC.

llvm-svn: 294854
2017-02-11 17:24:07 +00:00
Saleem Abdulrasool b893ed26ec Sema: simplify conditional execution (NFC)
The conditional cast is unnecessary since we know that it will always
succeed.  NFC.

llvm-svn: 294853
2017-02-11 17:24:04 +00:00
Simon Pilgrim d59fa0e38a [X86] Merge repeated getScalarValueSizeInBits calls. NFCI.
llvm-svn: 294852
2017-02-11 16:42:07 +00:00
Daniel Berlin 22a4a01ffa NewGVN: Reverse sense of this test to make it clearer
llvm-svn: 294851
2017-02-11 15:20:15 +00:00
Daniel Berlin 1529bb93c9 NewGVN: Add missing initialization of NumFuncArgs lost due to bad merge.
llvm-svn: 294850
2017-02-11 15:13:49 +00:00
Daniel Berlin 1c08767f88 NewGVN: Rank and order commutative operands consistently.
llvm-svn: 294849
2017-02-11 15:07:01 +00:00
Simon Pilgrim 86a95c1ff7 [X86][3DNow!] Add tests to ensure PFMAX/PFMIN are not commuted.
llvm-svn: 294848
2017-02-11 14:01:37 +00:00
Simon Pilgrim 6411a0ebed [X86][3DNow!] Enable PFSUB<->PFSUBR commutation
llvm-svn: 294847
2017-02-11 13:51:14 +00:00
Simon Pilgrim 4ead1d4aa9 [X86][3DNow!] Enable commutation for PFADD/PFMUL/PFCMPEQ/PAVGUSB/PMULHRW
All commutations confirmed to give identical results - note PFMAX/PFMIN do not

PFSUB<->PFSUBR should be commutable as well

llvm-svn: 294846
2017-02-11 13:32:55 +00:00
Simon Pilgrim 6b4a5134af [X86][3DNow!] Add tests showing missed commutation opportunities.
llvm-svn: 294845
2017-02-11 13:00:32 +00:00
Daniel Berlin b79f53669a NewGVN: Clean up how we handle the INITIAL class so that everything in
it is dead or unreachable, as it should be.
This also makes the leader of INITIAL undef, enabling us to handle
irreducibility properly.

Summary:
This lets us verify, more than we do now, that we didn't screw up
value numbering.

Reviewers: davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D29842

llvm-svn: 294844
2017-02-11 12:48:50 +00:00
Vitaly Buka bcb6622c95 Fix "left shift of negative value -1" introduced by r294805
llvm-svn: 294843
2017-02-11 12:44:03 +00:00
Vitaly Buka d8230247c9 This reverts commits r294826 and r294781 as they break linking on powerpc.
Revert "Fix -Wsign-compare - this might not be quite right, but preserves behavior"
Revert "[XRay] Implement powerpc64le xray."

This reverts commit r294826.
This reverts commit r294781.

llvm-svn: 294842
2017-02-11 12:34:27 +00:00
Simon Pilgrim 8158816efe [X86][XOP] Regenerate XOP commutation tests.
Added 32-bit tests as well.

llvm-svn: 294841
2017-02-11 12:30:59 +00:00
Simon Pilgrim 008ba63e04 [X86][SSE] Regenerate float comparison commutation tests.
llvm-svn: 294840
2017-02-11 12:29:56 +00:00
Simon Pilgrim 0d8632f089 [X86] Regenerate CLMUL commutation tests.
llvm-svn: 294839
2017-02-11 12:23:22 +00:00
Benjamin Kramer 357c9e1a4b Make helpers static. NFC.
llvm-svn: 294838
2017-02-11 12:21:17 +00:00
Benjamin Kramer efcf06f5f2 Move symbols from the global namespace into (anonymous) namespaces. NFC.
llvm-svn: 294837
2017-02-11 11:06:55 +00:00
Roman Gareev b196055c0c Check reduction dependencies in case of the matrix multiplication optimization
To determine parameters of the matrix multiplication, we check RAW dependencies
that can be expressed using only reduction dependencies. Consequently, we
should check the reduction dependencies, if this is the case.

Reviewed-by: Tobias Grosser <tobias@grosser.es>,
             Sven Verdoolaege <skimo-polly@kotnet.org>
             Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D29814

llvm-svn: 294836
2017-02-11 09:59:09 +00:00
Roman Gareev de69293b01 [FIX] Fix the potential issue of containsOnlyMatMulDep.
llvm-svn: 294835
2017-02-11 09:48:09 +00:00
Roman Gareev 5ef7e210c0 [NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp.
llvm-svn: 294834
2017-02-11 08:43:41 +00:00
Ed Schouten 252da3b3b4 Remove a now unneeded __CloudABI__ check.
CloudABI has gained the setlocale() function in the meantime, meaning
there is no longer a need to conditionalize this.

llvm-svn: 294833
2017-02-11 08:33:16 +00:00
Ed Schouten de5669e46c Fix the build of thread.cpp on CloudABI.
CloudABI does provide unistd.h, but doesn't define __unix__. We need to
include this header file to make hardware_concurrency work.

llvm-svn: 294832
2017-02-11 08:30:18 +00:00
Roman Gareev afcf026d81 [NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp.
llvm-svn: 294831
2017-02-11 07:14:37 +00:00
Craig Topper 1f6153bab4 [AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables.
llvm-svn: 294830
2017-02-11 07:01:40 +00:00
Craig Topper a9818aadab [AVX-512] Fix apparent typo in instruction name VMOVSSDrr_REV->VMOVSDZrr_REV.
llvm-svn: 294829
2017-02-11 07:01:38 +00:00
Roman Gareev 3d4eae31ea Use the size of the widest type of the matrix multiplication operands
The size of the operands type is the one of the parameters required
to determine the BLIS micro-kernel. We get the size of the widest type
of the matrix multiplication operands in case there are several
different types.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D29269

llvm-svn: 294828
2017-02-11 07:00:05 +00:00
Craig Topper 3afa777f10 [AVX-512] Add VPSADBW instructions to load folding tables.
llvm-svn: 294827
2017-02-11 06:24:03 +00:00
David Blaikie 9730789ae6 Fix -Wsign-compare - this might not be quite right, but preserves behavior
llvm-svn: 294826
2017-02-11 06:07:59 +00:00
Evgeny Stupachenko 5f3d9b6c09 The patch fixes r294821
Summary:
Update register match for windows testing

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 294825
2017-02-11 05:39:00 +00:00
Craig Topper 464b8cb244 [X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available.
Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available.

This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp.

Overall I think this produces better results in the modified test cases.

llvm-svn: 294824
2017-02-11 05:32:57 +00:00
David Blaikie a67cf0001f Fix memory leak by using unique_ptr
llvm-svn: 294823
2017-02-11 05:25:21 +00:00
Peter Collingbourne fa3175f2f6 Address Mehdi's post-commit review comments on r294795.
llvm-svn: 294822
2017-02-11 03:19:22 +00:00
Evgeny Stupachenko fe6f548d2d Fix PR23384 (under "-lsr-insns-cost" option)
Summary:
The patch adds instructions number generated by a solution
 to LSR cost under "-lsr-insns-cost" option.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D28307

From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 294821
2017-02-11 02:57:43 +00:00
Benjamin Kramer a05bdf75c0 Update XFAIL line after r294781.
llvm-svn: 294820
2017-02-11 02:00:03 +00:00
Ahmed Bougacha 8425f453ef [ARM] Make f16 interleaved accesses expensive.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Teach the cost model to consider f16 interleaved operations as
expensive.  Otherwise, we are all but guaranteed to end up with
a large block of scalarized vector code.

llvm-svn: 294819
2017-02-11 01:53:04 +00:00
Ahmed Bougacha fc979dc9dd [ARM] Don't lower f16 interleaved accesses.
There are no vldN/vstN f16 variants, even with +fullfp16.
We could use the i16 variants, but, in practice, even with +fullfp16,
the f16 sequence leading to the i16 shuffle usually gets scalarized.
We'd need to improve our support for f16 codegen before getting there.

Reject f16 interleaved accesses.  If we try to emit the f16 intrinsics,
we'll just end up with a selection failure.

llvm-svn: 294818
2017-02-11 01:53:00 +00:00
Ahmed Bougacha f37fb89edc [ARM] Unique some redundant CHECK lines. NFC.
llvm-svn: 294817
2017-02-11 01:52:57 +00:00
Rafael Espindola 08d6a3f133 Create only one section symbol per section.
Unfortunately some consumers of our .o files produced with -r expect
only one section symbol per section. That is true of at least of go's
own linker.

Combining them is a somewhat convoluted process. We have to create a
symbol for every section since we don't know which ones will be
needed. The relocation sections also have to be written first to
handle the Elf_Rel addend.

I did consider a completely different approach:

We could remove the -r special case of relocation sections when
reading. We would instead have a copyRelocs function that is used
instead of scanRelocs. It would create a DynamicReloc for each
relocation and a RelocationSection for each input relocation section.

A complication of such change is that DynamicReloc would have to take
a section index and a input section instead of a symbol since with
-emit-relocs some DynamicReloc would hold relocations referring to the
dynamic symbol table and other to the static symbol table.

That would be a pretty big change, and if we do it it is probably
better to do it as a refactoring.

llvm-svn: 294816
2017-02-11 01:40:49 +00:00