Commit Graph

359890 Commits

Author SHA1 Message Date
Ulrich Weigand 4c5a93bd58 [ABI] Handle C++20 [[no_unique_address]] attribute
Many platform ABIs have special support for passing aggregates that
either just contain a single member of floatint-point type, or else
a homogeneous set of members of the same floating-point type.

When making this determination, any extra "empty" members of the
aggregate type will typically be ignored.  However, in C++ (at least
in all prior versions), no data member would actually count as empty,
even if it's type is an empty record -- it would still be considered
to take up at least one byte of space, and therefore make those ABI
special cases not apply.

This is now changing in C++20, which introduced the [[no_unique_address]]
attribute.  Members of empty record type, if they also carry this
attribute, now do *not* take up any space in the type, and therefore
the ABI special cases for single-element or homogeneous aggregates
should apply.

The C++ Itanium ABI has been updated accordingly, and GCC 10 has
added support for this new case.  This patch now adds support to
LLVM.  This is cross-platform; it affects all platforms that use
the single-element or homogeneous aggregate ABI special case and
implement this using any of the following common subroutines
in lib/CodeGen/TargetInfo.cpp:
  isEmptyField
  isEmptyRecord
  isSingleElementStruct
  isHomogeneousAggregate
2020-07-10 14:01:05 +02:00
Simon Pilgrim b69e0f674f DomTreeUpdater::dump() - use const auto& iterator in for-range-loop.
Avoids unnecessary copies and silences clang tidy warning.
2020-07-10 12:47:15 +01:00
Nathan James a25487fd8c
[clang-tidy] Use Options priority in enum options where it was missing 2020-07-10 12:27:08 +01:00
Simon Pilgrim 9ce9831289 StackSafetyAnalysis.cpp - pass ConstantRange arg as const reference.
Avoids unnecessary copies and silences clang tidy warning - we do this in most places, there are just a few that were missed.
2020-07-10 12:13:34 +01:00
Simon Pilgrim 4cc26a44ca [X86][SSE] Use shouldUseHorizontalOp helper to determine whether to use (F)HADD. NFCI. 2020-07-10 12:13:34 +01:00
dstuttar 69a89b54c6 [NFC] Change isFPPredicate comparison to ignore lower bound
Summary:
Since changing the Predicate to be an unsigned enum, the lower bound check for
isFPPredicate no longer needs to check the lower bound, since
it will always evaluate to true.

Also fixed a similar issue in SIISelLowering.cpp by removing the need for
comparing to FIRST and LAST predicates

Added an assert to the isFPPredicate comparison to flag if the
FIRST_FCMP_PREDICATE is ever changed to anything other than 0, in which case the
logic will break.

Without this change warnings are generated in VS.

Change-Id: I358f0daf28c0628c7bda8ad4cab4e1757b761bab

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83540
2020-07-10 11:57:20 +01:00
Paul Walker f78e6a3095 [SVE] Code generation for fixed length vector truncates.
Lower fixed length vector truncates to a sequence of SVE UZP1 instructions.

Differential Revision: https://reviews.llvm.org/D83395
2020-07-10 10:37:19 +00:00
Pavel Labath d372a8e8bc [lldb/pecoff] Use a different llvm createBinary overload for parsing
Change the code the use the version which accepts a memory buffer,
instead of the one taking a file name.

This ensures we are not loading the file into memory twice
(ObjectFilePECOFF also loads a copy), reducing our memory footprint, as
well as enabling additional goodies in the future, like being able to
open files which don't exist on disk (D83512).
2020-07-10 11:57:11 +02:00
Haojian Wu 5f41ca48d1 [clang-tidy] More strict on matching the standard memset function in memset-usage check.
The check assumed the matched function call has 3 arguments, but the
matcher didn't guaranteed that.

Differential Revision: https://reviews.llvm.org/D83301
2020-07-10 11:42:35 +02:00
Florian Hahn 264ab1e2c8 [LV] Pick vector loop body as insert point for SCEV expansion.
Currently the DomTree is not kept up to date for additional blocks
generated in the vector loop, for example when vectorizing with
predication. SCEVExpander relies on dominance checks when looking for
existing instructions to re-use and in some cases that can lead to the
expander picking instructions that do not actually dominate their insert
point (e.g. as in PR46525).

Unfortunately keeping the DT up-to-date is a bit tricky, because the CFG
is only patched up after generating code for a block. For now, we can
just use the vector loop header, as this ensures the inserted
instructions dominate all uses in the vector loop. There should be no
noticeable impact on the generated code, as other passes should sink
those instructions, if profitable.

Fixes PR46525.

Reviewers: Ayal, gilr, mkazantsev, dmgreen

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D83288
2020-07-10 10:37:12 +01:00
Mirko Brkusanin cf40db21af [AMDGPU][GlobalISel] Fix G_AMDGPU_TBUFFER_STORE_FORMAT mapping
Add missing mappings and tablegen definitions for TBUFFER_STORE_FORMAT.

Differential Revision: https://reviews.llvm.org/D83240
2020-07-10 11:32:32 +02:00
Simon Pilgrim 9a3e8b11a8 extractConstantWithoutWrapping - use const APInt& returned by SCEVConstant::getAPInt()
Avoids unnecessary APInt copies and silences clang tidy warning.
2020-07-10 10:24:29 +01:00
Vitaly Buka c06417b24d Fix check-all with -DLLVM_USE_SANITIZER=Address 2020-07-10 01:47:51 -07:00
Simon Pilgrim 77133cc1e2 [X86][AVX] Attempt to fold PACK(SHUFFLE(X,Y),SHUFFLE(X,Y)) -> SHUFFLE(PACK(X,Y)).
Truncations lowered as shuffles of multiple (concatenated) vectors often leave us with lane-crossing shuffles that feed a PACKSS/PACKUS, if both shuffles are fed from the same 2 vector sources, then we can PACK the sources directly and shuffle the result instead.

This is currently limited to whole i128 lanes in a 256-bit vector, but we can extend this if the need arises (but I'm not seeing many examples in real world code).
2020-07-10 09:33:27 +01:00
Valeriy Savchenko 00997d1cad [analyzer][tests] Fix zip unpacking
Differential Revision: https://reviews.llvm.org/D83374
2020-07-10 11:32:13 +03:00
Valeriy Savchenko 9c7ff0a4aa [analyzer][tests] Make test interruption safe
Differential Revision: https://reviews.llvm.org/D83373
2020-07-10 11:31:59 +03:00
Valeriy Savchenko 21bacc2154 [analyzer][tests] Measure peak memory consumption for every project
Differential Revision: https://reviews.llvm.org/D82967
2020-07-10 11:31:41 +03:00
Danila Kutenin 68c011aa08 [builtins] Optimize udivmodti4 for many platforms.
Summary:
While benchmarking uint128 division we found out that it has huge latency for small divisors

https://reviews.llvm.org/D83027

```
Benchmark                                                   Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128>            13.0           13.0     55000000
BM_DivideIntrinsic128UniformDivisor<__int128>                     14.3           14.3     50000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128>         13.5           13.5     52000000
BM_RemainderIntrinsic128UniformDivisor<__int128>                  14.1           14.1     50000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128>             153            153        5000000
BM_DivideIntrinsic128SmallDivisor<__int128>                      170            170        3000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128>          153            153        5000000
BM_RemainderIntrinsic128SmallDivisor<__int128>                   155            155        5000000
```

This patch suggests a more optimized version of the division:

If the divisor is 64 bit, we can proceed with the divq instruction on x86 or constant multiplication mechanisms for other platforms. Once both divisor and dividend are not less than 2**64, we use branch free subtract algorithm, it has at most 64 cycles. After that our benchmarks improved significantly

```
Benchmark                                                   Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128>            11.0           11.0     64000000
BM_DivideIntrinsic128UniformDivisor<__int128>                     13.8           13.8     51000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128>         11.6           11.6     61000000
BM_RemainderIntrinsic128UniformDivisor<__int128>                  13.7           13.7     52000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128>              27.1           27.1     26000000
BM_DivideIntrinsic128SmallDivisor<__int128>                       29.4           29.4     24000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128>           27.9           27.8     26000000
BM_RemainderIntrinsic128SmallDivisor<__int128>                    29.1           29.1     25000000
```

If not using divq instrinsics, it is still much better

```
Benchmark                                                   Time(ns)        CPU(ns)     Iterations
--------------------------------------------------------------------------------------------------
BM_DivideIntrinsic128UniformDivisor<unsigned __int128>            12.2           12.2     58000000
BM_DivideIntrinsic128UniformDivisor<__int128>                     13.5           13.5     52000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128>         12.7           12.7     56000000
BM_RemainderIntrinsic128UniformDivisor<__int128>                  13.7           13.7     51000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128>              30.2           30.2     24000000
BM_DivideIntrinsic128SmallDivisor<__int128>                       33.2           33.2     22000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128>           31.4           31.4     23000000
BM_RemainderIntrinsic128SmallDivisor<__int128>                    33.8           33.8     21000000
```

PowerPC benchmarks:

Was
```
BM_DivideIntrinsic128UniformDivisor<unsigned __int128>            22.3           22.3     32000000
BM_DivideIntrinsic128UniformDivisor<__int128>                     23.8           23.8     30000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128>         22.5           22.5     32000000
BM_RemainderIntrinsic128UniformDivisor<__int128>                  24.9           24.9     29000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128>             394            394        2000000
BM_DivideIntrinsic128SmallDivisor<__int128>                      397            397        2000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128>          399            399        2000000
BM_RemainderIntrinsic128SmallDivisor<__int128>                   397            397        2000000
```

With this patch
```
BM_DivideIntrinsic128UniformDivisor<unsigned __int128>            21.7           21.7     33000000
BM_DivideIntrinsic128UniformDivisor<__int128>                     23.0           23.0     31000000
BM_RemainderIntrinsic128UniformDivisor<unsigned __int128>         21.9           21.9     33000000
BM_RemainderIntrinsic128UniformDivisor<__int128>                  23.9           23.9     30000000
BM_DivideIntrinsic128SmallDivisor<unsigned __int128>              32.7           32.6     23000000
BM_DivideIntrinsic128SmallDivisor<__int128>                       33.4           33.4     21000000
BM_RemainderIntrinsic128SmallDivisor<unsigned __int128>           31.1           31.1     22000000
BM_RemainderIntrinsic128SmallDivisor<__int128>                    33.2           33.2     22000000
```

My email: danilak@google.com, I don't have commit rights

Reviewers: howard.hinnant, courbet, MaskRay

Reviewed By: courbet

Subscribers: steven.zhang, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D81809
2020-07-10 09:59:16 +02:00
Diogo Sampaio 7bf168390f [BDCE] SExt -> ZExt when no sign bits is used and instruction has multiple uses
Summary: This allows to convert any SExt to a ZExt when we know none of the extended bits are used, specially in cases where there are multiple uses of the value.

Reviewers: dmgreen, eli.friedman, spatel, lebedev.ri, nikic

Reviewed By: lebedev.ri, nikic

Subscribers: hiraditya, dmgreen, craig.topper, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60413
2020-07-10 08:34:53 +01:00
David Sherwood da731894a2 [CodeGen] Replace calls to getVectorNumElements() in DAGTypeLegalizer::SetSplitVector
In DAGTypeLegalizer::SetSplitVector I have changed calls in the assert
from getVectorNumElements() to getVectorElementCount(), since this
code path works for both fixed and scalable vectors.

This fixes up one warning in the test:

  sve-sext-zext.ll

Differential Revision: https://reviews.llvm.org/D83196
2020-07-10 08:29:17 +01:00
Thomas Lively 043eaa9a4a [WebAssembly][NFC] Simplify vector shift lowering and add tests
This patch builds on 0d7286a652 by simplifying the code for detecting
splat values and adding new tests demonstrating the lowering of
splatted absolute value shift amounts, which are common in code
generated by Halide. The lowering is very bad right now, but
subsequent patches will improve it considerably. The tests will be
useful for evaluating the improvements in those patches.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D83493
2020-07-10 00:18:59 -07:00
George Mitenkov eb6b7c5d4f [MLIR][SPIRVToLLVM] Conversion of SPIR-V struct type without offset
This patch introduces type conversion for SPIR-V structs. Since
handling offset case requires thorough testing, it was left out
for now. Hence, only structs with no offset are currently
supported. Also, structs containing member decorations cannot
be translated.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D83403
2020-07-10 10:15:45 +03:00
David Sherwood 229dfb4728 [CodeGen] Replace calls to getVectorNumElements() in SelectionDAG::SplitVector
This patch replaces some invalid calls to getVectorNumElements() with calls
to getVectorMinNumElements() instead, since the code paths changed in this
patch work for both fixed and scalable vector types.

Fixes warnings in this test:

  sve-sext-zext.ll

Differential Revision: https://reviews.llvm.org/D83203
2020-07-10 08:11:30 +01:00
Muhammad Omair Javaid a65da5f592 [LLDB] Update AArch64 Dwarf and EH frame register numbers
This patch updates ARM64_ehframe_Registers.h and ARM64_DWARF_Registers.h
with latest register numbers in line with AArch64 SVE support.

For refernce take a look at "DWARF for the ARM® 64-bit Architecture (AArch64)
with SVE support" manual from Arm.
Version used: abi_sve_aadwarf_100985_0000_00_en.pdf
2020-07-10 11:45:39 +05:00
Daniel Grumberg 50f24331fd Add diagnostic option backing field for -fansi-escape-codes
Summary:
Keep track of -fansi-escape-codes in DiagnosticOptions and move the
option to the new option parsing system.

Depends on D82860

Reviewers: Bigcheese

Subscribers: dexonsmith, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D82874
2020-07-10 07:26:56 +01:00
Zakk Chen 04b9a46c84 [RISCV] Refactor FeatureRVCHints to make ProcessorModel more intuitive
Reviewers: luismarques, asb, evandro

Reviewed By: asb, evandro

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77030
2020-07-09 23:07:39 -07:00
Nathan Ridge 98d763ad05 [clangd] Factor out some helper functions related to heuristic resolution in TargetFinder
Summary:
Two helpers are introduced:

 * Some of the logic previously in TargetFinder::Visit*() methods is
   factored out into resolveDependentExprToDecls().

 * Some of the logic in getMembersReferencedViaDependentName() is
   factored out into resolveTypeToRecordDecl().

D82739 will build on this and use these functions in new ways.

Reviewers: hokein

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83371
2020-07-10 01:58:34 -04:00
SharmaRithik e71c7b593a [CodeMoverUtils] Move OrderedInstructions to CodeMoverUtils
Summary: This patch moves OrderedInstructions to CodeMoverUtils as It was
the only place where OrderedInstructions is required.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto, fhahn, nikic
Reviewed By: Whitney, nikic
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80643
2020-07-10 11:22:43 +05:30
Fangrui Song 760bbda2d8 [llvm-symbolizer][test] Fix options-from-env.test
options-from-env.test (D71668) does not test it intended to test:
`llvm-symbolizer 0x20112f` prints `0x20112f` in the absence of an environment
variable.
2020-07-09 22:39:56 -07:00
Guillaume Chatelet 30582457b4 [NFC] Separate bitcode reading for FUNC_CODE_INST_CMPXCHG(_OLD)
This is preparatory work to unable storing alignment for AtomicCmpXchgInst.
See D83136 for context and bug: https://bugs.llvm.org/show_bug.cgi?id=27168

Differential Revision: https://reviews.llvm.org/D83375
2020-07-10 04:27:39 +00:00
Richard Smith b03f1756fb [demangler] More properly save and restore the template parameter state
when parsing an encoding.
2020-07-09 21:12:51 -07:00
Petr Hosek ceb76d2fe7 [CMake][Fuchsia] Move runtimes to outer scope
This is needed for runtimes to be properly configured, addressing an
issue introduced in 53e38c85.
2020-07-09 21:07:44 -07:00
Stella Laurenzo c20c1960c1 Add Python bindings guide.
Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D83527
2020-07-09 20:49:39 -07:00
Richard Smith 553dbb6d7b [demangler] Don't allow the template parameters from the <encoding> in a
<local-name> to leak out into later parts of the name.

This caused us to fail to demangle certain constructs involving generic
lambdas.
2020-07-09 20:38:19 -07:00
Oliver Hunt 00c9a504ae CrashTracer: clang at clang: llvm::BitstreamWriter::ExitBlock
Add a guard for re-entering an SDiagsWriter's HandleDiagnostics
method after we've started finalizing. This is a generic catch
all for unexpected fatal errors so we don't recursive crash inside
the generic llvm error handler.

We also add logic to handle the actual error case in
llvm::~raw_fd_ostream caused by failing to clear errors before
it is destroyed.

<rdar://problem/63335596>
2020-07-09 20:27:33 -07:00
Chen Zheng f1efb8bb4b [SCEV][IndVarSimplify] insert point should not be block front.
The block front may be a PHI node, inserting a cast instructions like
BitCast, PtrToInt, IntToPtr among PHIs is not right.

Reviewed By: lebedev.ri

Differential Revision:  https://reviews.llvm.org/D80975
2020-07-09 21:56:57 -04:00
Jordan Rupprecht fbef6c55bc [lldb] Declare extern template instantiation to fix linking issues.
NativeProcessELF::GetELFImageInfoAddress<...>() is declared in NativeProcessELF.h, but only defined in NativeProcessELF.cpp. Via some optimized builds (e.g. thinlto), this instantiation may be removed when it is used in a different TU (NativeProcessELFTest.cpp).
2020-07-09 18:43:53 -07:00
Vitaly Buka 57f2a789ca [StackSafety,NFC] Reduce FunctionSummary size
Most compiler infocations will not need ParamAccess,
so we can optimize memory usage there with smaller unique_ptr
instead of empty vector.
Suggested in D80908 review.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D83458
2020-07-09 18:01:39 -07:00
Julian Lettner bed3e1a99b [Sanitizer] Update macOS version checking
Support macOS 11 in our runtime version checking code and update
`GetMacosAlignedVersionInternal()` accordingly.  This follows the
implementation of `Triple::getMacOSXVersion()` in the Clang driver.

Reviewed By: delcypher

Differential Revision: https://reviews.llvm.org/D82918
2020-07-09 17:28:01 -07:00
Richard Smith f721e0582b PR46648: Do not eagerly instantiate default arguments for a generic
lambda when instantiating a call operator specialization.

We previously incorrectly thought that such substitution was happening
in the context of substitution into a local scope, which is a context
where we should perform eager default argument instantiation.
2020-07-09 17:24:20 -07:00
Richard Smith a5569f0898 Push parameters into the local instantiation scope before instantiating
a default argument.

Default arguments can (after recent language changes) refer to
parameters of the same function. Make sure they're added to the local
instantiation scope before transforming a default argument so that we
can remap such references to them properly.
2020-07-09 17:24:20 -07:00
Richard Smith 7462793be7 Move default argument instantiation to SemaTemplateInstantiateDecl.cpp.
No functionality change intended.
2020-07-09 17:24:19 -07:00
ergawy 3847a6ae75 [MLIR][SPIRV] Support two memory access attributes in OpCopyMemory.
This commit augments spv.CopyMemory's implementation to support 2 memory
access operands. Hence, more closely following the spec. The following
changes are introduces:

- Customize logic for spv.CopyMemory serialization and deserialization.
- Add 2 additional attributes for source memory access operand.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D83241
2020-07-09 20:23:35 -04:00
Amara Emerson ce22527c0c [AArch64][GlobalISel] Add more specific debug info tests for 613f12dd8e.
As requested, these tests check for specific debug locs on the output of the
legalizer. The only one that I couldn't write was for moreElementsVector, which
AFAICT we don't trigger on AArch64.
2020-07-09 17:13:16 -07:00
Arthur Eubanks 8039d2c3bf [NFC] Derive from PassInfoMixin for no-op/printing passes
PassInfoMixin should be used for all NPM passes, rater than a custom
`name()`.

This caused ambiguous references in LegacyPassManager.cpp, so had to
remove "using namespace llvm::legacy" and move some things around.

The passes had to be moved to the llvm namespace, or else they would get
printed as "(anonymous namespace)::FooPass".

Reviewed By: ychen, asbirlea

Differential Revision: https://reviews.llvm.org/D83498
2020-07-09 16:58:30 -07:00
Wei Mi e296e9dfd6 [NFC] Change getEntryForPercentile to be a static function in ProfileSummaryBuilder.
Change file static function getEntryForPercentile to be a static member function
in ProfileSummaryBuilder so it can be used by other files.

Differential Revision: https://reviews.llvm.org/D83439
2020-07-09 16:38:19 -07:00
Wei Mi 78fe6a3ee2 [NFC] Extract the code to write instr profile into function writeInstrProfile
So that the function writeInstrProfile can be used in other places.

Differential Revision: https://reviews.llvm.org/D83521
2020-07-09 16:30:28 -07:00
Stella Laurenzo 722475a375 Initial boiler-plate for python bindings.
Summary:
* Native '_mlir' extension module.
* Python mlir/__init__.py trampoline module.
* Lit test that checks a message.
* Uses some cmake configurations that have worked for me in the past but likely needs further elaboration.

Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes

Tags: #mlir

Differential Revision: https://reviews.llvm.org/D83279
2020-07-09 12:03:58 -07:00
Eli Friedman 56ae2cebcd [AArch64][SVE] Add lowering for llvm.fma.
This is currently bare-bones; we aren't taking advantage of any of the
FMA variant instructions.  But it's enough to at least generate
code.

Differential Revision: https://reviews.llvm.org/D83444
2020-07-09 16:12:41 -07:00
Eric Schweitz 9263e08251 [flang] ifdef to avoid warning about supposedly dead function 2020-07-09 16:10:08 -07:00