Commit Graph

377397 Commits

Author SHA1 Message Date
Stephen Kelly 9a7fb08487 NFC: Minor cleanup of function calls 2021-01-17 18:47:17 +00:00
Kazu Hirata 50be8e4471 [TableGen] Drop redundant const from return types (NFC)
Identified with readability-const-return-type.
2021-01-17 10:39:49 -08:00
Kazu Hirata a59126115e [IRBuilder] "Zero"-initialize SmallVector (NFC) 2021-01-17 10:39:47 -08:00
Kazu Hirata 352fcfc697 [llvm] Use llvm::sort (NFC) 2021-01-17 10:39:45 -08:00
Raphael Isemann 7e9e6ac526 [lldb][docs] Fix some RST formatting errors related to code examples.
Mostly just making sure the indentation is right (SBDebugger had 0 spaces
as it was still plain text, the others had too much indentation or other
minor issues).
2021-01-17 17:41:05 +01:00
Dávid Bolvanský ed396212da [InstCombine] Transform abs pattern using multiplication to abs intrinsic (PR45691)
```
unsigned r(int v)
{
    return (1 | -(v < 0)) * v;
}

`r` is equivalent to `abs(v)`.

```

```
define <4 x i8> @src(<4 x i8> %0) {
%1:
  %2 = ashr <4 x i8> %0, { 31, undef, 31, 31 }
  %3 = or <4 x i8> %2, { 1, 1, 1, undef }
  %4 = mul nsw <4 x i8> %3, %0
  ret <4 x i8> %4
}
=>
define <4 x i8> @tgt(<4 x i8> %0) {
%1:
  %2 = icmp slt <4 x i8> %0, { 0, 0, 0, 0 }
  %3 = sub nsw <4 x i8> { 0, 0, 0, 0 }, %0
  %4 = select <4 x i1> %2, <4 x i8> %3, <4 x i8> %0
  ret <4 x i8> %4
}
Transformation seems to be correct!
```

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94874
2021-01-17 17:06:14 +01:00
Dávid Bolvanský 469ceaf538 [Tests] Add test for PR45691 2021-01-17 17:04:49 +01:00
Raphael Isemann acdc745689 [lldb][docs] Cleanup the Python doc strings for SB API classes
The first line of the doc string ends up on the SB API class summary at
the root page of the Python API  web page of LLDB. Currently many of the
descriptions are missing or are several lines which makes the table really
hard to read.

This just adds the missing docstrings where possible and fixes the formatting
where necessary.
2021-01-17 16:51:07 +01:00
Nikita Popov a13c0f62c3 [InstSimplify] Fold x*C1/C2 <= x (PR48744)
We can fold x*C1/C2 <= x to true if C1 <= C2. This is valid even
if the multiplication is not nuw: https://alive2.llvm.org/ce/z/vULors

The multiplication or division can be replaced by shifts. We don't
handle the case where both are shifts, as that should get folded
away by InstCombine.
2021-01-17 16:02:55 +01:00
Nikita Popov 4bfbfb9bcb [InstSimplify] Add tests for x*C1/C2<=x (NFC)
Tests for PR48744.
2021-01-17 16:02:55 +01:00
Utkarsh Saxena 9abbc05097
[clangd] Use !empty() instead of size()>0 2021-01-17 15:26:40 +01:00
Utkarsh Saxena 0f9908a7c9
[clangd] Use empty() instead of size()>0 2021-01-17 15:13:01 +01:00
mydeveloperday 00dc97f167 [clang-format] PR48594 BraceWrapping: SplitEmptyRecord ignored for templates
https://bugs.llvm.org/show_bug.cgi?id=48594

Empty or small templates were not being treated the same way as small classes especially when SplitEmptyRecord was set to true

This revision aims to help this by identifying a case when we should try not to merge the lines together

Reviewed By: curdeius, JohelEGP

Differential Revision: https://reviews.llvm.org/D93839
2021-01-17 11:14:33 +00:00
Raphael Isemann e7bc6c594b Reland [lldb][docs] Use sphinx instead of epydoc to generate LLDB's Python reference
The build server should now have the missing dependencies.

Original summary:

Currently LLDB uses epydoc to generate the Python API reference for the website.
epydoc however is unmaintained since more than a decade and no longer works with
Python 3. Also whatever setup we had once for generating the documentation on
the website server no longer seems to work, so the current website documentation
has been stale since more than a year.

This patch replaces epydoc with sphinx and its automodapi plugin that can
generate Python API references. LLVM already uses sphinx for the rest of the
documentation, so this way we are more consistent with the rest of LLVM. The
only new dependency is the automodapi plugin for sphinx.

This patch effectively does the following things:
* Remove the epydoc code.
* Make a new dummy Python API page in our website that just calls the Sphinx
  command for generated the API documentation.
* Add a mock _lldb module that is only used when generating the Python API.
 This way we don't have to build all of LLDB to generate the API reference.

Some notes:
* The long list of skips is necessary due to boilerplate functions that SWIG
  is generating. Sadly automodapi is not really scriptable from what I can see,
  so we have to blacklist this stuff manually.
* The .gitignore change because automodapi wants a subfolder of our
  documentation directory to place generated documentation files there. The path
  is also what is used on the website, so we can't really workaround this
  (without copying the whole `docs` dir somewhere else when we build).
* We have to use environment variables to pass our build path to our sphinx
  configuration. Sphinx doesn't support passing variables onto that script.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D94489
2021-01-17 12:13:01 +01:00
mydeveloperday 9af03864df [clang-format] Revert e9e6e3b34a
Reverting {D92753} due to issues with #pragma indentation in #ifdef/endif structure
2021-01-17 11:07:31 +00:00
Nikita Popov 0b84afa5fc Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.

-----

An alias query currently works out roughly like this:

 * Look up location pair in cache.
 * Perform BasicAA logic (including cache lookup and insertion...)
 * Perform a recursive query using BestAAResults.
   * Look up location pair in cache (and thus do not recurse into BasicAA)
   * Query all the other AA providers.
 * Query all the other AA providers.

This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().

There are some tradeoffs:

 * We can no longer pass through the precomputed underlying object
   to aliasCheck(). This is not a major concern, because nowadays
   getUnderlyingObject() is quite cheap.
 * Results from other AA providers are no longer cached inside
   BasicAA. The way this worked was already a bit iffy, in that a
   result could be cached, but if it was MayAlias, we'd still end
   up re-querying other providers anyway. If we want to cache
   non-BasicAA results, we should do that in a more principled manner.

In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.

Differential Revision: https://reviews.llvm.org/D90094
2021-01-17 10:34:35 +01:00
Nikita Popov b1c2f1282a [BasicAA] Move assumption tracking into AAQI
D91936 placed the tracking for the assumptions into BasicAA.
However, when recursing over phis, we may use fresh AAQI instances.
In this case AssumptionBasedResults from an inner AAQI can reesult
in a removal of an element from the outer AAQI.

To avoid this, move the tracking into AAQI. This generally makes
more sense, as the NoAlias assumptions themselves are also stored
in AAQI.

The test case only produces an assertion failure with D90094
reapplied. I think the issue exists independently of that change
as well, but I wasn't able to come up with a reproducer.
2021-01-17 10:34:35 +01:00
Fangrui Song 3809f4ebab [ELF] Support R_PPC_ADDR24 (ba foo; bla foo) 2021-01-17 00:02:13 -08:00
Kazushi (Jam) Marukawa 3cbd476c54 [VE] Support VE in libunwind
Modify libunwind to support SjLj exception handling routines for VE.
In order to do that, we need to implement not only SjLj exception
handling routines but also a Registers_ve class.  This implementation
of Registers_ve is incomplete.  We will work on it later when we need
backtrace in libunwind.

Reviewed By: #libunwind, compnerd

Differential Revision: https://reviews.llvm.org/D94591
2021-01-17 15:35:02 +09:00
Craig Topper 061f681c0d [RISCV] Remove an extra map lookup from RISCVCompressInstEmitter. NFC
When we looked up the map to see if the entry already existed,
this created the new entry for us. So save a reference to it so
we can use it to update the entry instead of looking it up again.

Also remove unnecessary StringRef constructors around string
literals on calls to this function.
2021-01-16 21:20:53 -08:00
Craig Topper 1327c730bb [RISCV] Few more minor cleanups to RISCVCompressInstEmitter. NFC
-Use StringRef instead of std::string.
-Const correct a parameter.
-Don't call StringRef::data() before printing. Just pass the StringRef.
2021-01-16 21:09:43 -08:00
Craig Topper 2b6a92625f [RISCV] Simplify mergeCondAndCode in RISCVCompressInstEmitter.cpp. NFC
Instead forming a std::string and returning it to pass into another
raw_ostream, just pass the raw_ostream as a parameter.

Take StringRef as arguments instead raw_string_ostream references
making the caller responsible for converting to strings. Use
StringRef operations instead of std::string::substr.a
2021-01-16 20:59:48 -08:00
Craig Topper 97f7e4e8c9 [RISC] Replace dyn_casts that are only checked by an assert with a cast. NFC 2021-01-16 20:23:48 -08:00
Craig Topper 633c5afccf [RISCV] Remove unneeded StringRef to std::string conversions in RISCVCompressInstEmitter. NFC
Stop concatenating std::string before streaming into a raw_ostream.
Just stream the pieces.

Remove some new lines from asserts. Remove std::string concatenation
from an assert. assert strings aren't really evaluated like this at
runtime. An assertion failure will just print exactly what's between
the parentheses in the source.
2021-01-16 20:09:45 -08:00
Fangrui Song a048ce13e3 [X86] Default to -x86-pad-for-align=false to drop assembler difference with or w/o -g
Fix PR48742: the D75203 assembler optimization locates MCRelaxableFragment's
within two MCSymbol's and relaxes some MCRelaxableFragment's to reduce the size
of a MCAlignFragment.  A -g build has more MCSymbol's and therefore may have
different assembler output (e.g. a MCRelaxableFragment (jmp) may have 5 bytes
with -O1 while 2 bytes with -O1 -g).

`.p2align 4, 0x90` is common due to loops. For a larger program, with a
lot of temporary labels, the assembly output difference is somewhat
destined. The cost seems to overweigh the benefits so we default to
-x86-pad-for-align=false until the heuristic is improved.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D94542
2021-01-16 16:39:54 -08:00
Nikita Popov 5238e7b302 [InstCombine] Replace one-use select operand based on condition
InstCombine already performs a fold where X == Y ? f(X) : Z is
transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
if f(X) only has one use, then we can always directly replace the
use inside the instruction. To actually be profitable, limit it to
the case where Y is a non-expr constant.

This could be further extended to replace uses further up a one-use
instruction chain, but for now this only looks one level up.

Among other things, this also subsumes D94860.

Differential Revision: https://reviews.llvm.org/D94862
2021-01-16 23:25:02 +01:00
Roman Lebedev 32fc32317a
[SimplifyCFG] markAliveBlocks(): catchswitch: preserve PostDomTree
When removing catchpad's from catchswitch, if that removes a successor,
we need to record that in DomTreeUpdater.

This fixes PostDomTree preservation failure in an existing test.
This appears to be the single issue that i see in my current test coverage.
2021-01-17 01:21:05 +03:00
David Green 1454724215 [ARM] Align blocks that are not fallthough targets
If the previous block in a function does not fallthough, adding nop's to
align it will never be executed. This means we can freely (except for
codesize) align more branches. This happens in constantislandspass (as
it cannot happen later) and only happens at aggressive optimization
levels as it does increase codesize.

Differential Revision: https://reviews.llvm.org/D94394
2021-01-16 22:19:35 +00:00
David Green 2a5b576e3e [ARM] Test for aligned blocks. NFC 2021-01-16 22:04:48 +00:00
Dávid Bolvanský bfd75bdf3f [NFC] Removed extra text in comments 2021-01-16 22:48:56 +01:00
Aart Bik d8fc27301d [mlir][sparse] improved sparse runtime support library
Added the ability to read (an extended version of) the FROSTT
file format, so that we can now read in sparse tensors of arbitrary
rank. Generalized the API to deal with more than two dimensions.

Also added the ability to sort the indices of sparse tensors
lexicographically. This is an important step towards supporting
auto gen of initialization code, since sparse storage formats
are easier to initialize if the indices are sorted. Since most
external formats don't enforce such properties, it is convenient
to have this ability in our runtime support library.

Lastly, the re-entrant problem of the original implementation
is fixed by passing an opaque object around (rather than having
a single static variable, ugh!).

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D94852
2021-01-16 12:16:10 -08:00
Shilei Tian ed939f853d [OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks.  We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.

Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.

Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D77609
2021-01-16 14:13:35 -05:00
Sanjay Patel 49b96cd9ef [SLP] remove opcode field from reduction data class
This is NFC-intended and another step towards supporting
intrinsics as reduction candidates.

The remaining bits of the OperationData class do not make
much sense as-is, so I will try to improve that, but I'm
trying to take minimal steps because it's still not clear
how this was intended to work.
2021-01-16 13:55:52 -05:00
Sanjay Patel fcfcc3cc6b [SLP] fix typos; NFC 2021-01-16 13:55:52 -05:00
Sanjay Patel 48dbac5b6b [SLP] remove unnecessary use of 'OperationData'
This is another NFC-intended patch to allow matching
intrinsics (example: maxnum) as candidates for reductions.

It's possible that the loop/if logic can be reduced now,
but it's still difficult to understand how this all works.
2021-01-16 13:55:52 -05:00
Dávid Bolvanský 63bedc80da [InstSimplify] Handle commutativity for 'and' and 'outer or' for (~A & B) | ~(A | B) --> ~A
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94870
2021-01-16 19:42:50 +01:00
David Green 372eb2bbb6 [ARM] Add low overhead loops terminators to AnalyzeBranch
This treats low overhead loop branches the same as jump tables and
indirect branches in analyzeBranch - they cannot be analyzed but the
direct branches on the end of the block may be removed. This helps
remove the unnecessary branches earlier, which can help produce better
codegen (and change block layout in a number of cases).

Differential Revision: https://reviews.llvm.org/D94392
2021-01-16 18:30:21 +00:00
David Green c1ab698dce [ARM] Remove LLC tests from transform/hardware loop tests.
We now have a lot of llc tests for hardware loops in CodeGen, which test
a larger variety of loops and are easier to maintain. This removes the
llc from mixed llc/opt tests.
2021-01-16 18:30:21 +00:00
Dávid Bolvanský 416854d0f7 [InstSimplify] Precommit new testcases; NFC 2021-01-16 19:11:58 +01:00
Kazu Hirata 2082b10d10 [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
Kazu Hirata 19aacdb715 [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Kazu Hirata ba0fc7e1f8 [StringExtras] Fix comment typos (NFC) 2021-01-16 09:40:51 -08:00
Florian Hahn bca16e2fbb
[LTO] Remove options to disable inlining, vectorization & GVNLoadPRE.
This patch removes some ancient options as a clean-up before moving
code-gen to use LTOBackend in D94487.

I think it would preferable to remove those ancient options, because

  1. There are no corresponding options in LTOBackend based tools,
  2. There are no unit tests for them,
  3. They are not passed through by Clang,
  4. At least for GNVLoadPRE, users could just use GVN's `enable-load-pre`.

Alternatively we could add support for those options to lto::Config &
co, but I think it would be better to remove them, unless they are
actually used in practice.

Reviewed By: steven_wu, tejohnson

Differential Revision: https://reviews.llvm.org/D94783
2021-01-16 16:29:15 +00:00
Dávid Bolvanský bdd4dda58b [InstSimplify] Update comments, remove redundant tests 2021-01-16 16:31:23 +01:00
Hsiangkai Wang 098dbf190a [RISCV] Correct alignment settings for vector registers.
According to "9. Vector Memory Alignment Constraints" in V
specification, the alignment of vector memory access is aligned to the
size of the element. In our current implementation, we support ELEN up
to 64. We could assume the alignment of vector registers is 64 under the
assumption.

Differential Revision: https://reviews.llvm.org/D94751
2021-01-16 23:21:29 +08:00
Dávid Bolvanský a4e2a5145a [InstSimplify] Add (~A & B) | ~(A | B) --> ~A 2021-01-16 15:43:34 +01:00
Dávid Bolvanský 9fc814ed59 [Tests] Added tests for new instcombine or simplification; NFC 2021-01-16 15:43:33 +01:00
James Player 25c1578a46 Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable
Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations.  Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.

I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:

```
62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
...
```
The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.

[[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
```
/// Storage for any type.
template <typename T, bool = std::is_trivially_copy_constructible<T>::value
                          && std::is_trivially_copy_assignable<T>::value>
class OptionalStorage {
```
Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted.  Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D93510
2021-01-16 09:37:04 -05:00
Stephen Kelly b765eaf9a6 [ASTMatchers] Add support for CXXRewrittenBinaryOperator
Differential Revision: https://reviews.llvm.org/D94130
2021-01-16 13:44:22 +00:00
Stephen Kelly e810e95e4b [ASTMatchers] Add binaryOperation matcher
This is a simple utility which allows matching on binaryOperator and
cxxOperatorCallExpr. It can also be extended to support
cxxRewrittenBinaryOperator.

Add generic support for MapAnyOfMatchers to auto-marshalling functions.

Differential Revision: https://reviews.llvm.org/D94129
2021-01-16 13:44:09 +00:00