Commit Graph

404582 Commits

Author SHA1 Message Date
Jez Ng ad8df21db2 [reland][lld-macho] Fix symbol relocs handling for compact unwind's functionAddress
Clang seems to emit all functionAddress relocs as section relocs, but
`ld -r` can turn those relocs into symbol ones. It turns out that we
weren't handling that case correctly when the symbol was a weak def
whose definition did not prevail.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D113702
2021-11-12 15:01:51 -05:00
Jacques Pienaar 153c298342 [mlir][shape] Add value_as_shape op
Part of the very first discussion here, but didn't upstream it before as we
didn't use it yet. Fix that for pending updates. Just adding the op here,
follow up will add the lowering to codegen.
2021-11-12 11:53:27 -08:00
Duncan P. N. Exon Smith 46a68c85bf Sema: const-qualify ParsedAttr::iterator::operator*()
`const`-qualify ParsedAttr::iterator::operator*(), clearing up confusion
about the two meanings of const for pointers/iterators. Helps unblock
removal of (non-const) iterator_facade_base::operator->().
2021-11-12 11:47:16 -08:00
Duncan P. N. Exon Smith 8b3e1adf2b IR: Avoid duplication of SwitchInst::findCaseValue(), NFC
Change the non-const version of findCaseValue() to forward to the const
version.
2021-11-12 11:44:08 -08:00
Philip Reames de2fed6152 [unroll] Keep unrolled iterations with initial iteration
The unrolling code was previously inserting new cloned blocks at the end of the function.  The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.

With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting.  As such, placing Count-1 copies out of line is a fairly poor code placement choice.  We'd much rather fall through into the hot (non-exiting) path.  For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.

However, the real motivation for this change isn't performance.  Its readability and human understanding.  Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.
2021-11-12 11:40:50 -08:00
Peter Klausler da25f968a9 [flang] Runtime performance improvements to real formatted input
Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.

Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.

Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.

These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.

Differential Revision: https://reviews.llvm.org/D113697
2021-11-12 11:40:02 -08:00
Keith Smiley eb6f9f3123 [lld-macho] Fix trailing slash in oso_prefix
Previously if you passed `-oso_prefix path/to/foo/` with a trailing
slash at the end, using `real_path` would remove that slash, but that
slash is necessary to make sure OSO prefix paths end up as valid
relative paths instead of starting with `/`.

Differential Revision: https://reviews.llvm.org/D113541
2021-11-12 11:29:08 -08:00
Duncan P. N. Exon Smith 1b651be046 ADT: Fix const-correctness of iterator adaptors
This fixes const-correctness of iterator adaptors, dropping non-`const`
overloads for `operator*()`.

Iterators, like the pointers that they generalize, have two types of
`const`.

The `const` qualifier on members indicates whether the iterator itself
can be changed. This is analagous to `int *const`.

The `const` qualifier on return values of `operator*()`, `operator[]()`,
and `operator->()` controls whether the the pointed-to value can be
changed. This is analogous to `const int *`.

Since `operator*()` does not (in principle) change the iterator, then
there should only be one definition, which is `const`-qualified. E.g.,
iterators wrapping `int*` should look like:
```
int *operator*() const; // always const-qualified, no overloads
```

ba7a6b314f changed `iterator_adaptor_base`
away from this to work around bugs in other iterator adaptors. That was
already reverted. This patch adds back its test, which combined
llvm::enumerate() and llvm::make_filter_range(), adds a test for
iterator_adaptor_base itself, and cleans up the `const`-ness of the
other iterator adaptors.

This also updates the documented requirements for
`iterator_facade_base`:
```
/// OLD:
///   - const T &operator*() const;
///   - T &operator*();

/// New:
///   - T &operator*() const;
```
In a future commit we might also clean up `iterator_facade`'s overloads
of `operator->()` and `operator[]()`. These already (correctly) return
non-`const` proxies regardless of the iterator's `const` qualifier.

Differential Revision: https://reviews.llvm.org/D113158
2021-11-12 11:24:17 -08:00
Philip Reames a1b496be6c (re-)Autogen one last unroll-and-jam test
This case was complicated because someone had added new non-autogened test to an autogened file.  In particular, those new tests used two variables (%J and %j) which differeded only in capitalization.  The auto-updater doesn't distinguish case, so this meant auto-gened versions of the new tests failed with non-obvious errors.

There are two key lessons here:
1) Please don't use two values which differ only in case.  This is problematic for automatic tooling, but is also hard to understand for a human.
2) Please DO NOT add new tests to an autogened test without running autogen again.  If autogen doesn't pass on your new test, put them in a separate file.
2021-11-12 11:21:08 -08:00
Peter Klausler d1b09adeeb [flang] Fix rounding edge case in F output editing
When an Fw.d output edit descriptor has a "d" value exactly
equal to the number of zeroes after the decimal point for a value
(e.g., 0.07 with F5.1), the Fw.d output editing code needs to
do the rounding itself to either 0.0 or 0.1 after performing
a conversion without rounding (to avoid 0.04999 rounding up twice).

Differential Revision: https://reviews.llvm.org/D113698
2021-11-12 11:16:25 -08:00
Alfsonso Gregory f46f93b478 [libc++][NFC] Resolve Python 2 FIXME
We don't use Python 2 anymore, so let us do the recommended fix instead
of using the workaround made for Python 2.

Differential Revision: https://reviews.llvm.org/D107715
2021-11-12 13:55:22 -05:00
Peter Klausler 4a0af824ee [flang] Respect NO_STOP_MESSAGE=1 in runtime
When an environment variable NO_STOP_MESSAGE=1 is set,
assume that STOP statements with a successful code
have QUIET=.TRUE.

Differential Revision: https://reviews.llvm.org/D113701
2021-11-12 10:50:36 -08:00
Lang Hames 3fb641618f [ORC-RT][llvm-jitlink] Fix a buggy check in ORC-RT MachO TLV deregistration.
The check was failing because it was matching against the end of the range, not
the start.

This bug wasn't causing the ORC-RT MachO TLV regression test to fail because
we were only logging deallocation errors (including TLV deregistration errors)
and not actually returning a failure code. This commit updates llvm-jitlink to
report the errors properly.
2021-11-12 10:36:17 -08:00
Lang Hames 9d5e647428 [JITLink] Fix think-o in handwritten CWrapperFunctionResult -> Error converter.
We need to skip the length field when generating error strings.

No test case: This hand-hacked deserializer should be removed in the near future
once JITLink can use generic ORC APIs (including SPS and WrapperFunction).
2021-11-12 10:36:17 -08:00
Philip Reames f453e23e67 Autogen a bunch of unrolling tests for ease of update 2021-11-12 10:34:50 -08:00
Peter Klausler 85ec449352 [flang] Fix ORDER= argument to RESHAPE
The ORDER= argument to the transformational intrinsic function RESHAPE
was being misinterpreted in an inverted way that could be detected only
with 3-d or higher rank array.  Fix in both folding and the runtime, and
extend tests.

Differential Revision: https://reviews.llvm.org/D113699
2021-11-12 10:25:00 -08:00
Florian Hahn 03cfea68c6
[SCEV] Update SCEVLoopGuardRewriter to take SCEV -> SCEV map (NFC).
Split off refactoring from D113577 to reduce the diff. NFC as the new
interface will only be used in D113577.
2021-11-12 18:16:03 +00:00
Nawrin Sultana 7a5680233e [OpenMP] Set default blocktime to 0 for hybrid cpu
Differential Revision:https://reviews.llvm.org/D113012
2021-11-12 12:05:35 -06:00
Quinn Pham 84c5702b76 [lldb][NFC] Inclusive language: rename m_master in ASTImporterDelegate
[NFC] As part of using inclusive language within the llvm project, this patch
replaces `m_master` in `ASTImporterDelegate` with `m_main`.

Reviewed By: teemperor, clayborg

Differential Revision: https://reviews.llvm.org/D113720
2021-11-12 12:04:14 -06:00
Simon Pilgrim 3170670541 [AMDGPU] Regenerate udiv.ll tests 2021-11-12 17:57:40 +00:00
Philip Reames 5dd64ef528 Refresh an autogen test to reduce spurious diffs 2021-11-12 09:48:37 -08:00
Fangrui Song a05384dc89 [ELF] Make --no-relax disable R_X86_64_GOTPCRELX and R_X86_64_REX_GOTPCRELX GOT optimization
This brings back the original version of D81359.
I have found several use cases now.

* Unlike GNU ld, LLD's relocation processing is one pass. If we decide to
  optimize(relax) R_X86_64_{,REX_}GOTPCRELX, we will suppress GOT generation and
  cannot undo the decision later. Optimizing R_X86_64_REX_GOTPCRELX can usually
  make it easy to hit `relocation R_X86_64_REX_GOTPCRELX out of range` because
  the distance to GOT is usually shorter. Without --no-relax, the user has to
  recompile with `-Wa,-mrelax-relocations=no`.
* The option would help during my investigationg of the root cause of https://git.kernel.org/linus/09e43968db40c33a73e9ddbfd937f46d5c334924
* There is need for relaxation for AArch64 & RISC-V. Implementing this for
  x86-64 improves consistency with little target-specific cost (two-line
  X86_64.cpp change).

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D113615
2021-11-12 09:47:31 -08:00
Sam McCall 4fb62e1383 [clangd] Mark completions as plain-text when there's no snippet part
This helps nvim support the "repeat" action

Fixes https://github.com/clangd/clangd/issues/922
2021-11-12 18:44:20 +01:00
Philip Reames e01c91f242 [tests] Add coverage for cases we can prune exits when runtlme unrolling 2021-11-12 09:43:16 -08:00
Nikita Popov 1c5d636af1 [ConstantRangeTest] Add helper to enumerate APInts (NFC)
While ForeachNumInConstantRange(ConstantRange::getFull(Bits))
works, it's somewhat roundabout, and I keep looking for this
function.
2021-11-12 18:18:51 +01:00
Quinn Pham 52a3ed5b93 [lldb][NFC] Inclusive language: replace master/slave names for ptys
[NFC] This patch replaces master and slave with primary and secondary
respectively when referring to pseudoterminals/file descriptors.

Reviewed By: clayborg, teemperor

Differential Revision: https://reviews.llvm.org/D113687
2021-11-12 10:54:18 -06:00
Dmitry Vyukov 79fbba9b79 Revert "tsan: new runtime (v3)"
Summary:
This reverts commit ac95b8d954.
There is a number of bot failures:
http://45.33.8.238/mac/38755/step_4.txt
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/38135/consoleFull#-148886289949ba4694-19c4-4d7e-bec5-911270d8a58c

Reviewers: vitalybuka, melver

Subscribers:
2021-11-12 17:49:47 +01:00
Simon Pilgrim 6bb71738e2 [X86] convertShiftLeftToScale - improve vXi8 constant handling
Add support for v32i8/v64i8 converting shift-by-constant to multiply-by-constant. This helps us avoid the generic vXi8 shift lowering, and a lot of VPBLENDVB ops which can be particularly slow.

We also needed to reorder a few shift lowering patterns to prevent regressions, particularly for XOP+AVX2 (Excavator) targets (which can split to fast v16i8 shifts) and AVX512-BWI targets (which prefers to extend to fast v32i16 shifts).
2021-11-12 16:48:10 +00:00
Zarko Todorovski bd81c39107 [NFC][llvm] Remove uses of blacklist in llvm/test/Instrumentation
Small patch that changes blacklisted_global to blocked_global and a change in comments.

Reviewed By: pgousseau

Differential Revision: https://reviews.llvm.org/D113692
2021-11-12 16:44:52 +00:00
Brian Cain 2d0aede515 [libcxx] Change the type of __size to correspond
__size was declared as unsigned which was compatible with
2021-11-12 08:30:38 -08:00
Joel E. Denny c9dfe322ee [OpenMP] Fix main thread barrier for Pascal and amdgpu
Fixes what's left of https://bugs.llvm.org/show_bug.cgi?id=51781.

Reviewed By: jdoerfert, JonChesterfield, tianshilei1992

Differential Revision: https://reviews.llvm.org/D113602
2021-11-12 11:18:45 -05:00
Florian Hahn 30ebdf8a6d
[LV] Precommit test case from PR52485. 2021-11-12 16:09:19 +00:00
Jay Foad a70bbb5f7a [AMDGPU] Simplify 64-bit division/remainder expansion
The old expansion open-coded a 64-bit addition in a strange way, by
adding the high parts *without* carry-in from the low part, and then
adding the carry back in later on. Fixing this saves a couple of
instructions and makes the code much easier to understand.

Differential Revision: https://reviews.llvm.org/D113679
2021-11-12 15:48:41 +00:00
Zarko Todorovski 05f34ffa21 [clang] Inclusive language: change instances of blacklist/whitelist to allowlist/ignorelist
Change the error message to use ignorelist, and changed some variable and function
names in related code and test.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D113189
2021-11-12 15:46:16 +00:00
Kazu Hirata 99d5cbbd7e [CodeGen] Use SDNode::uses (NFC) 2021-11-12 07:33:29 -08:00
Roman Lebedev 8d35c054e3
[NFC][SROA] Add more tests for non-capturing pointer-escaping calls 2021-11-12 18:08:39 +03:00
Nicolas Vasilache 0e185ceafb [mlir] NFC - Address post-commit comments
Address comments from https://reviews.llvm.org/D113745
which landed as aa37318067
2021-11-12 15:01:29 +00:00
Justas Janickas 388e8110db [OpenCL] Constructor address space test adjusted for C++ for OpenCL 2021
Reuses C++ for OpenCL constructor address space test so that it
supports optional generic address spaces in version 2021.

Differential Revision: https://reviews.llvm.org/D110184
2021-11-12 14:31:50 +00:00
Alexey Bataev 1513ca339b [Feature][NFC]Improve test checks to avoid possible false postitive test
failures, NFC.
2021-11-12 06:28:44 -08:00
Kerry McLaughlin 7647822156 [AArch64][SVE] Remove i1 type from isElementTypeLegalForScalableVector
`collectElementTypesForWidening` collects the types of load, store and
reduction Phis in a loop. These types are later checked using
`isElementTypeLegalForScalableVector` to prevent vectorisation of
loops with instruction types that are unsupported.

This patch removes i1 from the list of types supported for scalable
vectors. This fixes an assert ("Cannot yet scalarize uniform stores") in
`setCostBasedWideningDecision` when we have a loop containing a uniform
i1 store and a scalable VF, which we cannot create a scatter for.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D113680
2021-11-12 14:24:38 +00:00
Alexey Bataev 352c46e707 [SLP]Improve vectorization of split loads.
Need to fix ther cost estimation for split loads, since we look at the
subregs already, no need to permute them, need just to estimate
subregister insert, if it is smaller than the real register. Also, using
split loads, it might be profitable already to vectorize smaller trees
with gathering of the loads.

Differential Revision: https://reviews.llvm.org/D107188
2021-11-12 06:13:22 -08:00
Simon Pilgrim 59087dce3b [X86] combineX86ShufflesConstants - constant fold from target shuffles unless optsize = true
Currently we only constant fold target shuffles if any of the sources has one use, or it would remove a variable shuffle mask - the aim being to avoid constant pool bloat.

This patch proposes we should constant fold by default and only limit this if optsize is enabled - I've added a basic test for this in vector-mul.ll (the pmuludq case is by far the most common), I can add other specific test cases if people need them.

This should permit further constant folding, break some instruction dependencies and help reduce shuffle port pressure.

Differential Revision: https://reviews.llvm.org/D113748
2021-11-12 14:02:43 +00:00
Kadir Cetinkaya ebda5e1e52
[clangd] Fix use-after-free in test 2021-11-12 14:50:23 +01:00
Dmitry Vyukov ac95b8d954 tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
 - 2x smaller shadow memory (2x of app memory)
 - faster fully vectorized race detection
 - small fixed-size vector clocks (512b)
 - fast vectorized vector clock operations
 - unlimited number of alive threads/goroutimes

Depends on D112602.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D112603
2021-11-12 14:31:49 +01:00
Raphael Isemann cef1e07cc6 [lldb] Fix that the embedded Python REPL crashes if it receives SIGINT
When LLDB receives a SIGINT while running the embedded Python REPL it currently
just crashes in `ScriptInterpreterPythonImpl::Interrupt` with an error such as
the one below:

```

Fatal Python error: PyThreadState_Get: the function must be called with the GIL
held, but the GIL is released (the current Python thread state is NULL)

```

The faulty code that causes this error is this part of `ScriptInterpreterPythonImpl::Interrupt`:
```
    PyThreadState *state = PyThreadState_GET();
    if (!state)
      state = GetThreadState();
    if (state) {
      long tid = state->thread_id;
      PyThreadState_Swap(state);
      int num_threads = PyThreadState_SetAsyncExc(tid, PyExc_KeyboardInterrupt);
```

The obvious fix I tried is to just acquire the GIL before this code is running
which fixes the crash but the `KeyboardInterrupt` we want to raise immediately
is actually just queued and would only be raised once the next line of input has
been parsed (which e.g. won't interrupt Python code that is currently waiting on
a timer or IO from what I can see). Also none of the functions we call here is
marked as safe to be called from a signal handler from what I can see, so we
might still end up crashing here with some bad timing.

Python 3.2 introduced `PyErr_SetInterrupt` to solve this and the function takes
care of all the details and avoids doing anything that isn't safe to do inside a
signal handler. The only thing we need to do is to manually setup our own fake
SIGINT handler that behaves the same way as the standalone Python REPL signal
handler (which raises a KeyboardInterrupt).

From what I understand the old code used to work with Python 2 so I kept the old
code around until we officially drop support for Python 2.

There is a small gap here with Python 3.0->3.1 where we might still be crashing,
but those versions have reached their EOL more than a decade ago so I think we
don't need to bother about them.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D104886
2021-11-12 14:19:15 +01:00
Sanjay Patel bf5748a1af [x86] fold vector (X > -1) & Y to shift+andn
and (pcmpgt X, -1), Y --> pandn (vsrai X, BitWidth-1), Y

This avoids the -1 constant vector in favor of an arithmetic shift
instruction if it exists (the ISA is still not complete after all
these years...).

We catch this pattern late in combining by matching PCMPGT, so it
should not interfere with more general folds.

Differential Revision: https://reviews.llvm.org/D113603
2021-11-12 08:17:46 -05:00
Jan Svoboda ab6ef58727 [clang] NFC: Format a loop in CompilerInstance
This code will be moved to a separate function in a future patch. Reformatting now to prevent a bunch of clang-format complains on Phabricator.
2021-11-12 14:16:06 +01:00
Nicolas Vasilache 99ff697bf7 [mlir][Vector] Add support for 1D depthwise conv vectorization
At this time the 2 flavors of conv are a little too different to allow significant code sharing and other will likely come up.
so we go the easy route first by duplicating and adapting.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113758
2021-11-12 13:14:09 +00:00
Dmitry Vyukov 19c1d03f97 tsan: ignore some errors in the clone_setns test
Some bots failed with:
unshare failed: 1
https://lab.llvm.org/buildbot/#/builders/70/builds/14101

Look only for the target EINVAL error.

Differential Revision: https://reviews.llvm.org/D113759
2021-11-12 14:12:36 +01:00
Phoebe Wang 4721ee7029 Add nounwind for tests. NFC 2021-11-12 21:09:05 +08:00