Commit Graph

395825 Commits

Author SHA1 Message Date
David Green ff9958b70e [ARM] Test showing incorrect codegen when subreg liveness is enabled. NFC 2021-08-04 14:21:32 +01:00
Matthias Springer b6408fa169 [mlir] Include llvm/Support/Debug.h in Transforms/Passes.h
There are many downstream users of llvm::dbgs, which is defined in Debug.h. Before D106342, many users included that dependency transitively via the now deleted ViewRegionGraph.h. Adding it back to Transforms/Passes.h for convenience.

Differential Revision: https://reviews.llvm.org/D107451
2021-08-04 21:45:28 +09:00
Simon Pilgrim 8cd40ece70 [X86] Rename X86 tuning feature flag FeatureHasFastGather -> FeatureFastGather
Match the naming style used by the other 'FeatureFast/FeatureSlow' tuning flags.
2021-08-04 13:07:50 +01:00
Simon Pilgrim 17e8ac0703 [X86] Move FeatureFastBEXTR from bdver2 features to tuning
Noticed while looking at the feature flag renaming suggested in D107370
2021-08-04 13:07:49 +01:00
Muhammad Omair Javaid f2128abec2 [LLDB] Skip flaky tests on Arm/AArch64 Linux bots
Following LLDB tests fail randomly on LLDB Arm/AArch64 Linux buildbots.
We still not have a reliable solution for these tests to pass
consistently. I am marking them skipped for now.

TestBreakpointCallbackCommandSource.py
TestIOHandlerResize.py
TestEditline.py
TestGuiViewLarge.py
TestGuiExpandThreadsTree.py
TestGuiBreakpoints.py
2021-08-04 16:57:36 +05:00
Dmitry Vyukov e3f4c63e78 tsan: don't use spinning in __cxa_guard_acquire/pthread_once
Currently we use passive spinning with internal_sched_yield to wait
in __cxa_guard_acquire/pthread_once. Passive spinning tends to degrade
ungracefully under high load. Use FutexWait/Wake instead.

Depends on D107359.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107360
2021-08-04 13:56:33 +02:00
Jan Svoboda 2718ae397b [clang][deps] Substitute clang-scan-deps executable in lit tests
The lit tests for `clang-scan-deps` invoke the tool without going through the substitution system. While the test runner correctly picks up the `clang-scan-deps` binary from the build directory, it doesn't print its absolute path. When copying the invocations when reproducing test failures, this can result in `command not found: clang-scan-deps` errors or worse yet: pick up the system `clang-scan-deps`. This patch adds new local `%clang-scan-deps` substitution.

Reviewed By: lxfind, dblaikie

Differential Revision: https://reviews.llvm.org/D107155
2021-08-04 13:55:14 +02:00
Dmitry Vyukov 0bc626d516 tsan: refactor guard_acquire/release
Introduce named consts for magic values we use.

Differential Revision: https://reviews.llvm.org/D107445
2021-08-04 13:52:27 +02:00
Jan Svoboda 0556138624 [clang][cli] Expose -fno-cxx-modules in cc1
For some use-cases, it might be useful to be able to turn off modules for C++ in `-cc1`. (The feature is implied by `-std=C++20`.)

This patch exposes the `-fno-cxx-modules` option in `-cc1`.

Reviewed By: arphaman

Differential Revision: https://reviews.llvm.org/D106864
2021-08-04 13:46:40 +02:00
Matthias Springer 9102a16bef [mlir] Support drawing control-flow graphs in ViewOpGraph.cpp
* Add new pass option `print-data-flow-edges`, default value `true`.
* Add new pass option `print-control-flow-edges`, default value `false`.
* Remove `PrintCFGPass`. Same functionality now provided by
  `PrintOpPass`.

Differential Revision: https://reviews.llvm.org/D106342
2021-08-04 20:45:15 +09:00
Dmitry Vyukov 636428c727 tsan: unify __cxa_guard_acquire and pthread_once implementations
Currently we effectively duplicate "once" logic for __cxa_guard_acquire
and pthread_once. Unify the implementations.

This is not a no-op change:
 - constants used for pthread_once are changed to match __cxa_guard_acquire
   (__cxa_guard_acquire constants are tied to ABI, but it does not seem
   to be the case for pthread_once)
 - pthread_once now also uses PotentiallyBlockingRegion annotations
 - __cxa_guard_acquire checks thr->in_ignored_lib to skip user synchronization
It's unclear if these 2 differences are intentional or a mere sloppy inconsistency.
Since all tests still pass, let's assume the latter.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107359
2021-08-04 13:44:19 +02:00
Dmitry Vyukov 14e306fa4b tsan: use DCHECK instead of CHECK in atomic functions
Atomic functions are semi-hot in profiles.
The CHECKs verify values passed by compiler
and they never fired, so replace them with DCHECKs.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107373
2021-08-04 13:23:57 +02:00
Dmitry Vyukov d3faecbb7c tsan: minor MetaMap tweaks
1. Add some comments.
2. Use kInvalidStackID instead of literal 0.
3. Add more LIKELY/UNLIKELY.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D107371
2021-08-04 13:20:44 +02:00
David Spickett 6f8c4340c2 [llvm][MC] Disable cfi-version test for Windows on Arm
Like Windows on x86-64, Windows on arm64 uses structured
exception handling, so we don't emit .debug_frame.

See:
https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-160

Reviewed By: ostannard

Differential Revision: https://reviews.llvm.org/D107440
2021-08-04 11:18:05 +00:00
Tim Northover 13e145fe76 X86: add test for realignment fix committed earlier.
Forgot "git add" for a new file.
2021-08-04 12:10:20 +01:00
Jaroslav Sevcik f968bd77bb Reland "[lldb/DWARF] Only match mangled name in full-name function lookup (with accelerators)"
Summary:

In the spirit of https://reviews.llvm.org/D70846, we only return functions with
matching mangled name from Apple/DebugNamesDWARFIndex::GetFunction if
eFunctionNameTypeFull is requested.

This speeds up lookup in the presence of large amount of class methods of the
same name (a typical examples would be constructors of templates with many
instantiations or overloaded operators).

Reviewers: labath, teemperor

Reviewed By: labath, teemperor

Subscribers: aprantl, arphaman, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D73191
2021-08-04 12:50:13 +02:00
Matthias Springer 7f163931b9 [mlir] Fix CMake linker rules for ViewOpGraph.cpp
Differential Revision: https://reviews.llvm.org/D107439
2021-08-04 19:25:15 +09:00
Serge Pavlov 0c28a7c990 Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebcc.
Several errors were reported mainly test-suite execution time. Reverted
for investigation.
2021-08-04 17:18:15 +07:00
Simon Pilgrim fc8dee1ebb [X86] Split Subtarget ISA / Security / Tuning Feature Flags Definitions. NFC
Our list of slow/fast tuning feature flags has become pretty extensive and is randomly interleaved with ISA and Security (Retpoline etc.) flags, not even based on when the ISAs/flags were introduced, making it tricky to locate them. Plus we started treating tuning flags separately some time ago, so this patch tries to group the flags to match.

I've left them mostly in the same order within each group - I'm happy to rearrange them further if there are specific ISA or Tuning flags that you think should be kept closer together.

Differential Revision: https://reviews.llvm.org/D107370
2021-08-04 11:16:36 +01:00
Kim-Anh Tran 0092dbcd80 [lldb] Fix lookup of .debug_loclists with split-dwarf
This patch fixes the lookup of locations in
.debug_loclists, if they are split in a .dwp file.

Mainly, we need to consider the cu index offsets.

Reviewed By: jankratochvil

Differential Revision: https://reviews.llvm.org/D107161
2021-08-04 11:36:44 +02:00
David Spickett b1802d694c [llvm][ExecutionEngine] Don't try to run tests on ARM64/Windows on Arm
We use CMAKE_SYSTEM_PROCESSOR to set the host_arch lit feature.
This is going to be the same value as CMAKE_HOST_SYSTEM_PROCESSOR,
which on windows is set to the value of the PROCESSOR_ARCHITECTURE
environment variable.

https://cmake.org/cmake/help/latest/variable/CMAKE_HOST_SYSTEM_PROCESSOR.html#cmake-host-system-processor

On Windows on Arm this is "ARM64", not "AArch64" as we currently
look for.

https://docs.microsoft.com/en-us/windows/win32/winprog64/wow64-implementation-details#environment-variables

Add ARM64 to the unsupported list.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D107361
2021-08-04 09:24:18 +00:00
Raphael Isemann e4977f9cb5 [lldb] Partly revert "Allow range-based for loops over DWARFDIE's children"
As pointed out in D107434 by Walter, D103172 also changed two for loops that
were actually not just iterating over some DIEs but also using the iteration
variable later on for some other things. This patch reverts the respective
faulty parts of D103172.
2021-08-04 11:05:08 +02:00
Tim Northover d7b0e5525a X86: fix frame offset calculation with mandatory tail calls
If there's a region of the stack reserved for potential tail call arguments
(only the case when we guarantee tail calls will be honoured), this is right
next to the incoming stored return address, not necessarily next to the
callee-saved area, so combining the two into a single figure leads to incorrect
offsets in some edge cases.
2021-08-04 10:02:42 +01:00
Serge Pavlov 16ff91ebcc Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.

* The most common mechanism is using an unordered comparison made by
  instruction 'fcmp uno'. This simple solution is target-independent
  and works well in most cases. It however is not suitable if floating
  point exceptions are tracked. Corresponding IEEE 754 operation and C
  function must never raise FP exception, even if the argument is a
  signaling NaN. Compare instructions usually does not have such
  property, they raise 'invalid' exception in such case. So this
  mechanism is unsuitable when exception behavior is strict. In
  particular it could result in unexpected trapping if argument is SNaN.

* Another solution was implemented in https://reviews.llvm.org/D95948.
  It is used in the cases when raising FP exceptions by 'isnan' is not
  allowed. This solution implements 'isnan' using integer operations.
  It solves the problem of exceptions, but offers one solution for all
  targets, however some can do the check in more efficient way.

* Solution implemented by https://reviews.llvm.org/D96568 introduced a
  hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
  specific code into IR. Now only SystemZ implements this hook and it
  generates a call to target specific intrinsic function.

Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:

* The operation 'isnan' is hidden behind generic integer operations or
  target-specific intrinsics. It complicates analysis and can prevent
  some optimizations.

* IR can be created by tools other than clang, in this case treatment
  of 'isnan' has to be duplicated in that tool.

Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':

    std::isnan(std::numeric_limits<float>::quiet_NaN())

The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.

To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.

Differential Revision: https://reviews.llvm.org/D104854
2021-08-04 15:27:49 +07:00
Andre Vieira 2f002817fb [libc] Fix Memory Benchmarks code after rename
Differential Revision: https://reviews.llvm.org/D107376
2021-08-04 09:17:12 +01:00
Sjoerd Meijer 30fbb06979 [FuncSpec] Support specialising recursive functions
This adds support for specialising recursive functions. For example:

    int Global = 1;
    void recursiveFunc(int *arg) {
      if (*arg < 4) {
        print(*arg);
        recursiveFunc(*arg + 1);
      }
    }
    void main() {
      recursiveFunc(&Global);
    }

After 3 iterations of function specialisation, followed by inlining of the
specialised versions of recursiveFunc, the main function looks like this:

    void main() {
      print(1);
      print(2);
      print(3);
    }

To support this, the following has been added:
- Update the solver and state of the new specialised functions,
- An optimisation to propagate constant stack values after each iteration of
  function specialisation, which is necessary for the next iteration to
  recognise the constant values and trigger.

Specialising recursive functions is (at the moment) controlled by option
-func-specialization-max-iters and is opt-in for compile-time reasons. I.e.,
the default is -func-specialization-max-iters=1, but for the example above we
would need to use -func-specialization-max-iters=3. Future work is to see if we
can increase the default, or improve the cost-model/heuristics to control
compile-times.

Differential Revision: https://reviews.llvm.org/D106426
2021-08-04 08:07:04 +01:00
Senran Zhang 486b6013f9 [Support] Initialize common options in `getRegisteredOptions`
This allows users accessing options in libSupport before invoking
`cl::ParseCommandLineOptions`, and also matches the behavior before
D105959.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D106334
2021-08-03 23:59:10 -07:00
Adrian Kuegel 8385de1184 [mlir][Bazel] Adjust BUILD.bazel file.
The dependency is needed after 1b00b94ffc

Differential Revision: https://reviews.llvm.org/D107426
2021-08-04 08:51:02 +02:00
Esme-Yi 737e27f623 [llvm-readobj][XCOFF] dump the string table only if the size is bigger than 4. 2021-08-04 06:28:26 +00:00
Stephen Neuendorffer 432341d8a8 [mlir] Handle cases where transfer_read should turn into a scalar load
The existing vector transforms reduce the dimension of transfer_read
ops.  However, beyond a certain point, the vector op actually has
to be reduced to a scalar load, since we can't load a zero-dimension
vector.  This handles this case.

Note that in the longer term, it may be preferaby to support
zero-dimension vectors.  see
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097.

Differential Revision: https://reviews.llvm.org/D103432
2021-08-03 22:53:40 -07:00
hsmahesha 596e61c332 [AMDGPU] Ignore call graph node which does not have function info.
While collecting reachable callees (from kernels), ignore call graph node which
does not have associated function or associated function is not a definition.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D107329
2021-08-04 10:22:33 +05:30
Senran Zhang df4e0beaeb [NFC][ConstantFold] Check getAggregateElement before getSplatValue call
Constant::getSplatValue has O(N) time complexity in the worst case,
where N is the # of elements in a vector. So we call
Constant::getAggregateElement first and return earlier if possible to
avoid unnecessary getSplatValue calls.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107252
2021-08-03 21:52:14 -07:00
Matthias Springer faeb7ec68b [mlir] Fix broken build in pass_manager.py
This test ensures that an error is generated from the Python side when running a module pass on a function. The test used to instantiate ViewOpGraph, however, this pass was changed into a general "any op" pass in D106253. Therefore, a different pass must be used in this test.

Differential Revision: https://reviews.llvm.org/D107424
2021-08-04 13:12:17 +09:00
Heejin Ahn 9bd02c433b [WebAssembly] Misc. cosmetic changes in EH (NFC)
- Rename `wasm.catch` intrinsic to `wasm.catch.exn`, because we are
  planning to add a separate `wasm.catch.longjmp` intrinsic which
  returns two values.
- Rename several variables
- Remove an unnecessary parameter from `canLongjmp` and `isEmAsmCall`
  from LowerEmscriptenEHSjLj pass
- Add `-verify-machineinstrs` in a test for a safety measure
- Add more comments + fix some errors in comments
- Replace `std::vector` with `SmallVector` for cases likely with small
  number of elements
- Renamed `EnableEH`/`EnableSjLj` to `EnableEmEH`/`EnableEmSjLj`: We are
  soon going to add `EnableWasmSjLj`, so this makes the distincion
  clearer

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D107405
2021-08-03 21:03:46 -07:00
Arthur Eubanks ad25344620 [MC][CodeGen] Emit constant pools earlier
Previously we would emit constant pool entries for ldr inline asm at the
very end of AsmPrinter::doFinalization(). However, if we're emitting
dwarf aranges, that would end all sections with aranges. Then if we have
constant pool entries to be emitted in those same sections, we'd hit an
assert that the section has already been ended.

We want to emit constant pool entries before emitting dwarf aranges.
This patch splits out arm32/64's constant pool entry emission into its
own MCTargetStreamer virtual method.

Fixes PR51208

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107314
2021-08-03 20:55:31 -07:00
Matthias Springer a87be1c1bd [mlir] Truncate/skip long strings in ViewOpGraph.cpp
* New pass option `max-label-len`: Truncate attributes/result types that have more #chars.
* New pass option `print-attrs`: Activate/deactivate rendering of attributes.
* New pass option `printResultTypes`: Activate/deactivate rendering of result types.

Differential Revision: https://reviews.llvm.org/D106337
2021-08-04 12:14:46 +09:00
Vitaly Buka 3df1e7e6f0 [llvm-readobj][XCOFF] Warn about invalid offset
Followup for D105522

Differential Revision: https://reviews.llvm.org/D107398
2021-08-03 20:11:26 -07:00
Jacob Hegna b16c37fa2c [MLGO] Update the current model url for the Oz inliner model. 2021-08-04 03:09:00 +00:00
Matthias Springer 8d15b7dcba [mlir] Improve Graphviz visualization in PrintOpPass
* Visualize blocks and regions as subgraphs.
* Generate DOT file directly instead of using `GraphTraits`. `GraphTraits` does not support subgraphs.

Differential Revision: https://reviews.llvm.org/D106253
2021-08-04 11:56:26 +09:00
Christian Kandeler 159a269648 [clangd] Add new semantic token modifier "virtual"
This is needed for clients that want to highlight virtual functions
differently.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D107145
2021-08-03 19:53:01 -07:00
Aart Bik 3fc9294873 [mlir][sparse] add example to attribute doc
Also makes style consistent with the "surrounding"
text that appears on one webpage in MLIR doc

Reviewed By: grosul1

Differential Revision: https://reviews.llvm.org/D107418
2021-08-03 18:41:49 -07:00
Vitaly Buka 9ab590e3eb [msan] Add bsearch interceptor
Similar to qsort, bsearch can be called from non-instrumented
code of glibc. When it happends tls for arguments can be in uninitialized
state.

Unlike to qsort, bsearch does not move data, so we don't need to
check or initialize searched memory or key. Intrumented comparator will
do that on it's own.

Differential Revision: https://reviews.llvm.org/D107387
2021-08-03 18:39:14 -07:00
Jessica Paquette 5643736378 [AArch64][GlobalISel] Widen G_SELECT before clamping it
This allows us to handle the s88 G_SELECTS:

https://godbolt.org/z/5s18M4erY

Weird types like this can result in weird merges.

Widening to s128 first and then clamping down avoids that situation.

Differential Revision: https://reviews.llvm.org/D107415
2021-08-03 18:31:17 -07:00
Matthias Springer 767974f344 [mlir][scf] Fix bug in peelForLoop
Insertion point should be set before creating new operations.

Differential Revision: https://reviews.llvm.org/D107326
2021-08-04 10:20:46 +09:00
wlei f1affe8dc8 [llvm-profgen][CSSPGO] Support count based aggregated type of hybrid perf script
This change tried to integrate a new count based aggregated type of perf script. The only difference of the format is that an aggregated count is added at the head of the original sample which means the same samples are repeated to the given count times. This is used to reduce the perf script size.
e.g.
```
2
	          4005dc
	          400634
	          400684
	    7f68c5788793
 0x4005c8/0x4005dc/P/-/-/0  ....
```
Implemented by a dedicated PerfReader `AggregatedHybridPerfReader`.

Differential Revision: https://reviews.llvm.org/D107192
2021-08-03 17:56:35 -07:00
Rob Suderman 1b00b94ffc [mlir][tosa] Tosa shape propagation for tosa.cond_if
We can propagate the shape from tosa.cond_if operands into the true/false
regions then through the connected blocks. Then, using the tosa.yield ops
we can determine what all possible return types are.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105940
2021-08-03 17:54:54 -07:00
Dan Liew b4121b335c [Compiler-rt] Fix running ASan/TSan unit tests under macOS 12.0.
On macOS the unit tests currently rely on libmalloc being used for
allocations (due to no functioning interceptors) but also having the
ASan/TSan allocator initialized in the same process.

This leads to crashes with the macOS 12.0 libmalloc nano allocator so
disable use of the allocator while running unit tests as a workaround.

rdar://80086125

Differential Revision: https://reviews.llvm.org/D107412
2021-08-03 17:46:27 -07:00
Rob Suderman 143edeca6d [mlir][tosa] Shape inference for a few remaining easy cases:
Handles shape inference for identity, cast, and rescale. These were missed
during the initialy elementwise work. This includes resize shape propagation
which includes both attribute and input type based propagation.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105845
2021-08-03 17:20:32 -07:00
Shimin Cui 2d9759c790 [GlobalOpt] Fix the load types when OptimizeGlobalAddressOfMalloc
Currently, in OptimizeGlobalAddressOfMalloc, the transformation for global loads assumes that they have the same Type. With the support of ConstantExpr (https://reviews.llvm.org/D106589), this may not be true any more (as seen in the test case), and we miss the code to handle this, This is to fix that.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D107397
2021-08-03 19:22:53 -04:00
Michael Kruse ba2be8deba [clang/OpenMP][docs] Update OpenMP support list for unroll. 2021-08-03 18:11:17 -05:00