Commit Graph

369800 Commits

Author SHA1 Message Date
Ettore Tiotto e6521ce064 [NFC][PartialInliner]: Clean up code
Make member function const where possible, use LLVM_DEBUG to print debug traces
rather than a custom option, pass by reference to avoid null checking, ...

Reviewed By: fhann

Differential Revision: https://reviews.llvm.org/D89895
2020-10-22 14:40:15 -04:00
Tom Stellard 6f798e460c HowToReleaseLLVM: Clean up document and remove references to SVN
Reviewed By: hans

Differential Revision: https://reviews.llvm.org/D80395
2020-10-22 11:34:03 -07:00
Sanjay Patel f6cb7f37ff [InstSimplify] add tests for ctpop constant range; NFC 2020-10-22 14:16:48 -04:00
Jonathan Crowther 9bc02e892f [SystemZ][z/OS] Set short-enums as the default for z/OS
This patch sets short-enums to be the default for z/OS.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D89801
2020-10-22 14:15:58 -04:00
Duncan P. N. Exon Smith 156e8b3702 clang/Basic: Remove ContentCache::getRawBuffer, NFC
Replace `ContentCache::getRawBuffer` with `getBufferDataIfLoaded` and
`getBufferIfLoaded`, excising another accessor for the underlying
`MemoryBuffer*` in favour of `StringRef` and `MemoryBufferRef`.

Differential Revision: https://reviews.llvm.org/D89445
2020-10-22 14:00:44 -04:00
Paul C. Anagnostopoulos b9eecbfada [TableGen] Update documents to make them more complete
Differential Revision: https://reviews.llvm.org/D89962
2020-10-22 13:19:19 -04:00
Vedant Kumar 3419252a79 [InstCombine] Remove dbg.values describing contents of dead allocas
When InstCombine removes an alloca, it erases the dbg.{addr,declare}
instructions which refer to the alloca. It would be better to instead
remove all debug intrinsics which describe the contents of the dead
alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s.

This effectively undoes work performed in an InstCombine run earlier in
the pipeline by LowerDbgDeclare, which inserts DW_OP_deref dbg.values
before CallInst users of an alloca. The motivating example looks like:

```
  define void @foo(i32 %0) {
    %a = alloca i32              ; This alloca is erased.
    store i32 %0, i32* %a
    dbg.value(i32 %0, "arg0")    ; This dbg.value survives.
    dbg.value(i32* %a, "arg0", DW_OP_deref)
    call void @trivially_inlinable_no_op(i32* %a)
    ret void
  }
```

If the DW_OP_deref dbg.value is not erased, it becomes dbg.value(undef)
after inlining, making "arg0" unavailable. But we already have dbg.value
descriptions of the alloca's value (from LowerDbgDeclare), so the
DW_OP_deref dbg.value cannot serve its purpose of describing an
initialization of the alloca by some callee. It invalidates other useful
dbg.values, causing large gaps in location coverage, so we should delete
it (even though doing so may cause stale dbg.values to appear, if
there's a dead store to `%a` in @trivially_inlinable_no_op).

OTOH, it wouldn't be correct to delete all dbg.value descriptions of an
alloca. Note that it's possible to describe a variable that takes on
different pointer values, e.g.:

```
  void use(int *);
  void t(int a, int b) {
    int *local = &a;     // dbg.value(i32* %a.addr, "local")
    local = &b;          // dbg.value(i32* undef, "local")
    use(&a);             //           (note: %b.addr is optimized out)
    local = &a;          // dbg.value(i32* %a.addr, "local")
  }
```

In this example, the alloca for "b" is erased, but we need to describe
the value of "local" as <unavailable> before the call to "use". This
prevents "local" from appearing to be equal to "&a" at the callsite.

rdar://66592859

Differential Revision: https://reviews.llvm.org/D85555
2020-10-22 10:00:13 -07:00
Matt Arsenault 549f326d32 AMDGPU: Cleanup MIR test
Remove registers section and compact block/register numbers
2020-10-22 12:54:35 -04:00
Arthur Eubanks 87520657b8 Revert "[Docs] Clarify that FunctionPasses can't add/remove declarations"
This reverts commit 710676cf3a.
2020-10-22 09:49:42 -07:00
Fangrui Song a8f9f08018 [ELF] Set SHF_INFO_LINK for .rel[a].plt and .rel[a].dyn
The ELF spec says

> If the sh_flags field for this section header includes the attribute SHF_INFO_LINK, then this member represents a section header table index.

Set SHF_INFO_LINK so that binary manipulation tools know that sh_info is
a section header table index instead of (the number of local symbols in the case of SHT_SYMTAB/SHT_DYNSYM).
We have already added SHF_INFO_LINK for --emit-relocs retained SHT_REL[A].

For example, we can teach llvm-objcopy to preserve the section index of the sh_info referenced section if
SHF_INFO_LINK is set. (GNU objcopy recognizes .rel[a].plt and updates
sh_info even if SHF_INFO_LINK is not set).

Reviewed By: grimar, psmith

Differential Revision: https://reviews.llvm.org/D89828
2020-10-22 09:48:19 -07:00
Raphael Isemann 5dc70332d5 Revert "[lldb] Explicitly use the configuration architecture when building test executables"
This reverts commit 41185226f6.

Causes TestQuoting to fail on Windows.
2020-10-22 18:42:19 +02:00
Nikita Popov 32b6e9a450 [DomTree] Accept Value as Def (NFC)
Non-instruction defs like arguments, constants or global values
always dominate all instructions/uses inside the function. This
case currently needs to be treated separately by the caller, see
https://reviews.llvm.org/D89623#inline-832818 for an example.

This patch makes the dominator tree APIs accept a Value instead of
an Instruction and always returns true for the non-Instruction case.

A complication here is that BasicBlocks are also Values. For that
reason we can't support the dominates(Value *, BasicBlock *)
variant, as it would conflict with dominates(BasicBlock *, BasicBlock *),
which has different semantics. For the other two APIs we assert
that the passed value is not a BasicBlock.

Differential Revision: https://reviews.llvm.org/D89632
2020-10-22 18:32:03 +02:00
Florian Hahn d842b88687 [SLP] Add tests with selects that can be turned into min/max.
AArch64 does not have a flexible vector select instruction. In some
cases, the selects can be turned into min/max however, for which there
are dedicated vector instructions on AArch64.

This patch adds some tests for such cases.
2020-10-22 17:25:28 +01:00
Tim Corringham 3c1273d737 [AMDGPU] Add amdgpu specific loop threshold metadata
Add new loop metadata amdgpu.loop.unroll.threshold to allow the initial AMDGPU
specific unroll threshold value to be specified on a loop by loop basis.

The intention is to be able to to allow more nuanced hints, e.g. specifying a
low threshold value to indicate that a loop may be unrolled if cheap enough
rather than using the all or nothing llvm.loop.unroll.disable metadata.

Differential Revision: https://reviews.llvm.org/D84779
2020-10-22 17:21:32 +01:00
Arthur Eubanks af3c51e354 [gn build] Add missing clangd dependencies
Fixes
$ ninja obj/build/rel/gen/clang-tools-extra/clangd/CompletionModel.CompletionModel.obj

Some tablegen include files from clang/include/clang/AST and
clang/include/clang/Sema need to be generated before CompletionModel is
compiled.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89657
2020-10-22 09:04:30 -07:00
Arthur Eubanks 710676cf3a [Docs] Clarify that FunctionPasses can't add/remove declarations
In preparation for potential future concurrency, a FunctionPass
shouldn't modify anything at the module level that other FunctionPasses
can also modify.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D89890
2020-10-22 09:03:42 -07:00
Med Ismail Bennani efe62b637d [lldb/DWARF] Add support for DW_OP_implicit_value
This patch completes https://reviews.llvm.org/D83560. Now that the
compiler can emit `DW_OP_implicit_value` into DWARF expressions, lldb
needed to learn reading these opcodes for variable inspection and
expression evaluation.

This implicit location descriptor specifies an immediate value with two
operands: the length (ULEB128) followed by a block representing the value
in the target memory representation.

rdar://67406091

Differential revision: https://reviews.llvm.org/D89842

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
2020-10-22 18:02:44 +02:00
Marco Antognini a779a16993 [OpenCL] Remove unused extensions
Many non-language extensions are defined but also unused. This patch
removes them with their tests as they do not require compiler support.

The cl_khr_select_fprounding_mode extension is also removed because it
has been deprecated since OpenCL 1.1 and Clang doesn't have any specific
support for it.

The cl_khr_context_abort extension is only referred to in "The OpenCL
Specification", version 1.2 and 2.0, in Table 4.3, but no specification
is provided in "The OpenCL Extension Specification" for these versions.
Because it is both unused in Clang and lacks specification, this
extension is removed.

The following extensions are platform extensions that bring new OpenCL
APIs but do not impact the kernel language nor require compiler support.
They are therefore removed.

- cl_khr_gl_sharing, introduced in OpenCL 1.0

- cl_khr_icd, introduced in OpenCL 1.2

- cl_khr_gl_event, introduced in OpenCL 1.1
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquireGLObjects.
Hence, they cannot be used on the device side and the extension does
not impact the kernel language.

- cl_khr_d3d10_sharing, introduced in OpenCL 1.1

- cl_khr_d3d11_sharing, introduced in OpenCL 1.2

- cl_khr_dx9_media_sharing, introduced in OpenCL 1.2

- cl_khr_image2d_from_buffer, introduced in OpenCL 1.2

- cl_khr_initialize_memory, introduced in OpenCL 1.2

- cl_khr_gl_depth_images, introduced in OpenCL 1.2
Note: this extension is related to cl_khr_depth_images but only the
latter adds new features to the kernel language.

- cl_khr_spir, introduced in OpenCL 1.2

- cl_khr_egl_event, introduced in OpenCL 1.2
Note: this extension adds a new API to create cl_event but it also
specifies that these can only be used by clEnqueueAcquire* API
functions. Hence, they cannot be used on the device side and the
extension does not impact the kernel language.

- cl_khr_egl_image, introduced in OpenCL 1.2

- cl_khr_terminate_context, introduced in OpenCL 1.2

The minimum required OpenCL version used in OpenCLExtensions.def for
these extensions is not always correct. Removing these address that
issue.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D89372
2020-10-22 17:01:31 +01:00
Aaron En Ye Shi b2524eb944 [HIP] Fix HIP rounding math intrinsics
The __ocml_*_rte_f32 and __ocml_*_rte_f64 functions are not
available if OCML_BASIC_ROUNDED_OPERATIONS is not defined.

Reviewed By: b-sumner, yaxunl

Fixes: SWDEV-257235

Differential Revision: https://reviews.llvm.org/D89966
2020-10-22 15:57:09 +00:00
Mircea Trofin e24537d48f [NFC][MC] Use MCRegister for ReachingDefAnalysis APIs
Also updated the users of the APIs; and a drive-by small change to
RDFRegister.cpp

Differential Revision: https://reviews.llvm.org/D89912
2020-10-22 08:47:35 -07:00
Arthur Eubanks cb9ca35977 [LoopRotate][NPM] Disable header duplication under -Oz
It was already disabled under -Oz in
buildFunctionSimplificationPipeline(), but not in
buildModuleOptimizationPipeline()/addPGOInstrPasses().

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D89927
2020-10-22 08:39:12 -07:00
Jonas Devlieghere 826997c462 [lldb] Fix a regression introduced by D75730
In a new Range class was introduced to simplify and the Disassembler API
and reduce duplication. It unintentionally broke the
SBFrame::Disassemble functionality because it unconditionally converts
the number of instructions to a Range{Limit::Instructions,
num_instructions}. This is subtly different from the previous behavior,
where now we're passing a Range and assume it's valid in the callee, the
original code would propagate num_instructions and the callee would
compare the value and decided between disassembling instructions or
bytes.

Unfortunately the existing tests was not particularly strict:

  disassembly = frame.Disassemble()
  self.assertNotEqual(len(disassembly), 0, "Disassembly was empty.")

This would pass because without this patch we'd disassemble zero
instructions, resulting in an error:

  (lldb) script print(lldb.frame.Disassemble())
  error: error reading data from section __text

Differential revision: https://reviews.llvm.org/D89925
2020-10-22 08:38:03 -07:00
Eugene Zhulenev a8b0ae3bdd [mlir] Do not start threads in AsyncRuntime
pthreads is not enabled for all builds by default

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D89967
2020-10-22 08:31:30 -07:00
Teresa Johnson 5c20d7db9f [MemProf] Allow the binary to specify the profile output filename
This will allow the output directory to be specified by a build time
option, similar to the directory specified for regular PGO profiles via
-fprofile-generate=. The memory profiling instrumentation pass will
set up the variable. This is the same mechanism used by the PGO
instrumentation and runtime.

Depends on D87120 and D89629.

Differential Revision: https://reviews.llvm.org/D89086
2020-10-22 08:30:19 -07:00
Christian Sigg 9ab5362bab [mlir][gpu] NFC: switch occurrences of gpu.launch_func to custom format.
Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89929
2020-10-22 17:27:19 +02:00
Piotr Sobczak 7ae0033ca8 [AMDGPU] Fix expansion of i16 MULH
This commit marks i16 MULH as expand in AMDGPU backend,
which is necessary after the refactoring in D80485.

Differential Revision: https://reviews.llvm.org/D89965
2020-10-22 17:05:06 +02:00
Florian Hahn c1705e0ba4 [AArch64] Add min/max cost-model tests for v2i32. 2020-10-22 16:04:13 +01:00
Evgeny Leviant ed6a91f456 [ARM][SchedModels] Convert IsLdstsoScaledPred to MCSchedPredicate
Differential revision: https://reviews.llvm.org/D89939
2020-10-22 18:03:01 +03:00
Simon Pilgrim 2692978050 [X86] X86AsmParser - make methods const where possible. NFCI.
Reported by cppcheck
2020-10-22 15:55:06 +01:00
Simon Pilgrim 091b18ba81 [X86] Return const& in IntelExprStateMachine::getIdentifierInfo(). NFCI.
Avoid unnecessary copy in X86AsmParser::ParseIntelOperand
2020-10-22 15:55:06 +01:00
Jeremy Morse 68ac02c0dd [DebugInstrRef] Pass DBG_INSTR_REFs through register allocation
Both FastRegAlloc and LiveDebugVariables/greedy need to cope with
DBG_INSTR_REFs. None of them actually need to take any action, other than
passing DBG_INSTR_REFs through: variable location information doesn't refer
to any registers at this stage.

LiveDebugVariables stashes the instruction information in a tuple, then
re-creates it later. This is only necessary as the register allocator
doesn't expect to see any debug instructions while it's working. No
equivalence classes or interval splitting is required at all!

No changes are needed for the fast register allocator, as it just ignores
debug instructions. The test added checks that both of them preserve
DBG_INSTR_REFs.

This also expands ScheduleDAGInstrs.cpp to treat DBG_INSTR_REFs the same as
DBG_VALUEs when rescheduling instructions around. The current movement of
DBG_VALUEs around is less than ideal, but it's not a regression to make
DBG_INSTR_REFs subject to the same movement.

Differential Revision: https://reviews.llvm.org/D85757
2020-10-22 15:51:22 +01:00
Florian Hahn d6efc87518 [AArch64] Add min/max cost-model tests for v4i16. 2020-10-22 15:47:50 +01:00
Raphael Isemann 30d5590d17 [lldb] Fix TestTargetAPI.py on Apple simulators
This test checks that the output of `SBTarget.GetDescription()` contains the
substrings `'a.out', 'Target', 'Module', 'Breakpoint'` in that order. This test
is currently failing on Apple simulators as apparently 'Module' can't be found
in the output after 'Target".

The reason for that is that the actual output of `SBTarget.GetDescription()` looks like this:
```
Target
  Module /build/path/lldb-test-build.noindex/python_api/target/TestTargetAPI.test_get_description_dwarf/a.out
0x7ff2b6d3f990:     ObjectFileMachO64, file = /build/path/lldb-test-build.noindex/python_api/target/TestTargetAPI.test_get_description
[...]
0x7ff307150000:   BreakpointList with 0 Breakpoints:
<LLDB module output repeats for each loaded module>
```

Clearly the string order should be `'Target', 'Module', 'a.out', 'Breakpoint'`.
However, LLDB is also a bunch of system shared libraries (libxpc.dylib,
libobjc.A.dylib, etc.) when *not* running against a simulator, we end up
unintentionally finding the `'Target', 'Module', 'Breakpoint'` substrings in the
trailing descriptions of the system modules. When running against a simulator we
however don't load shared system libraries.

This patch just moves the substrings in the correct order to make this test pass
without having any shared library modules in the description output.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D89698
2020-10-22 16:41:54 +02:00
Matt Arsenault d5c0561667 AMDGPU: Fix not always reserving VGPRs used for SGPR spilling
The VGPRs used for SGPR spills need to be reserved, even if we aren't
speculatively reserving one.

This was broken by 117e5609e9.
2020-10-22 10:19:19 -04:00
Matt Arsenault d3bcfe2a36 AMDGPU: Implement getNoPreservedMask
We don't support funclets for exception handling and I hit this when
manually reducing MIR.
2020-10-22 10:17:31 -04:00
Matt Arsenault 188df17420 ScheduleDAGInstrs: Skip debug instructions at end of scheduling region
If the end instruction of the scheduling region was a DBG_VALUE, the
uses of the debug instruction were tracked as if they were real
uses. This would then hit the deadDefHasNoUse assertion in
addVRegDefDeps if the only use was the debug instruction.
2020-10-22 10:16:45 -04:00
Jeremy Morse e3c6b0f151 Limit debug instr-referencing tests to X86
The instruction referencing work currently only works on X86, and all the
tests for it will be X86 based for the time being. Configure the whole
directory to be X86-only, seeing how I keep on landing tests that don't
have the correct REQUIRES lines.
2020-10-22 15:04:19 +01:00
Jon Chesterfield 09bc755dea [OpenMP] Emit calls to int64_t functions for amdgcn
[OpenMP] Emit calls to int64_t functions for amdgcn

Two functions, syncwarp and active_thread_mask, return lanemask_t. Currently
this is assumed to be int32, which is true for nvptx. Patch makes the type
target architecture dependent.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D89746
2020-10-22 15:02:47 +01:00
Paul C. Anagnostopoulos b2faf75568 [TableGen] Continue improving the comments for the data structures.
Differential Revision: https://reviews.llvm.org/D89901
2020-10-22 10:00:49 -04:00
Eugene Zhulenev f8fcff5a9d [mlir] Convert from Async dialect to LLVM coroutines
Lower from Async dialect to LLVM by converting async regions attached to `async.execute` operations into LLVM coroutines (https://llvm.org/docs/Coroutines.html):
1. Outline all async regions to functions
2. Add LLVM coro intrinsics to mark coroutine begin/end
3. Use MLIR conversion framework to convert all remaining async types and ops to LLVM + Async runtime function calls

All `async.await` operations inside async regions converted to coroutine suspension points. Await operation outside of a coroutine converted to the blocking wait operations.

Implement simple runtime to support concurrent execution of coroutines.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D89292
2020-10-22 06:30:46 -07:00
Raphael Isemann 41185226f6 [lldb] Explicitly use the configuration architecture when building test executables
The Darwin builder currently assumes in `getArchCFlags` that the passed `arch`
value is an actual string it can string.join with vendor/os/version/env strings:

```
   triple = '-'.join([arch, vendor, os, version, env])
```

However this is not true for most tests as we just pass down the `arch=None`
default value from `TestBase.build`. This causes that if we actually end up in
this function we just error out when concatenating `None` with the other actual
strings of vendor/os/version/env. What we should do instead is check that if
there is no test-specific architecture that we fall back to the configuration's
architecture value.

It seems we already worked around this in `builder.getArchSpec` by explicitly
falling back to the architecture specified in the configuration.

This patch just moves this fallback logic to the top `build` function so that it
affects all functions called from `TestBase.build`.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D89056
2020-10-22 15:30:25 +02:00
Paul C. Anagnostopoulos e4b4543ff0 [Clang] [TableGen] Clean up !if(!eq(bool, 1) and related booleans
Differential Revision: https://reviews.llvm.org/D89893
2020-10-22 09:29:15 -04:00
Simon Pilgrim 794dc7ad26 [CodeGen] Split MVT::changeTypeToInteger() functionality from EVT::changeTypeToInteger().
Add the MVT equivalent handling for EVT changeTypeToInteger/changeVectorElementType/changeVectorElementTypeToInteger.

All the SimpleVT code already exists inside the EVT equivalents, but by splitting this out we can use these directly inside MVT types without converting to/from EVT.
2020-10-22 14:27:42 +01:00
Evgeny Leviant 088f3c83cc [llvm-mca] Add few ldm* instructions to cortex-a57 test case 2020-10-22 16:21:40 +03:00
Alexander Belyaev 461605c418 [mlir] Add MemRefReinterpretCastOp definition to Standard.
Reuse most code for printing/parsing/verification from SubViewOp.

https://llvm.discourse.group/t/rfc-standard-memref-cast-ops/1454/15

Differential Revision: https://https://reviews.llvm.org/D89720
2020-10-22 15:17:22 +02:00
Raphael Isemann bb1d702e25 [lldb][NFC] Make GetShellSafeArgument return std::string and unittest it. 2020-10-22 14:47:10 +02:00
Florian Hahn fbb6375db0 [AArch64] Add cost model tests for min/max intrinsics. 2020-10-22 13:28:04 +01:00
Jeremy Morse cb668d2e76 Test I added requires X86 to be built.
This the second time I've stepped on this landmine, I'll look at setting
a lit local config. All the tests in this dir are going to be X86 for now.
2020-10-22 13:18:55 +01:00
Jeremy Morse d73275993b [DebugInstrRef] Substitute debug value numbers to handle optimizations
This patch touches two optimizations, TwoAddressInstruction and X86's
FixupLEAs pass, both of which optimize by re-creating instructions. For
LEAs, various bits of arithmetic are better represented as LEAs on X86,
while TwoAddressInstruction sometimes converts instrs into three address
instructions if it's profitable.

For debug instruction referencing, both of these require substitutions to
be created -- the old instruction number must be pointed to the new
instruction number, as illustrated in the added test. If this isn't done,
any variable locations based on the optimized instruction are
conservatively dropped.

Differential Revision: https://reviews.llvm.org/D85756
2020-10-22 13:01:03 +01:00
David Zarzycki 8556f38b0d [x86 testing] NFC: Create exhaustive vector popcnt ULT/UGT tests
There are bunch of optimization opportunities right now in the vector
popcnt code gen when doing simple less-than/greater-than comparisons, so
let's examine them all to ensure that things don't regress as different
scenarios are fixed. We can always delete some later once some fixes are
made.

Please note: the new files were auto-generated. If people want, I can
commit the short C code that printed out the various combinations.
2020-10-22 07:57:40 -04:00