This revealed that scalar intrinsics could create nodes with a rounding mode of FROUND_CUR_DIRECTION, but the patterns didn't check for it. It just worked because isel doesn't check operand count and we had a pattern without the rounding mode argument at all.
llvm-svn: 282231
The backend can't encode all possible values of the argument and will fail isel. Checking in the frontend presents a friendlier experience to the user.
I started with builtins that can only take _MM_CUR_DIRECTION or _MM_NO_EXC. More builtins coming in the future.
llvm-svn: 282228
Summary:
For AMDGPU, we have been using the operating system component of the triple
for specifying the low-level runtime that is being used. The rationale for
this is that the host operating system (e.g. Linux) is irrelevant for GPU code,
since its execution enviroment will be mostly controled by the low-level runtime
being used to execute the code.
In most cases, higher level languages have their own runtime which is
implemented on top of the low-level runtime. The kernel ABIs of each
language mostly depend on the low-level runtime, but there may be some
slight differences between languages. OpenCL for example, may append
additional arguments to the kernel in order to pass values like global
offsets or buffers for printf. OpenMP, HCC, or other languages may want
to add their own values which differ from OpenCL.
The reason for adding a new opencl environment type is to make it possible for the backend
to distinguish between the ABIs of the higher-level languages and handle them correctly.
It seems cleaner to use the enviroment component for this rather than creating a new
OS type for every combination of low-level runtime / high-level language.
Reviewers: Anastasia, chandlerc
Subscribers: whchung, pekka.jaaskelainen, wdng, yaxunl, llvm-commits
Differential Revision: https://reviews.llvm.org/D24735
llvm-svn: 282218
Statically instanciate the most common PartialMappings. This should
be closer to what the code would look like when TableGen support is
added for GlobalISel. As a side effect, this should improve compile
time.
llvm-svn: 282215
In the verify method of the ValueMapping class we used to check that the
mapping exactly matches the bits of the input value. This is problematic
for statically allocated mappings because we would need a different
mapping for each different size of the value that maps on one
instruction. For instance, with such scheme, we would need a different
mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit
wide instruction.
Therefore, change the verifier to check that the meaningful bits are
covered by the mapping instead of matching them.
llvm-svn: 282214
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.
We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.
llvm-svn: 282213
We already have the udiv variant of this transform, so I think this is ok for
InstCombine too even though there is an increase in IR instructions. As the
tests and TODO comments show, the transform can lead to follow-on combines.
This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672
Differential Revision: https://reviews.llvm.org/D24527
llvm-svn: 282209
Add two options to the code coverage artifact prep script:
* --use-existing-profdata: Use an existing indexed profile instead of
merging the same profiles again.
* --restrict: Restrict the coverage reporting to the given list of
source directories.
With this in place, we can teach the coverage bot how to prepare
separate reports for each of the llvm tools.
llvm-svn: 282204
We've supported restricting coverage reports to a set of files for a
long time. Add support for being able to restrict by entire directories.
I suppose this supersedes D20803.
llvm-svn: 282202
Currently all nodes get added to the NextSU list when they are released,
so any candidate must be in that list, making the heuristic ineffective.
Remove it for now, we can add it back later in a working fashion if
necessary.
llvm-svn: 282200
and also the dependent r282175 "GVN-hoist: do not dereference null pointers"
It's causing compiler crashes building Harfbuzz (PR30499).
llvm-svn: 282199
The ELF spec doesn't allow relocations to point directly to
a deduplicated COMDAT section but this unfortunately happens in
practice. Bail out early instead of crashing.
Differential Revision: https://reviews.llvm.org/D24750
llvm-svn: 282197
This code was adding an explicit dependency on libclang because lldb needs clang headers, changing this to instead depend on the clang tablegen targets means we don't have to depend on building the clang bits in libclang that lldb doesn't need.
Note this is still a bit of a hack because we're adding the dependency to all lldb libraries, instead of just the ones that need it.
llvm-svn: 282196
USR_OVF is a subregister of USR, which is a member of CtrRegs. Having both
a register and its proper subregister in the same register class has bad
consequences for lane mask calculation: based solely on the lane mask info,
USR_OVF would not appear to be a subregister of USR.
llvm-svn: 282192
This change is very mechanical. All it does is change the
signature of `Options::GetDefinitions()` and `OptionGroup::
GetDefinitions()` to return an `ArrayRef<OptionDefinition>`
instead of a `const OptionDefinition *`. In the case of the
former, it deletes the sentinel entry from every table, and
in the case of the latter, it removes the `GetNumDefinitions()`
method from the interface. These are no longer necessary as
`ArrayRef` carries its own length.
In the former case, iteration was done by using a sentinel
entry, so there was no knowledge of length. Because of this
the individual option tables were allowed to be defined below
the corresponding class (after all, only a pointer was needed).
Now, however, the length must be known at compile time to
construct the `ArrayRef`, and as a result it is necessary to
move every option table before its corresponding class. This
results in this CL looking very big, but in terms of substance
there is not much here.
Differential revision: https://reviews.llvm.org/D24834
llvm-svn: 282188
According to MSDN (see the PR), functions which don't touch any callee-saved
registers (including %rsp) don't need any unwind info.
This patch makes LLVM not emit unwind info for such functions, to save
binary size.
Differential Revision: https://reviews.llvm.org/D24748
llvm-svn: 282185
Like NetBSD, OpenBSD prefers having a consistent set of typedefs
across the architectures it supports over strictly following the ARM
ABIs. The diff below makes sure that clang's view of those types
matches OpenBSD's system header files. It also adds a test that
checks the relevant types on all OpenBSD platforms that clang works
on. Hopefully we can add mips64 and powerpc to that list in the
future.
Patch by Mark Kettenis <mark.kettenis@xs4all.nl>
llvm-svn: 282184
Atomic comparison instructions use the sub-word load instruction on
Power8 and up but the value is not sign extended prior to the signed word
compare instruction. This patch adds that sign extension.
llvm-svn: 282182