Commit Graph

5758 Commits

Author SHA1 Message Date
David Tenty 58817a0783 [clang][XCOFF] Indicate that XCOFF does not support COMDATs
Summary: XCOFF doesn't support COMDATs, so clang shouldn't emit them.

Reviewers: stevewan, sfertile, Xiangling_L

Reviewed By: sfertile

Subscribers: dschuff, aheejin, dexonsmith, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74631
2020-02-18 16:10:11 -05:00
Mikhail Maltsev 63809d365e [ARM,MVE] Add vbrsrq intrinsics family
Summary:
This patch adds a new MVE intrinsics family, `vbrsrq`: vector bit
reverse and shift right. The intrinsics are compiled into the VBRSR
instruction. Two new LLVM IR intrinsics were also added: arm.mve.vbrsr
and arm.mve.vbrsr.predicated.

Reviewers: simon_tatham, dmgreen, ostannard, MarkMurrayARM

Reviewed By: simon_tatham

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74721
2020-02-18 17:31:21 +00:00
Djordje Todorovic 2bf44d11cb Revert "Reland "[DebugInfo] Enable the debug entry values feature by default""
This reverts commit rGa82d3e8a6e67.
2020-02-18 16:38:11 +01:00
Melanie Blower cd2c5af6df Reland D74436 "Change clang option -ffp-model=precise to select ffp-contract=on""
Change clang option -ffp-model=precise, the default, to select ffp-contract=on
    The patch caused some problems for PowerPC but ibm has made
    adjustments so I am resubmitting this patch.  Additionally, Andy looked
    at the performance regressions on LNT and it looks like a loop
    unrolling decision that could be adjusted.

    Reviewers: rjmccall, Andy Kaylor

    Differential Revision: https://reviews.llvm.org/D74436
2020-02-18 06:55:36 -08:00
Djordje Todorovic a82d3e8a6e Reland "[DebugInfo] Enable the debug entry values feature by default"
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-18 14:41:08 +01:00
Simon Tatham c32af4447f [ARM,MVE] Add the vmovnbq,vmovntq intrinsic family.
Summary:
These are in some sense the inverse of vmovl[bt]q: they take a vector
of n wide elements and truncate each to half its width. So they only
write half a vector's worth of output data, and therefore they also
take an 'inactive' parameter to provide the other half of the data in
the output vector. So vmovnb overwrites the even lanes of 'inactive'
with the narrowed values from the main input, and vmovnt overwrites
the odd lanes.

LLVM had existing codegen which generates these MVE instructions in
response to IR that takes two vectors of wide elements, or two vectors
of narrow ones. But in this case, we have one vector of each. So my
clang codegen strategy is to narrow the input vector of wide elements
by simply reinterpreting it as the output type, and then we have two
narrow vectors and can represent the operation as a vector shuffle
that interleaves lanes from both of them.

Even so, not all the cases I needed ended up being selected as a
single MVE instruction, so I've added a couple more patterns that spot
combinations of the 'MVEvmovn' and 'ARMvrev32' SDNodes which can be
generated as a VMOVN instruction with operands swapped.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74337
2020-02-18 09:34:50 +00:00
Simon Tatham 5e97940cd2 [ARM,MVE] Add the vmovlbq,vmovltq intrinsic family.
Summary:
These intrinsics take a vector of 2n elements, and return a vector of
n wider elements obtained by sign- or zero-extending every other
element of the input vector. They're represented in IR as a
shufflevector that extracts the odd or even elements of the input,
followed by a sext or zext.

Existing LLVM codegen already matches this pattern and generates the
VMOVLB instruction (which widens the even-index input lanes). But no
existing isel rule was generating VMOVLT, so I've added some. However,
the new rules currently only work in little-endian MVE, because the
pattern they expect from isel lowering includes a bitconvert which
doesn't have the right semantics in big-endian.

The output of one existing codegen test is improved by those new
rules.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74336
2020-02-18 09:34:50 +00:00
Simon Tatham 68b49f7ef4 [ARM,MVE] Add intrinsics vclzq and vclsq.
Summary:
vclzq maps nicely to the existing target-independent @llvm.ctlz IR
intrinsic. But vclsq ('count leading sign bits') has no corresponding
target-independent intrinsic, so I've made up @llvm.arm.mve.vcls.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74335
2020-02-18 09:34:50 +00:00
Simon Tatham b6236e9479 [ARM,MVE] Add the vrev16q, vrev32q, vrev64q family.
Summary:
These intrinsics just reorder the lanes of a vector, so the natural IR
representation is as a shufflevector operation. Existing LLVM codegen
already recognizes those particular shufflevectors and generates the
MVE VREV instruction.

This commit adds the unpredicated forms only.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74334
2020-02-18 09:34:50 +00:00
Simon Tatham c8b3196e54 [ARM,MVE] Add intrinsics for FP rounding operations.
Summary:
This adds the unpredicated forms of six different MVE intrinsics which
all round a vector of floating-point numbers to integer values,
leaving them still in FP format, differing only in rounding mode and
exception settings.

Five of them map to existing target-independent intrinsics in LLVM IR,
such as @llvm.trunc and @llvm.rint. The sixth, mapping to the `vrintn`
instruction, is done by inventing a target-specific intrinsic.

(`vrintn` behaves the same as `vrintx` in terms of the output value:
the side effects on the FPSCR flags are the only difference between
the two. But ACLE specifies separate user-callable intrinsics for the
two, so the side effects matter enough to make sure we generate the
right one of the two instructions in each case.)

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: miyuki

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74333
2020-02-18 09:34:50 +00:00
Simon Tatham df3ed6c0fe [ARM,MVE] Add intrinsics for int <-> float conversion.
Summary:
This adds the unpredicated versions of the family of vcvtq intrinsics
that convert between a vector of floats and a vector of the same size
of integer. These are represented in IR using the standard fptosi,
fptoui, sitofp and uitofp operations, which existing LLVM codegen
already handles.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74332
2020-02-18 09:34:50 +00:00
Simon Tatham 90dc78bc62 [ARM,MVE] Add intrinsics for abs, neg and not operations.
Summary:
This commit adds the unpredicated intrinsics for the unary operations
vabsq (absolute value), vnegq (arithmetic negation), vmvnq (bitwise
complement), vqabsq and vqnegq (saturating versions of abs and neg for
signed integers, in the sense that they give INT_MAX if an input lane
is INT_MIN).

This is done entirely in clang: all of these operations have existing
isel patterns and existing tests for them on the LLVM side, so I've
just made clang emit the same IR that those patterns already match.

Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D74331
2020-02-18 09:34:50 +00:00
Mikhail Maltsev 489f62e801 [ARM,MVE] Add vector-scalar intrinsics
Summary:
This patch adds vector-scalar variants to the following families of
MVE intrinsics:
* vaddq
* vsubq
* vmulq
* vqaddq
* vqsubq
* vhaddq
* vhsubq
* vqdmulhq
* vqrdmulhq

The vector-scalar variants perform a splat operation on the scalar
operand and then perform the same operations as their vector-vector
counterparts. Code generation is done accordingly (using LLVM IR 'insert'
and 'shuffle' operations which are later converted into an ARMvdup
SDNode).

Reviewers: simon_tatham, dmgreen, MarkMurrayARM, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74620
2020-02-17 17:47:05 +00:00
Mark de Wever 9658d895c8 [Sema] Adds the pointer-to-int-cast diagnostic
Converting a pointer to an integer whose result cannot represented in the
integer type is undefined behavior is C and prohibited in C++. C++ already
has a diagnostic when casting. This adds a diagnostic for C.

Since this diagnostic uses the range of the conversion it also modifies
int-to-pointer-cast diagnostic to use a range.

Fixes PR8718: No warning on casting between pointer and non-pointer-sized int

Differential Revision: https://reviews.llvm.org/D72231
2020-02-16 15:38:25 +01:00
Melanie Blower 9122b92f8e Revert "Reland D74436 "Change clang option -ffp-model=precise to select ffp-contract=on""
This reverts commit 0a1123eb43.
Want to revert this because it's causing trouble for PowerPC
I also fixed test fp-model.c which was looking for an incorrect error message
2020-02-14 07:32:09 -08:00
Fangrui Song 0a1123eb43 Reland D74436 "Change clang option -ffp-model=precise to select ffp-contract=on"
Buildbot are failing with the current revert status. So reland with a
fix to fp-model.c
2020-02-13 16:22:03 -08:00
Melanie Blower 88ec01ca1b Revert "Revert "Revert "Change clang option -ffp-model=precise to select ffp-contract=on"""
This reverts commit abd09053bc.
It's causing internal buildbot fails on ppc

Conflicts:
	clang/lib/Driver/ToolChains/Clang.cpp
2020-02-13 15:06:12 -08:00
Erik Pilkington e26c24b849 Revert "[IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas"
This reverts commit fafc6e4fdf.

Should fix ppc stage2 failure: http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/23546

Conflicts:
	clang/lib/CodeGen/CGCall.cpp
2020-02-12 12:26:46 -08:00
Melanie Blower abd09053bc Revert "Revert "Change clang option -ffp-model=precise to select ffp-contract=on""
This reverts commit 99c5bcbce8.
Change clang option -ffp-model=precise to select ffp-contract=on
Including some small touch-ups to the original commit

Reviewers: rjmccall, Andy Kaylor

Differential Revision: https://reviews.llvm.org/D74436
2020-02-12 07:30:43 -08:00
Djordje Todorovic 97ed706a96 Revert "[DebugInfo] Enable the debug entry values feature by default"
This reverts commit rG9f6ff07f8a39.

Found a test failure on clang-with-thin-lto-ubuntu buildbot.
2020-02-12 11:59:04 +01:00
jasonliu 55e2678fcd [clang] Add -fignore-exceptions
Summary:

This is trying to implement the functionality proposed in:
http://lists.llvm.org/pipermail/cfe-dev/2017-April/053417.html
An exception can throw, but no cleanup is going to happen.
A module compiled with exceptions on, can catch the exception throws
from module compiled with -fignore-exceptions.

The use cases for enabling this option are:
1. Performance analysis of EH instrumentation overhead
2. The ability to QA non EH functionality when EH functionality is not available.
3. User of EH enabled headers knows the calls won't throw in their program and
   wants the performance gain from ignoring EH construct.

The implementation tried to accomplish that by removing any landing pad code
 that might get generated.

Reviewed by: aaron.ballman

Differential Revision: https://reviews.llvm.org/D72644
2020-02-12 09:56:18 +00:00
Djordje Todorovic 9f6ff07f8a [DebugInfo] Enable the debug entry values feature by default
This patch enables the debug entry values feature.

  - Remove the (CC1) experimental -femit-debug-entry-values option
  - Enable it for x86, arm and aarch64 targets
  - Resolve the test failures
  - Leave the llc experimental option for targets that do not
    support the CallSiteInfo yet

Differential Revision: https://reviews.llvm.org/D73534
2020-02-12 10:25:14 +01:00
Reid Kleckner 2c6a3896ab Re-land "[MS] Overhaul how clang passes overaligned args on x86_32"
This brings back 2af74e27ed and reverts
eaabaf7e04.

The changes were correct, the code that was broken contained an ODR
violation that assumed that these types are passed equivalently:
  struct alignas(uint64_t) Wrapper { uint64_t P };
  void f(uint64_t p);
  void f(Wrapper p);

MSVC does not pass them the same way, and so clang-cl should not pass
them the same way either.
2020-02-11 16:49:28 -08:00
Melanie Blower 99c5bcbce8 Revert "Change clang option -ffp-model=precise to select ffp-contract=on"
This reverts commit 3fcdf2fa94.
Sorry I was too hasty with my commit, I will review Andy's comments
and resubmit.
2020-02-11 14:20:00 -08:00
Melanie Blower 3fcdf2fa94 Change clang option -ffp-model=precise to select ffp-contract=on
Reviewers: rjmccall

Differential Revision: https://reviews.llvm.org/D74436
2020-02-11 14:07:10 -08:00
Ian Levesque 14f870366a [xray][clang] Always add xray-skip-entry/exit and xray-ignore-loops attrs
The function attributes xray-skip-entry, xray-skip-exit, and
xray-ignore-loops were only being applied if a function had an
xray-instrument attribute, but they should apply if xray is enabled
globally too.

Differential Revision: https://reviews.llvm.org/D73842
2020-02-11 14:00:41 -08:00
Krzysztof Parzyszek 57148e0379 [Hexagon] Fix ABI info for returning HVX vectors 2020-02-11 12:38:54 -06:00
Kai Nacke a5040d5ec9 [SytemZ] Disable vector ABI when using option -march=arch[8|9|10]
When specifying -march=arch[8|9|10], those CPU types do NOT support
the vector extension. In this case the vector ABI must be disabled.
The generated data layout should NOT contain 64-v128.

Reviewers: uweigand

Differential Revision: https://reviews.llvm.org/D74146
2020-02-10 04:14:05 -05:00
serge_sans_paille e67cbac812 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille 4546211600 Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille 0fd51a4554 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
serge-sans-paille 658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00
Guillaume Chatelet d65bbf81f8 [clang] Add support for __builtin_memcpy_inline
Summary: This is a follow up on D61634 and the last step to implement http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

Reviewers: efriedma, courbet, tejohnson

Subscribers: hiraditya, cfe-commits, llvm-commits, jdoerfert, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73543
2020-02-07 23:55:26 +01:00
Erik Pilkington fafc6e4fdf [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas
These temporaries are only used in the callee, and their memory can be reused
after the call is complete.

rdar://58552124

Differential revision: https://reviews.llvm.org/D74094
2020-02-07 14:39:31 -08:00
Nico Weber b03c3d8c62 Revert "Support -fstack-clash-protection for x86"
This reverts commit 4a1a0690ad.
Breaks tests on mac and win, see https://reviews.llvm.org/D68720
2020-02-07 14:49:38 -05:00
serge_sans_paille 4a1a0690ad Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a3 with correct option
flags set.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 19:54:39 +01:00
serge-sans-paille f6d98429fc Revert "Support -fstack-clash-protection for x86"
This reverts commit 39f50da2a3.

The -fstack-clash-protection is being passed to the linker too, which
is not intended.

Reverting and fixing that in a later commit.
2020-02-07 11:36:53 +01:00
Diogo Sampaio 9d869180c4 [ARM] Follow AACPS for preserving number of loads/stores of volatile bit-fields
Summary:
Following the AAPCS, every store to a volatile bit-field requires to generate one load of that field, even if all the bits are going to be replaced.
This patch allows the user to opt-in in following such rule, whenever the a.

AAPCS Release 2019Q1.1 (https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, paragraph: Volatile bit-fields – preserving number and width of container accesses

```
When a volatile bit-field is written, and its container does not overlap with any non-bit-field member, its
container must be read exactly once and written exactly once using the access width appropriate to the
type of the container. The two accesses are not atomic.

```

Reviewers: lebedev.ri, ostannard, jfb, eli.friedman

Reviewed By: jfb

Subscribers: rsmith, rjmccall, dexonsmith, kristof.beyls, jfb, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67399
2020-02-07 10:11:54 +00:00
serge_sans_paille 39f50da2a3 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

Differential Revision: https://reviews.llvm.org/D68720
2020-02-07 10:56:15 +01:00
Craig Topper 96400ae2a4 Recommit "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
With REQUIRES: x86-register-target added to the tests.

Also remove some unneeded FIXMEs

But add a FIXME for bad IR generation for FMADDSUB/FMSUBADD with
constrained FP.

Original patch by Kevin P. Neal
2020-02-06 16:54:35 -08:00
Kevin P. Neal ad0e03fd4c Revert "[FPEnv][X86] Platform-specific builtin constrained FP enablement"
This reverts commit 208470dd5d.

Tests fail:
error: unable to create target: 'No available targets are compatible with triple "x86_64-apple-darwin"'

This happens on clang-hexagon-elf, clang-cmake-armv7-quick, and
clang-cmake-armv7-quick bots.

If anyone has any suggestions on why then I'm all ears.

Differential Revision: https://reviews.llvm.org/D73570

Revert "[FPEnv][X86] Speculative fix for failures introduced by eda495426."

This reverts commit 80e17e5fcc.

The speculative fix didn't solve the test failures on Hexagon, ARMv6, and
MSVC AArch64.
2020-02-06 19:17:14 -05:00
Kevin P. Neal 80e17e5fcc [FPEnv][X86] Speculative fix for failures introduced by eda495426.
Differential Revision: https://reviews.llvm.org/D73570
2020-02-06 15:28:36 -05:00
Kevin P. Neal 208470dd5d [FPEnv][X86] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the X86-specific builtins don't
use constrained intrinsics in some cases. Fix that.

Differential Revision: https://reviews.llvm.org/D73570
2020-02-06 14:20:44 -05:00
Mikhail Maltsev 2694cc3dca [ARM][MVE] Add fixed point vector conversion intrinsics
Summary:
This patch implements the following Arm ACLE MVE intrinsics:
* vcvtq_n_*
* vcvtq_m_n_*
* vcvtq_x_n_*

and two corresponding LLVM IR intrinsics:
* int_arm_mve_vcvt_fix (vcvtq_n_*)
* int_arm_mve_vcvt_fix_predicated (vcvtq_m_n_*, vcvtq_x_n_*)

Reviewers: simon_tatham, ostannard, MarkMurrayARM, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74134
2020-02-06 16:49:45 +00:00
Thomas Lively 8c3e6af71b [WebAssembly] Add experimental multivalue calling ABI
Summary:
For now, this ABI simply expands all possible aggregate arguments and
returns all possible aggregates directly. This ABI will change rapidly
as we prototype and benchmark a new ABI that takes advantage of
multivalue return and possibly other changes from the MVP ABI.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72972
2020-02-04 21:09:49 -08:00
Matt Arsenault a3c814d234 Separately track input and output denormal mode
AMDGPU and x86 at least both have separate controls for whether
denormal results are flushed on output, and for whether denormals are
implicitly treated as 0 as an input. The current DAGCombiner use only
really cares about the input treatment of denormals.
2020-02-04 12:59:21 -05:00
Yonghong Song 9271cab270 [BPF] use base lvalue type for preserve_{struct,union}_access_index metadata
Linux commit
  1cf5b23988 (diff-289313b9fec99c6f0acfea19d9cfd949)
uses "#pragma clang attribute push (__attribute__((preserve_access_index)),
      apply_to = record)"
to apply CO-RE relocations to all records including the following pattern:
  #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)
  typedef struct {
    int a;
  } __t;
  #pragma clang attribute pop
  int test(__t *arg) { return arg->a; }

The current approach to use struct type in the relocation record will
result in an anonymous struct, which make later type matching difficult
    in bpf loader. In fact, current BPF backend will fail the above program
with assertion:
  clang: ../lib/Target/BPF/BPFAbstractMemberAccess.cpp:796: ...
     Assertion `TypeName.size()' failed.

The patch use the base lvalue type for the "base" value to annotate
preservee_{struct,union}_access_index intrinsics. In the above example,
the type will be "__t" which preserved the type name.

Differential Revision: https://reviews.llvm.org/D73900
2020-02-04 09:28:30 -08:00
Jonas Paulsson 563e84790f [SystemZ] Support -msoft-float
This is needed when building the Linux kernel.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D72189
2020-02-04 10:32:45 -05:00
Fangrui Song dbc96b518b Revert "[CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition"
This reverts commit 789a46f2d7.

Accidentally committed.
2020-02-03 10:09:39 -08:00
Fangrui Song 789a46f2d7 [CodeGenModule] Assume dso_local for -fpic -fno-semantic-interposition
Summary:
Clang -fpic defaults to -fno-semantic-interposition (GCC -fpic defaults
to -fsemantic-interposition).
Users need to specify -fsemantic-interposition to get semantic
interposition behavior.

Semantic interposition is currently a best-effort feature. There may
still be some cases where it is not handled well.

Reviewers: peter.smith, rnk, serge-sans-paille, sfertile, jfb, jdoerfert

Subscribers: dschuff, jyknight, dylanmckay, nemanjai, jvesely, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, arphaman, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73865
2020-02-03 09:52:48 -08:00
Simon Tatham 961530fdc9 [ARM,MVE] Fix vreinterpretq in big-endian mode.
Summary:
In big-endian MVE, the simple vector load/store instructions (i.e.
both contiguous and non-widening) don't all store the bytes of a
register to memory in the same order: it matters whether you did a
VSTRB.8, VSTRH.16 or VSTRW.32. Put another way, the in-register
formats of different vector types relate to each other in a different
way from the in-memory formats.

So, if you want to 'bitcast' or 'reinterpret' one vector type as
another, you have to carefully specify which you mean: did you want to
reinterpret the //register// format of one type as that of the other,
or the //memory// format?

The ACLE `vreinterpretq` intrinsics are specified to reinterpret the
register format. But I had implemented them as LLVM IR bitcast, which
is specified for all types as a reinterpretation of the memory format.
So a `vreinterpretq` intrinsic, applied to values already in registers,
would code-generate incorrectly if compiled big-endian: instead of
emitting no code, it would emit a `vrev`.

To fix this, I've introduced a new IR intrinsic to perform a
register-format reinterpretation: `@llvm.arm.mve.vreinterpretq`. It's
implemented by a trivial isel pattern that expects the input in an
MQPR register, and just returns it unchanged.

In the clang codegen, I only emit this new intrinsic where it's
actually needed: I prefer a bitcast wherever it will have the right
effect, because LLVM understands bitcasts better. So we still generate
bitcasts in little-endian mode, and even in big-endian when you're
casting between two vector types with the same lane size.

For testing, I've moved all the codegen tests of vreinterpretq out
into their own file, so that they can have a different set of RUN
lines to check both big- and little-endian.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73786
2020-02-03 11:20:06 +00:00
Simon Tatham f8d4afc49a [ARM,MVE] Add intrinsics for v[id]dupq and v[id]wdupq.
Summary:
These instructions generate a vector of consecutive elements starting
from a given base value and incrementing by 1, 2, 4 or 8. The `wdup`
versions also wrap the values back to zero when they reach a given
limit value. The instruction updates the scalar base register so that
another use of the same instruction will continue the sequence from
where the previous one left off.

At the IR level, I've represented these instructions as a family of
target-specific intrinsics with two return values (the constructed
vector and the updated base). The user-facing ACLE API provides a set
of intrinsics that throw away the written-back base and another set
that receive it as a pointer so they can update it, plus the usual
predicated versions.

Because the intrinsics return two values (as do the underlying
instructions), the isel has to be done in C++.

This is the first family of MVE intrinsics that use the `imm_1248`
immediate type in the clang Tablegen framework, so naturally, I found
I'd given it the wrong C integer type. Also added some tests of the
check that the immediate has a legal value, because this is the first
time those particular checks have been exercised.

Finally, I also had to fix a bug in MveEmitter which failed an
assertion when I nested two `seq` nodes (the inner one used to extract
the two values from the pair returned by the IR intrinsic, and the
outer one put on by the predication multiclass).

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73357
2020-02-03 11:20:06 +00:00
Simon Tatham cf7e98e6f7 [ARM,MVE] Add intrinsics for vdupq.
Summary:
The unpredicated case of this is trivial: the clang codegen just makes
a vector splat of the input, and LLVM isel is already prepared to
handle that. For the predicated version, I've generated a `select`
between the same vector splat and the `inactive` input parameter, and
added new Tablegen isel rules to match that pattern into a predicated
`MVE_VDUP` instruction.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73356
2020-02-03 11:20:06 +00:00
Fangrui Song aed488e3a4 [Driver] Move -fsemantic-interposition decision from cc1 to driver
And add test/Driver/fsemantic-interposition.c
2020-02-02 20:45:29 -08:00
Fangrui Song 1acf129bcf [Frontend] Delete a redundant check of -pg for setFramePointer()
Driver errors if -fomit-frame-pointer is used together with -pg.
useFramePointerForTargetByDefault() returns true if -pg is specified.
=>
(!OmitFP && useFramePointerForTargetByDefault(Args, Triple)) is true
=>
We cannot get FramePointerKind::None
2020-02-01 00:29:29 -08:00
serge-sans-paille fd09f12f32 Implement -fsemantic-interposition
First attempt at implementing -fsemantic-interposition.

Rely on GlobalValue::isInterposable that already captures most of the expected
behavior.

Rely on a ModuleFlag to state whether we should respect SemanticInterposition or
not. The default remains no.

So this should be a no-op if -fsemantic-interposition isn't used, and if it is,
isInterposable being already used in most optimisation, they should honor it
properly.

Note that it only impacts architecture compiled with -fPIC and no pie.

Differential Revision: https://reviews.llvm.org/D72829
2020-01-31 14:02:33 +01:00
Craig Topper a10cec02f7 [X86] Improve X86 cmpps/cmppd/cmpss/cmpsd intrinsics with strictfp
The constrained fcmp intrinsics don't allow the TRUE/FALSE predicates.
Using them will assert. To workaround this I'm emitting the old X86 specific intrinsics that were never removed from the backend when we switched to using fcmp in IR. We have no way to mark them as being strict, but that's true of all target specific intrinsics so doesn't seem like we need to solve that here.

I've also added support for selecting between signaling and quiet.

Still need to support SAE which will require using a target specific
intrinsic. Also need to fix masking to not use an AND instruction
after the compare.

Differential Revision: https://reviews.llvm.org/D72906
2020-01-29 15:52:11 -08:00
Sanne Wouda 2939fc13c8 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
Sanne Wouda cbc45e4e75 Regenerate aarch64-neon-2velem.c CHECK lines 2020-01-29 13:03:27 +00:00
Hans Wennborg eaabaf7e04 Revert "[MS] Overhaul how clang passes overaligned args on x86_32"
It broke some Chromium tests, so let's revert until it can be fixed; see
https://crbug.com/1046362

This reverts commit 2af74e27ed.
2020-01-28 22:25:07 +01:00
Aaron Ballman 5547919280 Fix a crash when casting _Complex and ignoring the results.
Performing a cast where the result is ignored caused Clang to crash when
performing codegen for the conversion:

  _Complex int a;
  void fn1() { (_Complex double) a; }

This patch addresses the crash by not trying to emit the scalar conversions,
causing it to be a noop. Fixes PR44624.
2020-01-28 13:05:56 -05:00
Wang, Pengfei 3239b5034e [FPEnv] Add pragma FP_CONTRACT support under strict FP.
Summary: Support pragma FP_CONTRACT under strict FP.

Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3

Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72820
2020-01-28 20:43:43 +08:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Fangrui Song d600ab3bb5 [Frontend] Delete some unneeded CC1 options 2020-01-23 22:01:04 -08:00
Craig Topper a2137d6e09 [X86] Add -flax-vector-conversions=none to all of the x86 vector intrinsic header tests. 2020-01-23 20:43:50 -08:00
Teresa Johnson 90e630a95e Revert "[LTO/WPD] Enable aggressive WPD under LTO option"
This reverts commit 59733525d3.

There is a windows sanitizer bot failure in one of the cfi tests
that I will need some time to figure out:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/57155/steps/stage%201%20check/logs/stdio
2020-01-23 17:29:24 -08:00
Fangrui Song 69bf40c45f [Driver][CodeGen] Support -fpatchable-function-entry=N,M and __attribute__((patchable_function_entry(N,M))) where M>0
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D73072
2020-01-23 17:02:54 -08:00
Teresa Johnson 59733525d3 [LTO/WPD] Enable aggressive WPD under LTO option
Summary:
Third part in series to support Safe Whole Program Devirtualization
Enablement, see RFC here:
http://lists.llvm.org/pipermail/llvm-dev/2019-December/137543.html

This patch adds type test metadata under -fwhole-program-vtables,
even for classes without hidden visibility. It then changes WPD to skip
devirtualization for a virtual function call when any of the compatible
vtables has public vcall visibility.

Additionally, internal LLVM options as well as lld and gold-plugin
options are added which enable upgrading all public vcall visibility
to linkage unit (hidden) visibility during LTO. This enables the more
aggressive WPD to kick in based on LTO time knowledge of the visibility
guarantees.

Support was added to all flavors of LTO WPD (regular, hybrid and
index-only), and to both the new and old LTO APIs.

Unfortunately it was not simple to split the first and second parts of
this part of the change (the unconditional emission of type tests and
the upgrading of the vcall visiblity) as I needed a way to upgrade the
public visibility on legacy WPD llvm assembly tests that don't include
linkage unit vcall visibility specifiers, to avoid a lot of test churn.

I also added a mechanism to LowerTypeTests that allows dropping type
test assume sequences we now aggressively insert when we invoke
distributed ThinLTO backends with null indexes, which is used in testing
mode, and which doesn't invoke the normal ThinLTO backend pipeline.

Depends on D71907 and D71911.

Reviewers: pcc, evgeny777, steven_wu, espindola

Subscribers: emaste, Prazek, inglorion, arichardson, hiraditya, MaskRay, dexonsmith, dang, davidxl, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71913
2020-01-23 16:09:44 -08:00
Reid Kleckner 2af74e27ed [MS] Overhaul how clang passes overaligned args on x86_32
MSVC 2013 would refuse to pass highly aligned things (typically vectors
and aggregates) by value. Users would receive this error:
  t.cpp(11) : error C2719: 'w': formal parameter with __declspec(align('32')) won't be aligned
  t.cpp(11) : error C2719: 'q': formal parameter with __declspec(align('32')) won't be aligned

However, in MSVC 2015, this behavior was changed, and highly aligned
things are now passed indirectly. To avoid breaking backwards
incompatibility, objects that do not have a *required* high alignment
(i.e. double) are still passed directly, even though they are not
naturally aligned. This change implements the new behavior of passing
things indirectly.

The new behavior is:
- up to three vector parameters can be passed in [XYZ]MM0-2
- remaining arguments with required alignment greater than 4 bytes are
  passed indirectly

Previously, MSVC never passed things truly indirectly, meaning clang
would always apply the byval attribute to indirect arguments. We had to
go to the trouble of adding inalloca so that non-trivially copyable C++
types could be passed in place without copying the object
representation. When inalloca was added, we asserted that all arguments
passed indirectly must use byval. With this change, that assert no
longer holds, and I had to update inalloca to handle that case. The
implicit sret pointer parameter was already handled this way, and this
change generalizes some of that logic to arguments.

There are two cases that this change leaves unfixed:
1. objects that are non-trivially copyable *and* overaligned
2. vectorcall + inalloca + vectors

For case 1, I need to touch C++ ABI code in MicrosoftCXXABI.cpp, so I
want to do it in a follow-up.

For case 2, my fix is one line, but it will require updating IR tests to
use lots of inreg, so I wanted to separate it out.

Related to D71915 and D72110

Fixes most of PR44395

Reviewed By: rjmccall, craig.topper, erichkeane

Differential Revision: https://reviews.llvm.org/D72114
2020-01-23 16:04:00 -08:00
Roman Lebedev 5ffe6408ff
[Codegen] If reasonable, materialize clang's `AllocAlignAttr` as llvm's Alignment Attribute on call-site function return value
Summary:
Much like with the previous patch (D73005) with `AssumeAlignedAttr`
handling, results in mildly more readable IR,
and will improve test coverage in upcoming patch.

Note that in `AllocAlignAttr`'s case, there is no requirement
for that alignment parameter to end up being an I-C-E.

Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73006
2020-01-23 22:50:49 +03:00
Roman Lebedev e819f7c9fe
[Codegen] If reasonable, materialize clang's `AssumeAlignedAttr` as llvm's Alignment Attribute on call-site function return value
Summary:
This should be mostly NFC - we still lower the same alignment
knowledge to the IR. The main reasoning here is that
this somewhat improves readability of IR like this,
and will improve test coverage in upcoming patch.

Even though the alignment is guaranteed to always be an I-C-E,
we don't always materialize it as llvm's Alignment Attribute because:
1. There may be a non-zero offset
2. We may be sanitizing for alignment

Note that if there already was an IR alignment attribute
on return value, we union them, and thus the alignment
only ever rises.

Also, there is a second relevant clang attribute `AllocAlignAttr`,
so that is why `AbstractAssumeAlignedAttrEmitter` is templated.

Reviewers: erichkeane, jdoerfert, hfinkel, aaron.ballman, rsmith

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73005
2020-01-23 22:50:49 +03:00
Simon Tatham 98ea4b30c2 [ARM,MVE] Make the MVE intrinsics work in C++!
Summary:
Apparently nobody has tried this in months of development. It turns
out that `FunctionDecl::getBuiltinID` will never consider a function
to be a builtin if it is in C++ and not extern "C". So none of the
function declarations in <arm_mve.h> are recognized as builtins when
clang is compiling in C++ mode: it just emits calls to them as
ordinary functions, which then turn out not to exist at link time.

The trivial fix is to wrap most of arm_mve.h in an extern "C".

Added a test in clang/test/CodeGen/arm-mve-intrinsics which checks
basic functioning of the MVE header file in C++ mode. I've filled it
with copies of existing test functions from other files in that
directory, including a few moderately tricky cases of overloading (in
particular one that relies on the strict-polymorphism attribute added
in D72518).

(I considered making //every// test in that directory compile in both
C and C++ mode and check the code generation was identical. But I
think that would increase testing time by more than the value it adds,
and also update_cc_test_checks gets confused when the output function
name varies between RUN lines.)

Reviewers: LukeGeeson, MarkMurrayARM, miyuki, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73268
2020-01-23 14:10:27 +00:00
Simon Tatham 4321c6af28 [ARM,MVE] Support immediate vbicq,vorrq,vmvnq intrinsics.
Summary:
Immediate vmvnq is code-generated as a simple vector constant in IR,
and left to the backend to recognize that it can be created with an
MVE VMVN instruction. The predicated version is represented as a
select between the input and the same constant, and I've added a
Tablegen isel rule to turn that into a predicated VMVN. (That should
be better than the previous VMVN + VPSEL: it's the same number of
instructions but now it can fold into an adjacent VPT block.)

The unpredicated forms of VBIC and VORR are done by enabling the same
isel lowering as for NEON, recognizing appropriate immediates and
rewriting them as ARMISD::VBICIMM / ARMISD::VORRIMM SDNodes, which I
then instruction-select into the right MVE instructions (now that I've
also reworked those instructions to use the same MC operand encoding).
In order to do that, I had to promote the Tablegen SDNode instance
`NEONvorrImm` to a general `ARMvorrImm` available in MVE as well, and
similarly for `NEONvbicImm`.

The predicated forms of VBIC and VORR are represented as a vector
select between the original input vector and the output of the
unpredicated operation. The main convenience of this is that it still
lets me use the existing isel lowering for VBICIMM/VORRIMM, and not
have to write another copy of the operand encoding translation code.

This intrinsic family is the first to use the `imm_simd` system I put
into the MveEmitter tablegen backend. So, naturally, it showed up a
bug or two (emitting bogus range checks and the like). Fixed those,
and added a full set of tests for the permissible immediates in the
existing Sema test.

Also adjusted the isel pattern for `vmovlb.u8`, which stopped matching
because lowering started turning its input into a VBICIMM. Now it
recognizes the VBICIMM instead.

Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72934
2020-01-23 11:53:52 +00:00
Russell Gallop 4662f6e1c7 [test] Avoid loop-unroll.c test getting confused by fadd in git revision
Differential Revision: https://reviews.llvm.org/D73162
2020-01-23 09:27:16 +00:00
Roman Lebedev 372cb38f45
[Codegen] Emit both AssumeAlignedAttr and AllocAlignAttr assumptions if they exist
Summary:
We shouldn't be just giving up if we find one of them
(like we currently do with `AssumeAlignedAttr`),
we should emit them all.

As the tests show, even if we materialized good knowledge
from `__attribute__((assume_aligned(32)`, it doesn't mean
`__attribute__((alloc_align([...])))` info won't be useful.
It might be, but that isn't given.

Reviewers: erichkeane, jdoerfert, aaron.ballman

Reviewed By: erichkeane

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72979
2020-01-21 21:18:27 +03:00
Kevin P. Neal 2e667d07c7 [FPEnv][SystemZ] Platform-specific builtin constrained FP enablement
When constrained floating point is enabled the SystemZ-specific builtins
don't use constrained intrinsics in some cases. Fix that.

Differential Revision: https://reviews.llvm.org/D72722
2020-01-21 12:44:39 -05:00
Krzysztof Parzyszek 305bf5b21d [Hexagon] Add support for Hexagon v67t microarchitecture (tiny core) 2020-01-21 11:35:10 -06:00
Diogo Sampaio 2147703bde Revert "[ARM] Follow AACPS standard for volatile bit-fields access width"
This reverts commit 6a24339a45.
Submitted using ide button by mistake
2020-01-21 15:31:33 +00:00
Diogo Sampaio 6a24339a45 [ARM] Follow AACPS standard for volatile bit-fields access width
Summary:
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
  short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
  short a;
  int  b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.

The AAPCS Release 2019Q1.1
(https://static.docs.arm.com/ihi0042/g/aapcs32.pdf)
section 8.1 Data Types, page 35, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths
of bit-fields where the container overlaps with a non-bit-field member.
 This is because the C/C++ memory model defines these as being separate
memory locations, which can be accessed by two threads
 simultaneously. For this reason, compilers must be permitted to use a
narrower memory access width (including splitting the access
 into multiple instructions) to avoid writing to a different memory location.
```

I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
 - it won't overlap with any other non-bit-field member
 - we only access memory inside the bounds of the record

Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.

Reviewers: rsmith, rjmccall, eli.friedman, ostannard

Subscribers: ostannard, kristof.beyls, cfe-commits, carwil, olista01

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72932
2020-01-21 15:23:38 +00:00
Zakk Chen e15fb06e2d [RISCV] Pass target-abi via module flag metadata
Reviewers: lenary, asb

Reviewed By: lenary

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72755
2020-01-20 23:30:54 -08:00
Krzysztof Parzyszek c12a5917d2 [Hexagon] Add support for Hexagon/HVX v67 ISA 2020-01-20 16:16:49 -06:00
Mark Murray b10a0eb04a [ARM][MVE][Intrinsics] Take abs() of VMINNMAQ, VMAXNMAQ intrinsics' first arguments.
Summary: Fix VMINNMAQ, VMAXNMAQ intrinsics; BOTH arguments have the absolute values taken.

Reviewers: dmgreen, simon_tatham

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72830
2020-01-20 14:33:26 +00:00
Ian Levesque 97ba483026 [xray] Allow instrumenting only function entry and/or only function exit
Extend -fxray-instrumentation-bundle to split function-entry and
function-exit into two separate options, so that it is possible to
instrument only function entry or only function exit.  For use cases
that only care about one or the other this will save significant overhead
and code size.

Differential Revision: https://reviews.llvm.org/D72890
2020-01-17 13:32:34 -08:00
Ian Levesque 1d62be2441 [clang][xray] Add -fxray-ignore-loops option
XRay allows tuning by minimum function size, but also always instruments
functions with loops in them. If the minimum function size is set to a
large value the loop instrumention ends up causing most functions to be
instrumented anyway. This adds a new flag, -fxray-ignore-loops, to disable
the loop detection logic.

Differential Revision: https://reviews.llvm.org/D72873
2020-01-17 13:32:24 -08:00
Adrian Prantl 7b30370e5b Move the sysroot attribute from DIModule to DICompileUnit
[this re-applies c0176916a4
 with the correct commit message and phabricator link]

This addresses point 1 of PR44213.
https://bugs.llvm.org/show_bug.cgi?id=44213

The DW_AT_LLVM_sysroot attribute is used for Clang module debug info,
to allow LLDB to import a Clang module from source. Currently it is
part of each DW_TAG_module, however, it is the same for all modules in
a compile unit. It is more efficient and less ambiguous to store it
once in the DW_TAG_compile_unit.

This should have no effect on DWARF consumers other than LLDB.

Differential Revision: https://reviews.llvm.org/D71732
2020-01-17 12:55:40 -08:00
Adrian Prantl c17aee67f1 Revert "Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot"
This reverts commit 12e479475a.

I accidentally landed this patch with the wrong commit message ...
2020-01-17 12:52:36 -08:00
Alina Sbirlea 90bdb03727 Update clang test. 2020-01-17 11:08:59 -08:00
Sanne Wouda ecfd6d3e84 [clang] Set function attributes on SEH filter functions correctly.
Summary:
When compiling with -munwind-tables, the SEH filter funclet needs the uwtable
function attribute, which gets automatically added if we use
SetInternalFunctionAttributes.  The filter funclet is internal so this seems
appropriate.

Reviewers: rnk

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72786
2020-01-17 18:09:42 +00:00
Fangrui Song 932b5d6fca [test] Fix tests after D52810 2020-01-17 10:02:56 -08:00
Adrian Prantl 12e479475a Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot
This is a purely cosmetic change that is NFC in terms of the binary
output. I bugs me that I called the attribute DW_AT_LLVM_isysroot
since the "i" is an artifact of GCC command line option syntax
(-isysroot is in the category of -i options) and doesn't carry any
useful information otherwise.

This attribute only appears in Clang module debug info.

Differential Revision: https://reviews.llvm.org/D71722
2020-01-17 09:36:48 -08:00
serge-sans-paille d293417931 Add __warn_memset_zero_len builtin as a workaround for glibc issue
Glibc issue: https://sourceware.org/bugzilla/show_bug.cgi?id=25399
The fix consist in considering the missing function as a builtin lowered to a nop.

Differential Revision: https://reviews.llvm.org/D72869
2020-01-17 09:58:32 +01:00
serge-sans-paille d437fba8ef Reapply Allow system header to provide their own implementation of some builtin
This reverts commit 3d210ed3d1.

See https://reviews.llvm.org/D71082 for the patch and discussion that make it
possible to reapply this patch.
2020-01-17 09:58:32 +01:00
Krzysztof Parzyszek 6f3effbbf0 [Hexagon] Update autogenerated intrinsic info in clang
In addition to that, use target features to validate intrinsic
availability on a given target.
2020-01-16 14:20:12 -06:00
Amy Huang 3d210ed3d1 Revert "Allow system header to provide their own implementation of some builtin"
This reverts commit 921f871ac4 because it
causes libc++ code to trigger __warn_memset_zero_len.

See https://reviews.llvm.org/D71082.
2020-01-15 15:03:45 -08:00
Mark Murray da9d57d2c2 [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics.
Summary: Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics and unit tests.

Reviewers: simon_tatham, miyuki, dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D72761
2020-01-15 17:20:15 +00:00
Teresa Johnson 76b92cc7c1 Fix bot by adjusting wildcard matching
I noticed one bot failure due to
24a00ef240 because the wildcard matching
was not working as intended, fixed it to act similar to other checks of
CGSCCToFunctionPassAdaptor.
2020-01-15 08:37:15 -08:00
Teresa Johnson 24a00ef240 Restore "[ThinLTO] Add additional ThinLTO pipeline testing with new PM"
This restores 2af97be802 (reverted at
6288f86e87), with all the fixes I had
applied at the time, along with a new fix for non-determinism in the
ordering of a couple of passes due to being accessed as parameters on
the same call.

I've also added --dump-input=fail to the new tests so I can more
thoroughly fix any additional failures.
2020-01-15 07:33:08 -08:00
Reid Kleckner 8e780252a7 [X86] ABI compat bugfix for MSVC vectorcall
Summary:
Before this change, X86_32ABIInfo::classifyArgument would be called
twice on vector arguments to vectorcall functions. This function has
side effects to track GPR register usage, and this would lead to
incorrect GPR usage in some cases.  The specific case I noticed is from
running out of XMM registers with mixed FP and vector arguments and no
aggregates of any kind. Consider this prototype:

  void __vectorcall vectorcall_indirect_vec(
      double xmm0, double xmm1, double xmm2, double xmm3, double xmm4,
      __m128 xmm5,
      __m128 ecx,
      int edx,
      __m128 mem);

classifyArgument has no effects when called on a plain FP type, but when
called on a vector type, it modifies FreeRegs to model GPR consumption.
However, this should not happen during the vector call first pass.

I refactored the code to unify vectorcall HVA logic with regcall HVA
logic. The conventions pass HVAs in registers differently (expanded vs.
not expanded), but if they do not fit in registers, they both pass them
indirectly by address.

Reviewers: erichkeane, craig.topper

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72110
2020-01-14 17:49:13 -08:00
Rong Xu c9ee5e996e Fix windows bot failures in c410adb092c9cb51ddb0b55862b70f2aa8c5b16f
(clang diagnostic handler for IR input files)
2020-01-14 16:32:17 -08:00