Commit Graph

66965 Commits

Author SHA1 Message Date
Jonas Paulsson 7b63e27cc0 Temporarily run machine-verifier once in test/CodeGen/SPARC/fp128.ll, so that
it XFAIL:s also without expensive checks.

See https://reviews.llvm.org/D63973
2019-12-03 11:21:52 +01:00
Jonas Paulsson f8c0cfc24e ImplicitNullChecks: Don't add a dead definition of DepMI as live-in
This is one of the fixes needed to reapply D68267 which improves verification
of live-in lists.

Review: craig.topper
https://reviews.llvm.org/D70434
2019-12-03 11:02:53 +01:00
Djordje Todorovic 4cfceb9106 [LiveDebugValues] Introduce entry values of unmodified params
The idea is to remove front-end analysis for the parameter's value
modification and leave it to the value tracking system. Front-end in some
cases marks a parameter as modified even the line of code that modifies the
parameter gets optimized, that implies that this will cover more entry
values even. In addition, extending the support for modified parameters
will be easier with this approach.

Since the goal is to recognize if a parameter’s value has changed, the idea
at very high level is: If we encounter a DBG_VALUE other than the entry
value one describing the same variable (parameter), we can assume that the
variable’s value has changed and we should not track its entry value any
more. That would be ideal scenario, but due to various LLVM optimizations,
a variable’s value could be just moved around from one register to another
(and there will be additional DBG_VALUEs describing the same variable), so
we have to recognize such situation (otherwise, we will lose a lot of entry
values) and salvage the debug entry value.

Differential Revision: https://reviews.llvm.org/D68209
2019-12-03 11:01:45 +01:00
Jonas Paulsson 4fd8f11901 [MachineVerifier] Improve checks of target instructions operands.
While working with a patch for instruction selection, the splitting of a
large immediate ended up begin treated incorrectly by the backend. Where a
register operand should have been created, it instead became an immediate. To
my surprise the machine verifier failed to report this, which at the time
would have been helpful.

This patch improves the verifier so that it will report this type of error.

This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at
https://bugs.llvm.org/show_bug.cgi?id=44091

Review: thegameg, arsenm, fhahn
https://reviews.llvm.org/D63973
2019-12-03 10:20:52 +01:00
Sourabh Singh Tomar f1e3988aa6 Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE."
This revision is revised to update Go-bindings and Release Notes.

The original commit message follows.

This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE.
When explicit alignment is specified.

Patch by Awanish Pandey <Awanish.Pandey@amd.com>

Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok,
deadalinx

Differential Revision: https://reviews.llvm.org/D70111
2019-12-03 09:51:43 +05:30
Sourabh Singh Tomar 3f3d0f4f4b [DebugInfo] Support for debug_macinfo.dwo section in llvm and llvm-dwarfdump.
This patch adds support for debug_macinfo.dwo section[pre-standardized]
to llvm and llvm-dwarfdump.

Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok

Differential Revision: https://reviews.llvm.org/D70705

Tags: #debug-info #llvm
2019-12-03 08:54:12 +05:30
Wang, Pengfei cf81714a7e [X86] Model MXCSR for AVX instructions other than AVX512
Summary: Model MXCSR for AVX instructions other than AVX512

Reviewers: craig.topper, RKSimon

Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70875
2019-12-03 08:53:47 +08:00
Bill Wendling f61099af9e Fix failing testcase to check for the correct output 2019-12-02 16:19:35 -08:00
Bill Wendling 87f146767e Place the "cold" code piece into the same section as the original function
Summary:
This cropped up in the Linux kernel where cold code was placed in an
incompatible section.

Reviewers: compnerd, vsk, tejohnson

Reviewed By: vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70925
2019-12-02 15:24:59 -08:00
Amaury Séchet 77b7b23ca1 Automatically generated arm64-abi-varargs.ll . NFC 2019-12-02 22:54:51 +01:00
Francis Visoiu Mistrih 7902d6cc80 [Remarks][ThinLTO] Use the correct file extension based on the format
Since we now have multiple formats, the ThinLTO remark files should also
respect that.
2019-12-02 13:04:43 -08:00
Volkan Keles 3d02fa6da7 [GlobalISel] CombinerHelper: Fix a bug in matchCombineCopy
Summary:
When combining COPY instructions, we were replacing the destination registers
with the source register without checking register constraints. This patch adds
a simple logic to check if the constraints match before replacing registers.

Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic

Reviewed By: aditya_nandakumar

Subscribers: rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70616
2019-12-02 12:05:09 -08:00
David Green 57d96ab593 [ARM] Add some VCMP folding and canonicalisation
The VCMP instructions in MVE can accept a register or ZR, but only as
the right hand operator. Most of the time this will already be correct
because the icmp will have been canonicalised that way already. There
are some cases in the lowering of float conditions that this will not
apply to though. This code should fix up those cases.

Differential Revision: https://reviews.llvm.org/D70822
2019-12-02 19:57:12 +00:00
David Green 63aff5cd3c [ARM] More reversed vcmp tests. NFC 2019-12-02 19:57:12 +00:00
Taewook Oh 2da205d43e Reland "b19ec1eb3d0c [BPI] Improve unreachable/ColdCall heurstics to handle loops."
Summary: b19ec1eb3d has been reverted because of the test failures
with PowerPC targets. This patch addresses the issues from the previous
commit.

Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll
and CodeGen/PowerPC/sms-cpy-1.ll pass

Subscribers: llvm-commits
2019-12-02 10:28:40 -08:00
Sanjay Patel af4e59949c [InstCombine] fix undef propagation for vector urem transform (PR44186)
As described here:
https://bugs.llvm.org/show_bug.cgi?id=44186

The match() code safely allows undef values, but we can't safely
propagate a vector constant that contains an undef to the new
compare instruction.
2019-12-02 12:17:38 -05:00
Simon Tatham d173fb5d28 [ARM,MVE] Add intrinsics to deal with predicates.
Summary:
This commit adds the `vpselq` intrinsics which take an MVE predicate
word and select lanes from two vectors; the `vctp` intrinsics which
create a tail predicate word suitable for processing the first m
elements of a vector (e.g. in the last iteration of a loop); and
`vpnot`, which simply complements a predicate word and is just
syntactic sugar for the `~` operator.

The `vctp` ACLE intrinsics are lowered to the IR intrinsics we've
already added (and which D70592 just reorganized). I've filled in the
missing isel rule for VCTP64, and added another set of rules to
generate the predicated forms.

I needed one small tweak in MveEmitter to allow the `unpromoted` type
modifier to apply to predicates as well as integers, so that `vpnot`
doesn't pointlessly convert its input integer to an `<n x i1>` before
complementing it.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70485
2019-12-02 16:20:30 +00:00
Simon Tatham 48cce077ef [ARM,MVE] Rename and clean up VCTP IR intrinsics.
Summary:
D65884 added a set of Arm IR intrinsics for the MVE VCTP instruction,
to use in tail predication. But the 64-bit one doesn't work properly:
its predicate type is `<2 x i1>` / `v2i1`, which isn't a legal MVE
type (due to not having a full set of instructions that manipulate it
usefully). The test of `vctp64` in `basic-tail-pred.ll` goes through
`opt` fine, as the test expects, but if you then feed it to `llc` it
causes a type legality failure at isel time.

The usual workaround we've been using in the rest of the MVE
intrinsics family is to bodge `v2i1` into `v4i1`. So I've adjusted the
`vctp64` IR intrinsic to do that, and completely removed the code (and
test) that uses that intrinsic for 64-bit tail predication. That will
allow me to add isel rules (upcoming in D70485) that actually generate
the VCTP64 instruction.

Also renamed all four of these IR intrinsics so that they have `mve`
in the name, since its absence was confusing.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: samparker, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70592
2019-12-02 16:20:30 +00:00
Simon Tatham 01aefae4a1 [ARM,MVE] Add an InstCombine rule permitting VPNOT.
Summary:
If a user writing C code using the ACLE MVE intrinsics generates a
predicate and then complements it, then the resulting IR will use the
`pred_v2i` IR intrinsic to turn some `<n x i1>` vector into a 16-bit
integer; complement that integer; and convert back. This will generate
machine code that moves the predicate out of the `P0` register,
complements it in an integer GPR, and moves it back in again.

This InstCombine rule replaces `i2v(~v2i(x))` with a direct complement
of the original predicate vector, which we can already instruction-
select as the VPNOT instruction which complements P0 in place.

Reviewers: ostannard, MarkMurrayARM, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70484
2019-12-02 16:20:30 +00:00
Hideto Ueno 96552036e3 [Attributor] Copy or port test cases related to Attributor to` Attributor` test folder
Summary:
This patch moves the test cases related to Attributor to `Transforms/Attributor` folder.
We have used `Transforms/FunctionAttrs` as the primary folder for Attributor test but we need to change testing way now.

For the test cases which I think functionattrs doesn't infer anything something like (willreturn, nosync, value-simplify, h2s ..etc), I moved them with the command `git mv`.

For the test cases in which functoinattrs and attributor are tested, I copied the test to the folder and remove the check only used by functoinattrs.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70843
2019-12-02 15:36:29 +00:00
Roman Lebedev ec7436f299
Autogenerate test/Analysis/ValueTracking/non-negative-phi-bits.ll test
Forgot to stage this change into 0f22e783a0 commit.
2019-12-02 18:28:41 +03:00
Clement Courbet 3540b80fe4 [llvm-exegesis] Fix 44b9942898.
Summary:
Add missing stack release instructions in
loadImplicitRegAndFinalize.

Reviewers: pengfei, gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70903
2019-12-02 16:13:27 +01:00
Roman Lebedev 0f22e783a0
[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() (PR44100)
rL341831 moved one-use check higher up, restricting a few folds
that produced a single instruction from two instructions to the case
where the inner instruction would go away.

Original commit message:
> InstCombine: move hasOneUse check to the top of foldICmpAddConstant
>
> There were two combines not covered by the check before now,
> neither of which actually differed from normal in the benefit analysis.
>
> The most recent seems to be because it was just added at the top of the
> function (naturally). The older is from way back in 2008 (r46687)
> when we just didn't put those checks in so routinely, and has been
> diligently maintained since.

From the commit message alone, there doesn't seem to be a
deeper motivation, deeper problem that was trying to solve,
other than 'fixing the wrong one-use check'.

As i have briefly discusses in IRC with Tim, the original motivation
can no longer be recovered, too much time has passed.

However i believe that the original fold was doing the right thing,
we should be performing such a transformation even if the inner `add`
will not go away - that will still unchain the comparison from `add`,
it will no longer need to wait for `add` to compute.

Doing so doesn't seem to break any particular idioms,
as least as far as i can see.

References https://bugs.llvm.org/show_bug.cgi?id=44100
2019-12-02 18:06:15 +03:00
Nemanja Ivanovic 241cbf201a [PowerPC] Fix crash in peephole optimization
When converting reg+reg shifts to reg+imm rotates, we neglect to consider the
CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a
rotate with missing operands which causes a crash.

Committing this fix without review since it is non-controversial that the list
of mnemonics to consider should include the 64-bit aliases for the exact
mnemonics.

Fixes PR44183.
2019-12-02 08:56:04 -06:00
Victor Campos dcf11c5e86 [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-A
Summary:
Add support for vcadd_* family of intrinsics. This set of intrinsics is
available in Armv8.3-A.

The fp16 versions require the FP16 extension, which has been available
(opt-in) since Armv8.2-A.

Reviewers: t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70862
2019-12-02 14:38:39 +00:00
Sanjay Patel af0babc90a [InstCombine] fold copysign with constant sign argument to (fneg+)fabs
If the sign of the sign argument is known (this could be extended to use ValueTracking),
then we can use fneg+fabs to clear/set the sign bit of the magnitude argument.
http://llvm.org/docs/LangRef.html#llvm-copysign-intrinsic

This transform is already done in DAGCombiner, but we can do it sooner in IR as
suggested in PR44153:
https://bugs.llvm.org/show_bug.cgi?id=44153

We have effectively no analysis for copysign in IR, so we are taking the unusual step
of increasing the number of IR instructions for the negative constant case.

Differential Revision: https://reviews.llvm.org/D70792
2019-12-02 09:23:12 -05:00
Wang, Pengfei 76b70f6f75 [X86] Add initialization of FPCW in llvm-exegesis
Summary: This is a following up to D70874. It adds the initialization of FPCW in llvm-exegesis.

Reviewers: craig.topper, RKSimon, courbet, gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70891
2019-12-02 20:18:35 +08:00
Georgii Rymar e19f19b09f [llvm-readobj/llvm-readelf] - Simplify the code that dumps versions.
After changes introduced in D70495 and D70826 its now possible
to significantly simplify the code we have.

This also fixes an issue: previous code assumed that version strings
should always be read from the dynamic string table. While it is
normally true, the string table should be taken from the corresponding
sh_link field.

Differential revision: https://reviews.llvm.org/D70855
2019-12-02 15:14:30 +03:00
Mark Murray 510792a2e0 [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.
Summary: Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics and their predicated versions. Add unit tests.

Subscribers: kristof.beyls, hiraditya, dmgreen, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70829
2019-12-02 11:18:53 +00:00
David Green e9e1daf2b9 [ARM] Remove VHADD patterns
These instructions do not work quite like I expected them to. They
perform the addition and then shift in a higher precision integer, so do
not match up with the patterns that we added.

For example with s8s, adding 100 and 100 should wrap leaving the shift
to work on a negative number. VHADD will instead do the arithmetic in
higher precision, giving 100 overall. The vhadd gives a "better" result,
but not one that matches up with the input.

I am just removing the patterns here. We might be able to re-add them in
the future by checking for wrap flags or changing bitwidths. But for the
moment just remove them to remove the problem cases.
2019-12-02 10:38:14 +00:00
Wang, Pengfei 44b9942898 [X86] Add initialization of MXCSR in llvm-exegesis
Summary: This patch is used to initialize the new added register MXCSR.

Reviewers: craig.topper, RKSimon

Subscribers: tschuett, courbet, llvm-commits, LiuChen3

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70874
2019-12-02 18:19:32 +08:00
Bjorn Pettersson a9d6b0e544 [InstCombine] Fix big-endian miscompile of (bitcast (zext/trunc (bitcast)))
Summary:
optimizeVectorResize is rewriting patterns like:
  %1 = bitcast vector %src to integer
  %2 = trunc/zext %1
  %dst = bitcast %2 to vector

Since bitcasting between integer an vector types gives
different integer values depending on endianness, we need
to take endianness into account. As it happens the old
implementation only produced the correct result for little
endian targets.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=44178

Reviewers: spatel, lattner, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, hiraditya, uabelho, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70844
2019-12-02 11:05:25 +01:00
Georgii Rymar 7eecf2b872 [llvm-readelf/llvm-readobj] - Check the version of SHT_GNU_verneed section entries.
It is a follow-up for D70826 and it is similar to D70810.

SHT_GNU_verneed contains the following fields:
`vn_version`: Version of structure. This value is currently set to 1, and will be reset
if the versioning implementation is incompatibly altered.
(https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html)

We should check it for correctness.

Differential revision: https://reviews.llvm.org/D70842
2019-12-02 12:57:23 +03:00
Georgii Rymar c653a52c85 [llvm-readobj/llvm-readelf] - Reimplement dumping of the SHT_GNU_verneed section.
This is similar to D70495, but for SHT_GNU_verneed section.
It solves the same problems: different implementations, lack of error reporting
and no test coverage.

DIfferential revision: https://reviews.llvm.org/D70826
2019-12-02 12:27:31 +03:00
Anton Afanasyev bd23859f39 [NFC] Precommit test showing SROA loses `!tbaa.struct` metadata
This issue impacts llvm.org/pr42022
2019-12-02 11:48:01 +03:00
Austin Kerbow cfbbdc83b4 AMDGPU/GlobalISel: Add AGPR bank and RegBankSelect mfma intrinsics
Differential Revision: https://reviews.llvm.org/D70871
2019-12-01 22:15:48 -08:00
Florian Hahn 19fd8925a4 Revert "[Examples] Add IRTransformations directory to examples."
This breaks LLVMExports.cmake in some build configurations.

PR44197

This reverts commits ceb72d07b0
                     7d0b1d77b3.
2019-12-01 22:20:20 +00:00
Craig Topper 67298d683c [X86][InstCombine] Move non-X86 specific instcombine test from test/CodeGen/X86/ to test/Transforms/InstCombine/ 2019-12-01 10:31:04 -08:00
Craig Topper 3dd93dc2a1 [X86][InstCombine] Move instcombine test from test/CodeGen/X86 to test/Transforms/InstCombine/ and replace grep with FileCheck 2019-12-01 10:31:04 -08:00
Nuno Lopes 89c47313c9 remove UB from test by making GV alignment explicit 2019-12-01 15:16:31 +00:00
Craig Topper 40dfc6dff1 [X86] Add floating point execution domain to comi/ucomi/cvtss2si/cvtsd2si/cvttss2si/cvttsd2si/cvtsi2ss/cvtsi2sd instructions. 2019-11-30 11:26:28 -08:00
David Green 59b56e5c57 [InstCombine] Expand usub_sat patterns to handle constants
The constants come through as add %x, -C, not a sub as would be
expected. They need some extra matchers to canonicalise them towards
usub_sat.

Differential Revision: https://reviews.llvm.org/D69514
2019-11-30 16:58:01 +00:00
David Green 3a1bef5616 [InstCombine] Adjust usub_sat fold one use checks
This adjusts the one use checks in the the usub_sat fold code to not
increase instruction count, but otherwise do the fold. Reviewed as a
part of D69514.
2019-11-30 16:58:00 +00:00
David Green a46b959ebd [InstCombine] More usub_sat tests. NFC. 2019-11-30 16:58:00 +00:00
Hans Wennborg c2443155a0 Revert 651f07908a "[AArch64] Don't combine callee-save and local stack adjustment when optimizing for size"
This caused asserts (and perhaps also miscompiles) while building for Windows
on AArch64. See the discussion on D68530 for details and reproducer.

Reverting until this can be investigated and fixed.

> For arm64, D18619 introduced the ability to combine bumping the stack pointer
> upfront in case it needs to be bumped for both the callee-save area as well as
> the local stack area.
>
> That diff already remarks that "This change can cause an increase in
> instructions", but argues that even when that happens, it should be still be a
> performance benefit because the number of micro-ops is reduced.
>
> We have observed that this code-size increase can be significant in practice.
> This diff disables combining stack bumping for methods that are marked as
> optimize-for-size.
>
> Example of a prologue with the behavior before this diff (combining stack bumping when possible):
>   sub        sp, sp, #0x40
>   stp        d9, d8, [sp, #0x10]
>   stp        x20, x19, [sp, #0x20]
>   stp        x29, x30, [sp, #0x30]
>   add        x29, sp, #0x30
>   [... compute x8 somehow ...]
>   stp        x0, x8, [sp]
>
> And after this  diff, if the method is marked as optimize-for-size:
>   stp        d9, d8, [sp, #-0x30]!
>   stp        x20, x19, [sp, #0x10]
>   stp        x29, x30, [sp, #0x20]
>   add        x29, sp, #0x20
>   [... compute x8 somehow ...]
>   stp        x0, x8, [sp, #-0x10]!
>
> Note that without combining the stack bump there are two auto-decrements,
> nicely folded into the stp instructions, whereas otherwise there is a single
> sub sp, ... instruction, but not folded.
>
> Patch by Nikolai Tillmann!
>
> Differential Revision: https://reviews.llvm.org/D68530
2019-11-30 14:20:11 +01:00
Dmitri Gribenko b094258661 Updated the OCaml/bitwriter.ml test for OCaml 4.06+
Since OCaml 4.02 (released in 2014), strings and bytes are different
types, but up until OCaml 4.06, the compiler defaulted to a
compatibility mode "unsafe-string". OCaml 4.06 flips the default to
"safe-string", breaking the test.

This change should be compatible with OCaml 4.02+, but is only truly
necessary for OCaml 4.06+.

For more information, see:

https://caml.inria.fr/pub/docs/manual-ocaml/libref/String.html
https://ocaml.org/releases/4.02.html
2019-11-30 13:35:23 +01:00
Sean Fertile 26ab827c24 [PowerPC][AIX] Add support for lowering int/float/double formal arguments.
This patch adds LowerFormalArguments_AIX, support is added for lowering
int, float, and double formal arguments into general purpose and
floating point registers only.

The aix calling convention testcase have been redone to test for caller
and callee functionality in the same lit test.

Patch by Zarko Todorovski!

Differential Revision: https://reviews.llvm.org/D69578
2019-11-29 12:46:53 -05:00
Carey Williams 76fd58d0fe Revert "[ARM] Allocatable Global Register Variables for ARM"
This reverts commit 2d739f98d8.
2019-11-29 17:01:05 +00:00
Carey Williams c313a6bdbe Revert "[NFC] Fix test reserve_global_reg.ll after 2d739f9"
This reverts commit aea7578fad.
2019-11-29 17:00:55 +00:00
Bjorn Pettersson 363cbcc590 [InstCombine] Run the cast.ll test a twice, now also testing little endian. NFC
Some tests in test/Transforms/InstCombine/cast.ll depend on
endianness. Added a second run line to run the tests with both
big and little endian. In the past we only compiled for big
endian, and then it was hard to see if any big endian bugfixes
would impact the little endian result etc.
2019-11-29 13:24:13 +01:00
Victor Campos e478385e77 [ARM] Fix instruction selection for ARMISD::CMOV with f16 type
Summary:
In the cases where the CMOV (f16) SDNode is used with condition codes
LT, LE, VC or NE, it is successfully selected into a VSEL instruction.

In the remaining cases, however, instruction selection fails since VSEL
does not support other condition codes.

This patch handles such cases by using the single-precision version of
the VMOV instruction.

Reviewers: ostannard, dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70667
2019-11-29 10:40:37 +00:00
Georgii Rymar 99adf047c8 [llvm-readelf][test] - Update comment in elf-verdef-invalid.test. NFC.
It was suggested to change it during review of D70810,
but I've forgotten to update it before commit.
2019-11-29 11:38:27 +03:00
Georgii Rymar 7ab1481361 [llvm-readelf/llvm-readobj] - Check version of SHT_GNU_verdef section entries when dumping.
Elfxx_Verdef contains the following field:

vd_version
Version revision. This field shall be set to 1.
(https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html)

Our code should check the struct version for correctness. This patch does that.
(This will help to simplify or eliminate ELFDumper<ELFT>::LoadVersionDefs() which
has it's own logic to parse version definitions for no reason. It checks the
struct version currently).

Differential revision: https://reviews.llvm.org/D70810
2019-11-29 11:09:56 +03:00
Georgii Rymar 13cbcf1c1a [yaml2obj] - Add a way to describe content of the SHT_GNU_verneed section with "Content".
There is no way to set raw content for SHT_GNU_verneed section.
This patch implements it.

Differential revision: https://reviews.llvm.org/D70816
2019-11-29 10:50:00 +03:00
Hideto Ueno 6c742fdbf4 [Attributor] Deduce dereferenceable based on accessed bytes map
Summary:
This patch introduces the deduction based on load/store instructions whose pointer operand is a non-inbounds GEP instruction.
For example if we have,
```
void f(int *u){
 u[0] = 0;
 u[1] = 1;
 u[2] = 2;
}
```
then u must be dereferenceable(12).

This patch is inspired by D64258

Reviewers: jdoerfert, spatel, hfinkel, RKSimon, sstefan1, xbolva00, dtemirbulatov

Reviewed By: jdoerfert

Subscribers: jfb, lebedev.ri, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70714
2019-11-29 06:55:58 +00:00
Hideto Ueno dfedae5001 [Attributor] Remove dereferenceable_or_null when nonull is present
Summary: This patch prevents the simultaneous presence of `dereferenceable` and `dereferenceable_or_null` attribute

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70789
2019-11-29 06:45:07 +00:00
Fangrui Song b0e979724f [PassInstrumentation] Remove excess newline for the new pass manager
This also removes excess newline for the legacy pass manager when -filter-print-funcs is specified.
2019-11-28 17:20:17 -08:00
Amaury Séchet ca818f4550 [DAGCombiner] Peek through vector concats when trying to combine shuffles.
Summary: This combine showed up as needed when exploring the regression when processing the DAG in topological order.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68195
2019-11-28 23:57:29 +01:00
Lang Hames 674df13b5f [ORC][JITLink] Add support for weak references, and improve handling of static
libraries.

This patch substantially updates ORCv2's lookup API in order to support weak
references, and to better support static archives. Key changes:

-- Each symbol being looked for is now associated with a SymbolLookupFlags
   value. If the associated value is SymbolLookupFlags::RequiredSymbol then
   the symbol must be defined in one of the JITDylibs being searched (or be
   able to be generated in one of these JITDylibs via an attached definition
   generator) or the lookup will fail with an error. If the associated value is
   SymbolLookupFlags::WeaklyReferencedSymbol then the symbol is permitted to be
   undefined, in which case it will simply not appear in the resulting
   SymbolMap if the rest of the lookup succeeds.

   Since lookup now requires these flags for each symbol, the lookup method now
   takes an instance of a new SymbolLookupSet type rather than a SymbolNameSet.
   SymbolLookupSet is a vector-backed set of (name, flags) pairs. Clients are
   responsible for ensuring that the set property (i.e. unique elements) holds,
   though this is usually simple and SymbolLookupSet provides convenience
   methods to support this.

-- Lookups now have an associated LookupKind value, which is either
   LookupKind::Static or LookupKind::DLSym. Definition generators can inspect
   the lookup kind when determining whether or not to generate new definitions.
   The StaticLibraryDefinitionGenerator is updated to only pull in new objects
   from the archive if the lookup kind is Static. This allows lookup to be
   re-used to emulate dlsym for JIT'd symbols without pulling in new objects
   from archives (which would not happen in a normal dlsym call).

-- JITLink is updated to allow externals to be assigned weak linkage, and
   weak externals now use the SymbolLookupFlags::WeaklyReferencedSymbol value
   for lookups. Unresolved weak references will be assigned the default value of
   zero.

Since this patch was modifying the lookup API anyway, it alo replaces all of the
"MatchNonExported" boolean arguments with a "JITDylibLookupFlags" enum for
readability. If a JITDylib's associated value is
JITDylibLookupFlags::MatchExportedSymbolsOnly then the lookup will only
match against exported (non-hidden) symbols in that JITDylib. If a JITDylib's
associated value is JITDylibLookupFlags::MatchAllSymbols then the lookup will
match against any symbol defined in the JITDylib.
2019-11-28 13:30:49 -08:00
Florian Hahn ec3efcf11f [IVDescriptors] Skip FOR where we have multiple sink points for now.
This fixes a crash with instructions where multiple operands are
first-order-recurrences.
2019-11-28 22:18:47 +01:00
Austin Kerbow 256ad954a9 AMDGPU: Reuse carry out register during FI elimination
Summary:
Pre gfx9 we need to scavenge a 64-bit SGPR to use as the carry out for an Add.
If only one SGPR was available this crashed when trying to scavenge another
32bit SGPR to materialize the offset.

Instead, reuse a 32-bit SGPR from the carry out as the offset register.

Also prefer to use vcc for the unused carry out when it is available.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70614
2019-11-28 10:13:48 -08:00
Simon Tatham acd7fe8636 [AArch64][v8.3a] Don't emit LDRA '[xN]!' alias in disassembly.
Summary:
In rG643ac6c0420b, the syntax `ldraa x1, [x0]!` was added as an alias
for `ldraa x1, [x0, #0]!`. That syntax is less obvious in meaning, and
also will not be accepted by assemblers that haven't been updated yet.
So it would be better not to emit it as the preferred disassembly for
that instruction.

This change lowers the EmitPriority of the new alias so that the more
explicit syntax `[x0, #0]!` is preferred by the disassembler. The new
syntax is still accepted by the assembler.

Reviewers: ab, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70813
2019-11-28 15:31:59 +00:00
David Stuttard 943d8326dd AMDGPU: Fix lit test checks with dag option
Summary:
I was seeing some failures on a test with slightly different instruction
ordering. Adding in some DAG directives solved the issue.

Change-Id: If5a3d3969055fb19279943bd45161bb70a3dabce

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70531
2019-11-28 10:01:06 +00:00
Georgii Rymar 7f362f04a7 [llvm-readelf] - Make GNU style dumping of invalid SHT_GNU_verdef be consistent with LLVM style.
When we dump SHT_GNU_verdef section that has sh_link that references a non-existent section,
llvm-readobj reports a warning and continues dump, but llvm-readelf fails with a error.

This patch fixes the issue and opens road for futher follow-ups for
improving the printGNUVersionSectionProlog().

Differential revision: https://reviews.llvm.org/D70776
2019-11-28 12:41:29 +03:00
Georgii Rymar bb7d75ef1d [llvm-readelf][llvm-readobj][test] - Cleanup test cases for versioning sections.
Currently we have 2 tests for testing versioning sections:
1) elf-versioninfo.test
2) elf-invalid-versioning.test

The first one currently checks how versioning sections are dumped +
how tools dump invalid SHT_GNU_verdef section.

The second despite of its name contains only tests for invalid SHT_GNU_verneed section.

In this patch I`ve renamed elf-invalid-versioning.test->elf-verneed-invalid.test,
and moved a few tests from elf-versioninfo.test to a new elf-verdef-invalid.test.

It will help to maintain these and a new tests for broken versioning sections.

Differential revision:
2019-11-28 10:18:51 +03:00
Wang, Pengfei 1bc5c52afd [X86][NFC] Rename test file for following changes. 2019-11-28 15:03:56 +08:00
Lang Hames c33598d5e5 [JITLink] Make sure MachO/x86-64 handles 32-bit signed addends correctly.
These need to be sign extended when loading into Edge addends.
2019-11-27 22:46:07 -08:00
Ehud Katz 825debe847 [InlineCost] Fix infinite loop in indirect call evaluation
Currently every time we encounter an indirect call of a known function,
we try to evaluate the inline cost of that function. In case of a
recursion, that evaluation never stops.

The solution I propose is to evaluate only the indirect call of the
function, while any further indirect calls (of a known function) will be
treated just as direct function calls, which, actually, never tries to
evaluate the call.

Fixes PR35469.

Differential Revision: https://reviews.llvm.org/D69349
2019-11-28 08:27:50 +02:00
Craig Topper ed521fef03 [LegalTypes][X86] Add SoftenFloatOperand support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT. 2019-11-27 21:16:13 -08:00
Craig Topper 1727c4f1a2 [LegalizeTypes][X86] Add ExpandIntegerResult support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT. 2019-11-27 18:41:45 -08:00
Craig Topper 8f73a93b2d [X86] Add support for STRICT_FP_TO_UINT/SINT from fp128. 2019-11-27 18:38:32 -08:00
David Tenty 98740643f7 [AIX] Emit TOC entries for ASM printing
Summary:
Emit the correct .toc psuedo op when we change to the TOC and emit
TC entries. Make sure TOC psuedos get the right symbols via overriding
getMCSymbolForTOCPseudoMO on AIX. Add a test for TOC assembly writing
and update tests to include TOC entries.

Also make sure external globals have a csect set and handle external function descriptor (originally authored by Jason Liu) so we can emit TOC entries for them.

Reviewers: DiggerLin, sfertile, Xiangling_L, jasonliu, hubert.reinterpretcast

Reviewed By: jasonliu

Subscribers: arphaman, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70461
2019-11-27 17:20:55 -05:00
Dávid Bolvanský 40963b2bf0 Revert "[Attributor] Move pass after InstCombine to futher eliminate null pointer checks"
This reverts commit 7ca7d62c6e. Commited accidentally.
2019-11-27 22:45:47 +01:00
Stefan Pintilie 8e84c9ae99 [PowerPC] Separate Features that are known to be Power9 specific from Future CPU
The Power 9 CPU has some features that are unlikely to be passed on to future
versions of the CPU. This patch separates this out so that future CPU does not
inherit them.

Differential Revision: https://reviews.llvm.org/D70466
2019-11-27 15:40:13 -06:00
Dávid Bolvanský 7ca7d62c6e [Attributor] Move pass after InstCombine to futher eliminate null pointer checks
Summary: PR44149

Reviewers: jdoerfert

Subscribers: mehdi_amini, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70737
2019-11-27 22:36:51 +01:00
Stefan Pintilie dcceab1a0a [PowerPC] Add new Future CPU for PowerPC in LLVM
This is a continuation of D70262
The previous patch as listed above added the future CPU in clang. This patch
adds the future CPU in the PowerPC backend. At this point the patch simply
assumes that a future CPU will have the same characteristics as pwr9. Those
characteristics may change with later patches.

Differential Revision: https://reviews.llvm.org/D70333
2019-11-27 14:30:06 -06:00
Craig Topper 9283681e16 [CriticalAntiDepBreaker] Teach the regmask clobber check to check if any subregister is preserved before considering the super register clobbered
X86 has some calling conventions where bits 127:0 of a vector register are callee saved, but the upper bits aren't. Previously we could detect that the full ymm register was clobbered when the xmm portion was really preserved. This patch checks the subregisters to make sure they aren't preserved.

Fixes PR44140

Differential Revision: https://reviews.llvm.org/D70699
2019-11-27 11:20:58 -08:00
taewookoh 5d21f75b57 Revert b19ec1eb3d
Summary: This reverts commit b19ec1eb3d as it fails powerpc tests

Subscribers: llvm-commits
2019-11-27 11:17:10 -08:00
Sanjay Patel 5c166f1d19 [x86] make SLM extract vector element more expensive than default
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc

The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.

This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605

Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.

Differential Revision: https://reviews.llvm.org/D70607
2019-11-27 14:08:56 -05:00
Craig Topper ebfff46c8d [LegalizeTypes][FPEnv][X86] Add initial support for softening strict fp nodes
This is based on what's required for softening fp128 operations on 32-bit X86 assuming f32/f64/f80 are legal. So there could be some things missing.

Differential Revision: https://reviews.llvm.org/D70654
2019-11-27 10:50:10 -08:00
Taewook Oh b19ec1eb3d [BPI] Improve unreachable/ColdCall heurstics to handle loops.
Summary:
While updatePostDominatedByUnreachable attemps to find basic blocks that are post-domianted by unreachable blocks, it currently cannot handle loops precisely, because it doesn't use the actual post dominator tree analysis but relies on heuristics of visiting basic blocks in post-order. More precisely, when the entire loop is post-dominated by the unreachable block, current algorithm fails to detect the entire loop as post-dominated by the unreachable because when the algorithm reaches to the loop latch it fails to tell all its successors (including the loop header) will "eventually" be post-domianted by the unreachable block, because the algorithm hasn't visited the loop header yet. This makes BPI for the loop latch to assume that loop backedges are taken with 100% of probability. And because of this, block frequency info sometimes marks virtually dead loops (which are post dominated by unreachable blocks) super hot, because 100% backedge-taken probability makes the loop iteration count the max value. updatePostDominatedByColdCall has the exact same problem as well.

To address this problem, this patch makes PostDominatedByUnreachable/PostDominatedByColdCall to be computed with the actual post-dominator tree.

Reviewers: skatkov, chandlerc, manmanren

Reviewed By: skatkov

Subscribers: manmanren, vsk, apilipenko, Carrot, qcolombet, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70104
2019-11-27 10:36:06 -08:00
Mark Murray a048bf87fb [ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit tests.
Summary: Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit tests.

Reviewers: simon_tatham, ostannard, dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70547
2019-11-27 16:52:05 +00:00
Mark Murray e8a8dbe9c4 [ARM][MVE][Intrinsics] Add MVE VMUL intrinsics. Remove annoying "t1" from VMUL* instructions. Add unit tests.
Summary: Add MVE VMUL intrinsics. Remove annoying "t1" from VMUL* instructions. Add unit tests.

Reviewers: simon_tatham, ostannard, dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70546
2019-11-27 16:52:05 +00:00
Mark Murray f4bba07b87 [ARM][MVE][Intrinsics] Add MVE VABD intrinsics. Add unit tests.
Summary: Add MVE VABD intrinsics. Add unit tests.

Reviewers: simon_tatham, ostannard, dmgreen

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D70545
2019-11-27 16:52:04 +00:00
Sanjay Patel 5e6b728763 [InstCombine] add tests for copysign; NFC 2019-11-27 11:32:23 -05:00
Hideto Ueno 0f4383faa7 [Attributor] Handle special case when offset equals zero in nonnull deduction 2019-11-27 14:45:16 +00:00
David Green 9f15fcc271 [ARM] Replace arm_neon_vqadds with sadd_sat
This replaces the A32 NEON vqadds, vqaddu, vqsubs and vqsubu intrinsics
with the target independent sadd_sat, uadd_sat, ssub_sat and usub_sat.
This helps generate vqadds from standard IR nodes, which might be
produced from the vectoriser. The old variants are removed in the
process.

Differential Revision: https://reviews.llvm.org/D69350
2019-11-27 13:32:29 +00:00
John Brawn 3c1912a733 [ARM] Add constrained FP intrinsics test
Currently XFAILed, as there are various things that need fixing.

Differential Revision: https://reviews.llvm.org/D70599
2019-11-27 13:20:04 +00:00
Tim Northover 31c25fadcc AArch64: support the Apple NEON syntax for v8.2 crypto instructions.
Very simple change, just adding the extra syntax variant.
2019-11-27 10:54:38 +00:00
Georgii Rymar 3b35603a56 [llvm-readobj] - Always print "Predecessors" for version definition sections.
This is a follow-up discussed in D70495 thread.

The current logic is unusual for llvm-readobj. It doesn't print predecessors
list when it is empty. This is not good for machine parsers.
D70495 had to add this condition during refactoring to reduce amount of changes,
in tests, because the original code also had a similar logic.

Now seems it is time to get rid of it. This patch does it.

Differential revision: https://reviews.llvm.org/D70717
2019-11-27 12:29:55 +03:00
Martin Storsjö 47046f05e6 [MC] Produce proper section relative relocations for COFF in .debug_frame
The third parameter to Streamer.EmitSymbolValue() is "bool
IsSectionRelative = false".

For ELF, these debug sections are mapped to address zero, so a normal,
absolute address relocation works just fine, but COFF needs a section
relative relocation, and COFF is the only target where
needsDwarfSectionOffsetDirective() returns true. This matches how
EmitSymbolValue is called elsewhere in the same source file.

Differential Revision: https://reviews.llvm.org/D70661
2019-11-27 10:44:42 +02:00
Martin Storsjö 943513b799 [X86] [Win64] Avoid truncating large (> 32 bit) stack allocations
This fixes PR44129, which was broken in a7adc3185b (in 7.0.0
and newer).

Differential Revision: https://reviews.llvm.org/D70741
2019-11-27 10:44:42 +02:00
czhengsz 98189755cd [PowerPC] [NFC] change PPCLoopPreIncPrep class name after D67088.
Afer https://reviews.llvm.org/D67088, PPCLoopPreIncPrep pass can prepare more instruction forms except pre inc form, like DS/DQ forms.

This patch is a follow-up of https://reviews.llvm.org/D67088 to rename the pass name.

Reviewed by: jsji

Differential Revision: https://reviews.llvm.org/D70371
2019-11-26 23:58:00 -05:00
Eric Christopher fd39b1bb20 Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.""
This reapplies: 8ff85ed905

Original commit message:

As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.

This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410

This reverts commit c9ddb02659.
2019-11-26 20:28:52 -08:00
Craig Topper df773ebb5f [X86] Add test cases for constrained lrint/llrint/lround/llround to fp128-libcalls-strict. NFC 2019-11-26 15:46:29 -08:00
Sanjay Patel e177c5a00d [InstSimplify] fold copysign with same args to the arg
This is correct for any value including NaN/inf.

We don't have this fold directly in the backend either,
but x86 manages to get it after converting things to bitops.
2019-11-26 17:35:10 -05:00
Sanjay Patel 48a3a1e090 [InstSimplify] add tests for copysign; NFC 2019-11-26 17:23:30 -05:00
Sanjay Patel 8d20dd0b06 [ConstFolding] move tests for copysign; NFC
InstCombine doesn't have any transforms for copysign currently.
2019-11-26 16:54:46 -05:00
Simon Atanasyan 11074bfffe [mips] Fix sc, scs, ll, lld instructions expanding
There are a couple of bugs with the sc, scs, ll, lld instructions expanding:

1. On R6 these instruction pack immediate offset into a 9-bit field. Now
if an immediate exceeds 9-bits assembler does not perform expansion and
just rejects such instruction.

2. On 64-bit non-PIC code if an operand is a symbol assembler generates
incorrect sequence of instructions. It uses R_MIPS_HI16 and R_MIPS_LO16
relocations and skips R_MIPS_HIGHEST and R_MIPS_HIGHER ones.

To solve these problems this patch:
- Introduces `mem_simm9_exp` to mark 9-bit memory immediate operands
which require expansion. Probably later all `mem_simm9` operands will be
able to migrate on `mem_simm9_exp` and we rename it to `mem_simm9`.

- Adds new `OPERAND_MEM_SIMM9` operand type and assigns it to the
`mem_simm9_exp`. That allows to know operand size in the `processInstruction`
method and decide whether we need to expand instruction.

- Adds `expandMem9Inst` method to expand instructions with 9-bit memory
immediate operand. This method just load immediate into a "base"
register used by origibal instruction:

   sc $2, 256($sp) => addiu  $1, $sp, 256
                      sc     $2, 0($1)

- Fix `expandMem16Inst` to support a correct set of relocations for
symbol loading in case of 64-bit non-PIC code.

   ll $12, symbol => lui    $12, 0
                         R_MIPS_HIGHEST symbol
                     daddiu $12, $12, 0
                         R_MIPS_HIGHER symbol
                     dsll   $12, $12, 16
                     daddiu $12, $12, 0
                         R_MIPS_HI16 symbol
                     dsll   $12, $12, 16
                     ll     $12, 0($12)
                         R_MIPS_LO16 symbol

- Fix `expandMem16Inst` to unify handling of 3 and 4 operands
instructions.

- Delete unused now `MipsTargetStreamer::emitSCWithSymOffset` method.

Task for next patches - implement expanding for other instructions use
`mem_simm9` operand and other `mem_simm##` operands.

Differential Revision: https://reviews.llvm.org/D70648
2019-11-27 00:43:25 +03:00
Craig Topper cfce8f2cfb [X86] Add strict fp support for operations of X87 instructions
This is the following patch of D68854.

This patch adds basic operations of X87 instructions, including +, -, *, / , fp extensions and fp truncations.

Patch by Chen Liu(LiuChen3)

Differential Revision: https://reviews.llvm.org/D68857
2019-11-26 10:59:41 -08:00
Craig Topper b8cb73dd38 [X86] Pre-commit test modifications for D68857. NFC
Patch by Chen Liu(LiuChen3)

Differential Revision: https://reviews.llvm.org/D70706
2019-11-26 10:33:19 -08:00
Fangrui Song cd9c915d2a [Object][RISCV][test] Improve DebugInfo/RISCV/relax-debug-frame.ll
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D70578
2019-11-26 09:16:55 -08:00
David Green b5315ae8ff [Codegen][ARM] Add addressing modes from masked loads and stores
MVE has a basic symmetry between it's normal loads/store operations and
the masked variants. This means that masked loads and stores can use
pre-inc and post-inc addressing modes, just like the standard loads and
stores already do.

To enable that, this patch adds all the relevant infrastructure for
treating masked loads/stores addressing modes in the same way as normal
loads/stores.

This involves:
- Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra
   Offset operand that is added after the PtrBase.
- Extending the IndexedModeActions from 8bits to 16bits to store the
   legality of masked operations as well as normal ones. This array is
   fairly small, so doubling the size still won't make it very large.
   Offset masked loads can then be controlled with
   setIndexedMaskedLoadAction, similar to standard loads.
- The same methods that combine to indexed loads, such as
   CombineToPostIndexedLoadStore, are adjusted to handle masked loads in
   the same way.
- The ARM backend is then adjusted to make use of these indexed masked
   loads/stores.
- The X86 backend is adjusted to hopefully be no functional changes.

Differential Revision: https://reviews.llvm.org/D70176
2019-11-26 16:21:01 +00:00
David Green 549db744bd [ARM] Lots of MVE offset masked load and store tests. NFC 2019-11-26 16:21:01 +00:00
jasonliu 7707d8aa9d [XCOFF][AIX] Check linkage on the function, and two fixes for comments
This is a follow up commit to address post-commit comment in D70443

Differential revision: https://reviews.llvm.org/D70443
2019-11-26 16:09:31 +00:00
vpykhtin 008e65a7bf [AMDGPU] Fix emitIfBreak CF lowering: use temp reg to make register coalescer life easier.
Differential revision: https://reviews.llvm.org/D70405
2019-11-26 18:59:37 +03:00
Luís Marques 6fd4c42fa8 [LegalizeTypes][RISCV] Soften FCOPYSIGN operand
Summary: Adds support for softening FCOPYSIGN operands.
Adds RISC-V tests that exercise the new softening code.

Reviewers: asb, lenary, efriedma
Reviewed By: efriedma
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70679
2019-11-26 15:22:55 +00:00
Luís Marques d7be3eab5c [RISCV] Handle fcopysign(f32, f64) and fcopysign(f64, f32)
Summary: Adds tablegen patterns to explicitly handle fcopysign where the
magnitude and sign arguments have different types, due to the sign value casts
being removed the by DAGCombiner. Support for RV32IF follows in a separate
commit. Adds tests for all relevant scenarios except RV32IF.

Reviewers: lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70678
2019-11-26 14:26:31 +00:00
Georgii Rymar d88f67bdca [llvm-readobj/llvm-readelf] - Reimplement dumping of the SHT_GNU_verdef section.
Currently we have following issues:
1) We have 2 different implementations with a different behaviors for GNU/LLVM styles.
2) Errors are either not handled at all or we call report_fatal_error with not helpfull messages.
3) There is no test coverage even for those errors that are reported.

This patch reimplements parsing of the SHT_GNU_verdef section entries
in a single place, adds a few error messages and test coverage.

Differential revision: https://reviews.llvm.org/D70495
2019-11-26 17:15:39 +03:00
Sanjay Patel 2bd252ea89 [InferFuncAttributes][Attributor] add tests for 'dereferenceable'; NFC
Pulling a couple of extra tests out of
D64258
before abandoning in favor of
D70714
2019-11-26 09:09:13 -05:00
Georgii Rymar 64225aea8f [llvm-readobj][test] - Cleanup the many-sections.s test case.
It removes 2 precompiled binaries used which are now
can be crafted with the use of yaml2obj.

Differential revision: https://reviews.llvm.org/D70711
2019-11-26 16:56:48 +03:00
Georgii Rymar 91827ebf5e [yaml2obj] - Fix BB after «[yaml2obj] - Teach tool to describe SHT_GNU_verdef section with a "Content" property.»
Fixed a temporary file name.

BB: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/669
2019-11-26 16:05:38 +03:00
Georgii Rymar f69ac55d60 [yaml2obj] - Teach tool to describe SHT_GNU_verdef section with a "Content" property.
There is no way to set raw content for SHT_GNU_verdef section.
This patch implements it.

Differential revision: https://reviews.llvm.org/D70710
2019-11-26 15:35:05 +03:00
Alexey Lapshin e73f78acd3 [X86][MC] no error diagnostic for out-of-range jrcxz/jecxz/jcxz
Fix for PR24072:

X86 instructions jrcxz/jecxz/jcxz performs short jumps if rcx/ecx/cx register is 0
The maximum relative offset for a forward short jump is 127 Bytes (0x7F).
The maximum relative offset for a backward short jump is 128 Bytes (0x80).

Gnu assembler warns when the distance of the jump exceeds the maximum but llvm-as does not.

Patch by Konstantin Belochapka and Alexey Lapshin

Differential Revision: https://reviews.llvm.org/D70652
2019-11-26 14:32:17 +03:00
Kerry McLaughlin 4a649ad21a [AArch64][SVE] Implement floating-point conversion intrinsics
Summary:
Adds intrinsics for the following:
  - fcvt
  - fcvtzs & fcvtzu
  - scvtf & ucvtf
  - fcvtlt, fcvtnt
  - fcvtx & fcvtxnt

Reviewers: huntergr, sdesmalen, dancgr, mgudim, efriedma

Reviewed By: sdesmalen

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cameron.mcinally, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70180
2019-11-26 10:31:47 +00:00
Sam Parker 28166816b0 [ARM][ReachingDefs] Remove dead code in loloops.
Add some more helper functions to ReachingDefs to query the uses of
a given MachineInstr and also to query whether two MachineInstrs use
the same def of a register.

For Arm, while tail-predicating, these helpers are used in the
low-overhead loops to remove the dead code that calculates the number
of loop iterations.

Differential Revision: https://reviews.llvm.org/D70240
2019-11-26 10:27:46 +00:00
Sam Parker cced971fd3 [ARM][ReachingDefs] RDA in LoLoops
Add several new methods to ReachingDefAnalysis:
- getReachingMIDef, instead of returning an integer, return the
  MachineInstr that produces the def.
- getInstFromId, return a MachineInstr for which the given integer
  corresponds to.
- hasSameReachingDef, return whether two MachineInstr use the same
  def of a register.
- isRegUsedAfter, return whether a register is used after a given
  MachineInstr.

These methods have been used in ARMLowOverhead to replace searching
for uses/defs.

Differential Revision: https://reviews.llvm.org/D70009
2019-11-26 10:13:46 +00:00
Sam Parker 4a59eedd2d [ARM][ConstantIslands] Correct block size update
When inserting a non-decrementing LE, the basic block was being
resized to take into consideration that a tCMP and tBcc had been
combined into one T1 instruction. This is not true in the LE case
where we generate a CBN?Z and an LE.

Differential Revision: https://reviews.llvm.org/D70536
2019-11-26 09:55:58 +00:00
Dávid Bolvanský bb7b8540f0 [InstCombine] Optimize some memccpy calls to memcpy/null
Summary:
return memccpy(d, "helloworld", 'r', 20)
=>
return memcpy(d, "helloworld", 8 /* pos of 'r' in string */), d + 8

Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68089
2019-11-26 10:54:47 +01:00
Hideto Ueno 78a750276f [Attributor] Track a GEP Instruction in align deduction
Summary:
This patch enables us to track GEP instruction in align deduction.
If a pointer `B` is defined as `A+Offset` and known to have alignment `C`, there exists some integer Q such that
```
 A + Offset = C * Q = B
```
 So we can say that the maximum power of two which is a divisor of gcd(Offset, C) is an alignment.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70392
2019-11-26 07:55:28 +00:00
Wang, Pengfei 92f1446b8b [X86] Updated strict fp scalar tests and add fp80 tests for D68857, NFC. 2019-11-26 13:44:27 +08:00
Yonghong Song 6db023b99b [BPF] add "llvm." prefix to BPF internally created globals
Currently, BPF backend creates some global variables with name like
  <type_name>:<reloc_type>:<patch_imm>$<access_str>
to carry certain information to BPF backend.

With direct clang compilation, the following code in
   llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
is triggered and the above globals are emitted to the ELF file.
(clang enabled this as opt flag -faddrsig is on by default.)
   if (TM.Options.EmitAddrsig) {
    // Emit address-significance attributes for all globals.
    OutStreamer->EmitAddrsig();
    for (const GlobalValue &GV : M.global_values())
      if (!GV.use_empty() && !GV.isThreadLocal() &&
          !GV.hasDLLImportStorageClass() && !GV.getName().startswith("llvm.") &&
          !GV.hasAtLeastLocalUnnamedAddr())
        OutStreamer->EmitAddrsigSym(getSymbol(&GV));
  }
...
 10162: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT   UND tcp_sock:0:2048$0:117
 10163: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT   UND tcp_sock:0:2112$0:126:0
 10164: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT   UND tcp_sock:1:8$0:31:6
...
While in llc, those globals are not emited since EmitAddrsig
default option is false for llc. The llc flag "-addrsig" can be used to
enable the above code.

This patch added "llvm." prefix to these internal globals so that
they can be ignored in the above codes and possible other
places.

Differential Revision: https://reviews.llvm.org/D70703
2019-11-25 21:34:46 -08:00
Muhammad Omair Javaid c9ddb02659 Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there."
This reverts commit 8ff85ed905.

This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu

Following tests were failing:
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py
    lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py
    lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py
    lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py
    lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py
    lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py
    lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-26 09:32:13 +05:00
Craig Topper c43b8ec735 [X86] Add support for STRICT_FP_ROUND/STRICT_FP_EXTEND from/to fp128 to/from f32/f64/f80 in 64-bit mode.
These need to emit a libcall like we do for the non-strict version.

32-bit mode needs to SoftenFloat support to be implemented for strict FP nodes.

Differential Revision: https://reviews.llvm.org/D70504
2019-11-25 18:18:39 -08:00
Evgenii Stepanov 06d1110584 Speculative fix for frame-loclist.s test on Windows.
"echo -e" treats windows paths as special characters (ex. "\b").
2019-11-25 17:51:15 -08:00
Eric Christopher 8ff85ed905 As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there.
This change doesn't include any change to move from selection dag to fast isel
and that will come with other numbers that should help inform that decision.
There also haven't been any real debuggability studies with this pipeline yet,
this is just the initial start done so that people could see it and we could start
tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from either
optimization passes not run anymore (and without a compelling argument at the moment)
that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html

Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65410
2019-11-25 17:16:46 -08:00
Craig Topper 3687ddef2c [X86] Add proper execution domain information to the avx512vnni instructions. 2019-11-25 17:07:35 -08:00
Evgenii Stepanov 5906fb682d Fix new llvm-symbolizer tests on Windows.
A forward-slash vs backward-slash issue.
2019-11-25 15:59:13 -08:00
Craig Topper a64dc93ab3 [X86] Add test case for pr44140. NFC 2019-11-25 15:38:24 -08:00
Evgenii Stepanov 1b42cc0df1 llvm-symbolizer: fix handling of DW_AT_specification in FRAME.
Summary:
Use getSubroutineName() to the the subrouting name; this function knows
how to handle cases when DW_TAG_subprogram refers to an earlier
declaration:

0x00000050:     DW_TAG_subprogram
                  DW_AT_linkage_name    ("_ZN1A1fEv")
                  DW_AT_name    ("f")
...
0x00000067:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x0000000000000020)
                DW_AT_specification     (0x00000050 "_ZN1A1fEv")
...
0x0000008c:     DW_TAG_variable

Reviewers: pcc, vitalybuka, jdoerfert

Subscribers: srhines, hiraditya, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70630
2019-11-25 15:06:07 -08:00
Evgenii Stepanov 9f60820d84 llvm-symbolizer: Support loclist in FRAME.
Summary:
Support location lists in FRAME command.
These are used for the majority of local variables in optimized code.
Also support DW_OP_breg in addition to DW_OP_fbreg when it refers to the
same register as DW_AT_frame_base.

Reviewers: pcc, jdoerfert

Subscribers: srhines, hiraditya, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70629
2019-11-25 15:06:07 -08:00
Evgenii Stepanov 1c33d7130e llvm-symbolizer: Fix FRAME handling of missing AT_name.
Summary:
llvm-symbolizer protocol is empty string means end-of-output.
Do not emit empty string when a function or a variable do not have a
name for any reason. Emit "??".

Reviewers: pcc, vitalybuka, jdoerfert

Subscribers: srhines, hiraditya, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70626
2019-11-25 14:55:11 -08:00
Sanjay Patel 214683f3b2 [DAGCombiner] avoid crash on out-of-bounds insert index (PR44139)
We already have this simplification at node-creation-time, but
the test from:
https://bugs.llvm.org/show_bug.cgi?id=44139
...shows that we can combine our way to an assert/crash too.
2019-11-25 16:24:06 -05:00
Bardia Mahjour 67f0685b4d Revert "[DDG] Data Dependence Graph - Topological Sort"
Revert for now to look into the failures  on x86

This reverts commit bec37c3fc7.
2019-11-25 16:17:41 -05:00
Momchil Velikov 09555ce071 [ARM] Generate CMSE instructions from CMSE intrinsics
This patch adds instruction selection patterns for the TT, TTT, TTA, and TTAT
instructions and tests for llvm.arm.cmse.tt, llvm.arm.cmse.ttt,
llvm.arm.cmse.tta, and llvm.arm.cmse.ttat intrinsics (added in a previous
patch).

Patch by Javed Absar.

Differential Revision: https://reviews.llvm.org/D70407
2019-11-25 18:26:12 +00:00
Jonas Paulsson a7d3f6933d [SystemZ] Return the right offsets from getCalleeSavedSpillSlots().
// Due to the SystemZ ABI, the DWARF CFA (Canonical Frame Address) is not
// equal to the incoming stack pointer, but to incoming stack pointer plus
// 160.  The getOffsetOfLocalArea() returned value is interpreted as "the
// offset of the local area from the CFA".

The immediate offsets into the Register save area returned by
getCalleeSavedSpillSlots() should take this offset into account, which this
patch makes sure of.

Patch and review by Ulrich Weigand.
https://reviews.llvm.org/D70427
2019-11-25 19:03:05 +01:00
Nemanja Ivanovic 7fbaa8097e [PowerPC] Fix VSX clobbers of CSR registers
If an inline asm statement clobbers a VSX register that overlaps with a
callee-saved Altivec register or FPR, we will not record the clobber and will
therefore violate the ABI. This is clearly a bug so this patch fixes it.

Differential revision: https://reviews.llvm.org/D68576
2019-11-25 11:41:34 -06:00
bmahjour bec37c3fc7 [DDG] Data Dependence Graph - Topological Sort
Summary:
In this patch the DDG DAG is sorted topologically to put the
nodes in the graph in the order that would satisfy all
dependencies. This helps transformations that would like to
generate code based on the DDG. Since the DDG is a DAG a
reverse-post-order traversal would give us the topological
ordering. This patch also sorts the basic blocks passed to
the builder based on program order to ensure that the
dependencies are computed in the correct direction.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70609
2019-11-25 11:28:58 -05:00
jasonliu 906ecae2ed [AIX][XCOFF] Generate undefined symbol in symbol table for external function call
Summary:
This patch sets up the infrastructure for

 1. Associate MCSymbolXCOFF with an MCSectionXCOFF when it could not
    get implicitly associated.
 2. Generate undefined symbols. The patch itself generates undefined symbol
    for external function call only. Generate undefined symbol for external
    global variable and external function descriptors will be handled in
    separate patch(s) after this is land.

Differential Revision: https://reviews.llvm.org/D70443
2019-11-25 15:02:01 +00:00
Jeremy Morse d9c9a4e48d [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations
This is a re-land of D56151 / r364515 with a completely new implementation.

Once MIR code leaves SSA form and the liveness of a vreg is considered,
DBG_VALUE insts are able to refer to non-live vregs, because their
debug-uses do not contribute to liveness. This non-liveness becomes
problematic for optimizations like register coalescing, as they can't
``see'' the debug uses in the liveness analyses.

As a result registers get coalesced regardless of debug uses, and that can
lead to invalid variable locations containing unexpected values. In the
added test case, the first vreg operand of ADD32rr is merged with various
copies of the vreg (great for performance), but a DBG_VALUE of the
unmodified operand is blindly updated to the modified operand. This changes
what value the variable will appear to have in a debugger.

Fix this by changing any DBG_VALUE whose operand will be resurrected by
register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no
location. This is an overapproximation as some coalesced locations are safe
(others are not) -- an extra domination analysis would be required to work
out which, and it would be better if we just don't generate non-live
DBG_VALUEs.

Differential Revision: https://reviews.llvm.org/D64630
2019-11-25 13:47:06 +00:00
Anna Welker 6fc3e6f2eb [ARM][MVE] Select vqneg
Adds a pattern to ARMInstrMVE.td to use a VQNEG
  instruction if an equivalent multi-instruction
  construct is found.

Differential Revision: https://reviews.llvm.org/D70491
2019-11-25 11:29:14 +00:00
OCHyams 2de23c8364 [DebugInfo@O2][Utils] Undef instead of delete dbg.values in helper func
Summary:
Related bug: https://bugs.llvm.org/show_bug.cgi?id=40648

Static helper function rewriteDebugUsers in Local.cpp deletes dbg.value
intrinsics when it cannot move or rewrite them, or salvage the deleted
instruction's value. It should instead undef them in this case.

This patch fixes that and I've added a test which covers the failing test
case in bz40648. I've updated the unit test Local.ReplaceAllDbgUsesWith
to check for this behaviour (and fixed a typo in the test which would
cause the old test to always pass).

Reviewers: aprantl, vsk, djtodoro, probinson

Reviewed By: vsk

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D70604
2019-11-25 10:55:14 +00:00
Georgii Rymar 9659464d7e [yaml2obj/obj2yaml] - Add support for SHT_LLVM_DEPENDENT_LIBRARIES sections.
This section contains strings specifying libraries to be added to the link by the linker.
The strings are encoded as standard null-terminated UTF-8 strings.

This patch adds a way to describe and dump SHT_LLVM_DEPENDENT_LIBRARIES sections.

I introduced a new YAMLFlowString type here. That used to teach obj2yaml to dump
them like:

```
Libraries: [ foo, bar ]
```

instead of the following (if StringRef would be used):

```
Libraries:
  - foo
  - bar
```

Differential revision: https://reviews.llvm.org/D70598
2019-11-25 12:57:53 +03:00
QingShan Zhang bae5aac1ff [NFC][Test] Adding the test for bswap + logic op for PowerPC 2019-11-25 08:21:12 +00:00
Craig Topper 4f6f5bdc72 [X86] Add 32-bit RUN line to fp128-libcalls.ll. Add nounwind to test functions. NFC 2019-11-24 21:58:57 -08:00
czhengsz d1c16598b7 Revert "[PowerPC] combine rlwinm+rlwinm to rlwinm"
This reverts commit 29f6f9b2b2.
2019-11-24 22:46:26 -05:00
Seiya Nuta d72a8a4dd5
[llvm-objcopy][MachO] Implement --dump-section
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: alexshap, rupprecht, jhenderson

Subscribers: MaskRay, jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66408
2019-11-25 12:30:37 +09:00
Florian Hahn 9d24933f79 Recommit f0c2a5a "[LV] Generalize conditions for sinking instrs for first order recurrences."
This version contains 2 fixes for reported issues:
1. Make sure we do not try to sink terminator instructions.
2. Make sure we bail out, if we try to sink an instruction that needs to
   stay in place for another recurrence.

Original message:
If the recurrence PHI node has a single user, we can sink any
instruction without side effects, given that all users are dominated by
the instruction computing the incoming value of the next iteration
('Previous'). We can sink instructions that may cause traps, because
that only causes the trap to occur later, but not on any new paths.

With the relaxed check, we also have to make sure that we do not have a
direct cycle (meaning PHI user == 'Previous), which indicates a
reduction relation, which potentially gets missed by
ReductionDescriptor.

As follow-ups, we can also sink stores, iff they do not alias with
other instructions we move them across and we could also support sinking
chains of instructions and multiple users of the PHI.

Fixes PR43398.

Reviewers: hsaito, dcaballe, Ayal, rengolin

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D69228
2019-11-24 21:21:55 +00:00
Simon Atanasyan 1de788a1f1 [mips] Split test into MIPS and microMIPS parts. NFC 2019-11-25 00:19:31 +03:00
Florian Hahn 9a432161c6 [LoopInterchange] Adjust assertions when updating successors.
Currently the assertion in updateSuccessor is overly strict in some
cases and overly relaxed in other cases. For branches to the inner and
outer loop preheader it is too strict, because they can either be
unconditional branches or conditional branches with duplicate targets.
Both cases are fine and we can allow updating multiple successors.

On the other hand, we have to at least update one successor. This patch
adds such an assertion.
2019-11-24 19:37:16 +00:00
Sanjay Patel f575f12c64 [InstCombine] remove identity shuffle simplification for mask with undefs
And simultaneously enhance SimplifyDemandedVectorElts() to rcognize that
pattern. That preserves some of the old optimizations in IR.

Given a shuffle that includes undef elements in an otherwise identity mask like:

define <4 x float> @shuffle(<4 x float> %arg) {
  %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3>
  ret <4 x float> %shuf
}

We were simplifying that to the input operand.

But as discussed in PR43958:
https://bugs.llvm.org/show_bug.cgi?id=43958
...that means that per-vector-element poison that would be stopped by the shuffle can now
leak to the result.

Also note that we still have (and there are tests for) the same transform with no undef
elements in the mask (a fully-defined identity mask). I don't think there's any
controversy about that case - it's a valid transform under any interpretation of
shufflevector/undef/poison.

Looking at a few of the diffs into codegen, I don't see any difference in final asm. So
depending on your perspective, that's good (no real loss of optimization power) or bad
(poison exists in the DAG, so we only partially fixed the bug).

Differential Revision: https://reviews.llvm.org/D70246
2019-11-24 10:06:26 -05:00
Amy Kwan d1dded28da [PowerPC] Spill CR LT bits on P9 using setb
This patch aims to spill CR[0-7]LT bits on POWER9 using the setb instruction.
The sequence on P9 to spill these bits will be:

setb %reg, %CRREG
stw %reg, $FI

Instead of the typical sequence:

mfocrf %reg, %CRREG
rlwinm %reg1, %reg, $SH, 0, 0
stw %reg1, $FI

Differential Revision: https://reviews.llvm.org/D68443
2019-11-24 00:27:40 -06:00
Thomas Raoux e0297a8bee [ModuloSchedule] Fix a bug in experimental expander
Fix two problems that popped up after my last patch. One is that the
stiching of prologue/epilogue can be wrong when reading a value from a
previsou stage. Also changed how we duplicate phi instructions to avoid
generating extra phi that we delete later.

Differential Revision: https://reviews.llvm.org/D70213
2019-11-23 16:01:47 -08:00
Ehud Katz 986d8bf6fb Revert "[InlineCost] Fix infinite loop in indirect call evaluation"
This reverts commit 854e956219.
It broke tests:
Transforms/Inline/redundant-loads.ll
Transforms/SampleProfile/inline-callee-update.ll
2019-11-23 20:16:08 +02:00
Austin Kerbow fef69706dc AMDGPU: Handle waitcnt overflow
Summary:
The waitcnt pass can overflow the counters when the number of outstanding events
for a type exceed the capacity of the counter. This can lead to inefficient
insertion of waitcnts, or to waitcnt instructions with max values for each type.
The last situation can cause an instruction which when disassembled appears to
be an illegal waitcnt without an operand.

In these cases we should add a wait for the 'counter maximum' - 1, and update the
waitcnt brackets accordingly.

Reviewers: rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70418
2019-11-23 09:34:23 -08:00
Ehud Katz 854e956219 [InlineCost] Fix infinite loop in indirect call evaluation
Currently every time we encounter an indirect call of a known function,
we try to evaluate the inline cost of that function. In case of a
recursion, that evaluation never stops.

The solution presented is to evaluate only the indirect call of the
function, while any further indirect calls (of a known function) will be
treated just as direct function calls, which, actually, never tries to
evaluate the call.

Fixes PR35469.

Differential Revision: https://reviews.llvm.org/D69349
2019-11-23 19:02:59 +02:00
Sourabh Singh Tomar 0e02977b6e Recommit "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump."
The original commit message follows.

This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump.
Also Fixes PR43622, PR43623.

Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george

Differential Revision: https://reviews.llvm.org/D69462
2019-11-23 20:10:23 +05:30
Sourabh Singh Tomar 02cb4b2fd6 Revert "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump."
This reverts commit 81b0a3284a.
Will Re-apply, with updated Differtial Revision, for automatic closure of
Phabricator review.
2019-11-23 19:46:07 +05:30
Sourabh Singh Tomar 81b0a3284a [DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump.
This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump.
Also Fixes PR43622, PR43623.

Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george

https://reviews.llvm.org/D69462
2019-11-23 10:25:11 +05:30
Francis Visoiu Mistrih 4506afe3ca [Remarks] Allow empty temporary remark files
When parsing bitstream remarks, allow external remark files to be
empty, which means there are no remarks to be parsed.

In the same way, dsymutil should not produce a remark file.
2019-11-22 15:58:12 -08:00
Evandro Menezes ff0f407e90 [MCA] Fix test cases (NFC)
Fix the test cases for Exynos M5 that break under Darwin.
2019-11-22 16:19:58 -06:00
Davide Italiano c32f0ff92f [InstCombine] Fix call guard difference with dbg
Patch by Chris Ye!

Differential Revision: https://reviews.llvm.org/D68004
2019-11-22 13:35:53 -08:00
Evandro Menezes 48b7fe02a1 [AArch64] Add the pipeline model for Exynos M5
Add the scheduling and cost models for Exynos M5.
2019-11-22 15:09:17 -06:00
Anton Afanasyev 80cd6b6e04 [SLP] Enhance SLPVectorizer to vectorize vector aggregate
Summary:
Vector aggregate is homogeneous aggregate of vectors like `{ <2 x float>, <2 x float> }`.
This patch allows `findBuildAggregate()` to consider vector aggregates as
well as scalar ones. For instance, `{ <2 x float>, <2 x float> }` maps to `<4 x float>`.

Fixes vector part of llvm.org/PR42022

Reviewers: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70068
2019-11-22 20:01:59 +03:00
Anton Afanasyev 6d73265ad8 [SLP][Test] Precommit tests for D70068 and D70587. NFC. 2019-11-22 19:47:21 +03:00
Kazu Hirata 1a58be2ac5 [JumpThreading] Use profile data even with the new pass manager
Summary:
Without this patch, the jump threading pass ignores profiling data
whenever we invoke the pass with the new pass manager.

Specifically, JumpThreadingPass::run calls runImpl with class variable
HasProfileData always set to false.  In turn, runImpl sets
HasProfileData to false again:

  HasProfileData = HasProfileData_;

In the end, we don't use profiling data at all with the new pass
manager.

This patch fixes the problem by passing F.hasProfileData() to runImpl.

The bug appears to have been introduced at:

  https://reviews.llvm.org/D41461

which removed local variable HasProfileData in JumpThreadingPass::run
even though there was one more use left in the same function.  As a
result, the remaining use ended referring to the class variable
instead.

Note that F.hasProfileData is an extremely lightweight function, so I
don't see the need to cache its result.  Once this patch is approved,
I'm planning to stop caching the result of F.hasProfileData in
runOnFunction.

Reviewers: wmi, eli.friedman

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70509
2019-11-22 08:21:48 -08:00
Yonghong Song 9e6aa81588 [BPF] Fix a recursion bug in BPF Peephole ZEXT optimization
Commit a0841dfe85 ("[BPF] Fix a bug in peephole optimization")
fixed a bug in peephole optimization. Recursion is introduced
to handle COPY and PHI instructions.

Unfortunately, multiple PHI instructions may form a cycle
and this will cause infinite recursion, eventual segfault.
For Commit a0841dfe85, I indeed tried a few loops to ensure
that I won't see the recursion, but I did not try with
complex control flows, which, as demonstrated with the test case
in this patch, may introduce PHI cycles.

This patch fixed the issue by introducing a set to remember
visited PHI instructions. This way, cycles can be properly
detected and handled.

Differential Revision: https://reviews.llvm.org/D70586
2019-11-22 08:05:43 -08:00
jasonliu af8576ff9d [XCOFF][AIX] Read-only data section object file generation
Summary:
This patch is a follow up on read-only assembly patch D70182.
It intends to enable object file generation for the read-only data section on AIX.

Reviewers: DiggerLin, daltenty

Differential Revision: https://reviews.llvm.org/D70455
2019-11-22 15:49:37 +00:00
Clement Courbet cb15ba84fe Reland "[DAGCombiner] Allow zextended load combines."
Check that the generated type is simple.
2019-11-22 14:47:18 +01:00
Juneyoung Lee 1465b8bc3a [Test] Fix freeze ocaml test failure 2019-11-22 22:34:37 +09:00
Roman Lebedev 96cf5c8d47
[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` (PR35479)
Summary:
The current lowering is:
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n3, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/2xC
https://rise4fun.com/Alive/jpb5

However, we can support non-tautological cases `C1 u> C2` too.
Said handling consists of two parts:
* `C2 u<= (-1 %u C1)`. It just works. We only have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`
```
Name: (X % C1) == C2 -> (X - C2) * C3 <= C4   iff C2 u<= (-1 %u C1)
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u<= (-1 %u C1)
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = (-1 /u C1)
%n0 = sub i8 %x, C2
%n1 = mul i8 %n0, C3
%n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right
%n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n4 = or i8 %n2, %n3 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n4, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/m4P
https://rise4fun.com/Alive/SKrx
* `C2 u> (-1 %u C1)`. We also have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`,
  and we have to decrement C4:
```
Name: (X % C1) == C2 -> (X - C2) * C3 <= C4   iff C2 u> (-1 %u C1)
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u> (-1 %u C1)
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = (-1 /u C1)-1
%n0 = sub i8 %x, C2
%n1 = mul i8 %n0, C3
%n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right
%n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n4 = or i8 %n2, %n3 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n4, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/d40
https://rise4fun.com/Alive/8cF

I believe this concludes `x u% C1 ==/!= C2` lowering.
In fact, clang is may now be better in this regard than gcc:
as it can be seen from `@t32_6_4` test, we do lower `x % 6 == 4`
via this pattern, while gcc does not: https://godbolt.org/z/XNU2z9
And all the general alive proofs say this is legal.
And manual checking agrees: https://rise4fun.com/Alive/WA2

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=35479 | PR35479 ]].

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: nick, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70053
2019-11-22 15:22:42 +03:00
Roman Lebedev 3f46022e33
[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` with tautological C1 u<= C2 (PR35479)
Summary:
This is a preparatory cleanup before i add more
of this fold to deal with comparisons with non-zero.

In essence, the current lowering is:
```
Name: (X % C1) == 0 -> X * C3 <= C4
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, 0
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%r = icmp ule i8 %n3, %C4
```
https://rise4fun.com/Alive/oqd

It kinda just works, really no weird edge-cases.
But it isn't all that great for when comparing with non-zero.
In particular, given `(X % C1) == C2`, there will be problems
in the always-false tautological case where `C2 u>= C1`:
https://rise4fun.com/Alive/pH3

That case is tautological, always-false:
```
Name: (X % Y) u>= Y
%o0 = urem i8 %x, %y
%r = icmp uge i8 %o0, %y
  =>
%r = false
```
https://rise4fun.com/Alive/ofu

While we can't/shouldn't get such tautological case normally,
we do deal with non-splat vectors, so unless we want to give up
in this case, we need to fixup/short-circuit such lanes.

There are two lowering variants:
1. We can blend between whatever computed result and the correct tautological result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%res = icmp ule i8 %n3, %C4
%r = select i1 %is_tautologically_false, i1 0, i1 %res
```
https://rise4fun.com/Alive/PjT5
https://rise4fun.com/Alive/1KV

2. We can invert the comparison result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
  =>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n3, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/2xC
https://rise4fun.com/Alive/jpb5

3. We can expand into `and`/`or`:
https://rise4fun.com/Alive/WGn
https://rise4fun.com/Alive/lcb5

Blend-one is likely better since we avoid having to load the
replacement from constant pool. `xor` is second best since
it's still pretty general. I'm not adding `and`/`or` variants.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: nick, hiraditya, xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70051
2019-11-22 15:16:03 +03:00
Simon Pilgrim 5aaca2355e [X86] Updated strict fp scalar tests and add fp80 tests for D68857 2019-11-22 11:57:21 +00:00
Pavel Labath 01bb3b07c3 [DWARFVerifier] Use the new location list api
Summary:
Instead of going to the debug_loc section directly, use new
DWARFDie::getLocations instead. This means that the code will now
automatically support debug_loclists sections.

This is the last usage of the old debug_loc methods, and they can now be
removed.

Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX

Subscribers: hiraditya, probinson, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70534
2019-11-22 10:08:39 +01:00
QingShan Zhang a4cc895aee [PowerPC] Implement the vector extend sign instruction pattern match
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.

Differential Revision: https://reviews.llvm.org/D69601
2019-11-22 08:58:27 +00:00
Clement Courbet 88e205525c Revert "[DAGCombiner] Allow zextended load combines."
Breaks some bots.
2019-11-22 09:01:08 +01:00
Clement Courbet 036790f988 [DAGCombiner] Allow zextended load combines.
Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base)

Reviewers: apilipenko, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70487
2019-11-22 08:40:19 +01:00
czhengsz 29f6f9b2b2 [PowerPC] combine rlwinm+rlwinm to rlwinm
combine
x3 = rlwinm x3, 27, 5, 31
x3 = rlwinm x3, 19, 0, 12

to
x3 = rlwinm x3, 14, 0, 12

Reviewed by: steven.zhang

Differential Revision: https://reviews.llvm.org/D70374
2019-11-22 00:00:33 -05:00
Wang, Pengfei 085d7847aa [X86] Add option 'disable-strictnode-mutation' for tests that respect
strict fp semantics. NFCI.
2019-11-22 12:26:55 +08:00
Craig Topper b29e5cdb7c [X86] Add test cases for most of the constrained fp libcalls with fp128.
Add explicit setOperation actions for some to match their none
strict counterparts. This isn't required, but makes the code
self documenting that we didn't forget about strict fp. I've
used LibCall instead of Expand since that's more explicitly what
we want.

Only lrint/llrint/lround/llround are missing now.
2019-11-21 18:17:59 -08:00
Craig Topper fc4020dbbe [X86] Mark fp128 FMA as LibCall instead of Expand. Add STRICT_FMA as well.
The Expand code would fall back to LibCall, but this makes it
more explicit.
2019-11-21 18:17:57 -08:00
Craig Topper 7696b99258 [LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into libcalls. Use it for fp128 on x86-64.
This requires a minor hack for f32/f64 strict fadd/fsub to avoid
turning those into libcalls.
2019-11-21 16:19:25 -08:00
Craig Topper 95f44cf44a [X86] Mark vector STRICT_FADD/STRICT_FSUB as Legal and add mutation to X86ISelDAGToDAG
The prevents LegalizeVectorOps from scalarizing them. We'll need
to remove the X86 mutation code when we add isel patterns.
2019-11-21 16:19:18 -08:00
Craig Topper 0cc12b8a83 [X86] Remove regcall calling convention from fp-strict-scalar.ll. Add 32-bit and 64-bit check prefixes.
The regcall was making 32-bit mode pass things in xmm registers
which made 32-bit and 64-bit more similar. But I have upcoming
patches that require them to be separated anyway.
2019-11-21 16:18:55 -08:00
Alexander Shaposhnikov b6d3774a27 [llvm-lipo] Add support for -extract
This diff adds support for -extract.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D70522
2019-11-21 16:11:48 -08:00
Philip Reames dfb7a9091a [LoopPred] Robustly handle partially unswitched loops
We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed.  Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop.  LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.
2019-11-21 15:44:36 -08:00
Luís Marques 7bf721e59c [Object][RISCV] Resolve R_RISCV_32_PCREL
Summary: Add support for resolving `R_RISCV_32_PCREL` relocations. Those aren't
actually resolved AFAIK, but support is still needed to avoid llvm-dwarfdump
errors. The use of these relocations was introduced in D66419 but the
corresponding resolving wasn't added then. The test adds a check that should
catch future unresolved relocations.

Reviewers: asb, lenary
Reviewed By: asb
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70204
2019-11-21 23:34:05 +00:00
David Tellenbach 75434366ce [AArch64] [FrameLowering] Allow conditional insertion of CFI instruction
Summary:
The insertion of most CFI instructions during AArch64 frame lowering can
be disabled (e.g. using the function attribute `nounwind`).

This patch enables conditional insertion for one more CFI instruction.

Reviewers: t.p.northover, ostannard

Reviewed By: ostannard

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70129
2019-11-22 00:27:41 +01:00
Joel E. Denny f471eb8e99 [FileCheck] Make FILECHECK_OPTS useful for its test suite
Without this patch, `FILECHECK_OPTS` isn't propagated to FileCheck's
test suite so that `FILECHECK_OPTS` doesn't inadvertently affect test
results by affecting the output of FileCheck calls under test.  As a
result, `FILECHECK_OPTS` is useless for debugging FileCheck's test
suite.

In `llvm/test/FileCheck/lit.local.cfg`, this patch provides a new
subsitution, `%ProtectFileCheckOutput`, to address this problem for
both `FILECHECK_OPTS` and the deprecated
`FILECHECK_DUMP_INPUT_ON_FAILURE`.  The rest of the patch uses
`%ProtectFileCheckOutput` throughout the test suite

Fixes PR40284.

Reviewed By: probinson, thopre

Differential Revision: https://reviews.llvm.org/D65121
2019-11-21 18:01:12 -05:00
Vedant Kumar 844d97f650 Clang-trunk Generates Wrong Debug values with -O1
Bit-Tracking Dead Code Elimination (bdce) do not mark dbg.value as undef after
deleting instruction.  which shows invalid state of variable in debugger.  This
patches fixes this by marking the dbg.value as undef which depends on dead
instruction.

This fixes https://bugs.llvm.org/show_bug.cgi?id=41925

Patch by kamlesh kumar!

Differential Revision: https://reviews.llvm.org/D70040
2019-11-21 13:53:10 -08:00
Craig Topper 1439059cc7 [X86] Change legalization action for f128 fadd/fsub/fmul/fdiv from Custom to LibCall.
The custom code just emits a libcall, but we can do the same
with generic code. The only difference is that the generic code
can form tail calls where the custom code couldn't. This is
responsible for the test changes.

This avoids needing to modify the Custom handling for strict fp.
2019-11-21 11:44:29 -08:00
Craig Topper fea8288c17 [X86] Add test case for f128 fma. NFC
This should be turned into a libcall to fmal. We already do it
correctly, but we had no test to confirm.
2019-11-21 11:44:27 -08:00
Philip Reames aaea24802b Broaden the definition of a "widenable branch"
As a reminder, a "widenable branch" is the pattern "br i1 (and i1 X, WC()), label %taken, label %untaken" where "WC" is the widenable condition intrinsics. The semantics of such a branch (derived from the semantics of WC) is that a new condition can be added into the condition arbitrarily without violating legality.

Broaden the definition in two ways:
    Allow swapped operands to the br (and X, WC()) form
    Allow widenable branch w/trivial condition (i.e. true) which takes form of br i1 WC()

The former is just general robustness (e.g. for X = non-instruction this is what instcombine produces). The later is specifically important as partial unswitching of a widenable range check produces exactly this form above the loop.

Differential Revision: https://reviews.llvm.org/D70502
2019-11-21 10:46:16 -08:00
Philip Reames d9426c3360 [Tests] Autogenerate a bunch of SCEV trip count tests for readability. Will likely merge some of these files soon. 2019-11-21 10:46:16 -08:00
Philip Reames 70d173fb1f [SCEV] Add a mode to skip classification when printing analysis
For the various trip-count tests, the classification isn't useful and makes the auto-generated tests super verbose.  By skipping it, we make the auto-gen tests closer to the manually written ones.  Up next: auto-genning a bunch of the existings tests.
2019-11-21 10:24:19 -08:00
Philip Reames f1a9a83232 [SCEV] Be robust against IR generated by simple-loop-unswitch
Simple loop unswitch likes to leave around unsimplified and/or/xors. SCEV today bails out on these idioms which is unfortunate in general, and specifically for the unswitch interaction.

Differential Revision: https://reviews.llvm.org/D70459
2019-11-21 09:53:43 -08:00
Fangrui Song 30ccee71ca [llvm-objcopy][MachO] Implement --strip-debug
Reviewed By: alexshap

Differential Revision: https://reviews.llvm.org/D70476
2019-11-21 09:40:34 -08:00
Fangrui Song 242002770b [llvm-objcopy][MachO] Fix symbol order in the symbol table
Only consider isUndefinedSymbol() when the symbol is not local. This
fixes an assert failure when copying the symbol table, if a n_type=0x20
symbol is followed by a n_type=0x64 symbol.

Reviewed By: alexshap, seiya

Differential Revision: https://reviews.llvm.org/D70475
2019-11-21 09:30:46 -08:00
Bjorn Pettersson 898de30291 [BranchFolding] Fix PR43964 about branch folder not being debug invariant
Summary:
The fix in BranchFolder related to non debug invariant problems
done in commit ec32dff0b0 actually introduced some new
problems with debug invariance.

Before that patch ComputeCommonTailLength would move iterators
back, past debug instructions, in order to make ProfitableToMerge
make consistent answers "when one block differs from the other
only by whether debugging pseudos are present at the beginning".
But the changes in ec32dff0b0 undid that by moving the iterators
forward again.

This patch refactors ComputeCommonTailLength. The function was
really complex, considering that the SkipTopCFIAndReturn part
always moved the iterators forward to the first "real" instruction
in the found tail after ec32dff0b0.

The patch also restores the logic to "back past possible debugging
pseudos at beginning of block" to make sure ProfitableToMerge
gives consistent answers independent of DBG_VALUE instructions
before the tail. That is now done by ProfitableToMerge instead of
being hidden as a side-effect in ComputeCommonTailLength.

Reviewers: probinson, yechunliang, jmorse

Reviewed By: jmorse

Subscribers: Orlando, mehdi_amini, dexonsmith, aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70091
2019-11-21 18:13:32 +01:00
Adrian Prantl 1b9ef3bbb5 Reduce the number of iterations in testcase. (NFC) 2019-11-21 08:32:55 -08:00