The current implementation is emitting a global constant that happens
to evaluate to the same bytes + relocation as a jump instruction on
X86. This does not work for PIE executables and shared libraries
though, because we end up with a wrong relocation type. And it has no
chance of working on ARM/AArch64 which use different relocation types
for jump instructions (R_ARM_JUMP24) that is never generated for
data.
This change replaces the constant with module-level inline assembly
followed by a hidden declaration of the jump table. Works fine for
ARM/AArch64, but has some drawbacks.
* Extra symbols are added to the static symbol table, which inflate
the size of the unstripped binary a little. Stripped binaries are not
affected. This happens because jump table declarations must be
external (because their body is in the inline asm).
* Original functions that were anonymous are now named
<original name>.cfi, and it affects symbolization sometimes. This is
necessary because the only user of these functions is the (inline
asm) jump table, so they had to be added to @llvm.used, which does
not allow unnamed functions.
llvm-svn: 286611
This is a partial revert of r244615 (http://reviews.llvm.org/D11942),
which caused a major regression in debug info quality.
Turning the artificial __MergedGlobal symbols into private symbols
(l__MergedGlobal) means that the linker will not include them in the
symbol table of the final executable. Without a symbol table entry
dsymutil is not be able to process the debug info for any of the
merged globals and thus drops the debug info for all of them.
This patch is enabling the old behavior for all MachO targets while
leaving all other targets unaffected.
rdar://problem/29160481
https://reviews.llvm.org/D26531
llvm-svn: 286607
https://reviews.llvm.org/D26526
- Fixed DW_FORM_strp to be correctly sized and extracted for DWARF64
- Added some missing strp variants as well
- Fixed comment typo
llvm-svn: 286603
In preparation for a follow on patch that improves DWARF parsing speed, clean up DWARFFormValue so that we have can get the fixed byte size of a form value given a DWARFUnit or given the version, address byte size and dwarf32/64.
This patch cleans up code so that everyone is using one of the new DWARFFormValue functions:
static Optional<uint8_t> DWARFFormValue::getFixedByteSize(dwarf::Form Form, const DWARFUnit *U = nullptr);
static Optional<uint8_t> DWARFFormValue::getFixedByteSize(dwarf::Form Form, uint16_t Version, uint8_t AddrSize, bool Dwarf32);
This patch changes DWARFFormValue::skipValue() to rely on the output of DWARFFormValue::getFixedByteSize(...) instead of duplicating the code in each function. This will reduce the number of changes we need to make to DWARF to fewer places in DWARFFormValue when we add support for new form.
This patch also starts to support DWARF64 so that we can get correct byte sizes for forms that vary according the DWARF 32/64.
To reduce the code duplication a new FormSizeHelper pure virtual class was created that can be created as a FormSizeHelperDWARFUnit when you have a DWARFUnit, or FormSizeHelperManual where you manually specify the DWARF version, address byte size and DWARF32/DWARF64. There is now a single implementation of a function that gets the fixed byte size (instead of two where one took a DWARFUnit and one took the DWARF version, address byte size and DWARFFormat enum) and one function to skip the form values.
https://reviews.llvm.org/D26526
llvm-svn: 286597
This patch corresponds to review:
https://reviews.llvm.org/D26307
Adds all the intrinsics used for various conversion builtins that will
be added to altivec.h. These are type conversions between various types of
vectors.
llvm-svn: 286596
This adds support for the compare logical and trap (memory)
instructions that were added as part of the miscellaneous
instruction extensions feature with zEC12.
llvm-svn: 286587
This adds support for the LZRF/LZRG/LLZRGF instructions that were
added on z13, and uses them for code generation were appropriate.
SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF
over RISBG where both would be possible.
llvm-svn: 286586
This adds support for the 31-to-64-bit zero extension instructions
LLGT and LLGTR and uses them for code generation where appropriate.
Since this operation can also be performed via RISBG, we have to
update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT
over RISBG in case both are possible. The patch includes some
simplification to the tryRISBGZero code; this is not intended
to cause any (further) functional change in codegen.
llvm-svn: 286585
Summary:
Split ReaderWriter.h which contains the APIs into both the BitReader and
BitWriter libraries into BitcodeReader.h and BitcodeWriter.h.
This is to address Chandler's concern about sharing the same API header
between multiple libraries (BitReader and BitWriter). That concern is
why we create a single bitcode library in our downstream build of clang,
which led to r286297 being reverted as it added a dependency that
created a cycle only when there is a single bitcode library (not two as
in upstream).
Reviewers: mehdi_amini
Subscribers: dlj, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D26502
llvm-svn: 286566
This is forcing to use Error::success(), which is in a wide majority
of cases a lot more readable.
Differential Revision: https://reviews.llvm.org/D26481
llvm-svn: 286561
This is a replacement to binutils' string tool. It prints strings found in a
binary (object file, executable, or archive library). It is rather bare and
not functionally equivalent, however, it lays the groundwork necessary for the
strings tool, enabling iterative development of features to reach feature
parity.
llvm-svn: 286556
addSchedBarrierDeps() is supposed to add use operands to the ExitSU
node. The current implementation adds uses for calls/barrier instruction
and the MBB live-outs in all other cases. The use
operands of conditional jump instructions were missed.
Also added code to macrofusion to set the latencies between nodes to
zero to avoid problems with the fusing nodes lingering around in the
pending list now.
Differential Revision: https://reviews.llvm.org/D25140
llvm-svn: 286544
When a function is inlined, each instance is optimized in their own
inlining context. This can produce different remarks all pointing to
the same source line.
This adds a new column on the source view to display the inlining
context.
llvm-svn: 286537
There is no need to track dependencies for constant physregs, as they
don't change their value no matter in what order you read/write to them.
Differential Revision: https://reviews.llvm.org/D26221
llvm-svn: 286526
The NamedRegionTimer initializer without a group name puts the Timer
into the "Misc" group and is (nearly) unused. Remove it.
The only user of this constructor appears to be the HexagonGenInsert pass,
which creates a counter without group to count the complete execution
time of that pass, however since every pass gets a counter by the
PassManager anyway this should be unnecessary. Also removed the
pointless TimerGroup there.
Differential Revision: https://reviews.llvm.org/D25582
llvm-svn: 286524
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself. However, the original code did not properly consider this condition
if returned by a target. This patch addresses the issues to allow a target
to compute the series on its own.
Differential revision: https://reviews.llvm.org/D22975
llvm-svn: 286523
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.
This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.
As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html
Differential Revision: https://reviews.llvm.org/D22793
llvm-svn: 286514
When copying to/from a constant register interferences can be ignored.
Also update the documentation for isConstantPhysReg() to make it more
obvious that this transformation is valid.
Differential Revision: https://reviews.llvm.org/D26106
llvm-svn: 286503
Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata.
However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html).
This patch lets AMDGPU backend emits runtime metadata as a note element in .note section.
Differential Revision: https://reviews.llvm.org/D25781
llvm-svn: 286502
This makes it possible to indent a binary blob by a certain
number of bytes, and also makes some things more idiomatic.
Finally, it integrates this binary blob formatter into ScopedPrinter
which used to have its own implementation of this algorithm.
Differential Revision: https://reviews.llvm.org/D26477
llvm-svn: 286495
The r283656 did this in the remark arguments. We also need to do this
in the main function attribute as that is written to YAML as well.
llvm-svn: 286482
This restores the part of r286297 that didn't require adding a
dependency from the Analysis to Object library. There are two parts
to the original fix, and this will address the handling for the case
where locals are used in module level asm.
The part that requires functionality in libObject handles local defs
in module level asm, and was reverted because our downstream build
of clang builds lib/Bitcode into a single library, and this new
dependency introduced a cycle there. I am trying to get that fixed
(see D26502), so for now that change isn't being restored
llvm-svn: 286475
Note that the existing metadata checking was re-added by hand because the
script doesn't currently know how to generate checks for lines outside of
functions.
llvm-svn: 286460
We were failing to extract a constant splat shift value if the shifted value was being masked.
The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this.
llvm-svn: 286454
These examples are variations that were inspired from a small subgraph taken
from paper.ll which are interesting as they show certain issues with infinite
loops.
llvm-svn: 286450
The version of this instruction with the .w suffix already correctly accepts
this, but the alias without the .w did not.
Differential Revision: https://reviews.llvm.org/D26499
llvm-svn: 286446
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.
https://llvm.org/bugs/show_bug.cgi?id=30923
llvm-svn: 286423
Summary: This adds all of the CodeGen tests which currently pass.
Reviewers: arsenm, kparzysz
Subscribers: japaric, wdng
Differential Revision: https://reviews.llvm.org/D26388
llvm-svn: 286418
No testcase included because I can't figure out how to reduce it.
(It's easy to write a testcase where rotation clones an assume,
but that doesn't actually seem to trigger the crash in opt on
its own; maybe an issue with the laziness?)
Differential Revision: https://reviews.llvm.org/D26434
llvm-svn: 286410
Summary:
The change will test the change in r286159.
The idea behind the change: Make the dbg location different between loop header and preheader/exit. Originally, dbg location 21 exists in 3 BBs: preheader, header, critical edge (exit). Update the debug location of inside the loop header from !21 to !22 so that it will reflect the correct location.
Reviewers: probinson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26428
llvm-svn: 286403
../tools/llvm-extract/llvm-extract.cpp: In function ‘int main(int, char**)’:
warning: ISO C++ forbids zero-size array ‘argv’ [-Wpedantic]
GCC reference bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61259
llvm-svn: 286396
Summary:
Unrolled Loop Size calculations moved to a function.
Constant representing number of optimized instructions
when "back edge" becomes "fall through" replaced with
variable.
Some comments added.
Reviewers: mzolotukhin
Differential Revision: http://reviews.llvm.org/D21719
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 286389
Suspected to be the cause of a sanitizer-windows bot failure:
Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420
llvm-svn: 286385
A relocatable immediate is either an immediate operand or an operand that
can be relocated by the linker to an immediate, such as a regular symbol
in non-PIC code.
Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands
of type "imm32_su". Remove a number of now-redundant patterns.
Differential Revision: https://reviews.llvm.org/D25812
llvm-svn: 286384
For pairs of 32-bit registers: isub_lo, isub_hi.
For pairs of vector registers: vsub_lo, vsub_hi.
Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function
HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg)
that returns the appropriate subreg index for RegClass.
llvm-svn: 286377
Summary:
All changes are pretty straight-forward. I chose to use TimePoints with
second precision, as that is all that seems to be required here.
Reviewers: friss, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25908
llvm-svn: 286358
The name/comment of the third argument to the ScheduleDAGMI constructor
is RemoveKillFlags and not IsPostRA. Only the comments are changed.
Review: A Trick
llvm-svn: 286350
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.
Differential Revision: https://reviews.llvm.org/D26185
llvm-svn: 286347
This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions.
If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result.
It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result.
Differential Revision: https://reviews.llvm.org/D26331
llvm-svn: 286345
Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves.
Reviewers: delena, zvi, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26330
llvm-svn: 286344
Summary:
This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
Reviewers: delena, zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26322
llvm-svn: 286339
The BitcodeReader no longer produces BitcodeDiagnosticInfo diagnostics.
The only remaining reference was in the gold plugin; the code there has been
dead since we stopped producing InvalidBitcodeSignature error codes in r225562.
While at it remove the InvalidBitcodeSignature error code.
llvm-svn: 286326
Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code.
Reviewers: davidxl, hfinkel
Subscribers: mehdi_amini, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D26155
llvm-svn: 286325
Summary:
This is the initial version of the documentation for how to use XRay as
it stands in LLVM, Clang, and compiler-rt. We leave some room for later
expansion mentioining what is work in progress and what could be
expected moving forward.
We also give a high level overview of future work that's both ongoing
and planned.
Reviewers: echristo, dblaikie, chandlerc
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D26386
llvm-svn: 286319
The smallest tests that expose this are codegen tests (because SelectionDAGBuilder::visitSelect() uses matchSelectPattern
to create UMAX/UMIN nodes), but it's also possible to see the effects in IR alone with folds of min/max pairs.
If these were written as unsigned compares in IR, InstCombine canonicalizes the unsigned compares to signed compares.
Ie, running the optimizer pessimizes the codegen for this case without this patch:
define <4 x i32> @umax_vec(<4 x i32> %x) {
%cmp = icmp ugt <4 x i32> %x, <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
%sel = select <4 x i1> %cmp, <4 x i32> %x, <4 x i32> <i32 2147483647, i32 2147483647, i32 2147483647, i32 2147483647>
ret <4 x i32> %sel
}
$ ./opt umax.ll -S | ./llc -o - -mattr=avx
vpmaxud LCPI0_0(%rip), %xmm0, %xmm0
$ ./opt -instcombine umax.ll -S | ./llc -o - -mattr=avx
vpxor %xmm1, %xmm1, %xmm1
vpcmpgtd %xmm0, %xmm1, %xmm1
vmovaps LCPI0_0(%rip), %xmm2 ## xmm2 = [2147483647,2147483647,2147483647,2147483647]
vblendvps %xmm1, %xmm0, %xmm2, %xmm0
Differential Revision: https://reviews.llvm.org/D26096
llvm-svn: 286318