Before r294774, there was a problem when lowering broadcasts to use
128-bit subvectors.
When we looked through a bitcast to find the broadcast input, we'd keep
using the original type, so you'd end up with things like:
(v8f32 (broadcast
(v4f32 (extract_subvector
(v8i32 V),
...))
))
r294774 fixed it to always emit subvectors with the scalar type of the
original source.
It also introduced some asserts, to check that we use scalars with
the same size, and vectors with the same number of elements.
The scalar size equality is checked earlier when looking through bitcasts,
and is a useful assert.
However, the number of elements don't have to be identical: we're always
going to extract a 128-bit subvector, and we can have different size
inputs if we looked through a concat_vector to find a 256-bit source.
Relax the overzealous assert.
Replace it with a check of the original source vector being 256 or 512
bits. If it's 128 bits, we can't extract_subvector from it.
Fixes PR32371.
llvm-svn: 299490
This caused a failure in the test case:
functionalities/breakpoint/objc/TestObjCBreakpoints.py
When we are parsing up names we stick interesting parts of the names
in various buckets, one of which is the ObjC selector bucket. The new
C++ name parser must be interfering with this process somehow.
<rdar://problem/31439305>
llvm-svn: 299489
Decouple this setting from EnableIRPA.
To support function calls on AMDGPU, it is necessary to
report the global register usage throughout the kernel's
call graph, so callees need to be handled first.
llvm-svn: 299487
stores with some fixes.
Summary:
This enables us to cache the clobbering access for stores, despite the
fact that we can't rewrite the use-def chains themselves.
Early testing shows that, after this change, for larger testcases, it
will be a significant net positive (memory and time) to remove the
walker caching.
Reviewers: george.burgess.iv, davide
Subscribers: Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D31567
llvm-svn: 299486
Set correct default flags and section type based on its name for .text,
.data, .bss, .init_array, .fini_array, .preinit_array, .tdata, and .tbss
and support section name suffixes for .data.*, .rodata.*, .text.*,
.bss.*, .tdata.* and .tbss.* which matches the behavior of GAS.
Fixes PR31888.
Differential Revision: https://reviews.llvm.org/D30229
llvm-svn: 299484
This improves upon r246462: that prevented FMOVs from being emitted
for the cross-class INSERT_SUBREGs by disabling the formation of
INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting
caused us to introduce partial dependencies on the vector register.
Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that
is folded away by many patterns, including the scalar LDRS that we
want in this case.
Credit goes to Adam for finding the issue!
llvm-svn: 299482
This patch adds the option -print-resource-dir. It simply
prints the resource directory. This information will eventually
be used in compiler-rt to setup COMPILER_RT_LIBRARY_INSTALL_DIR.
Patch by Catherine Moore!
Differential Revision: https://reviews.llvm.org/D31447
llvm-svn: 299473
This works with a regular shell since the kernel can keep track of a
deleted cwd. Since we just keep a path string, the following
subprocess invocations fail.
I think this would also fail on windows.
llvm-svn: 299471
This adds the new pragma and the first variant, contract(on/off/fast).
The pragma has the same block scope rules as STDC FP_CONTRACT, i.e. it can be
placed at the beginning of a compound statement or at file scope.
Similarly to STDC FP_CONTRACT there is no need to use attributes. First an
annotate token is inserted with the parsed details of the pragma. Then the
annotate token is parsed in the proper contexts and the Sema is updated with
the corresponding FPOptions using the shared ActOn function with STDC
FP_CONTRACT.
After this the FPOptions from the Sema is propagated into the AST expression
nodes. There is no change here.
I was going to add a 'default' option besides 'on/off/fast' similar to STDC
FP_CONTRACT but then decided against it. I think that we'd have to make option
uppercase then to avoid using 'default' the keyword. Also because of the
scoped activation of pragma I am not sure there is really a need a for this.
Differential Revision: https://reviews.llvm.org/D31276
llvm-svn: 299470
With this, FMF(contract) becomes an alternative way to express the request to
contract.
These are currently only propagated for FMul, FAdd and FSub. The rest will be
added as more FMFs are hooked up for this.
This is toward fixing PR25721.
Differential Revision: https://reviews.llvm.org/D31168
llvm-svn: 299469
If an instruction has a true dependency, it makes sense for to use that
register for any undef read operands in the same instruction (we'll have
to wait for that register to become available anyway). This logic
was already implemented. However, the code would then still try to
revisit that instruction and break the dependency (and always fail,
since by definition a true dependency has to be live before the
instruction). Avoid revisiting such instructions as a performance
optimization. No functional change.
Differential Revision: https://reviews.llvm.org/D30173
llvm-svn: 299467
Currently we only fold with ConstantInt RHS. This generalizes to any Constant RHS.
Differential Revision: https://reviews.llvm.org/D31610
llvm-svn: 299466
Summary:
The new test case was crashing before. Now it passes
as expected.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D31441
llvm-svn: 299465
The ELF spec says:
all of the non-default visibility attributes, when applied to a symbol
reference, imply that a definition to satisfy that reference must be
provided within the current executable or shared object.
But we were trying to resolve those undef references to shared
symbols. That causes odd results like creating a got entry with
a relocation pointing to 0.
llvm-svn: 299464
This mode is just like -mcmodel=small except that it moves the
thread pointer from TPIDR_EL0 to TPIDR_EL1.
Patch by Roland McGrath.
Differential Revision: https://reviews.llvm.org/D31624
llvm-svn: 299462
This shares detection logic with ARM(32), since AArch64 capable CPUs may
also run in 32-bit system mode.
We observe weird /proc/cpuinfo output for MSM8992 and MSM8994, where
they report all CPU cores as one single model, depending on which CPU
core the kernel is running on. As a workaround, we hardcode the known
CPU part name for these SoCs.
For big.LITTLE systems, this patch would only return the part name of
the first core (usually the little core). Proper support will be added
in a follow-up change.
Differential Revision: D31675
llvm-svn: 299458
MS assembly syntax provide us with the 'EVEN' directive as a synonymous to at&t '.even'.
This patch include the (small, simple) changes need to allow it.
llvm-side:
https://reviews.llvm.org/D27417
Differential Revision: https://reviews.llvm.org/D27418
llvm-svn: 299454
MS assembly syntax provide us with the 'EVEN' directive as a synonymous to at&t '.even'.
This patch include the (small, simple) changes need to allow it.
Test is provided at the following (clang-side) review:
https://reviews.llvm.org/D27418
Differential Revision: https://reviews.llvm.org/D27417
llvm-svn: 299453
Change the get shared class info function to only
dump its results to the inferior stdout when the
log is verbose. This matches the lldb side of the
same process, which only logs what it found if the
log is on verbose.
llvm-svn: 299451
When the ProcessAllSections flag (introduced in r204398) is set RuntimeDyld is
supposed to make a call to the client's memory manager for every section in each
object that is loaded. Due to some missing checks, this was not happening in all
cases. This patch adds the missing cases, and fixes the Orc unit test that
verifies correct behavior for ProcessAllSections (The unit test had been
silently bailing out due to an ordering issue: a change in the test order meant
that this unit-test was running before the native target was registered. This
issue has also been fixed in this patch).
This fixes <rdar://problem/22789965>
llvm-svn: 299449
https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms
and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true),
so these patterns should not exist by the time we reach the target-specific combiner hook.
llvm-svn: 299448
Fixed the assertion due to absence of source location for
implicitly defined types (using addImplicitTypedef()).
During Sema checks the source location is being expected
and therefore an assertion is triggered.
The change is not specific to OpenCL. But it is particularly
common for OpenCL types to be declared implicitly in Clang
to support the mode without the standard header.
Differential Revision: https://reviews.llvm.org/D31397
llvm-svn: 299447
This patch optimizes two memory intrinsic operations: memset and memcpy based
on the profiled size of the operation. The high level transformation is like:
mem_op(..., size)
==>
switch (size) {
case s1:
mem_op(..., s1);
goto merge_bb;
case s2:
mem_op(..., s2);
goto merge_bb;
...
default:
mem_op(..., size);
goto merge_bb;
}
merge_bb:
Differential Revision: http://reviews.llvm.org/D28966
llvm-svn: 299446