We've already got versions without the barriers, so this just adds IR-level
support for generating the new v8 ones.
rdar://problem/16227836
llvm-svn: 204813
Given that we support multiple directives that enable a particular feature
(e.g. '.set mips16'), it's best to hoist that code into a new function
so that we don't repeat the same pattern w.r.t parsing and handling error cases.
No functional changes.
llvm-svn: 204811
After some discussion on IRC, emitting a call to the library function seems
like a better default, since it will move from a compiler internal error to
a linker error, that the user can work around until LLVM is fixed.
I'm also adding a note on the responsibility of the user to confirm that
the cache was cleared on platforms where nothing is done.
llvm-svn: 204806
The directive '.option pic2' enables PIC from assembly source.
At the moment none of the macros/directives check the PIC bit
but that's going to be fixed relatively soon. For example, the
expansion of macros like 'la' depend on the relocation model.
llvm-svn: 204803
Implementing the LLVM part of the call to __builtin___clear_cache
which translates into an intrinsic @llvm.clear_cache and is lowered
by each target, either to a call to __clear_cache or nothing at all
incase the caches are unified.
Updating LangRef and adding some tests for the implemented architectures.
Other archs will have to implement the method in case this builtin
has to be compiled for it, since the default behaviour is to bail
unimplemented.
A Clang patch is required for the builtin to be lowered into the
llvm intrinsic. This will be done next.
llvm-svn: 204802
With VSX there is a real vector select instruction, and so we should use it.
Note that VSELECT will still scalarize for v2f64 because the corresponding
SetCC result type (v2i64) is not currently a legal type.
llvm-svn: 204801
These are aliases of t4-t7 and are provided for compatibility with both the
original ABI documentation (using t4-t7) and GNU As (using t0-t3)
llvm-svn: 204797
Summary: Added test cases for O32 and N32 on MIPS64.
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3175
llvm-svn: 204796
This reveals a small mistake in mips-register-names.s ($sp is tested twice and
$s8 is not tested) which will be fixed in a follow-up commit.
llvm-svn: 204792
This reverts commit r204781.
I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.
llvm-svn: 204784
Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204781
Summary:
This allows them to be used without -cc1 the same way as -I and -isystem.
Renamed the options to --system-header-prefix=/--no-system-header-prefix to avoid interference with -isystem and make the intent of the option cleaner.
Reviewers: rsmith
Reviewed By: rsmith
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3185
llvm-svn: 204775
RoundTripYAML pass is removed from the regular execution pass in r204296,
so the safeguard to protect it from OOM error is no longer needed, because
we are sure that the pass is only used for tests, and test files are all
small.
We also want to see RoundTripYAML pass to fail in tests if it fails,
rather than silently skipping failing tests.
llvm-svn: 204772
Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are
across domain (int <-> float).
In particular, prior to this patch we were generating:
vpbroadcastd LCPI1_0(%rip), %ymm2
vpand %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty
Now, we generate the following nice sequence where everything is in the float
domain:
vbroadcastss LCPI1_0(%rip), %ymm2
vandps %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0
<rdar://problem/16354675>
llvm-svn: 204770
We need .symtab_shndxr if and only if a symbol references a section with an
index >= 0xff00.
The old code was trying to figure out if the section was needed ahead of time,
making it a fairly dependent on the code actually writing the table. It was
also somewhat conservative and would create the section in cases where it was
not needed.
If I remember correctly, the old structure was there so that the sections were
created in the same order gas creates them. That was valuable when MC's support
for ELF was new and we tested with elf-dump.py.
This patch refactors the symbol table creation to another class and makes it
obvious that .symtab_shndxr is really only created when we are about to output
a reference to a section index >= 0xff00.
While here, also improve the tests to use macros. One file is one section
short of needing .symtab_shndxr, the second one has just the right number.
llvm-svn: 204769
The VSX instruction set has two types of FMA instructions: A-type (where the
addend is taken from the output register) and M-type (where one of the product
operands is taken from the output register). This adds a small pass that runs
just after MI scheduling (and, thus, just before register allocation) that
mutates A-type instructions (that are created during isel) into M-type
instructions when:
1. This will eliminate an otherwise-necessary copy of the addend
2. One of the product operands is killed by the instruction
The "right" moment to make this decision is in between scheduling and register
allocation, because only there do we know whether or not one of the product
operands is killed by any particular instruction. Unfortunately, this also
makes the implementation somewhat complicated, because the MIs are not in SSA
form and we need to preserve the LiveIntervals analysis.
As a simple example, if we have:
%vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
...
%vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19,
%RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19
...
We can eliminate the copy by changing from the A-type to the
M-type instruction. This means:
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
is replaced by:
%vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9,
%RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9
and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
llvm-svn: 204768
Conceptually one of these loops is just a while-loop, but the actual code-gen
is more complicated. We don't instrument all the different control flow edges
to get accurate counts for each conditional branch, nor do I think it makes
sense to do so. Instead, make the simplifying assumption that the loop behaves
like a while-loop. Use the same branch weights for the first check for an
empty collection as would be used for the back-edge of a while loop, and use
that same weighting for the innermost loop, ignoring the possibility that there
may be some extra code to go fetch more elements.
llvm-svn: 204767
if they didn't change, just like it does for
registers. This makes life easier for kernel
debugging and any other situation where values
are read-only.
<rdar://problem/16367795>
llvm-svn: 204764