Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given
define void @my_func() {
ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias
We produce without this patch:
.weak my_alias
my_alias = my_func
.globl my_alias2
my_alias2 = my_alias
That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a
@my_alias = alias void ()* @other_func
would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.
There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.
llvm-svn: 204781
Summary:
This allows them to be used without -cc1 the same way as -I and -isystem.
Renamed the options to --system-header-prefix=/--no-system-header-prefix to avoid interference with -isystem and make the intent of the option cleaner.
Reviewers: rsmith
Reviewed By: rsmith
CC: cfe-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D3185
llvm-svn: 204775
RoundTripYAML pass is removed from the regular execution pass in r204296,
so the safeguard to protect it from OOM error is no longer needed, because
we are sure that the pass is only used for tests, and test files are all
small.
We also want to see RoundTripYAML pass to fail in tests if it fails,
rather than silently skipping failing tests.
llvm-svn: 204772
Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table.
That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are
across domain (int <-> float).
In particular, prior to this patch we were generating:
vpbroadcastd LCPI1_0(%rip), %ymm2
vpand %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty
Now, we generate the following nice sequence where everything is in the float
domain:
vbroadcastss LCPI1_0(%rip), %ymm2
vandps %ymm2, %ymm0, %ymm0
vmaxps %ymm1, %ymm0, %ymm0
<rdar://problem/16354675>
llvm-svn: 204770
We need .symtab_shndxr if and only if a symbol references a section with an
index >= 0xff00.
The old code was trying to figure out if the section was needed ahead of time,
making it a fairly dependent on the code actually writing the table. It was
also somewhat conservative and would create the section in cases where it was
not needed.
If I remember correctly, the old structure was there so that the sections were
created in the same order gas creates them. That was valuable when MC's support
for ELF was new and we tested with elf-dump.py.
This patch refactors the symbol table creation to another class and makes it
obvious that .symtab_shndxr is really only created when we are about to output
a reference to a section index >= 0xff00.
While here, also improve the tests to use macros. One file is one section
short of needing .symtab_shndxr, the second one has just the right number.
llvm-svn: 204769
The VSX instruction set has two types of FMA instructions: A-type (where the
addend is taken from the output register) and M-type (where one of the product
operands is taken from the output register). This adds a small pass that runs
just after MI scheduling (and, thus, just before register allocation) that
mutates A-type instructions (that are created during isel) into M-type
instructions when:
1. This will eliminate an otherwise-necessary copy of the addend
2. One of the product operands is killed by the instruction
The "right" moment to make this decision is in between scheduling and register
allocation, because only there do we know whether or not one of the product
operands is killed by any particular instruction. Unfortunately, this also
makes the implementation somewhat complicated, because the MIs are not in SSA
form and we need to preserve the LiveIntervals analysis.
As a simple example, if we have:
%vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
...
%vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19,
%RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19
...
We can eliminate the copy by changing from the A-type to the
M-type instruction. This means:
%vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16,
%RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16
is replaced by:
%vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9,
%RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9
and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9
llvm-svn: 204768
Conceptually one of these loops is just a while-loop, but the actual code-gen
is more complicated. We don't instrument all the different control flow edges
to get accurate counts for each conditional branch, nor do I think it makes
sense to do so. Instead, make the simplifying assumption that the loop behaves
like a while-loop. Use the same branch weights for the first check for an
empty collection as would be used for the back-edge of a while loop, and use
that same weighting for the innermost loop, ignoring the possibility that there
may be some extra code to go fetch more elements.
llvm-svn: 204767
if they didn't change, just like it does for
registers. This makes life easier for kernel
debugging and any other situation where values
are read-only.
<rdar://problem/16367795>
llvm-svn: 204764
For small structs, the frame format now prints them as one-liners
This follows the same definition that frame variable does for deciding what a "small struct" is, and as such should be fairly consistent with the variable display in general
llvm-svn: 204762
When cross-compiling LLVM itself the configure/make scripts get confused when
creating the needed build host tools. For example, building and configuring
like:
CC_FOR_BUILD='i686-pc-linux-gnu-gcc' CXX_FOR_BUILD='i686-pc-linux-gnu-g++'
CXX='i686-mingw32-g++' CC='i686-mingw32-gcc' LD='i686-mingw32-ld' /scratch
/meadori/llvm-trunk/src/trunk/configure --host=i686-mingw32
CC_FOR_BUILD='i686-pc-linux-gnu-gcc' CXX_FOR_BUILD='i686-pc-linux-gnu-g++'
CXX='i686-mingw32-g++' CC='i686-mingw32-gcc' LD='i686-mingw32-ld' make
causes the following build break:
checking whether the C compiler works... configure: error: cannot run C
compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details.
The 'config.log' shows that i686-mingw32-gcc is being used to create
executables for the build host.
This patch fixes the problem by propogating the names of the build host
tools via BUILD_* when configuring/making BuildTools.
Original patch by Ekaterina Sanina.
llvm-svn: 204760
Add a GetFoundationVersion() to AppleObjCRuntime
This API is used to return and cache the major version of Foundation.framework, which is potentially a useful piece of data to key off of to enable or disable certain ObjC related behaviors (especially in data formatters)
llvm-svn: 204756
If a response file is given via command line, the final command line
arguments will not appear in the log because the actual arguments are
in the given file.
This patch is to show the final command line if /verbose is specified
to help users.
llvm-svn: 204754
This change makes significant improvements in the performance of
calculating a UUID within ObjectFileELF, and handles both running
processes and core files correctly. This does lazy evaluation of
UUID generation and caches the result when calculated.
Change by Piotr Rak.
llvm-svn: 204749
Although the first two operands are the ones that can be swapped, the tied
input operand is listed before them, so we need to adjust for that.
I have a test case for this, but it goes along with an upcoming commit (so it
will come soon).
llvm-svn: 204748
Also added 'import sys' on some tests that are using non-standard
unittest2.skipUnless blocks with code that is intended to do things
that we have more specializes @* attributes for. These skip
conditions were failing to execute due to missing import, causing
darwin-only tests to run on Linux regardless. Will file a bug for
that separately.
llvm-svn: 204747
TableGen will create a lookup table for the A-type FMA instructions providing
their corresponding M-form opcodes. This will be used by upcoming commits.
llvm-svn: 204746
When there was no process, the expression options were set to not ignore breakpoints. This causes debug info to be generated and causes errors when evaluating simple expressions.
llvm-svn: 204745
Remove handling of select_cc, since it makes no sense to be there. This
now does nothing, but I'll be adding some handling of other target nodes
soon.
llvm-svn: 204743
In gcc using -Ofast forces linking of crtfastmath.o.
In the current clang crtfastmath.o is only linked when -ffast-math/-funsafe-math-optimizations passed. It can lead to performance issues, when using only -Ofast without explicit -ffast-math (I faced with it).
My patch fixes inconsistency with gcc behaviour and also introduces few tests on it.
Patch by Zinovy Nis!
Differential Revision: http://llvm-reviews.chandlerc.com/D3114
llvm-svn: 204742
Implement Pass::releaseMemory() in BlockFrequencyInfo and
MachineBlockFrequencyInfo. Just delete the private implementation when
not in use. Switch to a std::unique_ptr to make the logic more clear.
<rdar://problem/14292693>
llvm-svn: 204741
If getElementPtr uses a constant as base pointer, then make the constant opaque.
This prevents constant folding it with the offset. The offset can usually be
encoded in the load/store instruction itself and the base address doesn't have
to be rematerialized several times.
llvm-svn: 204739
The cost for the first four stackmap operands was always TCC_Free.
This is only true for the first two operands. All other operands
are TCC_Free if they are within 64bit.
llvm-svn: 204738
Usually opaque constants shouldn't be folded, unless they are simple unary
operations that don't create new constants. Although this shouldn't drop the
opaque constant flag. This commit fixes this.
Related to <rdar://problem/14774662>
llvm-svn: 204737
This used to resort to splitting the 256-bit operation into two 128-bit
shuffles and then recombining the results.
Fixes <rdar://problem/16167303>
llvm-svn: 204735
I found three implementations of this. This splits it out into a new function
and uses it from the three places.
My plan is to add a fourth use when lowering a vector_shuffle:v16i16.
Compared the assembly output of test/CodeGen/X86 before and after.
The only change is due to how the first PSHUFB was generated in
LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the
old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the
control mask. Now we write 0x80. These are of course interchangeable since
bit 7 decides if a constant zero is written in the result byte. The other
instances of this code use 0x80 consistently.
Related to <rdar://problem/16167303>
llvm-svn: 204734