On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0)
Fix for PR24366
Differential Revision: http://reviews.llvm.org/D14909
llvm-svn: 254495
I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode.
Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets).
Differential Revision: http://reviews.llvm.org/D15074
llvm-svn: 254494
time.
The new overloaded function is used when an attribute is added to a
large number of slots of an AttributeSet (for example, to function
parameters). This is much faster than calling AttributeSet::addAttribute
once per slot, because AttributeSet::getImpl (which calls
FoldingSet::FIndNodeOrInsertPos) is called only once per function
instead of once per slot.
With this commit, clang compiles a file which used to take over 22
minutes in just 13 seconds.
rdar://problem/23581000
Differential Revision: http://reviews.llvm.org/D15085
llvm-svn: 254491
This is very rudimentary support for debug_cu_index, but it is enough to
allow llvm-dwarfdump to find the offsets for contributions and
correctly dump debug_info.
It will need to actually find the real signature of the unit and build
the real hash table with the right number of buckets, as per the DWP
specification.
It will also need to be expanded to cover the tu_index as well.
llvm-svn: 254489
For efficiency reason, when importing multiple functions for the same Module,
we can avoid reparsing it every time.
Differential Revision: http://reviews.llvm.org/D15102
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254486
When linking static archive, there is no individual module files to
load. Instead they can be mmap'ed and could be initialized from a
buffer directly. The callback provide flexibility to override the
scheme for loading module from the summary.
Differential Revision: http://reviews.llvm.org/D15101
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 254479
This is done by finding the types that are forward declarations that come from a module, and loading that module's debug info in a separate lldb_private::Module, and copying the type over into the current module using a ClangASTImporter object. ClangASTImporter objects are already used to copy types from on clang::ASTContext to another for expressions so the type copying code has been around for a while.
A new FindTypes variant was added to SymbolVendor and SymbolFile:
size_t
SymbolVendor::FindTypes (const std::vector<CompilerContext> &context, bool append, TypeMap& types);
size_t
SymbolVendor::FindTypes (const std::vector<CompilerContext> &context, bool append, TypeMap& types);
The CompilerContext is a way to represent the exact context of a type and pass it through an agnostic API boundary so that we can find that exact context elsewhere in another file. This was required here because we can have a module that has submodules, both of which have a "foo" type.
I am not able to add tests for this yet as we currently don't build our C/C++/ObjC binaries with the clang binary that we build. There are some driver issues where it can't find the header files for the C and C++ standard library which makes compiling these tests hard. We can't also guarantee that if we are building with clang that it supporst the exact format of -gmodule debugging that we are trying to test. We have had other versions of clang that had a different implementation of -gmodule debugging that we are no longer supporting, so we can't enable tests if we are building with clang without compiling something and looking at the structure of the DWARF that was generated to ensure that it is the format we can actually use.
llvm-svn: 254476
We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an
UNDEF value (and worse, it's not what you want with the natural Arch64
implementation).
The generated code is pretty horrific, but I couldn't come up with an obviously
better alternative (if the amount is constant EXTR could help). Turns out
128-bit shifts are just nasty.
rdar://22491037
llvm-svn: 254475
si_int is already defined in sysroot's siginfo.h
emutls.c includes pthread.h which includes asm/siginfo.h which
in turn includes asm-generic/siginfo.h and that defines si_int.
si_int is also defined in builtin's int_types.h and this leads to
errors. This patch fixes the issue by undefining the si_int in int_types.h
Differential Revision: http://reviews.llvm.org/D15086
llvm-svn: 254472
For the struct with trailing objects, define
a member operator delete. Without this, the program
will fail when -fsized-deallocation option is used
where the wrong size will be passed to the global
delete operator.
llvm-svn: 254471
For the build set up which runs the unit tests using an emulator like QEMU,
the unit tests must be run using %run.
Differential Revision: http://reviews.llvm.org/D15081
llvm-svn: 254467
The bug is introduced in r254377 which failed some tests on ARM, where a new
probability is assigned to a successor but the provided BB may not be a
successor.
llvm-svn: 254463
The values in this field are compared against getAvailableFeatures()
which returns an uint64_t. This was causing problems in an internal
branch.
llvm-svn: 254462
Some MIPS relocations including `R_MIPS_HI16/R_MIPS_LO16` use combined
addends. Such addend is calculated using addends of both paired relocations.
Each `R_MIPS_HI16` relocation is paired with the next `R_MIPS_LO16`
relocation. ABI requires to compute such combined addend in case of REL
relocation record format only.
For details see p. 4-17 at
ftp://www.linux-mips.org/pub/linux/mips/doc/ABI/mipsabi.pdf
This patch implements lookup of the next paired relocation suing new
`InputSectionBase::findPairedRelocLocation` method. The primary
disadvantage of this approach is that we put MIPS specific logic into
the common code. The next disadvantage is that we lookup `R_MIPS_LO16`
for each `R_MIPS_HI16` relocation, while in fact multiple `R_MIPS_HI16`
might be paired with the single `R_MIPS_LO16`. From the other side
this way allows us to keep `MipsTargetInfo` class stateless and implement
later relocation handling in parallel.
This patch does not support `R_MIPS_HI16/R_MIPS_LO16` relocations against
`_gp_disp` symbol. In that case the relocations use a special formula for
the calculation. That will be implemented later.
Differential Revision: http://reviews.llvm.org/D15112
llvm-svn: 254461
Profile readers using incompatible on-disk hash table format can now share the same
implementation and interfaces.
Differential Revision: http://reviews.llvm.org/D15100
llvm-svn: 254458
ConstantDataArray::getImpl and ConstantDataVector::getImpl had a lot
of copy pasta in how they handled sequences of constants. Break that
out into a couple of simple functions.
llvm-svn: 254456
Don't use commuteInstruction, and don't commute if
doing so will not improve legality. Skip the more
complex checks for literal operands and constant bus restrictions,
which are not a concern for VOP2 instructions because src1
does not accept SGPRs or constants and few implicitly
read vcc.
This gets called quite a few times and the
attempts at commuting are a significant fraction
of the time spent in SIFixSGPRCopies, so it's
somewhat worthwhile to optimize. With this patch and others
leading up to it, this reduces the compile time of SIFixSGPRCopies
on some of the LuxMark 2 kernels from ~8ms to ~5ms on my system.
llvm-svn: 254452
The host libstdc++ may be horribly broken and we want the fake one to be
picked up. This workaround is lame but I don't see a better way.
llvm-svn: 254446