Main intention of test was to check that
we do not crash, but for additional check
it previously run readobj for input object
instead of output.
llvm-svn: 295161
If target of R_MIPS_GOT16 relocation is a local symbol its addend
is high 16 bits of complete addend. To calculate a final value, the addend
of this relocation is read, shifted to the left and combined with addend
of paired R_MIPS_LO16 relocation. To save updated addend when the linker
produces a relocatable output, we need to store high 16 bits of the
addend's value. It is different from the case of writing the relocation
result when the linker saves a 16-bit GOT index as-is.
llvm-svn: 295159
Added new ThreadSanitizer annotations to remove false positives with OpenMP reduction.
Cleaned up Tsan annotations header file from unused annotations.
Patch by Simone Atzeni!
Differential Revision: https://reviews.llvm.org/D29202
llvm-svn: 295158
Summary:
We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.
As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.
I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.
This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.
Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.
Reviewers: delena, RKSimon, zvi
Reviewed By: zvi
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D28747
llvm-svn: 295155
libunwind depends on C++ library headers. When building libunwind
as part of LLVM and libc++ is available, use its headers.
Differential Revision: https://reviews.llvm.org/D29800
llvm-svn: 295153
Summary:
The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.
This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.
Reviewers: zvi, RKSimon
Reviewed By: zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29926
llvm-svn: 295152
handler args.
The specialization just inherits from the std::decay'd response handler type.
This allows member functions (via MemberFunctionWrapper) to be used as async
handlers.
llvm-svn: 295151
r274291 made changes to prefer calling a move constructor to calling a
copy constructor when returning from a function. This caused programs to
crash when a __block variable in the heap was moved out and used later.
This commit fixes the bug by disallowing moving out of __block variables
implicitly.
rdar://problem/28181080
Differential Revision: https://reviews.llvm.org/D29908
llvm-svn: 295150
The comment was somewhat misleading in that it implied that passes were not
responsible for adding new assumptions to the assumption cache. This new
wording now explicitly mentions that they are required to do so.
Differential Revision: https://reviews.llvm.org/D29977
llvm-svn: 295148
The idea is that the apply* functions will also be called when importing
devirt optimizations.
Differential Revision: https://reviews.llvm.org/D29745
llvm-svn: 295144
This is still not sufficient for lld to handle its own output when a
fde points to a discarded section. I am investigating if it is better
to change the -r output or make lld able to read the current version.
llvm-svn: 295141
freescale triple.
On multiarch systems, this previously caused us to stat every file in
/usr/lib/<triple> (typically several thousand files). This change halves
the runtime of a clang invocation on an empty file on my system.
llvm-svn: 295140
This patch corrects the maximum workgroups per CU if we have big
workgroups (more than 128). This calculation contributes to the
occupancy calculation in respect to LDS size.
Differential Revision: https://reviews.llvm.org/D29974
llvm-svn: 295134
This is a really horrible case. If a .eh_frame points to a discarded
section, it is not clear what is the correct thing to do.
It looks like ld.bfd discards the entire .eh_frame content and gold
discards the second relocation, leaving one frame with an fde that
refers to a bogus location. This is similar to what gold does.
llvm-svn: 295133
This reverts commit r295102.
In the link of seabios the assumption seems to be that the section has
an actual address, so this is not sufficient. Changing the assembly
code to add a "a" flag seems like the correct thing to do instead of
extending this hack.
Sorry about the noise.
Original message:
Relax the restriction on what relocations can be in a non-alloc section.
The main thing that they can't have is relocations that require the
creation of gots or plt. For now also accept R_PC.
Found while linking seabios.
llvm-svn: 295130
Summary: Previously the cleanups (e.g. dtor calls) are inserted into the
outer scope (e.g. function body scope), instead of it's own scope. After
the fix, the cleanups are inserted right after getting the size value.
This fixes pr30306.
Reviewers: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24333
llvm-svn: 295123
that has been explicitly specialized!
We assume in various places that we can tell the template specialization kind
of a class type by looking at the declaration produced by TagType::getDecl.
That was previously not quite true: for an explicit specialization, we could
have first seen a template-id denoting the specialization (with a use that does
not trigger an implicit instantiation of the defintiion) and then seen the
first explicit specialization declaration. TagType::getDecl would previously
return an arbitrary declaration when called on a not-yet-defined class; it
now consistently returns the most recent declaration in that case.
llvm-svn: 295118
Summary:
The YAML output produced by llvm-xray is supposed to be wrapped at the
arbitrary default of 70 columns set by `yaml:Output`. Unfortunately,
the wrapping is rather unpredictable, and can easily go past the set
number of columns, depending on the execution environment.
To make the YAML output environment-independent, disable wrapping
instead.
Reviewers: dberris
Reviewed By: dberris
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D29962
llvm-svn: 295116
Multiple blocks in the callee can be mapped to a single cloned block
since we prune the callee as we clone it. The existing code
iterates over the value map and clones the block frequency (and
eventually scales the frequencies of the cloned blocks). Value map's
iteration is not deterministic and so the cloned block might get the
frequency of any of the original blocks. The fix is to set the max of
the original frequencies to the cloned block. The first block in the
sequence must have this max frequency and, in the call context,
subsequent blocks must have its frequency.
Differential Revision: https://reviews.llvm.org/D29696
llvm-svn: 295115