All but (v2f64 broadcast f64) are handled with VBROADCAST instructions. The v2f64 version can be handled with VMOVDDUP.
We may want to consider converting to BLENDM instructions in the two address instruction pass if its beneficial to register allocation.
llvm-svn: 291369
Summary:
The issue happens with:
%0 = ....., !tbaa !0
%1 = ....., !tbaa !1
With !0 that references !1.
In this case when loading !0 we generates a temporary for the
operand !1. We now flush it immediately and trigger the load of
!1 before moving on. If we don't we get the temporary when
attaching to %1. This is usually not an issue except that we
eagerly try to update TBAA MDNodes, which is obviously not possible
if we only have a temporary.
Differential Revision: https://reviews.llvm.org/D28423
llvm-svn: 291362
Don't print a line multiple times, each for different inlining contexts, if
nothing happened in any context. This prevents situations like this:
[[
> main:
65 | if ((i * ni + j) % 20 == 0) fprintf
> print_array:
65 | if ((i * ni + j) % 20 == 0) fprintf
]]
which could happen if different optimizations were missed in different inlining
contexts.
llvm-svn: 291361
fabs(x * x) is not generally safe to assume x is positive if x is a NaN.
This is also less general than it could be, so this will be replaced
with a transformation on the intrinsic.
llvm-svn: 291359
Summary:
Prior to this change, phi nodes were never considered defs, and so we ended up with undefined variables for any loop. Now, instead of trying to find just defs, we iterate over each actual IR value in the line, and replace them one by one with either a definition or a use.
We also don't try to match anything in the comment portions of the line.
I've tested it even on things like function pointer calls, etc, and against existing test cases uses update_test_checks
With this change, we are able to use update_tests on the cyclic cases in newgvn.
The only case i'm aware of that will misfire is if you have a string with which contains a valid token.
However, this is the same as it is now, with a slightly larger set of strings that may misfire.
Prior to this change, a test with the string " %a =" would be replaced.
Reviewers: spatel, chandlerc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28384
llvm-svn: 291357
As discussed with Dmitry (https://goo.gl/SA4izd), I would like to introduce a function to be called from a third-party library to flush the shadow memory.
In particular, we ran some experiments with our tool Archer (an OpenMP data race detector based on Tsan, https://github.com/PRUNER/archer) and flushing the memory at the end of an outer parallel region, slightly increase the runtime overhead, but reduce the memory overhead of about 30%. This feature would come very handy in case of very large OpenMP applications that may cause an "out of memory" exception when checked with Tsan.
Reviewed in: https://reviews.llvm.org/D28443
Author: Simone Atzeni (simoatze)
llvm-svn: 291346
Windows is greedy and it defines the identifier `__out` as a macro.
This patch renames all conflicting libc++ identifiers in order
to correctly work on Windows.
llvm-svn: 291345
Tar's Ustar header has the "prefix" field to store a directory
part of a filename. It is not as flexible as the PAX-extended
filename because there's still a limitation on the maximum filename
size, but it mitigates the situation.
This patch should unbreak some Windows buildbots that uses very
old tar command.
llvm-svn: 291340
Tests need to output everything into a single stream, or FileCheck is sometimes confused (buffering can cause stdout/stderr to be interleaved randomly).
llvm-svn: 291339
ELAST should point to the last valid error string value. However,
`_sys_nerr` provides the number of elements in the errlist array. Since
the index is 0-based, this is off-by-one. Adjust it accordingly.
Thanks to David Majnemer for catching this!
llvm-svn: 291336
Add an implementation for the Win32 threading model as a backing API for
the internal c++ threading interfaces. This uses the Fls* family for
the TLS (which has the support for adding termination callbacks),
CRITICAL_SECTIONs for the recursive mutex, and Slim Reader/Writer locks
(SRW locks) for non-recursive mutexes. These APIs should all be
available on Vista or newer.
llvm-svn: 291333
Summary:
On Windows the identifier `__deallocate` is defined as a macro by one of the Windows system headers. Previously libc++ worked around this by `#undef __deallocate` and generating a warning. However this causes the WIN32 version of `__threading_support` to always generate a warning on Windows. This is not OK.
This patch renames all usages of `__deallocate` internally as to not conflict with the macro.
Reviewers: mclow.lists, majnemer, rnk, rsmith, smeenai, compnerd
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28426
llvm-svn: 291332
Windows does not provide an implementation of `nanosleep`. Round up the
time duration to the nearest ms and use `Sleep`. Although this may
over-sleep, there is no hard real-time guarantee on the wake, so
sleeping a bit more is better than under-sleeping as it within the
specification.
llvm-svn: 291331
This patch adds a libc++ configuration macro for the ABI we
are targeting, either Itanium or Microsoft. For now we configure
for the Microsoft ABI when on Windows with a compiler that defines
_MSC_VER. However this is only temporary until Clang implements
builtin macros we can use.
llvm-svn: 291329
Gracefully leave code that performs function-pointer bitcasts implying
non-trivial pointer conversions alone, rather than aborting, since it's
just undefined behavior.
llvm-svn: 291326
Also move command line handling out of the pass constructor and into
a separate function.
Differential Revision: https://reviews.llvm.org/D28422
llvm-svn: 291323
This implements something like the current direction of DR1581: we use a narrow
syntactic check to determine the set of places where a constant expression
could be evaluated, and only instantiate a constexpr function or variable if
it's referenced in one of those contexts, or is odr-used.
It's not yet clear whether this is the right set of syntactic locations; we
currently consider all contexts within templates that would result in odr-uses
after instantiation, and contexts within list-initialization (narrowing
conversions take another victim...), as requiring instantiation. We could in
principle restrict the former cases more (only const integral / reference
variable initializers, and contexts in which a constant expression is required,
perhaps). However, this is sufficient to allow us to accept libstdc++ code,
which relies on GCC's behavior (which appears to be somewhat similar to this
approach).
llvm-svn: 291318