We used to have a separate multiclass for AVX2 and SSE/AVX. Now we have one multiclass and pass the relevant differences.
We were also missing load patterns, though we had them for the AVX-512 version.
llvm-svn: 311059
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
I didn't notice any performance impact when bootstrapping clang with this patch.
The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.
Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
Reviewed By: davide
Subscribers: grandinj, zhendongsu, llvm-commits, david2050
Differential Revision: https://reviews.llvm.org/D35869
llvm-svn: 311057
We had a lock to guard BAlloc from being used concurrently, but that
is not very easy to understand. This patch replaces it with a
std::unique_ptr.
llvm-svn: 311056
In dependent contexts we end up referencing these, so make sure they
have USRs, and have their declarations indexed. For the most part they
behave like typedefs, but we also need to worry about having multiple
using declarations with the same "name".
rdar://problem/33883650
llvm-svn: 311053
This way we can see what the current codegen looks like.
I've also explicitly added/removed the cmov attribute from the RUN lines,
so we know exactly what we're checking in the runs.
llvm-svn: 311052
The ASan runtime on many systems intercepts cxa_throw just so it
can call asan_handle_no_return first. Some newer systems such as
Fuchsia don't use interceptors on standard library functions at all,
but instead use sanitizer-instrumented versions of the standard
libraries. When libc++abi is built with ASan, cxa_throw can just
call asan_handle_no_return itself so no interceptor is required.
Patch by Roland McGrath
Differential Revision: https://reviews.llvm.org/D36599
llvm-svn: 311045
-no-integrated-as is not supported on some targets (e.g.,
x86_64-pc-windows-msvc). Testing using -save-temps is good enough to cover the
relevant logic, and that should work everywhere.
llvm-svn: 311043
Using Output.getFilename() to construct the file name used for optimization
recording in Clang::ConstructJob, when -c is provided, does not work correctly
if we're not using the integrated assembler. With -no-integrated-as (or
-save-temps) Output.getFilename() gives the name of the temporary assembly
file, not the final output file. Instead, use the final output (as provided by
-o). If this is not available, then fall back to using a name based on the
input file.
Fixes PR31532.
llvm-svn: 311041
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.
I didn't notice any performance impact when bootstrapping clang with this patch.
Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki
Reviewed By: davide
Subscribers: grandinj, zhendongsu, llvm-commits, david2050
Differential Revision: https://reviews.llvm.org/D35869
llvm-svn: 311039
This change extends MachineCopyPropagation to do COPY source forwarding.
This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa
Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny
Differential Revision: https://reviews.llvm.org/D30751
llvm-svn: 311038
Summary:
Even in the case of the input file is a preprocessed source, clang uses the file name of the preprocesses source for debug info (DW_AT_name attribute for DW_TAG_compile_unit). However, gcc uses the file name specified in the first linemarker instead. This makes more sense because the one specified in the linemarker represents the "actual" source file name.
Clang already uses the file name specified in the first linemarker for Module name (https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/FrontendAction.cpp#L779) if the input is preprocessed. This patch makes clang to use the same value for debug info as well.
Reviewers: compnerd, rnk, dblaikie, rsmith
Reviewed By: rnk
Subscribers: aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D36474
llvm-svn: 311037
This can be used to build non-sanitized and sanitized versions of
runtimes, where sanitized versions use the just built sanitizer
which in turn may use the non-sanitized version.
Differential Revision: https://reviews.llvm.org/D36348
llvm-svn: 311036
Summary:
Mark LoopDataPrefetch and AArch64FalkorHWPFFix passes as preserving
ScalarEvolution since they do not alter loop structure and should not
alter any SCEV values (though LoopDataPrefetch may introduce new
instructions that won't have cached SCEV values yet).
This can result in slight code differences, mainly w.r.t. nsw/nuw flags
on SCEVs, since these are computed somewhat lazily when a zext/sext
instruction is encountered. As a result, passes after the modified
passes may see SCEVs with more nsw/nuw flags present.
Reviewers: sanjoy, anemet
Subscribers: aemerson, rengolin, mzolotukhin, javed.absar, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D36716
llvm-svn: 311032
Debug information for TLS variables on MIPS might have R_MIPS_TLS_DTPREL32
or R_MIPS_TLS_DTPREL64 relocations. This patch adds a support for such
relocations in the `RelocVisitor`.
llvm-svn: 311031
Summary:
epoll_create() is better to be replaced by epoll_create1() with EPOLL_CLOEXEC
flag to avoid file descriptor leakage.
Differential Revision: https://reviews.llvm.org/D35367
llvm-svn: 311029
Summary:
epoll_create1() is better to set EPOLL_CLOEXEC flag to avoid file descriptor leakage.
Differential Revision: https://reviews.llvm.org/D35365
llvm-svn: 311028
Summary:
accept4() is better to set SOCK_CLOEXEC flag to avoid file descriptor leakage.
Differential Revision: https://reviews.llvm.org/D35363
llvm-svn: 311027
Summary:
accept() is better to be replaced by accept4() with SOCK_CLOEXEC
flag to avoid file descriptor leakage.
Differential Revision: https://reviews.llvm.org/D35362
llvm-svn: 311024
Specs require using fences when barrier() is invoked:
"The barrier function will either flush any variables stored in local memory
or queue a memory fence to ensure correct ordering of memory operations to local memory."
and
"The barrier function will queue a memory fence to ensure correct ordering
of memory operations to global memory."
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 311022
Summary:
This patch changes a few (small) things around for compatibility purposes for
the current Android & Fuchsia work:
- `realloc`'ing some memory that was not allocated with `malloc`, `calloc` or
`realloc`, while UB according to http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html
is more common that one would think. We now only check this if
`DeallocationTypeMismatch` is set; change the "mismatch" error
messages to be more homogeneous;
- some sketchily written but widely used libraries expect a call to `realloc`
to copy the usable size of the old chunk to the new one instead of the
requested size. We have to begrundingly abide by this de-facto standard.
This doesn't seem to impact security either way, unless someone comes up with
something we didn't think about;
- the CRC32 intrinsics for 64-bit take a 64-bit first argument. This is
misleading as the upper 32 bits end up being ignored. This was also raising
`-Wconversion` errors. Change things to take a `u32` as first argument.
This also means we were (and are) only using 32 bits of the Cookie - not a
big thing, but worth mentioning.
- Includes-wise: prefer `stddef.h` to `cstddef`, move `scudo_flags.h` where it
is actually needed.
- Add tests for the memalign-realloc case, and the realloc-usable-size one.
(Edited typos)
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36754
llvm-svn: 311018
Summary:
This patch introduces a way of informing the (Post)DominatorTree about multiple CFG updates that happened since the last tree update. This makes performing tree updates much easier, as it internally takes care of applying the updates in lockstep with the (virtual) updates to the CFG, which is done by reverse-applying future CFG updates.
The batch updater is able to remove redundant updates that cancel each other out. In the future, it should be also possible to reorder updates to reduce the amount of work needed to perform the updates.
Reviewers: dberlin, sanjoy, grosser, davide, brzycki
Reviewed By: brzycki
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D36167
llvm-svn: 311015
To clear assumptions that are potentially invalid after trivialization, we need
to walk the use/def chain. Normally, the only way to reach an instruction with
an unsized type is via an instruction that has side effects (or otherwise will
demand its input bits). That would stop the walk. However, if we have a
readnone function that returns an unsized type (e.g., void), we must avoid
asking for the demanded bits of the function call's return value. A
void-returning readnone function is always dead (and so we can stop walking the
use/def chain here), but the check is necessary to avoid asserting.
Fixes PR34211.
llvm-svn: 311014
If worksharing construct has at least one linear item, an implicit
synchronization point must be emitted to avoid possible conflict with
the loading/storing values to the original variables. Added implicit
barrier if the linear item is found before actual start of the
worksharing construct.
llvm-svn: 311013