This patch adds the following instructions:
RBIT reverse bits within each active elemnt (predicated), e.g.
rbit z0.d, p0/m, z1.d
for 8, 16, 32 and 64 bit elements.
REV reverse order of elements in data/predicate vector
(unpredicated), e.g.
rev z0.d, z1.d
rev p0.d, p1.d
for 8, 16, 32 and 64 bit elements.
REVB reverse order of bytes within each active element, e.g.
revb z0.d, p0/m, z1.d
for 16, 32 and 64 bit elements.
REVH reverse order of 16-bit half-words within each active
element, e.g.
revh z0.d, p0/m, z1.d
for 32 and 64 bit elements.
REVW reverse order of 32-bit words within each active element,
e.g.
revw z0.d, p0/m, z1.d
for 64 bit elements.
llvm-svn: 337534
MmapFixedNoReserve does not terminate process on failure.
Failure to check its result and die will always lead to harder
to debug crashes later in execution. This was observed in Go
processes due to some address space conflicts.
Consistently check result of MmapFixedNoReserve.
While we are here also add warn_unused_result attribute
to prevent such bugs in future and change return type to bool
as that's what all callers want.
Reviewed in https://reviews.llvm.org/D49367
llvm-svn: 337531
Summary: This is intended to be used for indexing, e.g. in D49417
Reviewers: ioeric, omtcyfz
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49540
llvm-svn: 337527
For dwarf debug info, an executable normally either contains the debug
info, or it is stripped out. To reduce the storage needed (slightly)
for the debug info kept separately from the released, stripped binaries,
one can choose to only copy the debug data from the original executable
(essentially the reverse of the strip operation), producing a file with
only debug info.
When copying the debug data from an executable with GNU objcopy,
the build id and debug directory need to reside in a separate section,
as this will be kept while the rest of the .rdata section is removed.
Differential Revision: https://reviews.llvm.org/D49352
llvm-svn: 337526
Previously, check-all failed many tests for me. It was running the
X86_64DefaultLinuxConfig, X86_64LibcxxLinuxConfig, and
X86_64StaticLibcxxLinuxConfig configs out of
llvm-build/projects/compiler-rt/test/fuzzer. Now, it runs them out of
separate subdirectories there, and most tests pass.
Reviewed By: morehouse, george.karpenkov
Differential Revision: https://reviews.llvm.org/D49249
llvm-svn: 337521
First, <experimental/filesystem> didn't correctly guard
against min/max macros. This adds the proper push/pop macro guards.
Second, an internal time helper had been renamed but the test for
it hadn't been updated. This patch updates those tests.
llvm-svn: 337520
Summary:
This patch implements directory_entry caching *almost* as specified in P0317r1. However, I explicitly chose to deviate from the standard as I'll explain below.
The approach I decided to take is a fully caching one. When `refresh()` is called, the cache is populated by calls to `stat` and `lstat` as needed.
During directory iteration the cache is only populated with the `file_type` as reported by `readdir`.
The cache can be in the following states:
* `_Empty`: There is nothing in the cache (likely due to an error)
* `_IterSymlink`: Created by directory iteration when we walk onto a symlink only the symlink file type is known.
* `_IterNonSymlink`: Created by directory iteration when we walk onto a non-symlink. Both the regular file type and symlink file type are known.
* `_RefreshSymlink` and `_RefreshNonSymlink`: A full cache created by `refresh()`. This case includes dead symlinks.
* `_RefreshSymlinkUnresolved`: A partial cache created by refresh when we fail to resolve the file pointed to by a symlink (likely due to permissions). Symlink attributes are cached, but attributes about the linked entity are not.
As mentioned, this implementation purposefully deviates from the standard. According to some readings of the specification, and the Windows filesystem implementation, the constructors and modifiers which don't pass an `error_code` must throw when the `directory_entry` points to a entity which doesn't exist. or when attribute resolution fails for another reason.
@BillyONeal has proposed a more reasonable set of requirements, where modifiers other than refresh ignore errors. This is the behavior libc++ currently implements, with the expectation some form of the new language will be accepted into the standard.
Some additional semantics which differ from the Windows implementation:
1. `refresh` will not throw when the entry doesn't exist. In this case we can still meet the functions specification, so we don't treat it as an error.
2. We don't clear the path name when a constructor fails via refresh (this will hopefully be changed in the standard as well).
It should be noted that libstdc++'s current implementation has the same behavior as libc++, except for point (2).
If the changes to the specification don't get accepted, we'll be able to make the changes later.
[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0317r1.html
Reviewers: mclow.lists, gromer, ldionne, aaron.ballman
Subscribers: BillyONeal, christof, cfe-commits
Differential Revision: https://reviews.llvm.org/D49530
llvm-svn: 337516
I'm optimistically reverting commit r337511, effectively reapplying
r337504 *without* changes.
The failing bots that had `SmallVector` in the backtrace recovered after
the unrelated commit r337508. The backtraces looked bogus anyway, with
`SmallVector::size()` calling (e.g.) `ConstantArray::get()`.
Here's the original commit message:
ADT: Shrink size of SmallVector by 8B on 64-bit platforms
Represent size and capacity directly as unsigned and calculate
`end()` using `begin() + size()`.
This limits the maximum size/capacity of a vector to UINT32_MAX.
https://reviews.llvm.org/D48518
llvm-svn: 337514
Summary:
lifetime2.C violates DR1696, which prevents reference members from being
initialized to temporaries, whose lifetime would end at the end of ctor.
Reviewers: sbc100
Subscribers: dschuff, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D49577
llvm-svn: 337512
remove dead declaration of a call instruction handling helper.
This moves to the 'harden' terminology that I've been trying to settle
on for returns. It also adds a really detailed comment explaining what
all we're trying to accomplish with return instructions and why.
Hopefully this makes it much more clear what exactly is being
"hardened".
Differential Revision: https://reviews.llvm.org/D49571
llvm-svn: 337510
It's more aggressive than we need to be, and leads to strange
workarounds in other places like call return value inference. Instead,
just directly mark an edge viable.
Tests by Florian Hahn.
Differential Revision: https://reviews.llvm.org/D49408
llvm-svn: 337507
Representing size and capacity directly as unsigned and calculate
`end()` using `begin() + size()`.
This limits the maximum size/capacity of a vector to UINT32_MAX.
https://reviews.llvm.org/D48518
llvm-svn: 337504
Summary:
Currently all type ids are emitted into the index file when it is
written. For distributed ThinLTO, that meant that all type ids were
being duplicated into every single distributed index file, regardless of
whether they were referenced, leading to huge amounts of unnecessary
duplication and size bloat.
Keep track of the type id GUIDs actually referenced by the GV summary
records being emitted, and only emit those type IDs.
Add a new test, and fix test/Assembler/thinlto-summary.ll so that all
type ids are referenced to prevent deletion in that test.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, vitalybuka, llvm-commits
Differential Revision: https://reviews.llvm.org/D49565
llvm-svn: 337503
We have a number of cases where we fail to reduce vector op widths, performing the op in a larger vector and then extracting a subvector. This is often because by default it would create illegal types.
This peephole patch attempts to handle a few common cases detailed in PR36761, which typically involved extension+conversion to vX2f64 types.
Differential Revision: https://reviews.llvm.org/D49556
llvm-svn: 337500
The last argument is expected to be the destination buffer size (or less).
Detects if it points to destination buffer size directly or via a variable.
Detects if it is an integral, try to detect if the destination buffer can receive the source length.
Updating bsd-string.c unit tests as it make it fails now.
Reviewers: george.karpenpov, NoQ
Reviewed By: george.karpenkov
Differential Revision: https://reviews.llvm.org/D48884
llvm-svn: 337499
constant, don't convert the rest into a packed struct.
If an array constant has a large non-zero portion and a large zero
portion, we want to emit the first part as an array and the rest as a
zeroinitializer if possible. This fixes a memory usage regression from
r333141 when compiling PHP.
llvm-svn: 337498
Previously, clang marked the specialization as invalid without emitting a
diagnostic. This lead to an assert in CodeGen.
rdar://41806724
Differential revision: https://reviews.llvm.org/D49085
llvm-svn: 337497
For the most part, these changes were from the RFC. I made a few minor
word/structure changes, but nothing significant. I also regenerated the
example output, and adjusted the text accordingly.
Differential Revision: https://reviews.llvm.org/D49527
llvm-svn: 337496
This particular version of GCC seems to break bitfields when a method
appears between two bitfield members.
Personally, I think it's nice to keep bitfields close together so that
it's easy to check how things are packed, so I moved the method after
SubClassData.
Fixes PR38168.
llvm-svn: 337495
It fires on things like SmallVector<std::pair<int, int>>, where we
intentionally use memcpy instead of calling the assignment operator.
This warning fires in practically every LLVM TU, so we have to do
something about it, even if we aren't interested in being 100% warning
clean with GCC.
Reported as PR37337
llvm-svn: 337492
Returning SDValue() means nothing was changed. Returning the result of CombineTo returns the first argument of CombineTo. This is specially detected by DAGCombiner as meaning that something changed, but worklist management was already taken care of.
I think the only real effect of this change is that we now properly update the Statistic the counts the number of combines performed. That's the only thing between the check for null and the check for N in the DAGCombiner.
llvm-svn: 337491
This is mostly a preparation work for adding a limited support for
select instructions. It proved to be difficult to do due to size and
irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I
believe.
It also turned out that these changes make it simpler to finish one of
the TODOs and fix a number of other small issues, namely:
1. Looking through bitcasts to a type of a different size (requires
careful tracking of the original load/store size and some math
converting sizes in bytes to expected differences in indices of GEPs).
2. Reusing partial analysis of pointers done by first attempt in proving
them consecutive instead of starting from scratch. This added limited
support for nested GEPs co-existing with difficult sext/zext
instructions. This also required a careful handling of negative
differences between constant parts of offsets.
3. Handing a case where the first pointer index is not an add, but
something else (a function parameter for instance).
I observe an increased number of successful vectorizations on a large
set of shader programs. Only few shaders are affected, but those that
are affected sport >5% less loads and stores than before the patch.
Reviewed By: rampitec
Differential-Revision: https://reviews.llvm.org/D49342
llvm-svn: 337489
As we already return true from needsAggressiveScheduling() for the most recent
hardware it would be cleaner to just return true for all PowerPC hardware.
Differential Revision: https://reviews.llvm.org/D48663
llvm-svn: 337488
This change fixes possibly invalid access to the internal data structure during
library shutdown. In a heavily oversubscribed situation, the library shutdown
sequence can reach the point where resources are deallocated while there still
exist threads in their final spinning loop. The added loop in
__kmp_internal_end() checks if there are such busy-waiting threads and blocks
the shutdown sequence if that is the case. Two versions of kmp_wait_template()
are now used to minimize performance impact.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D49452
llvm-svn: 337486