I want to add 512-bits support but I first want to make sure I'm
not breaking anything obvious. This is the first of a series of commit
adding tests. The first oddity found is that Scalar from APInt(s)
always constructed signed. Maybe at some point we want to revisit
this, but at least now we have a test to document how the API behaves.
<rdar://problem/46886288>
llvm-svn: 352103
Use the -check-prefixes= feature to merge most of the WEBASSEMBLY32 and
WEBASSEMBLY64 test checks into a shared WEBASSEMBLY test check.
Differential Revision: https://reviews.llvm.org/D57153
llvm-svn: 352099
Summary:
MemorySSA needs updating each time an instruction is moved.
LICM and control flow hoisting re-hoists instructions, thus needing another update when re-moving those instructions.
Pending cleanup: the MSSA update is duplicated, should be moved inside moveInstructionBefore.
Reviewers: jnspaulsson
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D57176
llvm-svn: 352092
This is the first part of splitting apart https://reviews.llvm.org/D57140 into usuable pieces. Landing the tests in advance of posting a review specifically for the demanded elements part.
llvm-svn: 352091
Summary:
An alignment should be non-zero positive power-of-two, anything and everything else is UB.
We should not have that check for all these prerequisites here, it's just UB.
Also, that was likely confusing middle-end passes.
While there, `CreateIntCast()` should be called with `/*isSigned*/ false`.
Think about it, there are two explanations: "An alignment should be positive",
therefore the sign bit is unset, so `zext` and `sext` is equivalent.
Or a second one: you have `i2 0b10` - a valid alignment,
now you `sext` it: `i2 0b110` - no longer valid alignment.
Reviewers: craig.topper, jyknight, hfinkel, erichkeane, rjmccall
Reviewed By: hfinkel, rjmccall
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D54653
llvm-svn: 352089
attributes from declaration over attributes from '#pragma clang attribute'
Before this commit users had an issue when using #pragma clang attribute with
availability attributes:
The explicit attribute that's specified next to the declaration is not
guaranteed to be preferred over the attribute specified in the pragma.
This commit fixes this by introducing a priority field to the availability
attribute to control how they're merged. Attributes with higher priority are
applied over attributes with lower priority for the same platform. The
implicitly inferred attributes are given the lower priority. This ensures that:
- explicit attributes are preferred over all other attributes.
- implicitly inferred attributes that are inferred from an explicit attribute
are discarded if there's an explicit attribute or an attribute specified
using a #pragma for the same platform.
- implicitly inferred attributes that are inferred from an attribute in the
#pragma are not used if there's an explicit, explicit #pragma, or an
implicit attribute inferred from an explicit attribute for the declaration.
This is the resulting ranking:
`platform availability > platform availability from pragma > inferred availability > inferred availability from pragma`
rdar://46390243
Differential Revision: https://reviews.llvm.org/D56892
llvm-svn: 352084
The unordered_set and unordered_multiset iterators are specified in the standard as follows:
using iterator = implementation-defined; // see [container.requirements]
using const_iterator = implementation-defined; // see [container.requirements]
using local_iterator = implementation-defined; // see [container.requirements]
using const_local_iterator = implementation-defined; // see [container.requirements]
The pairs iterator/const_iterator and local_iterator/const_local_iterator
are not required to be the same. The reasonable requirement would be that
iterator can convert to const_iterator and local_iterator can convert to
const_local_iterator. This patch weakens the check and makes the test
more portable.
Reviewed as https://reviews.llvm.org/D56493.
Thanks to Andrey Maksimov for the patch.
llvm-svn: 352083
Previously, we assumed that .rdata is zero-filled, so when writing
an COFF import table, we didn't write anything if the data is zero.
That assumption was wrong because .rdata can be merged with .text.
If .rdata is merged with .text, they are initialized with 0xcc which
is a trap instruction.
This patch removes that assumption from code.
Should be merged to 8.0 branch as this is a regression.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39826
Differential Revision: https://reviews.llvm.org/D57168
llvm-svn: 352082
Performing splitting early has several advantages:
- Inhibiting inlining of cold code early improves code size. Compared
to scheduling splitting at the end of the pipeline, this cuts code
size growth in half within the iOS shared cache (0.69% to 0.34%).
- Inhibiting inlining of cold code improves compile time. There's no
need to inline split cold functions, or to inline as much *within*
those split functions as they are marked `minsize`.
- During LTO, extra work is only done in the pre-link step. Less code
must be inlined during cross-module inlining.
An additional motivation here is that the most common cold regions
identified by the static/conservative splitting heuristic can (a) be
found before inlining and (b) do not grow after inlining. E.g.
__assert_fail, os_log_error.
The disadvantages are:
- Some opportunities for splitting out cold code may be missed. This
gap can potentially be narrowed by adding a worklist algorithm to the
splitting pass.
- Some opportunities to reduce code size may be lost (e.g. store
sinking, when one side of the CFG diamond is split). This does not
outweigh the code size benefits of splitting earlier.
On net, splitting early in the pipeline has substantial code size
benefits, and no major effects on memory locality or performance. We
measured memory locality using ktrace data, and consistently found that
10% fewer pages were needed to capture 95% of text page faults in key
iOS benchmarks. We measured performance on frequency-stabilized iOS
devices using LNT+externals.
This reverses course on the decision made to schedule splitting late in
r344869 (D53437).
Differential Revision: https://reviews.llvm.org/D57082
llvm-svn: 352080
Summary:
r347205 fixed a bug in FileManager: first calling
getFile(shouldOpen=false) and then getFile(shouldOpen=true) results in
the file not being open.
Unfortunately, some code was (inadvertently?) relying on this bug: when
building with a PCH, the file entries are obtained first by passing
shouldOpen=false, and then later shouldOpen=true, without any intention
of reading them. After r347205, they do get unneccesarily opened.
Aside from extra operations, this means they need to be closed. Normally
files are closed when their contents are read. As these files are never
read, they stay open until clang exits. On platforms with a low
open-files limit (e.g. Mac), this can lead to spurious file-not-found
errors when building large projects with PCH enabled, e.g.
https://bugs.chromium.org/p/chromium/issues/detail?id=924225
Fixing the callsites to pass shouldOpen=false when the file won't be
read is not quite trivial (that info isn't available at the direct
callsite), and passing shouldOpen=false is a performance regression (it
results in open+fstat pairs being replaced by stat+open).
So an ideal fix is going to be a little risky and we need some fix soon
(especially for the llvm 8 branch).
The problem addressed by r347205 is rare and has only been observed in
clangd. It was present in llvm-7, so we can live with it for now.
Reviewers: bkramer, thakis
Subscribers: ilya-biryukov, ioeric, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D57165
llvm-svn: 352079
It should be emitted when any floating-point operations (including
calls) are present in the object, not just when calls to printf/scanf
with floating point args are made.
The difference caused by this is very subtle: in static (/MT) builds,
on x86-32, in a program that uses floating point but doesn't print it,
the default x87 rounding mode may not be set properly upon
initialization.
This commit also removes the walk of the types pointed to by pointer
arguments in calls. (To assist in opaque pointer types migration --
eventually the pointee type won't be available.)
That latter implies that it will no longer consider a call like
`scanf("%f", &floatvar)` as sufficient to emit _fltused on its
own. And without _fltused, `scanf("%f")` will abort with error R6002. This
new behavior is unlikely to bite anyone in practice (you'd have to
read a float, and do nothing with it!), and also, is consistent with
MSVC.
Differential Revision: https://reviews.llvm.org/D56548
llvm-svn: 352076
Guessing that the slashes used in the scripts SECTION command was causing the
windows related failures in the added test.
Original commit message:
Small code model global variable access on PPC64 has a very limited range of
addressing. The instructions the relocations are used on add an offset in the
range [-0x8000, 0x7FFC] to the toc pointer which points to .got +0x8000, giving
an addressable range of [.got, .got + 0xFFFC]. While user code can be recompiled
with medium and large code models when the binary grows too large for small code
model, there are small code model relocations in the crt files and libgcc.a
which are typically shipped with the distros, and the ABI dictates that linkers
must allow linking of relocatable object files using different code models.
To minimze the chance of relocation overflow, any file that contains a small
code model relocation should have its .toc section placed closer to the .got
then any .toc from a file without small code model relocations.
Differential Revision: https://reviews.llvm.org/D56920
llvm-svn: 352071
This does *not* implement full SHT_GROUP semantic, yet it is a simple step forward:
Sections within a group are still considered valid, but they do not behave as
specified by the standard in case of garbage collection.
Differential Revision: https://reviews.llvm.org/D56437
llvm-svn: 352068
Write a couple of variations on vector geps w/both scalars and vectors live over safepoints. Use update_test_checks to show all the IR.
llvm-svn: 352062
After submitting https://reviews.llvm.org/D57138, I realized it was slightly more conservative than needed. The scalar indices don't appear to be a problem on a vector gep, we even had a test for that.
Differential Revision: https://reviews.llvm.org/D57161
llvm-svn: 352061
Fix remaining unittest errors caused by
__attribute__((no_caller_saved_registers))
Related commit which caused the buildbots to fail:
rL352050
llvm-svn: 352060
This is an alternative to https://reviews.llvm.org/D57103. After discussion, we dedicided to check this in as a temporary workaround, and pursue a true fix under the original thread.
The issue at hand is that the base rewriting algorithm doesn't consider the fact that GEPs can turn a scalar input into a vector of outputs. We had handling for scalar GEPs and fully vector GEPs (i.e. all vector operands), but not the scalar-base + vector-index forms. A true fix here requires treating GEP analogously to extractelement or shufflevector.
This patch is merely a workaround. It simply hides the crash at the cost of some ugly code gen for this presumable very rare pattern.
Differential Revision: https://reviews.llvm.org/D57138
llvm-svn: 352059
Summary:
This tunes several of the default parameters used within the allocator:
- disable the deallocation type mismatch on Android by default; this
was causing too many issues with third party libraries;
- change the default `SizeClassMap` to `Dense`, it caches less entries
and is way more memory efficient overall;
- relax the timing of the RSS checks, 10 times per second was too much,
lower it to 4 times (every 250ms), and update the test so that it
passes with the new default.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: srhines, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D57116
llvm-svn: 352057
I discovered that in ICC (where this list comes from), that the two
pentium_iii versions were actually identical despite the two different
names (despite them implying a difference). Because of this, they ended
up having identical manglings, which obviously caused problems when used
together.
This patch makes pentium_iii_no_xmm_regs an alias for pentium_iii so
that it can still be used, but has the same meaning as ICC. However, we
still prohibit using the two together which is different (albeit better)
behavior.
Change-Id: I4f3c9a47e48490c81525c8a3d23ed4201921b288
llvm-svn: 352054