This is an improvement when compiling with llvm. llvm doesn't inline
the call to insert, so the align is always executed and shows up in
the profile.
With gcc the call to insert is inlined and the align computation moved
and done only if needed.
With this patch we explicitly only compute it if it is needed.
In the two tests with debug info, the speedup was
scylla
master 3.008959365
patch 2.932080942 1.02621974786x faster
firefox
master 6.709823604
patch 6.592387227 1.01781393795x faster
In all others the difference was in the noise.
llvm-svn: 284249
This change adds transformations such as:
zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0))))
To:
srl(or(ctlz(x), ctlz(y)), log2(bitsize(x))
This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput.
Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it.
For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar.
Differential Revision: https://reviews.llvm.org/D23446
llvm-svn: 284248
Summary:
* Describe new (3.3) parameter attribute group encoding, leaving old encoding there with a note about legacy
* Bring TYPE_BLOCK docs up to date
* Remove docs about obsolete (pre 3.0) TYPE_SYMTAB_BLOCK, TST_CODE_ENTRY
* Fix a couple of incorrect comments and remove one unused enum definition along the way
This addresses https://llvm.org/bugs/show_bug.cgi?id=28941.
Patch by: Ismail Badawi <ibadawi@cisco.com>
Differential Revision: https://reviews.llvm.org/D25623
llvm-svn: 284246
This test was apparently checking for 2 independent folds, but we have
plenty of tests for those individual folds already. We are lacking
vector tests, however, because we don't have the shift folds for vectors.
llvm-svn: 284243
Prefer add/zext because they are better supported in terms of value-tracking.
Note that the backend should be prepared for this IR canonicalization
(including vector types) after:
https://reviews.llvm.org/rL284015
Differential Revision: https://reviews.llvm.org/D25135
llvm-svn: 284241
Summary:
This adds a diagnostic to the misc-use-after-move check that is output when the
use happens on a later loop iteration than the move, for example:
A a;
for (int i = 0; i < 10; ++i) {
a.foo();
std::move(a);
}
This situation can be confusing to users because, in terms of source code
location, the use is above the move. This can make it look as if the warning
is a false positive, particularly if the loop is long but the use and move are
close together.
In cases like these, misc-use-after-move will now output an additional
diagnostic:
a.cpp:393:7: note: the use happens in a later loop iteration than the move
Reviewers: hokein
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D25612
llvm-svn: 284235
This is 30646.
PT_OPENBSD_RANDOMIZE
The array element specifies the location and size of a part of the memory image of the program that must be filled with random data before any code in the object is executed. The memory region specified by a segment of this type may overlap the region specified by a PT_GNU_RELRO segment, in which case the intersection will be filled with random data before being marked read-only.
Reference links:
http://man.openbsd.org/OpenBSD-current/man5/elf.5c494713c45
Differential revision: https://reviews.llvm.org/D25469
llvm-svn: 284234
Summary:
The header guard generated by clang-move isn't always a perfect
style, just avoid getting the header included multiple times during
compiling period.
Also, we can use llvm-Header-guard clang-tidy check to correct the guard
automatically.
Reviewers: ioeric
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D25610
llvm-svn: 284233
This fixes a small omission where even when __external_threading is provided,
we attempt to declare a pthread based threading API. Instead, we should leave
out everything for the __external_threading header to take care of.
The __threading_support header provides a proof-of-concept externally threaded
libc++ variant when _LIBCPP_HAS_THREAD_API_EXTERNAL is defined. But if the
__external_threading header is present, we should exclude all of that POC stuff.
Reviewers: EricWF
Differential revision: https://reviews.llvm.org/D25468
llvm-svn: 284232
Libc++ will not build with modules enabled. In order to support an in-tree
libc++ when LLVM_ENABLE_MODULES is ON we need to explicitly disable the feature.
Unfortunately the libc++ sources are fundamentally non-modular. For example
iostream.cpp defines cout, cerr, wout, ... as char buffers instead of streams
in order to better control initialization/destruction. Not shockingly Clang
diagnoses this. Many other sources files define _LIBCPP_BUILDING_FOO macros to
provide definitions for normally inline symbols (See bind.cpp). Finally The
current module.map prohibits using <strstream> in C++11 so we can't build
strstream.cpp.
I think I can fix most of these issues but until then just disable modules.
llvm-svn: 284230
There was a bug in the implementation of captured statements. If it has
a lambda expression in it and the same lambda expression is used outside
the captured region, clang produced an error:
```
error: definition with same mangled name as another definition
```
Here is an example:
```
struct A {
template <typename L>
void g(const L&) { }
};
template<typename T>
void f() {
{
A().g([](){});
}
A().g([](){});
}
int main() {
f<void>();
}
```
Error report:
```
main.cpp:3:10: error: definition with same mangled name as another
definition
void g(const L&) { }
^
main.cpp:3:10: note: previous definition is here
```
Patch fixes this bug.
llvm-svn: 284229
Issue was revealed by AFl and I was able to generate such object using yaml2obj.
When object has more than one SHT_MIPS_OPTIONS,
each except the last one is destroyed after placing into Sections array.
Sections array contains dead pointers finally. LLD may crash then.
Differential revision: https://reviews.llvm.org/D25555
llvm-svn: 284227
-z wxneeded creates a PHDR PT_OPENBSD_WXNEEDED.
PT_OPENBSD_WXNEEDED
The array element specifies that a process executing this file may need to be able to map or protect memory regions as simultaneously executable and writable. If the system is unable or unwilling to permit that for this executable then it may fail immediately. This segment type is meaningful only for executable files and is ignored in other objects.
http://man.openbsd.org/OpenBSD-current/man5/elf.5
Differential revision: https://reviews.llvm.org/D25472
llvm-svn: 284226
Summary:
This will be used for 64-bit MULHU, which is in turn used for the 64-bit
divide-by-constant optimization (see D24822).
Reviewers: arsenm, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25289
llvm-svn: 284224
Mostly this just means changing the triple from aarch64-apple-ios to the generic
aarch64--. Only one test needs more significant changes, but GlobalISel already
does the right thing so it's ok to just change the checks.
Differential Revision: https://reviews.llvm.org/D25532
llvm-svn: 284223
Summary:
If there are multiple <File, Replacements> pairs with the same file
path after removing dots, we only keep one pair (with path after dots being
removed) and discard the rest.
Reviewers: djasper
Subscribers: klimek, hokein, bkramer, cfe-commits
Differential Revision: https://reviews.llvm.org/D25565
llvm-svn: 284219
For compatiblity with binutils, define these instructions to take
two registers with a 16bit unsigned immediate. Both of the registers
have to be same for dahi and dati.
Reviewers: dsanders, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D21473
llvm-svn: 284218
Summary:
This fixes a false-positive e.g. when string literals are returned from if statement.
This patch includes as well a small fix to includes and renames of the test suite that collided with the name of the check.
Reviewers: alexfh, hokein
Subscribers: hokein
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D25558
llvm-svn: 284212
Committing in the name of Ziv Izhar: After check-all and LGTM .
The following patch is for compatability with Microsoft.
Microsoft ignores the keyword "short" when used after a jmp, for example:
__asm {
jmp short label
label:
}
A test for that patch will be added in another patch, since it's located in clang's codegen tests. Link will be added shortly.
link to test: https://reviews.llvm.org/D24958
Differential Revision: https://reviews.llvm.org/D24957
llvm-svn: 284211
Summary:
This patch implements the library side of P0035R4. The implementation is thanks to @rsmith.
In addition to the C++17 implementation, the library implementation can be explicitly turned on using `-faligned-allocation` in all dialects.
Reviewers: mclow.lists, rsmith
Subscribers: rsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D25591
llvm-svn: 284206
This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code.
llvm-svn: 284204