Commit Graph

307522 Commits

Author SHA1 Message Date
Tom Stellard 3d36e5c3e6 Only promote args when function attributes are compatible
Summary:
Check to make sure that the caller and the callee have compatible
function arguments before promoting arguments.  This uses the same
TargetTransformInfo queries that are used to determine if attributes
are compatible for inlining.

The goal here is to avoid breaking ABI when a called function's ABI
depends on a target feature that is not enabled in the caller.

This is a very conservative fix for PR37358.  Ideally we would have a more
sophisticated check for ABI compatiblity rather than checking if the
attributes are compatible for inlining.

Reviewers: echristo, chandlerc, eli.friedman, craig.topper

Reviewed By: echristo, chandlerc

Subscribers: nikic, xbolva00, rkruppe, alexcrichton, llvm-commits

Differential Revision: https://reviews.llvm.org/D53554

llvm-svn: 351296
2019-01-16 05:15:31 +00:00
Serguei Katkov a5b0e5585b [InstCombine]Avoid introduction of unaligned mem access
InstCombine is able to transform mem transfer instrinsic to alone store or store/load pair.
It might result in generation of unaligned atomic load/store which later in backend
will be transformed to libcall. It is not an evident gain and it is better to keep intrinsic as is
and handle it at backend.

Reviewers: reames, anna, apilipenko, mkazantsev
Reviewed By: reames
Subscribers: t.p.northover, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D56582

llvm-svn: 351295
2019-01-16 04:36:26 +00:00
Eric Fiselier 8e92050794 [SemaCXX] Unconfuse Clang when std::align_val_t is unscoped in C++03
When -faligned-allocation is specified in C++03 libc++ defines
std::align_val_t as an unscoped enumeration type (because Clang didn't
provide scoped enumerations as an extension until 8.0).
Unfortunately Clang confuses the `align_val_t` overloads of delete with
the sized deallocation overloads which aren't enabled. This caused Clang
to call the aligned deallocation function as if it were the sized
deallocation overload.

For example: https://godbolt.org/z/xXJELh

This patch fixes the confusion.

llvm-svn: 351294
2019-01-16 02:34:36 +00:00
Peter Collingbourne d88cbd57d3 gn build: Merge r351283.
llvm-svn: 351293
2019-01-16 02:27:12 +00:00
Eric Fiselier efdac65f70 Attempt to make test_macros.h even more minimal
llvm-svn: 351292
2019-01-16 02:16:57 +00:00
Eric Fiselier 05019eb79f Fix feature test macros for atomics/mutexes without threading
llvm-svn: 351291
2019-01-16 02:10:28 +00:00
Eric Fiselier d01a4aa068 Fix PR40230 - std::pair may have padding on FreeBSD.
Summary:
FreeBSD ships a very old and deprecated ABI for std::pair where the copy and move constructors are not allowed to be trivial. D25389 change how this was implemented by introducing a non-trivial base class. This patch, introduced in October 2016, introduced an ABI bug that caused nested `std::pair` instantiations to have padding. For example:

```
using PairT = std::pair< std::pair<char, char>, char >;
static_assert(offsetof(PairT, first) == 0, "First member should exist at offset zero"); // Fails on FreeBSD!
```

The bug occurs because the base class for the first element (the nested pair) cannot be put at offset zero because the top-level pair already has the same base class laid out there.

This patch fixes that ABI bug by templating the dummy base class on the same parameters as the pair.

Technically this fix is an ABI break for users who depend on the "broken" ABI introduced in 2016. I'm putting this up for review so that the FreeBSD maintainers can sign off on fixing the ABI by breaking the ABI.
Another option, since we have to "break" the ABI to fix it, would be to move FreeBSD off the deprecated non-trivial pair ABI instead.

Also see:

* https://llvm.org/PR40230
* https://reviews.llvm.org/D21329



Reviewers: rsmith, dim, emaste

Reviewed By: rsmith

Subscribers: mclow.lists, krytarowski, christof, ldionne, libcxx-commits

Differential Revision: https://reviews.llvm.org/D56357

llvm-svn: 351290
2019-01-16 01:54:34 +00:00
Eric Fiselier d108bf85b0 Move internal usages of `alignof`/`__alignof` to use `_LIBCPP_ALIGNOF`.
Summary:
Starting in Clang 8.0 and GCC 8.0, `alignof` and `__alignof` return different values in same cases. Specifically `alignof` and `_Alignof` return the minimum alignment for a type, where as `__alignof` returns the preferred alignment. libc++ currently uses `__alignof` but means to use `alignof`. See  llvm.org/PR39713

This patch introduces the macro `_LIBCPP_ALIGNOF` so we can control which spelling gets used.

This patch does not introduce any ABI guard to provide the old behavior with newer compilers. However, if we decide that is needed, this patch makes it trivial to implement.

I think we should commit this change immediately, and decide what we want to do about the ABI afterwards. 

Reviewers: ldionne, EricWF

Reviewed By: ldionne, EricWF

Subscribers: jyknight, christof, libcxx-commits

Differential Revision: https://reviews.llvm.org/D54814

llvm-svn: 351289
2019-01-16 01:51:12 +00:00
Julian Lettner ac855d3ea9 [TSan] Use switches when dealing with enums
Summary:
Small refactoring: replace some if-else cascades with switches so that the compiler warns us about missing cases.
Maybe found a small bug?

Reviewers: dcoughlin, kubamracek, dvyukov, delcypher, jfb

Reviewed By: dvyukov

Subscribers: llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D56295

llvm-svn: 351288
2019-01-16 01:45:12 +00:00
Sam Clegg 6320efb4ed [WebAssembly] Store section alignment as a power of 2
This change bumps for version number of the wasm object file
metadata.

See https://github.com/WebAssembly/tool-conventions/pull/92

Differential Revision: https://reviews.llvm.org/D56762

llvm-svn: 351287
2019-01-16 01:43:21 +00:00
Eric Fiselier 32784a740a Implement feature test macros using a script.
Summary:
This patch implements all the feature test macros libc++ currently supports, as specified by the standard or cppreference prior to C++2a.

The tests and `<version>` header are generated using a script. The script contains a table of each feature test macro, the headers it should be accessible from, and its values of each dialect of C++.
When a new feature test macro is added or needed, the table should be updated and the script re-run.



Reviewers: mclow.lists, jfb, serge-sans-paille

Reviewed By: mclow.lists

Subscribers: arphaman, jfb, ldionne, libcxx-commits

Differential Revision: https://reviews.llvm.org/D56750

llvm-svn: 351286
2019-01-16 01:37:43 +00:00
Sam Clegg 56c587adfd [WebAssembly] Store section alignment as a power of 2
This change bumps for version number of the wasm object file
metadata.

See https://github.com/WebAssembly/tool-conventions/pull/92

Differential Revision: https://reviews.llvm.org/D56758

llvm-svn: 351285
2019-01-16 01:34:48 +00:00
Eli Friedman c4c43b2bad [EH] Rename llvm.x86.seh.recoverfp intrinsic to llvm.eh.recoverfp
This is the clang counterpart to D56747.

Patch by Mandeep Singh Grang.

Differential Revision: https://reviews.llvm.org/D56748

llvm-svn: 351284
2019-01-16 00:50:44 +00:00
Aditya Nandakumar 500e3ead9f [GISel]: Add support for CSEing continuously during GISel passes.
https://reviews.llvm.org/D52803

This patch adds support to continuously CSE instructions during
each of the GISel passes. It consists of a GISelCSEInfo analysis pass
that can be used by the CSEMIRBuilder.

llvm-svn: 351283
2019-01-16 00:40:37 +00:00
Vlad Tsyrklevich e3226737ce Revert "[Tooling] Make clang-tool find libc++ dir on mac when running on a file without compilation database."
This reverts commits r351222 and r351229, they were causing ASan/MSan failures
on the sanitizer bots.

llvm-svn: 351282
2019-01-16 00:37:39 +00:00
Mandeep Singh Grang 436735c3fe [EH] Rename llvm.x86.seh.recoverfp intrinsic to llvm.eh.recoverfp
Summary:
Make recoverfp intrinsic target-independent so that it can be implemented for AArch64, etc.
Refer D53541 for the context. Clang counterpart D56748.

Reviewers: rnk, efriedma

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D56747

llvm-svn: 351281
2019-01-16 00:37:13 +00:00
Jan Korous dca9c7cf24 [clangd] XPC transport layer
- New transport layer for macOS.
- XPC Framework
- Test client

Framework and client were written by Alex Lorenz.

Differential Revision: https://reviews.llvm.org/D54428

llvm-svn: 351280
2019-01-16 00:24:22 +00:00
Craig Topper 78e7fff56f [LangRef] Fix typo adress->address. NFC
llvm-svn: 351279
2019-01-16 00:21:59 +00:00
Craig Topper 0e420e6a62 [X86] Rename SHRUNKBLEND ISD node to BLENDV.
That's really what it is. If we didn't use intrinsics for BLENDVPS/BLENDVPD/PBLENDVB all the way to isel, this is the node we would use.

llvm-svn: 351278
2019-01-16 00:20:30 +00:00
Peter Collingbourne 32c46f7274 gn build: Add check-hwasan target.
The Android sanitizer tests are currently some of the most difficult
to run correctly, requiring at least 3 build directories which have
to be configured in just the right way and built in the correct order
(see e.g. [1] and the functions that it calls).

This patch adds a check-hwasan target which greatly simplifies running
the hwasan tests for gn users, taking advantage of its support for
multiple toolchains. With this the tests can be run simply by setting
an NDK path and running "ninja check-hwasan" with a compatible Android
device connected. The Linux/x86_64 and Android/aarch64 targets are
tested in parallel.

[1] https://github.com/llvm/llvm-zorg/blob/master/zorg/buildbot/builders/sanitizers/buildbot_android.sh

Differential Revision: https://reviews.llvm.org/D56713

llvm-svn: 351277
2019-01-16 00:15:25 +00:00
Alex Langford f510bc7972 [lldb-mi] Remove use of dialog box
Summary:
This really is only implemented on Windows, and it requires us to pull
in User32. This was only useful when debugging on lldb-mi on Windows, and there
doesn't seem to be a good reason why using a dialog box is better than what
exists for other platforms.

Reviewers: zturner, jingham, compnerd

Subscribers: ki.stfu

Differential Revision: https://reviews.llvm.org/D56755

llvm-svn: 351276
2019-01-16 00:09:50 +00:00
Craig Topper 34ac509ac8 [X86] Add avx512 scatter intrinsics that use a vXi1 mask instead of a scalar integer.
We're trying to have the vXi1 types in IR as much as possible. This prevents the need for bitcasts when the producer of the mask was already a vXi1 value like an icmp. The bitcasts can be subject to code motion and interfere with basic block at a time isel in bad ways.

llvm-svn: 351275
2019-01-15 23:36:25 +00:00
Adrian Prantl ee031dfacb Remove redundant check.
llvm-svn: 351274
2019-01-15 23:33:26 +00:00
Changpeng Fang 20fe3d2f35 AMDGPU: Raise the priority of MAD24 in instruction selection.
Summary:
  We have seen performance regression when v_add3 is generated. The major reason is that the v_mad pattern
is broken when v_add3 is generated. We also see the register pressure increased. While we could not properly
estimate register pressure during instruction selection, we can give mad a higher priority.

In this work, we raise the priority for mad24 in selection and resolve the performance regression.

Reviewers:
  rampitec

Differential Revision:
  https://reviews.llvm.org/D56745

llvm-svn: 351273
2019-01-15 23:12:36 +00:00
Stephen Kelly a4fd381dc2 Re-order type param children of ObjC nodes
Reviewers: aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D55394

llvm-svn: 351272
2019-01-15 23:07:30 +00:00
Stephen Kelly e80e4cbd20 NFC: Some cleanups that I missed in the previous commit
llvm-svn: 351271
2019-01-15 23:05:11 +00:00
Peter Collingbourne f12c754b02 compiler-rt/test: Bring back -pie on Android.
Looks like the sanitizer-x86_64-linux-android bot started failing
because -pie is still needed when targeting API levels < 16 (which
is the case by default for arm and i686).

llvm-svn: 351270
2019-01-15 22:53:24 +00:00
Stephen Kelly 42d9950073 Re-order overrides in FunctionDecl dump
Output all content which is local to the FunctionDecl before traversing
to child AST nodes.

This is necessary so that all of the part which is local to the
FunctionDecl can be split into a different method.

Reviewers: aaron.ballman

Differential Revision: https://reviews.llvm.org/D55083

llvm-svn: 351269
2019-01-15 22:50:37 +00:00
Stephen Kelly b418937793 NFC: Replace iterator loop with cxx_range_for
llvm-svn: 351268
2019-01-15 22:45:46 +00:00
Jonas Devlieghere 7a16862745 [VFS] Add getter for mapping entries.
When generating a reproducer in LLDB we build up the mapping but don't
immediately copy over the files on the file system.

Rather than keeping a separate data structure with real and virtual
paths, we might as well reuse the entries already stored in the
YAMLVFSWriter to lazily copy over the files when needed.

llvm-svn: 351266
2019-01-15 22:36:56 +00:00
Jonas Devlieghere 1a0ce65ad3 [VFS] Move RedirectingFileSystem interface into header (NFC)
This moves the RedirectingFileSystem into the header so it can be
extended. This is needed in LLDB we need a way to obtain the external
path to deal with FILE* and file descriptor APIs.

Discussion on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-November/127755.html

Differential revision: https://reviews.llvm.org/D54277

llvm-svn: 351265
2019-01-15 22:36:41 +00:00
Adrian Prantl 222474b9ff Simplify code by using Optional::getValueOr()
llvm-svn: 351264
2019-01-15 22:30:01 +00:00
Alex Langford 76cb3089de [debugserver][CMake] Remove commented out line
This has been commented out since rL300111
(commit d742d081f3a1e7412cc609765139ba32d597ac15). Looks like it was
committed as a commented out line, so I'm removing it.

llvm-svn: 351263
2019-01-15 22:27:59 +00:00
Jonathan Metzman 9e14cccf6f [libFuzzer] Remove unstable edge handling
Summary:
Remove code for handling unstable edges from libFuzzer since
it has not been found useful.

Differential Revision: https://reviews.llvm.org/D56730

llvm-svn: 351262
2019-01-15 22:12:51 +00:00
Paul Hoad b16e288c7d [clang-tidy] add options documentation to readability-identifier-naming checker
Summary:
The documentation for this clang-checker did not explain what the options are. But this checkers only works with at least some options defined.

To discover the options, you have to read the source code. This shouldn't be necessary for users who just have access to the clang-tidy binary.

This revision, explains the options and gives an example.

Patch by MyDeveloperDay.

Reviewers: JonasToth, Eugene.Zelenko

Reviewed By: JonasToth

Subscribers: xazax.hun, Eugene.Zelenko

Differential Revision: https://reviews.llvm.org/D56563

llvm-svn: 351261
2019-01-15 22:06:49 +00:00
Peter Collingbourne 6498fbb22b compiler-rt/test: Add a couple of convenience features for Android.
Add a ANDROID_SERIAL_FOR_TESTING CMake variable. This lets you
run the tests with multiple devices attached without having to set
ANDROID_SERIAL.

Add a mechanism for pushing files to the device. Currently most
sanitizers require llvm-symbolizer and the sanitizer runtime to
be pushed to the device. This lets the sanitizer make this happen
automatically before running the tests by specifying the paths in
the lit.site.cfg file.

Differential Revision: https://reviews.llvm.org/D56712

llvm-svn: 351260
2019-01-15 22:06:48 +00:00
Jordan Rupprecht 20a817ea2a [libObject] Tweak expected error output from llvm-ar
llvm-svn: 351259
2019-01-15 22:03:08 +00:00
Peter Collingbourne 16dfb6c55b gn build: Add a stage2 host toolchain and make the hwasan runtime buildable on x86_64 Linux.
Differential Revision: https://reviews.llvm.org/D56711

llvm-svn: 351258
2019-01-15 22:02:12 +00:00
Rong Xu 3e9e7fb961 [profile] Sync up InstrProfData.inc with llvm copy /NFC
llvm-svn: 351257
2019-01-15 21:59:17 +00:00
Jordan Rupprecht 904ce9847d [llvm-ar] Resubmit recursive thin archive test with fix for full path names and better error messages
llvm-svn: 351256
2019-01-15 21:52:31 +00:00
Peter Collingbourne efe83db799 gn build: Add a resource_dir.gni file.
The path to the resource directory will end up being used in several
more places once the support for running check-hwasan lands. This
moves the definition to a central location so that it can be used
from those places.

Differential Revision: https://reviews.llvm.org/D56700

llvm-svn: 351255
2019-01-15 21:44:59 +00:00
Craig Topper b2729b14e4 [X86] Add the GCCBuiltin name back to the deprecated avx512 gather intrinsics until the clang side patch for the new versions is approved.
llvm-svn: 351254
2019-01-15 21:41:31 +00:00
Roman Lebedev fb4eed381d X86DAGToDAGISel::matchBitExtract() with truncation (PR36419)
Summary:
Previously in D54095 i have added support for extraction of `lshr` from `X` if we are to produce `BEXTR`.
That was good, but the fix was partial, there was still [[ https://bugs.llvm.org/show_bug.cgi?id=36419 | PR36419 ]].

That pattern can also appear, roughly, when you have a large (64-bit) storage, and the consume bits from it.
It will not be unexpected if you will be doing further computations in 32-bit width.
And then the current code breaks, as the tests show.

The basic idea/pattern here is following:
1. We have `i64` input
2. We perform `i64` right-shift on it.
3. We `trunc`ate that shifted value
4. We do all further work (masking) in `i32`

Since we see `trunc`ation and not `lshr`, we give up, and stop trying to extract that right-shift.
BUT. The mask is `i32`, therefore we can extend both of the operands of the masking (`and`) to `i64`
and truncate the result after masking: https://rise4fun.com/Alive/K4B
```
Name: @bextr64_32_b1 -> @bextr64_32_b0
  %shiftedval = lshr i64 %val, %numskipbits
  %truncshiftedval = trunc i64 %shiftedval to i32
  %widenumlowbits1 = zext i8 %numlowbits to i32
  %notmask1 = shl nsw i32 -1, %widenumlowbits1
  %mask1 = xor i32 %notmask1, -1
  %res = and i32 %truncshiftedval, %mask1
=>
  %shiftedval = lshr i64 %val, %numskipbits
  %widenumlowbits = zext i8 %numlowbits to i64
  %notmask = shl nsw i64 -1, %widenumlowbits
  %mask = xor i64 %notmask, -1
  %wideres = and i64 %shiftedval, %mask
  %res = trunc i64 %wideres to i32
```

Thus, we are again able to extract that `lshr` into `BEXTR`'s control.

Now, the perf (via `llvm-exegesis`) of the snippet suggests that it is not a good idea:
```
$ cat /tmp/old.s
# bextr64_32_b1
# LLVM-EXEGESIS-LIVEIN RSI
# LLVM-EXEGESIS-LIVEIN EDX
# LLVM-EXEGESIS-LIVEIN RDI
movq %rsi, %rcx
shrq %cl, %rdi
shll $8, %edx
bextrl %edx, %edi, %eax
$ cat /tmp/old.s | ./bin/llvm-exegesis -mode=latency -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-1e0082.o
---
mode:            latency
key:
  instructions:
    - 'MOV64rr RCX RSI'
    - 'SHR64rCL RDI RDI'
    - 'SHL32ri EDX EDX i_0x8'
    - 'BEXTR32rr EAX EDI EDX'
  config:          ''
  register_initial_values: []
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 0.6638, per_snippet_value: 2.6552 }
error:           ''
info:            ''
assembled_snippet: 4889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C7C3
...
$ cat /tmp/old.s | ./bin/llvm-exegesis -mode=uops -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-43e346.o
---
mode:            uops
key:
  instructions:
    - 'MOV64rr RCX RSI'
    - 'SHR64rCL RDI RDI'
    - 'SHL32ri EDX EDX i_0x8'
    - 'BEXTR32rr EAX EDI EDX'
  config:          ''
  register_initial_values: []
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: PdFPU0, value: 0, per_snippet_value: 0 }
  - { key: PdFPU1, value: 0, per_snippet_value: 0 }
  - { key: PdFPU2, value: 0, per_snippet_value: 0 }
  - { key: PdFPU3, value: 0, per_snippet_value: 0 }
  - { key: NumMicroOps, value: 1.2571, per_snippet_value: 5.0284 }
error:           ''
info:            ''
assembled_snippet: 4889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C74889F148D3EFC1E208C4E268F7C7C3
...
```
vs
```
$ cat /tmp/new.s
# bextr64_32_b1
# LLVM-EXEGESIS-LIVEIN RDX
# LLVM-EXEGESIS-LIVEIN SIL
# LLVM-EXEGESIS-LIVEIN RDI
shlq $8, %rdx
movzbl %sil, %eax
orq %rdx, %rax
bextrq %rax, %rdi, %rax
$ cat /tmp/new.s | ./bin/llvm-exegesis -mode=latency -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-8944f1.o
---
mode:            latency
key:
  instructions:
    - 'SHL64ri RDX RDX i_0x8'
    - 'MOVZX32rr8 EAX SIL'
    - 'OR64rr RAX RAX RDX'
    - 'BEXTR64rr RAX RDI RAX'
  config:          ''
  register_initial_values: []
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 0.7454, per_snippet_value: 2.9816 }
error:           ''
info:            ''
assembled_snippet: 48C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C7C3
...
$ cat /tmp/new.s | ./bin/llvm-exegesis -mode=uops -snippets-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-da403c.o
---
mode:            uops
key:
  instructions:
    - 'SHL64ri RDX RDX i_0x8'
    - 'MOVZX32rr8 EAX SIL'
    - 'OR64rr RAX RAX RDX'
    - 'BEXTR64rr RAX RDI RAX'
  config:          ''
  register_initial_values: []
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: PdFPU0, value: 0, per_snippet_value: 0 }
  - { key: PdFPU1, value: 0, per_snippet_value: 0 }
  - { key: PdFPU2, value: 0, per_snippet_value: 0 }
  - { key: PdFPU3, value: 0, per_snippet_value: 0 }
  - { key: NumMicroOps, value: 1.2571, per_snippet_value: 5.0284 }
error:           ''
info:            ''
assembled_snippet: 48C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C748C1E208400FB6C64809D0C4E2F8F7C7C3
...
```
^ latency increased (worse).

Except //maybe// not really.
Like with all synthetic benchmarks, they //may// be misleading.

Let's take a look on some actual real-world hotpath.
In this case it's 'my' [[ https://github.com/darktable-org/rawspeed | RawSpeed ]]'s `BitStream<>::peekBitsNoFill()`, in [[ e3316dc851/src/librawspeed/decompressors/VC5Decompressor.cpp (L814) | GoPro VC5 decompressor ]]:
```
raw.pixls.us-unique/GoPro/HERO6 Black$ /usr/src/googlebenchmark/tools/compare.py -a benchmarks ~/rawspeed/build-clangs1-{old,new}/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 GOPR9172.GPR
RUNNING: /home/lebedevri/rawspeed/build-clangs1-old/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 GOPR9172.GPR --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmplwbKEM
2018-12-22 21:23:03
Running /home/lebedevri/rawspeed/build-clangs1-old/src/utilities/rsbench/rsbench
Run on (8 X 4012.81 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 3.41, 2.41, 2.03
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                        Time           CPU Iterations  CPUTime,s CPUTime/WallTime     Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GOPR9172.GPR/threads:8/real_time_mean           40 ms         40 ms        128   0.322244          7.96974        12M       37.4457M        298.534M      3.12047       24.8778   0.040465
GOPR9172.GPR/threads:8/real_time_median         39 ms         39 ms        128   0.312606          7.99155        12M        38.387M        306.788M      3.19891       25.5656   0.039115
GOPR9172.GPR/threads:8/real_time_stddev          4 ms          3 ms        128  0.0271557         0.130575          0        2.4941M        21.3909M     0.207842       1.78257   3.81081m
RUNNING: /home/lebedevri/rawspeed/build-clangs1-new/src/utilities/rsbench/rsbench --benchmark_counters_tabular=true --benchmark_min_time=0.00000001 --benchmark_repetitions=128 GOPR9172.GPR --benchmark_display_aggregates_only=true --benchmark_out=/tmp/tmpWAkan9
2018-12-22 21:23:08
Running /home/lebedevri/rawspeed/build-clangs1-new/src/utilities/rsbench/rsbench
Run on (8 X 4013.1 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 3.78, 2.50, 2.06
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                        Time           CPU Iterations  CPUTime,s CPUTime/WallTime     Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GOPR9172.GPR/threads:8/real_time_mean           39 ms         39 ms        128   0.311533          7.97323        12M       38.6828M        308.471M      3.22356        25.706  0.0390928
GOPR9172.GPR/threads:8/real_time_median         38 ms         38 ms        128   0.304231          7.99005        12M       39.4437M        315.527M      3.28698        26.294  0.0380316
GOPR9172.GPR/threads:8/real_time_stddev          3 ms          3 ms        128  0.0229149         0.133814          0       2.26225M        19.1421M     0.188521       1.59517   3.13671m
Comparing /home/lebedevri/rawspeed/build-clangs1-old/src/utilities/rsbench/rsbench to /home/lebedevri/rawspeed/build-clangs1-new/src/utilities/rsbench/rsbench
Benchmark                                                 Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------
GOPR9172.GPR/threads:8/real_time_pvalue                 0.0000          0.0000      U Test, Repetitions: 128 vs 128
GOPR9172.GPR/threads:8/real_time_mean                  -0.0339         -0.0316            40            39            40            39
GOPR9172.GPR/threads:8/real_time_median                -0.0277         -0.0274            39            38            39            38
GOPR9172.GPR/threads:8/real_time_stddev                -0.1769         -0.1267             4             3             3             3
```
I.e. this results in //roughly// -3% improvements in perf.

While this will help [[ https://bugs.llvm.org/show_bug.cgi?id=36419 | PR36419 ]], it won't address it fully.

Reviewers: RKSimon, craig.topper, andreadb, spatel

Reviewed By: craig.topper

Subscribers: courbet, llvm-commits

Differential Revision: https://reviews.llvm.org/D56052

llvm-svn: 351253
2019-01-15 21:31:18 +00:00
Peter Collingbourne f6627ce834 compiler-rt/test: Clean up Android specific workarounds in lit.common.cfg.
-pie -Wl,--enable-new-dtags are no longer needed because
the driver passes them by default as of r316606.

Prepend -fuse-ld=gold instead of appending it so that the linker can
be overridden using COMPILER_RT_TEST_COMPILER_CFLAGS.

Differential Revision: https://reviews.llvm.org/D56697

llvm-svn: 351252
2019-01-15 21:27:44 +00:00
David Callahan d129d3e93f treat invoke like call
Summary:
InvokeInst should be treated like CallInst and
assigned a separate discriminator. This is particularly
import when an Invoke is converted to a Call
during compilation and so can invalidate sample profile
data collected wtih different link time optimizations

Reviewers: twoh, Kader, danielcdh, wmi

Reviewed By: wmi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56491

llvm-svn: 351251
2019-01-15 21:26:51 +00:00
Adrian Prantl 5b73c9bf27 Simplify Value::GetValueByteSize()
llvm-svn: 351250
2019-01-15 21:26:03 +00:00
Reid Kleckner ede93da6fe [clang-cl] Alias /Zc:alignedNew[-] to -f[no-]aligned-allocation
Implements PR40180.

clang-cl has one minor behavior difference with cl with this change.
Clang allows the user to enable the C++17 feature of aligned allocation
without enabling all of C++17, but MSVC will not call the aligned
allocation overloads unless -std:c++17 is passed. While our behavior is
technically incompatible, it would require making driver mode specific
changes to match MSVC precisely, and clang's behavior is useful because
it allows people to experiment with new C++17 features individually.
Therefore, I plan to leave it as is.

llvm-svn: 351249
2019-01-15 21:24:55 +00:00
Peter Collingbourne e6b1a3418d gn build: Move target flags from toolchain to a .gni file.
While here, add a use_lld flag and default it to true when using
clang on non-mac.

Differential Revision: https://reviews.llvm.org/D56710

llvm-svn: 351248
2019-01-15 21:24:00 +00:00
Matt Morehouse 19ff35c481 [SanitizerCoverage] Don't create comdat for interposable functions.
Summary:
Comdat groups override weak symbol behavior, allowing the linker to keep
the comdats for weak symbols in favor of comdats for strong symbols.

Fixes the issue described in:
https://bugs.chromium.org/p/chromium/issues/detail?id=918662

Reviewers: eugenis, pcc, rnk

Reviewed By: pcc, rnk

Subscribers: smeenai, rnk, bd1976llvm, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D56516

llvm-svn: 351247
2019-01-15 21:21:01 +00:00
Peter Collingbourne 4a22fb18c7 gn build: Add build files for compiler-rt/lib/{hwasan,interception,sanitizer_common,ubsan}.
This allows the hwasan runtime to be built for Android aarch64.

Differential Revision: https://reviews.llvm.org/D56628

llvm-svn: 351246
2019-01-15 21:08:21 +00:00