getLoopID has different control flow for two cases: If there is a
single loop latch and for any other number of loop latches (0 and more
than one). The latter case should return the same result if there is
only a single latch. We can save the preceding redundant search for a
latch by handling both cases with the same code.
Differential Revision: https://reviews.llvm.org/D52118
llvm-svn: 342406
We were mapping an instruction every time we saw something we couldn't map
before this. Since each illegal mapping is unique, we only have to do this once.
This makes it so that we don't map illegal instructions when the previous
mapped instruction was illegal.
In CTMark (AArch64), this results in 240 fewer instruction mappings on
average over 619 files in total. The largest improvement is 12576 fewer
mappings in one file, and the smallest is 0. The median improvement is 101
fewer mappings.
llvm-svn: 342405
Summary:
The IR reference for the `byval` attribute states:
```
This indicates that the pointer parameter should really be passed by value
to the function. The attribute implies that a hidden copy of the pointee is
made between the caller and the callee, so the callee is unable to modify
the value in the caller. This attribute is only valid on LLVM pointer arguments.
```
However, on Win64, this attribute is unimplemented and the raw pointer is
passed to the callee instead. This is problematic, because frontend authors
relying on the implicit hidden copy (as happens for every other calling
convention) will see the passed value silently (if mutable memory) or
loudly (by means of a crash) modified because the callee treats the
location as scratch memory space it is allowed to mutate.
At this point, it's worth taking a step back to understand the context.
In most calling conventions, aggregates that are too large to be passed
in registers, instead get *copied* to the stack at a fixed (computable
from the signature) offset of the stack pointer. At the LLVM, we hide
this hidden copy behind the byval attribute. The caller passes a pointer
to the desired data and the callee receives a pointer, but these pointers
are not the same. In particular, the pointer that the callee receives
points to temporary stack memory allocated as part of the call lowering.
In most calling conventions, this pointer is never realized in registers
or memory. The temporary memory is simply defined by an implicit
offset from the stack pointer at function entry.
Win64, uniquely, works differently. The structure is still passed in
memory, but instead of being stored at an implicit memory offset, the
caller computes a pointer to the temporary memory and passes it to
the callee as a regular pointer (taking up a register, or if all
registers are taken up, an additional stack slot). Presumably, this
was done to allow eliding the copy when passing aggregates through
several functions on the stack.
This explains why ignoring the `byval` attribute mostly works on Win64.
The argument simply gets passed as a pointer and as long as we're ok
with the callee trampling all over that memory, there are no ill effects.
However, it does contradict the documentation of the `byval` attribute
which specifies that there is to be an implicit copy.
Frontends can of course work around this by never emitting the `byval`
attribute for Win64 and creating `alloca`s for the requisite temporary
stack slots (and that does appear to be what frontends are doing).
However, the presence of the `byval` attribute is not a trap for
frontend authors, since it seems to work, but silently modifies the
passed memory contrary to documentation.
I see two solutions:
- Disallow the `byval` attribute in the verifier if using the Win64
calling convention.
- Make it work by simply emitting a temporary stack copy as we would
with any other calling convention (frontends can of course always
not use the attribute if they want to elide the copy).
This patch implements the second option (make it work), though I would
be fine with the first also.
Ref: https://github.com/JuliaLang/julia/issues/28338
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51842
llvm-svn: 342402
dllimported symbols go through an import stub that's called __imp_ followed by
the name the stub points to. Make that work.
Differential Revision: https://reviews.llvm.org/D52145
llvm-svn: 342401
isSupportedValue explicitly checked and accepted many types of value,
primarily for debugging reasons. Remove most of these checks and do a
bit of refactoring now that the pass is more stable. This also enables
ZExts to be sources, but this has very little practical benefit at the
moment extend instructions will still be introduced.
Differential Revision: https://reviews.llvm.org/D52080
llvm-svn: 342395
Summary:
PR37913 documents wrong behaviour for a templated exception factory function.
The check does misidentify dependent types as not derived from std::exception.
The fix to this problem is to ignore dependent types, the analysis works correctly
on the instantiated function.
Reviewers: aaron.ballman, alexfh, hokein, ilya-biryukov
Reviewed By: alexfh
Subscribers: lebedev.ri, nemanjai, mgorny, kbarton, xazax.hun, cfe-commits
Differential Revision: https://reviews.llvm.org/D48714
llvm-svn: 342393
We allow overflowing instructions if they're decreasing and only used
by an unsigned compare. Add the extra condition that the icmp cannot
be using a negative immediate.
Differential Revision: https://reviews.llvm.org/D52102
llvm-svn: 342392
Summary:
In order for this test to work the log file needs to be removed from both
from the host and device. To fix this the `rm` `RUN` lines have been
replaced with `RUN: rm` followed by `RUN: %device_rm`.
Initially I tried having it so that `RUN: %run rm` implicitly runs `rm`
on the host as well so that only one `RUN` line is needed. This
simplified writing the test however that had two large drawbacks.
* It's potentially very confusing (e.g. for use of the device scripts outside
of the lit tests) if asking for `rm` to run on device also causes files
on the host to be deleted.
* This doesn't work well with the glob patterns used in the test.
The host shell expands the `%t.log.*` glob pattern and not on the
device so we could easily miss deleting old log files from previous
test runs if the corresponding file doesn't exist on the host.
So instead deletion of files on the device and host are explicitly
separate commands.
The command to delete files from a device is provided by a new
substitution `%device_rm` as suggested by Filipe Cabecinhas.
The semantics of `%device_rm` are that:
* It provides a way remove files from a target device when
the host is not the same as the target. In the case that the
host and target are the same it is a no-op.
* It interprets shell glob patterns in the context of the device
file system instead of the host file system.
This solves the globbing problem provided the argument is quoted so
that lit's underlying shell doesn't try to expand the glob pattern.
* It supports the `-r` and `-f` flags of the `rm` command,
with the same semantics.
Right now an implementation of `%device_rm` is provided only for
ios devices. For all other devices a lit warning is emitted and
the `%device_rm` is treated as a no-op. This done to avoid changing
the behaviour for other device types but leaves room for others
to implement `%device_rm`.
The ios device implementation uses the `%run` wrapper to do the work
of removing files on a device.
The `iossim_run.py` script has been fixed so that it just runs `rm`
on the host operating system because the device and host file system
are the same.
rdar://problem/41126835
Reviewers: vsk, kubamracek, george.karpenkov, eugenis
Subscribers: #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D51648
llvm-svn: 342391
Summary:
Hello, i would like to suggest a fix for one of the checks in clang-tidy.The bug was reported in https://bugs.llvm.org/show_bug.cgi?id=32575 where you can find more information.
For example:
```
template <typename T0>
struct S {
template <typename T>
void g() const {
int a;
(void)a;
}
};
void f() {
S<int>().g<int>();
}
```
this piece of code should not trigger any warning by the check modernize-redundant-void-arg but when we execute the following command
```
clang_tidy -checks=-*,modernize-redundant-void-arg test.cpp -- -std=c++11
```
we obtain the following warning:
/Users/eco419/Desktop/clang-tidy.project/void-redundand_2/test.cpp:6:6: warning: redundant void argument list in function declaration [modernize-redundant-void-arg]
(void)a;
^~~~
Reviewers: aaron.ballman, hokein, alexfh, JonasToth
Reviewed By: aaron.ballman, JonasToth
Subscribers: JonasToth, lebedev.ri, cfe-commits
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D52135
llvm-svn: 342388
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.
Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.
llvm-svn: 342387
Create a temporary file in the system temporary directory instead of creating a
file in the current directory, which may be not writable. (Fix for an issue
introduced in r342283.)
llvm-svn: 342386
Summary:
Completing inside the expression command now uses the new description API
to also provide additional information to the user. For now this information
are the types of variables/fields and the signatures of completed function calls.
Reviewers: #lldb, JDevlieghere
Reviewed By: JDevlieghere
Subscribers: JDevlieghere, lldb-commits
Differential Revision: https://reviews.llvm.org/D52103
llvm-svn: 342385
Summary:
The init expression of a VarDecl is overwritten in the "To" context if we
import a VarDecl without an init expression (and with a definition). Please
refer to the added tests, especially InitAndDefinitionAreInDifferentTUs. This
patch fixes the malfunction by importing the whole Decl chain similarly as we
did that in case of FunctionDecls. We handle the init expression similarly to
a definition, alas only one init expression will be in the merged ast.
Reviewers: a_sidorin, xazax.hun, r.stahl, a.sidorin
Subscribers: rnkovacs, dkrupp, cfe-commits
Differential Revision: https://reviews.llvm.org/D51597
llvm-svn: 342384
Summary: This will be useful to generate many configurations and test instruction regimes (NaN, Inf, subnormal, normal).
Reviewers: courbet
Subscribers: mgorny, tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D51858
llvm-svn: 342369
Summary:
Merged the recently added `err_attribute_argument_negative` diagnostic
with existing `err_attribute_requires_positive_integer` diagnostic:
the former allows only strictly positive integer, while the latter
also allows zero.
Reviewers: aaron.ballman
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D51853
llvm-svn: 342367
The original was reverted due to an apparent build-bot test failure,
but it looks like this is just a flaky test.
Also added a C-interface function for large values, and updated
llvm-lto's --thinlto-cache-max-size-bytes switch to take a type larger
than int.
The maximum cache size in terms of bytes is a 64-bit number. However,
the methods to set it only took unsigned previously, which meant that
the maximum cache size could not be specified above 4GB. That's quite
small compared to the output of some projects, so it makes sense to
provide the ability to set larger values in that field.
We also needed a C-interface function that provides a greater range
than the existing thinlto_codegen_set_cache_size_bytes, which also only
takes an unsigned, so this change also adds
hinlto_codegen_set_cache_size_megabytes.
Reviewed by: mehdi_amini, tejohnson, steven_wu
Differential Revision: https://reviews.llvm.org/D52023
llvm-svn: 342366
This patch defines a new substitution and uses it to reduce
duplication in the Clang Analyzer test cases.
Differential Revision: https://reviews.llvm.org/D52036
llvm-svn: 342365
This diff adds -S as an alias for --strip-all-gnu
(for compatibility with binutils' objcopy).
Patch by Dmitry Golovin!
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D52163
llvm-svn: 342364
Summary:
To exclude thirdparty code.
To test:
With /tmp/foo.c
```
void test() {
int x;
x = 1; // warn
}
```
```
$ scan-build --exclude non-existing/ --exclude /tmp/ -v gcc -c foo.c
scan-build: Using '/usr/lib/llvm-7/bin/clang' for static analysis
scan-build: Emitting reports for this run to '/tmp/scan-build-2018-09-16-214531-8410-1'.
foo.c:3:3: warning: Value stored to 'x' is never read
x = 1; // warn
^ ~
1 warning generated.
scan-build: File '/tmp/foo.c' deleted: part of an ignored directory.
scan-build: 0 bugs found.
```
Reviewers: jroelofs
Reviewed By: jroelofs
Subscribers: whisperity, cfe-commits
Differential Revision: https://reviews.llvm.org/D52153
llvm-svn: 342359
This can be used to detect whether the code is being built with XRay
instrumentation using the __has_feature(xray_instrument) predicate.
Differential Revision: https://reviews.llvm.org/D52159
llvm-svn: 342358
Support for .preinit_array has been implemented in Fuchsia's libc,
add Fuchsia to the list of platforms that support this feature.
Differential Revision: https://reviews.llvm.org/D52155
llvm-svn: 342357
Summary:
This change makes XRay FDR mode use a single backing store for the
buffer queue, and have indexes into that backing store instead. We also
remove the reliance on the internal allocator implementation in the FDR
mode logging implementation.
In the process of making this change we found an inconsistency with the
way we're returning buffers to the queue, and how we're setting the
extents. We take the chance to simplify the way we're managing the
extents of each buffer. It turns out we do not need the indirection for
the extents, so we co-host the atomic 64-bit int with the buffer object.
It also seems that we've not been returning the buffers for the thread
running the flush functionality when writing out the files, so we can
run into a situation where we could be missing data.
We consolidate all the allocation routines now into xray_allocator.h,
where we used to have routines defined in xray_buffer_queue.cc.
Reviewers: mboerger, eizan
Subscribers: jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D52077
llvm-svn: 342356
std::vector::iterator type may be a pointer, then
iterator::value_type fails to compile since iterator is not a class,
namespace, or enumeration.
Patch by orivej (Orivej Desh)
Differential Revision: https://reviews.llvm.org/D52142
llvm-svn: 342354
For constant non-uniform cases we'll never introduce more and/andn/or selects than already occur in generic pre-SSE41 ISD::SRL lowering.
llvm-svn: 342352
This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040.
Copying the motivational statement by @evandro from that patch:
"This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp,
447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks.
Otherwise, no regressions on x86-64 or A64."
I'm proposing to add only the minimum support for a DAG node here. Since we don't have an
LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I
don't think we need to worry about DAG builder, legalization, a strict variant, etc. We
should be able to expand as needed when adding more functionality/transforms. For reference,
these are transform suggestions currently listed in SimplifyLibCalls.cpp:
// * cbrt(expN(X)) -> expN(x/3)
// * cbrt(sqrt(x)) -> pow(x,1/6)
// * cbrt(cbrt(x)) -> pow(x,1/9)
Also, given that we bail out on long double for now, there should not be any logical
differences between platforms (unless there's some platform out there that has pow()
but not cbrt()).
Differential Revision: https://reviews.llvm.org/D51753
llvm-svn: 342348
https://bugs.llvm.org/show_bug.cgi?id=38949
It's not clear to me that we even need a one-use check in this fold.
Ie, 2 independent loads might be better than a load+dependent shuffle.
Note that the existing re-use tests are not affected. We actually do form a
broadcast node in those tests now because there's no extra use of the
insert_subvector node in those cases. But something later in isel pattern
matching decides that it is not worth using a broadcast for the full load in
those tests:
Legalized selection DAG: %bb.0 'test_broadcast_2f64_4f64_reuse:'
t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64
t4: i64,ch = CopyFromReg t0, Register:i64 %1
t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64
t18: v4f64 = insert_subvector undef:v4f64, t7, Constant:i64<0>
t20: v4f64 = insert_subvector t18, t7, Constant:i64<2>
Becomes:
t7: v2f64,ch = load<(load 16 from %ir.p0)> t0, t2, undef:i64
t4: i64,ch = CopyFromReg t0, Register:i64 %1
t10: ch = store<(store 16 into %ir.p1)> t7:1, t7, t4, undef:i64
t21: v4f64 = X86ISD::SUBV_BROADCAST t7
ISEL: Starting selection on root node: t21: v4f64 = X86ISD::SUBV_BROADCAST t7
...
Created node: t27: v4f64 = INSERT_SUBREG IMPLICIT_DEF:v4f64, t7, TargetConstant:i32<7>
Morphed node: t21: v4f64 = VINSERTF128rr t27, t7, TargetConstant:i8<1>
llvm-svn: 342347