Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
LowerMemIntrinsics pass to cease using the old getAlignment() API of MemoryIntrinsic in
favour of getting source & dest specific alignments through the new API.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 324278
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SimplifyLibCalls pass to cease using the old IRBuilder createMemCpy/createMemMove
single-alignment APIs in favour of the new API that allows setting source and destination
alignments independently.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, r3L24148 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 324273
The major visible difference here is that in line-table dumps,
directory and file names are wrapped in double-quotes; previously,
directory names got single quotes and file names were not quoted at
all.
The improvement in this patch is that when a DWARF v5 line table
header has indirect strings, in a verbose dump these will all have
their section[offset] printed as well as the name itself. This
matches the format used for dumping strings in the .debug_info
section.
Differential Revision: https://reviews.llvm.org/D42802
llvm-svn: 324270
The 'trivial_abi' attribute can be applied to a C++ class, struct, or
union. It makes special functions of the annotated class (the destructor
and copy/move constructors) to be trivial for the purpose of calls and,
as a result, enables the annotated class or containing classes to be
passed or returned using the C ABI for the underlying type.
When a type that is considered trivial for the purpose of calls despite
having a non-trivial destructor (which happens only when the class type
or one of its subobjects is a 'trivial_abi' class) is passed to a
function, the callee is responsible for destroying the object.
For more background, see the discussions that took place on the mailing
list:
http://lists.llvm.org/pipermail/cfe-dev/2017-November/055955.htmlhttp://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180101/thread.html#214043
rdar://problem/35204524
Differential Revision: https://reviews.llvm.org/D41039
llvm-svn: 324269
Summary:
The 32-bit division breaks SizeClassAllocator64PopulateFreeListOOM which uses
Primary that has a maximum size > 32-bit.
Reviewers: alekseyshl
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D42928
llvm-svn: 324268
As PR36225 shows, we definitely don't want to enable the
canEvaluate* logic with phis.
There's still a question of whether we should just revert
r324014 completely because it exposes a compile-time sinkhole
(although that problem might exist independently).
llvm-svn: 324266
Summary:
The signatures for the builtins @llvm.memcpy, @llvm.memmove, and @llvm.memset
where changed in rL322965. The number of arguments has decreased from five to
four with the removal of the alignment argument. Alignment is now conveyed
by supplying the align parameter attribute on the destination and/or source of
the cpy/move/set.
llvm-svn: 324265
When using Elf_Rela every tool should use the addend in the
relocation.
We have --apply-dynamic-relocs to work around bugs in tools that don't
do that.
The default value of --apply-dynamic-relocs should be false to make
sure these bugs are more easily found in the future.
llvm-svn: 324264
Summary:
In `ClassID`, make sure we use an unsigned as based for the `lbits` shift.
The previous code resulted in spurious sign extensions like for x64:
```
add esi, 0FFFFFFFFh
movsxd rcx, esi
and rcx, r15
```
The code with the `U` added is:
```
add esi, 0FFFFFFFFh
and rsi, r15
```
And for `MaxCachedHint`, use a 32-bit division instead of `64-bit`, which is
faster (https://lemire.me/blog/2017/11/16/fast-exact-integer-divisions-using-floating-point-operations/)
and already used in other parts of the code (64-bit `GetChunkIdx`, 32-bit
`GetMetaData` enforce 32-bit divisions)
Not major performance gains by any mean, but they don't hurt.
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D42916
llvm-svn: 324263
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.
This also allows us to match an and to a movzx after a not.
This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.
llvm-svn: 324260
Summary:
Here are a few improvements proposed for the local cache:
- `InitCache` always read from `per_class_[1]` in the fast path. This was not
ideal as we are working with `per_class_[class_id]`. The latter offers the
same property we are looking for (eg: `max_count != 0` means initialized),
so we might as well use it and keep our memory accesses local to the same
`per_class_` element. So change `InitCache` to take the current `PerClass`
as an argument. This also makes the fast-path assembly of `Deallocate` a lot
more compact;
- Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit
counterparts, by passing the current `PerClass` as an argument. This saves
some array computations;
- As far as I can tell, `InitCache` has no place in `Drain`: it's either called
from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which
checks for `c->count` to be greater than 0 (strictly). So remove it there.
- Move the `stats_` updates to after we are done with the `per_class_` accesses
in an attempt to preserve locality once more;
- Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in
the fast path and seemed to be overly cautious failsafes;
- Mark some variables as `const`.
The overall result is cleaner more compact fast path generated code, and some
performance gains with Scudo (and likely other Sanitizers).
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D42851
llvm-svn: 324257
Davide pointed out this would be useful if the file ever needs to be
regenerated (and I certainly agree).
I also replace the test binary with a slightly smaller one -- I intended
to do this in the original commit, but I forgot to add it to the patch
as I was juggling several things at the same time.
llvm-svn: 324256
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited:
https://reviews.llvm.org/D37510https://reviews.llvm.org/D37534
It converts unsigned saturated subtraction patterns to forms recognized
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
Patch by Yulia Koval!
Differential Revision: https://reviews.llvm.org/D41480
llvm-svn: 324255
ObjectFileELF::GetModuleSpecifications contained a lot of tip-toing code
which was trying to avoid loading the full object file into memory. It
did this by trying to load data only up to the offset if was accessing.
However, in practice this was useless, as 99% of object files we
encounter have section headers at the end, so we would load the whole
file as soon as we start parsing the section headers.
In fact, this would break as soon as we encounter a file which does
*not* have section headers at the end (yaml2obj produces these), as the
access to .strtab (which we need to get the section names) was not
guarded by this offset check.
As this strategy was completely ineffective anyway, I do not attempt to
proliferate it further by guarding the .strtab accesses. Instead I just
lead the full file as soon as we are reasonably sure that we are indeed
processing an elf file.
If we really care about the load size here, we would need to reimplement
this to just load the bits of the object file we need, instead of
loading everything from the start of the object file to the given
offset. However, given that the OS will do this for us for free when
using mmap, I think think this is really necessary.
For testing this I check a (tiny) SO file instead of yaml2obj-ing it
because the fact that they come out first is an implementation detail of
yaml2obj that can change in the future.
llvm-svn: 324254
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi
instructions that might have repeated operands. This is likely a source of an infinite loop.
I haven't manufactured a test case to prove that, but it should be safe to speculatively limit
this transform to binops while we try to create that test.
llvm-svn: 324252
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.
This patch teachs shrinkAndImmediate not to break that optimization.
Differential Revision: https://reviews.llvm.org/D42899
llvm-svn: 324249
This broke the Chromium build; see PR36238.
> This patch is an enhancement to propagate dbg.value information when
> Phis are created on behalf of LCSSA. I noticed a case where a value
> carried across a loop was reported as <optimized out>.
>
> Specifically this case:
>
> int bar(int x, int y) {
> return x + y;
> }
>
> int foo(int size) {
> int val = 0;
> for (int i = 0; i < size; ++i) {
> val = bar(val, i); // Both val and i are correct
> }
> return val; // <optimized out>
> }
>
> In the above case, after all of the interesting computation completes
> our value is reported as "optimized out." This change will add a
> dbg.value to correct this.
>
> This patch also moves the dbg.value insertion routine from
> LoopRotation.cpp into Local.cpp, so that we can share it in both places
> (LoopRotation and LCSSA).
>
> Patch by Matt Davis!
>
> Differential Revision: https://reviews.llvm.org/D42551
llvm-svn: 324247
Summary:
When a preprocessor indent closes after the last line of normal code we do not
correctly fixup include guard indents. For example:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
#endif
#endif
incorrectly reformats to:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
# endif
#endif
To resolve this issue we must fixup levels after parseFile(). Delaying
the fixup introduces a new state, so consolidate include guard search
state into an enum.
Reviewers: krasimir, klimek
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42035
llvm-svn: 324246
The function shuffp2 was breaking up a wide shuffle into a pair of
narrower ones, except that the narrower shuffle masks were actually
uninitialized.
llvm-svn: 324243
Summary:
This complements the fixes in r323633 and r324075 which drop the
definitions of dead functions and variables, respectively.
Fixes PR36208.
Reviewers: grimar, rafael
Subscribers: mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D42856
llvm-svn: 324242
Summary:
When a preprocessor indent closes after the last line of normal code we do not
correctly fixup include guard indents. For example:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
#endif
#endif
incorrectly reformats to:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
# endif
#endif
To resolve this issue we must fixup levels after parseFile(). Delaying
the fixup introduces a new state, so consolidate include guard search
state into an enum.
Reviewers: krasimir, klimek
Reviewed By: krasimir
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42035
llvm-svn: 324238
Summary:
We cannot call process_up->SetState() inside
the NativeProcessNetBSD::Factory::Launch
function because it triggers a NULL pointer
deference.
The generic code for launching a process in:
GDBRemoteCommunicationServerLLGS::LaunchProcess
sets the m_debugged_process_up pointer after
a successful call to m_process_factory.Launch().
If we attempt to call process_up->SetState()
inside a platform specific Launch function we
end up dereferencing a NULL pointer in
NativeProcessProtocol::GetCurrentThreadID().
Use the proper call process_up->SetState(,false)
that sets notify_delegates to false.
Sponsored by <The NetBSD Foundation>
Reviewers: labath, joerg
Reviewed By: labath
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D42868
llvm-svn: 324234
We've had a bug (fixed by https://reviews.llvm.org/D42828) where the
thread name was being read incorrectly. Add a test for this behavior.
llvm-svn: 324230
PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr.
But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge:
for (unsigned i = 0; i < TripCount; i++) {
// do something
if (__builtin_expect(check(), 1))
break;
// do something
}
Differential Revision: https://reviews.llvm.org/D42637
llvm-svn: 324229