This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.
This also allows us to match an and to a movzx after a not.
This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.
llvm-svn: 324260
Summary:
Here are a few improvements proposed for the local cache:
- `InitCache` always read from `per_class_[1]` in the fast path. This was not
ideal as we are working with `per_class_[class_id]`. The latter offers the
same property we are looking for (eg: `max_count != 0` means initialized),
so we might as well use it and keep our memory accesses local to the same
`per_class_` element. So change `InitCache` to take the current `PerClass`
as an argument. This also makes the fast-path assembly of `Deallocate` a lot
more compact;
- Change the 32-bit `Refill` & `Drain` functions to mimic their 64-bit
counterparts, by passing the current `PerClass` as an argument. This saves
some array computations;
- As far as I can tell, `InitCache` has no place in `Drain`: it's either called
from `Deallocate` which calls `InitCache`, or from the "upper" `Drain` which
checks for `c->count` to be greater than 0 (strictly). So remove it there.
- Move the `stats_` updates to after we are done with the `per_class_` accesses
in an attempt to preserve locality once more;
- Change some `CHECK` to `DCHECK`: I don't think the ones changed belonged in
the fast path and seemed to be overly cautious failsafes;
- Mark some variables as `const`.
The overall result is cleaner more compact fast path generated code, and some
performance gains with Scudo (and likely other Sanitizers).
Reviewers: alekseyshl
Reviewed By: alekseyshl
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D42851
llvm-svn: 324257
Davide pointed out this would be useful if the file ever needs to be
regenerated (and I certainly agree).
I also replace the test binary with a slightly smaller one -- I intended
to do this in the original commit, but I forgot to add it to the patch
as I was juggling several things at the same time.
llvm-svn: 324256
This is the instcombine part of unsigned saturation canonicalization.
Backend patches already commited:
https://reviews.llvm.org/D37510https://reviews.llvm.org/D37534
It converts unsigned saturated subtraction patterns to forms recognized
by the backend:
(a > b) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b < a) ? a - b : 0 -> ((a > b) ? a : b) - b)
(b > a) ? 0 : a - b -> ((a > b) ? a : b) - b)
(a < b) ? 0 : a - b -> ((a > b) ? a : b) - b)
((a > b) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b < a) ? b - a : 0) -> - ((a > b) ? a : b) - b)
((b > a) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
((a < b) ? 0 : b - a) -> - ((a > b) ? a : b) - b)
Patch by Yulia Koval!
Differential Revision: https://reviews.llvm.org/D41480
llvm-svn: 324255
ObjectFileELF::GetModuleSpecifications contained a lot of tip-toing code
which was trying to avoid loading the full object file into memory. It
did this by trying to load data only up to the offset if was accessing.
However, in practice this was useless, as 99% of object files we
encounter have section headers at the end, so we would load the whole
file as soon as we start parsing the section headers.
In fact, this would break as soon as we encounter a file which does
*not* have section headers at the end (yaml2obj produces these), as the
access to .strtab (which we need to get the section names) was not
guarded by this offset check.
As this strategy was completely ineffective anyway, I do not attempt to
proliferate it further by guarding the .strtab accesses. Instead I just
lead the full file as soon as we are reasonably sure that we are indeed
processing an elf file.
If we really care about the load size here, we would need to reimplement
this to just load the bits of the object file we need, instead of
loading everything from the start of the object file to the given
offset. However, given that the OS will do this for us for free when
using mmap, I think think this is really necessary.
For testing this I check a (tiny) SO file instead of yaml2obj-ing it
because the fact that they come out first is an implementation detail of
yaml2obj that can change in the future.
llvm-svn: 324254
There was a logic hole in D42739 / rL324014 because we're not accounting for select and phi
instructions that might have repeated operands. This is likely a source of an infinite loop.
I haven't manufactured a test case to prove that, but it should be safe to speculatively limit
this transform to binops while we try to create that test.
llvm-svn: 324252
If the upper 32 bits of a 64 bit mask are all zeros, we have special isel patterns to use a 32-bit and instead of a 64-bit and by relying on the impliciting zeroing of 32 bit ops.
This patch teachs shrinkAndImmediate not to break that optimization.
Differential Revision: https://reviews.llvm.org/D42899
llvm-svn: 324249
This broke the Chromium build; see PR36238.
> This patch is an enhancement to propagate dbg.value information when
> Phis are created on behalf of LCSSA. I noticed a case where a value
> carried across a loop was reported as <optimized out>.
>
> Specifically this case:
>
> int bar(int x, int y) {
> return x + y;
> }
>
> int foo(int size) {
> int val = 0;
> for (int i = 0; i < size; ++i) {
> val = bar(val, i); // Both val and i are correct
> }
> return val; // <optimized out>
> }
>
> In the above case, after all of the interesting computation completes
> our value is reported as "optimized out." This change will add a
> dbg.value to correct this.
>
> This patch also moves the dbg.value insertion routine from
> LoopRotation.cpp into Local.cpp, so that we can share it in both places
> (LoopRotation and LCSSA).
>
> Patch by Matt Davis!
>
> Differential Revision: https://reviews.llvm.org/D42551
llvm-svn: 324247
Summary:
When a preprocessor indent closes after the last line of normal code we do not
correctly fixup include guard indents. For example:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
#endif
#endif
incorrectly reformats to:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
# endif
#endif
To resolve this issue we must fixup levels after parseFile(). Delaying
the fixup introduces a new state, so consolidate include guard search
state into an enum.
Reviewers: krasimir, klimek
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42035
llvm-svn: 324246
The function shuffp2 was breaking up a wide shuffle into a pair of
narrower ones, except that the narrower shuffle masks were actually
uninitialized.
llvm-svn: 324243
Summary:
This complements the fixes in r323633 and r324075 which drop the
definitions of dead functions and variables, respectively.
Fixes PR36208.
Reviewers: grimar, rafael
Subscribers: mehdi_amini, llvm-commits, inglorion
Differential Revision: https://reviews.llvm.org/D42856
llvm-svn: 324242
Summary:
When a preprocessor indent closes after the last line of normal code we do not
correctly fixup include guard indents. For example:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
#endif
#endif
incorrectly reformats to:
#ifndef HEADER_H
#define HEADER_H
#if 1
int i;
# define A 0
# endif
#endif
To resolve this issue we must fixup levels after parseFile(). Delaying
the fixup introduces a new state, so consolidate include guard search
state into an enum.
Reviewers: krasimir, klimek
Reviewed By: krasimir
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42035
llvm-svn: 324238
Summary:
We cannot call process_up->SetState() inside
the NativeProcessNetBSD::Factory::Launch
function because it triggers a NULL pointer
deference.
The generic code for launching a process in:
GDBRemoteCommunicationServerLLGS::LaunchProcess
sets the m_debugged_process_up pointer after
a successful call to m_process_factory.Launch().
If we attempt to call process_up->SetState()
inside a platform specific Launch function we
end up dereferencing a NULL pointer in
NativeProcessProtocol::GetCurrentThreadID().
Use the proper call process_up->SetState(,false)
that sets notify_delegates to false.
Sponsored by <The NetBSD Foundation>
Reviewers: labath, joerg
Reviewed By: labath
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D42868
llvm-svn: 324234
We've had a bug (fixed by https://reviews.llvm.org/D42828) where the
thread name was being read incorrectly. Add a test for this behavior.
llvm-svn: 324230
PPCCTRLoops transform loops using mtctr/bdnz instructions if loop trip count is known and big enough to compensate for the cost of mtctr.
But if there is a loop exit edge which is known to be frequently taken (by builtin_expect or by PGO), we should not transform the loop to avoid the cost of mtctr instruction. Here is an example of a loop with hot exit edge:
for (unsigned i = 0; i < TripCount; i++) {
// do something
if (__builtin_expect(check(), 1))
break;
// do something
}
Differential Revision: https://reviews.llvm.org/D42637
llvm-svn: 324229
Summary:
Right now only the ProcResourceUnits that are directly referenced by
instructions are emitted. This change emits all of them, so that
analysis passes can use the information.
This has no functional impact. It typically adds a few entries (e.g. 4
for X86/haswell) to the generated ProcRes table.
Reviewers: gchatelet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42903
llvm-svn: 324228
This test was marked as an expected failure because of PR20231 but it
seems to consistently result in an unexpected success across the bots.
Let's try to re-enable this test again.
llvm-svn: 324227
Summary:
This changes the way we store the debug info variant to make it
available earlier in the test bringup: instead of it being set by the
test wrapper method, it is set as a *property* of the wrapper method.
This way, we can inspect it as soon as self.testMethodName is
initialized. The retrieval is implemented by a new function
TestBase.getDebugInfo(), and all that's necessary to make it work is to
change self.debug_info into self.getDebugInfo().
While searching for debug_info occurences i noticed that TestLogging is
being replicated for no good reason, so I removed the replication there.
Reviewers: aprantl, jingham
Subscribers: eraman, JDevlieghere, lldb-commits
Differential Revision: https://reviews.llvm.org/D42836
llvm-svn: 324226
I have found LLDB cannot find separate debug info of Fedora /usr/bin/gdb.
It is because:
lrwxrwxrwx 1 root root 14 Jan 25 20:41 /usr/bin/gdb -> ../libexec/gdb*
-rwxr-xr-x 1 root root 10180296 Jan 25 20:41 /usr/libexec/gdb*
ls: cannot access '/usr/lib/debug/usr/bin/gdb-8.0.1-35.fc27.x86_64.debug': No such file or directory
-r--r--r-- 1 root root 29200464 Jan 25 20:41 /usr/lib/debug/usr/libexec/gdb-8.0.1-35.fc27.x86_64.debug
FYI that -8.0.1-35.fc27.x86_64.debug may look confusing, it was always just
.debug before.
Why is /usr/bin/gdb a symlink is offtopic for this bugreport, Fedora has it so
for some reasons.
It is always safest to look at the .debug file only after resolving all
symlinks on the binary file.
Differential revision: https://reviews.llvm.org/D42853
llvm-svn: 324224
I have found the lookup by build-id
(when lookup by /usr/lib/debug/path/name/exec.debug failed) does not work as
LLDB tries the build-id hex string in uppercase but Fedora uses lowercase.
xubuntu-16.10 also uses lowercase during my test:
/usr/lib/debug/.build-id/6c/61f3566329f43d03f812ae7057e9e7391b5ff6.debug
Differential revision: https://reviews.llvm.org/D42852
llvm-svn: 324222
When resolving dynamic RELA relocations the addend is taken from the
relocation and not the place being relocated. Accordingly lld does not
write the addend field to the place like it would for a REL relocation.
Unfortunately there is some system software, in particlar dynamic loaders
such as Bionic's linker64 that use the value of the place prior to
relocation to find the offset that they have been loaded at. Both gold
and bfd control this behavior with the --[no-]apply-dynamic-relocs option.
This change implements the option and defaults it to true for compatibility
with gold and bfd.
Differential Revision: https://reviews.llvm.org/D42797
llvm-svn: 324221
We did not report valid filename for duplicate symbol error when
symbol came from binary input file.
Patch fixes it.
Differential revision: https://reviews.llvm.org/D42635
llvm-svn: 324217
The patch causes the failure of the test
compiler-rt/test/profile/Linux/counter_promo_nest.c
To unblock buildbot, revert the patch while investigation is in progress.
Differential Revision: https://reviews.llvm.org/D42691
llvm-svn: 324214
The commit rL308422 introduces a restriction for folding unconditional
branches. Specifically if empty block with unconditional branch leads to
header of the loop then elimination of this basic block is prohibited.
However it seems this condition is redundantly strict.
If elimination of this basic block does not introduce more back edges
then we can eliminate this block.
The patch implements this relax of restriction.
Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl
Reviewed By: pacxx
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42691
llvm-svn: 324208
We always created X86ISD::SHUF128 with a 64-bit element type so we can use isel patterns to detect a bitconvert to 32-bit to handle masking.
The test changes are because we also match the bitconvert even if there is no masking. This leads to unnecessary isel pattern, but it requires more multiclass hackery in tablegen to get rid of it.
llvm-svn: 324205
ScalarEvolution::isKnownPredicate invokes isLoopEntryGuardedByCond without check
that SCEV is available at entry point of the loop. It is incorrect and fixed by patch.
To bugs additionally fixed:
assert is moved after the check whether loop is not a nullptr.
Usage of isLoopEntryGuardedByCond in ScalarEvolution::isImpliedCondOperandsViaNoOverflow
is guarded by isAvailableAtLoopEntry.
Reviewers: sanjoy, mkazantsev, anna, dorit, reames
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42417
llvm-svn: 324204