If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.
This patch checks for the long dependence chain, and give up if-conversion if find one.
Differential Revision: https://reviews.llvm.org/D39352
llvm-svn: 321377
In https://reviews.llvm.org/rL321077 and https://reviews.llvm.org/D41231 I fixed a regression in the c-api which prevented the pruning from being *effectively* disabled.
However this approach, helpfully recommended by @labath, is cleaner.
It is also nice to remove the weasel words about effectively disabling from the api comments.
Differential Revision: https://reviews.llvm.org/D41497
llvm-svn: 321376
Re-land r321234. It had to be reverted because it broke the shared
library build. The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).
Original commit message:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321375
Summary:
This patch, on top of https://reviews.llvm.org/D40898, contains the build system
changes necessary to enable the Solaris/x86 sanitizer port.
The only issue of note is the libclang_rt.sancov_{begin, end} libraries: clang relies on the
linker automatically defining __start_SECNAME and __stop_SECNAME labels for
sections whose names are valid C identifiers. This is a GNU ld extension not present
in the ELF gABI, also implemented by gold and lld, but not by Solaris ld. To work around
this, I automatically link the sancov_{begin,end} libraries into every executable for now.
There seems to be now way to build individual startup objects like crtbegin.o/crtend.o,
so I've followed the lead of libclang_rt.asan-preinit which also contains just a single
object.
Reviewers: kcc, alekseyshl
Reviewed By: alekseyshl
Subscribers: srhines, kubamracek, mgorny, fedor.sergeev, llvm-commits, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D40899
llvm-svn: 321373
Memory transfer instructions take two pointers. It is not defined to
which of those a noalias annotation applies. To ensure correctness,
do not add noalias annotations to memcpy/memmove instructions anymore.
The caused a miscompile with test-suite's MultiSource/Applications/obsequi.
Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if
known to copy uninitialized memory. In that case, it was initialized
by another memcpy, but the annotation for the target pointer said
it would not alias. The annotation was actually meant for the source
pointer, which was was an alloca and could not alias with the target
pointer.
llvm-svn: 321371
This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint.
I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent.
llvm-svn: 321370
Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code.
llvm-svn: 321369
Currently, inline cost model considers a binary operator as free only if both
its operands are constants. Some simple cases are missing such as a + 0, a - a,
etc. This patch modifies visitBinaryOperator() to call SimplifyBinOp() without
going through simplifyInstruction() to get rid of the constant restriction.
Thus, visitAnd() and visitOr() are not needed.
Differential Revision: https://reviews.llvm.org/D41494
llvm-svn: 321366
Summary:
Providing aligned new/delete implementations to match ASan.
Unlike ASan, MSan and TSan do not perform any additional checks
on overaligned memory, hence no sanitizer specific tests.
Reviewers: eugenis
Subscribers: kubamracek, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D41532
llvm-svn: 321365
The compiler warns that _BSD_SOURCE is deprecated and _DEFAULT_SOURCE should
be used instead. We keep _BSD_SOURCE for older compilers, that don't know
about _DEFAULT_SOURCE.
The linker drops the tool when linking, since there is no visible need for
the library. So we need to tell the linker, that the tool should be linked
anyway.
Differential Revision: https://reviews.llvm.org/D41499
llvm-svn: 321362
The format string for hints only prints the second argument (string) and drops
the first argument (hint id). Depending on how you read the POSIX text for
printf, this could be valid. But for practical reason, i.e., unpacking the
va_list passed to printf based on the formating information, it makes sense
to fix the implementation and not pass the id for hint.
Failing testcases were:
misc_bugs/teams-reduction.c
ompt/parallel/not_enough_threads.c
Differential Revision: https://reviews.llvm.org/D41504
llvm-svn: 321361
Now tests for metadata created by clang involve compiling code snippets
placed into c/c++ source files and matching interesting patterns in the
obtained textual representation of IR. Writting such tests is a painful
process as metadata often form complex tree-like structures but textual
representation of IR contains only a pile of metadata at the module end.
This change implements IR matchers that may be used to match required
patterns in the binary IR representation. In this case the metadata
structure is not broken and creation of match patterns is easier.
The change adds unit tests for TBAA metadata generation.
Differential Revision: https://reviews.llvm.org/D41433
llvm-svn: 321360
Summary:
The tools is used to generate global symbols for clangd (global code completion),
The format is YAML, which is only for **experiment**.
Usage:
./bin/global-symbol-builder </path/to/llvm-dir> > global-symbols.yaml
TEST:
used the tool to generate global symbols for LLVM (~72MB).
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41491
llvm-svn: 321358
Summary:
Make sure we propagate environment when starting debugserver with a pre-loaded
inferior. AFAIK, RNBRunLoopLaunchInferior is only called in pre-loaded inferior
scenario, so we can just pick up the debugserver environment instead of trying
to construct an envp from the (empty) context.
This makes debugserver pass an test added for an equivalent lldb-server fix.
Reviewers: jasonmolenda, clayborg
Subscribers: JDevlieghere, lldb-commits
Differential Revision: https://reviews.llvm.org/D41352
llvm-svn: 321355
Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.
llvm-svn: 321354
The test works fine on linux, and I believe other targets should not
have an issue with as well. If they do, we can start blacklisting
instead of whitelisting.
The idea of using "-1" as the value of the pointer on non-apple targets
backfired, as it fails the "address != LLDB_INVALID_ADDRESS" test (-1 is
the value of LLDB_INVALID_ADDRESS). However, it should be safe to use
0x100 for other targets as well. The first page of memory is generally
kept unreadable to catch null pointer dereferences.
llvm-svn: 321353
Now that in the new TBAA format we allow access types to be of
any object types, including aggregate ones, it becomes critical
to specify types of all sub-objects such aggregates comprise as
their members. In order to meet this requirement, this patch
enables generation of field descriptors for members of array
types.
Differential Revision: https://reviews.llvm.org/D41399
llvm-svn: 321352
Now that the MDBuilder helpers generating TBAA type and access
descriptors in the new format are in place, we can teach clang to
use them when requested.
Differential Revision: https://reviews.llvm.org/D41394
llvm-svn: 321351
inlined subroutines for a given address.
This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.
This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.
It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.
Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]
All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.
Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.
Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.
Differential Revision: https://reviews.llvm.org/D40987
llvm-svn: 321345
Summary:
It was possible when searching for a symbol by regex in a pdb that an invalid regex would cause an exception on Windows. This updates the code to avoid throwing an exception.
When fixing the exception it was decided there is no reason to search for a symbol in a pdb by regex. To support this, SymbolFilePDB::FindTypes() now only searches for types by name and no longer calls FindTypesByRegEx().
Reviewers: zturner, lldb-commits
Reviewed By: zturner
Subscribers: clayborg
Differential Revision: https://reviews.llvm.org/D41086
llvm-svn: 321344
In case `@import Foo.Private` fails because the submodule doesn't exist,
look for `Foo_Private` (if available) and build/load that module
instead. In that process emit a warning and tell the user about the
assumption.
The intention here is to assist all existing private modules owners
(in ObjC and Swift) to migrate to the new `Foo_Private` syntax.
rdar://problem/36023940
llvm-svn: 321342
The standard correctly forbids various decl-specifiers that dont make sense on non-type template parameters - such as the extern in:
template<extern int> struct X;
This patch implements those restrictions (in a fashion similar to the corresponding checks on function parameters within ActOnParamDeclarator).
Credit goes to miyuki (Mikhail Maltsev) for drawing attention to this issue, authoring the initial versions of this patch, and supporting the effort to re-engineer it slightly. Thank you!
For details of how this patch evolved please see: https://reviews.llvm.org/D40705
llvm-svn: 321339
an empty Python string object when it reads a 0-length
string out of memory (and a successful SBError object).
<rdar://problem/26186692>
llvm-svn: 321338
We used to advertise private modules to be declared as submodules
(Foo.Private). This has proven to not scale well since private headers
might carry several dependencies, introducing unwanted content into the
main module and often causing dep cycles.
Change the canonical way to name it to Foo_Private, forcing private
modules as top level ones, and provide warnings under -Wprivate-module
to suggest fixes for other private naming. Update documentation to
reflect that.
rdar://problem/31173501
llvm-svn: 321337
Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well.
The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled.
Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions.
I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations.
llvm-svn: 321335
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.
While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.
Differential Revision: https://reviews.llvm.org/D41522
llvm-svn: 321332
Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41461
llvm-svn: 321331