Adopt a more strict approach regarding what marks should/can appear after a destination register, when operating upon an AVX512 platform.
Differential Revision: https://reviews.llvm.org/D35785
llvm-svn: 310467
Codegen with -polly-parallel queried the unmapped MemoryAccess, but only
the MemoryKind after mapping is relevant for codegen.
This should fix various fails of the
perf-x86_64-penryn-O3-polly-parallel-fast buildbot.
llvm-svn: 310466
The previous value of "polly-delicm" was forgotten to to be changed when
ForwardOpTree was split from DeLICM.
Thanks to Tobias for noticing!
llvm-svn: 310465
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.
The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().
The isFoldableMemAccess() hook has been removed everywhere.
Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933
llvm-svn: 310463
In the recursive call to isAMCompletelyFolded(), the passed offset should be
the sum of F.BaseOffset and Fixup.Offset.
Review: Quentin Colombet.
llvm-svn: 310462
distributeDomain() and filterKnownValInst() are used in a scop
of ForwardOpTree that limits the number of isl operations.
Therefore some isl functions may return null after any operation.
Remove assertion that assume non-null results and handle
isl_*_foreach returning isl::stat::error.
I hope this fixes the crash of the asop buildbot at ihevc_recon.c.
llvm-svn: 310461
Assert that a binary expression is actually a binary expression,
rather than potientially incorrectly attempting to handle it as a
unary expression.
This resolves PR34083.
Thanks to Simonn Pilgrim for reporting the issue!
llvm-svn: 310460
Summary:
This handles a case where the trailing '*/' of a multiline jsdoc eding in a
comment pragma wouldn't be put on a new line.
Reviewers: mprobst
Reviewed By: mprobst
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D36359
llvm-svn: 310458
The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.
Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.
Differential Revision: https://reviews.llvm.org/D36405
llvm-svn: 310457
to Nodes when removing ref edges from a RefSCC.
This map based association turns out to be pretty expensive for large
RefSCCs and pointless as we already have embedded data members inside
nodes that we use to track the DFS state. We can reuse one of those and
the map becomes unnecessary.
This also fuses the update of those numbers into the scan across the
pending stack of nodes so that we don't walk the nodes twice during the
DFS.
With this I expect the new PM to be faster than the old PM for the test
case I have been optimizing. That said, it also seems simpler and more
direct in many ways. The side storage was always pretty awkward.
The last remaining hot-spot in the profile of the LCG once this is done
will be the edge iterator walk in the DFS. I'll take a look at improving
that next.
llvm-svn: 310456
Summary:
The available platform list was previously only accessible via the
`platform list` command, this patch makes it possible to access that
list via the SBDebugger API. The active platform list has likewise
been exposed via the SBDebugger API.
Differential Revision: https://reviews.llvm.org/D35760
llvm-svn: 310452
that RefSCC still connected.
This is common and can be handled much more efficiently. As soon as we
know we've covered every node in the RefSCC with the DFS, we can simply
reset our state and return. This avoids numerous data structure updates
and other complexity.
On top of other changes, this appears to get new PM back to parity with
the old PM for a large protocol buffer message source code. The dense
map updates are very hot in this function.
llvm-svn: 310451
limited batch updates.
Specifically, allow removing multiple reference edges starting from
a common source node. There are a few constraints that play into
supporting this form of batching:
1) The way updates occur during the CGSCC walk, about the most we can
functionally batch together are those with a common source node. This
also makes the batching simpler to implement, so it seems
a worthwhile restriction.
2) The far and away hottest function for large C++ files I measured
(generated code for protocol buffers) showed a huge amount of time
was spent removing ref edges specifically, so it seems worth focusing
there.
3) The algorithm for removing ref edges is very amenable to this
restricted batching. There are just both API and implementation
special casing for the non-batch case that gets in the way. Once
removed, supporting batches is nearly trivial.
This does modify the API in an interesting way -- now, we only preserve
the target RefSCC when the RefSCC structure is unchanged. In the face of
any splits, we create brand new RefSCC objects. However, all of the
users were OK with it that I could find. Only the unittest needed
interesting updates here.
How much does batching these updates help? I instrumented the compiler
when run over a very large generated source file for a protocol buffer
and found that the majority of updates are intrinsically updating one
function at a time. However, nearly 40% of the total ref edges removed
are removed as part of a batch of removals greater than one, so these
are the cases batching can help with.
When compiling the IR for this file with 'opt' and 'O3', this patch
reduces the total time by 8-9%.
Differential Revision: https://reviews.llvm.org/D36352
llvm-svn: 310450
Previously, we used to compute this with `elementSizeInBits / 8`. This
would yield an element size of 0 when the array had element size < 8 in
bits.
To fix this, ask data layout what the size in bytes should be.
Differential Revision: https://reviews.llvm.org/D36459
llvm-svn: 310448
It was timing out on this test, but for reasons unrelated to the
specific bug it was testing for. Randomly breaking in gdb with `clang
-target i686-windows -fmsc-version=1700` reveals *many* frames from
MicrosoftCXXNameMangler. So, it would seem that some caching is needed
there, as well...
Fingers crossed that specifying a triple is sufficient to work around
this.
llvm-svn: 310444
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.
Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392
llvm-svn: 310443
For some reason I didn't see this failure the first time. The
output format changed slightly, so we just have to update the
test for the new format.
llvm-svn: 310442
In the refactor to merge the publics and globals stream, a bug
was introduced that wrote the wrong value for one of the fields
of the PublicsStreamHeader. This caused debugging in WinDbg
to break.
We had no way of dumping any of these fields, so in addition to
fixing the bug I've added dumping support for them along with a
test that verifies the correct value is written.
llvm-svn: 310439
The publics stream and globals stream are very similar. They both
contain a list of hash buckets that refer into a single shared stream,
the symbol record stream. Because of the need for each builder to manage
both an independent hash stream as well as a single shared record
stream, making the two builders be independent entities is not the right
design. This patch merges them into a single class, of which only a
single instance is needed to create all 3 streams. PublicsStreamBuilder
and GlobalsStreamBuilder are now merged into the single GSIStreamBuilder
class, which writes all 3 streams at once.
Note that this patch does not contain any functionality change. So we're
still not yet writing any records to the globals stream. All we're doing
is making it so that when we do start writing records to the globals,
this refactor won't have to be part of that patch.
Differential Revision: https://reviews.llvm.org/D36489
llvm-svn: 310438
This is a follow-up to r310436 with actual functional changes. Please
see that commit message for a description of why a cache is appearing
here.
Suggestions for less-bad ways of testing this are appreciated. :)
This fixes PR29160.
llvm-svn: 310437
This is patch 1 in a 2 patch series that aims to fix PR29160. Its goal
is to cache decl visibility/linkage for the duration of each
visibility+linkage query.
The simplest way I can see to do this is to put the visibility
calculation code that needs to (transitively) access this cache into a
class, which is what this patch does. Actual caching will come in patch
2. (Another way would be to keep the cache in ASTContext + manually
invalidate it or something, but that felt way too subtle to me.)
Caching visibility results across multiple queries seems a bit tricky,
since the user can add visibility attributes ~whenever they want, and
these attributes can apparently have far-reaching effects (e.g. class
visibility extends to its members, ...). Because a cache that's dropped
at the end of each top-level query seems to work nearly as well and
doesn't require any eviction logic, I opted for that design.
llvm-svn: 310436
Do not discard invalid Decl when searching for the operator delete function.
The lookup for this function always expects to find a result, so sometimes the
invalid Decl is the only choice possible. This fixes PR34109.
llvm-svn: 310435
Summary:
This is a pure refactoring change. It paves the way for OS-specific
implementations, such as Fuchsia's, that can do most of the
per-thread bookkeeping work in the creator thread before the new
thread actually starts. This model is simpler and cleaner, avoiding
some race issues that the interceptor code for thread creation has
to do for the existing OS-specific implementations.
Submitted on behalf of Roland McGrath.
Reviewers: vitalybuka, alekseyshl, kcc
Reviewed By: alekseyshl
Subscribers: phosek, filcab, llvm-commits, kubamracek
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D36385
llvm-svn: 310432
Converting a _Complex type to a real one simply discards the imaginary part.
This can easily lead to loss of information so for safety (and GCC
compatibility) this patch disallows that when the conversion would be implicit.
The one exception is bool, which actually compares both real and imaginary
parts and so is safe.
llvm-svn: 310427
This reverts commit r310115.
It causes a linker failure for the one of the unittests of AArch64 on one
of the linux bot:
http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429
: && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC
-fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W
-Wno-unused-parameter -Wwrite-strings -Wcast-qual
-Wno-missing-field-initializers -pedantic -Wno-long-long
-Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment
-ffunction-sections -fdata-sections -O2
-L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined
-Wl,-O3 -Wl,--gc-sections
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o
unittests/Target/AArch64/AArch64Tests
lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn
lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn
lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn
lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn
lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread
lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread
-Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib
&& :
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0):
undefined reference to `vtable for llvm::LegalizerInfo'
unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8):
undefined reference to `vtable for llvm::RegisterBankInfo'
The particularity of this bot is that it is built with
BUILD_SHARED_LIBS=ON
However, I was not able to reproduce the problem so far.
Reverting to unblock the bot.
llvm-svn: 310425
Before, the outliner would mark all instructions that read from/modify LR as
illegal. This doesn't handle W30, which overlaps with LR. This shouldn't be
outlined.
This commit fixes that by making modifiesRegister() and readsRegister() look at
W30 + take in a TRI argument. This makes sure that modifiesRegister() and
readsRegister() won't outline either of W30 and LR.
https://reviews.llvm.org/D36435
llvm-svn: 310422
When a new phi is generated for scalarpre of an expression, the phiTranslate cache
will become stale: Before PRE, the candidate expression must not be available in a
predecessor block, and phitranslate will cache the information. After PRE, the
expression will become available in all predecessor blocks, so the related entries
in phiTranslate cache becomes stale. The patch will simply remove the stale entries
so phiTranslate can be recomputed next time.
The stale entries in phitranslate cache will not affect correctness but will cause
missing PRE opportunity for later instructions.
Differential Revision: https://reviews.llvm.org/D36124
llvm-svn: 310421
The 9 byte nop is a suffix of the 10 byte nop, and we need at most 6
bytes.
ntdll's version of strcpy is written in assembly and is very clever.
strcat tail calls strcpy but with a slightly different arrangement of
argument registers at an alternate entry point. It looks like this:
ntdll!strcpy:
00007ffd`64e8a7a0 4c8bd9 mov r11,rcx
ntdll!__entry_from_strcat_in_strcpy:
00007ffd`64e8a7a3 482bca sub rcx,rdx
00007ffd`64e8a7a6 f6c207 test dl,7
If we overwrite more than two bytes in our interceptor, that label will
no longer be a valid instruction boundary.
By recognizing the 9 byte nop, we use the two byte backwards branch to
start our trampoline, avoiding this issue.
Fixes https://github.com/google/sanitizers/issues/829
Patch by David Major
llvm-svn: 310419