Summary:
.NAME is a bit of an odd duck, in that we should really treat it like
a template argument, but we currently don't, and so when and where
NAME is initialized and how is pretty inconsistent. Best to just avoid
using it as a field of already instantiated records, and use cast to
string instead.
Change-Id: I5a0c202401cede3d5c3827ab9c7858ea48b29108
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D43551
llvm-svn: 325794
Implement c.lui immediate constraint to [1, 31] and [0xfffe0, 0xfffff].
The RISC-V ISA describes the constraint as [1, 63], with that value
being loaded in to bits 17-12 of the destination register and sign extended
from bit 17. Therefore, this 6-bit immediate can represent values in the
ranges [1, 31] and [0xfffe0, 0xfffff].
Differential Revision: https://reviews.llvm.org/D42834
llvm-svn: 325792
- Fix for bug 36078.
- Prevent the functionattrs, function-attrs, globalopt and argpromotion passes
from changing naked functions.
- These passes can perform some alterations to the functions that should not be
applied. An example is removing parameters that are seemingly not used because
they are only referenced in the inline assembly. Another example is marking
the function as fastcc.
llvm-svn: 325788
There were no memory dependencies made between stores generated
when lowering formal arguments and loads generated when
call lowering byVal arguments which made the Post-RA scheduler
place a load before a matching store.
Make the fixed object stored to mutable so that the load
instructions can have their memory dependencies added
Set the frame object as isAliased which clears the underlying
objects vector in ScheduleDAGInstrs::buildSchedGraph().
This results in addition of all stores as dependenies for loads.
This problem appeared when passing a byVal parameter
coupled with a fastcc function call.
Differential Revision: https://reviews.llvm.org/D37515
llvm-svn: 325782
NFC intended, syndicate common code to a parametric base class. Part of the original problem is that InvokeInst is a TerminatorInst, unlike CallInst. the problem is solved by introducing a parametrized class paramtertized by its base.
Differential Revision: https://reviews.llvm.org/D40727
llvm-svn: 325778
As pointed out by @sabuasal in a comment on D23568, the logic in
RISCVMCCodeEmitter::getImmOpValue could be more defensive. Although with the
current instruction definitions it is always the case that `VK_RISCV_LO` is
always used with either an I- or S-format instruction, this may not always be
the case in the future. Add a check to ensure we will get an assertion in
debug builds if that changes.
llvm-svn: 325775
Fixup to rL325573 for large xor constants.
Thanks to Eli Friedman for the catch.
Differential revision: https://reviews.llvm.org/D43549
llvm-svn: 325761
Calling realpath is expensive but necessary to perform the uniqueing in
dsymutil. Although we already cached the results for every individual
file in the line table, we had reports of it taking 40 seconds of a 3.5
minute link.
This patch adds a second level of caching. When we do have to call
realpath, we cache its result for its parents path. We didn't replace
the existing caching, because it's fast (indexed) and saves us from
reading the line table for entries we've already seen.
For WebkitCore this results in a decrease of 11% in linking time: from
85.79 to 76.11 seconds (average over 3 runs).
Differential revision: https://reviews.llvm.org/D43511
llvm-svn: 325757
This is a follow up of r325012, that allowed half types in constant pools.
Proper alignment was enforced when a big basic block was split up, but not when
a CPE was placed before/after a block; the successor block had the wrong
alignment.
Differential Revision: https://reviews.llvm.org/D43580
llvm-svn: 325754
We looked through a BITCAST, but the bitcast might be a from a scalar type rather than a vector.
I don't have a test case. I stumbled onto it while prototyping another change that isn't ready yet.
llvm-svn: 325750
Summary:
Exposing getOffset and findFunctionSamples as members of
SampleProfile. They are intimately tied to design choices of the
sample profile format - using offsets instead of line numbers, and
traversing inlined functions stack, respectively.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43605
llvm-svn: 325747
SCEV has multiple occurences of code when we need to prove some predicate on
every iteration of a loop and do it with invocations of couple `isLoopEntryGuardedByCond`,
`isLoopBackedgeGuardedByCond`. This patch factors out these two calls into a separate
method. It is a preparation step to extend this logic: it is not the only way how we can prove
such conditions.
Differential Revision: https://reviews.llvm.org/D43373
llvm-svn: 325745
An FRem instruction inside a loop should prevent the loop from being converted
into a CTR loop since this is not an operation that is legal on any PPC
subtarget. This will always be a call to a library function which means the
loop will be invalid if this instruction is in the body.
Fixes PR36292.
llvm-svn: 325739
According to the current coverage report salvageDebugInfo() is called
5.12 million times during testing and almost always returns early.
The early return depends on LocalAsMetadata::getIfExists returning null,
which involves a DenseMap lookup in an LLVMContextImpl. We can probably
speed this up by simply checking the IsUsedByMD bit in Value.
llvm-svn: 325738
The pahole does not work with BPF backend properly:
-bash-4.2$ cat test.c
struct test_t {
int a;
int b;
};
int test(struct test_t *s) {
return s->a;
}
-bash-4.2$ clang -g -O2 -target bpf -c test.c
-bash-4.2$ pahole test.o
struct clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) {
clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 0 4 */
clang version 7.0.0 (trunk 325446) (llvm/trunk 325464) clang version 7.0.0 (trunk 325446) (llvm/trunk 325464); /* 4 4 */
/* size: 8, cachelines: 1, members: 2 */
/* last cacheline: 8 bytes */
};
-bash-4.2$
The reason is that BPF backend is not yet implemented in elfutils backend
https://github.com/threatstack/elfutils/tree/master/backends
and pahole depends on elfutils for dwarf parsing and resolving relocation.
More specifically, the unsupported relocation in .debug_info for type/member name
against symbol table caused the incorrect result above. The following is
the raw .rel.debug_info for the above example,
Hex dump of section '.rel.debug_info':
0x00000000 06000000 00000000 0a000000 0b000000 ................
0x00000010 0c000000 00000000 0a000000 01000000 ................
0x00000020 12000000 00000000 0a000000 02000000 ................
0x00000030 16000000 00000000 0a000000 0e000000 ................
0x00000040 1a000000 00000000 0a000000 03000000 ................
----------------- -------- --------
reloc location type symtab index
Hex dump of section '.debug_info':
0x00000000 7b000000 04000000 00000801 00000000 {...............
0x00000010 0c000000 00000000 00000000 00000000 ................
0x00000020 00000000 00001000 00000200 00000000 ................
Based on "type", the proper value will be extracted from symbol table
and filled in .debug_info so later on .debug_info can be properly
resolved against debug strings.
There are two ways to fix this problem. One is to fix elfutils by adding
BPF support which is desirable. This could take a long time and won't work
with already deployed pahole. For a short term workaround, we can disable
dwarf cross-section relation which specifically avoids debug_info and
symbol table cross relocation. This should help any dwarf-related tool
which has not implement BPF specific relocations yet.
Now .rel.debug_info does not have any relocation for symbol table and
.debug_info itself contains necessary relocation information by itself.
Hex dump of section '.debug_info':
0x00000000 7b000000 04000000 00000801 00000000 {...............
0x00000010 0c003700 00000000 00003e00 00000000 ..7.......>.....
0x00000020 00000000 00001000 00000200 00000000 ................
location 0xc has 0, 0x12 has 0x37, 0x1a has 0x3e in place which
will be used in relocation resolution. Here, the values of 0, 0x37 and 0x3e
are offset in .debug_str section.
Please note the difference between two above .debug_info dumps.
With the fix, pahole works properly with BPF backend:
-bash-4.2$ clang -O2 -g -target bpf -c test.c
-bash-4.2$ pahole test.o
struct test_t {
int a; /* 0 4 */
int b; /* 4 4 */
/* size: 8, cachelines: 1, members: 2 */
/* last cacheline: 8 bytes */
};
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325735
The issue was that the has function was generating different results depending
on the signedness of char on the host platform. This commit fixes the issue by
explicitly using an unsigned char type to prevent sign extension and
adds some extra tests.
The original commit message was:
This patch implements a variant of the DJB hash function which folds the
input according to the algorithm in the Dwarf 5 specification (Section
6.1.1.4.5), which in turn references the Unicode Standard (Section 5.18,
"Case Mappings").
To achieve this, I have added a llvm::sys::unicode::foldCharSimple
function, which performs this mapping. The implementation of this
function was generated from the CaseMatching.txt file from the Unicode
spec using a python script (which is also included in this patch). The
script tries to optimize the function by coalescing adjecant mappings
with the same shift and stride (terms I made up). Theoretically, it
could be made a bit smarter and merge adjecant blocks that were
interrupted by only one or two characters with exceptional mapping, but
this would save only a couple of branches, while it would greatly
complicate the implementation, so I deemed it was not worth it.
Since we assume that the vast majority of the input characters will be
US-ASCII, the folding hash function has a fast-path for handling these,
and only whips out the full decode+fold+encode logic if we encounter a
character outside of this range. It might be possible to implement the
folding directly on utf8 sequences, but this would also bring a lot of
complexity for the few cases where we will actually need to process
non-ascii characters.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: mgorny, hintonda, echristo, clayborg, vleschuk, llvm-commits
Differential Revision: https://reviews.llvm.org/D42740
llvm-svn: 325732
than a shared ObjectFile/MemoryBuffer pair.
There's no need to pre-parse the buffer into an ObjectFile before passing it
down to the linking layer, and moving the parsing into the linking layer allows
us remove the parsing code at each call site.
llvm-svn: 325725
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.
llvm-svn: 325717
This patch changes hwasan inline instrumentation:
Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte).
Emits brk instruction instead of hlt for the kernel and user space.
Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel).
Fixes and adds appropriate tests.
Patch by Andrey Konovalov.
Differential Revision: https://reviews.llvm.org/D43135
llvm-svn: 325711
Add a test that checks that kernel inline instrumentation works.
Patch by Andrey Konovalov!
Differential Revision: https://reviews.llvm.org/D42473
llvm-svn: 325710
Summary:
IMAGE_REL_AMD64_ADDR32NB relocations are currently set to zero in all cases.
This patch sets the relocation to the correct value when possible and shows an error when not.
Reviewers: enderby, lhames, compnerd
Reviewed By: compnerd
Subscribers: LepelTsmok, compnerd, martell, llvm-commits
Differential Revision: https://reviews.llvm.org/D30709
llvm-svn: 325700
Enable multiple COPY hints to eliminate more COPYs during register allocation.
Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.
Review: Krzysztof Parzyszek
llvm-svn: 325697