Consider a callee function that has a call (C) within it which feeds
into the return. When we inline that callee into a callsite that has
return attributes, we can backward propagate those attributes to the
call (C) within that inlined callee body.
This is safe to do so only if we can guarantee transfer of execution to
successor in the window of instructions between return value (i.e. the
call C) and the return instruction.
See added test cases.
Reviewed-By: reames, jdoerfert
Differential Revision: https://reviews.llvm.org/D76140
Summary:
//reviews.llvm.org/D33035 added in 2017 basic support for intel-pt. I
plan to improve it and use it to support reverse debugging.
I fixed a couple of issues and now this plugin works again:
1. pythonlib needed to be linked against it for the SB framework.
Linking was failing because of this
2. the decoding functionality was broken because it lacked handling for
instruction events. It seems old versions of libipt, the actual decoding
library, didn't require these, but modern version require it (you can
read more here
https://github.com/intel/libipt/blob/master/doc/howto_libipt.md). These
events signal overflows of the internal PT buffer in the CPU,
enable/disable events of tracing, async cpu events, interrupts, etc.
I ended up refactoring a little bit the code to reduce code duplication.
In another diff I'll implement some basic tests.
This is a simple execution of the library:
(lldb) target create "/data/users/wallace/rr-project/a.out"
Current executable set to '/data/users/wallace/rr-project/a.out' (x86_64).
(lldb) plugin load liblldbIntelFeatures.so
(lldb) b main
Breakpoint 1: where = a.out`main + 8 at test.cpp:10, address = 0x00000000004007fa
(lldb) b test.cpp:14
Breakpoint 2: where = a.out`main + 50 at test.cpp:14, address = 0x0000000000400824
(lldb) r
Process 902754 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
frame #0: 0x00000000004007fa a.out`main at test.cpp:10
7 }
8
9 int main() {
-> 10 int z = 0;
11 for(int i = 0; i < 10000; i++)
12 z += fun(z);
13
Process 902754 launched: '/data/users/wallace/rr-project/a.out' (x86_64)
(lldb) processor-trace start all
(lldb) c
Process 902754 resuming
Process 902754 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 2.1
frame #0: 0x0000000000400824 a.out`main at test.cpp:14
11 for(int i = 0; i < 10000; i++)
12 z += fun(z);
13
-> 14 cout << z<< endl;
15 return 0;
16 }
(lldb) processor-trace show-instr-log
thread #1: tid=902754
0x7ffff72299b9 <+9>: addq $0x8, %rsp
0x7ffff72299bd <+13>: retq
0x4007ed <+16>: addl $0x1, %eax
0x4007f0 <+19>: leave
0x4007f1 <+20>: retq
0x400814 <+34>: addl %eax, -0x4(%rbp)
0x400817 <+37>: addl $0x1, -0x8(%rbp)
0x40081b <+41>: cmpl $0x270f, -0x8(%rbp) ; imm = 0x270F
0x400822 <+48>: jle 0x40080a ; <+24> at test.cpp:12
0x400822 <+48>: jle 0x40080a ; <+24> at test.cpp:12
```
Subscribers: mgorny, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D76872
Using the ADDITIONAL_COMPILE_FLAGS annotation, it is possible to move
these tests from .sh.cpp to .pass.cpp, making them suitable for running
on remote hosts more easily.
Move lower-affine.mlir from test/Transforms to
test/Conversion/AffineToStandard/. Other related NFC.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D77008
Registers used in any address (as well as in a few other contexts)
have special semantics when a "zero" register is used, which is
why the back-end defines extra register classes ADDR32, ADDR64 etc
to be used to prevent the register allocator from using %r0 there.
However, when writing assembler code "by hand", you sometimes need
to trigger that special semantics. However, currently the AsmParser
will reject %r0 in those places. In some cases it may be possible
to write that instruction differently - but in others it is currently
not possible at all.
This check in AsmParser simply seems overly strict, so this patch
just removes the check completely. This brings the behaviour of
AsmParser in line with the GNU assembler as well.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45092
Make InstCombine aware of the aligned_alloc library function.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Depends on D76970.
Differential Revision: https://reviews.llvm.org/D76971
SBPlatform::GetHostPlatform was missing the reproducer instrumentation
macros. Fixed by running lldb-instr on SBPlatform.cpp:
$ ./bin/lldb-instr ../llvm-project/lldb/source/API/SBPlatform.cpp
Summary:
CGProfilePass is run by default in certain new pass manager optimization pipeline. Assemblers other than llvm as (such as gnu as) cannot recognize the .cgprofile entries generated and emitted from this pass, causing build time error.
This patch adds new options in clang CodeGenOpts and PassBuilder options so that we can turn cgprofile off when not using integrated assembler.
Reviewers: Bigcheese, xur, george.burgess.iv, chandlerc, manojgupta
Reviewed By: manojgupta
Subscribers: manojgupta, void, hiraditya, dexonsmith, llvm-commits, tcwang, llozano
Tags: #llvm, #clang
Differential Revision: https://reviews.llvm.org/D62627
Summary:
When doing cross-compilation from Linux to MacOS we don't have
access to have access to `xcodebuild` and therefore need a way
to set the SDK version from the outside.
Fixes https://reviews.llvm.org/D68292#1853594 for me.
Reviewers: delcypher, yln
Reviewed By: delcypher
Subscribers: #julialang, mgorny, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D77026
Summary:
The basic idea is to walk through the concept definition, looking for
t.foo() where t has the constrained type.
In this patch:
- nested types are recognized and offered after ::
- variable/function members are recognized and offered after the correct
dot/arrow/colon trigger
- member functions are recognized (anything directly called). parameter
types are presumed to be the argument types. parameters are unnamed.
- result types are available when a requirement has a type constraint.
These are printed as constraints, except same_as<T> which prints as T.
Not in this patch:
- support for merging/overloading when two locations describe the same member.
The last one wins, for any given name. This is probably important...
- support for nested template members (T::x<int>)
- support for completing members of (instantiations of) template template parameters
Reviewers: nridge, saar.raz
Subscribers: mgrang, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73649
The above change used a binary literal that is not supported in c++11 mode when
using gcc. It was formalized into the c++14 standard and works when using that
mode to compile, so change the script to use c++14 instead.
Reviewed by: dvyukov
Differential Revision: https://reviews.llvm.org/D77111
Existing tiling implementation of Linalg would still work for tiling
the batch dimensions of the convolution op.
Differential Revision: https://reviews.llvm.org/D76637
Summary: this patch preserve information from various places in EarlyCSE into assume bundles.
Reviewers: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76769
--no-threads is a name copied from gold.
gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld
processes concurrently or customizing the number of LTO threads).
Having a single --threads=N is a straightforward replacement of gold's
--no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of
keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar),
--thinlto-jobs= defaults to --threads=,
otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still
a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885
This patch fixes a crash that happens on the DWARF expression evaluator
when trying to access the top of the stack while it's empty.
rdar://60512489
Differential Revision: https://reviews.llvm.org/D77108
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Summary:
Add support for TupleGetOp folding through InsertSlicesOp and ExtractSlicesOp.
Vector-to-vector transformations for unrolling and lowering to hardware vectors
can generate chains of structured vector operations (InsertSlicesOp,
ExtractSlicesOp and ShapeCastOp) between the producer of a hardware vector
value and its consumer. Because InsertSlicesOp, ExtractSlicesOp and ShapeCastOp
are structured, we can track the location (tuple index and vector offsets) of
the consumer vector value through the chain of structured operations to the
producer, enabling a much more powerful producer-consumer fowarding of values
through structured ops and tuple, which in turn enables a more powerful
TupleGetOp folding transformation.
Reviewers: nicolasvasilache, aartbik
Reviewed By: aartbik
Subscribers: grosul1, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76889
This makes it closer to how one would run the tests by hand, and it is
also closer to how the SSHExecutor runs the tests remotely. It also
allows using shell builtins in .sh.cpp tests when using %{exec}.
This patch fixes a crash that happens on the DWARF expression evaluator
when trying to access the top of the stack while it's empty.
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This is the Waymarking algorithm implemented as an independent utility.
The utility is operating on a range of sequential elements.
First we "tag" the elements, by calling `fillWaymarks`.
Then we can "follow" the tags from every element inside the tagged
range, and reach the "head" (the first element), by calling
`followWaymarks`.
Differential Revision: https://reviews.llvm.org/D74415
Based on the current discussion in https://llvm.org/PR45307, it seems
that it's legitimate for `temp_directory_path()` to return a path with
a trailing slash. Since `p.parent_path()` will never contain a trailing
slash, comparing it to the result of `temp_directory_path()` will fail
depending on whether `temp_directory_path()` returns a trailing slash
or not.
If canLowerByDroppingEvenElements indicates that the shuffle is a N:1 compaction pattern and the inputs are suitably sign/zero extended then we can use a chain of PACKSS/PACKUS to compact.
This helps avoid PSHUFB (and its mask load) for short shuffle chains, shuffle combining will still replace with a PSHUFB if we have enough shuffles as getFauxShuffleMask can recognise PACKSS/PACKUS chains.
Otherwise, trying to reproduce a failing filesystem test by copy-pasting
the command-line used and running that in the shell won't work, because
the shell will eat quoting around the define and we'll end up with a
non-stringized path in the .cpp file.
That way, local lit configuration files don't have to worry about
deep-copying the compiler instance of the test format, which is
arguably an implementation detail.
We pass the config to this method even though it is not used by the
current test format because this allows replacing the current test
format by other test formats that would require the config to add
new compile flags.
This reduces the complexity of our already complex global lit configuration,
and also avoids cluttering the compilation commands for all tests with
things that are only relevant to the filesystem tests.
Differential Revision: https://reviews.llvm.org/D76785
The script now includes extra info about command-line options used
when generating its advertisement heading, but we don't want that
here. This is a special-case because we have enhanced the check
lines (as noted in the 2nd comment line).
Previously, filesystem tests would require LIBCXX_FILESYSTEM_DYNAMIC_TEST_ROOT
to be present in the environment and to match the value provided when
compiling, as a macro. This has the problem that it only allows for the
filesystem tests to be run on the same machine they are created.
Instead, we create a temporary directory for each test. Technically,
this is tricky to do because we're relying on some of the code that
we're testing to do this. However, there's no other portable way of
creating temporary direcories in C++, so this is difficult to avoid.
Differential Revision: https://reviews.llvm.org/D76731