The approach I took was to define a dialect 'extern' attribute that a GlobalOp can take as a value to signify external linkage. I think this approach should compose well and should also work with wherever the OpaqueElements work goes in the future (since that is just another kind of attribute). I special cased the GlobalOp parser/printer for this case because it is significantly easier on the eyes.
In the discussion, Jeff Niu had proposed an alternative syntax for GlobalOp that I ended up not taking. I did try to implement it but a) I don't think it made anything easier to read in the common case, and b) it made the parsing/printing logic a lot more complicated (I think I would need a completely custom parser/printer to do it well). Please have a look at the common cases where the global type and initial value type match: I don't think how I have it is too bad. The less common cases seem ok to me.
I chose to only implement the direct, constant load op since that is non side effecting and there was still discussion pending on that.
Differential Revision: https://reviews.llvm.org/D124318
This patch changes the strategy for vectorizing freeze instrucion, from
replicating multiple times to widening according to selected VF.
Fixes#54992
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D125016
In RISCVTargetTransformInfo, enumerating the processor family is not a good way to predict.
Because it needs to enumerate many subtarget family and is hard to update if add new subtarget.
Instead, create a feature to distinguish whether targets want to use default unroll preference or not.
Keep TuneSiFive7 because it's flag to indicate subtarget family, which may used in other place.
Differential Revision: https://reviews.llvm.org/D125741
This should fix the issues introduced by d71d1a9, which skipped all the
test setup commands.
This also fixes the test failures happening in TestAutosuggestion.py.
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
In this patch we add a function foldICmpInstWithConstantAllowUndef
to fold integer comparisons with a constant operand: icmp Pred X, C
where X is some kind of instruction and C is AllowUndef.
We move this fold to the new function, so that it can solve undef elts in a vector.
Reviewed By: spatel, RKSimon
Differential Revision: https://reviews.llvm.org/D125220
This patch subtracts 1 to the pc of any frame above frame 0 to get the
previous line entry and display the right line in the debugger.
This also rephrase some old comment from `48d157dd4`.
rdar://92686666
Differential Revision: https://reviews.llvm.org/D125928
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
When the terminal window is too small, lldb would wrap progress messages
accross multiple lines which would break the progress event handling
code that is supposed to clear the message once the progress is completed.
This causes the progress message to remain on the screen, sometimes partially,
which can be confusing for the user.
To fix this issue, this patch trims the progress message to the terminal
width taking into account the progress counter leading the message for
finite progress events and also the trailing `...`.
rdar://91993836
Differential Revision: https://reviews.llvm.org/D124785
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch adds a new `use_colors` argument to the PExpect.launch
method.
As the name suggests, it allows the user to conditionally enable color
support in the debugger, which can be helpful to test functionalities that
rely on that, like progress reporting. It defaults to False.
Differential Revision: https://reviews.llvm.org/D125915
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
It can happen on macOS that terminal doesn't report the "colors"
capability in the terminfo database, in which case `tigetnum` returns -1.
This doesn't mean however that the terminal doesn't supports color, it
just means that the capability is absent from the terminal description.
In that case, we should still fallback to the checking the $TERM
environment variable to see if it supports ANSI escapes codes.
Differential Revision: https://reviews.llvm.org/D125914
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Make sure that we really don't emit quad-precision unless the "hard-quad-float"
feature is available. Add missing replacement instruction patterns that are
needed to emit alternative code for conditional moves of quad-precision floats.
Test from koakuma.
Reviewed By: koakuma
Differential Revision: https://reviews.llvm.org/D119104
Add a test with standard-conforming non-conforming lcobound()
intrinsic function invocations. Also test that several
non-conforming lcobound() invocations generate the correct error
messages.
Differential Revision: https://reviews.llvm.org/DD123747
Add a new testcase that reproduces a bug when BOLTing current
trunk LLD bootstrapped with trunk clang. This makes it official
that we do not support this transformation but are working on
it. When the support is ready, XFAIL should be removed.
Reviewed By: maksfb, Amir, yota9
Differential Revision: https://reviews.llvm.org/D125843
The commit added a dependency on LLVMSymbolize but the
CMakeLists.txt file wasn't updated. This doesn't cause
issues for static libraries builds but breaks the shared
libraries build. This just adds the missing dependency.
This is the test which triggered my disabling of the assert in d4545e6. The
issue it reveals is basically the same as from cc0283a6, but in the cross
block case.
We visit block1, mutate the setvli (correctly), and then visit block two and
ask whether the vadd is compatible with the block state. Before mutation, it
wasn't. After mutation, it is. And thus, we have our phase 1 vs 3 difference.
Initial introduction of the new macro before obsoleting the old one - the old name was really confusing.
Also moved SANITIZER_WATCHOS and SANITIZER_TVOS definitions under common #if defined(__APPLE__) block
Differential Revision: https://reviews.llvm.org/D125816
Generally, size_t is an alias for unsigned long long. In the strlcpy
tests, the return value of strlcpy (a size_t) is compared to an unsigned
long. On Linux unsigned long and unsigned long long are both 64 bits,
but on windows unsigned long is 32 bits. Since the macros require
identical types for both sides, this caused a build failure on windows.
This patch changes the constants to be explicit size_t values.
Differential Revision: https://reviews.llvm.org/D125917
An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.
Reviewed By: dblaikie, rnk, aprantl
Differential Revision: https://reviews.llvm.org/D123534
The pattern matching and vectgorization for reductions was not very
effective. Some of of the possible reduction values were marked as
external arguments, SLP could not find some reduction patterns because
of too early attempt to vectorize pair of binops arguments, the cost of
consts reductions was not correct. Patch addresses these issues and
improves the analysis/cost estimation and vectorization of the
reductions.
The most significant changes in SLP.NumVectorInstructions:
Metric: SLP.NumVectorInstructions [140/14396]
Program results results0 diff
test-suite :: SingleSource/Benchmarks/Adobe-C++/loop_unroll.test 920.00 3548.00 285.7%
test-suite :: SingleSource/Benchmarks/BenchmarkGame/n-body.test 66.00 122.00 84.8%
test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG/miniGMG.test 100.00 128.00 28.0%
test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test 664.00 810.00 22.0%
test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test 592.00 687.00 16.0%
test-suite :: MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame.test 402.00 426.00 6.0%
test-suite :: MultiSource/Applications/JM/lencod/lencod.test 1665.00 1745.00 4.8%
test-suite :: External/SPEC/CINT2017rate/500.perlbench_r/500.perlbench_r.test 135.00 139.00 3.0%
test-suite :: External/SPEC/CINT2017speed/600.perlbench_s/600.perlbench_s.test 135.00 139.00 3.0%
test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 388.00 397.00 2.3%
test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 895.00 914.00 2.1%
test-suite :: MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm.test 240.00 244.00 1.7%
test-suite :: MultiSource/Benchmarks/mediabench/gsm/toast/toast.test 240.00 244.00 1.7%
test-suite :: External/SPEC/CINT2017speed/602.gcc_s/602.gcc_s.test 820.00 832.00 1.5%
test-suite :: External/SPEC/CINT2017rate/502.gcc_r/502.gcc_r.test 820.00 832.00 1.5%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 14804.00 14914.00 0.7%
test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 8125.00 8183.00 0.7%
test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 1330.00 1338.00 0.6%
test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 1330.00 1338.00 0.6%
test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 9832.00 9880.00 0.5%
test-suite :: External/SPEC/CFP2017rate/511.povray_r/511.povray_r.test 5267.00 5291.00 0.5%
test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 4018.00 4024.00 0.1%
test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 4018.00 4024.00 0.1%
test-suite :: External/SPEC/CFP2017speed/644.nab_s/644.nab_s.test 426.00 424.00 -0.5%
test-suite :: External/SPEC/CFP2017rate/544.nab_r/544.nab_r.test 426.00 424.00 -0.5%
test-suite :: External/SPEC/CINT2017rate/541.leela_r/541.leela_r.test 201.00 192.00 -4.5%
test-suite :: External/SPEC/CINT2017speed/641.leela_s/641.leela_s.test 201.00 192.00 -4.5%
644.nab_s and 544.nab_r - reduced number of shuffles but increased number
of useful vectorized instructions.
641.leela_s and 541.leela_r - the function
`@_ZN9FastBoard25get_pattern3_augment_specEiib` is not inlined anymore
but its body gets vectorized successfully. Before, the function was
inlined twice and vectorized just after inlining, currently it is not
required. The vector code looks pretty similar, just like as it was before.
Differential Revision: https://reviews.llvm.org/D111574
Summary:
fixed a return curr_symbol() for Russian in the libcxx/test/support/locale_helpers.h for AIX
Reviewers: David Tenty,Mark de Wever
Differential Revision: https://reviews.llvm.org/D125801
I noticed https://reviews.llvm.org/D87415 added SDAG combines to fold
FMIN/MAX instrs with NaNs.
The patch implements the same NaN combines for GISel GMIR FMIN/MAX opcodes:
G_FMINNUM(X, NaN) -> X
G_FMAXNUM(X, NaN) -> X
G_FMINIMUM(X, NaN) -> NaN
G_FMAXIMUM(X, NaN) -> NaN
The patch adds AArch64 tests for these combines as well.
Reviewed by: arsenm
Differential revision: https://reviews.llvm.org/D125819
When shifting by a byte-multiple:
bswap (shl X, Y) --> lshr (bswap X), Y
bswap (lshr X, Y) --> shl (bswap X), Y
This was limited to constants as a first step in D122010 / 60820e53ec ,
but issue #55327 shows a source example (and there's a test based on that here)
where a variable shift amount is used in this pattern.
Fix a couple minor details in the existing logic for calculating
saved registers and stack adjustment.
Synthesize the corresponding prologues and epilogues and print them.
(This supersedes the previous printout of one single list of stored
registers; as there's lots of minor nuance differences in how
registers are pushed/popped in various corner cases, it's better to
print the full prologue/epilogue instead of trying to condense it
into one single list.)
Print the raw values of the fields Reg, R, L (LinkRegister) and C
(Chaining) instead of only printing the derived values.
Differential Revision: https://reviews.llvm.org/D125644
We want to build libunwind, libc++abi and libc++ as universal libraries
supporting both x86_64 and arm64 architectures.
Differential Revision: https://reviews.llvm.org/D125908
Use a specialized buffer wrapper to limit the number of insertions in the
buffer. After the limit has been reached the buffer only needs to count
the number of insertions to return the buffer size required to store the
entire output.
Depends on D110498
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D110499
This optimizes the __format_arg_store type to allow a more efficient
storage of the basic_format_args.
It stores the data in two arrays:
- A struct with the tag of the exposition only variant's type and the
offset of the element in the data array. Since this array only depends
on the type information it's calculated at compile time and can be
shared by different instances of this class.
- The arguments converted to the types used in the exposition only
variant of basic_format_arg. This means the packed data can be
directly copied to an element of this variant.
The new code uses rvalue reference arguments in preparation for P2418.
The handle class also has some changes to prepare for P2418. The real
changed for P2418 will be done separately, but these parts make it
easier to implement that paper.
Some parts of existing test code are removed since they were no longer
valid after the changes, but new tests have been added.
Implements parts of:
- P2418 Add support for std::generator-like types to std::format
Completes:
- LWG3473 Normative encouragement in non-normative note
Depends on D121138
Reviewed By: #libc, vitaut, Mordante
Differential Revision: https://reviews.llvm.org/D121514
This formatter isn't in the list of required formatters in
[format.formatter.spec]/2.2
For each charT, the string type specializations
template<> struct formatter<charT*, charT>;
template<> struct formatter<const charT*, charT>;
template<size_t N> struct formatter<const charT[N], charT>;
template<class traits, class Allocator>
struct formatter<basic_string<charT, traits, Allocator>, charT>;
template<class traits>
struct formatter<basic_string_view<charT, traits>, charT>;
Since remove_cvref_t<const charT[N]> is charT[N] the formatter is
required by
[format.functions]/25
Preconditions: formatter<remove_cvref_t<Ti>, charT> meets the
BasicFormatter requirements ([formatter.requirements]) for each Ti in
Args.
Depends on D120921
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D121138