Summary:
When no scope qualifier is specified, allow completing index symbols
from any scope and insert proper automatically. This is still experimental and
hidden behind a flag.
Things missing:
- Scope proximity based scoring.
- FuzzyFind supports weighted scopes.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52364
llvm-svn: 343248
Summary:
_Note_: I am not attached to the name `DenseSizeClassMap`, so if someone has a
better idea, feel free to suggest it.
The current pre-defined `SizeClassMap` hold a decent amount of cached entries,
either in cheer number of, or in amount of memory cached.
Empirical testing shows that more compact per-class arrays (whose sizes are
directly correlated to the number of cached entries) are beneficial to
performances, particularly in highly threaded environments.
The new proposed `SizeClassMap` has the following properties:
```
c00 => s: 0 diff: +0 00% l 0 cached: 0 0; id 0
c01 => s: 16 diff: +16 00% l 4 cached: 8 128; id 1
c02 => s: 32 diff: +16 100% l 5 cached: 8 256; id 2
c03 => s: 48 diff: +16 50% l 5 cached: 8 384; id 3
c04 => s: 64 diff: +16 33% l 6 cached: 8 512; id 4
c05 => s: 80 diff: +16 25% l 6 cached: 8 640; id 5
c06 => s: 96 diff: +16 20% l 6 cached: 8 768; id 6
c07 => s: 112 diff: +16 16% l 6 cached: 8 896; id 7
c08 => s: 128 diff: +16 14% l 7 cached: 8 1024; id 8
c09 => s: 144 diff: +16 12% l 7 cached: 7 1008; id 9
c10 => s: 160 diff: +16 11% l 7 cached: 6 960; id 10
c11 => s: 176 diff: +16 10% l 7 cached: 5 880; id 11
c12 => s: 192 diff: +16 09% l 7 cached: 5 960; id 12
c13 => s: 208 diff: +16 08% l 7 cached: 4 832; id 13
c14 => s: 224 diff: +16 07% l 7 cached: 4 896; id 14
c15 => s: 240 diff: +16 07% l 7 cached: 4 960; id 15
c16 => s: 256 diff: +16 06% l 8 cached: 4 1024; id 16
c17 => s: 320 diff: +64 25% l 8 cached: 3 960; id 49
c18 => s: 384 diff: +64 20% l 8 cached: 2 768; id 50
c19 => s: 448 diff: +64 16% l 8 cached: 2 896; id 51
c20 => s: 512 diff: +64 14% l 9 cached: 2 1024; id 48
c21 => s: 640 diff: +128 25% l 9 cached: 1 640; id 49
c22 => s: 768 diff: +128 20% l 9 cached: 1 768; id 50
c23 => s: 896 diff: +128 16% l 9 cached: 1 896; id 51
c24 => s: 1024 diff: +128 14% l 10 cached: 1 1024; id 48
c25 => s: 1280 diff: +256 25% l 10 cached: 1 1280; id 49
c26 => s: 1536 diff: +256 20% l 10 cached: 1 1536; id 50
c27 => s: 1792 diff: +256 16% l 10 cached: 1 1792; id 51
c28 => s: 2048 diff: +256 14% l 11 cached: 1 2048; id 48
c29 => s: 2560 diff: +512 25% l 11 cached: 1 2560; id 49
c30 => s: 3072 diff: +512 20% l 11 cached: 1 3072; id 50
c31 => s: 3584 diff: +512 16% l 11 cached: 1 3584; id 51
c32 => s: 4096 diff: +512 14% l 12 cached: 1 4096; id 48
c33 => s: 5120 diff: +1024 25% l 12 cached: 1 5120; id 49
c34 => s: 6144 diff: +1024 20% l 12 cached: 1 6144; id 50
c35 => s: 7168 diff: +1024 16% l 12 cached: 1 7168; id 51
c36 => s: 8192 diff: +1024 14% l 13 cached: 1 8192; id 48
c37 => s: 10240 diff: +2048 25% l 13 cached: 1 10240; id 49
c38 => s: 12288 diff: +2048 20% l 13 cached: 1 12288; id 50
c39 => s: 14336 diff: +2048 16% l 13 cached: 1 14336; id 51
c40 => s: 16384 diff: +2048 14% l 14 cached: 1 16384; id 48
c41 => s: 20480 diff: +4096 25% l 14 cached: 1 20480; id 49
c42 => s: 24576 diff: +4096 20% l 14 cached: 1 24576; id 50
c43 => s: 28672 diff: +4096 16% l 14 cached: 1 28672; id 51
c44 => s: 32768 diff: +4096 14% l 15 cached: 1 32768; id 48
c45 => s: 40960 diff: +8192 25% l 15 cached: 1 40960; id 49
c46 => s: 49152 diff: +8192 20% l 15 cached: 1 49152; id 50
c47 => s: 57344 diff: +8192 16% l 15 cached: 1 57344; id 51
c48 => s: 65536 diff: +8192 14% l 16 cached: 1 65536; id 48
c49 => s: 81920 diff: +16384 25% l 16 cached: 1 81920; id 49
c50 => s: 98304 diff: +16384 20% l 16 cached: 1 98304; id 50
c51 => s: 114688 diff: +16384 16% l 16 cached: 1 114688; id 51
c52 => s: 131072 diff: +16384 14% l 17 cached: 1 131072; id 48
c53 => s: 64 diff: +0 00% l 0 cached: 8 512; id 4
Total cached: 864928 (152/432)
```
It holds a bit less of 1MB of cached entries at most, and the cache fits in a
page.
The plan is to use this map by default for Scudo once we make sure that there
is no unforeseen impact for any of current use case.
Benchmarks give the most increase in performance (with Scudo) when looking at
highly threaded/contentious environments. For example, rcp2-benchmark
experiences a 10K QPS increase (~3%), and a decrease of 50MB for the max RSS
(~10%). On platforms like Android where we only have a couple of caches,
performance remain similar.
Reviewers: eugenis, kcc
Reviewed By: eugenis
Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits
Differential Revision: https://reviews.llvm.org/D52371
llvm-svn: 343246
Summary:
-lm is needed for these tests on Linux, but the lit config for this package automatically adds it for Linux and excludes it for Windows. So we should be able to get these tests running again by just dropping -lm and let the lit config add it when possible.
I was under the impression that -lm worked across platforms because it exists in other tests without and 'UNSUPPORTED: windows' commands (e.g. divsc3_test.c), but those are actually excluded because they 'REQUIRES: c99-complex' which is excluded from windows platforms (also by the local lit config).
I don't have easy access to a windows machine to verify this patch, but I can trigger a build bot run on clang-x64-ninja-win7 shortly after submitting.
Reviewers: hans
Subscribers: dberris, delcypher, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D52563
llvm-svn: 343245
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.
llvm-svn: 343244
- Add latency timings to GDB packet log summary if timestamps are on log
- Add the ability to plot the latencies for each packet type with --plot
- Don't crash the script when target xml register info is in wierd format
llvm-svn: 343243
'ld{{.*}}"' seems to match the complete line for me which is failing
the test. Only allow an optional '.exe' for Windows systems as most
other tests do.
Another possibility would be to collapse the greedy expression with
the next check to avoid matching the full line.
Differential Revision: https://reviews.llvm.org/D52619
llvm-svn: 343240
Now that D51487 has landed, the last machine verifier tests that failed EXPENSIVE_CHECKS builds have now been fixed/removed, so we can remove @MatzeB 's isMachineVerifierClean() hack for sparc targets.
Differential Revision: https://reviews.llvm.org/D52612
llvm-svn: 343232
Bits [23-22] are used in Add and Sub to specify the shift. The value of the
shift field must be 0x; values of 1x are unallocated. MTE adds some instructions
that use such encodings, and this patch refactors the Add/Sub class so that
another class could derive from this one to implement other encodings and other
formats of bitfields.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52489
llvm-svn: 343231
When looking for the bclib Clang considered the default library
path first while it preferred directories in LIBRARY_PATH when
constructing the invocation of nvlink. The latter actually makes
more sense because during development it allows using a non-default
runtime library. So change the search for the bclib to start
looking in directories given by LIBRARY_PATH.
Additionally add a new option --libomptarget-nvptx-path= which
will be searched first. This will be handy for testing purposes.
Differential Revision: https://reviews.llvm.org/D51686
llvm-svn: 343230
This adds two new barrier instructions which can be used to restrict
speculative execution of load instructions.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52483
llvm-svn: 343229
When C is not zero and infinites are not allowed (C / X) > 0 is a sign
test. Depending on the sign of C, the predicate must be swapped.
E.g.:
foo(double X) {
if ((-2.0 / X) <= 0) ...
}
=>
foo(double X) {
if (X >= 0) ...
}
Patch by: @marels (Martin Elshuber)
Differential Revision: https://reviews.llvm.org/D51942
llvm-svn: 343228
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.
Before this patch the code was incorrectly devirtualizing the fallback
indirect call.
See the new test and the example described there for more details.
Reviewers: pcc, vitalybuka
Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D52514
llvm-svn: 343226
This adds new instructions used by the Branch Target Identification
feature. When this is enabled, these are the only instructions which can
be targeted by indirect branch instructions.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52485
llvm-svn: 343225
The implementation of this is in TargetParser, so we only need to add a
test for it in clang.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52492
llvm-svn: 343220
This adds some new system registers which can be used to restrict
certain types of speculative execution.
Patch by Pablo Barrio and David Spickett!
Differential revision: https://reviews.llvm.org/D52482
llvm-svn: 343218
This adds two new system registers, used to generate random numbers.
This is an optional extension to v8.5-A, and will be controlled by the
"+rng" modifier of the -march= and -mcpu= options.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52481
llvm-svn: 343217
This adds a new variant of the DC system instruction for persistent
memory.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52480
llvm-svn: 343216
This adds new system instructions which act as barriers to speculative
execution based on earlier execution within a particular execution
context.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52479
llvm-svn: 343214
This is a new barrier which limits speculative execution of the
instructions following it.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52477
llvm-svn: 343213
IslAst could mark two nested outer loops as "OutermostParallel". It
caused that the code generator tried to OpenMP-parallelize both loops,
which it is not prepared loop.
It was because the recursive AST build algorithm managed a flag
"InParallelFor" to ensure that no nested loop is also marked as
"OutermostParallel". Unfortunatetly the same flag was used by nodes
marked as SIMD, and reset to false after the SIMD node. Since loops can
be marked as SIMD inside "OutermostParallel" loops, the recursive
algorithm again tried to mark loops as "OutermostParellel" although
still nested inside another "OutermostParallel" loop.
The fix exposed another bug: The function "astScheduleDimIsParallel" was
only called when a loop was potentially "OutermostParallel" or
"InnermostParallel", but as a side-effect also determines the minimum
dependence distance. Hence, changing when we need to know whether a loop
is "OutermostParallel" also changed which loop was annotated with
"#pragma minimal dependence distance".
Moreover, some complex condition linked with "InParallelFor" determined
whether a loop should be an "InnermostParallel" loop. It missed some
situations where it would not use mark as such although being inside an
SIMD mark node, and therefore not be annotated using "#pragma simd".
The changes in particular:
1. Split the "InParallelFor" flag into an "InParallelFor" and an
"InSIMD" flag.
2. Unconditionally call "astScheduleDimIsParallel" for its side-effects
and store the result in "InParallel" for later use.
3. Simplify the condition when a loop is "InnermostParallel".
Fixes llvm.org/PR33153 and llvm.org/PR38073.
llvm-svn: 343212
This is a new barrier which limits speculative execution of the
instructions following it.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52476
llvm-svn: 343211
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.
Reviewers: jyknight, joerg, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D51487
llvm-svn: 343210
These are some new variants of the "Floating-point Round to Integral"
family of instructions, which round to the nearest floating-point value
which fits in a 32- or 64-bit integer.
Patch by Pablo Barrio!
Differential revision: https://reviews.llvm.org/D52475
llvm-svn: 343209
Summary: The key is now the resource name, not the resource id.
Reviewers: gchatelet
Subscribers: tschuett, RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D52607
llvm-svn: 343208
Add cl_khr_depth_images to extension-version.cl.
Extend to_addr_builtin.cl to additionally test the built-in methods
to_private and to_local, and test assignment with to_global to
incorrect types.
Patch by Alistair Davies.
Differential Revision: https://reviews.llvm.org/D52020
llvm-svn: 343207
Summary: Use 0 as the default immediate for the UNIMP instruction.
This matches the behavior in gas.
Reviewers: jyknight, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D51526
llvm-svn: 343203
Summary:
Partial write %PSR (WRPSR) is a SPARC V8e option that allows WRPSR
instructions to only affect the %PSR.ET field. It is supported by
the GR740 and GR716.
Reviewers: jyknight, venkatra
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D48644
llvm-svn: 343202