llvm-project

Commit Graph

Author	SHA1	Message	Date
Joel E. Denny	b3f157a900	Revert 374651: "Reland r374392: [lit] Extend internal diff to support -U" This series of patches still breaks a Windows bot. llvm-svn: 374680	2019-10-12 18:52:05 +00:00
Joel E. Denny	57046e8fd9	Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it" This series of patches still breaks a Windows bot. llvm-svn: 374679	2019-10-12 18:51:51 +00:00
Joel E. Denny	9abfa58171	Revert r374653: "[lit] Fix a few oversights in r374651 that broke some bots" This series of patches still breaks a Windows bot. llvm-svn: 374678	2019-10-12 18:51:34 +00:00
Joel E. Denny	e9d3b8192e	Revert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots" This series of patches still breaks a Windows bot. llvm-svn: 374677	2019-10-12 18:51:18 +00:00
Joel E. Denny	b005d9e86f	Revert r374666: "[lit] Adjust error handling for decode introduced by r374665" This series of patches still breaks a Windows bot. llvm-svn: 374676	2019-10-12 18:51:08 +00:00
Joel E. Denny	459a93659a	Revert r374671: "[lit] Try errors="ignore" for decode introduced by r374665" This series of patches still breaks a Windows bot. llvm-svn: 374675	2019-10-12 18:50:57 +00:00
Simon Pilgrim	6716512670	[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings llvm-svn: 374674	2019-10-12 18:33:47 +00:00
Simon Pilgrim	5f2543f8dc	SymbolRecord - consistently use explicit for single operand constructors llvm-svn: 374673	2019-10-12 17:55:09 +00:00
Simon Pilgrim	936c6b57be	SymbolRecord - fix uninitialized variable warnings. NFCI. llvm-svn: 374672	2019-10-12 17:55:01 +00:00
Joel E. Denny	1e98a6c57a	[lit] Try errors="ignore" for decode introduced by r374665 Still trying to fix the same error as in r374666. llvm-svn: 374671	2019-10-12 17:23:25 +00:00
Roman Lebedev	c8ac97edc8	[NFC][LoopIdiom] Adjust FIXME to be self-explanatory llvm-svn: 374670	2019-10-12 16:48:16 +00:00
Simon Pilgrim	66417a9f03	Replace for-loop of SmallVector::push_back with SmallVector::append. NFCI. llvm-svn: 374669	2019-10-12 16:37:02 +00:00
Simon Pilgrim	37041c7d22	Fix cppcheck shadow variable name warnings. NFCI. llvm-svn: 374668	2019-10-12 16:36:52 +00:00
Simon Pilgrim	6446079add	[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI. llvm-svn: 374667	2019-10-12 16:36:44 +00:00
Joel E. Denny	64c00893fa	[lit] Adjust error handling for decode introduced by r374665 On that decode, Windows bots fail with: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` That's the same error as before r374665 except it's now at the decode before the write to stdout. llvm-svn: 374666	2019-10-12 16:25:46 +00:00
Joel E. Denny	a271acbf79	[lit] Try yet again to fix new tests that fail on Windows bots I seem to have misread the bot logs on my last attempt. When lit's internal diff runs on Windows under Python 2.7, it's text diffs not binary diffs that need decoding to avoid this error when writing the diff to stdout: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` There is no `decode` attribute in this case under Python 3.6.8 under Ubuntu, so this patch checks for the `decode` attribute before using it here. Hopefully nothing else is needed when `decode` isn't available. It might take a couple more attempts to figure out what error handling, if any, is needed for this decoding. llvm-svn: 374665	2019-10-12 16:00:35 +00:00
Joel E. Denny	8259f7ca12	Revert r374657: "[lit] Try again to fix new tests that fail on Windows bots" llvm-svn: 374664	2019-10-12 16:00:25 +00:00
Paul Hoad	1f20bc17d0	[clang-format] Proposal for clang-format to give compiler style warnings Summary: Related somewhat to {D29039} On seeing a quote on twitter by @invalidop > If it's not formatted with clang-format it's a build error. This made me want to change the way I use clang-format into a tool that could optionally show me where my source code violates clang-format syle. When I'm making a change to clang-format itself, one thing I like to do to test the change is to ensure I didn't cause a huge wave of changes, what I want to do is simply run this on a known formatted directory and see if any new differences arrive in a manner I'm used to. This started me thinking that we should allow build systems to run clang-format on a whole tree and emit compiler style warnings about files that fail clang-format in a form that would make them as a warning in most build systems and because those build systems range in their construction I don't think its unreasonable to NOT expect them to have to do the directory searching or parsing the output replacements themselves, but simply transform that into an error code when there are changes required. I am starting this by suggesing adding a -n or -dry-run command line argument which would emit a warning/error of the form Support for various common compiler command line argumuments like '-Werror' and '-ferror-limit' could make this very flexible to be integrated into build systems and CI systems. ``` > $ /usr/bin/clang-format --dry-run ClangFormat.cpp -ferror-limit=3 -fcolor-diagnostics > ClangFormat.cpp:54:29: warning: code should be clang-formatted [-Wclang-format-violations] > static cl::list<std::string> > ^ > ClangFormat.cpp:55:20: warning: code should be clang-formatted [-Wclang-format-violations] > LineRanges("lines", cl::desc("<start line>:<end line> - format a range of\n" > ^ > ClangFormat.cpp:55:77: warning: code should be clang-formatted [-Wclang-format-violations] > LineRanges("lines", cl::desc("<start line>:<end line> - format a range of\n" > ^ ``` Reviewers: mitchell-stellar, klimek, owenpan Reviewed By: klimek Subscribers: mgorny, cfe-commits Tags: #clang-format, #clang-tools-extra, #clang Differential Revision: https://reviews.llvm.org/D68554 llvm-svn: 374663	2019-10-12 15:36:05 +00:00
Roman Lebedev	76cdcf25b8	[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition Summary: This is a recommit, this originally landed in rL370454 but was subsequently reverted in rL370788 due to https://bugs.llvm.org/show_bug.cgi?id=43206 The reduced testcase was added to bcmp-negative-tests.ll as @pr43206_different_loops - we must ensure that the SCEV's we got are both for the same loop we are currently investigating. Original commit message: @mclow.lists brought up this issue up in IRC. It is a reasonably common problem to compare some two values for equality. Those may be just some integers, strings or arrays of integers. In C, there is `memcmp()`, `bcmp()` functions. In C++, there exists `std::equal()` algorithm. One can also write that function manually. libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ libc++ does not do anything like that, it simply relies on simple C++'s `operator==()`. https://godbolt.org/z/er0Zwf (GOOD!) So likely, there exists a certain performance opportunities. Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that is using `memcmp()` (in this case, compiled with modified compiler). {F8768213} ``` #include <algorithm> #include <cmath> #include <cstdint> #include <iterator> #include <limits> #include <random> #include <type_traits> #include <utility> #include <vector> #include "benchmark/benchmark.h" template <class T> bool equal(T* a, T* a_end, T* b) noexcept { for (; a != a_end; ++a, ++b) { if (a != b) return false; } return true; } template <typename T> std::vector<T> getVectorOfRandomNumbers(size_t count) { std::random_device rd; std::mt19937 gen(rd()); std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(), std::numeric_limits<T>::max()); std::vector<T> v; v.reserve(count); std::generate_n(std::back_inserter(v), count, [&dis, &gen]() { return dis(gen); }); assert(v.size() == count); return v; } struct Identical { template <typename T> static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) { auto Tmp = getVectorOfRandomNumbers<T>(count); return std::make_pair(Tmp, std::move(Tmp)); } }; struct InequalHalfway { template <typename T> static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) { auto V0 = getVectorOfRandomNumbers<T>(count); auto V1 = V0; V1[V1.size() / size_t(2)]++; // just change the value. return std::make_pair(std::move(V0), std::move(V1)); } }; template <class T, class Gen> void BM_bcmp(benchmark::State& state) { const size_t Length = state.range(0); const std::pair<std::vector<T>, std::vector<T>> Data = Gen::template Gen<T>(Length); const std::vector<T>& a = Data.first; const std::vector<T>& b = Data.second; assert(a.size() == Length && b.size() == a.size()); benchmark::ClobberMemory(); benchmark::DoNotOptimize(a); benchmark::DoNotOptimize(a.data()); benchmark::DoNotOptimize(b); benchmark::DoNotOptimize(b.data()); for (auto _ : state) { const bool is_equal = equal(a.data(), a.data() + a.size(), b.data()); benchmark::DoNotOptimize(is_equal); } state.SetComplexityN(Length); state.counters["eltcnt"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant); state.counters["eltcnt/sec"] = benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate); const size_t BytesRead = 2 * sizeof(T) * Length; state.counters["bytes_read/iteration"] = benchmark::Counter(BytesRead, benchmark::Counter::kDefaults, benchmark::Counter::OneK::kIs1024); state.counters["bytes_read/sec"] = benchmark::Counter( BytesRead, benchmark::Counter::kIsIterationInvariantRate, benchmark::Counter::OneK::kIs1024); } template <typename T> static void CustomArguments(benchmark::internal::Benchmark* b) { const size_t L2SizeBytes = []() { for (const benchmark::CPUInfo::CacheInfo& I : benchmark::CPUInfo::Get().caches) { if (I.level == 2) return I.size; } return 0; }(); // What is the largest range we can check to always fit within given L2 cache? const size_t MaxLen = L2SizeBytes / /total bufs/ 2 / /maximal elt size/ sizeof(T) / /safety margin/ 2; b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN); } BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical) ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical) ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical) ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical) ->Apply(CustomArguments<uint64_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway) ->Apply(CustomArguments<uint8_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway) ->Apply(CustomArguments<uint16_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway) ->Apply(CustomArguments<uint32_t>); BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway) ->Apply(CustomArguments<uint64_t>); ``` {F8768210} ``` $ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx 2019-04-25 21:17:11 Running build-old/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 0.65, 3.90, 4.14 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 432131 ns 432101 ns 1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s BM_bcmp<uint8_t, Identical>_BigO 0.86 N 0.86 N BM_bcmp<uint8_t, Identical>_RMS 8 % 8 % <...> BM_bcmp<uint16_t, Identical>/256000 161408 ns 161409 ns 4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s BM_bcmp<uint16_t, Identical>_BigO 0.67 N 0.67 N BM_bcmp<uint16_t, Identical>_RMS 25 % 25 % <...> BM_bcmp<uint32_t, Identical>/128000 81497 ns 81488 ns 8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s BM_bcmp<uint32_t, Identical>_BigO 0.71 N 0.71 N BM_bcmp<uint32_t, Identical>_RMS 42 % 42 % <...> BM_bcmp<uint64_t, Identical>/64000 50138 ns 50138 ns 10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s BM_bcmp<uint64_t, Identical>_BigO 0.84 N 0.84 N BM_bcmp<uint64_t, Identical>_RMS 27 % 27 % <...> BM_bcmp<uint8_t, InequalHalfway>/512000 192405 ns 192392 ns 3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s BM_bcmp<uint8_t, InequalHalfway>_BigO 0.38 N 0.38 N BM_bcmp<uint8_t, InequalHalfway>_RMS 3 % 3 % <...> BM_bcmp<uint16_t, InequalHalfway>/256000 127858 ns 127860 ns 5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s BM_bcmp<uint16_t, InequalHalfway>_BigO 0.50 N 0.50 N BM_bcmp<uint16_t, InequalHalfway>_RMS 0 % 0 % <...> BM_bcmp<uint32_t, InequalHalfway>/128000 49140 ns 49140 ns 14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s BM_bcmp<uint32_t, InequalHalfway>_BigO 0.40 N 0.40 N BM_bcmp<uint32_t, InequalHalfway>_RMS 18 % 18 % <...> BM_bcmp<uint64_t, InequalHalfway>/64000 32101 ns 32099 ns 21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s BM_bcmp<uint64_t, InequalHalfway>_BigO 0.50 N 0.50 N BM_bcmp<uint64_t, InequalHalfway>_RMS 1 % 1 % RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0 2019-04-25 21:19:29 Running build-new/test/llvm-bcmp-bench Run on (8 X 4000 MHz CPU s) CPU Caches: L1 Data 16K (x8) L1 Instruction 64K (x4) L2 Unified 2048K (x4) L3 Unified 8192K (x1) Load Average: 1.01, 2.85, 3.71 --------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... --------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 18593 ns 18590 ns 37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s BM_bcmp<uint8_t, Identical>_BigO 0.04 N 0.04 N BM_bcmp<uint8_t, Identical>_RMS 37 % 37 % <...> BM_bcmp<uint16_t, Identical>/256000 18950 ns 18948 ns 37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s BM_bcmp<uint16_t, Identical>_BigO 0.08 N 0.08 N BM_bcmp<uint16_t, Identical>_RMS 34 % 34 % <...> BM_bcmp<uint32_t, Identical>/128000 18627 ns 18627 ns 37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s BM_bcmp<uint32_t, Identical>_BigO 0.16 N 0.16 N BM_bcmp<uint32_t, Identical>_RMS 35 % 35 % <...> BM_bcmp<uint64_t, Identical>/64000 18855 ns 18855 ns 37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s BM_bcmp<uint64_t, Identical>_BigO 0.32 N 0.32 N BM_bcmp<uint64_t, Identical>_RMS 33 % 33 % <...> BM_bcmp<uint8_t, InequalHalfway>/512000 9570 ns 9569 ns 73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s BM_bcmp<uint8_t, InequalHalfway>_BigO 0.02 N 0.02 N BM_bcmp<uint8_t, InequalHalfway>_RMS 29 % 29 % <...> BM_bcmp<uint16_t, InequalHalfway>/256000 9547 ns 9547 ns 74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s BM_bcmp<uint16_t, InequalHalfway>_BigO 0.04 N 0.04 N BM_bcmp<uint16_t, InequalHalfway>_RMS 29 % 29 % <...> BM_bcmp<uint32_t, InequalHalfway>/128000 9396 ns 9394 ns 73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s BM_bcmp<uint32_t, InequalHalfway>_BigO 0.08 N 0.08 N BM_bcmp<uint32_t, InequalHalfway>_RMS 30 % 30 % <...> BM_bcmp<uint64_t, InequalHalfway>/64000 9499 ns 9498 ns 73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s BM_bcmp<uint64_t, InequalHalfway>_BigO 0.16 N 0.16 N BM_bcmp<uint64_t, InequalHalfway>_RMS 28 % 28 % Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench Benchmark Time CPU Time Old Time New CPU Old CPU New --------------------------------------------------------------------------------------------------------------------------------------- <...> BM_bcmp<uint8_t, Identical>/512000 -0.9570 -0.9570 432131 18593 432101 18590 <...> BM_bcmp<uint16_t, Identical>/256000 -0.8826 -0.8826 161408 18950 161409 18948 <...> BM_bcmp<uint32_t, Identical>/128000 -0.7714 -0.7714 81497 18627 81488 18627 <...> BM_bcmp<uint64_t, Identical>/64000 -0.6239 -0.6239 50138 18855 50138 18855 <...> BM_bcmp<uint8_t, InequalHalfway>/512000 -0.9503 -0.9503 192405 9570 192392 9569 <...> BM_bcmp<uint16_t, InequalHalfway>/256000 -0.9253 -0.9253 127858 9547 127860 9547 <...> BM_bcmp<uint32_t, InequalHalfway>/128000 -0.8088 -0.8088 49140 9396 49140 9394 <...> BM_bcmp<uint64_t, InequalHalfway>/64000 -0.7041 -0.7041 32101 9499 32099 9498 ``` What can we tell from the benchmark? * Performance of naive equality check somewhat improves with element size, maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s for uint64_t. I think, that instability implies performance problems. * Performance of `memcmp()`-aware benchmark always maxes out at around bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the naive variant! * eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and linearly decreases with element size. For uint64_t, it's ~4x+ the elements/second. * The call obvious is more pricey than the loop, with small element count. As it can be seen from the full output {F8768210}, the `memcmp()` is almost universally worse, independent of the element size (and thus buffer size) when element count is less than 8. So all in all, bcmp idiom does indeed pose untapped performance headroom. This diff does implement said idiom recognition. I think a reasonable test coverage is present, but do tell if there is anything obvious missing. Now, quality. This does succeed to build and pass the test-suite, at least without any non-bundled elements. {F8768216} {F8768217} This transform fires 91 times: ``` $ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json Tests: 1149 Metric: loop-idiom.NumBCmp Program result-new MultiSourc...Benchmarks/7zip/7zip-benchmark 79.00 MultiSource/Applications/d/make_dparser 3.00 SingleSource/UnitTests/vla 2.00 MultiSource/Applications/Burg/burg 1.00 MultiSourc.../Applications/JM/lencod/lencod 1.00 MultiSource/Applications/lemon/lemon 1.00 MultiSource/Benchmarks/Bullet/bullet 1.00 MultiSourc...e/Benchmarks/MallocBench/gs/gs 1.00 MultiSourc...gs-C/TimberWolfMC/timberwolfmc 1.00 MultiSourc...Prolangs-C/simulator/simulator 1.00 ``` The size changes are: I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look. ``` $ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash Tests: 1149 Same hash: 907 (filtered out) Remaining: 242 Metric: size..text Program result-old result-new diff test-suite...ingleSource/UnitTests/vla.test 753.00 833.00 10.6% test-suite...marks/7zip/7zip-benchmark.test 1001697.00 966657.00 -3.5% test-suite...ngs-C/simulator/simulator.test 32369.00 32321.00 -0.1% test-suite...plications/d/make_dparser.test 89585.00 89505.00 -0.1% test-suite...ce/Applications/Burg/burg.test 40817.00 40785.00 -0.1% test-suite.../Applications/lemon/lemon.test 47281.00 47249.00 -0.1% test-suite...TimberWolfMC/timberwolfmc.test 250065.00 250113.00 0.0% test-suite...chmarks/MallocBench/gs/gs.test 149889.00 149873.00 -0.0% test-suite...ications/JM/lencod/lencod.test 769585.00 769569.00 -0.0% test-suite.../Benchmarks/Bullet/bullet.test 770049.00 770049.00 0.0% test-suite...HMARK_ANISTROPIC_DIFFUSION/128 NaN NaN nan% test-suite...HMARK_ANISTROPIC_DIFFUSION/256 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/64 NaN NaN nan% test-suite...CHMARK_ANISTROPIC_DIFFUSION/32 NaN NaN nan% test-suite...ENCHMARK_BILATERAL_FILTER/64/4 NaN NaN nan% Geomean difference nan% result-old result-new diff count 1.000000e+01 10.00000 10.000000 mean 3.152090e+05 311695.40000 0.006749 std 3.790398e+05 372091.42232 0.036605 min 7.530000e+02 833.00000 -0.034981 25% 4.243300e+04 42401.00000 -0.000866 50% 1.197370e+05 119689.00000 -0.000392 75% 6.397050e+05 639705.00000 -0.000005 max 1.001697e+06 966657.00000 0.106242 ``` I don't have timings though. And now to the code. The basic idea is to completely replace the whole loop. If we can't fully kill it, don't transform. I have left one or two comments in the code, so hopefully it can be understood. Also, there is a few TODO's that i have left for follow-ups: * widening of `memcmp()`/`bcmp()` * step smaller than the comparison size * Metadata propagation * more than two blocks as long as there is still a single backedge? * ??? Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet Reviewed By: courbet Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists Tags: #llvm Differential Revision: https://reviews.llvm.org/D61144 llvm-svn: 374662	2019-10-12 15:35:32 +00:00
Roman Lebedev	45539737dd	[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206. The transform forgot to check SCEV loop scopes. https://bugs.llvm.org/show_bug.cgi?id=43206 llvm-svn: 374661	2019-10-12 15:35:16 +00:00
Roman Lebedev	c41e9f6bbf	[NFC][LoopIdiom] Move one bcmp test into the proper place llvm-svn: 374660	2019-10-12 15:35:09 +00:00
Sylvestre Ledru	4644e9a50a	remove an useless allocation found by scan-build - the new Dead nested assignment check llvm-svn: 374659	2019-10-12 15:24:00 +00:00
Simon Pilgrim	9f0885d38d	[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction This should go away once D66004 has landed and we can simplify shuffle chains using demanded elts. llvm-svn: 374658	2019-10-12 15:19:13 +00:00
Joel E. Denny	1f5823b788	[lit] Try again to fix new tests that fail on Windows bots Based on the bot logs, when lit's internal diff runs on Windows, it looks like binary diffs must be decoded also for Python 2.7. Otherwise, writing the diff to stdout fails with: ``` UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128) ``` I did not need to decode using Python 2.7.15 under Ubuntu. When I do it anyway in that case, `errors="backslashreplace"` fails for me: ``` TypeError: don't know how to handle UnicodeDecodeError in error callback ``` However, `errors="ignore"` works, so this patch uses that, hoping it'll work on Windows as well. This patch leaves `errors="backslashreplace"` for Python >= 3.5 as there's no evidence yet that doesn't work and it produces more informative binary diffs. This patch also adjusts some lit tests to succeed for either error handler. This patch adjusts changes introduced by D68664. llvm-svn: 374657	2019-10-12 14:58:43 +00:00
Joel E. Denny	0e22cb6ce3	Revert r374654: "[lit] Try to fix new tests that fail on Windows bots" llvm-svn: 374656	2019-10-12 14:58:30 +00:00
Simon Pilgrim	1b59a16c0b	[CostModel][X86] Improve sum reduction costs. I can't see any notable differences in costs between SSE2 and SSE42 arches for FADD/ADD reduction, so I've lowered the target to just SSE2. I've also added vXi8 sum reduction costs in line with the PSADBW codegen and discussions on PR42674. llvm-svn: 374655	2019-10-12 13:21:50 +00:00
Joel E. Denny	ba229557dd	[lit] Try to fix new tests that fail on Windows bots llvm-svn: 374654	2019-10-12 13:08:21 +00:00
Joel E. Denny	648875bbcf	[lit] Fix a few oversights in r374651 that broke some bots llvm-svn: 374653	2019-10-12 12:32:00 +00:00
Joel E. Denny	0f80927316	[lit] Fix internal diff's --strip-trailing-cr and use it Using GNU diff, `--strip-trailing-cr` removes a `\r` appearing before a `\n` at the end of a line. Without this patch, lit's internal diff only removes `\r` if it appears as the last character. That seems useless. This patch fixes that. This patch also adds `--strip-trailing-cr` to some tests that fail on Windows bots when D68664 is applied. Based on what I see in the bot logs, I think the following is happening. In each test there, lit diff is comparing a file with `\r\n` line endings to a file with `\n` line endings. Without D68664, lit diff reads those files with Python's universal newlines support activated, causing `\r` to be dropped. However, with D68664, lit diff reads the files in binary mode instead and thus reports that every line is different, just as GNU diff does (at least under Ubuntu). Adding `--strip-trailing-cr` to those tests restores the previous behavior while permitting the behavior of lit diff to be more like GNU diff. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D68839 llvm-svn: 374652	2019-10-12 11:58:30 +00:00
Joel E. Denny	92a8294f9e	Reland r374392: [lit] Extend internal diff to support -U To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D68668 llvm-svn: 374651	2019-10-12 11:58:03 +00:00
Joel E. Denny	32096a86b2	Reland r374390: [lit] Extend internal diff to support `-` argument To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D67643 llvm-svn: 374650	2019-10-12 11:57:41 +00:00
Joel E. Denny	e4f11a3192	Reland r374389: [lit] Clean up internal diff's encoding handling To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D68664 llvm-svn: 374649	2019-10-12 11:57:20 +00:00
Joel E. Denny	daf42dc36d	Reland r374388: [lit] Make internal diff work in pipelines To avoid breaking some tests, D66574, D68664, D67643, and D68668 landed together. However, D68664 introduced an issue now addressed by D68839, with which these are now all relanding. Differential Revision: https://reviews.llvm.org/D66574 llvm-svn: 374648	2019-10-12 11:56:57 +00:00
Benjamin Kramer	c5d1d56731	[Attributor] Extend anonymous namespace. NFC. llvm-svn: 374647	2019-10-12 11:01:52 +00:00
Benjamin Kramer	97c9804e06	[LV] Merge LLVM_DEBUG blocks. Avoids unused variable warnings about the range-based for loops in there. NFCI. llvm-svn: 374646	2019-10-12 10:57:22 +00:00
Craig Topper	9bd542dcd5	[X86] Use pack instructions for packus/ssat truncate patterns when 256-bit is the largest legal vector and the result type is at least 256 bits. Since the input type is larger than 256-bits we'll need to some concatenating to reassemble the results. The pack instructions ability to concatenate while packing make this a shorter/faster sequence. llvm-svn: 374643	2019-10-12 07:59:29 +00:00
Craig Topper	80a4feed7c	[X86] Test SKX cpu in the vector-trunc-packus/ssat/usat.ll tests instad of min-legal-vector-width.ll This adds "min-legal-vector-width"="256" function attributes to all the tests for a larger than 256-bit input. Also switch any larger than 512-bit inputs to use a load. This makes the arguments consistent with min-legal-vector-width attribute which should usually be at least as large as the arguments. The SKX configuration will avoid using zmm registers on the modified test cases. For many of them we should use something closer to the AVX2 codegen with pack instructions instead of the avx512 saturating truncates. llvm-svn: 374642	2019-10-12 07:59:24 +00:00
Simon Atanasyan	aeaf5f8bd3	[mips] Rely on GPR size not ABI when select instruction to load value into register llvm-svn: 374641	2019-10-12 07:42:51 +00:00
Simon Atanasyan	4a46af845f	[mips] Fix `loadImmediate` calls when load non-address values. llvm-svn: 374640	2019-10-12 07:42:44 +00:00
Martin Storsjo	fe88be8c3a	[lit] Remove setting of the target-windows feature No other OSes use a target-<os> feature, and no tests depend on it any lomger. Differential Revision: https://reviews.llvm.org/D68450 llvm-svn: 374639	2019-10-12 06:40:24 +00:00
Puyan Lotfi	17bde36a03	[clang][IFS] Fixing spelling errors in interface-stubs OPT flag (NFC). This is just a long standing spelling error that was found recently. llvm-svn: 374638	2019-10-12 06:25:07 +00:00
Alexander Shaposhnikov	b42e679a4b	[llvm-lipo] Pass ArrayRef by value. Pass ArrayRef by value, fix formatting. NFC. Test plan: make check-all llvm-svn: 374637	2019-10-12 06:14:02 +00:00
Vitaly Buka	ec6bfa81b7	Revert 374629 "[sancov] Accommodate sancov and coverage report server for use under Windows" http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/27650/steps/ninja%20check%201/logs/stdio http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759 http://lab.llvm.org:8011/builders/clang-s390x-linux-lnt/builds/15095 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/21075 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/31759 llvm-svn: 374636	2019-10-12 05:23:43 +00:00
Hubert Tong	fce11c6904	NFC: clang-format rL374420 and adjust comment wording The commit of rL374420 had various formatting issues, including lines that exceed 80 columns. This patch applies `git clang-format` on the changes from commit `13bd3ef40d`. It further adjusts a comment to clarify the domain of inputs upon which a newly added function is meant to operate. The adjustment to the comment was suggested in a post-commit comment on D68721 and discussed off-list with @sfertile. llvm-svn: 374635	2019-10-12 04:08:31 +00:00
Zi Xuan Wu	9802268ad3	recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634	2019-10-12 02:53:04 +00:00
Puyan Lotfi	c0abc2e7f2	[clang][IFS] Updating tests to pass on -fvisibility=hidden builds (NFCi). Special thanks to JamesNagurne who got to the bottom of this; landing this on his behalf. Differential Revision: https://reviews.llvm.org/D68897 llvm-svn: 374632	2019-10-12 02:46:57 +00:00
Walter Erquinigo	af1d27e301	[platform process list] add a flag for showing the processes of all users Summary: For context: https://reviews.llvm.org/D68293 We need a way to show all the processes on android regardless of the user id. When you run `platform process list`, you only see the processes with the same user as the user that launched lldb-server. However, it's quite useful to see all the processes, though, and it will lay a foundation for full apk debugging support from lldb. Before: ``` PID PARENT USER TRIPLE NAME ====== ====== ========== ======================== ============================ 3234 1 aarch64-unknown-linux-android adbd 8034 3234 aarch64-unknown-linux-android sh 9096 3234 aarch64-unknown-linux-android sh 9098 9096 aarch64-unknown-linux-android lldb-server (lldb) ^D ``` Now: ``` (lldb) platform process list -x 205 matching processes were found on "remote-android" PID PARENT USER TRIPLE NAME ====== ====== ========== ======================== ============================ 1 0 init 524 1 init 525 1 init 531 1 ueventd 568 1 logd 569 1 aarch64-unknown-linux-android servicemanager 570 1 aarch64-unknown-linux-android hwservicemanager 571 1 aarch64-unknown-linux-android vndservicemanager 577 1 aarch64-unknown-linux-android qseecomd 580 577 aarch64-unknown-linux-android qseecomd ... 23816 979 com.android.providers.calendar 24600 979 com.verizon.mips.services 27888 979 com.hualai 28043 2378 com.android.chrome:sandboxed_process0 31449 979 com.att.shm 31779 979 com.samsung.android.authfw 31846 979 com.samsung.android.server.iris 32014 979 com.samsung.android.MtpApplication 32045 979 com.samsung.InputEventApp ``` Reviewers: labath,xiaobai,aadsm,clayborg Subscribers: > llvm-svn: 374584 llvm-svn: 374631	2019-10-12 02:36:16 +00:00
Walter Erquinigo	0f22955899	Revert "[platform process list] add a flag for showing the processes of all users" This reverts commit f670a5edfc70066872e1795d650ed6e1ac62b6a8. llvm-svn: 374630	2019-10-12 02:31:22 +00:00
Vitaly Buka	23aa2aec78	[sancov] Accommodate sancov and coverage report server for use under Windows Summary: This patch makes the following changes to SanCov and its complementary Python script in order to resolve issues pertaining to non-UNIX file paths in JSON symbolization information: * Convert all paths to use forward slash. * Update `coverage-report-server.py` to correctly handle paths to sources which contain spaces. * Remove Linux platform restriction for all SanCov unit tests. All SanCov tests passed when ran on my local Windows machine. Patch by Douglas Gliner. Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman Reviewed By: vitalybuka Subscribers: vsk, Dor1s, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D51018 llvm-svn: 374629	2019-10-12 02:29:26 +00:00
Vitaly Buka	e8a462a019	[sancov] Use LLVM Support library JSON writer in favor of individual implementation Summary: In this diff, I've replaced the individual implementation of `JSONWriter` with `json::OStream` provided by `llvm/Support/JSON.h`. Important Note: The output format of the JSON is considerably different compared to the original implementation. Important differences include: * New line for each entry in an array (should make diffs cleaner) * No space between keys and colon in attributed object entries. * Attributes with empty strings will now print the attribute name and a quote pair rather than excluding the attribute altogether Examples of these differences can be seen in the changes to the sancov tests which compare the JSON output. Patch by Douglas Gliner. Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman Subscribers: mehdi_amini, dexonsmith, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D68752 llvm-svn: 374628	2019-10-12 02:29:24 +00:00

... 3 4 5 6 7 ...

329655 Commits All Branches Search

329655 Commits

All Branches