Somehow, the codegen logic for these sequences has gone completely untested
until now (note the 2 compare instructions generated per test).
There's also an *Intel* AVX optimization opportunity exposed in these cases
and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP
predicates were added, so a single comparison should always be sufficient,
and operand commutation should never be necessary.
llvm-svn: 272397
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.
llvm-svn: 272395
Rehashing the ExplodedNode table is very expensive. The hashing
itself is expensive, and the general activity of iterating over the
hash table is highly cache unfriendly. Instead, we guess at the
eventual size by using the maximum number of steps allowed. This
generally avoids a rehash. It is possible that we still need to
rehash if the backlog of work that is added to the worklist
significantly exceeds the number of work items that we process. Even
if we do need to rehash in that scenario, this change is still a
win, as we still have fewer rehashes that we would have prior to
this change.
For small work loads, this will increase the memory used. For large
work loads, it will somewhat reduce the memory used. Speed is
significantly increased. A large .C file took 3m53.812s to analyze
prior to this change. Now it takes 3m38.976s, for a ~6% improvement.
http://reviews.llvm.org/D20933
llvm-svn: 272394
In isPreemptible routine we interested in R_MIPS_GPREL16 relocation
only. This relocation fits 0xf. So the new mask 0xff is just to conform
the ABI specification.
llvm-svn: 272388
Summary: give users an option to show N more headers in case there are too many candidates.
Reviewers: bkramer
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D21181
llvm-svn: 272387
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.
Differential Revision: http://reviews.llvm.org/D20873
llvm-svn: 272385
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20968
llvm-svn: 272384
Memory operand is new for AVX512 (SSE/AVX2 didn't support it).
Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations).
Regenerated VPALIGNR test now that the shuffle comments work
llvm-svn: 272383
It has RTTI disabled by default, so need to enable it explicitly.
Reviewers: silvas
Differential Revision: http://reviews.llvm.org/D21186
llvm-svn: 272381
Initially we wanted to check that these two relocations are not present when linking DSO because of
possible overflow in runtime. Patch moves them to writable segment in testcases to allow
proper error check to trigger.
Otherwise error message about using dynamic relocations against text segment was shown.
Differential revision: http://reviews.llvm.org/D21184
llvm-svn: 272379
It was reported in PR28020, that lld does not link code which
gold do. But in fact that is expected behavior as we do not
support DT_TEXTREL.
This patch changes error message as it can report about relocations against
text segments exclusively, other dynamic relocations errors can
be handled separately.
Differential revision: http://reviews.llvm.org/D21133
llvm-svn: 272377
End-end test with no integrated assembly should be added
at some point (not done now because some bots are not properly configured to
support -no-integrated-as)
llvm-svn: 272376
Test that __llvm_profile_set_filename invoked in
main program is 'visible' to shared lib (overriding
shared libary's profile path set on command line)
llvm-svn: 272375
This fixes the following unit tests:
FuzzerDictionary.ParseOneDictionaryEntry
FuzzerDictionary.ParseDictionaryFile
The issue appears to be mixing non-ASan-ified code (LibFuzzer) and
ASan-ified code (the unittest) as the tests would pass fine if
everything was built with ASan enabled.
I believe the issue is that different implementations of std::vector<>
are being used in LibFuzzer and outside LibFuzzer (in the unittests).
For Libcxx (I've not seen the issue manifest for libstdc++) we can disable
the ASanified std::vector<> by definining the ``_LIBCPP_HAS_NO_ASAN`` macro.
Doing this fixes the tests on OSX.
Differential Revision: http://reviews.llvm.org/D21049
llvm-svn: 272374
This is the next step towards being able to write PDBs.
MemoryBuffer is immutable, and StreamInterface is our replacement
which can be any combination of read-only, read-write, or write-only
depending on the particular implementation.
The one place where we were creating a PDBFile (in RawSession) is
updated to subclass ByteStream with a simple adapter that holds
a MemoryBuffer, and initializes the superclass with the buffer's
array, so that all the functionality of ByteStream works
transparently.
llvm-svn: 272370
This adds method and tests for writing to a PDB stream. With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157
llvm-svn: 272369
Some calls from OMPClauseProfiler were calling the Stmt Profiler with null
pointers, but the profiler can only handle non-null pointers. Add an assert
to the VisitStmt for valid pointers, and check all calls from OMPClauseProfiler
to be non-null pointers.
llvm-svn: 272368
Crash reported in PR28023 is caused by the fact that non-type template
parameters are found by tag name lookup. In the code provided in that PR:
template<int V> struct A {
struct B {
template <int> friend struct V;
};
};
the template parameter V is found when lookup for redeclarations of 'struct V'
is made. Latter on the error about shadowing of 'V' is emitted but the semantic
context of 'struct V' is already determined wrong: 'struct A' instead of
translation unit.
The fix moves the check for shadowing toward the beginning of the method and
thus prevents from wrong context calculations.
This change fixes PR28023.
llvm-svn: 272366
- The intended use of this was just in diagnostics, so we shouldn't pay the
cost of reading these all the time.
- This will avoid including the full output of each command in tests which
fail, but the most important use case for this was to gather the output of
the specific command which failed.
llvm-svn: 272365
Summary:
Adds the struct field offset array in the struct StructInfo.
Prints struct size and field offset info in the report.
Reviewers: aizatsky
Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits, kubabrecka
Differential Revision: http://reviews.llvm.org/D21191
llvm-svn: 272363
The test case is not great espicially because it is still cumbersome to
run the regalloc pass with run-pass. (We miss a bunch of initiliazier to
be properly implemented.)
Related to llvm.org/PR27983
llvm-svn: 272360
Previously we could run only one machine pass with the run-pass option.
With that patch, we can now specify several passes with several run-pass
options (or just one option with a list of comma separated passes) and
llc will build the related pipeline.
This is great to test the interaction of two passes that are not
necessarily next to each other in the pipeline, or play with pass
ordering.
Now, we should be at parity with opt for the flexibility of running
passes.
Note: I also moved the run pass option from CommandFlags.h to llc.cpp
because, really, this is needed only there!
llvm-svn: 272356
Summary:
Adds ClInstrumentFastpath option to control fastpath instrumentation.
Avoids the load/store instrumentation for the cache fragmentation tool.
Renames cache_frag_basic.ll to working_set_slow.ll for slowpath
instrumentation test.
Adds the __esan_init check in struct_field_count_basic.ll.
Reviewers: aizatsky
Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka
Differential Revision: http://reviews.llvm.org/D21079
llvm-svn: 272355
For some reason, the conversion to taking the target lock when acquiring
the ExecutionContext was only done for some of the functions here. That was
allowing lock inversion in some complex uses.
<rdar://problem/26705635>
llvm-svn: 272354
Summary:
This documents the various relocation types that are supported by the
Radeon Open Compute (ROC) runtime (which is essentially the dynamic
linker for AMDGPU).
Only R_AMDGPU_32 is not currently supported by the ROC runtime, but
it will usually be resolved at link time by lld.
Patch by: Konstantin Zhuravlyov
Reviewers: kzhuravl, rafael
Subscribers: rafael, arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20952
llvm-svn: 272352
The doxygen comments are automatically generated based on Sony's intrinsics docu
ment.
I got an OK from Eric Christopher to commit doxygen comments without prior code
review upstream.
llvm-svn: 272350