This patch fixes std::allocator, and more specifically, all users
of __libcpp_allocate and __libcpp_deallocate, to support over-aligned
types.
__libcpp_allocate/deallocate now take an alignment parameter, and when
the specified alignment is greater than that supported by malloc/new,
the aligned version of operator new is called (assuming it's available).
When aligned new isn't available, the old behavior has been kept, and the
alignment parameter is ignored.
This patch depends on recent changes to __builtin_operator_new/delete which
allow them to be used to call any regular new/delete operator. By using
__builtin_operator_new/delete when possible, the new/delete erasure optimization
is maintained.
llvm-svn: 328180
I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do.
This reduces the size of the optimized llc binary on my computer by 400K. Presumably because we went from 5000+ scheduler classes per CPU to ~2000.
llvm-svn: 328179
To resolve symbol context at a particular address, we need to
determine the compiland for the address. We are able to determine
the parent compiland of PDBSymbolFunc, PDBSymbolTypeUDT,
PDBSymbolTypeEnum symbols indirectly through line information.
However no such information is availabile for PDBSymbolData,
i.e. variables.
The Section Contribution table from PDBs has information about
each compiland's contribution to sections by address. For example,
a piece of a contribution looks like,
VA RelativeVA Sect No. Offset Length Compiland
14000087B0 000087B0 0001 000077B0 000000BB exe_main.obj
So given an address, it's possible to determine its compiland with
this information.
llvm-svn: 328178
Summary:
We received reports of the Objective-C style guesser getting a false
negative on header files like:
CGSize SizeOfThing(MyThing thing);
This adds more Core Graphics identifiers to the Objective-C style
guesser.
Test Plan: New tests added. Ran tests with:
% make -j12 FormatTests && ./tools/clang/unittests/Format/FormatTests
Reviewers: jolesiak, djasper
Reviewed By: jolesiak, djasper
Subscribers: krasimir, klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D44632
llvm-svn: 328175
Summary:
Previously, clang-format would insert a space between
the closing parenthesis and 'new' in the following valid Objective-C
declaration:
+ (instancetype)new;
This was because 'new' is treated as a keyword, not an identifier.
TokenAnnotator::spaceRequiredBefore() already handled the case where
r_paren came before an identifier, so this diff extends it to
handle r_paren before 'new'.
Test Plan: New tests added. Ran tests with:
% make -j12 FormatTests && ./tools/clang/unittests/Format/FormatTests
Reviewers: djasper, jolesiak, stephanemoore
Reviewed By: djasper, jolesiak, stephanemoore
Subscribers: stephanemoore, klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D44692
llvm-svn: 328174
Summary: Rewrites -Winfinite-recursion to remove the state dictionary and explore paths in loops - especially infinite loops. The new check now detects recursion in loop bodies dominated by a recursive call.
Reviewers: rsmith, rtrieu
Reviewed By: rtrieu
Subscribers: lebedev.ri, cfe-commits
Differential Revision: https://reviews.llvm.org/D43737
llvm-svn: 328173
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
llvm-svn: 328165
Summary:
This fixes a unittest failure introduced by D44717
D44717 introduced lazy sorting of the internal data structures of the
symbol table. The AddrToMD5Map getter was potentially exposing
inconsistent (unsorted) state. We could sort in the accessor, however,
a client may store the pointer and thus bypass the internal state
management of the symbol table. The alternative in this CL blocks
direct access to the state, thus ensuring consistent
externally-observable state.
Reviewers: davidxl, xur, eraman
Reviewed By: xur
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44757
llvm-svn: 328163
The hash table is a list of buckets, and the *value* stored in
the bucket cannot be 0 since that is reserved. However, the code
here was incorrectly skipping over the 0'th bucket entirely.
The 0'th bucket is perfectly fine, just none of these buckets
can contain the value 0.
As a result, whenever there was a string where hash(S) % Size
was equal to 0, we would write the value in the next bucket
instead. We never caught this in our tests due to *another*
bug, which is that we would iterate the entire list of buckets
looking for the value, only using the hash value as a starting
point. However, the real algorithm stops when it finds 0 in
a bucket since it takes that to mean "the item is not in the
hash table".
The unit test is updated to carefully construct a set of hash
values that will cause one item to hash to 0 mod bucket count,
and the reader is also updated to return an error indicating that
the item is not found when it encounters a 0 bucket.
llvm-svn: 328162
This fixes host-side LTO during CUDA compilation. Before, LTO
pipeline construction was clashing with CUDA pipeline construction.
At the moment there's no point doing LTO on device side as each
device-side TU is a complete program. We will need to figure out
compilation pipeline construction for the device-side LTO when we
have working support for multi-TU device-side CUDA compilation.
Differential Revision: https://reviews.llvm.org/D44691
llvm-svn: 328161
This is needed for the upcoming implementation of the
new 8x32x16 and 32x8x16 variants of WMMA instructions
introduced in CUDA 9.1.
Differential Revision: https://reviews.llvm.org/D44719
llvm-svn: 328158
The "ShouldEmitMatchRegisterName" bit wasn't taking effect because the
WebAssembly target didn't point to the custom WebAssemblyAsmParser
record.
llvm-svn: 328155
This is mostly just plumbing to get a DWARFDataExtractor where we
compute abbr_offset so we can use getRelocatedValue.
This is part of PR36793.
llvm-svn: 328154
During reading C++ definition data for lambda we can access
CXXRecordDecl representing lambda before we finished reading the
definition data. This can happen by reading a captured variable which is
VarDecl, then reading its decl context which is CXXMethodDecl `operator()`,
then trying to merge redeclarable methods and accessing
enclosing CXXRecordDecl. The call stack looks roughly like
VisitCXXRecordDecl
ReadCXXRecordDefinition
VisitVarDecl
VisitCXXMethodDecl
mergeRedeclarable
getPrimaryContextForMerging
If we add fake definition data at this point, later we'll hit the assertion
Assertion failed: (!DD.IsLambda && !MergeDD.IsLambda && "faked up lambda definition?"), function MergeDefinitionData, file clang/lib/Serialization/ASTReaderDecl.cpp, line 1675.
The fix is to assign definition data before reading it. Fixes PR32556.
rdar://problem/37461072
Reviewers: rsmith, bruno
Reviewed By: rsmith
Subscribers: cfe-commits, jkorous-apple, aprantl
Differential Revision: https://reviews.llvm.org/D43494
llvm-svn: 328153
There are at least 3 problems:
1. We're distributing across large patterns, but fail to do that for the minimal patterns.
2. We're not checking uses, so we may create more instructions than we eliminate.
3. We should be able to do these transforms with less than full 'fast' fmuls.
llvm-svn: 328152
Summary:
Following-up the refactoring of mmap interceptors, adding a new common
option to detect PROT_WRITE|PROT_EXEC pages request.
Patch by David CARLIER
Reviewers: vitalybuka, vsk
Reviewed By: vitalybuka
Subscribers: krytarowski, #sanitizers
Differential Revision: https://reviews.llvm.org/D44194
llvm-svn: 328151
Summary: The global stack initialization function may be called multiple times. The initialization of the shared memory slots should only happen when the function is called for the first time for a given warp master thread.
Reviewers: grokos, carlo.bertolli, ABataev, caomhin
Reviewed By: grokos
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D44754
llvm-svn: 328148
Summary: The check for the master warp must take into consideration the actual number of warps: the master warp is equal to the last active warp not necessarily WARPSIZE - 1.
Reviewers: grokos, carlo.bertolli, ABataev, caomhin
Reviewed By: grokos
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D44537
llvm-svn: 328146
Summary:
This patch allows worker to have a global memory stack managed by the runtime. This patch is needed for completeness and consistency with the globalization policy: if a worker-side variable escapes the current context it then needs to be globalized.
Until now, only the master thread was allowed to have such a stack. These global values can now potentially be shared amongst workers if the semantics of the OpenMP program require it.
Reviewers: ABataev, grokos, carlo.bertolli, caomhin
Reviewed By: grokos
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D44487
llvm-svn: 328144
This diff adds support for SHT_GROUP sections to llvm-objcopy.
Some sections are interrelated and comprise a group.
For example, a definition of an inline function might require,
in addition to the section containing its instructions,
a read-only data section containing literals referenced inside the function.
A section of the type SHT_GROUP contains the indices of the group members,
therefore, it needs to be updated whenever the indices change.
Similarly, the fields sh_link, sh_info should be recalculated as well.
[Resubmit r328012 with the proper handling of endianness]
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D43996
llvm-svn: 328143
We already know all the of instructions we're processing in the instruction loop belong to no class or all to the same class. So we only have to worry about remapping one class. So hoist it all out and remove the SmallPtrSet that tracked which class we'd already remapped.
I had to introduce new instruction loop inside this code to print an error message, but that only occurs on the error path.
llvm-svn: 328142
We already have an OldSCIdx variable in the outer loop here. And we already did the map lookup in the loop that populated ClassInstrs. And the outer OldSCIdx got it from ClassInstrs.
llvm-svn: 328139
Summary:
Libc++'s default allocator uses `__builtin_operator_new` and `__builtin_operator_delete` in order to allow the calls to new/delete to be ellided. However, libc++ now needs to support over-aligned types in the default allocator. In order to support this without disabling the existing optimization Clang needs to support calling the aligned new overloads from the builtins.
See llvm.org/PR22634 for more information about the libc++ bug.
This patch changes `__builtin_operator_new`/`__builtin_operator_delete` to call any usual `operator new`/`operator delete` function. It does this by performing overload resolution with the arguments passed to the builtin to determine which allocation function to call. If the selected function is not a usual allocation function a diagnostic is issued.
One open issue is if the `align_val_t` overloads should be considered "usual" when `LangOpts::AlignedAllocation` is disabled.
In order to allow libc++ to detect this new behavior the value for `__has_builtin(__builtin_operator_new)` has been updated to `201802`.
Reviewers: rsmith, majnemer, aaron.ballman, erik.pilkington, bogner, ahatanak
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43047
llvm-svn: 328134