WARNING: If you're looking at this patch because you're looking for a full
performace mitigation of the Intel JCC Erratum, this is not it!
This is a preliminary patch on the patch towards mitigating the performance
regressions caused by Intel's microcode update for Jump Conditional Code
Erratum. For context, see:
https://www.intel.com/content/www/us/en/support/articles/000055650.html
The patch adds the required assembler infrastructure and command line options
needed to exercise the logic for INTERNAL TESTING. These are NOT public flags,
and should not be used for anything other than LLVM's own testing/debugging
purposes. They are likely to change both in spelling and meaning.
WARNING: This patch is knowingly incorrect in some cornercases. We need, and
do not yet provide, a mechanism to selective enable/disable the padding.
Conversation on this will continue in parellel with work on extending this
infrastructure to support prefix padding.
The goal here is to have the assembler align specific instructions such that
they neither cross or end at a 32 byte boundary. The impacted instructions are:
a. Conditional jump.
b. Fused conditional jump.
c. Unconditional jump.
d. Indirect jump.
e. Ret.
f. Call.
The new options for llvm-mc are:
-x86-align-branch-boundary=NUM aligns branches within NUM byte boundary.
-x86-align-branch=TYPE[+TYPE...] specifies types of branches to align.
A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit
NOP to align the fused/unfused branch.
alignBranchesBegin inserts MCBoundaryAlignFragment before instructions,
alignBranchesEnd marks the end of the branch to be aligned,
relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch.
Nop padding is disabled when the instruction may be rewritten by the linker,
such as TLS Call.
Process Note: I am landing a patch by skan as it has been LGTMed, and
continuing to iterate on the review is simply slowing us down at this point.
We can and will continue to iterate in tree.
Patch By: skan
Differential Revision: https://reviews.llvm.org/D70157
D34393 added MCCodePadder as an infrastructure for padding code with
NOP instructions. It lacked tests and was not being worked on since
then.
Intel has now worked on an assembler patch to mitigate performance loss
after applying microcode update for the Jump Conditional Code Erratum.
https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html
This new patch shares similarity with MCCodePadder, but has a concrete
use case in mind and is being actively developed. The infrastructure it
introduces can potentially be used for general performance improvement
via alignment. Delete the unused MCCodePadder so that people can develop
the new feature from a clean state.
Reviewed By: jyknight, skan
Differential Revision: https://reviews.llvm.org/D71106
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Instruction bundling is only supported on descendants of the
MCEncodedFragment type. By moving the bundling functionality and
MCSubtargetInfo to this class it makes it easier to set and extract the
MCSubtargetInfo when it is necessary.
This is a refactoring change that will make it easier to pass the
MCSubtargetInfo through to writeNops when nop padding is required.
Differential Revision: https://reviews.llvm.org/D45959
llvm-svn: 334814
Avoid requirement that number of values must be known at assembler
time.
Fixes PR33586.
Reviewers: rnk, peter.smith, echristo, jyknight
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D46703
llvm-svn: 332741
See r331124 for how I made a list of files missing the include.
I then ran this Python script:
for f in open('filelist.txt'):
f = f.strip()
fl = open(f).readlines()
found = False
for i in xrange(len(fl)):
p = '#include "llvm/'
if not fl[i].startswith(p):
continue
if fl[i][len(p):] > 'Config':
fl.insert(i, '#include "llvm/Config/llvm-config.h"\n')
found = True
break
if not found:
print 'not found', f
else:
open(f, 'w').write(''.join(fl))
and then looked through everything with `svn diff | diffstat -l | xargs -n 1000 gvim -p`
and tried to fix include ordering and whatnot.
No intended behavior change.
llvm-svn: 331184
Summary:
This fragment emits a symbol ID and will be useful for more than just Safe SEH
tables (e.g., I plan to re-use it for Control Flow Guard tables). This is
simply a rename refactor.
Reviewers: rnk
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D39770
llvm-svn: 317703
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.
Reviewers:
aaboud
zvi
craig.topper
gadi.haber
Differential revision: https://reviews.llvm.org/D34393
Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
llvm-svn: 316413
Summary:
Add LLVM_FORCE_ENABLE_DUMP cmake option, and use it along with
LLVM_ENABLE_ASSERTIONS to set LLVM_ENABLE_DUMP.
Remove NDEBUG and only use LLVM_ENABLE_DUMP to enable dump methods.
Move definition of LLVM_ENABLE_DUMP from config.h to llvm-config.h so
it'll be picked up by public headers.
Differential Revision: https://reviews.llvm.org/D38406
llvm-svn: 315590
Without this cast the "char" overload of operator<< is
chosen and the values is output as an ascii rather than
an integer.
Differential Revision: https://reviews.llvm.org/D34486
llvm-svn: 306039
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html
For reference:
- Public headers should just declare the dump() method but not use
LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void MyClass::dump() {
// print stuff to dbgs()...
}
#endif
llvm-svn: 293359
Add a SMLoc to MCExpr. Most code does not generate or consume the SMLoc (yet).
Patch by Sanne Wouda <sanne.wouda@arm.com>!
Differential Revision: https://reviews.llvm.org/D28861
llvm-svn: 292515
Many lists want to override only allocation semantics, or callbacks for
iplist. Split these up to prevent code duplication.
- Specialize ilist_alloc_traits to change the implementations of
deleteNode() and createNode().
- One common desire is to do nothing deleteNode() and disable
createNode(). Specialize ilist_alloc_traits to inherit from
ilist_noalloc_traits for that behaviour.
- Specialize ilist_callback_traits to use the addNodeToList(),
removeNodeFromList(), and transferNodesFromList() callbacks.
As a drive-by, add some coverage to the callback-related unit tests.
llvm-svn: 280128
MCContext already has many tasks, and separating CodeView out from it is
probably a good idea. The .cv_loc tracking was modelled on the DWARF
tracking which lived directly in MCContext.
Removes the inclusion of MCCodeView.h from MCContext.h, so now there are
only 10 build actions while I hack on CodeView support instead of 265.
llvm-svn: 279847
Remove all the dead code around ilist_*sentinel_traits. This is a
follow-up to gutting them as part of r279314 (originally r278974),
staged to prevent broken builds in sub-projects.
Uses were removed from clang in r279457 and lld in r279458.
llvm-svn: 279473
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
CodeView, like most other debug formats, represents the live range of a
variable so that debuggers might print them out.
They use a variety of records to represent how a particular variable
might be available (in a register, in a frame pointer, etc.) along with
a set of ranges where this debug information is relevant.
However, the format only allows us to use ranges which are limited to a
maximum of 0xF000 in size. This means that we need to split our debug
information into chunks of 0xF000.
Because the layout of code is not known until *very* late, we must use a
new fragment to record the information we need until we can know
*exactly* what the range is.
llvm-svn: 259868
This directive emits the binary annotations that describe line and code
deltas in inlined call sites. Single-stepping through inlined frames in
windbg now works.
llvm-svn: 259535
The value size was always 1 or 0, so we don't need to store it.
In a no asserts build this takes the testcase of pr26208 from 11 to 10
seconds.
llvm-svn: 258141
header to its own header, allowing users of fragments to have a narrower
header file, and avoid circular header dependencies when getting the
definition of MCSection prior to inspecting traits on MCSection
pointers.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Note that this doesn't in any way change the design of MC, it is just
moving code around to allow the *header files* to be more fine grained.
Without this, it is impossible to get a complete type for MCSection
where it is needed.
If anyone would prefer a different slicing of the header files, I'm
happy to oblige of course. =]
llvm-svn: 256548