Spectre variant #1 for x86.
There is a lengthy, detailed RFC thread on llvm-dev which discusses the
high level issues. High level discussion is probably best there.
I've split the design document out of this patch and will land it
separately once I update it to reflect the latest edits and updates to
the Google doc used in the RFC thread.
This patch is really just an initial step. It isn't quite ready for
prime time and is only exposed via debugging flags. It has two major
limitations currently:
1) It only supports x86-64, and only certain ABIs. Many assumptions are
currently hard-coded and need to be factored out of the code here.
2) It doesn't include any options for more fine-grained control, either
of which control flow edges are significant or which loads are
important to be hardened.
3) The code is still quite rough and the testing lighter than I'd like.
However, this is enough for people to begin using. I have had numerous
requests from people to be able to experiment with this patch to
understand the trade-offs it presents and how to use it. We would also
like to encourage work to similar effect in other toolchains.
The ARM folks are actively developing a system based on this for
AArch64. We hope to merge this with their efforts when both are far
enough along. But we also don't want to block making this available on
that effort.
Many thanks to the *numerous* people who helped along the way here. For
this patch in particular, both Eric and Craig did a ton of review to
even have confidence in it as an early, rough cut at this functionality.
Differential Revision: https://reviews.llvm.org/D44824
llvm-svn: 336990
We currently only support binary instructions in the alternate opcode shuffles.
This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:
1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.
Reapplied with fix to only accept 2 different casts if they come from the same source type (PR38154).
Differential Revision: https://reviews.llvm.org/D49135
llvm-svn: 336989
The current version of SymbolFilePDB::ParseVariableForPDBData function
always initializes variables with an empty location. This patch adds the
converter of a location information from PDB to a DWARF expression, so
it becomes possible to watch values of variables of primitive data
types. At the moment the converter supports only Static, TLS, RegRel,
Enregistered and Constant PDB location types, but it seems that it's
enough for most cases. There are still some problems with retrieving
values of variables (e.g. we can't watch variables of composite types),
but they look not relevant to the conversion to DWARF.
Patch by: Aleksandr Urakov
Differential revision: https://reviews.llvm.org/D49018
llvm-svn: 336988
begin label emitted for some routines with personality functions and
such.
Without this, we don't even recognize such functions as appearing in the
output and so don't attach any assertions to them. Happy to tweak this
or improve it if folks w/ deeper knowledge of the asm sequences that
show up here want.
llvm-svn: 336987
flow patterns including forks, merges, and even cyles.
This tries to cover a reasonably comprehensive set of patterns that
still don't require PHIs or PHI placement. The coverage was inspired by
the amazing variety of patterns produced when copy EFLAGS and restoring
it to implement Speculative Load Hardening. Without this patch, we
simply cannot make such complex and invasive changes to x86 instruction
sequences due to EFLAGS.
I've added "just" one test, but this test covers many different
complexities and corner cases of this approach. It is actually more
comprehensive, as far as I can tell, than anything that I have
encountered in the wild on SLH.
Because the test is so complex, I've tried to give somewhat thorough
comments and an ASCII-art diagram of the control flows to make it a bit
easier to read and maintain long-term.
Differential Revision: https://reviews.llvm.org/D49220
llvm-svn: 336985
This patch adds support for the following unpack instructions:
- PUNPKLO, PUNPKHI Unpack elements from low/high half and
place into elements of twice their size.
e.g. punpklo p0.h, p0.b
- UUNPKLO, UUNPKHI Unpack elements from low/high half and
SUNPKLO, SUNPKHI place into elements of twice their size
after zero- or sign-extending the values.
e.g. uunpklo z0.h, z0.b
llvm-svn: 336982
As suggested by @efriedma on D49262 - changed the extractelement to a store to prevent SimplifyDemandedVectorElts from simplifying the build vectors - this keeps the immediate generation which was the point of the tests.
llvm-svn: 336981
During the execution of long functions or functions that have a lot of
inlined code it could come to the situation where tracked value could be
transferred from one register to another. The transfer is recognized only if
destination register is a callee saved register and if source register is
killed. We do not salvage caller-saved registers since there is a great
chance that killed register would outlive it.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D44016
llvm-svn: 336978
Summary:
This test invokes undocumented behaviour that could change in
the future. Given this, it's probably best to just remove the
test.
rdar://problem/42022283
Reviewers: kubamracek
Subscribers: llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D49269
llvm-svn: 336977
Previously we iseled to blend, commuted to another blend, and then commuted back to movss/movsd or blend depending on optsize. Now we do it directly.
llvm-svn: 336976
Summary:
llvm-xray changes:
- account-mode - process-id {...} shows after thread-id
- convert-mode - process {...} shows after thread
- parses FDR and basic mode pid entries
- Checks version number for FDR log parsing.
Basic logging changes:
- Update header version from 2 -> 3
FDR logging changes:
- Update header version from 2 -> 3
- in writeBufferPreamble, there is an additional PID Metadata record (after thread id record and tsc record)
Test cases changes:
- fdr-mode.cc, fdr-single-thread.cc, fdr-thread-order.cc modified to catch process id output in the log.
Reviewers: dberris
Reviewed By: dberris
Subscribers: hiraditya, llvm-commits, #sanitizers
Differential Revision: https://reviews.llvm.org/D49153
llvm-svn: 336974
This is not an optimization we should be doing in isel. This is more suitable for a DAG combine.
My main concern is a future time when we support more FPENV. Changing a packed op to a scalar op could cause us to miss some exceptions that should have occured if we had done a packed op. A DAG combine would be better able to manage this.
llvm-svn: 336971
Summary:
This change adds support for writing out profiles at program exit.
Depends on D48653.
Reviewers: kpw, eizan
Reviewed By: kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48956
llvm-svn: 336969
Summary:
Previously, when both DT and PDT are nullptrs and the UpdateStrategy is Lazy, DomTreeUpdater still pends updates inside.
After this patch, DomTreeUpdater will ignore all updates from(`applyUpdates()/insertEdge*()/deleteEdge*()`) in this case. (call `delBB()` still pends BasicBlock deletion until a flush event according to the doc).
The behavior of DomTreeUpdater previously documented won't change after the patch.
Reviewers: dmgreen, davide, kuhar, brzycki, grosser
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48974
llvm-svn: 336968
-v prints all directive pattern matches.
-vv additionally prints info that might be noise to users but that can
be helpful to FileCheck developers.
To maximize code reuse and to make diagnostics more consistent, this
patch also adjusts and extends some of the existing diagnostics.
CHECK-NOT failures now report variables uses. Many more diagnostics
now report the check prefix and kind of directive.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D47114
llvm-svn: 336967
This bug was created by rL335258 because we used to always call instsimplify
after trying the associative folds. After that change it became possible
for subsequent folds to encounter unsimplified code (and potentially assert
because of it).
Instead of carrying changed state through instcombine, we can just return
immediately. This allows instsimplify to run, so we can continue assuming
that easy folds have already occurred.
llvm-svn: 336965
This re-applies r336929 with a fix to accomodate for the Mips target
scheduling multiple SelectionDAG instances into the pass pipeline.
PrologEpilogInserter and StackColoring depend on the StackProtector analysis
being alive from the point it is run until PEI, which requires that they are all
scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass
between StackProtector and PEI results in these passes being in separate
FunctionPassManagers and the StackProtector is not available for PEI.
PEI and StackColoring don't use much information from the StackProtector pass,
so transfering the required information to MachineFrameInfo is cleaner than
keeping the StackProtector pass around. This commit moves the SSP layout
information to MFI instead of keeping it in the pass.
This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.
Patch by Stephen Crane <sjc@immunant.com>
Differential Revision: https://reviews.llvm.org/D49256
llvm-svn: 336964
Summary:
This patch is crucial for proving equality laundered/stripped
pointers. eg:
bool foo(A *a) {
return a == std::launder(a);
}
Clang with -fstrict-vtable-pointers will emit something like:
define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) {
entry:
%c = bitcast %struct.A* %a to i8*
%call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c)
%0 = bitcast %struct.A* %a to i8*
%1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0)
%2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call)
%cmp = icmp eq i8* %1, %2
ret i1 %cmp
}
and because %2 can be replaced with @llvm.strip.invariant.group(%0)
and that %2 and %1 will produce the same value (because strip is readnone)
we can replace compare with true.
Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47423
llvm-svn: 336963
function parameter packs.
This makes our handling of non-trailing function parameter packs
consistent between the case of deduction at the top level in a function
call and other cases where deduction encounters a non-trailing function
parameter pack.
Instead of treating a non-trailing pack and all later parameters as
being non-deduced, we treat a non-trailing pack as exactly matching
any explicitly-specified template arguments (or being an empty pack
if there are no such arguments). This corresponds to the "never deduced"
rule in [temp.deduct.call]p1, but generalized to all deduction contexts.
Non-trailing template argument packs still result in the entire
template argument list being treated as non-deduced, as specified in
[temp.deduct.type]p9.
llvm-svn: 336962
Summary: It looks like the test file was copied from TestCPPStaticMethods.py because they have the same name. This means that the two tests will try to write to the same output files and will either overwrite each other's output or occasionally cause failures because they can't both access the same file.
Reviewers: asmith, zturner
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49261
llvm-svn: 336960
Summary:
This patch clears up some of the semantics within the Stage class. Now, preExecute
can be called multiple times per simulated cycle. Previously preExecute was
only called once per cycle, and postExecute could have been called multiple
times.
Now, cycleStart/cycleEnd are called only once per simulated cycle.
preExecute/postExecute can be called multiple times per cycle. This
occurs because multiple execution events can occur during a single cycle.
When stages are executed (Pipeline::runCycle), the postExecute hook will
be called only if all Stages return a success from their 'execute' callback.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D49250
llvm-svn: 336959
Summary:
This patch gets rid of the C-string parameter in the RawCommandObject::DoExecute function,
making the code simpler and less memory unsafe.
There seems to be a assumption in some command objects that this parameter could be a nullptr,
but from what I can see the rest of the API doesn't actually allow this (and other command
objects and related code pieces dereference this parameter without any checks).
Especially CommandObjectRegexCommand has error handling code for a nullptr that is now gone.
Reviewers: davide, jingham, teemperor
Reviewed By: teemperor
Subscribers: jingham, lldb-commits
Differential Revision: https://reviews.llvm.org/D49207
llvm-svn: 336955
Summary: The current documentation does not include the error parameter.
Reviewers: jingham, asmith
Reviewed By: jingham
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49251
llvm-svn: 336948
Basically, "AttributeList" loses all list-like mechanisms, ParsedAttributes is
switched to use a TinyPtrVector (and a ParsedAttributesView is created to
have a non-allocating attributes list). DeclaratorChunk gets the later kind,
Declarator/DeclSpec keep ParsedAttributes.
Iterators are added to the ParsedAttribute types so that for-loops work.
llvm-svn: 336945
Not all programs want section ordering when compiled with LTO.
In particular, the Linux kernel is very sensitive when it comes to linking, and
doesn't boot when each function is placed in its own sections.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D48756
llvm-svn: 336943
Summary:
This allows counters associated with unused functions to be
dead-stripped along with their functions. This approach is the same one
we used for PC tables.
Fixes an issue where LLD removes an unused PC table but leaves the 8-bit
counter.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: llvm-commits, hiraditya, kcc
Differential Revision: https://reviews.llvm.org/D49264
llvm-svn: 336941
The one I noticed is sqrtsss/sd, but there could be others.
I had to add a couple new tests that don't have insertelement in there to catch this on the fast-isel path. Otherwise we trigger an abort and use SelectionDAG.
llvm-svn: 336938