This change removes the current timers with ones that partition time properly.
The current timers are nested, so that if a new timer, B, starts when the
current timer, A, is already timing, A's time will include B's. To eliminate
this problem, the partitioned timers are designed to stop the current timer (A),
let the new timer run (B), and when the new timer is finished, restart the
previously running timer (A). With this partitioning of time, a threads' timers
all sum up to the OMP_worker_thread_life time and can now easily show the
percentage of time a thread is spending in different parts of the runtime or
user code.
There is also a new state variable associated with each thread which tells where
it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if
time is spent in OMP_task_taskwait, then that thread executed tasks inside a
#pragma omp taskwait construct.
The changes are mostly changing the MACROs to use the new PARITIONED_* macros,
the new partitionedTimers class and its methods, and new state logic.
Differential Revision: http://reviews.llvm.org/D19229
llvm-svn: 268640
The instruction A2_tfrpi has a 64-bit operand, while the corresponding
intrinsic takes a 32-bit value. The actual value has only 8 significant
bits, so the difference is only in the type used to represent it.
In order to map the intrinsic to the instruction, the operand needs to
be extended to the correct type.
llvm-svn: 268635
Summary:
Some PHIs can have expressions that are not AddRecExprs due to the presence
of sext/zext instructions. In order to prevent the Loop Vectorizer from
bailing out when encountering these PHIs, we now coerce the SCEV
expressions to AddRecExprs using SCEV predicates (when possible).
We only do this when the alternative would be to not vectorize.
Reviewers: mzolotukhin, anemet
Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17153
llvm-svn: 268633
This backend was supposed to generate C++ code which will re-construct
the LLVM IR passed as input. This seems to me to have very marginal
usefulness in the first place.
However, the code has never been updated to use IRBuilder, which makes
its current value negative -- people who look at the output may be
steered to use the *wrong* C++ APIs to construct IR.
Furthermore, it's generated code that doesn't compile since at least
2013.
Differential Revision: http://reviews.llvm.org/D19942
llvm-svn: 268631
[mips] On error, ParseDirective should always return false to signify that the
directive was understood.
Reviewers: dsanders, vkalintiris, sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: http://reviews.llvm.org/D19929
llvm-svn: 268630
Summary:
When launching ThinLTO backends in a distributed build (currently
supported in gold via the thinlto-index-only plugin option), emit
an individual index file for each backend process as described here:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098272.html
The individual index file encodes the summary and module information
required for implementing the importing/exporting decisions made
for a given module in the thin link step.
This is in place of the current mechanism that uses the combined index
to make importing decisions in each back end independently. It is an
enabler for doing global summary based optimizations in the thin link
step (which will be recorded in the individual index files), and reduces
the size of the index that must be sent to each backend process, and
the amount of work to scan it in the backends.
Rather than create entirely new ModuleSummaryIndex structures (and all
the included unique_ptrs) for each backend index file, a map is created
to record all of the GUID and summary pointers needed for a particular
index file. The IndexBitcodeWriter walks this map instead of the full
index (hiding the details of managing the appropriate summary iteration
in a new iterator subclass). This is more efficient than walking the
entire combined index and filtering out just the needed summaries during
each backend bitcode index write.
Depends on D19481.
Reviewers: joker.eph
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D19556
llvm-svn: 268627
The function only avaibleble when python is enabled. Guard the new call
in the Java plugin with LLDB_DISABLE_PYTHON until we can change
AddCXXSynthetic to be available in all case to get the build bots green
again.
llvm-svn: 268626
Both Linux and kFreeBSD use glibc, so follow similiar code paths.
Add isTargetGlibc to check for this, and use it instead of isTargetLinux
in a few places.
Fixes PR22248 for kFreeBSD.
Differential Revision: http://reviews.llvm.org/D19104
llvm-svn: 268624
now that the timeout actually means something, we see that sometimes adb is just really slow in
replying to the DONE packet during file push. Give it more time to complete.
llvm-svn: 268623
OpenMP 4.5 adds taskloop/taskloop simd directives. These directives
allow to use lastprivate clause. Patch adds codegen for this clause.
llvm-svn: 268618
Summary:
AdbClient would spin in a loop in ReadAllBytes in case the remote end was closed before reading
the requested number of bytes. Make sure we return an error in this case instead.
Reviewers: ovyalov
Subscribers: tberghammer, danalbert, lldb-commits
Differential Revision: http://reviews.llvm.org/D19916
llvm-svn: 268617
Summary:
We were trying to get a DWARFDIE from a CompileUnit belonging to a DWO file. However, this
function does not understand the die encoding used by the DWO files. Instead use GetDIE on the
SymbolFileDWARF, which is overriden in DWO to do the right thing.
Reviewers: clayborg, tberghammer
Subscribers: lldb-commits, ovyalov
Differential Revision: http://reviews.llvm.org/D19927
llvm-svn: 268615
The result type of setcc is dependent on whether or not AVX512 is
present.
We had an X86-specific DAG-combine which assumed that the result type
should be i8 when it could be i1.
This meant that we would generate illegal setccs which LowerSETCC did
not like.
Instead, use an appropriate type and zero extend to i8.
Also, there were some scenarios where the fold should have fired but
didn't because we were overly cautious about the types. This meant that
we generated:
shrl $31, %edi
andl $1, %edi
kmovw %edi, %k0
kxnorw %k0, %k0, %k1
kshiftrw $15, %k1, %k1
kxorw %k1, %k0, %k0
kmovw %k0, %eax
instead of:
testl %edi, %edi
setns %al
This fixes PR27638.
llvm-svn: 268609
If the linker requested to preserve a linkonce function, we should
honor this even if we drop all uses.
We explicitely avoid turning them into weak_odr (unlike the first
version of this patch in r267644), because the codegen can be
different on Darwin: because of `llvm::canBeOmittedFromSymbolTable()`
we may emit the symbol as weak_def_can_be_hidden instead of
weak_definition.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268607
This reverts commit r267644. Turning linkonce_odr into weak_odr is
a sementic change on Darwin: because of
`llvm::canBeOmittedFromSymbolTable()` we may emit the symbol as
weak_def_can_be_hidden instead of weak_definition.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268606
This was a remaining of a previous scheme where some IPOs were taking
place before we enter this code. This is not relevant anymore.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 268605
We were previously using an output offset of -1 for both GC'd and tail
merged pieces. We need to distinguish these two cases in order to filter
GC'd symbols from the symbol table -- we were previously asserting when we
asked for the VA of a symbol pointing into a dead piece, which would end
up asking the tail merging string table for an offset even though we hadn't
initialized it properly.
This patch fixes the bug by using an offset of -1 to exclusively mean GC'd
pieces, using 0 for tail merges, and distinguishing the tail merge case from
an offset of 0 by asking the output section whether it is tail merge.
Differential Revision: http://reviews.llvm.org/D19953
llvm-svn: 268604
It is insanely hard to write a test that works both on Windows and Unix.
I tried to workaround it with cpio's minor options, but the behaviors of
the options were myterious. It just doesn't worth to spend time on it.
And probably minor options could break buildbots that doesn't have the
GNU version of cpio command.
In this patch, I simply added a separate test file that runs only on
Windows.
llvm-svn: 268596
* an unscoped enumerator whose enumeration is a class member is itself a class
member, so can only be the subject of a class-scope using-declaration.
* a scoped enumerator cannot be the subject of a class-scope using-declaration.
llvm-svn: 268594
This is not meant to report that a value doesn't have a dynamic type - it is only meant as a mechanism to propagate actual type discovery issues (e.g. malformed type metadata for languages that have such a notion)
This information is used by ValueObjectDynamic to set its own m_error, which is a fairly sharp and heavyweight tool to begin with
For the time being, this is an architectural improvement but a practical no-op as no existing runtimes are actually setting errors
llvm-svn: 268591
The code here is recursively Select-ing a new Node to avoid issues
where N is CSE'd during replaceDAGValue and stops being valid. We can
accomplish the same goal in a more principled way by using a
HandleSDNode.
This is essentially a less dodgy fix for PR25733 than the original
attempt back in r255120.
llvm-svn: 268590
r217178 changed clang to add function attribute uwtable by default on
Win64, which caused the __eh_frame section to be emitted by default.
This commit restores the previous behavior for MachO targets.
rdar://problem/25282627
llvm-svn: 268589