* When both RTL and NEMU report a critical-error, the hardware behavior is considered correct.
Upon detecting a critical-error, it should indicate a "good trap."
* If they do not report simultaneously, it results in a diff error and triggers an abort.
When Squash enabled, previous squashed instrCommit may be submitted
with new-coming updated ArchRegState, which results in reg mismatch
with REF.
In difftest software side, we will only use ArchRegState when
instrCommit or event (progress = true in code), so we add
updateDependency for ArchRegState to ensure using corresponding ones
with squashed commits, just like we add for CSR before.
Previous in PR #483 we generate interface according to collected
instances by Gateway. However, these collected instances have
already been processed by Gateway and different from original.
To generated original interface, we use Instances from Jsonprofile
directly to avoid impact of Gateway optimization.
Also fix missing semicolon in instance assignment.
---------
Co-authored-by: klin02 <youkunlin20@mails.ucas.ac.cn>
Previous we can generate JsonProfile while building single-core DUT, and build multi-core Difftest Endpoint according to JsonProfile, which support replicating single-core ports for CHI.
To simplify connection between cores and difftest, this PR support generate Svh Interface, and then we can use generated interface directly when instantiating cores and difftest.
---------
Co-authored-by: klin02 <youkunlin20@mails.ucas.ac.cn>
When DUT has error and no longer commit instructions, previously
Difftest does not check buffered data in Squash/Batch, but report
Timeout and finish simulation. However, mismatched Diffstate in
buffered data may help Difftest check error reason.
This change support timeout check for Squash/Batch, when DUT does
not commit instructions for a long time, Difftest will flush
buffer, submit and check.
---------
Co-authored-by: klin02 <youkunlin20@mails.ucas.ac.cn>
Previously, we view data collected from same cycle as a whole, end
batch assembling when step data longer than available space. It
results in bubble in transmission, and cannot handle situation when
step data longer than Max width in a single transmission.
This change support spliting step data according to collector,
appending part of data to output and updating remained to state.
To shorten logic length, we divide complex logic to three stage.
Spliting step data into two output can accept at most 2*MaxByteLen data.
However, when previous state contains data, remained space may
be not enough for new-coming peak. So we support submit previous
state ahead to leave space for peak.
Note step data may be splited to different batch func, and should be
read as a whole, so we avoid buffer-zone switch when batch enabled.
When the DUT executes a cbo.inval, a set is used to record its cacheline address.
Later, if there is a data mismatch between DUT and GoldenMem in the address space operated by the cbo.inval instruction, the Pmem of REF and GoldenMem will be directly updated using the data of DUT.
To facilitate Batch migration between DPIC and PCIe, we pack Batch
param to an aligned array, and parse it inside software.
In transmission, we only need to pass single hardware data, as
svBitVecVal[] or uint8_t[], and then view it as generated struct in
software.
Note we put also step inside BatchInfo and call simv_nstep inside,
Step will be also exposed to Top all the time for Verilator and
Timeout Check.
We also add isFPGA to GatewayConfig and generate BYTELEN macro
for FPGA.
Co-authored-by: Kami <fengkehan@bosc.ac.cn>
To support difftest for projects that use their own palladium flow,
the build of DPILIB_EMU shared library should be moved to separated
target.
Signed-off-by: Liu Shan <liushan@bosc.ac.cn>
Signed-off-by: Jiuyue Ma <majiuyue@bosc.ac.cn>
Co-authored-by: Liu Shan <liushan@bosc.ac.cn>
This PR adds support for non-register interrupts pending to
copy the interrupt to NEMU. The interrupts come from Count
overflow and Platform-Level Interrupt Controller such as PLIC,
CLINT, IMSIC. Thus, we could diff xip csr registers.
As a legacy issue of supporting Sv48, the default level of do_s2xlate when mode=0 is incorrect. Also, for only VS stage case (onlyS1), it should not do_s2xlate at all. This patch fixes these problems.
TrapEvent may not raise hasTrap when any other bundle valid, so we
recover needUpdate to mark there is valid TrapEvent at this cycle.
In validate, we use getValid to mark valid for single bundle, and
needUpdate to mark valid for this cycle. We will transmit trapEvent
when hasTrap, or other bundles valid.
Since difftest step function may not be called every cycle, we use
cycleCnt recorded in trapEvent instead of ticks, to check timeout
for Instr commits.
However, commit timeout will only be checked when difftest checks,
which is triggered by difftest_step. We also check if step is 0
for more than stuck_limit for both vcs and emu.
Privious when InternalStep is defined, we pass difftest_step to
Batch DPIC instead of tb_top, thus putting transmission and
comparision together.
This change pass step to tb_top all the time, which can indicate
Difftest check triggered.
In Batch DPIC function, some signals, such as coreid/index/address,
will serves as buffer locating info, instead of actually transmitted.
So we cluster transmitted data to simplify Parsing logic.
Such problem is introdeced by new Hardware elements instanced by
DifftestBundle Trait, which results in Locate info mixed into
transmitted data.
To make byteAlign funtions compatible for both Hardware and Chisel
type, we use DataMirror to check bindings.
Trace module will dump trace of IOs between DUT and difftest.
With trace file and json profile, we can drive difftest without DUT,
which will speed up repeated simulation without DUT modification,
and support difftest modification and iteration.
Note we padding 100 trace at the end of traceFile when dumping,
because when enable some features for acceleration, Difftest may
finish comparision within some cycles after loading trace.
Since Delayer is now put in GatewayEndpoint, trace loaded outside,
we dump and load trace after Delayer to ensure data delayed only
once. Trace delayed can be seen as a whole, we can use some bundle
with valid to validate others without valid. Dumping after Delayer can
reduce amounts of trace than before.
As VCS disallow use output port as dpic args directly, we add io_dummy
as intermediate val.
---------
Co-authored-by: xiaokamikami <fengkehan@bosc.ac.cn>
Previous we transmit all DiffState without hasValid when some
DiffState valid in same cycle. However, when Squash enabled, some
State without hasValid can be validated with special condition.
As Submit times of State with SquashQueue is much greater than
others, validated will greatly reduce submit times of Stata without
hasValid.
Previous we also use do_squash and squashDependency to filter invalid
CSR value. This change rename squashDependency to updateDependency,
and only validate CSR when commit and event. So we remove do_squash.
As Trapevent will also be checked for instrs Exceed, we transmit
it when validated, rather than just hasTrap or hasWFI.
In acclerated and multi-core verification cases, we need to adjust
or build Difftest's Hardware side seperately.
We use json to record Difftest apply and finish Info of DUT. Then
difftest can rebuild for single or multi cores according to Profile.
This commits avoid the execution of exit(0) for younger child
after wakeup, which makes the process hang after exit. Instead,
it lets the parent process kill the younger child before wakeup
the older one.
This fixes the hang case of the simulation process when the
simulation encounters errors with LightSSS enabled.
Co-authored-by: ZhangZifei <zhangzifei16@mails.ucas.ac.cn>