# RV32I Micro-Core Test Strategy

## Scope

The testbench should treat the DUT as a black-box synchronous CPU core with
separate instruction and data memory interfaces. The benchmark is intended
to test end-to-end soft-processor integration rather than any one isolated
building block, so the checker should validate only observable behaviour:
instruction fetch sequencing, register-update effects made visible through
stores, branch and jump control flow, aligned word memory accesses, halt
behaviour, trap behaviour, and reset semantics.

Python is likely useful here because even short CPU programs become tedious
to audit cycle by cycle. If used later, it must be used only offline to
generate instruction words and expected outcomes. The runnable Verilog
testbench must hardcode the final expected values and must not execute
Python at runtime.

The concrete directed programs to implement are listed in
`test_programs.md` in this same problem directory.

After offline validation, the final `testbench.v` may hardcode the emitted
instruction words rather than reading assembly text directly. This is
preferred once the hex has been checked against an assembler because it
keeps the runnable benchmark self-contained and avoids simulator-specific
`$readmemh` parsing quirks.

## Coverage Goals

- Reset behaviour: confirm the core starts from PC=0 after reset, clears any
  sticky `halted` or `trap` state, and does not issue a store while reset is
  asserted.
- Halt behaviour: confirm `EBREAK` raises `halted` and freezes state.
- x0 behaviour: confirm writes to register x0 are ignored and reads of x0
  behave as zero.
- ALU datapath: verify LUI, ADDI, ADD, SUB, AND, OR, and XOR produce correct
  32-bit results.
- Load/store behaviour: verify LW and SW use correct effective addresses and
  transfer exact 32-bit words on aligned accesses.
- Branch behaviour: verify both taken and not-taken BEQ/BNE cases, including
  forward and backward branches.
- Jump behaviour: verify JAL writes rd = PC + 4 and jumps to the correct
  aligned target.
- Illegal-instruction trap: verify an unsupported encoding raises `trap`
  without performing a register write or store.
- Misalignment trap: verify misaligned LW, SW, taken-branch targets, and JAL
  targets raise `trap` and suppress side effects.
- Sticky terminal state: verify that after `halted` or `trap`, the DUT no
  longer updates state and `dmem_we` remains low until reset.

## Planned Directed Scenarios

- Hold reset active, then deassert it and execute a single `EBREAK`; verify
  clean startup and correct halt behaviour.
- Execute a short arithmetic program using LUI, ADDI, ADD, SUB, AND, OR, and
  XOR; store the computed results to a signature region in data memory.
- Execute a program that attempts to modify x0, then stores x0 and another
  live register to memory; verify x0 remains zero.
- Preload data memory with known words, execute LW and SW on aligned
  addresses, and verify the expected signature words after program halt.
- Execute a branch-heavy loop with both taken and not-taken BEQ/BNE cases;
  verify the loop exits correctly and writes the correct final signature.
- Execute a JAL-based control-transfer program and store the link register to
  memory so the testbench can verify rd = old PC + 4.
- Execute a program containing one illegal instruction word; verify `trap`
  asserts and no store occurs for that instruction.
- Execute separate programs that trigger misaligned LW, SW, taken branch, and
  JAL targets; verify each one traps without committing the invalid
  operation.

## Checking Method

- Model instruction memory as a small word array addressed by `imem_addr[31:2]`.
- Model data memory as a small word array addressed by `dmem_addr[31:2]`,
  with combinational read data and synchronous writes on `dmem_we`.
- Maintain a cycle counter and a watchdog timeout for each program to detect
  hangs or wrong control flow.
- For each directed program, compare:
  - termination reason (`halted` or `trap`)
  - termination within the expected cycle budget
  - final contents of selected signature words in data memory
  - selected store events `(cycle, address, data)` for scenarios where a
    precise external trace is useful
- Treat any unexpected trap, missing trap, wrong signature word, extra store,
  or missing store as a failure.

## Python Use

If a helper script is added later, use it only offline to generate golden
program images and expected final states for the directed scenarios above.

Planned workflow:

1. Encode each directed program using a small Python helper that implements
   the standard RV32I encodings for only the supported subset.
2. Simulate those programs with an independent ISA-level Python reference
   model that implements:
   - PC update rules
   - x0 hardwiring
   - LUI, ADDI, ADD, SUB, AND, OR, XOR, LW, SW, BEQ, BNE, JAL, EBREAK
   - alignment checks and illegal-instruction detection
   - sticky `halted` and `trap` behaviour
3. Emit the resulting instruction memory contents, initial data memory
   contents, expected termination reason, expected signature words, and any
   expected store trace entries.
4. Copy those literal expected values into `testbench.v`. The final
   simulation must remain pure Verilog and must not call Python.

For raw-word trap cases such as the illegal-instruction test, preserve the
exact intended instruction word in the final Verilog even if an external
assembler flow omits or rewrites it.

## Golden-Data Confidence

Strictly speaking, no single generator can prove its own outputs are 100%
correct. The practical goal is to make the golden data independently
derived, cross-checked, and auditable.

- Keep the oracle independent from the DUT: do not share decode or helper
  logic between the Python model and the RTL.
- Prefer assembling the directed programs with an external RISC-V assembler
  if one is available, then compare the produced words against the Python
  encoder.
- Keep each directed program short enough that the expected final signature
  can be reviewed manually.
- If a second ISA model or simulator is available later, run the same short
  programs there and compare final memory and termination outcomes before
  freezing the vectors into the Verilog testbench.

This keeps the benchmark deterministic and the expected results highly
trustworthy while preserving a self-contained runnable Verilog testbench.

The offline register dumps in `values/` are still useful while reviewing the
assembled programs, but the final self-checking Verilog should treat the CPU
as a black box and verify only externally visible behaviour: termination,
stores, and final memory signatures. This avoids depending on any debug
visibility into the DUT register file and avoids coupling the benchmark to
simulator-specific initial register conventions from external tools.