Generate ONLY the Verilog module code for the following specification. ## Problem Description Implement a synchronous accelerator that computes the signed dot product of two streamed 8-element vectors. INPUT VECTOR FORMAT: - An input pair is accepted on each rising edge of `clk` for which `rst=0` and `in_valid=1`. - Each accepted pair contains one signed 8-bit element from vector A and one signed 8-bit element from vector B. - The signed values use two's-complement representation. - Each vector pair consists of exactly eight accepted input pairs, numbered `(a0, b0)` through `(a7, b7)` in acceptance order. - For accepted pairs 0 through 6 of a vector pair, `last_in` is 0. - For accepted pair 7 of that same vector pair, `last_in` is 1. - `last_in` is sampled only on cycles where an input pair is accepted. If `in_valid=0`, no pair is accepted and `last_in` is ignored. - The evaluator may present vector pairs back-to-back with no idle cycle. If the final pair of one vector pair is accepted on cycle N, the first pair of the next vector pair may be accepted on cycle N+1. - The evaluator will not insert idle cycles inside a vector pair. DOT-PRODUCT RULE: - For each accepted vector pair, compute: `dot = a0*b0 + a1*b1 + a2*b2 + a3*b3 + a4*b4 + a5*b5 + a6*b6 + a7*b7` - Multiplication and addition must use signed arithmetic. - Use exact arithmetic. Do not saturate, wrap, truncate, or round. - `result` must present the exact dot product as a signed 32-bit two's-complement value. OUTPUT TIMING: - Each accepted vector pair must produce exactly one output cycle with `result_valid=1`. - If the final input pair of a vector pair is accepted on cycle N (`in_valid=1` and `last_in=1` on that cycle), then on cycle N+2: `result_valid` must be 1 `result` must equal the exact dot product for that completed vector pair - That result must not appear on cycle N or N+1, and must not be delayed beyond cycle N+2. - The module must continue accepting new input pairs while producing `result` for an earlier vector pair. - Whenever no result is scheduled for a cycle, `result_valid` must be 0 and `result` must be 0. RESET: - `rst` is synchronous and active-high. - While `rst=1`, do not accept input pairs, discard any partial vector pair, cancel any pending result, and drive `result_valid=0` and `result=0`. - After reset is deasserted, the next accepted input pair becomes pair 0 of a new vector pair. ## Interface Specification Module Name: streaming_dot_product8 Ports: - input 1 clk // System clock - input 1 rst // Synchronous active-high reset - input 1 in_valid // High when `a_data`, `b_data`, and `last_in` describe an input pair to be accepted on this cycle - input 8 a_data // Signed 8-bit two's-complement element from vector A - input 8 b_data // Signed 8-bit two's-complement element from vector B - input 1 last_in // Asserted only on the accepted final pair of each 8-element vector pair - output 1 result_valid // High when `result` contains the valid dot product for one completed vector pair - output 32 result // Signed 32-bit two's-complement dot-product result ## Requirements - Generate ONLY the Verilog module code - Do NOT output any reasoning, analysis, scratchpad, or tags - Start directly with `module streaming_dot_product8` as the first line of your response - Do NOT include any testbenches - Do NOT include any explanations or comments outside the code - End with `endmodule` - Ensure the code is correct and synthesizable