Generate ONLY the Verilog module code for the following specification. ## Problem Description Implement a synchronous streaming stencil engine for an 8x8 tile of signed scalar-field samples. INPUT TILE FORMAT: - Input samples are accepted on each rising edge of `clk` for which `rst=0` and `in_valid=1`. - `tile_start` is sampled only on cycles where an input sample is accepted. - Each tile contains exactly 64 accepted samples arranged as an 8-row by 8-column grid. - Within one tile, samples arrive in row-major order: row 0 columns 0 through 7, then row 1 columns 0 through 7, and so on through row 7 column 7. - The first accepted sample of each tile has `tile_start=1`. - Every later accepted sample of that same tile has `tile_start=0`. - The evaluator will present only complete tiles and will not assert `tile_start=1` in the middle of a tile. - Tiles may be separated by idle cycles with `in_valid=0`, or may arrive back-to-back with the first sample of the next tile accepted on the cycle immediately following the final sample of the previous tile. - `sample_in` is a signed 8-bit two's-complement value. - A cycle with `in_valid=0` accepts no sample, does not advance the tile position, and ignores `tile_start`. STENCIL RULE: - Let `x[r][c]` denote the accepted sample at row `r` and column `c` of the current tile. - Valid stencil results exist only for the 6x6 interior points with `1 <= r <= 6` and `1 <= c <= 6`. - For each interior point, compute the signed 5-point stencil: `y[r][c] = x[r-1][c] + x[r+1][c] + x[r][c-1] + x[r][c+1] - 4*x[r][c]` - Use exact signed arithmetic. Do not saturate, wrap, round, or truncate. - `stencil_out` must present the exact result as an 11-bit signed two's- complement value. OUTPUT ORDER AND TIMING: - For evaluation, an output is associated with each accepted input sample `x[r][c]` for which `r >= 2` and `c >= 2`. - If the accepted input on cycle `T` is `x[r][c]` with `r >= 2` and `c >= 2`, then on cycle `T+1`: `out_valid` must be 1 `stencil_out` must equal `y[r-1][c-1]` - Therefore, the valid outputs for one tile appear in row-major order over the 6x6 interior points: `y[1][1], y[1][2], ..., y[1][6], y[2][1], ..., y[6][6]` - No result may appear on cycle `T`, and the corresponding result must not be delayed beyond cycle `T+1`. - Example timing: if `x[2][2]` is accepted on cycle 20, then on cycle 20 `out_valid` must be 0, and on cycle 21 `out_valid` must be 1 with `stencil_out = y[1][1]`. - If another input sample is accepted on cycle `T+1`, the module must still output the previously scheduled stencil result on cycle `T+1` while accepting the new sample. - The module must continue accepting new input samples while producing an output for an earlier interior point. - On any cycle for which no output is scheduled, `out_valid` must be 0 and `stencil_out` must be 0. RESET: - `rst` is synchronous and active-high. - While `rst=1`, accept no input samples, discard any partial tile state, cancel any pending output, and drive `out_valid=0` and `stencil_out=0`. - After reset is deasserted, the next accepted input sample must again be the first sample of a new tile and must be presented with `tile_start=1`. ## Interface Specification Module Name: streaming_5point_stencil Ports: - input 1 clk // System clock - input 1 rst // Synchronous active-high reset - input 1 in_valid // High when `sample_in` is an input sample to be accepted on this cycle - input 1 tile_start // Asserted only on the first accepted sample of each 8x8 tile - input 8 sample_in // Signed 8-bit two's-complement scalar-field sample - output 1 out_valid // High when `stencil_out` contains the valid stencil result scheduled for this cycle - output 11 stencil_out // Signed 11-bit two's-complement 5-point stencil result ## Requirements - Generate ONLY the Verilog module code - Do NOT output any reasoning, analysis, scratchpad, or tags - Start directly with `module streaming_5point_stencil` as the first line of your response - Do NOT include any testbenches - Do NOT include any explanations or comments outside the code - End with `endmodule` - Ensure the code is correct and synthesizable