Generate ONLY the Verilog module code for the following specification. ## Problem Description Implement a high-throughput sine and cosine calculator with single-cycle throughput. The module computes sin(angle) and cos(angle) for streaming input data. It must accept a new input sample every clock cycle (Initiation Interval = 1). FIXED-POINT FORMAT: - Input `angle_in`: 16-bit signed, Q2.14 format (radians). Valid range: [-pi/2, +pi/2] (approx. [-1.57, +1.57]). - Output `sin_out` / `cos_out`: 16-bit signed, Q1.15 format. Range: [-1.0, +1.0). INTERFACE PROTOCOL: - Streaming interface with valid signals. - When `in_valid` is high, `angle_in` contains valid data to process. - After a fixed latency (implementation-defined), results appear on `sin_out` and `cos_out` with `out_valid` high. - The module must sustain single-cycle throughput: a new input can be accepted every clock cycle regardless of latency. - `rst` clears any in-flight computations; outputs become invalid. ACCURACY: - Output values must be within ±0.001 (~32 LSBs in Q1.15) of ideal values. ## Interface Specification Module Name: cordic_sincos_perf Ports: - input 1 clk // Clock signal - input 1 rst // Synchronous reset, active high - input 16 angle_in // Input angle, signed Q2.14 fixed-point (radians) - input 1 in_valid // High when angle_in contains valid data - output 16 sin_out // Sine result, signed Q1.15 fixed-point - output 16 cos_out // Cosine result, signed Q1.15 fixed-point - output 1 out_valid // High when sin_out and cos_out contain valid results ## Requirements - Generate ONLY the Verilog module code - Do NOT output any reasoning, analysis, scratchpad, or tags - Start directly with `module cordic_sincos_perf` as the first line of your response - Do NOT include any testbenches - Do NOT include any explanations or comments outside the code - End with `endmodule` - Ensure the code is correct and synthesizable