Learn why latency and cost are now critical first-class metrics in LLM evaluation and how to optimize TTFT and token throughput for production AI.