Fissionlabs - senior ai/ml developer
Pune/HyderabadFISSION COMPUTER LABS PRIVATE LIMITED
...frameworks : Triton Inference Server, KServe, or Ray Serve.- Familiarity with kernel optimization libraries (FlashAttention, xFormers).Performance Engineering :- Proven ability to optimize inference metrics : TTFT (first token latency), ITL (inter-token latency), and throughput.- Experience profiling and resolving GPU memory [...]
Category IT & Telecommunications