Fissionlabs - senior ai/ml developer
Pune/HyderabadFISSION COMPUTER LABS PRIVATE LIMITED
...decoding, KV cache optimization (MQA/GQA/PagedAttention), and dynamic batching.- Deep understanding of prefill vs decode phases, memory-bound vs compute-bound operations.- Experience with quantization methods (INT4/INT8, GPTQ, AWQ) and model parallelism strategies.Inference Frameworks :- Hands-on experience [...]
Category IT & Telecommunications