Fissionlabs - senior ai/ml developer
Pune/HyderabadFISSION COMPUTER LABS PRIVATE LIMITED
...prefill vs decode phases, memory-bound vs compute-bound operations.- Experience with quantization methods (INT4/INT8, GPTQ, AWQ) and model parallelism strategies.Inference Frameworks :- Hands-on experience with production inference engines : vLLM, TensorRT-LLM, DeepSpeed-Inference, or TGI.- Proficiency with [...]
Category IT & Telecommunications