Fissionlabs - senior ai/ml developer
Pune/HyderabadFISSION COMPUTER LABS PRIVATE LIMITED
...phases, memory-bound vs compute-bound operations.- Experience with quantization methods (INT4/INT8, GPTQ, AWQ) and model parallelism strategies.Inference Frameworks :- Hands-on experience with production inference engines : vLLM, TensorRT-LLM, DeepSpeed-Inference, or TGI.- Proficiency with serving frameworks : Triton Inference [...]
Category IT & Telecommunications