Senior gpu optimisation engineer
BangaloreSilverpeople
...GPU bottlenecks - memory bandwidth, kernel launch overhead, fusion opportunities, quantization constraints- Design and implement custom kernels (CUDA/Triton/Tinygrad) for performance-critical model sections- Perform operator fusion, graph optimization, and kernel-level scheduling improvements- Tune models to fit GPU memory [...]
Category IT & Telecommunications