Senior gpu optimisation engineer
BangaloreSilverpeople
...performance on specific GPU hardware- Profile models end-to-end to identify GPU bottlenecks - memory bandwidth, kernel launch overhead, fusion opportunities, quantization constraints- Design and implement custom kernels (CUDA/Triton/Tinygrad) for performance-critical model sections- Perform operator fusion, graph optimization, [...]
Category IT & Telecommunications