Infobell it - gpu administrator
BangaloreInfobell IT
...: - Support data scientists/ML engineers with environment setup and troubleshooting.- Manage libraries/frameworks : PyTorch, TensorFlow, RAPIDS, JAX, etc.- Work with distributed training tools (NCCL, Horovod, DeepSpeed) and HPC schedulers (SLURM/Ray).Monitoring & Troubleshooting : - Implement monitoring tools : DCGM, [...]
Category Office & Administration