Staff ai runtime engineer
BangaloreScaling Theory Technologies Pvt Ltd
...setups.- Design and maintain libraries and services that support the model lifecycle : training, checkpointing, fault recovery, packaging, and deployment.- Implement observability hooks, diagnostics, and resilience mechanisms for deep learning workloads.- Champion best practices in CI/CD, testing, and software quality across [...]
Category IT & Telecommunications