Senior applied scientist
BangaloreConsulting firm
...implement quantization, pruning, knowledge distillation, and kernel optimization (CUDA, Triton, TensorRT) to achieve <100ms latency targets.- Build and maintain end-to-end ML data pipelines : data collection, validation, cleaning, feature engineering, ETL processes, versioning, and streaming pipelines using Apache Spark, [...]
Category Education, Training, & Library