Job description
In this role, you will be at the forefront of evaluating multimodal and generative models for real-world health/wellbeing applications on their objective quality and alignment with human intent and perception, such as truthfulness, adaptability, and model generalizability. You will work on data and evaluation pipeline of both human and synthetic data for model evaluation, leverage ML technologies such as reinforcement learning with human feedback and adversarial models. Responsibilities: Build the back-end system that generate and lead data from a variety of endpoints (e.g. health databases, human annotations, synthetic generations) Build quality and eval pipeline, model experimentation such as adversarial testing Build insights/interpretability tools; explore methods to understand and predict failure modes Being a critical part of the core multimodal ML dev team, innovate solutions to enhance model performance on quality metrics such as robustness and generalizability Team up with algorithm engineers to build end-to-end pipelines that prioritize rapid iterations in support for reliability of a complex multi-year project