Published on January 1, 2026 2:23 PM GMTThe MIT AI Risk Initiative is seeking support to build LLM-augmented pipelines to accelerate evidence synthesis and systematic reviews for AI risks and mitigations. The initial contract is six months, and part-time, with the possibility of extension.The immediate use case is to help build out modules to support our review of global organizations’ AI risk responses, where we identify public documents, screen for relevance, extract claims about AI risks/mitigations, and classify outputs against several taxonomies.The bigger picture includes generalizing and adapting this pipeline to support living updates & extensions for our risk repository, incident tracker, mitigations review, and governance mapping work.By contributing your skills to the MIT AI Risk Initiative, you’ll help us provide the authoritative data and frameworks that enable decision-makers across the AI ecosystem to understand & address AI risks.What you’ll do:Phase 1: Org review pipeline (Jan–Mar)Build/improve modules for document identification, screening, extraction, and classificationBuild/improve human validation / holdout sampling processes and interfaces so we can measure performance against humans at each stepIntegrate modules into an end-to-end evidence synthesis pipelineShip something that helps us complete the org review by ~MarchPhase 2: Generalization & learning (Mar onwards)Refactor for reuse across different AI Risk Initiative projects (incidents, mitigations, governance mapping)Implement adaptive example retrievalBuild change tracking: when prompts or criteria change, what shifts in outputs?Help us understand where LLM judgments can exceed human performance and thus be fully automated, and what still needs human review (and design interfaces / processes to enable this)Document architecture and findings for handoffRequired skillsStrong software engineering fundamentalsHands-on experience building LLM pipelinesPython proficiencyComfort working on ambiguous problems where “what should we build?” is part of the workCan communicate clearly with researchers who aren’t software engineersNice to havePrior work in research, systematic review, or annotation/labeling contextsExperience with evaluation/QA/human validationFamiliarity with embeddings + vector search for example retrievalAPI integrations (Airtable or similar), Extract, Transform, Load (ETL)/scraping-adjacent workRead more: https://futuretech.mit.edu/opportunities/ml-engineer—mit-ai-risk-initiative-contractor-part-time-6-monthsExpress interest:https://mitfuturetech.atlassian.net/jira/core/form/a35da49a-3ed9-4722-8eda-2258b30bcc29Please share with anyone relevant.Discuss Read More
ML Engineer – MIT AI Risk Initiative, Contractor, Part-time, 6-months
Published on January 1, 2026 2:23 PM GMTThe MIT AI Risk Initiative is seeking support to build LLM-augmented pipelines to accelerate evidence synthesis and systematic reviews for AI risks and mitigations. The initial contract is six months, and part-time, with the possibility of extension.The immediate use case is to help build out modules to support our review of global organizations’ AI risk responses, where we identify public documents, screen for relevance, extract claims about AI risks/mitigations, and classify outputs against several taxonomies.The bigger picture includes generalizing and adapting this pipeline to support living updates & extensions for our risk repository, incident tracker, mitigations review, and governance mapping work.By contributing your skills to the MIT AI Risk Initiative, you’ll help us provide the authoritative data and frameworks that enable decision-makers across the AI ecosystem to understand & address AI risks.What you’ll do:Phase 1: Org review pipeline (Jan–Mar)Build/improve modules for document identification, screening, extraction, and classificationBuild/improve human validation / holdout sampling processes and interfaces so we can measure performance against humans at each stepIntegrate modules into an end-to-end evidence synthesis pipelineShip something that helps us complete the org review by ~MarchPhase 2: Generalization & learning (Mar onwards)Refactor for reuse across different AI Risk Initiative projects (incidents, mitigations, governance mapping)Implement adaptive example retrievalBuild change tracking: when prompts or criteria change, what shifts in outputs?Help us understand where LLM judgments can exceed human performance and thus be fully automated, and what still needs human review (and design interfaces / processes to enable this)Document architecture and findings for handoffRequired skillsStrong software engineering fundamentalsHands-on experience building LLM pipelinesPython proficiencyComfort working on ambiguous problems where “what should we build?” is part of the workCan communicate clearly with researchers who aren’t software engineersNice to havePrior work in research, systematic review, or annotation/labeling contextsExperience with evaluation/QA/human validationFamiliarity with embeddings + vector search for example retrievalAPI integrations (Airtable or similar), Extract, Transform, Load (ETL)/scraping-adjacent workRead more: https://futuretech.mit.edu/opportunities/ml-engineer—mit-ai-risk-initiative-contractor-part-time-6-monthsExpress interest:https://mitfuturetech.atlassian.net/jira/core/form/a35da49a-3ed9-4722-8eda-2258b30bcc29Please share with anyone relevant.Discuss Read More
