Software Engineer – Data & Infrastructure

LLM Trust & Safety

AI, Machine Learning & Data Science

Apply Now

Hybrid

San Fransisco, CA

$180-$220k + equity

December 5, 2025

Apply Now

Our client is a fast-growing AI company building foundational AI safety and reasoning systems designed to keep advanced AI models aligned, reliable, and under human oversight. As AI capabilities accelerate, the safety infrastructure behind them must scale just as quickly. The team is developing the core data, evaluation, and training systems that ensure models behave safely, consistently, and with human-aligned reasoning at scale.

If you want to build the data backbone behind cutting-edge AI safety research—this role gives you the opportunity to shape the pipelines and infrastructure that power safe, trustworthy AI.

What You’ll Do

Design, build, and maintain the data lake, warehouse, and ingestion pipelines that power training, evaluation, and safety research
Develop scalable ETL/ELT processes, ingesting structured and unstructured data from diverse internal and external sources
Build orchestration workflows using tools like Airflow, Prefect, Dagster, or Argo, ensuring reliability and observability
Collaborate with ML engineers to deliver high-quality datasets for model training, safety evaluations, and RLHF pipelines
Implement robust data quality checks, validation layers, and monitoring systems for safety-critical data
Optimize data storage, compute usage, and distributed processing systems
Contribute to infrastructure decisions related to storage, schema design, data governance, and scaling
Help develop tooling that accelerates annotation, evaluation, and rubric-driven feedback loops
Improve internal developer experience across data pipelines, environments, and CI/CD workflows

What We’re Looking For

4+ years of experience in software or data engineering building production-grade pipelines or infrastructure
Strong experience with Python, SQL, and modern data engineering frameworks
Hands-on expertise with data lakes, warehouses, ETL pipelines, and distributed data processing
Familiarity with cloud infrastructure (AWS, GCP, or Azure), containerization, and orchestration
Experience building and scaling systems with tools like Spark, Ray, Kafka, Airflow, Dagster, or Argo
Strong debugging and systems-thinking mindset across data, infrastructure, and backend components
Understanding of versioning, schema evolution, and reliability principles for critical data assets
Ability to collaborate closely with ML teams and translate ambiguous requirements into clear data workflows

Bonus Points

Experience with:
- DevOps / platform engineering
- distributed compute (Ray, Spark, Kubernetes)
- data governance, cataloging, or lineage systems
- automated evaluation or annotation pipelines
- safety-, ML-, or research-oriented data environments
Prior work supporting ML training, evaluation pipelines, or AI safety initiatives

‍

Adam@intelletec.com

View Profile

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Software Engineer – Data & Infrastructure

LLM Trust & Safety

What You’ll Do

What We’re Looking For

Bonus Points

Adam Brown

More open jobs