Home Job Details
N
Information Technology 🏢 Full Time ⭐️ Verified

Lead Synthetic Data Engineer: AI Infrastructure for 2026

Nexus Horizon Systems
New York
Estimated Salary
USD 180.000 – USD 260.000
Live Update
15 Mei 2026
Deadline
15 Mei 2027

Job Description

We are pioneering the next generation of artificial intelligence infrastructure, building the data foundation for the year 2026 and beyond. Nexus Horizon Systems is seeking a visionary Lead Synthetic Data Engineer to spearhead the development of high-fidelity data generation models. In this role, you will bridge the gap between theoretical AI research and production-grade synthetic data, ensuring our clients have the diverse, unbiased, and scalable datasets required to train next-generation Large Language Models (LLMs) and autonomous agents.

Why Join Us?

  • Impact First: Your work will directly train the models that will power critical industries for the next decade.
  • Future-Proof: Work at the bleeding edge of Generative AI, Synthetic Data, and Privacy-Preserving Machine Learning.
  • Premium Culture: Competitive compensation, flexible remote-first policies, and top-tier mentorship.

Responsibilities

  • Architect and deploy large-scale synthetic data pipelines using Generative Adversarial Networks (GANs) and Diffusion models.
  • Collaborate with ML researchers to define data requirements for upcoming 2026 AI roadmaps.
  • Ensure generated data maintains strict fidelity to real-world distributions while eliminating bias and PII.
  • Optimize model inference speed to support real-time data generation for training workflows.
  • Implement and maintain robust data governance and compliance standards (GDPR, CCPA).
  • Mentor a team of junior data engineers and data scientists.

Qualifications

  • PhD or Master's degree in Computer Science, Statistics, or a related quantitative field.
  • 7+ years of experience in Data Engineering, Machine Learning, or AI research.
  • Deep expertise in Python, PyTorch, TensorFlow, and distributed computing frameworks (Apache Spark, Kubernetes).
  • Proven track record of implementing synthetic data solutions or working with Generative AI models.
  • Strong understanding of statistical distributions, data augmentation, and model evaluation metrics.
  • Excellent communication skills with the ability to translate complex technical concepts to stakeholders.

Required Skills

Python PyTorch TensorFlow Synthetic Data GANs Diffusion Models Distributed Systems Machine Learning SQL Data Governance AWS Cloud Architecture

Ready to Take This Challenge?

Make sure your resume is ready. Submit your application now before the deadline.

Apply Now

Related Jobs

Similar job recommendations for you

View All