Job Description
We are pioneering the next generation of artificial intelligence infrastructure, building the data foundation for the year 2026 and beyond. Nexus Horizon Systems is seeking a visionary Lead Synthetic Data Engineer to spearhead the development of high-fidelity data generation models. In this role, you will bridge the gap between theoretical AI research and production-grade synthetic data, ensuring our clients have the diverse, unbiased, and scalable datasets required to train next-generation Large Language Models (LLMs) and autonomous agents.
Why Join Us?
- Impact First: Your work will directly train the models that will power critical industries for the next decade.
- Future-Proof: Work at the bleeding edge of Generative AI, Synthetic Data, and Privacy-Preserving Machine Learning.
- Premium Culture: Competitive compensation, flexible remote-first policies, and top-tier mentorship.
Responsibilities
- Architect and deploy large-scale synthetic data pipelines using Generative Adversarial Networks (GANs) and Diffusion models.
- Collaborate with ML researchers to define data requirements for upcoming 2026 AI roadmaps.
- Ensure generated data maintains strict fidelity to real-world distributions while eliminating bias and PII.
- Optimize model inference speed to support real-time data generation for training workflows.
- Implement and maintain robust data governance and compliance standards (GDPR, CCPA).
- Mentor a team of junior data engineers and data scientists.
Qualifications
- PhD or Master's degree in Computer Science, Statistics, or a related quantitative field.
- 7+ years of experience in Data Engineering, Machine Learning, or AI research.
- Deep expertise in Python, PyTorch, TensorFlow, and distributed computing frameworks (Apache Spark, Kubernetes).
- Proven track record of implementing synthetic data solutions or working with Generative AI models.
- Strong understanding of statistical distributions, data augmentation, and model evaluation metrics.
- Excellent communication skills with the ability to translate complex technical concepts to stakeholders.