Job Description
Architecting the Intelligence of Tomorrow
We are seeking a visionary Generative AI Architect to lead the development of our next-generation LLM infrastructure. As we look toward the future, our mission is to build autonomous AI agents capable of complex reasoning and creative problem-solving.
At Nebula Future Systems, we don't just follow trends; we define the roadmap for 2026. You will work on cutting-edge multimodal models, optimizing inference for real-time applications, and ensuring our systems scale to millions of users worldwide.
Responsibilities
- Design and implement scalable inference pipelines for Large Language Models (LLMs).
- Research and deploy Retrieval-Augmented Generation (RAG) architectures to enhance model accuracy and reduce hallucinations.
- Collaborate with cross-functional teams to fine-tune foundation models on proprietary datasets.
- Optimize model latency and reduce token generation costs through aggressive quantization and pruning.
- Build robust evaluation frameworks to measure model performance and drive continuous improvement.
- Lead architectural decisions for the next evolution of our AI product suite.
Qualifications
- Masterβs or Ph.D. in Computer Science, Machine Learning, or a related quantitative field.
- 5+ years of experience in applied machine learning or deep learning engineering.
- Proficiency in Python, PyTorch, or TensorFlow with a deep understanding of neural network architectures.
- Strong experience with Hugging Face Transformers, LangChain, and vector databases.
- Experience deploying models to production environments using Kubernetes and Docker.