Staff / Lead Data Engineer

MacroXStudio, Inc.


San Francisco, CA (Hybrid)
Salary: $200,907 per year

About MacroX

MacroX is building the next generation of AI-powered macroeconomic intelligence. We combine alternative data, machine learning, and large language models to help investors, businesses, and policymakers understand economic trends in real time.



The Opportunity

We are looking for a Staff / Lead Data Engineer to architect and scale our end-to-end data platform. You will lead the development of the infrastructure powering our AI, machine learning, and macroeconomic forecasting systems.

This role is ideal for an experienced engineer who enjoys building large-scale data systems, working with cutting-edge AI technologies, and shaping the technical direction of a rapidly growing company.



What You'll Do


  • Architect and manage the company’s end-to-end data platform, enabling scalable ingestion, transformation, and consumption of high-frequency data, while ensuring compatibility with downstream AI/ML pipelines and LLM applications.

  • Define and implement data engineering and MLOps standards, including unified data schemas, automated feature stores, model input pipelines, and CI/CD for model retraining and deployment.

  • Lead the integration of AI-powered data validation and anomaly detection systems to proactively identify quality issues and enhance pipeline resilience using ML-based diagnostics.

  • Collaborate with data scientists and ML engineers to productionize AI models, ensuring robust data pre-processing, real-time feature serving, and post-model monitoring infrastructure.

  • Drive the development of vectorized data systems (e.g., ChromaDB, FAISS) and retrieval-augmented generation (RAG) architectures to support LLM-based internal tools and customer-facing AI services.

  • Build observability and lineage systems enhanced with AI agents that flag broken data contracts, track schema drift, and auto-generate pipeline performance summaries for engineering teams.

  • Champion ethical AI data governance, ensuring pipelines enforce fairness, compliance (e.g., GDPR/CCPA), and explainability by design across ML data assets and training workflows.

  • Lead recruitment, mentorship, and technical upskilling for a hybrid team of data engineers and ML infrastructure engineers, fostering a culture of rapid iteration and responsible innovation.

  • Serve as the technical voice for AI and data strategy in leadership forums, collaborating with product, legal, and GTM teams to align data/AI investments with business and customer impact.

  • Deploy real-time streaming with AI event triggers (fraud, personalization), building intelligent data flows that adapt to business events using machine learning-driven routing and transformation logic.

  • The position follows a hybrid schedule, requiring four days in the office and permitting one day of telecommuting per week.



  • Minimum Requirements: Master’s degree or foreign equivalent in Data Science or related field plus two (2) years of experience in the Senior Data Engineer or related occupation.

    Must have experience with the following:
    - Programming languages: Python, SQL, R;
    - Data tools: Jupyter Notebook, JupyterLab;
    - Machine learning frameworks: scikit-learn, XGBoost, LightGBM, TensorFlow, PyTorch;
    - Model tracking: MLFlow, Weights & Biases
    - Data processing: Pandas, NumPy, Spark (PySpark), Airflow, dbt
    - Databases: PostgreSQL, MySQL, BigQuery, Redshift, MongoDB, Cassandra
    - Cloud platforms: Google Cloud Platform (GCP), Amazon Web Services (AWS) and
    - Version control & deployment: Git, GitHub, GitLab, Docker, REST APIs


    To Apply: Any interested applicant may click on the APPLY NOW button/link for position to apply for this position. Or, send resume to: communications@macroxstudio.com