Own and maintain data pipelines using AWS Glue (PySpark) and S3, Process and manage large-scale datasets across Elasticsearch and MongoDB, Improve data workflows: transformation, cleaning, normalization, Enhance search infrastructure using semantic models (LLMs, embeddings, DeepL), Monitor data quality and schema evolution, Collaborate with backend/product teams to ensure fast and reliable data features