Summary:
We are seeking a passionate and skilled ML Data Engineer to join our team. You will play a pivotal role in building and maintaining the data infrastructure and pipelines for our cutting-edge Generative AI applications. You will collaborate closely with the Generative AI Full Stack Architect and MLOps Engineer to ensure the quality, security, and accessibility of data for our Generative AI models.
Responsibilities:
- Design, develop, and implement data pipelines for ingesting, pre-processing, and transforming data for Generative AI model training and inference.
- Build and maintain efficient data storage solutions, including data lakes, warehouses, and databases, appropriate for large-scale generative AI datasets.
- Implement data security and governance policies to ensure the privacy and integrity of sensitive data used in Generative AI projects.
- Collaborate with data scientists and engineers to understand data requirements for Generative AI models and translate them into efficient data pipelines.
- Monitor and optimize data pipelines for performance, scalability, and cost-effectiveness.
- Stay up-to-date on the latest advancements in Data Engineering tools and technologies (e.g., Apache Spark, Airflow, Snowflake, Data Bricks ) and apply them to our Generative AI platform.
- Document data pipelines and processes for clarity and transparency.
- Communicate effectively with technical and non-technical stakeholders about data quality and availability for Generative AI projects.
Qualifications:
- Bachelor’s degree in computer science, Data Science, Statistics, or a related field, or equivalent experience.
- 6+ years of experience in Data Engineering or related roles, such as data pipeline development, data storage, or ETL/ELT processes.
- Proven experience building and maintaining data pipelines for Machine Learning projects.
- Strong understanding of data modeling principles, data quality measures, and data security best practices.
- Proficient in programming languages like Python, SQL, and scripting languages (e.g., Bash, Shell).
- Familiarity with cloud platforms (e.g., AWS, GCP, Azure) for data storage and processing.
- Excellent communication, collaboration, and problem-solving skills.
- Ability to work independently and as part of a team.
- Passion for Generative AI and its potential to solve real-world challenges.
Job Type: Full-time
Pay: $90,000.00 - $115,000.00 per year
Benefits:
- 401(k) matching
Compensation package:
- Yearly bonus
Experience level:
- 6 years
Schedule:
- 8 hour shift
- Monday to Friday
Experience:
- Data Engineering or related roles: 6 years (Required)
- building and maintaining data pipelines: 4 years (Required)
- programming languages like Python, SQL: 3 years (Required)
- scripting languages (e.g., Bash, Shell).: 3 years (Required)
- cloud platforms: 3 years (Required)
- Machine Learning: 3 years (Required)
Work Location: Remote