- Develop and implement data integration solutions to enable seamless data movement across various systems and Platforms.
- Implement efficient data workflows, data pipelines, and ETL processes to accommodate structured and unstructured data from various sources to ensure the timely delivery of high-quality data.
- Define data models and build data hierarchy structures to support AI/ML model integrations that are reliable and scalable.
- Transform and cleanse data to ensure accuracy, consistency, and integrity.
- Collaborate with data analysts and data scientists to understand data requirements and deliver tailored solutions.
- Troubleshoot and resolve data integration issues in a timely manner.
Data Platform Development and Maintenance:
- Design, develop, and maintain scalable data platforms that support data ingestion, storage, processing, and retrieval.
- Collaborate with cross-functional teams to ensure data platforms meet the organization's evolving data requirements.
- Regularly monitor the data platform's performance, identifying and resolving any issues or bottlenecks.
Data Quality, Governance and Security:
- Implement and enforce data quality and governance assurance policies, ensuring compliance with relevant data protection regulations and industry best practices.
- Develop and maintain data security measures, including access controls, encryption, and data anonymization techniques.
- Monitor data usage and access patterns, proactively identifying and mitigating potential security risks.
- Collaborate with the IT and cybersecurity teams to address data-related vulnerabilities and incidents.
- Perform data profiling, data validation, and data cleansing activities to ensure data.accuracy and completeness.
- Collaborate with stakeholders to identify and resolve data quality issues.
- Define and monitor data quality metrics to measure and improve data quality over time.
- Conduct regular audits and reviews to ensure adherence to data quality standards.
- Ensure data governance and compliance standards, including responsible AI principles and data privacy, are adhered to during data integration and transformation processes.
- Identify and implement performance optimization strategies for data platforms and processes.
- Optimize database design, data structures, and query performance to enhance data retrieval speed.
- Monitor and analyze data processing and query performance metrics, taking proactive actions to optimize their performance.
- Collaborate with infrastructure and network teams to ensure optimal data platform performance.
- Conduct regular performance testing and tuning activities and optimize data platforms for performance, reliability, and security.
- Document data platform architecture, data models, data flows, and technical specifications.
- Create and maintain comprehensive documentation of data engineering processes and workflows.
- Share knowledge and best practices with team members and stakeholders.
- Provide training and support to users on data engineering tools and technologies.
- Contribute to the development and enhancement of data engineering standards and guidelines.
- Continuously research, evaluate and implement emerging technologies and best practices in data engineering to drive innovation.
- BS in Technical discipline such as Computer Science, Information Systems, Computer Engineering or a related field. Proven experience as a Data Engineer, Database Developer, or relevant experience and certifications are welcome in lieu of a degree.
- Strong understanding of data engineering principles, data management, and data modeling concepts.
- Proficient in programming languages such as Python, Java, or Scala, with experience in database query languages (e.g., SQL).
- Experience with cloud-based data platforms (e.g., AWS, Azure, GCP) and associated services (e.g., S3, Redshift, BigQuery).
- Familiarity with data integration techniques, ETL frameworks (e.g., Apache Spark), and workflow management tools (e.g., Airflow).
- Experience with data streaming and real-time data processing frameworks (e.g., Kafka, Apache Flink, AWS Kinesis, etc).
- Familiarity with machine learning and AI techniques for data analysis and prediction.
- Understanding of data security, encryption, privacy, and compliance requirements.
- Excellent problem-solving and analytical skills, with the ability to optimize data processing pipelines for performance and efficiency.
- Strong communication skills, with the ability to effectively collaborate with cross-functional teams and explain complex technical concepts to non-technical stakeholders.
- Experience with data engineering tools and frameworks such as Apache Airflow, Apache NiFi, Talend, etc.
- Experience with Data science tools such as Open Data Hub (Seldon, Prometheus, Dataiku, IBM Watson Studio, etc)
- Deep learning - machine learning that is a neural network with three or more layers, which helps to “learn” from large amounts of data.
- Cloud/big data tools (ex. blob storage, Redshift, Kafka, Hadoop, Spark, Hive etc.).
- Experience with containerization technologies such as Docker or Kubernetes.
- Competitive base salary.
- Annual Bonus
- Industry-leading Benefit Plans (Medical, Dental, Vision)
- Paid time off, including vacation, paid holidays, sick time, and personal days
- 401(K) Plan with company match + additional contribution
- Advancement opportunities
- Career mobility
- Education reimbursement for continued learning
- Training and Development programs
- Well-being program
- Community service and engagement programs
- Product programs
- Free drinks onsite