The Health Technologies Team conceives and proves out innovative technology for Apple's future products and features in health.
We are seeking a highly capable Data Engineer to join a multi-disciplinary team. Successful candidates will be able to integrate with our research study leads, data scientists and engineers to develop and support effective data analysis and machine learning workflows.
Key Qualifications
5+ years of experience in software development
Expert in at least one of the following programming languages: (Python, Scala, Java)
Expert in large-scale data processing using parallel computing (e.g. Apache Spark, Hadoop, Dask)
Workflow orchestrations (e.g., Airflow, Luigi)
Proficiency in the Python programming language
Proficiency in Python frameworks and libraries for scientific computing (e.g. Numpy, Pandas, SciPy, Pytorch, Pyarrow)
Designing and maintaining relational and file system databases (e.g. Postgres, SQL, Parquet, S3, Data Lake)
Great understanding of infrastructure designs
In depth experience working with enterprise DE tools and the ability to learn and improve upon in-house DE tools
Experience designing and implementing custom ETL workflows
Demonstrated technical leadership and good communication skills
Description
Work closely with team members and study staff to design, build, launch and maintain systems for storing, aggregating and analyzing large amounts of data
Process, troubleshoot, and clean incoming data from human studies
Automate and monitor data ingestion and transformation pipelines, with hooks for QA, auditing, redaction and compliance checks per data management specifications
Create and maintain databases with existing and incoming clinical data
Architect data models and create tools to harmonize disparate data sources
Incorporate and comply with regulations as they pertain to electronic and clinical data and databases.
Education & Experience
BS/MS in Computer Science, Engineering, Informatics, or equivalent
Additional Requirements
- Desired/Preferred Qualifications:
- Experience with biomedical sensors/platforms for measuring physiological signals in the health, wellness and/or fitness realm
- Familiarity with best practices for information security, including safe harbor privacy principles for sensitive data
- Experience with machine learning development pipelines
- Experience with data modeling of diverse types of data streams
- Familiarity with AWS (or similar) cloud services and backend development
- Familiarity with development on Linux and MacOS
- Familiarity with iOS and WatchOS frameworks and app development