Critical Path Institute (C-Path) is a nonprofit engaged in the creation of partnerships and innovative processes that improve human health by reducing the time, cost, and risk in developing and approving new therapies. For over fifteen years, we have partnered with industry and academic experts to advance technologies across the spectrum of medical product development from research to regulatory approval. As a leading nonprofit organization dedicated to fostering collaboration and promoting data sharing in the precompetitive space, C-Path has been at the forefront of numerous advances designed to get new treatments to patients quicker. Our continuing success is made possible by a combination of public and private support from those who share our vision to accelerate a path to a healthier world.
POSITION OVERVIEW
We are seeking an experienced candidate for the position of Data Engineer II (DE II) within the Data Engineering (DE) Team, which is part of the Data Science (DS) Team within C-Path's Data Collaboration Center (DCC). This is an Extended Temporary Employment (ETE) position through Aug. 31, 2024, with the possibility of renewal pending funding availability.
The C-Path Data Science Team is developing a sophisticated data management workflow for transforming deidentified patient data into multiple standards (SDTM, OMOP, FHIR), mapping across standards, integrating clinical and real-world patient data using ontologies, enhancing patient data with public data sources via a knowledge graph, and sharing data via our data and analytics platform. The DE II will be responsible for developing and enhancing data pipelines, interfaces, and automation tools that help us comply with FAIR data principles and data security requirements. The DE II will work as part of a team of data managers, analysts, and scientists to automate data processes or make them more efficient, to improve data quality assurance and semantic interoperability, and to transform raw data into machine-readable and analysis-ready forms.
We are seeking a very specific candidate who has experience developing data standardization, mapping, and curation workflows; working with relational and graph databases; using and mapping across biomedical data standards such as OMOP, FHIR, SDTM, or ontologies; modeling data in relational and graph structures; and working in a collaborative environment that includes not only the use of tools like git and Jira, but interacting with teammates on a regular basis. The most successful candidate should have a deep understanding of data concepts and their applications in life science research, support data sharing and open science, and love metadata and coding. In return, we offer an innovative, supportive, and rewarding work environment.
SUPERVISORY RESPONSIBILITIES
Non-supervisory position
CORE DUTIES/RESPONSIBILITIES
The DE II will develop and maintain tools to support data managers and data analysts:
- Carry out coding and code versioning, primarily using Python and GitLab.
- Develop data pipelines for automating programming tasks using APIs and automations.
- Quality assurance data pipelines and automation tools for incoming data curation and ETL/ELT
- Database creation, in collaboration infrastructure team
- Contribute to requirement specification, testing, and documentation of pipelines, databases, and platforms.
- Support for interface specifications (API or other) for federated querying and other interoperability
- Support Data Managers and consortia/projects with technical aspects of data curation
- Independently or with vendor support, develop GUI interfaces or platforms for easy exploration of metadata catalog.
The DE II will support the Data Engineering (DE) Team, Data Science (DS) Team, and Data Collaboration Center (DCC):
- Contribute to internal reporting to DS and project leads on development status.
- Provide support for decisions on which software and data systems the DE Team and DCC should use.
Additional duties may include:
- Present the work of the DE Team at internal and external meetings.
- Contribute to publications or presentations about the work of the DS Team, DCC, and consortium- or project-led efforts.
- Develop pipelines to automate annotation of ontology terms, and collaboration to integrate ontologies into data structures for optimal querying.
- Possible travel on occasion for out-of-town meetings or training (max under 10%)
REQUIRED KNOWLEDGE, SKILLS AND ABILITIES
- Experience with life science data is required, especially in a biomedical research setting.
- Experience with Python, SQL, and SPARQL
- Experience that enables you to innovate data management or semantic interoperability processes (JSON, RDF, OWL, SPARQL, knowledge graphs, or similar)
- Experience managing version-controlled code using git.
- Knowledge and experience working with relational and unstructured data architectures.
- Ability to translate data science infrastructure needs and liaise with scientific and DevOps teams.
- Experience working in cloud environment (AWS and Azure preferable)
- Knowledge of multiple kinds of databases (SQL and NoSQL), data platforms, and understanding of data modeling concepts
- Strong organizational and written and spoken communication skills.
REQUIRED EDUCATION AND EXPERIENCE
- 5+ years of directly relevant professional experience in data management, analysis or engineering in healthcare or life sciences field
- 3+ years of experience working with databases, including queries in SQL or SPARQL
- 3+ years of experience working with life science data, including clinical research (i.e. clinical trials) and real-world data
- 3+ years of experience with cloud computing platforms (Amazon or Azure), data and code versioning, and data transformation workflows
- 3+ years of experience using APIs for (meta)data access and management.
Critical Path Institute is an equal opportunity employer. Visit our website at
The above statements describe the general nature and level of work only. They are not an exhaustive list of all required responsibilities, duties, and skills. Other duties ay be added, or this description amended at any time.
Covid-19
All C-Path employees must vaccinate to safeguard the health of our employees and their families and the community at large from COVID-19.
Reasonable Accommodation:
Newly hired employees in need of an exemption from this policy due to a medical reason or because of a sincerely held religious belief must submit a completed request for accommodation form to the human resources department to begin the interactive accommodation process as soon as possible. Accommodations will be granted where they do not cause C-Path undue hardship or pose a direct threat to the health and safety of others. Please direct any questions regarding this policy to the human resources department.