Rancho Biosciences, a Data Science service company, is seeking a BiologicalData Architectto join our team. The successful candidate will be responsible for the design, development, and implementation of data models to support the storage and integration of various omics data, such as genomics, proteomics, and metabolomics, within the biological space. A strong background in data modeling, ETL (Extract, Transform, Load) pipeline development, and programming languages like R and Python is essential. The candidate must also demonstrate domain knowledge in imaging and single cell data analysis.
We are a Data Science service company working with some of the most renowned pharmaceutical companies in the world. Our team of scientists, curators, computational biologists, data scientists, and solution developers are located throughout the country; we support talented people living where they chose, working collaboratively together on projects that have a real impact on human health.
Key Responsibilities:
- Design, develop and maintain data models for Biological Databases that support the integration and storage of multiple omics data types, including genomics, proteomics, and metabolomics.
- Develop, optimize, and manage ETL pipelines for data ingestion, ensuring seamless integration of diverse datasets into the Biological Databases.
- Write and maintain robust and efficient R and/or Python scripts for data analysis, transformation, and visualization.
- Utilize SQL and command-line tools to perform efficient data manipulation and query tasks.
- Stay up-to-date with cutting-edge technologies and best practices in the field, such as Spark and other distributed computing frameworks.
- Collaborate with researchers and other stakeholders to understand their data-related needs.
- Troubleshoot and resolve data-related issues and maintain data quality and integrity in the databases.
- Collaborate with other team members and cross-functional teams to support various data-related projects and contribute to the continuous improvement of data engineering processes and best practices.
Requirements:
- Bachelor's or Master's degree in bioinformatics, computer science, biology, or a related field.
- Strong experience in data modeling and designing database structures for Biological Data.
- Proven experience in developing ETL pipelines for data ingestion, preferably in the omics domain.
- Proficient in R and/or Python programming languages, and experience with SQL and command-line tools.
- Knowledge of Spark or other distributed computing frameworks is a plus.
- Familiarity with biological imaging, single cell data, and related analytical methods is a plus
- Excellent analytical, problem-solving, and communication skills.
- Ability to work independently and as part of a team and adapt to changing priorities and technologies in a fast-paced environment.
- Excellence in managing scientific stakeholder relationships.
- Experience translating project requirements to both business process solutions and technical solutions.
- Independently driven, hardworking, and committed.
- Able to communicate effectively at all levels.
- Detail-oriented and well organized, with an ability to work collaboratively and remotely.