The Computational Catalysts group within gCS is a diverse, curious and action-driven team at the intersection of computation, engineering and science with ambition to advance our technical excellence. The focus of the group is on partnering with the informatics and scientific communities to create a computational and data ecosystem that powers scientific discovery and accelerates decision making. We aim to modernize our ability to acquire, store, link, share, find and analyze data across the organization through scalable and integrated solutions that truly make every data point count.
In a collaborative effort between the Visualization and Interactive Data Analytics (VIDA) and Bioinformatics and AI/ML departments within the CBT, we are currently seeking a dynamic Summer Intern to contribute to the development of interactive exploration tools. The primary focus will be on interpreting the outcomes generated by Machine Learning models applied to regulatory DNA, such as Borzoi - Predicting RNA-seq from DNA Sequence.
As a member of a team of Summer Interns, the selected candidate will play a key role in advancing our mission to innovate scalable, interactive, and interpretable tools. These tools are designed to enhance our understanding and interaction with large-scale datasets and Machine Learning models.
This internship is located in South San Francisco, CA on-site.
Key responsibilities
- Work directly with research software engineers to develop visualization methods compatible with ML models of regulatory sequence
- Contextualize and predict the impact of non-coding variation using ML models
- Prototype and develop exploratory web applications
- Participate in talks, journal clubs, and general research laboratory activities
- Ideally contribute to publication(s) resulting from your summer work
Program Highlights
- Intensive 12-weeks, full time (40 hours per week) paid internship.
- Program start dates are in May/June (Summer)
- A stipend, based on location, will be provided to help alleviate costs associated with the internship.
- Ownership of challenging and impactful business-critical projects.
- Work with some of the most talented people in the biotechnology industry.
Who You Are
Required education
- Must be currently enrolled in a PhD program or currently pursuing a Master's degree
Required majors:
- Computer Science, Data Visualization, AI/ML, Computational Biology or other related majors
Required skills
- Proven ability to develop interactive Data Visualization applications using JavaScript or web-based visualization technologies (e.g., d3.js or WebGL).
- Proficiency in Python, with hands-on experience in frameworks for Machine Learning inference, such as PyTorch or TensorFlow.
- Familiarity with fundamental ML concepts and interpretation techniques such as transformers, saliency maps.
- Practical experience with popular regulatory genomics models like BPnet, Enformer, Borzoi, and DeepMEL.
- Previous educational exposure to basic concepts in cell biology and gene expression.
- Preferred: Knowledge of webGPU and experience in running Machine Learning models in the browser using WebAssembly would be an added bonus.
Relocation benefits are not available for this job posting.
The expected salary range for this position based on the primary location for this position in California is $22.00 - $55.00 per hour. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. This position also qualifies for paid holiday time off benefits.
#GNE-gCS-2024-Interns
Genentech is an equal opportunity employer, and we embrace the increasingly diverse world around us. Genentech prohibits unlawful discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin or ancestry, age, disability, marital status and veteran status.