Job Description
Location: Chicago, IL
50% work from office. (Fine for remote as well)
Look for 15+ years of experience.
Must have skills sets: SLURM, Big data work on GCP
We need to look for candidates who worked on SLURM, Big data work on GCP.
We are looking for an architect who can work in infrastructure platform engineering.
Who knows how infrastructure is setup for HPC cluster (high performance computing)
They should know how infrastructure is setup for HPC cluster (high performance computing)
Experiencing in tools which use those cluster to run AI/Client models
Monitoring of such infra
The tool is posit, earlier known as rstudio
Job Description: Description
Responsibilities
- Design, develop, and maintain various features in a highly scalable and extensible AI/Client platform for large scale applications.
- Design and develop abstraction over services in public cloud platforms (AWS, Azure, GCP).
- Work on frameworks for performance, scalability and reliability tracking over different components of a highly extensible AI/Client platform.
- Work with architects, product managers, and software engineers across teams in a highly collaborative environment.
- Participate and provide insights in technical discussions.
- Write clean code following a test-driven methodology.
- Deliver commitments in a timely manner following agile software development methodology.
- Bachelor of Science in Computer Science, Computer Engineering, or related fields.
- 15+ years of relevant experience accompanied by strong understanding of Computer Science fundamentals.
- High proficiency in coding with Java, Python and deep knowledge of SpringBoot etc.
- Strong competency in object-oriented programming, data structures, and algorithms
- Experience with large-scale distributed systems, deep understanding of EMR, DataProc etc
- Experience with public cloud platforms such as AWS or GCP.
- Knowledge of HPC, SLURM, and related technologies for high-performance computing.
- Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Familiarity with modern container orchestration systems such as Kubernetes, Mesos, DC/OS, Swarm.
- Strong verbal and written technical communication ability to facilitate collaboration.
- Ability to thrive in a fast-paced, hard-working, dynamic environment and value end-to-end ownership of projects.