Job Description
As AI and HPC converge and reshape not only computing, but also business and society, Intel is making major bets on the future of AI and HPC in data center computing. Intel's Datacenter and AI Solutions, DAIS organization leads Intel's HPC efforts for Intel's CPUs and accelerators, and associated software and systems. Intel's DAIS teams work spans entire data center workloads from generative AI, and deep learning to media analytics, HPC and graphics.
DAIS System Design and Enablement (SDE) is chartered to drive design and implementation of datacenter scale systems that deliver best-in-class performance, efficiency and TCO for AI/ML and HPC workloads. SDE will work with key partners within Intel as well as ODMs/OEMs to drive integrated system stack improvements to lead the way in at-scale AI innovation.
Intel's AI datacenter team is seeking a highly motivated AI Cluster Validation Architect. Knowledge and experience in platforms, validation at scale, network interconnect and AI frameworks is required. The candidate will work in a multi disciplined team to define end to end validation strategy and execution plan for AI/ML clusters. The ideal candidate has a broad, system-level understanding of data center systems, networking, Linux kernel, I/O technology, use of accelerators in the data center. The candidate will be responsible for creating, defining and developing system validation environment and test suites. The responsibilities will also include development of methodologies, execution of validation plans, and debug of failures. The candidate requires broad understanding of multiple system areas and requires interfaces with Architecture, Design, and Pre-silicon Validation teams in improving post-silicon test content and providing feedback for future on-die debug features.
Qualifications
Minimum Qualifications
10+ bachelor’s and/or master’s degree with 8+ in computer engineering, Electrical Engineering or related field.
Experience validation/verification skills at post-Si, platform, and/or system levels.
Extensive knowledge in MPIs and underlying network interconnect in HPC and AI clusters.
Experience leading a diverse technical team leveraging collaboration across multiple departments to achieve results.
Linux, middleware, HPC software and/or hardware debug in a Linux environment background.
GPU, Server or High-Performance Computing environment experience
Post-Si debug and debug at scale experience.
Preferred Qualifications
Experience with Server board, system and datacenter design and architecture.
Technical understanding of Intel's silicon and/or platform Product Life Cycles (PLC's).
Experience presenting to executives and external partners.
Experience with JTAG-based hardware debug tools.
Hardware system debug tools including logic/network analyzers knowledge and experience.
Inside this Business Group
The Data Center & Artificial Intelligence Group (DCAI) is at the heart of Intel’s transformation from a PC company to a company that runs the cloud and billions of smart, connected computing devices. The data center is the underpinning for every data-driven service, from artificial intelligence to 5G to high-performance computing, and DCG delivers the products and technologies—spanning software, processors, storage, I/O, and networking solutions—that fuel cloud, communications, enterprise, and government data centers around the world.
Other Locations
US, OR, Hillsboro; US, CA, Santa Clara
Posting Statement
All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.
Benefits
We offer a total compensation package that ranks among the best in the industry. It consists of competitive pay, stock, bonuses, as well as, benefit programs which include health, retirement, and vacation. Find more information about all of our Amazing Benefits here.
Annual Salary Range for jobs which could be performed in US, California: $162,041.00-$259,425.00
- Salary range dependent on a number of factors including location and experience
Working Model
This role will be eligible for our hybrid work model which allows employees to split their time between working on-site at their assigned Intel site and off-site. In certain circumstances the work model may change to accommodate business needs.
JobType
Hybrid