Company

Machine Learning And AiSee more

addressAddressCupertino, CA
CategoryInformation Technology

Job description

The team will be responsible for maintenance and delivery of Infrastructure Services. These services are key to the development and production process of the AIML team. This team works very closely with other teams across AIML as operational subject matter exports. They offer guidance and advice that enables other teams to improve their services. A successful candidate will likely have experience in being a Systems Administrator that has moved on to development and automation in their career. In this role, you will get to: - Help operate Apple’s largest infrastructure supporting millions of AIML customers - Manage one of the largest deployment of logging service on AWS - Migrate configs and users from legacy service to new platform on AWS - Actively participate in capacity planning, scale testing, and disaster recovery exercises - Interact with stakeholder teams, including engineering, QA, and program management - Cultivate and maintain relationships with internal and external third-party vendors - Make changes to our environment with the purpose of pushing AIML services to the next level

Request

  • 10+ years of work experience in system administration
  • Expert knowledge of the Linux operation system (OS, networking, process level)
  • Experience in managing, scaling, and troubleshooting applications on AWS
  • Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus
  • Fluent in at least one scripting language (Shell, Python, Ruby, etc.)
  • Experience with at least one configuration management tool (Puppet, Chef, Ansible, Salt)
  • Strong verbal and written communication skills
  • Passionate about being a part of a tight-knit Operations team
  • A strong sense of ownership while being a team player who communicates clearly and transparently
  • Self-motivated, inquisitive, and always looking to learn more
Refer code: 9074830. Machine Learning And Ai - The previous day - 2024-04-18 05:53

Machine Learning And Ai

Cupertino, CA
Jobs feed

Senior Site Contract Manager

Johnson & Johnson

Titusville, NJ

Heavy Labor Production Associate, 2nd Shift Available

Nesco Resource

CANTON, OH

$16.25 •

Snowflake/SAP Data Engineer

Nesco Resource

Dallas, TX

$80.00 to $85.00 •

Security Supervisor

Fortiss Llc / Parkwest Casinos

Livermore, CA

Senior/Principal Fire Protection Engineer (Experienced) - Hybrid

Sandia National Laboratories

Livermore, CA

Pipeline Technician

Nesco Resource

Corpus Christi, TX

$25.00 to $32.00 •

Share jobs with friends

Related jobs

Aiml - Infrastructure Services - Site Reliability Engineer, Machine Learning Platform And Infrastructure

Site Reliability Engineer (remote - CA locals only)

Culturetech Solutions

Sacramento, CA

yesterday - seen

Senior Site Reliability Engineer (SRE) - ASE / iCloud

Software And Services

Cupertino, CA

6 days ago - seen

Lead Site Reliability Engineer

Job Board

San Francisco, CA

6 days ago - seen

DevOps & Site Reliability Engineer (SRE)

Hardware

Cupertino, CA

2 weeks ago - seen

Sr Site Reliability Engineer - Cross Functional

Software And Services

Cupertino, CA

3 weeks ago - seen

Site Reliability Engineer (SRE) - ASE / iCloud

Software And Services

Cupertino, CA

3 weeks ago - seen

Senior Site Reliability Engineer - DNS Infrastructure - Direct Hire

Braintrust

San Francisco, CA

4 weeks ago - seen

Senior Software Engineer, Site Reliability - Direct Hire

Braintrust

San Francisco, CA

4 weeks ago - seen

Site Reliability Engineer (SRE) 100% remote

Docupace Technologies, Llc

Culver City, CA

4 weeks ago - seen

Sr Site Reliability Engineer (Hybrid)

Nbcuniversal

Los Angeles, CA

a month ago - seen

Staff Site Reliability Engineer

Platform Science

San Diego, CA

a month ago - seen

Senior Site Reliability Engineer

Hireio, Inc.

San Jose, CA

a month ago - seen

Principal Site Reliability Engineer (West Coast)

Bobsled

San Francisco, CA

a month ago - seen

Site Reliability Engineer - Onboard Software

Wayve

Mountain View, CA

a month ago - seen

Senior Site Reliability Engineer (SRE), Multimodal

Character.ai

Menlo Park, CA

a month ago - seen

Senior Site Reliability Engineer

Extensiv

El Segundo, CA

a month ago - seen

Principal Site Reliability Engineer - Product Reliability

Zscaler

San Jose, CA

a month ago - seen