Company

Collins Consulting: EmployersSee more

addressAddressBerkeley, CA
type Form of workFull-Time
CategoryInformation Technology

Job description

Must be a US Citizen or Green Card holder. This is a hybrid position.
Statement of Work:
GBS requires the services of a Storage System Administrator III to provide labor services to support the DOE National Energy Research Scientific Computing Center (NERSC) Storage Systems Group (SSG) Team's hardware and software environment at the Lawrence Berkeley National Laboratory's NERSC facilities in Berkeley, CA. The hardware and software is part of a High Performance Computing (HPC) system environment and includes Storage Systems, servers in support of Storage Systems, storage services, software, and network components. The work will require active interaction/participation with clients and the Team to troubleshoot and resolve technical issues with production Storage System. Baseline Hardware and Software Environment Support Baseline Equipment, QTY Estimated:
  • 14 racks of Elastic Storage System computer storage - Community File System - manufactured by IBM
  • 43 disk arrays - NetApp
  • 80 storage servers - Supermicro
  • 12 elastic Storage System enclosures - IBM
  • 44 storage servers - test development environment - Supermicro
  • 48 mid-range servers - HPE
  • 164 enterprise tape drives - installed in IBM tape libraries
  • 3 tape libraries - manufactured by IBM.
  • 3 director level fiber channel switches - Brocade
Baseline Software:
  • IBM Spectrum Scale
  • IBM Red Hat Linux, Centos
  • High Performance Storage System
Required skills/Level of Experience:
(Numbered skills/experiences are a priority, listed in order of importance
  1. Bachelor's degree or equivalent experience and a minimum of three years of computing or storage experience; or equivalent experience
  2. Strong understanding of Linux fundamentals including file systems, networking, and automation tools like Ansible or Puppet
  3. Experience using one or more interpreted programming or scripting languages such as Python and Bash to automate system management tasks.
  4. Ability to work effectively and collaboratively on a team and on technical projects, as well as give and receive constructive feedback to foster communication and trust.
  5. Experience with hardware installation and replacement, running cables, cable management, racking systems, and labeling
  6. Strong organizational skills and ability to effectively manage priorities across many projects ranging from immediate problem resolution to long-term strategic planning.
  7. Strong written and verbal communication skills and the ability to document and describe complex tasks to audiences of varying familiarity with storage technologies.
Task Description:
Team Interaction/Participation
  • Participate in weekly team meetings to maintain awareness of open projects and goals as allowable to maintain internal info and activities with other vendors, NDA etc.
  • Monitor Slack for direct messages and other channels for issues related to Storage Systems
    • limit to certain channels at the discretion of the University
  • Respond to email in timely manner as determined by the University Technical Representative
  • Participate as a proactive team member
  • Potential participation in on-call 24/7 responsibilities
  • Potential participation in production Storage System problem determination and resolution
    • engage with other team members for advice when in doubt and vendor support when needed
    • one-week rotation between 3-4 other individuals
    • average < 5 off-hours calls per person per year
    • 2 hour on-site response time in emergency situations
Hardware activities
  • Communicate discovered and suspected hardware issues to the storage team
    • Slack or email for awareness
    • Service Now ticket for tracking status and closure
  • Monitor for and respond to hardware issues on all systems from multiple vendors as needed, open support cases with upstream vendors
    • coordinate with SSG team for replacement of components live or with down-time when required
    • monitoring requires pro-active parsing of logs, monitoring Graphical User Interface (GUIs) to determine, rather than reactively waiting until something fails
    • see issues through to resolution
      • e.g. disk controller failure: confirm that replacement is requested, arrives, is installed and returned material authorization (RMA) is sent back
  • Amber light walk at least weekly
  • Work with on-site technicians as needed from the University and vendors
  • Install/de-install hardware as needed
    • rack and cable both new and existing equipment
    • contribute to larger-scale integration responsibilities shared with other groups; e.g. making Storage System available to new compute resources

Software activities
At the Client's discretion -
  • Determine for all Storage System components (OS/kernel/firmware/etc.) when updates are needed
    • Read release notes, determine any impact of upgrades, fixes provided
    • communicate concerns/issues to the team
    • Via Gitlab issues, document upgrade plan, date of change(s) and systems involved, any issues encountered, potential risks
  • Identify areas for routine process optimization and implement solutions
    • Automation of common tasks, contributing to monitoring infrastructure
    • Develop scripts and tools and contribute them to internal Gitlab repository
    • Contribute to integration and implementation planning for future system upgrades and deployments

Nice to have skills:
  • Has demonstrated contributions to the high-performance storage community (e.g., conference presentations, open-source software). Ability to present and describe systems and issues to technical staff as well as higher level management.
  • Understanding of file system internals, prior work developing Storage Systems, or experience troubleshooting and optimizing parallel I/O.
Refer code: 8494923. Collins Consulting: Employers - The previous day - 2024-03-08 02:48

Collins Consulting: Employers

Berkeley, CA
Popular Storage System jobs in top cities
Jobs feed

Licensed Social Worker (LSW)- Hospital

Alomere Health

Alexandria, MN

Oral Surgeon

Lakeview Smiles

Chicago, IL

$130,000 - $300,000 a year

Sr. Mechanical Design Engineer, Packaging Innovation

Tesla

Austin, TX

$118K - $149K a year

Performance Architect - Silicon Design Engineer

Advanced Micro Devices, Inc

Austin, TX

$138K - $174K a year

Systems Performance Design Engineer

Advanced Micro Devices, Inc

Austin, TX

$113K - $143K a year

Test Chip Design Integration Engineer

Intel

Austin, TX

School Social Worker 2023-2024 School Year

Bellwood School District #88

Bellwood, IL

Pediatric Dentist

Chicago Children's Surgery Center

Chicago, IL

$135,000 - $450,000 a year

Travel Clinical Social Worker

Impresiv Health

Albany, OR

Senior FPGA Design Engineer : Gilbert, AZ

Acara Solutions

Gilbert, AZ

Share jobs with friends

Related jobs

Storage System Administrator

Pre-Sales Systems Engineer, Enterprise - Bay Area

Pure Storage

San Francisco, CA

3 weeks ago - seen

Storage Systems Architect

State Of California

Sacramento, CA

4 weeks ago - seen

Sr Project Manager - Structural Storage Systems - Relo Avail

Cybercoders

Bakersfield, CA

4 weeks ago - seen

Principal Business Systems Analyst

Pure Storage

Sacramento, CA

4 weeks ago - seen

Lead Test Engineer (Energy Storage & Distribution Systems)

Joby Aviation

San Carlos, CA

2 months ago - seen

Product Engineer, Servers and Storage Systems

Google

Sunnyvale, CA

2 months ago - seen

Pre-Sales Systems Engineer, Commercial - Bay Area

Pure Storage

San Francisco, CA

3 months ago - seen

Systems Engineer - Virtual Networking and Storage

Apple

Santa Clara, CA

4 months ago - seen

Radio Frequency Network Engineer, SBG

DOCUMENT STORAGE SYSTEMS INC

San Diego, CA

4 months ago - seen

VMWare/Windows/Storage Systems Administrator

Resource Informatics Group Inc

San Jose, CA

4 months ago - seen

Sr Account Manager Energy Storage Systems

LG Electronics USA

San Diego, CA

4 months ago - seen

Staff Engineer, Storage Systems

Samsung Semiconductor

San Jose, CA

4 months ago - seen

Technical Writer, Energy Storage Systems

Gotion, Inc.

Fremont, CA

5 months ago - seen

Systems Engineer - Data Protection/Storage - Hybrid Remote

Cedars Sinai Medical Center

Los Angeles, CA

5 months ago - seen

Systems Engineer - Data Protection/Storage - Hybrid Remote

Cedars-Sinai

Los Angeles, CA

5 months ago - seen

Energy Storage Systems Engineer

174 Power Global

Irvine, CA

5 months ago - seen

Postdoctoral Fellow (Hybrid Seismic Monitoring Systems for CO2 Storage)

GO-Energy Geosciences

Bodega Bay, CA

5 months ago - seen