Senior Site Reliability Engineer

Company	RxmgSee more
Address	Irvine, CA
Category	Engineering/Architecture/scientific

Job description

Required Visa Status:

Citizen	GC
US Citizen	Student Visa
H1B	CPT
OPT	H4 Spouse of H1B
GC Green Card

Employment Type:

Full Time	Part Time
Permanent	Independent - 1099
Contract – W2	C2H Independent
C2H W2	Contract – Corp 2 Corp
Contract to Hire – Corp 2 Corp

Description:

SENIOR SITE RELIABILITY ENGINEER

RXMG is a California-based digital advertising company that employs our own state-of-the-art analytical and consumer intelligence platform to match people with the products they need to enrich their financial well-being.
We seek a Senior Site Reliability Engineer to join our engineering team to help develop an inclusive, innovative, and collaborative team environment.
The Ideal candidate is an experienced Senior Site Reliability Engineer with a strong technical background. A Site Reliability Engineer (SRE) is a professional uniquely positioned at the software engineering and systems operations crossroads. Your role is to develop and implement scalable, reliable, and efficient systems, ensuring that both internal and external services meet the highest standards of uptime and performance.
You will be working 100% remotely and should be extremely comfortable working via Slack, Google Meet, Zoom etc.

REQUIREMENTS:

4+ years of experience as a Site Reliability Engineer.
Deep understanding of containerized ecosystems
Expert Working knowledge of:
Google Cloud Platform (GCP), Amazon Web Services (AWS) components, monitoring tools, and alerting systems.
NGINX/Apache configuration and PHP module installation through apt or PECL.
Firewalls, including setting up, managing, and understanding their role in network security.
Be adept in managing user and file permissions across different operating systems, ensuring appropriate access rights without compromising security.
Proficiency in using ‘.htaccess’ for web server configurations, such as URL redirection and access control, is crucial.
Additionally, having a strong understanding of various hashing algorithms is essential, particularly for securing sensitive information and ensuring data integrity.
Hosting blameless postmortems to share findings, discover gaps, embrace transparency, and improve reliability across our services
Demonstrating Configuration Management to build and maintain consistency across platform components and services.
Willing to work Pacific Standard Time as well as off-hours

How To Apply:

Incase you would like to apply to this job directly from the source, please click here

Responsibilities:

Infrastructure Optimization: Tailoring our infrastructure for peak performance is especially crucial as we transition to the RXP platform.
Uptime & Platform Support: The candidate’s main focus will be to ensure the uptime and reliability of our internal platform. This includes proactive monitoring, troubleshooting, and timely resolution of any issues to maintain continuous operational functionality.
Site Monitoring & Support: The role also requires maintaining the uptime of our company websites. The candidate should be capable of managing site performance, addressing downtime promptly, and implementing strategies to enhance overall site stability.
Security: The candidate must also contribute to the security of our systems. This involves implementing basic security measures, responding to security incidents, and collaborating with the security team to uphold the integrity and safety of our digital assets.
Incident Management: Develop robust incident response protocols to address and mitigate any issues quickly, maintaining service continuity.
Lead and participate in weekend testing (e.g., capacity testing, fail-over, etc).
Provide for 24x7x365 on-call technical support for the Engineering and Operations team as needed.
Provide technical leadership, support, and operational oversight to sustain resiliency and high availability of critical business operations.
Monitor production, disaster recovery, and certification systems for issues. Troubleshoot and drive resolution of issues.
Analyze and optimize the performance of core platforms.
Investigate software defects.
Assist the Engineering team in resolving build/deployment issues.
Analyze application logs (e.g., GCP GKE and AWS EKS logs and various platform logs) to troubleshoot or explain perceived issues.
Execute SQL queries against a database to identify potential performance issues and or create upgrade recommendations.
Drive capacity planning decisions for RXMG platforms and systems and support capacity planning needs.
Provide an active voice within Capacity Planning meetings with engineering and technical operations management staff.

Refer code: 8209736. Rxmg - The previous day - 2024-02-18 07:22

Senior Site Reliability Engineer

RxmgSee more

Job description

Required Visa Status:

Employment Type:

EMT High Paying PART TIME

EMT High Paying PART TIME

Executive Director, Quantitative Pharmacology & Pharmacometrics

Lift Team EMT

EMT- Newark, DE

Sales Manager (Part Time) - 24H210

Emergency Medical Technician (EMT-B / A-EMT)

EMT Intermediate - Virginia South

Senior Scientist, Engineering

Associate Director, Clinical Trials Communications

Related jobs

Senior Site Reliability Engineer