A career with A Place for Mom (APFM) is an opportunity to join our rapidly expanding technology companycommitted to making a real difference in the lives of the families we serve and the senior living industry at large.
About A Place For Mom
We’re the leading online platform connecting families searching for senior care with a team of experienced local advisors providing insight-driven, personalized solutions. As the nation’s most trusted senior advisory service, we are a mission-based organization that enables caregivers to make the best senior living decisions for their loved ones. With hundreds of senior living experts nationwide, A Place for Mom helps hundreds of thousands of families each year simplify the process of finding the right senior care solution across home care, independent living, memory care, assisted living, and more. Our services are offered at no cost to families as we’re paid by the 14K+ communities and 3K+ providers in our network.
Recently awarded one of the 2022 Best Places to work in NY and Best HR teams by Comparably, the leading workplace culture and brand reputation platform, A Place for Mom is committed to fostering, cultivating, and preserving a culture of diversity, equity, and inclusion.
Employees who thrive at A Place for Mom live our values every day:
- Focus on Excellence
- Act with Integrity & Assume Positive Intent
- Drive Outcomes Every Day with Passion and A Sense of Mission
- Make the Lives of our Families and Customers Better, Easier and More Successful
- Realize the Full Potential in Each Team Member. Work as a Single Supportive Team
Job Description
The position
A Place for Mom is seeking a motivated and energetic Director of Site Reliability Engineering/DevOps with a strong sense of ownership and technical ability. A Place for Mom has a 100% “cloud” based infrastructure and is seeking a tech leader with strong experience in Infrastructure as Code, automation, CI/CD, Containers, AWS, and DevOps best practices to lead our DevOps/Site Reliability Engineering team!
Excellent communication skills are desired, as the TechOps team has developed a strong and close working relationship with both development owners and product owners to define clear expectations of objectives and fast, robust, and future proof results.
The ideal candidate has a very strong sense of ownership and passion for learning. This position will report directly to the Vice President of Technology - Operations & CyberSecurity, who will rely on the Director - DevOps & SRE to build, lead, manage and consistently track and report on the DevOps/SRE progress for key stakeholders.
This role will require travel into our NYC office around one per month.
Job responsibilities
- This DevOps/SRE Engineering leader will be responsible for managing the cloud infrastructure and the underlying ecosystem of services and all associated components
- Including owning and driving the Major Incident Management process
- Mentor and guide the professional and technical development of engineers on your team and build a culture of accountability while setting the strategic direction
- Work collaboratively with development teams within and across Agile development processes to design, develop, test, implement, and support technical solutions across a full-stack of development tools and technologies
- Lead the availability, resilience, and scalability of your solutions
- Bring a passion to stay on top of tech trends, experiment with and learn new technologies, participate in internal & external technology communities, and mentor other members of the engineering community
- Drive the automation of deployment, configuration management, and monitoring processes to improve efficiency and reduce manual intervention
- Review and streamline the APFM DevOps process, tools and platforms
- Evaluate and select third-party tools and services that align with the organization's needs and goals
- Develop and maintain disaster recovery plans to ensure business continuity in the event of outages or disasters
- Experience with Site Reliability Engineering principles, including setting and managing Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgeting
- Partnering with the Security Team to ensure that HIPAA, NIST and CIS controls are implemented and maintained within all environments
- Experience managing a geographically dispersed team
- Perform additional tasks as assigned.
Qualifications
Required Skills and Competencies
- Extensive experience with Terraform and CI/CD tools such as Github Actions or Jenkins.
- Extensive experience with AWS managed service offerings. Most importantly, familiarity with ECS Fargate, EC2, S3, RDS, Lambda, Cloudfront, and Cloudwatch X-Ray/Eventbus.
- Familiarity with NewRelic or other similar APM tools.
- Experience with software monitoring and log aggregation tools.
- Strong sense of ownership and troubleshooting skills.
- Strong working knowledge of Linux, Windows operating systems
- Minimum 3 years of strong working experience around DNS and Network concepts, enabling efficient communication, scalability, security, and automation.
- Strong working knowledge of Docker or Kubernetes
- Experience designing Event Driven Architecture and Applications
- You are not afraid to question any existing processes and solutions, yet you display a keen sense of business value proposition and focus on the right priorities
- At least a Bachelor's Degree
- At least 9 years experience in a software development environment with DevOps/SRE and CI/CD engineering responsibility and experience
- At least 4 years experience in people management
- At least 5 years of experience with AWS
- At least 3 years of experience with Google’s Site Reliability Engineering (SRE) methodologies with establishing, tracking and reporting on daily metrics for management and instill a “manage by metrics” framework
- 5 + years of experience in a Software Engineering, SRE, or DevOps discipline
- At least 3 + years of experience writing Terraform, preferably writing Terraform Modules
- The ideal candidate is an autonomous self-starter that has a passion for learning paired with a strong sense of responsibility and ownership.
- Experience “containerizing” legacy applications.
- Strong communication skills and experience working with Tech Leaders and business/product owners
- Be part of the team - be fully capable of reviewing the teams work, offer solutions/suggestions and be able to troubleshoot and resolve issues
- Strong troubleshooting skills and an ability to come up with creative “outside the box” solutions in a timely and cost-effective manner
- Demonstrable track record of dealing well with ambiguity, prioritizing needs, and delivering measurable results in an agile environment
Education Requirements
- Bachelor’s Degree in Computer Science or related field, or equivalent college degree with 5+ years relevant work experience.
Additional Information
Compensation
- Base Salary: $160,000 to $170,000 + 30% Bonus
- Benefits:
- 401(k) plus match
- Dental insurance
- Health insurance
- Vision Insurance
- Paid Time Off
All your information will be kept confidential according to EEO guidelines.
#LI-KT1