At Raya, Infrastructure Engineers are passionate about building and maintaining robust, scalable infrastructure that empowers product engineering teams to innovate with confidence. You are excited to work with modern cloud platforms, Kubernetes, and infrastructure-as-code technologies. The ideal candidate will have 6-8+ years of experience with demonstrated growth in previous roles, showing both technical depth and leadership in infrastructure engineering. You will join the Platform Engineering team to help evolve our infrastructure foundation during this period of rapid growth.
As a Senior Infrastructure Engineer, you’ll drive the technological advancement of our Infrastructure Platform within Raya. Leveraging best-in-class cloud technologies and container orchestration, your work will be central to delivering a reliable, secure, and scalable foundation that our product teams depend on daily. We prioritize learning and teamwork and love giving people the opportunity to champion infrastructure solutions to complex challenges while growing into the best versions of themselves.
A great candidate is excited to support our infrastructure initiatives across multi-regional environments, Kubernetes orchestration, deployment automation, and observability. With a strong focus on performance optimization, you can design infrastructure that maximizes application efficiency and resource utilization. You’re enthusiastic about leveraging AI to enhance your own infrastructure workflows, automate complex tasks, and maximize the scale and impact of your work. You design infrastructure architecture in a scalable, evidence-based manner while understanding the value of working within established systems and thoughtfully introducing improvements. Finally, you believe in Raya’s vision of enriching lives by fostering relationships through quality, in-person interactions.
We offer comprehensive medical and dental coverage, $50 a day food delivery budget, equity-based employment, a great culture, learning opportunities, unlimited vacation, 12 weeks paid parental leave, and we pay all employees $1,000 a year to go somewhere in the world that they’ve never been because of our values of human connection, empathy, and curiosity.
Responsibilties
Infrastructure Leadership: Design and build major new infrastructure components and platforms to support Raya’s growing needs
Kubernetes & Container Orchestration: Lead our Kubernetes strategy, designing and implementing container orchestration solutions that optimize for various application workloads
Performance Optimization: Design and optimize infrastructure for maximum application performance, focusing on memory management, resource allocation, network traffic optimization, and system efficiency
Reliability Engineering: Implement SLOs, monitoring, and observability solutions to ensure high reliability of our platform
Cloud Engineering: Apply your in-depth knowledge of AWS to design scalable, resilient architectures across multiple regions
Incident Response: Participate in on-call rotations and lead complex infrastructure incident resolution and post-incident analysis
System Evolution: Thoughtfully improve existing infrastructure through incremental enhancements while respecting operational constraints
Deployment Automation: Enhance our CI/CD pipelines and deployment strategies to enable faster, safer releases
AI-Enhanced Workflows: Integrate AI tools and capabilities into infrastructure workflows to automate complex tasks, enhance decision-making, and maximize operational efficiency
Infrastructure Security: Collaborate with security teams to implement secure-by-design infrastructure
Cost Optimization: Design cost-effective infrastructure solutions and implement optimization strategies
Team Mentorship: Contribute to engineering excellence by mentoring other infrastructure engineers
Qualifications
A BS/MS in Computer Science, Engineering, Systems Administration, or a related technical field (Professional experience can be substituted for candidates with non-engineering educational backgrounds)
6-8+ years of hands-on experience with infrastructure engineering, with a track record of designing and implementing scalable infrastructure solutions
Strong expertise in Kubernetes and Docker, with experience designing and managing production container orchestration environments
Demonstrated expertise in AWS and infrastructure-as-code tools (Terraform, CloudFormation, Pulumi, Ansible)
Experience with performance tuning and optimization of both infrastructure and applications
Experience with monitoring and observability tools (Datadog, Prometheus, Grafana)
Proficiency in scripting and automation (Python, Bash, Go, Ruby)
Experience working with and incrementally improving established infrastructure environments
Strong collaborative instincts, emphasizing open communication, transparency, and cross-team interaction
Desired Qualifications
Background in SRE (Site Reliability Engineering) practices
Experience using AI tools to enhance infrastructure workflows, automate tasks, and improve operational efficiency
Knowledge of database administration and optimization (PostgreSQL, MongoDB, Redis, Elasticsearch)
Experience with multi-regional/global infrastructure deployment and operations
Track record of successfully modernizing legacy infrastructure components
Strong understanding of Node.js performance characteristics and experience optimizing infrastructure for Node.js workloads, including memory management, CPU utilization patterns, and scaling considerations
Proficiency in application profiling and performance analysis tools
Experience with network infrastructure and security
Experience with infrastructure security and compliance controls
Understanding of cost optimization strategies in cloud environments
Experience with service mesh technologies (Istio, Linkerd, Consul)
Experience with other cloud platforms (GCP, Azure) in addition to AWS
Familiarity with disaster recovery planning and implementation
What Set’s You Apart
Reliability Focus: You have a passion for building systems that are resilient, self-healing, and maintainable
Performance Optimizer: You have a keen eye for identifying bottlenecks and optimizing system performance at all levels of the stack
Problem Solver: You excel at diagnosing complex infrastructure issues and implementing effective solutions
Application-Aware Infrastructure: You design infrastructure that maximizes application performance by understanding how applications actually behave in production, with particular expertise in Node.js workloads
Pragmatic Innovator: You can balance working within existing constraints while strategically introducing improvements where they’ll have the most impact
Context Builder: You take time to understand existing systems and historical decisions before proposing changes, appreciating the journey that led to the current state
Infrastructure Vision: You think beyond immediate needs to design infrastructure that can scale and evolve with Raya’s growth
Automation Mindset: You’re driven to automate manual processes and eliminate operational toil
AI-Powered Efficiency: You’re excited about leveraging AI to amplify your capabilities, automate complex workflows, and achieve greater scale and impact in your infrastructure work
Impact-driven: You prioritize infrastructure initiatives that maximize impact, aligning with Raya’s overarching goals
Growth-oriented: You possess a perpetual learner’s mindset, open to challenges, and always seeking opportunities to expand your technical horizons