Platform Engineering Manager - Hybrid (Austin, TX)
James Avery Jewelry
JOB SUMMARY
The Platform Engineering Manager leads a team to design, build, and optimize a robust platform by creating reusable infrastructure templates, automated workflows, and self-service tools that enhance developer productivity. Oversees the performance, availability, and security of enterprise business systems, data platforms, and applications, ensuring consistent deployments and compliance across hybrid cloud and on-premises environments. Drives automation, monitors uptime, performance, security, and observability metrics, and ensures rapid incident resolution. Collaborates closely with IT Infrastructure, Information Security and Software Development teams delivering secure, scalable deployment solutions, optimizing costs, and providing KPI-driven insights to foster innovation, continuous improvement, and a DevOps culture aligned with organizational goals.
WHAT YOU WILL BE DOING:
- Team Leadership and Strategy: Manage and mentor a team of platform engineers, defining strategies to align with organizational goals for developer productivity, scalability, security, cost efficiency, and application performance.
- Platform Oversight and Uptime: Ensure 99.99% uptime, optimal performance, and compliance for enterprise applications (e.g., Oracle E-Business Suite, Tomcat, Apriso) and data platforms (e.g., Oracle Database, Amazon Aurora, MySQL), emphasizing cost efficiency and scalability.
- Incident Management and Resolution: Serve as the primary escalation point for critical platform incidents, leading cross-functional teams for timely resolution, root cause analysis, and post-incident reviews to implement improvements and prevent recurrence.
- Capacity and Scalability Planning: Forecast, plan, and execute platform capacity and scalability strategies across hybrid environments (AWS, Oracle Cloud Infrastructure, on-premises) to support current and future business demands, including peak load periods.
- CI/CD and DevSecOps Pipeline Management: Design and maintain reusable CI/CD pipelines (e.g., GitLab, Jenkins) with integrated DevSecOps practices (SAST, DAST, container scanning) to ensure consistent, secure, and automated deployments.
- Automation and Process Optimization: Drive automation and efficiency in CI/CD, job scheduling, and platform operations using tools like Ansible and enterprise schedulers to streamline workflows.
- Developer Enablement and Portals: Build and enhance internal developer portals, golden paths, and self-service workflows to empower developers to deploy securely and independently.
- Developer Training and Adoption: Develop and deliver training programs to educate developers on platform tools, best practices, and self-service workflows, promoting adoption across development teams.
- Observability and Performance Monitoring: Implement observability using tools like Dynatrace for real-time monitoring, predictive analytics, automated alerting, and actionable dashboards, tracking KPIs for application availability, data platform responsiveness, and serverless functions (e.g., AWS Lambda).
- Lifecycle Management: Oversee periodic patching, upgrades, and configuration of enterprise applications and databases to maintain supportability and compliance.
- Vulnerability Patching and Security Management: Manage vulnerability patching, security updates, and compliance for owned platforms and services, ensuring a robust security posture through timely remediation, automated scanning, and alignment with security best practices.
- Collaboration with IT and Security Teams: Partner with IT Infrastructure for networking, IAM, and compute integration, and with Information Security to align platform tools with organizational security policies and compliance standards.
- Cost Optimization and Resource Management: Monitor and optimize resource usage across cloud and on-premises environments, implementing cost-saving measures like automated scaling, resource tagging, and usage analytics for serverless and database services.
- Stakeholder Reporting and Risk Management: Provide clear reporting on operational health, risks, security posture, disaster recovery alignment, and KPI trends to business, audit, and executive stakeholders, ensuring secure backups and timely patching.
WHAT IS REQUIRED:
- Bachelor’s Degree in Computer Science, Information Technology or relevant field; or equivalent combination of education and/or experience.
- 5 years in platform engineering or DevOps experience.
- 3 years’ experience in a Leadership role.
- Familiarity with public clouds (e.g., AWS, OCI), containers (e.g., Docker), serverless functions (e.g., AWS Lambda), and observability tools (e.g., Dynatrace).
- Experience with IaC (e.g., Terraform, Cloudformation, SAM), CI/CD pipelines (e.g., GitHub Actions, Gitlab, Jenkins), and hybrid environments.
- Familiarity with application support (e.g., Oracle E-Business Suite), database management (e.g., Oracle Database), and vulnerability management.
- Proficiency in scripting (e.g., Python, Bash, Go) for automation.
- Understanding of networking, IAM, security practices, and automation tools (e.g., Ansible).
- Experience collaborating with Information Security teams for secure development and compliance.
- Strong communication and leadership skills to guide teams and engage with stakeholders.
- Demonstrated ability to deliver reliable, secure, and efficient IT services with a focus on operational excellence.
- Strong analytical and advanced problem-solving skills, with the ability to troubleshoot and resolve complex system and platform issues.
- Ability to manage multiple projects simultaneously and prioritize strategically to meet deadlines.
- Ability to travel to various work locations as needed.
PREFERRED QUALIFICATIONS:
- Experience managing platform teams in complex hybrid environments.
- Experience supporting enterprise applications and databases.
- Expertise in designing and/or delivering developer training.
- Demonstrated experience administering retail or customer-facing applications.
- Success in cost optimization, DR, vulnerability management and secure development.
- Ability to drive DevOps culture and effectively present strategies to executives.
- AWS Certified DevOps Engineer, Terraform Associate or similar professional certification.