(Sr.) DevOps Engineer (OAT & xAPPs)

Trend Micro
Trend Micro

Software Engineering

Taipei City, Taiwan

Posted on Jun 15, 2026

Join Trend ‧ Join New Generation

趨勢科技 - 全球雲端資安領航者 / 全亞洲最大軟體公司 / 企業版圖橫跨五大洲 / 趨勢全球研發基地在台灣
===============================================================

We are seeking a highly motivated and talented (Sr.) DevOps Engineer to join our dynamic team. This role offers a unique opportunity to safeguard the reliability, scalability, and security of multiple TrendAI™ Vision One services — starting with our Observed Attack Techniques, Search, Detection Model Management and Response Management, and expanding to three or more products as our platform grows. If you thrive in a challenging environment, enjoy eliminating toil through automation, and have a passion for running large-scale cloud systems with an AI-native mindset, we want to hear from you!

As a (Sr.) DevOps Engineer, you will build and maintain robust infrastructure and backend services — owning CI/CD workflows, infrastructure-as-code, and DevSecOps practices across the software development lifecycle. You will design and operate monitoring and observability systems for product and service health checks, join on-call rotations to triage and resolve customer-reported issues and service incidents, and bring an overall, big-picture perspective that lets you context-switch fluidly across multiple products.

Key Responsibilities

  • AI-Native Operations — Master and integrate AI-native tools (e.g., Claude Code, GitHub Copilot, Cursor) into every stage of your workflow. This includes "Prompt Engineering for Engineers" — writing structured, context-rich prompts — and applying context engineering and agentic workflows to produce reliable, production-ready automation and code.

  • Design, build, and maintain infrastructure-as-code across AWS and Azure using Terraform and CloudFormation.

  • Drive configuration management and server automation with Ansible.

  • Operate, scale, and harden containerized workloads on Kubernetes (K8S).

  • Build and maintain observability — trace logging, metrics, dashboards, and alerting with Grafana — to ensure fast detection and diagnosis of issues.

  • Manage and optimize managed databases (RDS), ensuring reliability, scalability, and performance.

  • Automate operational tasks and tooling with shell scripting (and Python / Node.js where applicable).

  • Identify toil and design and implement automation within the SaaS.

  • Ensure the performance, availability, and reliability of all servers, services, and microservices; maintain a high SLA.

  • Provide operational support and automation tools to application developers, including deployments to staging and production environments.

  • Maintain a strong focus on security and compliance within a cloud environment.

  • Maintain a strong focus on metrics, measurement, and reporting on and improving service levels.

  • Join on-call rotations to triage and resolve service incidents.

  • Context-switch across multiple products (today two, growing to three or more), keeping an overall view of system health and consistently applying shared standards and tooling across them.

  • Research and implement new technologies and processes with the goal of improving service quality and reducing cost.

What We're Looking For

  • AI-native workflow — Familiarity with AI-native development tools (e.g., Claude Code, GitHub Copilot, Cursor) and the ability to maintain consistent AI configurations, establish a repeatable AI-native workflow, and apply context engineering / agent tools (e.g., AI Agent Skills, MCP servers).

  • At least two years of experience in cloud computing with AWS and/or Azure, including hands-on infrastructure, systems, and application architecture for large-scale, web-based applications.

  • Experience developing infrastructure-as-code with Terraform and CloudFormation.

  • Experience with configuration management / automation using Ansible.

  • Experience operating container solutions (e.g., Docker / Kubernetes).

  • Strong shell scripting skills for automation.

  • Experience with observability and troubleshooting using trace logs and Grafana (and similar monitoring / profiling tools).

  • Experience managing and optimizing relational / managed databases (e.g., RDS).

  • Experience testing and managing high-availability environments, including regular disaster recovery tests.

  • Experience with at least one CI/CD tool (e.g., GitLab, Jenkins, GitHub Actions).

  • Ability to prioritize and operate across multiple products with a strong systems-level, "overall vision" perspective — transparently communicating and justifying time investments.

  • Strong customer focus, with the ability to provide service to all levels of the organization.

Nice to Have

  • Hands-on experience with Python and/or Node.js.

  • Experience driving measurable team-level improvements through AI workflow standardization.

  • Experience leveraging AI to redefine product capabilities and deliver measurable business impact.

  • Understanding of secure coding practices and compliance within a cloud environment, including PCI, ISO 27001, and SOC 2.

Culture Fit

  • Passionate about AI and eager to share knowledge.

  • Embraces experimentation and continuous learning in a fast-evolving AI landscape.

  • Comfortable owning and switching across multiple products while keeping a high standard of reliability.

  • Everyone owns both development and testing — quality is built, not just tested.

  • Embraces DevOps culture: design, coding, testing, and supporting customer issues.

  • Good English communication skills, a proactive attitude, strong problem-solving skills, and a willingness to take on challenges.

  • Effective communicator and reliable team player in agile and cross-functional environments.

Why Join Us?

  • Build an AI-native operations practice from the ground up and shape how the team integrates AI across the development lifecycle.

  • Gain broad domain knowledge across multiple products in an industry-leading security ecosystem.

  • Tackle challenging large-scale reliability and automation problems with a team that values innovation and excellence.

  • Access strong opportunities for professional growth and learning.

===============================================================
連結智慧 守護世界 --- Connected Intelligence for Securing a Connected World