LOCATION: Shanghai/Nanjing, China
JOB TYPE: Technical Support Leader
REPORT TO:
IT Director
COMPANY: A global data-driven, technology-enabled performance marketing agency
ROLE PURPOSE
- The Cloud Engineer role within company is responsible for cloud platform configuration and delivery along with providing Tier 3 support for customer solutions across multiple cloud providers (AWS, GCP, Azure). The candidate is the subject matter expert in one or more cloud platforms and is called upon for most complex assignments. The Cloud Engineer is responsible for timely delivery of client solutions, environment performance measurement, analysis and tuning in a cloud hosting environment. They keep up to date on industry trends and deliver the highest level of customer service possible.
- The Cloud Engineer role will report to the Director of IT. The role will work with a team of geographically distributed Company staff and contract employees in support of client platform implementation and delivery.
ESSENTIAL TASKS AND RESPENSIBILITIES
- Lead the design and implementation/delivery of global, cost effective, enterprise-class cloud based platforms following DevOps practices, while maintaining compliance with all company policies, procedures and standards.
- Participate in the development of cloud technology roadmaps that align with overall company strategy and vision of re-platforming and/or moving on-premises solutions to public cloud.
- Provide support for all aspects of cloud technologies with a heavy focus on new solution configuration by supporting delivery architects and developers to ensure on-time delivery of new cloud based platforms. This includes on-call and after-hours support, participation in the incident triage, and following a sound troubleshooting process.
- Act as a coach and mentor for less experienced team members.
- Lead innovation, standardization and automation of cloud deployments within the IT organization.
- Lead within discipline to develop best practices, case studies, training materials, and whitepapers. Present at Architecture Review Board, Lunch and Learn sessions, and training sessions.
- Anticipate risks and constraints and proactively work on solutions to address the risk/constraints, escalating as appropriate.
- Gather system configuration requirements, develop the platform delivery plan for solution and services milestones, managing expectations and coordinating delivery with developers and other client specific team members.
- Maintain in-depth awareness and fluency of the key features, functions, and/or industry trends around the area of expertise.
- Lead the evaluation process and recommend all product standards for Company cloud hosted infrastructure.
- Manage and prioritize multiple simultaneous incidents and Service Requests and drive resolution to technical incidents following the Company Incident Management Process.
- Perform post incident root cause analysis and write the associated RCA document.
-
Platform Ownership for cloud based technology. A Platform Owner is responsible for:
- Leading the technology councils for cloud platforms
- Determining and reporting on the availability of cloud platforms
- Monitoring and alerting – Ensure the primary platform functionality is properly monitored
- Backups and Recovery – Ensure critical data is identified/protected and a recovery model is in place and periodically tested leveraging modern cloud based models (snapshots, versioning, replication)
- Capacity management – Develop capacity models including dynamic/elastic solutions, measure utilization and establish and maintain forecasts
- Incident tracking – Track the incidents that have occurred on the platform and the remediation activities performed
- Develop and present environmental metrics
- Health checks – Periodic evaluation of the platform to ensure it is operating optimally
- Creation and maintenance of product guides
- Lifecycle Management – Ensure that software and cloud platforms are up to date
- Cloud Platform – Ensure that our deployments are supported by the vendor and pro-actively plan for a refresh when required.
- Software – Software follow the N-1 rule for major release
- Leading the technology councils for cloud platforms
- Uphold and promote Company's core values and culture
MEASURES OF SUCCESS:
- Consistently deliver Cloud based client platform configurations that exceed customer expectations on-time/budget.
- Successfully resolve technical incidents/service requests in support of client platform development and delivery efforts with a high level of quality, escalating where appropriate.
- Demonstrate a high level of customer service.
- Participate in the development of Company's technology vision and strategy.
- Demonstrate the ability to work as part of a team and share knowledge with team members.
- Implementation and adoption of support processes and structures sufficient to ensure system stability and user satisfaction.
- Attain and maintain "expert level" recognition in select areas by IT Leadership and peers.
- Contribute to the development and evolution of reference architectures.
- Positive recognition as noted on Employee and Customer satisfaction surveys, completed on a periodic basis.
SKILLS/REQUIREMENTS:
- Expert Knowledge in Amazon Web services components including EC2, Elastic Map Reduce, S3, Cloud trail, Data Pipeline, Glue
- Deep understanding of cloud security concepts including Firewall, VPC's, Encryption, Authentication and Authorization (SSO, Active Directory/LDAP etc.) strategies
- Additional experience with other major cloud providers including Google Cloud Platform and Azure highly desired
- Experience with Third Party Cloud Automation, Platform Management and Cost management tools such as Morpheus, Right Scale, Scalr, Cloud health etc. highly desired
- Enterprise productization processes, including developing product guides, asset management, data protection, capacity management, performance management, and lifecycle management.
- Enterprise System Management tools.
DEMONSTRATE THE ABILITY TO:
- Meet project deadlines and manage and prioritize multiple simultaneous projects while adhering to a time allocation model
- Assess the scope and impact of incidents and respond, with a sense of urgency that matches the incident, following appropriate policies and procedures
- Multi-task, prioritize, manage workload and adapt to changing business conditions
- Tolerate stressful situations and remain focused under pressure
- Effectively communicate at all levels of the organization.
- Manage through conflict and challenging situations with positive outcomes for the clients and company.
- Develop and cultivate strategic relationships that benefit IT and Company.
- Make decisions and judgments based on standard procedures.
- Demonstrate critical thinking and problem solving skills.
- Contribute to Managed Hosting standards and best practices for processes, procedures, and technical standards.
QUALIFICATIONS:
- Bachelor's Degree required and Master's Degree in Information Technology or Computer Science preferred
- 5+ years of experience supporting and implementing enterprise-class solutions operating in a 24/7 environment.
- 1+ years professional experience in designing technology solutions
- Certifications within engineering discipline and Cloud industry (AWS, GCP or Azure) certifications are highly desired.
- Good written and oral English communication skills