职位描述
Founded in 2012, ByteDance is a technology company operating a range of content platforms that inform, educate, entertain and inspire people across languages, cultures and geographies. With a suite of more than a dozen products, including TikTok, Douyin and Toutiao. ByteDance now has a portfolio of applications available in over 150 markets and 75 languages.
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems. Infrastructure SRE ensures that ByteDance's infrastructure services reliability and uptime appropriate to the needs of users and fast iterations of improvement. Our software development pays great attention to optimizing existing systems, building infrastructure, and eliminating work through automation.
As a Tech Lead, you will be responsible for leading and building a team of software/system engineers with your excellent technical leadership. You are expected to set up necessary processes for efficient execution and advocate good engineering practices. You will also regularly coordinate and communicate with other infrastructure teams as well as our users.
Responsibilities
- Building and managing SRE/DBA team, including team recruitment, new talent training, system operation/maintenance/coordination and team culture building.
- Develop a long-term technical plan, have a clear implementation path and milestones, continuously ensure the competitiveness of the team and technology.
- Formulate process specifications and plans with regards to access, configuration, disaster recovery as well as fault handling for all critical paths of the operating platform.
- Design and implement software platforms as well as monitor frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance.
- Cooperate with the system development team to ensure system reliability throughout the entire life cycle from system design to launch. Continuously evolve automated operation, maintenance facilities and platforms.
- Strengthen communication and cooperation with business teams, improve cross-team coordination, ensure continuous improvement and optimization of business flows. Promote the evolution of business architecture design.
职位要求
- Bachelor degree or above in Computer Science or a related technical discipline with 5+ years working experience (3+ years of R&D experience).
- Systematic in operation and maintenance thinking. Familiar with Linux systems and network skills. Practical experience with large-scale distributed system operations and maintenance systems.
- Self-driven with the ability to plan and summarize well. Experienced with project and team management.
- Possesses a strong sense of responsibility, a proactive team spirit, and a strong ability to comprehensively analyze and solve problems.
- Experience with large-scale cloud-computing platforms is preferred.
- Experience with large-scale distributed storage, scheduling, big data computing system development or intelligent operation and maintenance is preferred.
欢迎咨询!
电话:18519274080
微信号:Brylin1991
邮箱: herocanjob@163.com
--
FROM 122.190.149.*