
Wikimedia Foundation
We're the nonprofit that hosts @Wikipedia , working to create a world where every human can freely share in the sum of all knowledge. Join the movement.
Job Summary
The Wikimedia Foundation is looking for a Site Reliability Engineer (SRE) to join our team, reporting to the Director of Data Engineering. As an Site Reliability Engineer on the Data Engineering team, you will be responsible for building, maintaining and operating the shared data infrastructure that empowers the use of data at the Foundation as well as the Wiki Movement. You will be part of a larger community of SREs where you’ll have plenty of space and opportunities to learn and get familiar with our tech. For more details about our stack see: https://wikitech.wikimedia.org/wiki/Data_Engineering
We are a fully remote, internationally distributed team. With international travel resuming, we are hoping to again see each other in person 2-3 times a year during one of our off-sites (the last few have been in places like Majorca, New York and Prague), the Wikimedia All Hands (once a year in San Francisco), or Wikimania, the annual international conference for the Wiki community.
You are responsible for:
- Deployment, configuration and maintenance of the distributed data systems that comprise our data platform. Our stack includes Hadoop, Kafka, Spark, Cassandra, Presto, Druid, Airflow, Superset, DataHub, Turnilo
- Monitoring of systems and services, optimization of performance and resource utilization
- Cookbook/runbook implementation for common maintenance actions
- Development and maintenance of data platform infrastructure running on Kubernetes as well as bare metal
- Automation and streamlining of tasks as well as identifying process gaps
Skills and Experience:
- At least two of years experience in an SRE/Operations/DevOps role as part of a team
- Experience supporting high availability distributed production systems
- Comfortable with configuration management and orchestration tools (Puppet, Ansible, Chef, SaltStack, etc.), and modern observability infrastructure (monitoring, metrics and logging)
- Comfortable with shell and scripting languages such as Python, Go, Bash, Ruby
- Good understanding of Linux/Unix fundamentals and debugging skills
- Excellent written and verbal communication skills
- BS or MS degree, preferably in Computer Science, or equivalent work experience
Qualities that are important to us:
- Commitment to the mission of the organization and our values
- Commitment to our guiding principles
- Commitment to diversity, equity, and inclusion
- Cross-cultural sensitivity and awareness
- Collaborative working experience
Additionally, we’d love it if you have:
- Experience implementing containerization solutions (Docker, Kubernetes)
- Experience with package management for operating systems (Debian, etc)
- We are avid supporters (and users) of open source software; history of contributing to Open Source projects is valued
- Prior participation in the Wikimedia movement
Looking to sharpen your Software Engineering skills to stay relevant in the market? CLICK HERE to have a look at the top schools.
For all your IT certification needs, please, click here for more information