Senior Software Engineer - SRE
Posted on Friday, December 13, 2019
At SafetyCulture, we help businesses get better everyday. As the operational heartbeat of working teams, our technology gives workers a voice and leaders the visibility to make smart decisions. We’re constantly evolving our platform, expanding into sensors/IoT, Scalable and Event-Driven Architecture to name a few, but we believe there’s more to be done.
Recently valued at AU$2.7Bn, we’re investing our resources into creating and shaping a better workplace for all. We are growing fast and looking for talented, self-motivated people that value collaboration, growth and learning to join our team.
As a Software Engineer in the Site Reliability Engineering team at SafetyCulture you’ll help to design, build and run resilient systems. You live and die by Murphy’s Law, knowing that anything that can go wrong will go wrong at the worst possible moment. You will help to foster a culture of designing for, and expecting failure in production systems - a culture where learning and knowledge-sharing is expected.
You love to solve sticky cross-service and cross-domain problems, and have a passion to identify root causes in complex scenarios. You understand how important it is for the teams to analyse past incidents and learn from them. Most importantly you are a team-player, are excited about the prospect of working in a fast-paced demanding environment and get that learning happens at the edge of the comfort zone.
How you can have an impact
As one of a core team of experienced SREs, you will shape and mature the culture, define the processes that the development teams will follow, and allow the business to scale to millions of users. You will be a key driver of our observability culture, enabling teams to diagnose cross domain issues and building a unified experience of metrics, logs and traces.
You’ll coach and educate your engineering colleagues on systems reliability and fault-tolerance best practice, identify gaps in existing systems and come up with remediation plans. You’ll improve metrics such as MTTR and MTBF, and promote a culture of sustainable incident response and blameless post-mortem. We encourage involvement in the community, open source work, attending talks and events, and experimenting with new technologies.
How you will spend your time:
- Engaging with teams across Engineering on reliability and performance issues
- Building out core capabilities such as load testing, observability improvements and advanced deployment mechanisms
- Write and maintain Go modules providing fundamental capabilities to our applications (e.g observability instrumentation)
- Evolving our Incident Management processes and engaging in post incident reviews, driving our learning culture
- Educating and driving the SRE mandate across the organisation
What you'll need
- Fluency in at least one modern programming language
- A solid education in SRE concepts like SLOs
- Experience with distributed systems and Unix/Linux systems internals (e.g. filesystems, inodes, system calls) or networking (e.g. TCP/IP, routing, network topologies and hardware, SDN).
- A good understanding of monitoring, logging, tracing, and observability instrumentation
- Excellent human-handling-skills with an ability to build and maintain healthy cross-team relationships
- You balance your love of systems-engineering with a product-mindset and build empathy with your customers and your product-engineering colleagues
More than a job!
- Equity with high growth potential, and a competitive salary
- Flexible working arrangements, we encourage you to create the best work blend while working from your home and the local SafetyCulture office;
- Access to professional and personal training and development opportunities;
- Hackathons, Workshops, Lunch & Learns;
- We encourage involvement in the community, open source work, attending talks and events, and experimenting with new technologies.
- In-house Culinary Crew serving up daily breakfast, lunch and snacks;
- Barista coffee machine, craft beer on tap, boutique wines and a range of non-alcoholic beverages;
- Wellbeing initiatives such as subsidised fitness programs, EAP services and generous parental leave policy;
- Quarterly celebrations and team events, including the annual Shiplt global offsite;
- On-site gym,Table tennis, board games, books library, and pet-friendly offices.
We’re committed to building inclusive teams and cultivating a sense of belonging so our people can bring their whole authentic selves to work each day. We seek to make reasonable adjustments throughout our recruitment process to create an even playing field for all candidates. Thanks to the tireless efforts of the entire SafetyCulture team we’ve built an incredible culture which has seen us recognised as a Best Place to Work in Australia, the US and the UK.
Even if you don't meet every requirement listed in the ad, please consider applying for this role. We prioritise inclusion and value individuals with potential over a checklist of qualifications. Don't rule yourself out, hit that apply button if this job resonates with you You can find out more about life at SafetyCulture via Youtube, Twitter, Instagram and LinkedIn.
To all recruitment agencies, we do not accept resumes or partnership opportunities. Please do not forward resumes to SafetyCulture or any of our employees. We are not responsible for any fees associated with unsolicited resumes.