At Snowplow, we are on a mission to empower people to use data to differentiate. We are able to provide technology which enables customers to not only control their data, but allows them to do amazing things with that control.
As part of that effort, we're changing the way that people do digital analytics by moving companies away from one-size-fits-all vendors, such as Google Analytics and Adobe, to dictate what should be done with their data and enabling them to collect and own their data themselves.
Our Managed Service offering has grown significantly over the last year, and we now orchestrate and monitor the Snowplow event pipeline across more than 100 customer-owned AWS accounts, with individual accounts processing many billions of events per month.
We are looking for our second Site Reliability Engineer to help us grow to managing 1,000 and then 10,000 AWS, GCP and Azure accounts. You’ll work closely with our Tech Ops Lead, on all aspects of our proprietary deployment, orchestration and monitoring stack.
The team and mission:
Technical Operations at Snowplow is responsible for two distinct domains:
Within both domains, Tech Ops at Snowplow is striving to increase service reliability, fulfil customer requests in a timely fashion, and automate recurring tasks. Task automation is essential as our customer base grows, because our “infrastructure estate” scales linearly with our customer numbers, unlike most software businesses.
Our roadmap includes:
This is an enormously ambitious undertaking but also, we hope, a hugely exciting infrastructure automation challenge!
Today, our in-house stack uses pragmatic technologies including Docker, Ansible, Consul, CloudFormation, bash and Golang to manage our internal and customer infrastructure.
For our next level of automation, we are now exploring tools such as Terraform, Kubernetes and Vault.
Within the software engineering side you will be responsible for the implementation, deployment and stability of your systems and services. You will own software end to end with a high expectation of ownership over anything that is deployed.
Within the operational side you will join our on-call process for incident resolution, and be in the assignment for the regular client infrastructure work, with a strong mandate to continuing automation.
What we are looking for:
This role will be a great fit for somebody who:
This role would be a great fit for a software engineer or systems administrator who wants to transition into a full SRE role.
The integrity of our customers' systems and data underpin everything we do at Snowplow. As part of their probation, candidates will be put through a full background security check.
An important part of this role relates to out-of-hours work, particularly around:
The on-call process for the Tech Ops team is still evolving; we will discuss these requirements with short-listed candidates.
What you’ll get in return: