About the Role:
PlanGrid is looking for a Site Reliability Engineer to join our rapidly growing systems team. We're looking for people with knowledge of Linux systems administration and coding skills to automate laborious chore tasks. We want engineers who will create systems and processes that make the engineering department more efficient. We're here to partner with developers to help them ship their code onto a reliable infrastructure. We assist developers in tracking down production bugs and in instrumenting their code to be able to quickly zero in on problems.
You should be passionate about getting in front of problems instead of waiting until things are on fire. If you dream of stability, think in metrics, and love building reliable, systems that are hum along and take care of themselves, we want you on the team.
- Implement consistent development workflows across teams
- Maintain our CI/CD pipeline and the tooling that makes it all work in a sane and reproducible way
- Preach the religion of monitoring, metrics, and logging
- Make these better and more actionable
- Help development teams gather and present metrics for their various projects and services
- Write tools to automatically deploy/scale our HTTP services and our asynchronous workers
- Lend DevOps expertise to other teams to help track down performance/stability problems
- Build, automate, update, and maintain shared infrastructure e.g. postgres, redis, async workers infrastructure, internal services, etc
- Plan and execute large architectural changes such as migrating backend services out of heroku, rebuilding/simplifying our custom EC2 autoscaler, and moving towards international hosting
- Craft a brand new docker/kubernetes infrastructure to allow developers to ship more features and more services
- Obsess about infrastructure as code and documentation
In your first 6 months on the SRE team, you will:
- Plan and execute the movement of our core web services out of Heroku. We hope you can bring your CloudFormation, Troposhere, Docker, Kubernetes, SaltStack, Python, etc... skills to the table (with an opinion on how to use them)
- Write new developer and deploy tools for our kubernetes migration
- Prep for going multi-region
- Get security/permissions processes and policies in place
- Automate application deployment so that we can deploy faster and more reliably
PlanGrid solves a major problem for a 7,000 year old industry. Construction data is shackled in legacy, paper blueprints that are clunky, heavy to carry, and result in enormous rework costs totaling $9 billion per year for the industry due to working from outdated plans.
PlanGrid was built by builders, for builders. We’re spearheading the industry’s transformation to the cloud and digitization by arming construction workers with the best productivity tools. Contractors, owners, designers, and architects worldwide maximize PlanGrid to finish their projects on time and under budget. PlanGrid currently stores over 50 million blueprints, making us the largest digital blueprint repository in the world. We emerged from Y Combinator in 2012, and have secured over $62 million in funding from world-renowned organizations and individuals including Sequoia, Founders Fund, GV, 500 Startups, Box, Northgate, Spectrum 28, and Tenaya Capital.
- Located in San Francisco’s Mission District just one block from BART, among local shops, bars, and restaurants
- Flexible vacation
- Dog-friendly office
- Clipper Cards (for public transportation) funded by PlanGrid
- Construction site tours of the biggest projects in San Francisco using PlanGrid
- Volunteer time off: We encourage employees to give back to our local communities. We organize volunteer days and have worked with organizations such as Glide, SF/Marin Food Bank, Muttville, Family Dog Rescue, and Bryant Elementary School (as part of PlanGrid’s commitment with Circle the Schools).
- Catered lunches
- Premium medical, dental, and vision coverage for full-time employees and their dependents
- Office is wheelchair accessible