A day in the life of an SRE | Suraj Nath
Today we have Suraj Nath as part of the SRE Stories.
Suraj works as Software Engineer at Grafana Labs on Tempo and Grafana Cloud Traces products. Before this, he was an early hire at Clarisights. Suraj is a speaker at various technical conferences. He also runs a meetup - failuremodes.dev.
Suraj describes himself quite interestingly π π
I mostly find myself busy fixing big rented computers βοΈ, busy killing pods and crashing prod π₯
Letβs start with our questions!
What is your work setup like? Are you a dual monitor / single monitor person? Which are the tools you cannot do without for day-to-day productivity?
I use a 14-inch 7th Gen. Thinkpad X1 Carbon with Ubuntu as my work laptop. I am a single-monitor kinda person. I work remotely, so I have a dedicated home office setup. I have a Blue Snowball ICE Mic and a ring light for better lighting. I heavily use Google Calendar with Reclaim.ai to build my routine, find focus time in my schedule, and be productive.
I write Go on most days, so GoLand is one tool I can't live without; GoLand makes it easy to write Go. We dogfood Grafana OnCall for OnCall management, Grafana Incident for Incident Management, and our LGTM stack for observability.
What does your typical day look like? Do you start with a dashboard and end with a dashboard? Any typical routine that you follow?
I have coworkers in the US timezone, so I start the day with a catch-up. I go through slack, email, and GitHub notifications in the morning. We have Grafana OnCall and alert manager connected to a slack channel for our service. When I am on-call, I will scan that channel and see messages from US on-call person. I usually open service-related dashboards when I am doing a roll-out or get an alert.
Which are your go-to tools for debugging an incident?
Grafana Cloud stack, Grafana OnCall, and Grafana Incident are the tools that I reach out to when I get alerted.
Any memorable incident you helped/tracked/fixed?
We had a Sidekiq server that used to crash only on some weekends; it was deemed haunted π½. We later found out it was a slow memory leak. For details, check out my post detailing it.
How many dashboards do you track over a day?
It depends on the day; if I am on-call, I have a set of 4-5 dashboards that I check when I am alerted. At times I will have too many dashboards open, and sometimes it's zero.
How do you manage burnout?
I try to take time off, disconnect, focus on my hobbies, get out to a park, or meet friends for coffee.
Follow Suraj on Twitter, he can be frequently found in one of the cafes in BLR with coffee and Grafana stickers π
If you were not an SRE, what would you be doing?
Probably Teaching or Farming.
Do you have any suggestions for us questions that we can ask fellow SREs?
A question around the maturity of SRE practices at their workplace?
Where can people find you online?
I am active on Twitter and also blog regularly.
Thanks, Suraj, for taking the time and sharing your story with us.
Readers - If you are interested in appearing on this substack or want to nominate someone, please submit it here ππ»