2 Comments

Until the last couple of years, I tended to be over confident in rollouts, especially with not using the 'undo' button. This is a button I highly suggest to be trigger-happy with :)

Often we think that our changes affect only area X, and after our release something in area Y gets broken. We tend to ignore it, and think that other people broke something, as we loath to press the undo button and start from scratch. I had a case where we rolled out a big change that didn't work well, and after a couple of days of debugging, I decided to rollback most of the code. I left a small part that I was sure 100% is not related, and was the most painful to rollback.

Of course, that small part was the one causing issues and hugely inflating our costs... Only couple of weeks later I figured it out.

My lesson was that if you have even the smallest doubt (and you always should when things break after you push to production!), rollback. It's better to rollback unneedlessly than be over confident...

Expand full comment
author

Yea. I had seen a similar case where we were hesitant in rolling back as none of the changes seemed relevant. They were just some instrumentation / logging changes. It turned out one of those logging changes combined with long standing environment variables resulted in significant inefficiency.

Expand full comment