This article was first published in my newsletter, Herding Lions

In Thinking In Systems Donella Meadows observed that when maintaining a system the properties you can choose to optimize for include productivity, stability, and resiliency. Productivity is about producing outputs – for a software team this might be completed projects, fixed bugs, or releases shipped. Stability is about consistency of output; can you maintain the same level of productivity over time? Resilience is the property of being able to recover from perturbations and unexpected changes. For a software team that might mean being able to handle losing a teammate, adapt to remote work or maintaining productivity during a tumultuous time for an organization.

Meadow’s interesting observation was that because resilience is less observable than productivity or stability, people tend to systematically underinvest in it. We can see outputs go up, and we can measure how they change over time, but resilience is by definition about response to rare and unexpected events, and therefore is hard to measure in normal circumstances. But the last year has been a reminder for many just how important resilience is; whether dealing with moving teams to remote work, the great resignation, supporting parents whose kids were suddenly home with them during the day, or the trauma of illness or loss from the pandemic, we’ve all had to deal with out of the ordinary circumstances when leading teams this year.

So how do we cultivate resilience on our teams? Here are a few things I’m thinking about:

  1. Reduce Work In Progress - In addition to other benefits, reducing the # of things you’re working on a time helps team be more resilient by encouraging collaboration / shared knowledge.
  2. Leave Margin - Don’t fill 100% of your sprint/quarter/annual planning with roadmap projects. Leave meaningful slack that can be used on technical improvements, unplanned requests, maintenance and learning tasks. That work is valuable to start with but the margin that planning for it provides also can make your team more resilient to changes in plan or scope creep surprises.
  3. Build Psychological Safety - Teams that trust each other and can talk about difficult things are less likely to have issues fester and explode, and are more able to weather storms. Building a safe environment is the right thing to do for your teammates regardless of effectiveness, but the business case here is also easy to make.
  4. Prioritize Developer Tooling & Velocity - Its easy to adjust to change when you can move fast. Velocity makes everything easier, and great tooling also makes it easier to onboard new team members and reduces reliance on the head-canon context of your more experienced team members.
  5. Hire and Train For Redundancy In Your Team - Teams of specialists lack resiliency. If only one person can do a particular task, then you’re vulnerable to anything that removes that person: leaving for a new job, an illness, a vacation, or just an opportunity for them to be promoted into new responsibilities.
  6. Have Everybody Take Real Vacations - Vacations are good for your team, but when people truly take the time off they’re also great “chaos engineering” experiments for your team to see how well you can function without different team members. If your output is limited when somebody is on vacation, it’s a smell that you can try and correct with some of the other tactics here.
  7. Rotate Responsibilities - Similar to the last 2 items, we can build redundancy by making sure that we don’t rely on a single person to always fulfill certain tasks. It’s great to have experts and “owners” but those titles shouldn’t be exclusive and part of the responsibility should be helping others do the work and grow into co-owners.
  8. Give Everyone Context - When everybody understands the why behind a team’s work, it’s easier to adapt to changing conditions because that can happen throughout the team – the leader doesn’t need to understand every change or detail
  9. Write Documentation - Another way to make it easier to ramp people up or move people between roles is to write things down. This means processes, architecture, how-tos. Thoughtful, well-maintained docs can add significant resilience to a team. Unfortunately this item is also the one that I’ve never really seen done well at a team level and may in practice be unrealistic. At an org level though this is extremely important for resilience.