Automation isn’t just for scale

A few years back, in a sidebar discussion at a tech conference, one of Netflix’s engineering managers asked me if I was using any automation tools at work.

I said, “Not really. It’s a small environment and we’re not delivering any web apps that require automation for scale.”

She gave me an amused/sympathetic look and replied, “Dealing with scale isn’t the only reason to automate things.” It wasn’t condescension; she was being kind — dropping some knowledge, but I didn’t know how to respond.

A little embarrassed, I mumbled some other excuses for why automation wasn’t a good fit, said ‘nice to meet you’, and wandered off.

I cycled through my excuses, trying to figure out if they were valid. Most of the automation and config management stuff I had used in the past had been imperative, task-sequence based stuff, like what you’d find in Microsoft System Center. When you have to do the “walk forward five steps, now extend left hand at 30 degrees, close fingers around peanut butter jar”- programming game for smaller, legacy environments, it definitely feels “not worth it.”

Days after, the conversation still bugged me. “Why do people automate their infra? Why, really?” Even after reading a ton of articles, blog posts, and whitepapers, I still couldn’t come up with anything that wasn’t ultimately a scale use-case.

I had confirmed my bias and probably would have stopped there in similar circumstances, but what the Netflix employee said had a feeling of truth that I couldn’t let go of. I kept digging.

In order to understand the benefits and justification for automation, I started automating things.

Turns out, that engineering manager had a gift for understatement.

Livestock, not pets

I grew up in a culture of IT where servers, even PCs, were treated as special snowflakes. It took a long time to reinstall Windows + drivers + software, so you did a lot of care, feeding, and troubleshooting to make sure you didn’t have to start over from scratch.

We named servers after hobbits and constellation. We got attached to them and treated each like a pet.

“Bilbo-01 just crashed?! NOOOOOOO!”

In some ways, virtualization worsened that philosophy. Things were more abstracted, but not enough to force a mindshift. You could now move your pet servers between different hardware, reducing the reasons you would have to rebuild a particular server. At great cost, effort, and risk (“You can never patch my preciousssss.”), there are businesses running VMs that are old enough to drive.

So we ended up with thousands of VMs running thousands of apps that were setup by people who have retired, switched jobs 10 times since, or stayed and now act like fancy wizards, holding their knowledge tight to their chest.

Automation is the documentation

Let’s tackle the issue of tribal and secret knowledge first.

A big component of DevOps (and the Lean concepts that inspire it) is identifying and removing bottlenecks. Sometimes those bottlenecks are people. This doesn’t mean you have to get rid of people, but you do need to (where possible) remove any one individual as a core dependency for getting something done.

“Bob is the only person who knows how to install that app.”

“Those are Jane’s servers, you’ll have to check with her.”

“We can’t change any of this because no one knows how it works.”

At the end of the day, this is a scale problem. It’s scaling your IT to be larger than one person. Part of the solution to this problem is cross-training, but automation can also help (and prevent future stupidity).

If you use a configuration tool like Ansible or Chef, the playbooks/cookbooks become the documentation for the environment. They detail dependencies, configuration changes, and service hooks that were realistically never going to be documented otherwise. If you’ve subscribed to a declarative model of automation, the playbooks not only detail what the app stack should look like— if they’re run again, they can enforce that the stack matches what’s in the playbook.

Change control

Things generally break because something changed. Maybe it’s a hardware or network failure. Maybe the software is buggy and there was a memory overrun or a cascading service failure. Maybe somebody touched something they shouldn’t have.

In olden times, a sysad would be tasked to troubleshoot the broken thing, wasting hours with Google searches and trial & error. Meanwhile, the app is down.

If you’re automating your infrastructure, that’s less of a thing. App stopped working? Re-run the playbook for the stack. Want to know why the app stopped working? Look at your run logs. Troubleshooting is still needed sometimes, but there is a lot less fire fighting when you can push a simple reset button to get things back up and running. Turn it off and on again.

For approved changes, automation requires that the changes be well defined, which is a big positive that helps everyone know what’s happening and what to expect.

This type of state enforcement could equally be considered a security measure. Some people schedule plays that run through app stacks and repair/report anything that doesn’t match the expected norm.

NO MORE (or maybe less) PATCHING!

Not everyone is able to get there, but having fully automated stacks often means you can do away with OS patching. Just rebuild the stack once a month with the newest patched OS image. Boom!

If you do have to patch, you can significantly reduce your patching and service confirmation work by building the patch installs, reboots, and health checks into your automation. This helps prevent the post-patch-night “My app doesn’t work.” emails.

Fewer backups

Even with de-dupe, I can’t imagine how many petabytes of backup data are made of up OS volumes and full VMs. If you’re automating deployment and config management, the scope of what you need to back up is greatly decreased (so is your time to recover).

You’ll really just be concerned with backing up application data. Other than that, you can make compute and the VMs your app runs on disposable. So you’ll just have to worry about having your playbooks with configs in version control and some method to backup databases and storage blobs.

This rolls into DR and failover as well. In many instances, automation will enable you to do away with failover systems. Depending on your SLAs, a recovery plan could be as simple as “re-run the playbook with a different datacenter target.”

Integration tests… for infrastructure

If you truly are treating your infrastructure as code, you can write unit and integration tests for it that go past “well, the server responds to ping”. You can also deploy into test environments very easily and run those environments more cheaply because of not having to maintain 1:1 infra full-time.

Turns out, if you make testing easier, people actually test things and you end up with better infrastructure.

This stuff is important

I get that none of these things feel very sexy, but in practice, they are game changing. As you start automating, you’ll discover that your infrastructure doesn’t work exactly like you thought it did, you’ll figure out what different apps actually need, and you’ll pull the weight of being the only person that knows something about a particular server/app off of your shoulders.

Some people like keeping secrets. They think being the only person who can do something gives them job security.

Those people are idiots. Maybe they will keep their job, but that’s not a good thing. They’ll never advance, never do anything more interesting than their current responsibilities.

Automating your infrastructure, opening up the secret knowledge to the entire team and doing away with the idea of being a hero who fights constant fires, is how you free yourself up to do better things. So build the robot, let it take over your job, and keep peeling all the layers of the onion to find work that’s more meaningful and interesting than installing patches, troubleshooting IIS, and getting griped at because “the server” is down.

You don’t have to work for a web company or be in the cloud to do this stuff (although some of the cloud toolsets are better). If you have even a small number of servers, it’s worth it. You don’t need “scale”, you just need a desire for your infrastructure not to suck.

Originally posted on

People Tech

How to lead without authority

I spent a lot of time being angry when I started my career. My employers and bosses frustrated me. My coworkers frustrated me. End users, customers, everyone frustrated me.

I got angry about decisions that made no sense to me. Most of my complaints fell into the theme of:

“If I was in charge, we’d never do X.”

If only they’d asked me first. If only I was the boss. If only…

I probably had a few legitimate criticisms and good ideas, but most of my frustration was based on the ignorance of youth and inexperience — thinking I knew more than I knew.

When I did want something changed or disagreed with a decision, my first course of action was to complain to my boss.

“This is stupid. You should fix it.”

I had no sense of agency and thought I couldn’t change anything because I didn’t have the power to. I could come up with ideas (Fun fact: ideas are easy.), but I needed someone else’s permission and authority to put them into motion.

I thought I needed control and a mandate to lead and affect change. More often than not I thought “I can’t do anything about this, other than complain.”

I was wrong.

Over time I discovered three things:

  1. True leadership is not based on authority.
  2. It’s possible (even preferable in many situations) to lead sideways.
  3. The degree to which anyone has actual control over anything or anyone is comically small.

Getting people to follow you

I’ve been very lucky in my career to have worked for good managers, although I often took them for granted. Even though I looked to their authority for solutions, with only a couple exceptions did any of them ever tell me “Do this… because I said so.”

Rather than dictating specific action, they presented a vision of what needed to be accomplished (goals), and provided me with support and breathing room to get it done. They trusted and empowered me. They made me feel important and that they genuinely cared about my well-being and personal progress.

I still thought I needed their mandate to change things, but I was able to move out of my comfort zone and build confidence in my skills and judgement.

My motivation to do well flowed from a desire to not let those managers down. I didn’t want to betray their trust or make them look bad. None of that dedication came from fear of losing my job or a respect for authority — it was because they inspired me to care. If any of them called me today and asked for help, personally or professionally, I’d be there in a heartbeat.

On the flip side, I’ve had a couple of bosses that micromanaged me (making me feel like they didn’t trust me at all) or leaned heavily on their authority to drive me and coworkers to action. I respected neither of them, although I have some sympathy for them in hindsight.

I’ve come to believe that there is no surer sign of a person’s self-perceived inadequacy — feeling in-over-their-head or simply out of control than when they feel the need to declare themselves “the boss”. The moment a person asserts their authority as the reason to follow them is the same moment they’ve proven they aren’t worth following.

I’ve seen that behavior in pimply-faced kids who get promotions at fast food restaurants. I’ve seen it in 60-year-old CEOs of large companies. Everytime I see it, I want to pull those people aside and tell them “Shhh… Shhh… You’re OK. It will all be OK.”

Then I’d tell them three things about what real leaders do:

  1. They provide a vision of something greater than day-to-day tasks.
  2. They spend the time and emotional effort to discover what the people they’re leading care about.
  3. They trust and empower the people they’re leading, even when the stakes are high.

The scope of what can be accomplished by people who are inspired and care about the person leading them is far greater than what is done out of fear of losing one’s job or being reprimanded.

The power of soft influence

Even working for good bosses, I remained under the impression for a long time that my power to drive change had to come from them.

Yet again, I was wrong.

Without really being conscious of it, I started copying some of the behavior I saw in those I admired. I spent more time building relationships with co-workers, learning what motivated them, and sharing a little of myself in turn. I started trusting others a little more and let go of tasks and control of conversations I would have normally tried to hold tight to my chest.

I made conscious changes as well. I started asking for other people’s opinions more. Although it doesn’t come naturally to my personality, I started asking for help.

I started sharing more of my vision for the things I wanted to build and the changes I wanted to make. I worked to build consensus, soft-selling my ideas and compromising when necessary. I started letting others take ownership of my ideas as well.

And a curious thing began to happen. A lot of the things I was frustrated about and wanted to change — started changing.

Much to my surprise, it was entirely possible to lead and affect change among peers without any authority at all.

Also to my surprise and frustration, the hardest thing for me to do was also the most effective in getting others to follow my lead: asking for help.

It’s one of those head-slapping things that you feel dumb about when you realize how well it works on you, but when someone asks for your help (and really means it), it makes you feel important, which in turn, makes you want to help.

Asking for help is a little like rolling over and showing your soft underbelly. Some of us have a hard time doing it because of ego and vulnerability, but if you can get past that and have confidence in your end goals, asking for help is straight up magic.

You’re saying to the other person, “You, specifically you, have the power, skills, knowledge, etc… to help me accomplish this thing. You are important to our success. You are important to me.” That’s hard to turn away from.

If this sounds a little like manipulation, it absolutely can be, but that’s fairly transparent when it happens. I think most people can tell when someone else is buttering them up or asking for something just to mooch.

If you ask for help and can get past yourself to believe that you really do need the other person’s help, you and all the others you’re leading will be able to build spaceships and cure diseases. You’ll be tapping into the real social network, the type of collaboration that got humans out of scraping by, living in caves, and into planting wheat and building cities.

Control is an illusion

I am a control freak and used to be much worse than I am now.

I thought I needed control for things to be the way I wanted them to be. I thought I needed control to change things and didn’t really change much because also I thought I needed someone else to provide me with that control.

Seeking out control is a good way to make yourself unhappy, because you’re never going to get it and those that think they do have control tend to look like idiots to everyone around them (see teenage fast food manager above).

It’s hard to admit you don’t have control. It’s really scary too. That anything can happen at any time and you can’t really do anything about it is a good way to give yourself nightmares.

But it’s the truth. The most we have control of is ourselves and how we react to things, and even that’s limited.

You can get mad and yell and try to change someone’s mind about something, but you can’t control what they think. Nevermind that desiring that type of control is borderline psychopathic.

You can buy insurance and build your house into a fortress, but you can’t stop the freak electrical fire from burning it down while you’re out of town.

The best you can do is manage your reactions and maybe give a nudge here and there. That seems to be true for both leadership and life in general.

You don’t need control or authority to lead, because those aren’t real things. Instead what you need is empathy, vision, and a realistic understanding of what you can and cannot influence to direct your efforts.

We seek out control because it feels like an easy fix. We just need that promotion, or to be our own boss and then everything will be better. Control gives us the authority to lead the charge and get stuff done.

That’s just not how life works. Actual leadership is hard because having empathy, vision, and a detachment from control is hard. The sooner you give up chasing after control and put your efforts into building those other muscles, the sooner you’ll actually accomplish something.

Originally posted on