Toyota Production Systems (Lean) Terminology

I found a great article about the Toyota Production Systems (TPS) terminology. TPS is also known as Lean and is the basis of The Goal, The Phoenix Project and DevOps. I’ll be using the terminology a lot in the future, so take a moment to read up on the terms. A shared language helps ensure effective communication. We’ve already discussed Kanban, here are some other terms to focus on:

  • Andon
  • Kaizen – notice it’s for everyone, not specialists.
  • Nemawashi
  • Muda
  • Mura
  • Muri
  • Set-Up Time
  • Tataki Dai

DevOps: The Dev doesn’t mean what you think it means

In a past discussions on DevOps, I’ve said that the Dev doesn’t stand for Developers. That probably seems odd, since in many instances it’s described as Developers + Ops. DevOps is a software development methodology, hence the Dev means development. But, what does that actually mean?

Development is the business side of your product pipeline, as opposed to Ops, which is the customer side. The business side entails not just your software developers, but Product, Sales, and QA (and you could even argue Marketing). These organizations help come up with the product requirements and customers who will use the product. You need a product to develop that your customers want so that the software developers can start developing. This whole side of the business needs to work in synchrony to provide the most value. Development without a product nets you nothing, and product without customers nets you increased inventory costs.

This also affects your feedback loop between Ops and Dev. The operations side of the house needs to provide feedback not just to the software developers, but to let Product and Sales know how the customer’s needs were met and QA needs to know about quality issues that slipped through. If you only talk to the developers, your feedback loop isn’t complete and you’re not implementing DevOps properly.

Celebrating your developers and ignoring the rest of development is like exercising your arms and legs but ignoring your core.

Puppet Forge Module rnelson0/certs

I just published my first puppet module on the forge, rnelson0/certs. It provides a single define that installs a pair of SSL files (.crt and .key) from a specified external location to the managed node. This is designed for use with apache::vhost defines that allow you to provide the name of SSL files to the vhost, but requires the files to already exist on the node. I hope you find it useful. Report any issues via the GitHub issues tracker. Thanks!

Writing for #vDM30in30

As November rushes to its conclusion, it’s time to start some introspection about #vDM30in30. Here are some observations I’ve made, in no particular order:

  • I was asked how I write so many articles in such a short time. My most successful pattern is to identify 2-3 related concepts I want to write about and write 5 or 6 sentences describing them in Evernote. When it’s time to write an article, go to Evernote, pick a concept, and give myself 1 hour to write an article. At the end of the hour, start revising it, but there’s no time limit on this. I’ve been able to do most articles in 60-90 minutes this way. The concepts stay focused, the editing is good, and I’m prevented from obsessing about perfection to the point that I never end up publishing anyway. This results in shorter articles, so won’t hold true for longer technical articles, but I like this pattern.
  • As a consequence of this pattern, my editing process is getting tighter and tighter. Write and finish, then edit, edit, edit. I’ve noticed fewer errors in my writing overall – hopefully that’s the reality of it.
  • Writing and editing is easy. Ideas are difficult. Having a stock to rely on isn’t a bad thing. I hope to end this effort with a dozen or so ideas banked for the future.
  • Thirty articles in thirty days is rough. Even with a decent process in place and much shorter articles, I don’t think I want to do this again anytime soon. I’ll stick to my once or twice a week schedule, thanks!
  • Fear is a killer. By publishing rapidly, I’ve overcome most of my fear. Previously, I would sit on a completed article for days, sometimes weeks, for fear of how it would be received. Now, I am more focused on writing for my own goals – I appreciate it when an article is well received, but it’s not the primary focus during writing and editing.
  • Even though I’m less concerned about the reception, of course I’ve looked at page view statistics. Whether I’ve published 0, 1, or 4 articles a day, page views – aggregate and per new article – seem to remain fairly consistent. There doesn’t appear to be a downside to publishing multiple articles a day. I also didn’t see any significant correlation between the day of the week or the time of publication and the number of views. This isn’t something I’ll worry about in the future.
  • I wrote about the writing process itself a few times. I found this useful to myself and I hope others find it helpful as well. The 30in30 exercise, after all, was about improving my writing.

While I said I don’t want to do this again, it has been a worthwhile exercise and I think I benefited a lot. I hope the readers enjoyed it, as well! Even though this 30in30 challenge is ending, it’s not too late to start your own 30in30 challenge.

Happy Thanksgiving!

When Good Hypotheses Go Bad

I’ve written recently about the necessity of hypotheses, whether you’re writing or troubleshooting. When you craft a hypothesis, it’s based on some preconceived notion you have that you plan to test. When your hypotheses are tested, sometimes they are found wanting. It’s tempting to discard your failed hypotheses and simply move on to the next, but even a failed hypothesis can have a purpose.

Imagine for a moment that you’re sitting in front of a user’s computer, helping them out with some pesky problem. Suddenly it’s the end of the day, you’ve tried everything in your repertoire and you’re calling it quits when the user looks at you and says, “I thought it was kinda weird you tried all that. Bob did everything you did last week and he couldn’t figure it out either.” Gee, thanks, Bob! There’s not even a ticket from last week, nor did he mention talking with this user. How many hours did you just waste that you could have saved if you knew none of it would work?

Bob spent hours crafting and testing his hypotheses, but he discarded all of them, straight to the circular file. You then proceeded to craft and test many of the same hypotheses which, of course, failed again. If only there was some way we could learn from our failures… Wait, there is!

Let’s take a quick look at another example, a scientific hypothesis. A researcher crafts a hypothesis and spends $100,000 to gather preliminary data that can be submitted for a grant worth $2,000,000. If the preliminary data looks good, great – well on the way to two million in funding. If it doesn’t pan out and the hypothesis is shot, $100,000 just went down the drain. That’s the nature of science. But…

A few years go by. Another researcher comes up with the same brilliant idea and sets out to collect some preliminary data for around $100,000. Whoops, the hypothesis isn’t that brilliant, doesn’t work out, and the scientist wasted time and money. Now science is out $200,000 on this failed hypothesis. If only she had known that someone else had tried this before, but there was nothing in the literature to indicate that someone had. She publishes her data in a journal and the next scientist who thinks they have it made can see what the results will look like before investing time and money in the idea. Good money isn’t thrown after bad money anymore.

You can help those after you (including future-you) if you take some time to record your hypotheses and how they failed. You don’t necessarily have to go into great detail, though scientific papers obviously require more rigor, often just a sentence or two will work. “Traceroutes were failing at the firewall, but a packet capture on the data port showed the traffic leaving the firewall,” or, “The AC fan wouldn’t start and the capacitor looked like it might be bad, but I swapped it out for my spare cap and it still won’t start.” If it’s a really spectacular failure – something that was ohhhhh-so-close to working, or a real subtle failure – maybe it’s worthy of a full blog article.

Make sure to store this information somewhere it will be found by someone who is likely to need it. In Bob’s case, this is what the ticketing system was there for, so that others can see his previous work on an asset or for a user. At home, you might keep a journal or put a note in the margins of the AC manual. For public consumption, you might write a blog article or submit your research results to a journal. Anywhere that will help prevent someone in the future from having to waste resources to rediscover the failed hypothesis.

Try and make this part of your habit when researching and troubleshooting. State your hypothesis, test the hypothesis, and record any failures before proceeding with successes. Don’t be a Bob!

DevOps for the SysAdmin

Last Thursday, I was proud to present the following slidedeck with Byron Schaller at the Indianapolis VMUG meeting. There was one edit I wanted to make afterward, which of course is when I have the best editing thoughts 🙂

The Dev in DevOps stands for software development, not software developers.

It’s a small and subtle difference that has a huge impact.

If you haven’t spoken at your local VMUG, please, give it a shot. It’s incredibly rewarding in the growth of your speaking abilities, your ability to internalize and then vocalize a subject, and in participation in the VMUG. Most VMUGs struggle to find speakers. Please volunteer and give back to the community..

Hypothesis Driven Writing

I just tackled hypothesis-driven troubleshooting, which brings me to an important subject for blog writers and #vDM30in30 in particular: hypothesis-driven writing. As writers, we constantly seek to improve our abilities. One of the most important skills, in my opinion, is to use a hypothesis as the foundation of your writing. Writing around a solid hypothesis results in an interesting, focused result that engages readers and leaves them with a clear impression of what the writer wanted to say. A lack of hypothesis results in an aimless article that leaves the reader confused and wondering what the writer was trying to convey.

As a reader, most of us find this hypothesis to be true without requiring great analysis. If an article starts out talking about the importance of OpenStack and devolves into comparing Disney films, we all feel the lack of a solid hypothesis. On the other hand, if Disney films are involved in the hypothesis, perhaps as analogies to the components of or community around OpenStack, the reader may feel rewarded and be very receptive to the writer’s goals (I challenge someone to write such an article, it would be quite the feat!). When the writer follows the hypothesis, everyone enjoys the benefits.

If we agree that good writing relies on a solid hypothesis and the writer’s adherence to the hypothesis, how do we, as writers, craft an effective hypothesis? Look at the definition of hypothesis. There are many types and the type chosen will be based on the writing goal. A research paper would require a working hypothesis, a hypothesis that is provisionally accepted to further research. It is constructed as a statement of expectations, such as, “We expect X to increase proportionally to the decrease of Y,” which would then be tested to determine it’s validity. A formal logic statement, of the form, “If X, then Y,” is based on hypothesis X, and can be the foundation of a logical proof or experiment. In an opinion piece, like a blog, a hypothesis may be crafted as a general plot, such as, “Creating and adhering to a hypothesis is the key to good writing,” which is then examined in detail.

Now that you have a hypothesis, you need to state it. The first paragraph if your writing is where you state the hypothesis. There are many ways to state your hypothesis. I follow a few guidelines.

  • Describe the general hypothesis.
  • State your specific hypothesis. Avoid terms like, “I think,” when possible.
  • Repeat your specific hypothesis.

Throughout the rest of your writing, every paragraph needs to relate to the hypothesis, through direct support of the statement or through indirect support, such as data or analysis that relates to the hypothesis. Your readers will be able to follow the thread of your writing and, hopefully, see exactly what you were trying to present them.

In your final summary (typically the final paragraph, except in larger articles), restate the hypothesis and the supporting evidence. If you did a good job explaining yourself, this will reinforce the ideas in your reader’s minds.

As a writer, your goal is to create a valuable article. The foundation of that article is a hypothesis. It’s important to adhere to this hypothesis in order to reward your readers with a solid article. Whether you’re participating in #vDM30in30 or writing on a less frequent basis, by practicing hypothesis-drive writing, take the time to focus on improving this skill and both you and your readers will appreciate the results.

Hypothesis Driven Troubleshooting

John Price wrote a wonderful article about troubleshooting the other day that got me thinking about this skill. Troubleshooting is an incredibly vital skill in IT and one that many people view as an innate skill, to the point that a common adage is, “You can’t teach troubleshooting, you have it or you don’t.” I believe that, like nearly every other skill, it is a learned skill, and those without the skills should not be treated as hopeless. It may come easier to some people, but anyone can be taught the fundamentals of troubleshooting if they care to learn.

Troubleshooting is, at a bare minimum, the search for the source of a problem. Good, effective troubleshooting is a logical and systematic search for that source. That difference is driven by a scientific hypothesis, or a proposed explanation for a problem that can be tested. The hypothesis might be, “The reason the internet is unavailable for users is that their internet connection is down.” This can be tested and determined to be the cause, or discarded as a failed hypothesis. The troubleshooter can determine another scientific hypothesis, “The reason the internet is unavailable for users is because the firewall is not passing traffic,” which can then be tested. By creating and following a series of hypothesis until a valid hypothesis is found, the troubleshooter can identify a problem that can be fixed. This is the essence of the scientific method, which isn’t just for scientists anymore. Troubleshooting without a hypothesis may lead to the source of a problem, but only through random luck.

The scientific method, how to craft a hypothesis, and how to test a hypothesis are all methods and skills that must be learned. We are not born with this knowledge, it must be taught. Some of us learn this in school as part of our formal education. Some of us learn in less formal methods. In John’s article, his father taught him how to define and test a hypothesis via the Socratic method, asking John to ask and answer questions and teaching him how to narrow the possible sources down to a single source. While most of us learn these skills at a relatively young age, usually before age 20, the skills and knowledge are teachable to anyone of any age. All it requires is a good teacher and a student willing to listen.

If someone you know does not have good troubleshooting skills and their job – or a job they want to obtain – requires it, they can be taught. If this person is your colleague or friend, do not give up on them! Become a teacher to them or find them a mentor. Perhaps they’ll teach you something along the way, and you’ll have the satisfaction of knowing that you’ve contributed to the next generation of IT leaders.

Questioning Assumptions with Intelligence

“Question everything!” You’ve heard this a million times. You probably try to do it, sometimes, too. The underlying tenants of The Goal, the Theory of Constraints, Lean, and other methodologies relies on questioning assumptions. It’s important, but what exactly does it mean, and what do you do afterward?

First off, it’s not a license to literally ask questions about every business decision at every opportunity. Many questions can be answered in your own head before you open your mouth, so there’s no need to bother others with those questions. For the rest, go back to the theory of constraints and ask yourself if it’s a bottleneck first. If not, the answer might not matter. Above all, always be courteous and understanding of the situation before speaking. If you do literally question everything, you will be treated like an a-hole of the first degree and your message will be lost. There’s a time and place for everything. Continuing on…

In the right context, “Hey, wait a minute, why exactly are we doing that?” is a good question. Sometimes there is a good answer,  but other times the answer is simply, “because.” That’s not a good answer. For example, someone who lives in SoCal suggested I salt my car’s tires in the winter. Though I have lived in the north, I had never heard of doing that. I asked where they learned to do that. Many years ago, the person went to college in Pittsburgh and saw buckets of salt near parking areas. They saw someone else pour salt around their car’s tires, so they assumed that is what it was intended for. Turns out it was for the sidewalks.

You might find what that person did humorous, but before you snicker, look around your business – are you sure you’re not doing something simply because your colleague or predecessor did it? A long time ago, I found out I had been swapping backup tapes every morning on a system that had been decommissioned but not powered off. Whoops! This is cargo cult behavior, and we all participate in it at some point in our lives. Businesses do it A LOT. The important thing is that we come to understand what we are doing and correct the behavior.

When you do find some broken assumption, you must be smart in how you address it. Again, make sure it’s a constraint. A little salt around the tires won’t really hurt anything, but putting salt in the gas tank certainly would. Focus the efforts on the constraints. Figure out what is wrong with the assumption and how to make it right. When you find these broken assumptions, there’s no need to blame or ridicule someone. You fixed a problem, everyone should be happy! Once you make some correction, take a look at the other assumptions in your system and see if they were affected. Decisions in the fundamental parts of the system tend to have cascading effects further down the line.

This is an iterative process. If you question an assumption this year and there’s a good reason for it, you will eventually want to revisit it, maybe next year or in 5 years. Change is perpetual and you should embrace it, not flee from it.

Fortigate user permissions peculiarities

While working with a customer on their Fortigate firewalls, I was introduced to a peculiarity of how FortiOS interprets user’s diag commands. I suspect this affects multiple versions, but I don’t have the ability to test this.

  • FortiOS: 4.2.x
  • User: wild-card (TACACS)
  • Profile: super_admin_readonly

TACACS users whose permissions elevate them to the super_admin profile are unaffected. They can run diag commands unrestricted as they have full access.

TACACS users whose permissions remain at super_admin_readonly were finding that they could not run diag commands that accessed an interface, such as diag sniff packet any “icmp”. Upon further investigation, the issue was related to the IP the user connected to and the interface (“any” in the example) used in the command. As a readonly user, the any interface is off-limits. The interfaces configured for the VDOM that the user connected to are available to the readonly users.

In other words, if a firewall had two VDOMs, Common and DMZ, and the user connected to any interface connected to the Common interface, only those interfaces would be useable. For instance, diag sniff packet common-outside “icmp” would work, as well as common-inside. Interfaces connected to other VDOMs are off-limits, so diag sniff packet dmz-outside “icmp would fail. By providing the end user a list of the IP addresses and interface names, and the VDOM they belonged to, the user was able to perform all required diagnostic commands.

I hope this is fixed in more recent versions, but at least there’s a workaround that makes some logical sense.