Getting started in IT: Years 0-4

Over the weekend, a really great hashtag came into existence, #FirstTechJob:

//platform.twitter.com/widgets.js

This in turn came from a great question about the requirements in job listings:

//platform.twitter.com/widgets.js

Please check out the hashtag, it’s a great sampling of the always humble, often mundane, beginnings of nearly everyone in IT. Some common themes were of course help desk support, managing printers and email systems, and managing or running ISPs (often including modems!). My, how times have changed. It did inspire me to talk a bit more about my journey in the hope that it may help some others on their own journey, whether they’re just getting started or have been at it for a while.

Getting Started

//platform.twitter.com/widgets.js

My very first professional job was working for a neighbor’s local PC business in the summer between high school and college. He sold then-high end computers (I think mostly 386s and sometimes 486s, but it’s been a while), with ISA graphics cards that took 30-60 minutes to render a single low-res frame, and needed assistance assembling them in a timely manner. The work itself wasn’t difficult – insert Tab A into Slot B, tighten Screw C – but I asked a lot of questions and learned a good bit about hardware. I made a few bucks, mostly spent at the local laser tag arcade, and most importantly was able to put “professional” work experience on my resume in addition to Wendy’s and KFC. Thank you, neighbor, for that first job!

After the first year of college, I lucked into a paid summer internship at a local engineering firm. The company’s owner was a friend from church and my Dad helped me get an interview – some of that was getting me in the door, but a good bit was getting me off my butt – and I was able to upsell my summer job and my schooling into experience. I did a number of responsibilities there over the next two summers. Migrating CAD files from an old unix terminal to the new NT 3.51 systems, then to NT 4.0; desktop support; network support; printers and plotters; and a bajillion other little things.

One memorable event was when Pittsburgh was struck by some severe weather (including tornados – a real rarity for that area!) and a lightning strike blew out the transformer outside our building. Always splurge for lightning suppression. Over half the hubs died and we got a fast track to switches. In 1997-98, that was ahead of the curve for many. There were of course many less memorable, but more important, things I learned. The most important was how to provide service and support to users and maintain a positive relationship. There were always trying people (I have actually seen someone stick a CD in a 5 1/4″ floppy and force the door shut, and it’s not pretty) but hey, I knew nothing about what they did, so why would I hold it against them for not being experts in my job area?

In spring of ’99, I was supposed to intern there again, but a hiring freeze changed that plan. I already had the college semester off and it was too late to schedule classes when I found out, so I canvassed and found two part-time jobs where I could maintain self-employment, my own schedule, and make money. I learned pretty quickly that I don’t want to be my own boss. That’s a lot of work, and sometimes I only had < 3 days of work in a week! I kept at this through most of ’99 and added “Y2K preparation” to my skillset. Note: you really want to retire before 2038.

In December of ’99, I found a full time job at a local IT consultant, except they weren’t local to me so I had to move. I am 99% certain the only reason I got the job was because I called the company every week asking if had openings and the owner decided it was easier to let me try a job on probation than to put me off anymore. Persistence pays off! This was my first full time, self-sustaining job. I stayed here for 3 years and did a little bit of everything: customer service was key to everything, large-scale OCR of court documents, web front-ends to said documents, Wireless WAN connectivity (pre 802.11b), and I really fell in love with network security.

Keep Going

That covers the first four years and a bit beyond, which gave me a really great foundation for the rest of my IT career. I would like to think I’ve done fairly well since then. These jobs may not seem like the kind of awe-inspiring jobs that everyone wants, but they were good jobs, with good people, and I appreciate how lucky I was to have them. I know it can be a struggle to get those first few jobs and years of experience, so if you can’t land a dream job out of the gate, know that you can find tons of other jobs that will benefit you and your career. IT is really diverse, and you may find something you didn’t know you were looking for; if not it will certainly help you with those “4+ years experience needed” jobs.

Good luck in your journey!

Ruby net/https debugging and modern protocols

I ran into a fun problem recently with Zabbix and the zabbixapi gem. During puppet runs, each puppetdb record for a Zabbix_host resource is pushed through the zabbixapi, to create or update the host in the Zabbix system. When this happened, an interesting error crops up:

Error: /Stage[main]/Zabbix::Resources::Web/Zabbix_host[kickstart.example.com]: Could not evaluate: SSL_connect SYSCALL returned=5 errno=0 state=SSLv2/v3 read server hello A

If you google for that, you’ll find a lot of different errors and causes described across a host of systems. Puppet itself is one of those systems, but it’s not the only one. All of the systems have something in common: Ruby. What they rarely have is actual resolution, though. Possible causes include time out of sync between nodes, errors with the certificates and stores on the client or server side, and of course a bunch of “it works now!” with no explanation what changed. To confuse matters even more, the Zabbix web interface works just fine in the latest browsers, so the SSL issue seems restricted to zabbixapi.

To find the cause, we looked at recent changes. The apache SSLProtocols were changed recently, which shows up in a previous puppet run’s output:

Continue reading

Announcement: Github repo for common vCenter roles

Last week, I was installing some of the vRealize Suite components and was creating accounts for each component using the Principle of Least Privilege. I was sometimes able to find vendor documentation on the required permissions, sometimes I found a few blog posts where people guessed at the required permissions, but in almost no cases was I able to find automated role creation. Perhaps my google-fu is poor! Regardless, I thought it would be nice to have documented and automated role creation in a single place.

To that end, I created a repo on GitHub called vCenter-roles. I use PowerCLI to create a role with the correct permissions, and only the correct permissions. Each cmdlet will allow you to specify the role name or it will use a default. For instance, to create a role for Log Insight, just run the attached ps1 script followed by the command:

New-LogInsightRole

It’s that easy!

I will be adding some other vRealize Suite roles as I work my way through the installation, but there are tons of other common applications out there that require their own role, and not just VMware’s own applications! I encourage you to open an issue or submit a Pull Request (PR) for any applications you use. The more roles we can collect in one place, the more helpful it is for the greater community. Thanks!

What is a backdoor?

Last month, a significant finding in Fortinet devices was discovered and published. When I say significant, I mean, it’s huge – Multiple Products SSH Undocumented Login Vulnerability. In other words, there’s a username/password combination that works on all devices running the affected firmware versions. If you are still running an affected version, you NEED to upgrade now! This is bad in so many ways, especially following similar issues with Juniper and everything we’ve seen from Snowden’s data dumps. Fortinet responded by saying ‘This was not a “backdoor” vulnerability issue but rather a management authentication issue.’

Is that right? What is a “backdoor” and what is “management authentication”? Is there an actual difference between the two, or is just a vendor trying to save their butt? I got into a discussion about that on twitter:

//platform.twitter.com/widgets.js

Ethan challenged me to think about the terminology and I think I’ve come around a bit. Here’s what I now believe the two terms mean.

Continue reading

Root Cause Analysis: It’s Still Valid

You’ve probably heard it before: Root Cause Analysis (RCA) doesn’t exist, there’s always something under the root cause. Or, there’s no root cause, only contributing factors. This isn’t exactly untrue, of course. Rarely in our entire life will we find some cause and effect so simple that we can reduce a problematic effect to a single cause. Such arguments against RCA may be grounded in truth but smooth over the subtleties and complexities of the actual process of analysis. They also focus on the singular, though nothing in the phrase “Root Cause Analysis” actually implies the singular. Let’s take a look at how RCA works and analyze it for ourselves.

Root Cause Analysis is the analysis of the underlying causes related to an outage. We should emphasize that “causes” is plural. The primary goal is to differentiate the symptoms from the causes. This is a transformative and iterative process. You start with a symptom, such as the common “the internet is down!” In a series of analytical steps, you narrow it down as many times as needed. That progression may look like:

  • “DNS resolutions failed”
  • “DNS server bind72 failed to restart after the configuration was updated”
  • “A DNS configuration was changed but not verified and it made its way into production”
  • “Some nodes had two resolvers, one of which was bind72 and the other was the name of a decommissioned DNS node.”

Each iteration gets us closer to a root cause. We may identify multiple root causes – in this case, lack of config validation and bad settings on some nodes. Not only are these causes, root causes, but they are actionable.  Validation can be added to DNS configuration changes. Bad settings can be updated. Perhaps there’s even a cause underneath – WHY the nodes had bad settings – because RCA is an iterative process. We can also extrapolate upward to imagine what other problems could be prevented. DNS configurations surely aren’t the only configurations that need validated.

Multiple causes and findings doesn’t invalidate Root Cause Analysis, it only strengthens the case for it. If it makes it easier to share the concept, we can even call it Root Causes Analysis, to help others understand that we’re not looking for a singular cause. Regardless of what we call it, I believe it is absolutely vital that we continue such analysis, that we don’t throw away the practice because some people have focused on the singular. Be an advocate of proper RCA, of iterative analytical processes, and of identifying and addressing the multiple causes at hand.