VMware Home Lab 2015

Well it’s about time I post about my home lab 2015 edition.  Last year I was running a happy home lab on HP workstations using Operton 2xxx processors.   This year VMware releases vSphere 6 and I loose support for Operton 2xxx processors.   So I had to start over again.   I am a very cheap man so I want the cheapest thing that will reasonably work.  This time I wanted to get three nodes for possible vsan so here is my home lab.

Compute

After pricing out all kinds of stuff I stuck with the HP workstations.   They are normally server class hardware with great life and lightly used.   They also don’t sound like a Jet engine when running.   The HP version of ESXi installs out of the box without any problems and does not require any modification which is a huge win.    I wanted to make sure the processor was on the HCL (I don’t care about the server being on the HCL because I am cheap) so I browsed options and found the E5404 was on the HCL for ESXi 6.   Searching ebay I found some great deals on the HP xw6600 workstations with a single quad-core E5404 processor.    Normally they ship with two broadcom 1GB nic’s and 4GB of RAM.   They max out at 32GB of RAM with a single processor.   The only real downside is you only get four cores, but it’s $79.99 each right now.   You add some RAM – 32GB for $88.99 and you get 4 cores and 32GB of RAM for $167 per node.   This forms a very solid cluster with three nodes.

 

Storage

Shared storage is critical to a good cluster.  You have a few options:

  1. VMware vSAN
  2. Additional PC providing NFS or iSCSI
  3. Some type of NAS (Synology)

 

vSAN requires at least one SSD and one spinning disk per node.   So you are looking at roughly $300 total to go this way.   It also requires a license from VMware.

 

Additional PC providing NFS or iSCSI – use FreeNAS or something similar.   It’s cheap and easy if you have hard drives and PC’s sitting around.

 

Synology – These sell for cheap on ebay and rock so much.   I love them… you can buy with drives on ebay for around $150 if you are lucky.

 

I personally have vSAN and a synology nas in place.

 

Networking

You really need a Gigi-Managed Switch for VMware implementation.  Buy one for $120 or so or find one on ebay.

 

Licenses

Here is the sticky bit… you need licenses.  There are two options available at this time for low-cost:

  • VMware vExpert – This program provides great access to licenses and is not hard to get into at this point
  • VMUG Advantage program – for $200 a year you get licenses and lots of other benefits

 

Let me know if you have any questions.

Journey to an Automated Cloud Part 1

Are you ready to automate everything?  Does you boss want some of that cloud?   Well everywhere I turn people want to get into the cloud.  They all want a vendor product to provide the cloud.   Every vendor show I to go to has hundreds of products to solve that problem.   In my experience it is not a product problem that limits our journey to the cloud.   In these series of articles I will explore some of my thoughts on your journey to the cloud.

 

Part 1 – Where am I and what do I want?

My thoughts for this part are best explained by an exchange from Alice in Wonderland:

 

“Would you tell me, please, which way I ought to go from here?”
“That depends a good deal on where you want to get to,” said the Cat.
“I don’t much care where–” said Alice.
“Then it doesn’t matter which way you go,” said the Cat.
“–so long as I get SOMEWHERE,” Alice added as an explanation.
“Oh, you’re sure to do that,” said the Cat, “if you only walk long enough.”

The cat completely covers my feelings if you don’t know what you want it does not matter which way you go.  Most engineers are stuck with the problem of lack of definition.   We want to get into cloud provisioning.  We want to get to 20 minute deployments of servers.   We need to be more like Amazon.

 

Let examine these statements a little:

We want to get into cloud provisioning.

  • Does this mean you want to use public cloud for servers?
  • Does this mean you want a web portal for provisioning servers?
  • What does this mean?  It’s like me telling you I want to enforce family values… We all want family values but every family has a different definition

We want to get to 20 minute deployments of servers.

  • What type of server do you want deployed in 20 minutes?  OS, Application, three-tier?
  • What does deployment mean?  Powered on?  Able to talk on the network?  Internet facing?

We need to be more like Amazon.

  • You want to deploy unsecured operating systems without backup quickly?
  • You want to have pay as you go for our customers?

 

I want to be fair and emphasize that all these statements are valid but without definition and that every one of these offerings have positive sizes as well.  What they lack is business definition.   In almost every cloud situation the business wants IT to be more agile.   They want processes to go more quickly.    Universally they cannot understand why provisioning a server takes so long and is so complex.  Honestly nether can I.   I have made a career out of complex servers and it has to stop.    So before you start down some unknown paths with the cat ask some critical business questions (no engineer likes them but you need to ask them).

For example:

  • What pain point are we trying to solve with the cloud
  • What specific expectations do you have for the cloud
  • What is the timeline for the cloud

If during any of this conversation products come out know it’s normal.    Business people explain technology in terms of products.   (For example I want it to be like an ipad with dropbox) These statements are not locking you into a product they are helping you define requirements.    Ask questions about the product to help define requirements.   It is critical that you translate their products into requirements and constraints.    Once you have translated their needs into requirements statements get them to sign off on it.

 

Where am I?

In almost all cloud deployments it’s really about adding automation to all aspects of the service.   This allows you to be more agile to change.   Before you can begin your transformation you need to define your starting point.

Picture1

Is your current environment like the above picture?  Do you have many hands touching the configuration of your servers and applications.   Have you provided some basic automation like server cloning or configuration management?   This approach is common and really a growth of the virtualization era.   Let me give an example of this process:

  • Server request is provided to server team
  • Server team clones an operating system
  • Firewall team does firewall rules
  • Server team deploys application
  • Developers deploy code
  • Security team reviews server and approves
  • Server team release to production

 

This process seems simple and should be easy.    This is where the people problems start.   The development team has a project.  The server,firewall and security team have tasks.   They do their tasks without knowledge of the development teams project.  Which means that bolts will not be where they are expected and in the end something will require rework.   There is tons of room for human error and mistakes.    Each project built this way will be unique because people are executing the steps.    It gets worse as you scale up.   Assume that the normal firewall worker is out sick, now we have a stand-in who cannot do the job as well.   More errors and problems are introduced.   So to review:

  • Each team treats a project as a task
  • Each team executes the tasks with different priorities causing delays
  • There is lots of room for human error and mistakes hurting the timeline
  • It does not scale it’s mostly human capital

The fun part is this process is pretty good.   At least they have a defined process.

Do you have a process and is it followed?

It’s simple individuals have processes they natively follow.  We naturally assume that other people think and act just like us so naturally they will follow the same process right?  Wrong.  Everyone is different and does it a little differently.   So many IT shops have poorly defined processes and even when they do they are rarely followed.    In order to make it into the cloud you have to define your manual processes.   Get them on paper with the following details:

  • What information is required to work this process
  • What information is expected to return from this process
  • Who can work this process
  • What choices need to be made as part of this process
  • What happens if a process fails in an unexpected way

 

Does this sound like software development to anyone else?  Well it’s is.  Welcome to the rest of your career as a software developer or what I like to call a process engineer.   Once you have defined the process management needs to enforce the manual process to find out where it breaks… this is the hard part.  You can write down a process… you can send people to training on the process but you cannot make them drink.   All manual processes will be slower and worse at first.   Change is hard (That’s part two).   You have to practice the process to find the holes.    Here is a logical outline to define your process:

  • Have a subject matter expert define the process on paper (electronic or otherwise)
  • Have the SME train others on the process
  • Have management encourage others to do the process
  • Have people other than SME do the process and report back problems
  • Improve the process until it works in all situations encountered

 

Does it seem simple?  Yep it is..  Does it seem common sense… right again.   I should change the names of everything to something like points or teeshirt sizes so I can sell it but that not me.   It is simple to understand hard to implement.

Quiescing Backup causing BSOD in Windows OS’s on Current VMware Tools

Evening,

I got notified of this problem earlier today thanks to my awesome BCS engineer.  You can read VMware’s week old KB here.  Essentially certain versions of tools can cause a BSOD when a quiescing operation is done.   This is a big problem for API based backups since when possible they use this method.   There are three solutions provided by VMware at this time:

  • Disable quiescing
  • Do not select Quiescing guest file system when taking a snapshot
  • Downgrade the VMware Tools to previous version not affected

Good news… the latest version of VMware Tools in 6 and 5.5 is affected.  I was notified that a few specific version were affected.  The KB does not specify them.   You can track versions to build numbers via this table: http://packages.vmware.com/tools/versions

I have been told the following are affected:

  • 8399 – 8.6.15
  • 9231 – 9.0.15
  • 9355 – 9.4.11
  • 9216 – 9.10.0

 

Here is a powershell snippet to locate the machine that are affected:

 $VMS = get-vm |get-view | where {$_.powerstate -ne "PoweredOff" } | where {$_.config.tools.toolsVersion -eq "8399" -or $_.config.tools.toolsVersion -eq "9231" -or $_.config.tools.toolsVersion -eq "9355" -or $_.config.tools.toolsVersion -eq "9216"}

 

I am not sure if there is a scripted way to downgrade the tools.  Here is VMware method to download older tools version.   As with all my articles you should open a VMware ticket to get specific production assistance.   Let me know if you know a scripted way to downgrade tools.

 

PowerCLI locate VM’s with multiwriter

Another snippet to locate VM’s with Mutiwritter enabled:

#Create the array

$array = @()

$vms = get-cluster “ClusterName” | get-vm

foreach ($vm in $vms)

{

 

 

$disks = get-advancedsetting -Entity $vm | ? { $_.Value -like “*multi-writer*”  }

foreach ($disk in $disks){

$REPORT = New-Object -TypeName PSObject

$REPORT | Add-Member -type NoteProperty -name Name -Value $vm.Name

$REPORT | Add-Member -type NoteProperty -name VMHost -Value $vm.Host

$REPORT | Add-Member -type NoteProperty -name Mode -Value $disk.Name

$REPORT | Add-Member -type NoteProperty -name Type -Value “MultiWriter”

$array += $REPORT

}

 

 

}

$array | out-gridview

PowerClI locate all the SCSI Bus Sharing VM’s

More things that stop vMotion like SCSI Bus Sharing here is a snippet to locate all of them in a cluster

#Create the array

$array = @()

$vms = get-cluster “ClusterName” | get-vm

#Loop for BusSharingMode

foreach ($vm in $vms)

{

 

$disks = $vm | Get-ScsiController | Where-Object {$_.BusSharingMode -eq ‘Physical’ -or $_.BusSharingMode -eq ‘Virtual’}

 

foreach ($disk in $disks){

$REPORT = New-Object -TypeName PSObject

$REPORT | Add-Member -type NoteProperty -name Name -Value $vm.Name

$REPORT | Add-Member -type NoteProperty -name VMHost -Value $vm.Host

$REPORT | Add-Member -type NoteProperty -name Mode -Value $disk.BusSharingMode

$REPORT | Add-Member -type NoteProperty -name Type -Value “BusSharing”

$array += $REPORT

}

 

 

}

$array | out-gridview

PowerCLI How to locate all RDM’s in a cluster

I love RDM’s they are a royal pain on managability until vSphere 6.   (You can vMotion RDM’s in 6)  Here is a snippet that will allow you to locate all RDM’s in a cluster:

#Create the array

$array = @()

$vms = get-cluster “ClusterName” | get-vm

foreach ($vm in $vms)

{

 

$disks = $vm | Get-HardDisk -DiskType “RawPhysical”,”RawVirtual”

 

foreach ($disk in $disks){

$REPORT = New-Object -TypeName PSObject

$REPORT | Add-Member -type NoteProperty -name Name -Value $vm.Name

$REPORT | Add-Member -type NoteProperty -name VMHost -Value $vm.Host

$REPORT | Add-Member -type NoteProperty -name Mode -Value $disk.DiskType

$REPORT | Add-Member -type NoteProperty -name Type -Value “RDM”

$array += $REPORT

}

 

}

$array | out-gridview

What is a Server Architect?

When I started my career I wanted to work with computers.   I knew that being a programmer was not for me, I liked to play with hardware and the big picture.   So I dabbled in PC support and quickly learned that I did not like being reactive.   Some jobs are mostly reactive for example a firefighter.  They train and prep, but most of their job is waiting for an emergency so they can react.   It’s impossible to be 100% proactive as a firefighter.  They have safety prevention and work to limit the effects of fire on the loss of life, but in the end they are still waiting for a fire.  PC support was the same model.  You wait for someone to break something, then you fix it.   I have seen some really great PC support teams that are very proactive with training and locking down the PC.   At the end of the day you are still waiting to react.   I wanted to be more proactive resolving problems before they become needs to react.   I went into systems administration convinced that computer will do what I tell them and I can enforce better outcomes.   I spent a number of years focused on Linux-based server working to create a very well-managed solution that would allow us to not be reactive.   I felt very successful in this journey to the point I became bored looking for new challenges.   When faced with many years left in my career I needed the next step.   That next step seemed to be Systems Architect.

What is an Architect?

In order to define an architect we should look to people who hold the title outside computers.   Building architects are the easiest.   A home architect takes into account many factors and produces a physical design for the builder to follow.    Some of the factors a building architect has to consider are the following:

  • Building code
  • City regulations
  • Lot size
  • Available funds
  • Customer requests and needs
  • Best practices

 

Each of these things can really be categorized into two columns:

  • Requirements – Things that must happen
  • Constraints – Things that limit or control what must happen

 

For example:

  • Building code – Constraints
  • City regulations – Constraints
  • Lot size – Constraints
  • Available funds – Constraints
  • Customer requests and needs – Requirements
  • Best practices – Things to keep in mind

 

Notice how best practices are not requirements or constraints.   It may be best practices to have a bathroom on the main floor but it’s not a requirement.  In IT this is true as well.   Systems Architects take information from the customer, their personal knowledge, and the constraints and form a solution.   Each solution should represent the requirements and constraints of the project.   An architect should understand building practices but does not have to be a practiced builder.   They need to understand the innerworkings and requirements for each design choice.   For example if I put down laminate flooring I need a underlay to reduce noise.  An architect should be the master of proactive administration.  Looking to reduce risk on a design and meet customer needs.   Each systems architect needs a methodogy to ensure they don’t miss critical steps in the process.   In systems I like to use the conceptual, logical and physical design model.   An architect does not form the perfect solution.  They form the solution that meets the customers needs with an eye to the elements of design.

Elements of Design

Early in my career I struggled to understand the elements of design.   What critical thinking should I use to make sure my architecture will work well.  VMware introduced me to the elements of design which mirrored my own really well.   I use the term RAMPS to remember them:

  • Recoverability – How do you recover the design from a failure, what is the requirements needs,
  • Availability – How do you ensure availability of the solution,  What options do you have
  • Managability – Is the design too complex and costly to manage, how do you manage it
  • Performance – Does the design meet performance needs and take into account growth
  • Security – Does the design meet security needs and requirements

 

I would like to illustrate the elements of design with a simple scenario.  The customer wants to deploy a web server running drupal for some new brand site.   The following questions might help you figure out the requirements and constraints while ensuring the solution meets RAMPS.

Recoverability:

  • What is the expected RTO (Recovery time objective, how long to get it back into service after a full failure)
  • What is the expected RPO (Recovery point objective, how much data is ok to lose in a failure scenario)
  • How do you expect to backup the application, database and user-generated data?

Availability:

  • Is there off hours for the application?
  • How much planned downtime is acceptable for the application per month?
  • What is the cost per minute of unplanned downtime?
  • Do you have a SLA (Service level agreement or objective) with your customers?
  • Who are your customers and where will they be accessing the application from?

Managability

  • How do you expect to make changes to the application?
  • What are the roles involved in this project (Form a RACI)
  • How often do you expect the content to change?
  • Is there any unique requirements around the application that we need to know?

Performance

  • How many concurrent users do you expect?
  • How large is the application?  Do you have any test metrics or data to show usage patterns or expectations?
  • What is reasonable response time from the application?
  • Any unique performance requirements?
  • How much network bandwidth do you expect the solution to use?

Security

  • Does you application require a login?  Where are they kept?
  • What type of data is stored in your application?  Is it sensitive
  • What is the cost of a data breach on this application
  • Are there any security policies from the organization that should be taken into account

 

At lot of these questions will yield no answer or unknowns.  The performance metrics are a particularly sticky question.  This is where our friend assumptions come to town.  When you don’t know write down an assumption so people understand what your designed to with a lack of information.   For example the customer may share they have no idea how many concurrent users will use the application.   You should make an educated guess about the number, get the customer to sign off and move forward.

 

So what is an Architect?

So what really is an architect?  It’s someone to attempts to meld best practices with customer requirements to form a usable solution.  A systems architect has to take into account all types of things like:

  • Interconnections between logical and physical elements
  • Building space
  • Capacity
  • Logical architecture of the solution
  • Cost
  • Power
  • Best practices
  • Current practices that are constraints
  • etc…

It’s a fun job that changes each day.  If you do it correctly you should be reducing the reactive nature of your systems architecture.   You need to plan, document, study then plan again.   It’s a detail job that requires lots of thought but mostly lots of reworking and negotiation.

Negotiation?

Yep in order to architect you need  a customer.  When building a house everyone wants a huge house with gold walls.  You need to manage the expectations to successfully complete the solution.   You have negotiate.  The first rule of negotiation is simple every answer is a “yes, however”  the customer can have anything they want, as an architect you have to help them understand the impact.    Every choice has an impact just like every action has a reaction.   If you want gold walls the cost will be impacted.   If you want no bathroom on the first floor, expect to be an expert stair climber.  Being an architect is as much about people skills as technical skills.