Pernix Data 30 days later

I have been interested in Pernix data since its initial release the idea of using flash to accelerate storage is not new to me.   Anyone who reads my blog based rants have found that I am a huge supporter of larger cache on storage arrays.   I have always found having more cache will make up for any speed issues on drives.   My thoughts on this are simple if 90% of my storage writes and reads come from cache I run at near the speed of the cache.   Spend your money on larger cache instead of faster spinning disks and performance is improved.   Almost every storage array vendor has been using ssd to speed up arrays for the last four years.   They all suffer from the same problem, they treat all I/O’s equal without any knowledge of workload.

Netfliks problem

The best way to explain this problem is using netflicks.   They have implemented a system where you rate a show based upon stars.   Then it compares your ratings against everyone else ratings and locates users with similar ratings to yours.   Once located it uses these user’s recommendations to locate new shows for you.   This is great… assuming you have 100% the same taste in shows as that user.   This has been advanced a lot in the last five years and is much more accurate and complex algorithm.  It’s pretty accurate for me except for one problem… My wife and kids share the same netfliks account and my children love to rate everything.    This produces the world worst set of recommendations… I get little girl TV shows mixed with Downton Abbey and Sci-Fi movies.   It’s a mess… Netflix literally has no idea how to recommend show to me.    This problem exists for storage arrays with cache.   Choosing which data should be cached for reads is hard, because there are lots of different workloads competing for cache.   I don’t want to devalue the algorithms used by storage vendors, they much like Netflix are a work of evolving art.  With everyone being profiled into one mass everyone’s performance suffers.   Netflix understood this problem and created user profiles to solve the problem.  They added simple versions of localized intelligence to the process.   These pockets of intelligent ratings are used to provide recommendations for the local needs.

Pernix is the intelligent user profile

Pernix is just like Netflix user profiles, it’s installed locally on each ESXi server.  It caches for that ESXi host (and replicates writes for others).   It can be configured to cache everything on the host, datastore or virtual machine.   It provides the following features:

  • The only local SSD write cache that I know of outside hyper-converged solutions
  • Local SSD read cache
  • Great management interface
  • Metrics on usage
  • Replication of writes to multiple SSD’s for data protection

 

Pernix is built for vSphere

Pernix installs as a VIB into the kernel and does not require a reboot.   It has a Web client interface and C# client interface.   It does require a Windows server and SQL server for reporting.   It is quick and easy to install and operate.  The cache can be SSD’s or memory for pure speed.    Pernix works only in vSphere so it’s 100% customized for vSphere.

Lab

My local Pernix SE’s were kind enough to provide me a download and license for Pernix Data.   My home lab has been documented on this blog before but the current solution is 3 hp nodes with 32Gb of RAM each as shown below:

node1

I added a 120GB san disk SSD to each node for this test.    My storage ‘array’ is an older Synology nas with two mirrored 2TB 7,200 RPM disks via iSCSI and NFS.  My rough math says I should be getting about 80 IOPS total from this solution which really sucks, oddly it’s always worked for me.  I didn’t have any desire to create artificial workloads for my tests, I just wanted to see how it accelerated my every day workload.   All of these tests were done in vSphere 5.5 U2.

Pernix Look and feel

Pernix provides a simple and powerful user interface.  I really like the experience even in the web client.   They use pictures to quickly show you where problems exist.

pernix1

 

As you can see lots of data is presented in a great graphical interface.   They also provide performance charts on every resource using Pernix.  Without reading any manual other than Pernix quick start guide I was able to install their solution in 15 minutes and have it caching my whole environment, it was awesome.

How do we determine storage performance?

This is a constant question, every vendor has a different metric they want to use to determine why their solution is better.   If it’s a fiber channel array they want to talk about latency then IOPS.   If it’s all flash NAS its IOPS then latency.    So we will use these two metrics for the tests:

  • Latency – time it takes to commit a write or get a read
  • IOPS – Input / Outputs per second

I wanted to avoid using Pernix’s awesome graphs for my tests so I chose to use vRealize Operations to provide all recorded metrics.

 

Baseline

The VM that gives my environment the biggest storage workout is vRealize log insight.   It has been known to have recorded IOP’s of 300 in the environment.    Generating IOP’s is easy just click around the interface prebuild dashboards with the time slider set for all time.   Read IOP’s fly up like crazy.   So my average information before Pernix is as follows:

  • Max IOPS 350
  • Max Latency: 19 ms
  • Average Latency: 4 ms

 

Now with Pernix

I setup Pernix to cache all virtual machines in my datacenter.  With pernix I clicked around on multiple days and performed lots of random searches.  I loaded down a SQL server with lots of garbage inserts to create writes.   Nothing perfectly scientific with control groups I just wanted to kick the tires.   After a month with pernix I got the following metrics:

  • Max IOPS: 4,000
  • Max Latency: 14 ms
  • Average Latency: 1.2 ms

 

So the results clearly denote a massive increase in IOP’s.  Some may say sure you are using SSD’s for the first time, which is true.   The increase is not just SSD’s speed because the latency is greatly improved as well which is representative of the local cache.   Imagine using enterprise worthy SSD’s with much larger capacity.  Simple answer will Pernix improve storage performance… the answer is it depends but there is a very good chance.

Use Cases

With my home lab hat removed I need to talk about some enterprise use cases:

  • Any environment or workload where you need to reduce latency
  • Any environment where workload needs every more IOP’s than most of the solutions

Both of these use cases should be implemented where less latency or IOP’s is a direct cost.   Pernix can be used as a general speed enhancer on some slower environments or to improve legacy arrays.   It does push toward a scale up approach to clustering.   Larger cluster nodes with larger SSD’s will cost less than lots of nodes.  Pernix is licensed per node.   Putting in larger nodes does have a big impact on failure domains that should be taken to account.

My only Gripe

My only gripe with Pernix is the cost.  Compared to large storage array’s it is really cheap.  The problem is budgets… I need more storage performance which means the storage team buys more storage arrays not the compute team.  Getting that budget transferred is hard because storage budgets are thin already.     This will change hyper-converged is becoming very accepted and Pernix will really shine in this world.   Pernix just released the read cache for free making it a very tempting product.   They are a smart company with a great product. They are on the right path bringing storage performance as close to the workload as possible with an added element of intelligence.

VMworld Tips for 2015

I have had the pleasure of attending a number of VMworld events including two in San Francisco.   I figured since we are about a month out it would be time to post some helpful tips I have learned along the way.  Please feel free to comment with additional tips.

Food

The food at this venue is universally hated.  It personally does not agree with me.  It’s not VMware’s fault the food is provided by the venue.   I think there are two factors that contribute to my dislike of the food:

  • There is too much food (snacks every two hours with drinks plus two meals)
  • The food is not the kind of food I would eat everyday

I am a person who eats the same thing for breakfast and lunch every day… all the options really don’t agree with me.   So my advise is pace yourself.   Don’t over eat the convention food.  San Francisco has tons of really good restaurants so you might be tempted to skip the meal situation 100%.   This is a major mistake… the meal time is the best peer networking you will be able to do.   At every table there are hundreds of people with the same problems as you.   I started eating before the meal and just going for the networking.   I highly recommend not skipping the meals for the networking.

If you are looking for food options similar to what you might eat at home try the following (see map for locations):

  • Mel Diner and Drive in (Blue) ($12 per plate) – it’s a good american food location
  • Whole Foods (Green) – They are a market but include a food bar with lots of hot and cold options ready to eat
  • Target (Brown)- It’s a target store downtown which has a lot of food options and the cheapest prices in the city as far as I can tell (it’s also right across the street from the convention) bring your own bags they will not provide plastic bags (city rules)
  • Metreon – It’s like a mall with a movie theater across the street from the convention – they have a food court with mostly asian options on the first floor
  • Westfield San Francisco Centre (Purple) – it’s a real mall with a movie theater and lots of food options

Untitled

 

Drinks

I don’t drink any type of alcohol so I don’t know any of the good spots to drink.   As for non-alcohol you conference attendance normally comes with a backpack and water bottle.  The venue does not provide any bottled water.  They have cooled water available everywhere but no bottles.  They offer coffee and tea all day long and soft drinks in the afternoon.   I highly recommend that you drink lots of water… your out-of-town and want to flush with water.

Hotels

You booked early and got a close hotel right?  My first year I didn’t and was stuck at fishermans wharf.  Which was a lot of fun… it’s a tourist trap with lots of seafood.  It was fun… other than the 20 minute bus ride to the convention each day.    Or the 45 minute walk.   If you booked a convention hotel they have shuttles for free.   Just catch them outside the West building.    At this point I hope you have a hotel.

Dress

You will see just about every type of dress code.  The CIO’s and VMware brass will be wearing sport coats and dress shoes.   Duncan Epping will be wearing a tee-shirt and levi’s it really just depends on what you want to wear.   I do recommend you wear comfortable shoes (I always wear tennis shoes) you will do a ton of walking and standing.   You really want the comfortable shoes.

Attitude

This is not a party convention during the day.   Respectful attitudes and conversation are a good idea.    You will knee-deep in vendors, customers and partners all with their own agenda’s.   Be respectful of everyone… no one enjoys having their product thrown under the bus by the random walk by.

Parties

There are at least 3 parties each night.  Talk to your vendors before you go to get invites.  The official VMworld party is different each year but normally is drinks, food, music and games / sports.  It is loud if you like that sort of thing.  I understand it will be at the baseball stadium again this year… which was awesome in the past.  Last time they had stadium food (all you can eat… don’t eat that much) carnival games and of course the band.   It was a lot of fun.  Don’t kill yourself with parties… it will remove your ability to learn from the sessions.

Transport

The forms of transport available are:

  • BART’s (metro train system) they have an app for smart phones that rocks.  It’s very inexpensive and safe to ride.  It’s a great cheap way to get from the airport to hotel.  If you plan on riding Barts get the app.  You can also see routes here :https://www.bart.gov/schedules/bystation
  • Cabs – Yep they are available and a pain to hail… you stand by the side and stick out your hand… make sure they turn on the meter…
  • Uber – Yep it’s available and works great just download the app and setup before you need a ride
  • Cable cars – the classic cable cars are more a tourist attraction than a method of transport.   If you really want to ride one (for example market street to fishermans wharf)  just walk past the first stop to the next stop on the line and your will avoid all the crowds at the first stop
  • Cars – don’t rent a car… parking is really expensive
  • Bikes – you can rent bikes everywhere
  • Walking – San Francisco is a very walking friendly city the area near the convention are pretty safe, I would avoid finding yourself alone on a dark street in parts of SF thou.

VMworld sessions

The first instinct is to attend everything.  Bad idea.  Do not schedule back to back sessions… most will require walking between buildings.   The sessions are draining on a good day you will only get to four sessions.   Remember that most major sessions are recorded and available afterwards so you can rewatch.   This year there is a new type of session called quick talk (30 minute)  I think these sessions will be really awesome and look forward to attending a lot of them.

Vendor Hall

The vendor hall is awesome.  It’s a great place to meet vendors and understand their wares on the surface.   It’s my experience that most vendors don’t bring their highly technical people to VMworld.   It’s mostly sales people.  It’s also too noisy for any real technical conversation.   I have my best conversations with the small new vendors.   If it’s swag you want then this is the place.  Every kind of swag will be available.    Near the end of the conference they have an awards show where VMware awards new vendors and products (titles like most innovative new product etc..) .   Watch for this list/presentation and visit these vendors.

Training

VMware offers discounted training the week before VMworld if you want to attend.

Certification

VMware certifications are normally 50-75% off so it’s a great time to get some certifications done.

Hands on Labs

This is a site to see.. VMware provides on site labs for hundreds of people.   Most run in vCloud air and are awesome.   I personally skip this event all the labs are available two months later to be done in your own home for free.   If you are feeling tired and want some computer time it’s a good place to hide (I prefer the movie theater).  For last years labs go to hol.vmware.com

What to bring

It’s hard to determine what to carry for the conference.  I would carry as little as possible.   I suggest you bring the following:

  • Notepad – yep I am old school (replace with ipad or tablet of choice if you prefer)
  • Back pack – you will get stuff to carry
  • Business cards – yes bring lots with you it’s an awesome networking experience
  • Twitter – it’s used a ton get a twitter account to track activities and goings on
  • Extra phone batteries – I use my phone a ton in between and with the app… you want a small phone battery add-on

Outside the conference

The city is awesome.  Fishermans Wharf and chinatown are great.   The redwoods north of town also rock.   Honestly I could walk around this city for years it’s a ton of fun.

 

I hope to see you there this year.   Feel free to follow me on twitter @Gortees

VMware Home Lab 2015

Well it’s about time I post about my home lab 2015 edition.  Last year I was running a happy home lab on HP workstations using Operton 2xxx processors.   This year VMware releases vSphere 6 and I loose support for Operton 2xxx processors.   So I had to start over again.   I am a very cheap man so I want the cheapest thing that will reasonably work.  This time I wanted to get three nodes for possible vsan so here is my home lab.

Compute

After pricing out all kinds of stuff I stuck with the HP workstations.   They are normally server class hardware with great life and lightly used.   They also don’t sound like a Jet engine when running.   The HP version of ESXi installs out of the box without any problems and does not require any modification which is a huge win.    I wanted to make sure the processor was on the HCL (I don’t care about the server being on the HCL because I am cheap) so I browsed options and found the E5404 was on the HCL for ESXi 6.   Searching ebay I found some great deals on the HP xw6600 workstations with a single quad-core E5404 processor.    Normally they ship with two broadcom 1GB nic’s and 4GB of RAM.   They max out at 32GB of RAM with a single processor.   The only real downside is you only get four cores, but it’s $79.99 each right now.   You add some RAM – 32GB for $88.99 and you get 4 cores and 32GB of RAM for $167 per node.   This forms a very solid cluster with three nodes.

 

Storage

Shared storage is critical to a good cluster.  You have a few options:

  1. VMware vSAN
  2. Additional PC providing NFS or iSCSI
  3. Some type of NAS (Synology)

 

vSAN requires at least one SSD and one spinning disk per node.   So you are looking at roughly $300 total to go this way.   It also requires a license from VMware.

 

Additional PC providing NFS or iSCSI – use FreeNAS or something similar.   It’s cheap and easy if you have hard drives and PC’s sitting around.

 

Synology – These sell for cheap on ebay and rock so much.   I love them… you can buy with drives on ebay for around $150 if you are lucky.

 

I personally have vSAN and a synology nas in place.

 

Networking

You really need a Gigi-Managed Switch for VMware implementation.  Buy one for $120 or so or find one on ebay.

 

Licenses

Here is the sticky bit… you need licenses.  There are two options available at this time for low-cost:

  • VMware vExpert – This program provides great access to licenses and is not hard to get into at this point
  • VMUG Advantage program – for $200 a year you get licenses and lots of other benefits

 

Let me know if you have any questions.

Journey to an Automated Cloud Part 1

Are you ready to automate everything?  Does you boss want some of that cloud?   Well everywhere I turn people want to get into the cloud.  They all want a vendor product to provide the cloud.   Every vendor show I to go to has hundreds of products to solve that problem.   In my experience it is not a product problem that limits our journey to the cloud.   In these series of articles I will explore some of my thoughts on your journey to the cloud.

 

Part 1 – Where am I and what do I want?

My thoughts for this part are best explained by an exchange from Alice in Wonderland:

 

“Would you tell me, please, which way I ought to go from here?”
“That depends a good deal on where you want to get to,” said the Cat.
“I don’t much care where–” said Alice.
“Then it doesn’t matter which way you go,” said the Cat.
“–so long as I get SOMEWHERE,” Alice added as an explanation.
“Oh, you’re sure to do that,” said the Cat, “if you only walk long enough.”

The cat completely covers my feelings if you don’t know what you want it does not matter which way you go.  Most engineers are stuck with the problem of lack of definition.   We want to get into cloud provisioning.  We want to get to 20 minute deployments of servers.   We need to be more like Amazon.

 

Let examine these statements a little:

We want to get into cloud provisioning.

  • Does this mean you want to use public cloud for servers?
  • Does this mean you want a web portal for provisioning servers?
  • What does this mean?  It’s like me telling you I want to enforce family values… We all want family values but every family has a different definition

We want to get to 20 minute deployments of servers.

  • What type of server do you want deployed in 20 minutes?  OS, Application, three-tier?
  • What does deployment mean?  Powered on?  Able to talk on the network?  Internet facing?

We need to be more like Amazon.

  • You want to deploy unsecured operating systems without backup quickly?
  • You want to have pay as you go for our customers?

 

I want to be fair and emphasize that all these statements are valid but without definition and that every one of these offerings have positive sizes as well.  What they lack is business definition.   In almost every cloud situation the business wants IT to be more agile.   They want processes to go more quickly.    Universally they cannot understand why provisioning a server takes so long and is so complex.  Honestly nether can I.   I have made a career out of complex servers and it has to stop.    So before you start down some unknown paths with the cat ask some critical business questions (no engineer likes them but you need to ask them).

For example:

  • What pain point are we trying to solve with the cloud
  • What specific expectations do you have for the cloud
  • What is the timeline for the cloud

If during any of this conversation products come out know it’s normal.    Business people explain technology in terms of products.   (For example I want it to be like an ipad with dropbox) These statements are not locking you into a product they are helping you define requirements.    Ask questions about the product to help define requirements.   It is critical that you translate their products into requirements and constraints.    Once you have translated their needs into requirements statements get them to sign off on it.

 

Where am I?

In almost all cloud deployments it’s really about adding automation to all aspects of the service.   This allows you to be more agile to change.   Before you can begin your transformation you need to define your starting point.

Picture1

Is your current environment like the above picture?  Do you have many hands touching the configuration of your servers and applications.   Have you provided some basic automation like server cloning or configuration management?   This approach is common and really a growth of the virtualization era.   Let me give an example of this process:

  • Server request is provided to server team
  • Server team clones an operating system
  • Firewall team does firewall rules
  • Server team deploys application
  • Developers deploy code
  • Security team reviews server and approves
  • Server team release to production

 

This process seems simple and should be easy.    This is where the people problems start.   The development team has a project.  The server,firewall and security team have tasks.   They do their tasks without knowledge of the development teams project.  Which means that bolts will not be where they are expected and in the end something will require rework.   There is tons of room for human error and mistakes.    Each project built this way will be unique because people are executing the steps.    It gets worse as you scale up.   Assume that the normal firewall worker is out sick, now we have a stand-in who cannot do the job as well.   More errors and problems are introduced.   So to review:

  • Each team treats a project as a task
  • Each team executes the tasks with different priorities causing delays
  • There is lots of room for human error and mistakes hurting the timeline
  • It does not scale it’s mostly human capital

The fun part is this process is pretty good.   At least they have a defined process.

Do you have a process and is it followed?

It’s simple individuals have processes they natively follow.  We naturally assume that other people think and act just like us so naturally they will follow the same process right?  Wrong.  Everyone is different and does it a little differently.   So many IT shops have poorly defined processes and even when they do they are rarely followed.    In order to make it into the cloud you have to define your manual processes.   Get them on paper with the following details:

  • What information is required to work this process
  • What information is expected to return from this process
  • Who can work this process
  • What choices need to be made as part of this process
  • What happens if a process fails in an unexpected way

 

Does this sound like software development to anyone else?  Well it’s is.  Welcome to the rest of your career as a software developer or what I like to call a process engineer.   Once you have defined the process management needs to enforce the manual process to find out where it breaks… this is the hard part.  You can write down a process… you can send people to training on the process but you cannot make them drink.   All manual processes will be slower and worse at first.   Change is hard (That’s part two).   You have to practice the process to find the holes.    Here is a logical outline to define your process:

  • Have a subject matter expert define the process on paper (electronic or otherwise)
  • Have the SME train others on the process
  • Have management encourage others to do the process
  • Have people other than SME do the process and report back problems
  • Improve the process until it works in all situations encountered

 

Does it seem simple?  Yep it is..  Does it seem common sense… right again.   I should change the names of everything to something like points or teeshirt sizes so I can sell it but that not me.   It is simple to understand hard to implement.

Quiescing Backup causing BSOD in Windows OS’s on Current VMware Tools

Evening,

I got notified of this problem earlier today thanks to my awesome BCS engineer.  You can read VMware’s week old KB here.  Essentially certain versions of tools can cause a BSOD when a quiescing operation is done.   This is a big problem for API based backups since when possible they use this method.   There are three solutions provided by VMware at this time:

  • Disable quiescing
  • Do not select Quiescing guest file system when taking a snapshot
  • Downgrade the VMware Tools to previous version not affected

Good news… the latest version of VMware Tools in 6 and 5.5 is affected.  I was notified that a few specific version were affected.  The KB does not specify them.   You can track versions to build numbers via this table: http://packages.vmware.com/tools/versions

I have been told the following are affected:

  • 8399 – 8.6.15
  • 9231 – 9.0.15
  • 9355 – 9.4.11
  • 9216 – 9.10.0

 

Here is a powershell snippet to locate the machine that are affected:

 $VMS = get-vm |get-view | where {$_.powerstate -ne "PoweredOff" } | where {$_.config.tools.toolsVersion -eq "8399" -or $_.config.tools.toolsVersion -eq "9231" -or $_.config.tools.toolsVersion -eq "9355" -or $_.config.tools.toolsVersion -eq "9216"}

 

I am not sure if there is a scripted way to downgrade the tools.  Here is VMware method to download older tools version.   As with all my articles you should open a VMware ticket to get specific production assistance.   Let me know if you know a scripted way to downgrade tools.

 

PowerCLI locate VM’s with multiwriter

Another snippet to locate VM’s with Mutiwritter enabled:

#Create the array

$array = @()

$vms = get-cluster “ClusterName” | get-vm

foreach ($vm in $vms)

{

 

 

$disks = get-advancedsetting -Entity $vm | ? { $_.Value -like “*multi-writer*”  }

foreach ($disk in $disks){

$REPORT = New-Object -TypeName PSObject

$REPORT | Add-Member -type NoteProperty -name Name -Value $vm.Name

$REPORT | Add-Member -type NoteProperty -name VMHost -Value $vm.Host

$REPORT | Add-Member -type NoteProperty -name Mode -Value $disk.Name

$REPORT | Add-Member -type NoteProperty -name Type -Value “MultiWriter”

$array += $REPORT

}

 

 

}

$array | out-gridview

PowerClI locate all the SCSI Bus Sharing VM’s

More things that stop vMotion like SCSI Bus Sharing here is a snippet to locate all of them in a cluster

#Create the array

$array = @()

$vms = get-cluster “ClusterName” | get-vm

#Loop for BusSharingMode

foreach ($vm in $vms)

{

 

$disks = $vm | Get-ScsiController | Where-Object {$_.BusSharingMode -eq ‘Physical’ -or $_.BusSharingMode -eq ‘Virtual’}

 

foreach ($disk in $disks){

$REPORT = New-Object -TypeName PSObject

$REPORT | Add-Member -type NoteProperty -name Name -Value $vm.Name

$REPORT | Add-Member -type NoteProperty -name VMHost -Value $vm.Host

$REPORT | Add-Member -type NoteProperty -name Mode -Value $disk.BusSharingMode

$REPORT | Add-Member -type NoteProperty -name Type -Value “BusSharing”

$array += $REPORT

}

 

 

}

$array | out-gridview