Does Cloud + REST API spell the end of GUI

Fun question:  Does API spell the end of the GUI?

I started my career as a Solaris and Linux administrators mostly because I felt that working in Windows Server took away most of my control.  I loved configuring a web server in text and having full control.   I love having to understand what each variable did so I could tune my web server to meet my needs.    It was a great job which led into configuration management with puppet.   Full control and text once again…

This evening I was working with the REST API for NSX working on a side project and to confirm the results of my query I just used REST… I got my answer is a millisecond… I could not have refreshed the GUI that quickly.   It was so easy and it reminded me of the good old Linux days long forgotten as a architect.

Make no mistake it’s a coders world out there infrastructure folks need to get comfortable with API’s and code.   The future is a process of automating different units together using API’s.   Working with Rest has taught me so much about the platform.   You start to understand how the solution was built.   It exposes workflows that helps you build efficiency…

I suggest that if you really want to understand your product you need to learn it’s API.  If it does not have an API consider a different product.   I know GUI’s will be around but I do believe they will continue to have less value in enterprise deployments.  Strap on your code and join the power users.

Advice to VCDX candidates from a Double VCDX

“Sometimes it’s the journey that teaches you a lot about your destination.”  – Drake

Update: I have updated the wording on the constraints section to reflect a Twitter comment from  thanks for the fix to wording and reading.

The VMware Certified Design Expert certification (VCDX) represents the highest tier of VMware’s certifications.   I recently contributed to a panel of VCDX’s at VMworld.  Candidates considering the VCDX certification had the opportunity to ask the panel questions.   The questions illustrated that candidates were concerned about the Herculean effort required to achieve the certification.   I wanted to take this opportunity to provide some guidance I have learned as a mentor.   I believe anyone can become a VCDX.   It does require some hard work but it is very achievable.

 

Requirements, Constraints, Assumptions and Risks

Becoming a VMware certified design expert does not mean you have to be the most technical person in the room.   It does mean you have to know how to align technology to business needs.    My experience has taught me that I can tell if a proposal for VCDX will be successful right away based upon requirements, constraints, assumptions and risks.   The ability to gather business and technical requirements is a key skill for any design expert.   Your technical requirements should be aligned to the business requirements. It’s important to understand the difference between business and technical requirements:

  • Business Requirements – Defines how the delivered product provides value. Other words often used are outcomes, or expected benefits.  For example, the solution must meet regulatory compliance.
  • Technical Requirements – Defines the technical “must haves” to achieve the outcome. For example the solution must be able to fail over and fail back from a disaster and support a RTO of four hours.

Many VCDX documents are solely focused on technical requirements and miss the “why” that drives the design.   Understanding the difference between requirements and constraints is another challenge for many candidates:

  • Requirements – Things the design must meet, such as: establish a RTO of four hours or provide capacity for twenty percent growth for the next three years.
  • Constraints – Things that form limits or boundaries that apply to the design.  For example a specific vendor relationship or reuse of current hardware.  Constraints should be met by the design unless they are resolved via conflict.

Once you have established your requirements and constraints you are left with assumptions and risks:

  • Assumptions – things you believe to be true but cannot verify. For example, storage usage will grow at the same rate as compute usage or the sample data provided represents reality.
  • Risks – are simply risks to the project meeting business requirements. If you identify risks they should be provided in this section.   Every project has risks.   For example, staff skills or timelines.

 

Correctly creating requirements and constraints that align with the elements of design are critical to a successful submission.    Identification of assumptions and risks provide important protections to the architect.  The goal of a VCDX design is to align technology to meet the requirements and constraints not provide the best technology mix.

 

Elements of Design

When working with infrastructure, VMware has designated five elements that should be considered in each design choice.  Each design choice should be evaluated against the elements of design for impact.  I personally like to use the acronym RAMPS to help me remember these elements:

  • Recoverability – Choices effect on disaster recovery
  • Availability – Choices effect on SLA
  • Manageability – Choices effect of management cost
  • Performance – Choices effect on performance
  • Security – Choices effect on security

It is not uncommon for availability, recoverability, security or performance to have a negative impact on manageability.   Not all choices can have a net benefit to all elements of design.   The tie breaker with these conflicts should be the requirements.   Conflicts between design elements may exist even after evaluating the requirements.   This allows for a conflict resolution section.   Conflict resolution is where the customer of the solution acknowledges the conflict and mitigates the conflict in some form.   Make sure your design has conflicts.   Each requirement and constrain should be aligned to an element of design.  When gathering business requirement, consider the RAMPS impact of each requirement to help gather a full list of requirements and constraints.    Each technical requirement or constraint should be aligned to a single element of RAMPS.

 

Fun with Formats

Every single candidate struggles with document format.    The VCDX requires far more detail than most designs in enterprise.    Format paralysis has slowed if not stopped many candidates.   My suggestion is identify an outline that aligns with the blueprint.

  • Overview
  • Requirements, constraints, assumptions and risks
  • Conceptual architecture
  • Logical architecture
  • Physical architect
  • Security
  • Appendix

 

Each of the different layers of architecture should address the sub elements: compute, storage, networking, applications, recovery, virtual machine, management, etc…   You cannot provide lip service to conceptual and logical architecture.   They must be developed just like physical architecture.    Design choices should be justified against RAMPS, with conflicts identified.   The secret is to determine a format and start writing, don’t get stuck on format.   In the end, the format is not as important as the content assuming the reviewer can locate the items required in the blueprint.

 

Time Management

Every candidate struggles with time.  We have family, friends, hobbies, faith and work conflicting with the VCDX goal.    My advice is to set a goal with a timeline.   Agree upon a set time each day.  Exercise discipline to work on the VCDX during that time and you will achieve your goal.   For me I used 8:00 – 9:00 PM each night.  It was after my kids’ bed time and before spending time with my wife.   I had to sacrifice computer game time, social media time and blogging time, but after six months I was done.   This model has worked for me to achieve two VCDX certifications and put me on the path to my third.   I’d like to end where I began.   I believe everyone can achieve this certification with hard work.   To start get a mentor by visiting vcdx.vmware.com and searching for a mentor including me.

Redeploy NSX Edges to a different cluster / datacenter

First Issue my bad

I ran into an interesting issue in my home lab.  I recently replaced all my older HP servers with Intel NUC’s.  I could not be happier with the results.   Once I replaced all the ESXi hosts I mounted the storage and started up my virtual machines including vCenter.   Once vCenter and NSX Manager were available I moved all the ESXi hosts to the distributed switch.   This normal process was complicated by NSX.    I should have added the ESXi hosts to the transport zone allowing NSX to join the distributed switch.   Failure to do this made the NSX VXLAN process fail.   I could not prepare the hosts… ultimately I removed the VXLAN entries from the distributed switch and then re-prepared which re-created the VXLAN entries on the switch.   (This is not a good idea if you use it in production so follow the correct path.

Second Issue nice to know

This process generated a second issue the original cluster and datacenter on which my NSX edges used to live was gone.   I assumed that I could just re-deploy NSX edges from the manager.   While this is true the configuration assumes that it will be deploying the Edges to the same datacenter, resource pool and potentially the same host as when it was created.   So if I have a failure and expect to just bring up NSX manager and redeploy to a new cluster it will not work.   You have to adjust the parameters for the edges you can do this via the API or GUI.   I wanted to demonstrate the API method:

I needed to change the resource pool, datastore, and host for my Edge.   I identified my Edge via the identifier name in the GUI.  (edge-8 for me)  Grabbed my favorite REST tool (postman) and formed a query on the current state:

This returned the configuration for this edge device.  If you need to identify all edges just do

Then I needed the VMware identifier for resource pool, datastore and host – this can all be gathered via the REST API but I went for Powershell because it was faster for me.  I used the following commands in PowerCLI:

 

 

Once identified I was ready to form my adjusted query:

 

<appliances>
<applianceSize>compact</applianceSize>
<appliance>
<highAvailabilityIndex>0</highAvailabilityIndex>
<vcUuid>500cfc30-5b2a-6bae-32a3-360e0315ccd3</vcUuid>
<vmId>vm-924</vmId>
<resourcePoolId>domain-c861</resourcePoolId>
<resourcePoolName>domain-c861</resourcePoolName>
<datastoreId>datastore-865</datastoreId>
<datastoreName>datastore-865</datastoreName>
<hostId>host-881</hostId>
<vmFolderId>group-v122</vmFolderId>
<vmFolderName>NSX</vmFolderName>
<vmHostname>esg1-0</vmHostname>
<vmName>ESG-1-0</vmName>
<deployed>true</deployed>
<cpuReservation>
<limit>-1</limit>
<reservation>1000</reservation>
</cpuReservation>
<memoryReservation>
<limit>-1</limit>
<reservation>512</reservation>
</memoryReservation>
<edgeId>edge-9</edgeId>
<configuredResourcePool>
<id>domain-c26</id>
<name>domain-c26</name>
<isValid>false</isValid>
</configuredResourcePool>
<configuredDataStore>
<id>datastore-31</id>
<isValid>false</isValid>
</configuredDataStore>
<configuredHost>
<id>host-29</id>
<isValid>false</isValid>
</configuredHost>
<configuredVmFolder>
<id>group-v122</id>
<name>NSX</name>
<isValid>true</isValid>
</configuredVmFolder>
</appliance>
<deployAppliances>true</deployAppliances>
</appliances>

I used a PUT against https://{nsx-manager-ip}/api/4.0/edges/{edgeId}/appliances  with the above body in xml/application.   Then I was able to redeploy my edge devices without any challenge.

Powershell Functions for tags

Some quick powershell functions for tags in ESXi enjoy:

Public cloud has forced change

Readers will immediately cry foul of this title.   Public cloud adoption is not huge and even in the most die hard cloud only shops it’s only around 40%.   I believe public cloud has and will continue to have a transformative effect on private cloud.  The presence of a second option has forced the current options hand.    I will not detail the challenges in public cloud adoption that is a blog post for another day.   I want to focus on elements that public cloud’s presence has forced into our private on-prem. clouds:

  • Life cycle management for hypervisor is now table stakes – gone are the days with hypervisor specific teams – you can roll that cost into the operational budget on public cloud.   Quite simply upgrading / maintaining / tweaking the hypervisor needs to become easier and cost a lot less OpEx.
  • Delivery of traditional IT services need to become transparent and quick – the buzz word agility applies – the business does not care how many engineer’s it takes to screw in the server they just want it now
  • Consumers of IT services don’t like limits or scale issues – all on-prem. offerings need to have some form of elasticity
  • No one really wants or needs IaaS (Infrastructure as a Service) they want platform because only platforms provide value to developers who in turn provide value to the business-  Platform has to include multiple servers, networking/networking constructs, security, and authentication.
  • Cost is important like never before… if you don’t control / understand your cost, comparisons will be made to public cloud and you will loose.
  • Service catalogs are only useful if they change and are responsive to business needs (think application development life cycle)
  • Infrastructure people need to learn from development – the future is automated and created by developers who understand infrastructure – you can try to stand still but it will not last.
  • IT shops now want to spend IT budget incrementally very few shops want to buy IT as a CapEx spend every three years

 

I want to be clear I believe public and especially hybrid cloud should be part of every IT strategy.   It’s a critical reality in our world.   I also believe that private cloud is here to stay for many years but expectations will continue to change based upon public cloud experience.

The real question for me is how will the new edge of IoT force public clouds hand.

Operational aspects of HCI video

While attending VMworld 2017 I presented on some of the operational aspects of hyper-converged infrastructure.   I believe the key takeaways were:

  • Hyper-converged is more than just storage to gain the real benefits
  • Hyper-converged has a difference scalability model  (more linear)
  • Hyper-converged requires a difference organizational structure to be successful
  • Hyper-converged performance and availability are policy based instead of location based.

You can watch the video here:

 

One question I was asked after the presentation was about scalability.   It was a really good question so I wanted to answer here.   Let’s assume that you start with a 3 node cluster.  After three years you add 12 nodes to a total of 15 nodes.   At some point newer hardware types are available does that mean that you now need to buy 15 brand new nodes running into a major investment instead of incremental growth?

  • The answer is yes and no.   At some point you do have to make the three node investment but you should do it long before the end of life for your current cluster so you can organically grow new hardware before end of life.   This should be taken into account on your growth models.

Thanks to all who attended.

How to operate IT with full Velocity

I was honored to be able to present this last week at VMworld 2017.   I have always been a huge supporter of vBrownBag and was really happy to present for them at VMworld again this year.   One of my presentations was on how to operate IT will full velocity as a follow-up to my post on how to make IT Agile.   The session was recorded and posted on YouTube.   You can view the brief (12 minute) talk here:

Please let me know if you have any feedback or thoughts