Cisco ACI and Software Defined Networking – Part 2

This month’s blog post is a continuation of last October’s post on Cisco ACI and Software Defined Networking – Part 1. In part 1, I gave some history on traditional networking and discussed how networking is evolving to accommodate the next generation data center and cloud computing with Software Defined Networking or SDN. This month we dive into Cisco’s approach to SDN – Cisco Application Centric Infrastructure or ACI.

At first glance

Cisco ACI is made up of two components. First, the Application Policy Infrastructure Controller or APIC is management software component for ACI. Its responsibility is to manage and apply network policy on the fabric.  The APIC is an appliance or UCS 2240 M3 (Frist Gen) or 2240 M4 (Second Gen) which is a 1U rack mount servers. They are typically in a clustered configuration for added resiliency. Second, the Nexus 9000 is the hardware component. They can run in traditional nexus mode or ACI mode. In ACI mode, all the management and configuration happens at the APIC level. In Nexus mode, management happens at the individual switch level.

Here is a list of ACI compatible hardware.

Object Model, Tenants and Contexts

The Object Model is the foundation of ACI and where it truly derives its power. For those who are familiar to programming. The Object Model that ACI uses operates like an Object in Object Operating programming. The object has a set of attributes that can be modified and shared across the entire model. This is a huge departure from how we managed switches in the past individually using a flat configuration file. In programming terms, the old way of configuring switches was a lot a procedural program. Each line in the startup configuration file was read into memory and becomes the running configuration.

The ACI Object model is made up of tenants which can be used by different customers, business units, or departments within an organization. When ACI is first turned on it creates a default Tenant. From there, additional Tenants can be created based on the needs of the organization.

Tenants are broken up into contexts which are different IP spaces or subnets with their own Virtual Routing and Forwarding (VRF) instance(s). Contexts are very similar to VLANs but are much more configurable and less limited than the traditional VLAN.

Endpoint Groups and Policies

Endpoint Groups or (EPGs) are a grouping of endpoint devices that share the same set of network services and that ultimately represent an application or business unit. An EPG can be a physical NIC, Virtual NIC, Port Group, IP Address, VLANs, VXLANs or DNS name. EPGs allow the network engineer to logically segregate a network based on the application. In the past, this would typically be done with VLAN(s) which would logically segment the network to Isolate for performance or security reasons. This can cause additional complexity which isn’t necessary needed. By default, a device can’t communicate on the network. This rule operates more like Fiber Channel SAN and less like an Ethernet LAN.

A policy consists of a Source EPGs (sEPG) and a destination EPGs (dEPG). Polices can be ingress and egress rules that can be used for Access Control, Quality of Service (QoS), or other network related services. Once you are in an Endpoint group you can communicate as long as you have IP reachability. A policy allows you to create an application group (web, app, and database servers and control the network communication between each. A policy essentially defines a security zone for a particular application. A policy enforcement matrix used to group sEPG(s) and dEPG(s) in a grid and where they meet is where policies are enforced.

Contracts and Filters

Contracts define how EPGs communicate with each other. Similar to a contract that you sign the defines an agreed upon outcome. In the ACI world, a contract is a set of rules that defines how the network will operate within a policy. Contracts can either be provided or consumed by an EPG. Filters are used to permit and deny traffic at Layer 2, 3 and 4. Filters are applied to both inbound and outbound interfaces. A Filter is essentially a Access Control List or (ACL) on the network.

Application Network Profiles

An Application Network Profile groups everything together (EPGs, contracts and filters) and that dictates how that traffic behaves on the network fabric for a specific application. For those that are familiar with UCS platform. An application network profile is very much like a service profile is to a server. It gives the network hardware an identity once defined.

Private Networks, Bridge Domains and Subnets

A private network is simply a L3 forwarding domain. When added to a context is acts just like a VRF in the traditional network world which can allows for a private network to have overlapping IP addresses without conflict. Bridge Domains or simply BDs are responsible for L2 forwarding like a VLAN. The difference is that you aren’t subject to the limitations of a VLANs on a traditional network like the 4096 VLAN limit. A subnet is defined under a Bridge Domain and creates a gateway. Much like a Switch Virtual Interface or SVI. A gateway is a logical interface that can exist on multiple nodes in the fabric.

In this post, I just scratched the surface on Cisco’s ACI by covering some basic concepts and terminology. Cisco’s ACI and SDN in general are changing the way Network Administrators and Engineers approach the design and administration of networks. New skills like basic scripting and programming will be required for Network Engineers as software takes more predominate role in the data center.

Cisco ACI and Software Defined Networking – Part 1

Software Defined Networking or SDN has started to take the networking world by storm in the last few years. The goal of SDN was to bring the benefits of virtualization to the networking world like we seen in the server world. Decoupling the software from the networking stack gives organizations agility, intelligence and centralized management to address the rapidly changing environments.

Cisco was a little late entering the SDN game. Back in November 2013 they announced Cisco Application Centric Infrastructure (ACI) with the $863 million purchase of Insieme networks. The goal of ACI was to deploy network services to meet the needs of a specific application. This approached changed the paradigm on how we build and deployed network. As of this writing, Cisco ACI is only supported on the Nexus 9000 platform and is “hardware ready” on the Nexus 5600s. The Cisco ACI is managed by an Application Policy Infrastructure Controller or APIC.

So how does SDN change the way we look at networks? To really understand this shift we need to look back on how we built networks.

Traditional Networks

The traditional network was built using a three-tier design that is still widely used today. For those that have been involved with Cisco are most likely familiar with the three tier network design – core, aggregation and access layers. Here is a breakdown of each layer.

Core Layer – The Core is the backbone of the network and is where speed and reliability are important. No routing or network services operate at this layer.

Aggregation Layer – The Aggregation layer is concerned with routing between subnets, QoS, Security and other Network services.

Access Layer – The access layer contains endpoint devices like desktops, servers, printers, etc. It priority is delivering packets to the appropriate endpoint device.

This three-tier design was solid for campus network designs but started to run into limitations in the data center. The primary cause was server virtualization. When VMware started to explode on the scene 8 to 10 years ago we saw a large increase in east-west traffic. In large scale VMware deployments, the three-tier design started to run into scalability issues. 10 Gig Ethernet helps address some of those challenges that still didn’t solve everything.

There were also challenges around redundancy and the three-tier design. Redundancy is imperative inside the data center but the protocols we used in the networks of yesterday aren’t sufficient for today’s data center and the cloud. Spanning Tree Protocol or (STP) was designed to prevent loops in Ethernet networks and variations of it have been around for decades. In order for STP to prevent loops it had to block one of the redundant ports in the network. From a design standpoint this isn’t very efficient and can waste a lot of potential bandwidth. The other challenge was slow re-convergence of the network after a failure meaning that some kind of outage typically happened depending on the size of the network.

Network virtualization technologies like Virtual Port Channel (VPC) and Virtual Switching System (VSS) helped addresses some of these issues by creating logical switches on the network that appear to be a single switch on a single path from an STP standpoint. In that scenario spanning-tree would not block the redundant link giving you a more efficient use of your bandwidth. This is more of a smoke and mirrors method but works well to address the challenges around Spanning Tree.

Traditional network management was primarily done via the CLI or sometimes via a management GUI. Both methods were limited in large scale environments and automation was a challenge at best. Cisco IOS operated using a flat configuration file and didn’t have any APIs that allows for network programmability. So most Network Administrators would “Copy” the running configuration from one switch and “Paste” on to another switch with a few tweaks done in a text editor. This method was only so scalable and was prone to human error.

The biggest limitation was the device by device approach to administration and inconsistent or orphaned configurations. This typically caused issues that were difficult to troubleshoot and very time consuming to remediate. The open nature of these networks tended to create unsecure networks because security wasn’t tightly integrated into the design and didn’t come from a more restrictive approach. Lack of full traffic visibility became a big challenge. Especially in virtualized environments were the network team didn’t have visibility.

Spine-Leaf Networks and ACI

In the ACI network uses a Leaf-Spine architecture. The Leaf-Spine collapses and simplify the Three-Tier architecture with a CLOS architecture. In this topology we have more of a mesh topology with all leaf switches accessing the spine and vise versa. Leafs don’t connect to each other and Spines don’t connect to each other either. This design reduces latency and give you an optimal connection where all paths on the network are forwarding. Here is the breakdown.

Leaf – The Leaf is the access layer where endpoint devices connect.

Spine – Is the backbone layer that interconnects the Leaf layers.

Pretty simple, right? While the Spine-Leaf design doesn’t remove the limitations of Spanning Tree but its replaced with FabricPath (or TRILL the open standard) for loop prevention. FabricPath treats loop prevention like a Link-state routing protocol which are designed for very large WAN environments. Using FabricPath allows a data center network to scale beyond what is capable for Spanning-Tree, while allowing all paths on the network to be active and in a forwarding state. It gives the network the level of intelligence you see in large WAN networks were being able to scale is key.

Overlay Networking

A Leaf-Spine topology utilizes VXLANs or Virtual Extensible LAN as an Overlay Network to scale from an IP subnet standpoint. VXLAN is a tunneling protocol that encapsulates a virtual network. VXLAN supports up to 16 Million logical networks which is a considerable increase of the 4,096 logical networks that VLANs supports. This is accomplished by adding a 24-bit segment ID to the frame. But adding support for additional logical networks is not the only thing that VXLANs bring to the table. VXLANs allow for the migration of Virtual Machines across long distances which is important for the software defined datacenter. In addition, it allows for centralized management of network traffic which makes the network operate as more of a collective and not as individual independent devices. NVGRE is another network overlay technology that’s gaining momentum. I will discuss this one in a future blog post.

Now that we know some of the fundamental components that make up Software Defined Network. Lets dive deeper into ACI and breakdown how it works. Stay tuned for my next blog post.

Zerto – A game changer for DR

In our industry, Disaster Recovery or DR has always been a difficult discussion for a lot of people. It’s typically something that most organization don’t want to tackle for a variety of reasons. These reasons range from the complexity from a technology standpoint to the financial obligation that’s needed for an effective DR strategy. With the advent of virtualization made was a major step in make DR more simple but there were still challenges around the order of operations of applications and the reliance of a run book which inevitably brings in the possibility of human error. Zerto is truly a game changer when it comes to tools that help organizations tackle the challenge of DR.

Brief history

Zerto is an Israel based company. It was founded by Ziv and Oded Kedem who also found Kashya back in 2000. When Kashya was introduced it too was a game changer when it came to DR and data protection. Back then DR was a very manual process and SAN replication was the tool used to replicate data between a primary and secondary sites. The challenge with SAN replication that its typically expensive, complex to implement and was dependent on hardware. Kashya moved replication to the host or switch level (depending on the solution) using a product called Cisco SANTap. Not only did it just mirrored data but it brought continuous data protection or CDP to the table. For everything that Kashya brought it still didn’t address complexity and management challenges with the overall solution.

In 2006 EMC purchased Kashya and it became EMC RecoverPoint. After selling the company both Ziv and Oded started a new project called Zerto. Zerto addressed some of the challenges that Kashya had by moving the replication engine from the SAN to the hypervisor. With replication taking place at the hypervisor level simplified the overall installation of Zerto. There are no additional hardware components or overly complex configurations on the SAN. It was software only appliance that was purpose built for virtual environments. Zerto also addressed some of the shortcomings around another product called VMware’s Site Recovery Manager or SRM. With replication taking place at the hypervisor level made it easy for Ziv and Oded to add DR automation and orchestration to make Zerto a complete DR solution.

How does Zerto work?

Zerto needs a Virtual Replication Appliance or VRA installed on each ESX or virtual hosts in the environment. The VRA is a Virtual Appliance running a form of Linux Debian. The VRA splits incoming IO through the virtual SCSI stack using a IOVP (I/O Vendor Partner Package) driver that resides on each host. This gives it more direct access to the IO than VMware vStorage APIs and change block tracking which is used by a lot of backup software. Zerto uses asynchronous replication.

Replication supports WAN compression and throttling automatically and its intelligent enough to detect changes in the link and rollback any actions that didn’t complete at the other side. A nice feature those who have limited bandwidth or don’t want to congest the WAN

Zerto doesn’t leverage snapshot technology like a lot of replication technologies but instead uses journaling. The journal holds all data changes which allows you to rollback to a specific point in time. This gives you low RPO (Recovery Point Objectives) which can’t be achieved by snapshot based technologies. Journaling is effective and can deliver tight RPOs but does come with some caveats. The longer you want to go back the larger the journal needs to be in order to accommodate those requests. The same goes for IO intensive applications. If there is a high rate of data change then you will need adequate journal space to accommodate. So size accordingly and have reasonable requirements.

Management

The Zerto Virtual Manager or ZVM is a small piece of Windows software that can be installed on vCenter or in SCVMM server. The ZVM’s job is to monitor replication, define Virtual Protection Group or VPGs (We’ll get to that later) and ultimately protect VMs. This is in contract to the VRA which sole job is to split IO and replicate data. It’s recommended that you have 1 vCenter or SCVMM server per data center so that if in the event of a disaster you can failover your site. Both servers should be protected to prevent downtime. If vCenter of SCVMM server is down not only can’t you manage your virtual environment but the ZVM will be unavailable as well. ZVM needs to communicate with the ZVM over TCP port 4005. Make sure that this port is open on any firewall between your production and DR sites. The Zerto Virtual Manager can be accessed by via port 9080 and the GUI is HTML 6.

DR Automation and Orchestration

Zerto uses a Virtual Protection Group or VPG to protect Virtual Machines. A VPG is typically used on a per application basis. It provides Virtual Machine consistency groups, define RPO, CDP history, and WAN optimization and compression, failover configuration, and boot order which defines the order the Virtual Machines come up. The VPG is where you can put your DR plan to work. With properly configured VPGs you have a working DR plan for your virtual environment which help drive down RTO (Recovery Time Objective).

Zerto also allows for the ability to do point-in-time DR testing non-disruptively without impacting replication or the production environment. This is a great way to test your environment on a regular basis without having to deal with the outages and/or after hours work. Note: This doesn’t replace the need perform actual test. I always recommend that organization test their BCDR plan once a year at a minimum.

Conclusion

In conclusion, Zerto brings the replication and DR automation to the SMB and Enterprise customer. It is simple to deploy, configure and administer which gives it an advantage over more complex solutions like VMware SRM and storage array based replication. In addition, it a tool that has become storage and hypervisor agnostic taking the dependence on the hardware and hypervisor out of the equation. Allowing customers to make more economical decision when evaluating a DR strategy.

It should be noted that Zerto doesn’t replace the need for a good Business Continuity and Disaster Recovery Plan. I have seen some people think that a product like Zerto will just magically create a DR plan for their environment. For all the cool things it does it can’t understand business and create an effective DR plan. Zerto is only a tool that helps put that plan and makes its actionable. In a future blog post I’ll talk more about Business Continuity and DR planning.

Tintri VM-Aware Storage vs. VMware VVOLs on Traditional Storage

With the release of VMware’s vSphere 6 and VVOLs. I have had a several customers ask me questions regarding VVOLs and how they compare to Tintri’s VM-Aware storage platform. There lies some confusion on both technologies. For example, there is a notion that VVOLs give non-Tintri storage arrays the same capabilities as Tintri. Now there some some similarities between the two but it’s important to understand that both technologies are trying to solve the same problem but from different perspectives. So I decided I decided to dive a little deeper on both technologies and hopefully clear up some confusion.

What problems are we trying to solve?

Before we dive into both VVOLs and Tintri’s VM-Aware storage its important to understand what problem are we trying the solve. One of the big challenges as a VMware environment starts to grow becomes storage management and data protection. VMware has operated in a data store model since it’s initial release a decade ago. A data store can be a VMFS (VMware File System) data store on top of a block storage devise like a FC or iSCSI LUN or it can be file based using NFS. These data stores are basically pools of storage for VMs and here lies the issue. Each data store can have dozens or even hundreds of VMs which have their own storage characteristics and needs for the application. In the data store model your point of management is the data store which is cumbersome from a performance, troubleshooting and data protection standpoint. This is exacerbated in larger enterprise environments when your dealing with thousands of VMs. So how do we change this paradigm? We move away from the data store model to a VM based storage model. What does that mean? That means that we change the focus from the data store to what really matters most, the VM and most importantly the application. This is what is ultimately brought to the table with both VVOLs and Tintri’s VM-Aware storage.

What is Tintri’s VM-Aware storage?

Tintri’s released VM-Aware storage back in 2011 with the announcement of their VMStore platform. This was a few years before VVOLs were officially released with vSphere 6. VM-Aware storage allows Tintri to gain visibility to a virtual machine that’s running on top the Tintri file system. This visibility is gained through VAAI (VMware API for Array Integration) which allows Tintri to connect to vCenter and gain understanding of the Virtual Machine environment. Tintri is NFS only, which makes it easier for integration and overall improves the VMware storage experience. The Tintri VMStore doesn’t operate in Volumes but in VMs. Which makes overall storage management a lot easier and more efficient.

Tintri leverages zero based pointer snapshot technology which makes snapshots and clones almost instantaneous during creation. That coupled with VM level storage management makes it easier to design a data protection strategy. This same snapshot technology is used for replication and this too a VM level and not Volume based. This helps you design a more granular RTO/RPO for Disaster Recovery.

Tintri’s VM-Aware storage also has benefits from a performance standpoint. Performance characters like QoS and latency can be fined tuned for individual VMs versus other storage platforms where VMs have to be grouped by LUN or Volumes. This approach gives the VMware Administrator more flexibility when it comes to carving out performance characteristics. This overall simplification of storage makes it easier for a “non-storage” administrator to provision storage that works and performs out of the box.

What are VVOLs?

VVOLs or Virtual Volumes are similar to Tintri’s VM-Aware storage but also have some noticeable differences. VVOLs allow storage vendors to gain a level integration and better management capabilities at the VM-level. This level of integration is accomplished using VMware’s VASA or VMware’s API for Storage Awareness. VASA lies between the ESX host and the storage array and it facilitates the communication between the two. VASA 2.0 which was released with vSphere 6.0 is the only version that supports VVOLs. The previous version of VASA that ran on vSphere 5 is not supported. So an upgrade of your VMware environment will be in order if you are running an older version of VMware.

On the storage side, there a couple of things to take into consideration. First, does my storage array support VVOLs? If the answer is no, then newer hardware will be needed in order to get support. Even with vSphere 6, you still need the storage array to support VVOLs. Second, if your storage array does support VVOLs. A software or firmware upgrade my still be needed in order to turn on VVOL support. If you fall under the first option. The need to purchase a new storage array. Just remember, that not all storage arrays are equal when it comes to VVOLs. Each implementation will be dependent on the overall quality of their product.

Once you get things squared away on both the VMware and storage side. There is an important consideration with VVOLs that needs to be mentioned as you start to design phase. Each VM will need 3 to 4 VVOLs at a minimum in order to get full benefits of VVOLs. You will need a Configuration VVOL, vSwap VVOL, Data VVOL and Snapshots VVOL at a minimum. If you are presenting multiple data VMDK files to a VM then those will need to be VVOLs as well. So a storage arrays ability to scale is important as the number of VMs in your environment increases. This goes back to the previous paragraph. Pick you storage vendor and array wisely.

How do they compare?

So how does Tintri’s VM-Aware storage compare to VVOLs. Let’s break this down by first discussing where both technologies overlap.

VM Level Snapshots: Tintri and VVOLs both support VM Level Snapshots. This is where both technologies can help from a data protection strategy. The ability to snapshot at the VM Level versus the data store level. The only difference lies in that VVOLs will leverage whatever Snapshot technology is available to it on the storage side. So it goes back to understanding that each storage platform is different and with snapshots your mileage could very when it comes to overhead.

VM Level Clones: Tintri and VVOLs both support VM Level Clones. This too (like snapshots) moves cloning from the data store level to the VM Level. Like with snapshots, there are different implementation of cloning. So if cloning is an important part of your overall strategy for development or test/dev its important to understand your overall implementation. Some cloning technologies are more space efficient than others.

VM Level Replication: This is where you start seeing some difference between VVOLs and Tintri. Tintri supports VM Level Replication. The ability to replicate at the VM level is a nice feature for more granular failover of VMs and allows for more tightly control RPO/RTO. With VVOLs you can replicate at the VVOL level which a first glace might seem to be the better option. However, if you think about it. This is where you could run into issues from a storage management perspective. It most cases you want to keep the same RPO/RTO for an entire VM. I’m guessing some instances having it at a per VVOL level would have some advantages depending on use cases.

VM Level QoS: Tintri has full support for VM Level QoS. However, VVOLs don’t have any ability to control QoS or add that functionality to storage array. Storage QoS is still a fairly new features and is not supported by all storage vendors and the ones that do support it are LUN or Volume bases. With that said, there are some inherent advantages to VM Level QoS. Each QoS policy and be fine tuned for each VM in a Tintri solution but there could be some advantages to LUN level as well. It all depends on how your environment would best benefit from QoS.

Another big take away is that Tintri supports vSphere 4, 5 and 6. VVOLs are a new feature of vSphere 6 which will require an upgrade of your VMware environment and most likely some kind of hardware and/or software upgrade to your storage environment. For a more detailed breakdown check out this whitepaper by Tintri.

So you can see. That both Tintri’s VM-Aware storage and VVOLs are changing the paradigm on how storage will be administered going forward in a virtual environment. It is important to understand how each technology works and that both have advantages even if there is some overlap in features. The Challenge with VVOLs now is the fact that this is a new technology and you are dependent on the storage vendor and the quality of their API(s) and feature set. For example, the max number of VVOLs supported between storage vendors as of this writing ranges from the thousands to over a million. That scalability will be needed for large and growing virtual environments Tintri has been doing VM-Aware storage since their inception several years ago and once VVOLs mature a bit. Tintri and VMware will have some of the best integration when it comes to bridging the gap between storage and virtualization.

OpenStack Primer

The OpenStack submit has ended here in Texas so I decided to write a high-level overview of the OpenStack platform. Many of my customers and readers have inquired about OpenStack as the discussion around Cloud in a lot of organizations starts to heat up. So what is OpenStack? Why would I want to use OpenStack? These are some of the questions I will attempt to answer. However, I want to add a disclaimer here. I’m not an OpenStack guru and this was just as much as learning exercise for me as I hope it will be for you.

Introduction and History

OpenStack was a collection of open source technologies that formed a cloud computing framework that created large pools of compute, networking and storage resources that are centrally managed by a Dashboard while empowering users to provision resources. It also has a command line interpreter and large collection of APIs which makes is highly customizable when it comes to meeting the needs of an organization. As an open technology that goal of OpenStack was to promote collaboration between developers and partners in the spirit of transparency.

OpenStack started back in 2010. It was created by Rackspace and NASA as an answer to Amazon’s AWS public cloud offering. It was a combination of NASA’s Nebula Computing Platform and Rackspace’s Cloud Files Platform.

Organizational Overview

OpenStack is maintained by a Board of Directors, Technical Committee, and a User Committee. Known as the OpenStack Foundation. The goal of the OpenStack Foundation is to promote, maintain, and serve the entire ecosystem. Contributions can come in the form of individual developers, Manufacture development and Documentation. Each area performs a key role in the development of OpenStack and making sure the product goes through a proper development cycle.

OpenStack has a fairly simple release name format. The release names are in alphabetical order which makes it easier to determine the version and the release date. The development cycles are typically around 6 months. There release cycle is pretty straightforward.

Planning & Design: Lasts around 4-weeks with the final revision approved at the design summit

Implementation: This is when the rubber hits the road and the coding begins and the creation of the documentation. The implementation is broken up into iterations

Release Candidates: Testing and bug fixes are addressed at this stage. No more features are added after RC1 or Release Candidate 1.

Release Day: The last release candidates are published

Here is a link to the latest releases.

OpenStack Components

OpenStack is a modular platform that is manages compute, networking and storage components within the cloud infrastructure. This modular design allows for flexibility and the mobility of your application(s) using exposed APIs. These components in the OpenStack lingo are known as “Projects”. Each project has a team leader who responsible for the overall development and interoperability with vendors and partners. Projects are broken down into several terms which describes their roles in the overall process. OpenStack uses has developed a nomenclature that describes a projects status within the OpenStack organization. This link shows the project types.

OpenStack Requirements

The following is needed to build an OpenStack environment.

Host Operating System (Windows, Linux, VMware, etc.): A host operating system is needed for the compute platform. That can be a bare metal instance or a more commonly a virtualization platform like VMware, Hyper-V or KVM. And that brings me to one biggest misconceptions about OpenStack. OpenStack is not a replacement for your Hypervisor. OpenStack is a Cloud Frameworks that ties everything together into a cloud framework. You still need a host operating system.

Database Service: Like most software nowadays, a database backend is needed to power the OpenStack engine. The Database service is powered by MySQL.

Requires a Message Queuing (IMQ) Service: Message Queuing the process that allows a group of application components to form a much larger application. Without a message queuing service OpenStack components would not be able to communicate and scale. Making each component or project a separate independent application. RabbitMQ, Qpid, and ZeroMQ are the supported IMQ services.

Networking (Layer 2 and Layer 3): This is a no brainer. Without networking your users wouldn’t be able to gain access to applications or services.

Key Services:

OpenStack is made up of several key services which at a minimum are needed to deploy a OpenStack cloud environment.

  1. Compute Services (Nova)
  2. Networking Services (Nova or Neutron)
  3. Storage (Cinder or SWIFT)
  4. Identity Services (Keystone)
  5. Image Services (Glance)

Each of these services are building blocks for an OpenStack deployment. Let’s explore each of these and get a better understanding how each works.

Nova

Nova is a control plane or management layer for the compute environment. Your instance will run on top of Nova. Under Nova is your hyper-visor environment. Most hyper-visors (as I mentioned earlier) are supported including VMware, Hyper-V, and KVM. For each Hypervisor instance there is a corresponding Nova instance. So the hypervisor acts as the data layer and Nova would be the management layer.  This is an important distinction to understand going forward. For those with a Cisco background understand a control and data plane. Think of the control plan as the management layer and the data plane as the data layer in a Nova Compute environment.

Nova Networking

Nova comes with the standard networking components. From L2/L3 connectivity, DHCP services and VLANs. Networks can range from the simple flat networks to a multi-VLAN network gives most organizations flexibility they need in an OpenStack deployment. Nova runs into some scalability issues in larger deployments and that is were Neutron come into play.

Neutron Networking

Neutron is a software defined networking platform which designed for larger organizations and service providers. As mentioned before, Nova has some inebriate scalability limits which could cause challenges for large OpenStack deployments. One of the biggest was the 4096 VLAN limitations which for most organization is more than enough but it’s could be a challenge in larger service providers. It also didn’t do well from an interoperability standpoint outside of the OpenStack environment. Finally, the code base for Nova was becoming by far the largest projects and it quickly made sense to create a separate project to handle networking.

Neutron introduced support for advanced features like stretch and overlay VLANs. This opened the door for third parity plugins which allowed for seamless integration with networks outside of the OpenStack ecosystem. Some of the newer topologies introduced were VXLANs and GRE.

Keystone

Keystone is OpenStack’s Identity Service. Identity Services provide a means to authentic and authorize users in a OpenStack Cloud environment. These users can be users accessing a specific application or service. The OpenStack identity service uses a token based system as a way to authorize a user to a specific set of resources. Keystone can be standalone or has the ability to integrate with Microsoft Active Directory.

Glance

If you are familiar with any Hypervisor platform than virtual machine images. This is where OpenStack Glance comes into the picture. Images allow you to create prebuilt Virtual Machines templates for rapid deployment of Virtual Machine guests. Glance can manage images at a global or ternate level depending on the use case. Images are typically stored in a dedicated volume which can be stored locally or in the public cloud like AWS.

CINDER

Cinder is the OpenStack is a block storage platform. Its comparable to AWS’s Elastic Block Storage volumes and are persistent when an instance (Virtual Machine) is terminated (Unlike ephemeral storage which is deleted when an instance is terminated). A lot of storage vendors are contributing to the Cinder platform allowing for growth of features and functionality.

Cinder attaches and detaches volumes using iSCSI. A wide variety of features are available including Snapshots and Cloning. The feature list of Cinder will continue to grow as the project matures and better integration with manufactures like NetApp, EMC and others.

SWIFT

Swift is OpenStack object storage projects. It is similar to Amazon S3 protocol. Object storage allows applications to access web objects which makes accessing data a whole lot simpler for developers. Swift has been around since the beginning of OpenStack and can be used as a standalone product. Swift can leverage commodity servers and disk drives (JBODs) and create a distributed storage environment that is highly saleable. RAID is not needed in a SWIFT deployment but instead leverages a three replica or erasure coding data protection schemes. Three replica or object copy is pretty straightforward. There are three copies of object in the object storage environment which can be across three data centers. This is effective but also presents a lot of overhead. Erasure coding removed some of the overhead by striping the data across data centers like RAID 5 does to disk drives. This is far more efficient from a capacity standpoint. For more info on object storage please check out my post here.

Whew! That was a lot of information! As you can see, OpenStack is a very complex and extensive solution. We just barely scratched the surface. Each of these areas could be easily a week long course. My goal was to get you familiar with the terminology and understand at a high-level how OpenStack works. I personally believe as the technology matures we’ll see more adoption. At this time large organizations and service providers are the most likely candidates. Smaller shops just don’t have the time or the expertise to put in OpenStack.

What does the NetApp purchase of SolidFire Really Mean?

Just before Christmas there was an announcement that NetApp purchased SolidFire for 870 Million in cash. The announcement came under the radar for most of the industry due to the holiday season. It came somewhat of a surprise to me that NetApp decided to pull the trigger on purchasing a Flash storage vendor. I remember hearing rumblings that FlashRay wasn’t going well for NetApp a few months ago and began to wonder if NetApp was going to pull back and reevaluate their storage play. However, I still didn’t think NetApp was going to go the acquisition route. I figured they would push forward with their current flash offerings.

Why the Purchase?

There was a good podcast by Justin Warren at The Eignecast where he interviews both George Kurian, CEO of NetApp, and Dave Write, CEO of SolidFire. George Kurian was pretty transparent when he discusses FlashRay and the challenges NetApp was having internally getting the product to match their vision and the challenges of the market. The bottom line, FlashRay was not ready for primetime and further delays were going to hurt NetApp in the flash marketplace and NetApp needed an acquisition to make up for any lost time. So here comes SolidFire!

Brief History of NetApp All-Flash

I thought it would be interesting to look at NetApp’s history in All-Flash. NetApp has been leveraging Hybrid-Flash for some time but was fairly new to the All-Flash storage market. Just a note; You could do an All Flash FAS prior to the release of the FAS8000 series but NetApp didn’t productize it like they did with AFF platform back in 2014 when the new FAS8000s came out.

All-Flash FAS (AFF)

Starting with their All-Flash FAS (AFF) is what NetApp calls their enterprise platform. It runs the Data ONTAP operating systems and utilizes WAFL which has been the core of their product line for quite some time. The bottom line is that all the software features that we have grown to love with NetApp are available with the AFF platform.

EF-Series

NetApp acquired LSI back in 2011. The E-Series storage platform was designed for environments that need simple fast block-only storage platform without the bells and whistles. E-Series doesn’t run Data ONTAP, it runs SANtricity. The EF-Series is the All-Flash E-Series and it brings high-performance but little in features functionality. The big difference between EF-Series and AFF is that the EF-Series has lower latency and higher IOPs but for most customers the AFF will meet their performance requirements.

FlashRay

NetApp’s FlashRay was supposed to address the big knock on NetApp, which has been ease of use. The big success that Pure and other flash startups have had is ease of use. This has especially resonated with environments that don’t have a storage expert or storage team on staff.  FlashRay also ran another operating system called Mars OS which gave NetApp 3 operating system. The reason for not using ONTAP was it inherent complexities and big memory footprint but now with the purchase of SolidFire we can officially say that FlashRay is dead.

SolidFire

Now comes SolidFire. SolidFire’s role at NetApp will be to replace the FlashRay. SolidFire is a highly distributed architecture that will fill the gap that AFF and EF Series can’t fill. It also addresses the main goal of FlashRay which was ease of use from a management perspective. NetApp plans on keeping the Name and hopes this will keep Pure, XtremIO, and other from taking market share. I will be doing a more detailed blog post in the future.

What does it mean for NetApp going forward?

NetApp was late to the game when it came to All-Flash. During NetApp Insight they were very honest in saying that they didn’t give the market what it was demanding and that opened the door to the competition. They instead were too focused on Hybrid-Flash solutions like Flash Cache and Flash Pools which still are very relevant but lack the sexiness of flash only. In NetApp’s defense most customers don’t need all flash but what they need and want can be two very different things. Plus, companies like Pure did a great job marketing flash and finding creative ways to make their product affordable. The big question I have about the SolidFire acquisition in how this will impact NetApp going forward managing three software platforms. This was the same question I had about FlashRay, as well. What attracted me to NetApp like other customers was the one software operating system that was consistent across platforms. That made the product simple to digest and made it a lot more appealing than EMC which seemed to have 2 or 3 of everything. I know that NetApp started to realize that Data ONTAP wasn’t a fit for every customer which makes some sense and having other offerings will help to address customer needs without confusing them in the process. We’ll have see how this strategy plays out.

What does the Dell Acquisition of EMC mean to the Industry?

Last month the biggest tech deal of all time took place and sent shock waves through the industry. Dell buys EMC for 67 billion dollars. EMC will go private after a long stint as a publicly traded company and will be run by CEO Michael Dell. Long time EMC CEO Joe Lucci will retire. VMware will still operate as a separate organization and will remain publicly traded. What makes this deal so fascinated is that fact that Dell is the much smaller company. Dell value is almost a third of EMC’s purchase price.

I have been on both sides of the fence as a partner and customer when it comes to Dell and EMC. Both companies from a cultural standpoint are very different. It will be interesting to see how two very large companies can assimilate from a cultural standpoint. It is my belief this could be a huge issue in the coming years which could potentially cause a lot of turnover. EMC has the reputation in the field of being cutthroat. They have no problem going around a someone even if it ultimately harms the relationship. This is typically due to aggressive numbers which puts a lot of presume on the sales teams. EMC also puts a high value on their product which typically translates to a higher price. Dell on the other hand has always been a little more laid back in their approach. It always seemed to be more about the price than the technology. Most customers looking at Dell are doing so because the reputation of being price sensitive. Which a lot of times puts them in more the commercial or SMB space which isn’t were EMC typically plays.

Neither EMC or Dell can be classified as industry innovators. They have grown through acquisitions. Some of EMC’s most notable acquisitions where Data General, Data Domain and Isilion.  Dell’s where EqualLogic and Compellent. What EMC gives Dell a more enterprise storage and backup play. With EqualLogic already being phased out. My guess that Compellent will be next and Dell will keep the majority of EMC’s storage portfolio. The Dell products will either be dropped or sold off to pay off debt. From all reports, it sounds like VMware will still operate independently like it did under EMC. I wonder if VMware will be sold off or will Dell try to control the company through purchasing shares in their stock. VMware was the majority of EMC valuation and have been the leader in virtualization for the past 10 years.

So what does the future hold for Dell? It’s hard to say how things will look after the dust settles. Dell will now have the full stack which will allow them to better compete with HP, Cisco and Oracle. I do see the EMC and Cisco relationship starting to fade long term. It sounds like in the short term VCE and the VBlock will continue to be supported but in my opinion the future is uncertain for both. My guess is Michael Dell wants to move towards a “one vendor for all” model like HP in the next several years. If that’s the case, I see the end for both VCE and VBlock.

How does Dell’s acquisition of EMC impact their competition? Well I think this is a good thing. Especially for NetApp which have had its own set of challenges. There biggest being the move from 7-Mode to Clustered ONTAP. I think for the next couple of years Dell is going to be focused on merging two very complex companies and large product line. So it’s going to take some time before they start gaining momentum. This move could make some existing EMC accounts more vulnerable. Especially, those diehard EMC accounts that don’t necessarily like Dell’s technology.

It will be interesting couple of years to see how things evolve with Dell. Dell acquired a lot of debt with this acquisition and I’m curious how this will impact R&D in the long run. Plus, it will be interesting to see how or if Dell competition will answer. Will someone else do a large acquisition of there own? I still think there is a possibility that Cisco will purchase NetApp one day. Cisco will need to make a move on a storage vendor as the industry starts to consolidate more. In my opinion prior to the EMC acquisition. The NetApp and Cisco relationship started to weaken a bit. This was mainly due to the fact that Cisco decided to diversify its partnership with storage vendors like Nimble and Pure. I guess only time will tell.

An Introduction to Object Storage

What is object storage? Object storage is a term used to describe a new storage paradigm. It was created to address new challenges that we see starting to see around the large growth in unstructured data. Today’s traditional storage technologies aren’t able to scale in order to deal with this growth which some analysts are saying well grow 40% in the next 10 years. But before we go any deeper with object storage. Let’s get a refresher in file and block based storage architectures. It’s important to understand these technologies first and how cloud storage is different.

SAN Storage – This technology has been around for decades and has been the foundation for both direct-attached-storage (DAS) and storage-area-networks (SAN) technologies. In Block storage, blocks were numbered and stored in a table. The OS would reference the table to access the appropriate block(s). In Windows this was a FAT or NTFS and in the UNIX world is was called Superblock. This model was limited to the OS or kernel level. The challenge here was scalability from the server and file system perspective. Storage arrays addressed some of the challenges by centralizing storage and allowing for more growth but we still had operating system and file system constraints.

 

NAS Storage – This technology presents files over the network using SMB (CIFS) and NFS. It still references block and uses a file system like WAFL, ZFS, etc. This model functions at the user level versus the host level. Think of it being one layer above the SAN storage example above. You still need a block storage device but you are using a protocol like SMB or NFS to access the data and not iSCSI or Fiber Channel. Compared to SAN, NAS storage is typically easier to manage can can hide some of the complexities of SAN or block based storage from the user and administrator. Plus, it you weren’t constrained at the OS or server level with this particular solution. Now that NAS performance is comparable to iSCSI and FC it is a great option for a lot of workloads.

Object Storage –This is the new paradigm. Similar to NAS but uses objects not files and makes then available via HTTP using a protocol called REST. REST is a lightweight protocol used for accessing data over the web. REST has the ability to send command operators (GET, PUT, DELETE, etc.) via HTTP. It still uses block storage under the covers much like NAS but in a much simpler format. It also uses a different form of data protection to address come challenges in the older RAID technology. In addition, it allows multiple software platforms to access data without dealing with the complexities of FC, iSCSI, NFS, or SMB. So from a software developers perspective it makes access data a whole lot simpler.

High-level View

Your typically object storage environment consists of a cluster of Linux server or appliances (sometimes called nodes) with a load balancer. When a request comes from the client the load balancer (depending on the algorithm) will determine which node will receive the request. In this example, there are six nodes. The request comes in and is sent to node one. The client will write the file to node one and it will make multiple copies of this file to node 3 and node 6. This provides redundancy and replaces the need for RAID in this solution. This is called a 3 Object Copy. All Copies of files are available for both read and write functions which is important. These nodes could be in the same data center or be in geography different locations. This is a fairly simplified design but it gives you the high level view of how object storage works.

What is REST?

We briefly touched on REST earlier in this post. Representational State Transfer (REST) is a newer protocol that was originally designed to solve some of the challenges in the world of web development. More specify around the better ways of doing web services and the challenges around using protocol like Simple Object Access Protocol (SOAP). What made REST appealing for Object Storage was it’s extremely lightweight and high customizable nature. Amazon S3 is a customized version of REST that was developed by Amazon Web Services (AWS). S3 is quickly becoming the protocol of choice for object storage. There are other protocols out there like Cloud Data Management Interface (CDMI) jointly developed by SNIA and the OpenStack Object Storage protocol or SWIFT but it’s clear for now S3 is the leader.

Protection Types:

Replica – Replica is the object copy model and is the most rudimentary protection method. For every file written to a node in the cluster, two more copies are written to other nodes in the cluster. This is fairly simple to implement but triples the amount of raw capacity needed which isn’t very efficient. The most common is the “three copies” protection scheme. This is still widely used by a lot of cloud provides but doesn’t have a future in the enterprise space.

Erasure Coding – Erasure Coding was a technology develop by NASA for deep space telemetry because of the high rate of signal loss in space. The algorithm was capable of reconstructing a signal, even with 30% to 40% of signal loss. It was later found that this algorithm works well as a data protection scheme in distributed cloud storage environments where you are dealing with the challenges of a WAN connectivity.

How does Erasure Coding work? In an Object storage model, Erasure Coding splits the file into segments and adds a hash to each segment. This hash creates a file protection mechanism (metadata) which doesn’t double to triples file size like with saw with replica model. This model saves disk space which makes it a more cost efficient and better performing model in most deployments. It can also survive several device failures depending on how it is deployed. Because of this protection scheme large object deployments are typically it’s sweet spot.

The challenge with large deployments that span geographic locations is that it requires a lot of backend infrastructure to support this design even with Erasure Coding. Depending on the rate of change is the amount of data that will traverse the WAN could potentially be quite substantial. So replication of data is still typically used for situations were data needs to be sent to a remote location with Erasure Coding taking place at a single location.

NoSQL DB – NoSQL is a non-rational database designed for large scale deployments, where unstructured data are stored across multiple nodes or servers. This is a distributed architecture which sales horizontally as data grows this is no decrease in performance. This is the opposite with rational databases. Rational databases typically require more compute horsepower and only scale vertically meaning you need a larger box when you have consumed all resources. NoSQL originally took off with the growth in Web 2.0 applications but we’re now seeing it be used in big data applications and cloud storage. The ability for NoSQL to scale makes it a good choice for object storage were small objects are used.

Use Cases:

Object Storage is still hasn’t been adopted by the enterprise. Part of the challenge is its limited use cases and the small number of products available on in the market. As mentioned earlier, Object Storage was designed to address a need in software development to access data without the complexities other older technologies and massive growth in unstructured data. We should start to see move adoption on these type of technologies in the next 5 to 10 years. Analysts are predicting unstructured data to grow as much as 20 to 40 times. The only way to address this massive growth is Object Storage. Backup and Archive is another area were we could see Object Storage take off in the next few years. It’s a great way to backup data to disk and start phasing out that tape environment. We are already starting to see a lot of backup vendors developing gateways. You need what is called a gateway server or appliance to convert NAS/SAN based protocols to REST or S3. At some point you’ll see media servers natively support this function as it grows in popularity. This was the same challenge backup vendors had when writing to disk 10 years ago. A VTL has to be used to present disks based storage because backup vendors could only write to tape.

Conclusion

As mentioned earlier, object based storage is a great solution for unstructured data, backup and archive. However, it is not a good option for virtualization, databases, and other high I/O workloads. Block and files based storage that’s Flash optimized is better solutions for these type of workloads. I do see the gap shrinking between object based storage and the more traditional methodologies, e.g. SAN and NAS in the next 10 years.

Where does All-Flash Storage Make Sense?

This is the million-dollar question for a lot of organizations. Flash storage an exciting technology and it has gotten a lot of momentum in the last couple of years. The goal of the article is dive into the use cases for All-Flash storage and how to determine if you need All-Flash array or would a Hybrid array be a better option.

 

History of Flash

 

Before we dive into All-Flash and Hybrid storage discussion. Let’s take a brief look at the history of SSD and Flash technology. There is some confusion on the term Flash and SSD. Let’s start with the term SSD. SSD stands for Solid-State Hard Drives. SSDs are not composed of any moving part like a Hard-Disk Drive (HDD). HDDs have cylinder that spins and a head that moves up and down to read data on the cylinder. SSDs were originally DRAM based using capacitors to hold memory state. Then SSDs moved to Flash based technology using NAND technology to hold the memory state. Once SSDs moved to NAND technology, SSD and Flash became the same thing. Now all SSDs are Flash based so the terms can and will be used interchangeably.

Flash based SSD drives are broken into 4 categories. TLC (Three-Level Cell) are typically used for consumer equipment and not used in the enterprise. So we won’t spend a whole lot of time here. The bottom line is lack of performance and poor reliability, which is needed in the enterprise. SLC (Single Level Cell) are high performance drives and are typically the most costly. SLCs have one bit, which increase reliability (in theory) but the trade off is the decrease in capacity. MLC (Multi-Level Cell) have average performance with less reliability when compared to SLC drives. MLCs have two bits, which increases the capacity the higher ware and tear on the drive. eMLC (Enterprise Multi Level Cell) have better performance when compared to MLC drives, increased capacity and better price point when compared to SLC drives. In other words, eMLC bridges the gap between MLC and SLC drives giving you almost the best of both worlds.

 

Types of Flash Storage:

 

Now let’s elevate the topic and discuss the different options around flash technologies. I think it’s important to understand things from a high-level before getting into vendors and features.

All-Flash Arrays are entirely composed of SSDs and can deliver high performance for certain workloads. With the price of SSD drives dropping in the last several years. We have seen a huge jump in All-Flash storage arrays from both the big storage giants and the smaller startup players. The cost of SSDs is still higher than HDDs but some vendors have incorporated deduplication and compression in their sizing to bridge the gap. This controversial strategy has made the cost per-GB comparable with spinning drives. These two catalysts have lowered the barrier of entry for Flash and its now more assessable then ever to the smaller shops.

 

Hybrid-Arrays are a combination of HDD and SSD drives. They do a great job of increasing performance by using Flash but allowing for growth in capacity using HDD. This is achieved in a coupe of different ways leveraging a small amount of Flash for random reads (which is what flash is good at) and writing everything else to either SAS or SATA drives. This has been around for some time. Another method is to capture random read/writes in Flash and sequentially write the data to SATA. This method lowers the cost per GB even more because it removes the need for 10 or 15K SAS drives. Both of these are a great fit for most environments and can support a diverse workloads and requirements.

 

Traditional Arrays are composed entirely of HDDs and have been the staple for the industry for over 10 years. They still make sense for a lot of environments and can support variety of workloads. Both SAS and SATA still have a place in the industry. SAS is a great option for more performance demanding workloads and SATA is great for dense capacity needs. However, there are a couple of downsides we’re starting to see. The old way of sizing for performance is not the best option anymore. Increasing spindle to size for IOPS doesn’t scale very well. With larger capacity drives (up to 6TB for SATA and 1.2TB for SAS) you could potentially throw a lot of capacity at a workload in order to provide the spindle count needed to accommodate the application. The other challenge is the more spindles you through at an application the more latency you can introduce.

 

 

Flash, Performance, and Workload:

 

So here is the good stuff. What workloads benefit the most from flash storage or a flash hybrid mix? Typically, it’s small block, random I/O, and latency sensitive applications that will see the most benefit from any type of flash storage. Let’s break it down by application and workload.

 

Databases (SQL, Oracle, SAP HANA) – Databases are typically latency sensitive and very I/O intensive. They traditionally see a lot benefit from flash and will perform better. The challenge comes when their inefficiencies in the databases that are being masked by hardware. This ends up being a “kicking the can down the road” approach that will most likely come back at some point. Understanding poor database design and fixing these will help improve performance as well.

 

Server Virtualization (VMware, Hyper-V) – Virtualization workloads are small random I/O that will also benefit from Flash. The VM system .vmdk files are mostly read, while, data .vmdks are dependent on the application on the server.

 

Desktop Virtualization (Horizon View, Citrix XenDesktop) – Just like server virtualization, desktop virtualization is also highly random IO that tends to be bursty. Especially during boot storms and virus scanning operations.

 

Unstructured Data – Most file server data won’t really see a lot of benefit from Flash. This typically isn’t IO intensive.

 

Disk-to-Disk Backups – Typically backup data streams sequentially which isn’t a good play for Flash.

 

Know Your Workload:

 

How do you better understand your workload? There are tools out there can measure performance at both the storage array level and OS/Application Layer. Anytime you are monitoring performance you need to collect a baseline. The longer you can run a performance gather tool the more accurate your data will be. It’s a good idea to get months worth to capture any monthly business processes that might otherwise be missed. Once you get this data you evaluate your IOPs, Latency and Read/Write ratio and see if there is anyway to improve performance. Use this data in conjunction with what you are hearing from your users or customer. If you users or customers aren’t complaining about performance issues. Why make an investment in Flash storage? Sometime we tend to over architect our environments when there isn’t a need. This is why from a business perspective it’s important to understand your end-user experience coupled with any performance data your have. Decreasing latency in an environment that’s not experiencing any application latency issues isn’t a good business decision, especially, if it’s incurring additional cost.

 

In conclusion, we covered a broad spectrum of topics. We started with the history of SSDs and Flash Technology. Then we moved into the Hybrid and All-Flash ray overview and how the technologies are different. Finally, we discussed workloads that work best for flash and how to determine if it’s needed for you environment.

 

How to build a Private Cloud in Seven Steps

Here are the seven steps in building a private cloud: Consolidation, Virtualization, Standardization, Service Levels, Automation, Orchestration, and Chargeback/Showback. Each or these key areas need to be achieved before you can move to a private cloud. As a CIO or IT Director, it’s important to take a step-by-step approach, however, before you jump in. It’s a good idea to perform a cloud enablement assessment of you environment to get an idea where you are at from an organization and infrastructure standpoint before you move towards you goal of a private cloud.

 

  1. Consolidation: Consolidation has also been around for a long time. It started with the move to centralized storage in the 1990’s. Both Storage Area Networks (SAN) and Network Attach Storage (NAS) started gaining momentum as a way for an organization to scale storage based on the needs of the business and not be confined to a single server. That followed a move in the server technology towards blade severs and then multicore CPU technology a few years later created an environment for dense compute in a relatively small form factor. This stage for server virtualization to take off which will cover in the next section. Last but not least, there was a move to 10Gb Ethernet networks (and now 40Gb and 100Gb) that allows for the consolidation of several 1Gb Ethernet connections with different protocols on a single wire. Cisco calls this the unified fabric. All of these need to be in place.

 

  1. Virtualization: Like consolidation, we have seen virtualization take off in the last 10 years. It’s now pretty common to see some sort of virtualization in place. Server Virtualization was the first to become adopted last decade with VMware. With most applications now supporting server virtualization there really isn’t a reason not to virtualize unless there’s unique requirement. With that said, server virtualization is a critical step in the process in moving towards the internal cloud and getting your environment as close to 100% will help in that process. Virtualization is not just confined to servers. Both network and storage now support some kind of virtualization. Software Defined Networking (SDN) elevated the network and makes it better suited for the automation and orchestration phase. Both Cisco ACI and VMware NSX are players in this space. That goes for storage as well. Software Defined Storage (SDS) technologies allow you to abstract the hardware allowing for unified management and functionality. I see both of these continuing to evolve in the coming years.

 

  1. Standardization: You can’t have a cloud strategy without standards in place. This is a critical step and is where a lot of organizations get tripped up. What are standards? Standards are any formal documentation that describes a process within an IT organization. Standards can document a procedure or could define a group of strategic vendors. Either way, standards bring predictability into the organization and prevent “one offs” in the data center.

 

  1. Service Levels: A Service Level Agreement (SLA) is an agreement that defines a group of metrics that measures the service that an IT department provides to an application owner, business unit or department. The metrics are typically uptime, response time, performance benchmarks and a change management process for planned outages. Like standards, these need to be in place.

 

  1. Automation: Automation at some level as been around in IT and the data center for years. Scripting has been the tool of choice for systems administrations to automate repetitive tasks. This freed up time to focus on higher-level tasks like project orientated work. In a private cloud this is taken to a new level. Automation at a large scale is a key but can be challenge to deploy at a large scale. The automation products you use and how it’s deployed will all depend on the standards that your organization developed and your underlying architecture.

 

  1. Orchestration: Once automation is complete, it sets the stage for the Orchestration phase. Orchestration is the ability to provision resources based on requests from the business using a self-service model. However, before that takes place a service catalog needs to be created with defined workflows. Again, this will be dependent the standards your organization has in place and the underlying architecture. Standardization on a few vendors will help in the integration and deployment process. Cisco, VMware, and other have developed orchestration. Automation and orchestration tools are typically coupled together.

 

  1. Charge Back/Show Back: This seems like a pipe dream for most IT Managers and CIOs but it’s an import step in the overall cloud strategy. The chargeback model is pretty straightforward. It is typically employed by hosting providers that charge tenants a monthly fee for resource consumption and has been around for sometime. But there has been some momentum outside of the service provider space to make IT as a profit center in the corporate enterprise. This can be an aggressive goal if IT hasn’t operated in such a role and typically results in a massive shift in thinking for the organization. The showback model tends to work better for most enterprise environments and it simpler to implement but it sill has it’s own set of challenges. The showback model allows for IT to tack resource consumption in a monetary format. This allows IT to justify costs for budgetary purposes and show who is consuming the most resources in the enterprise.

 

Technology is only a piece to the private cloud puzzle. People, process, and unfortunately politics also play a big role. Find out what the goals are of the organization and see how a move towards a private cloud will help address those challenges. Once those are defined you can approach from a technology perspective.