Tag Archives: Nexus

Hyper-converged infrastructure – Part 2 : Planning an Cisco HyperFlex deployment

I recently got the chance to deploy a Cisco HyperFlex solution that is composed of 3 Cisco HX nodes in my home lab. As a result, I wanted to share my experience with that new technology (for me). If you do not really know what all this “Hyperconverged Infrastructure hype” is all about, you can read an introduction here.

Cisco eased our job by releasing a pre installation spreadsheet and it is very important to read that document with great attention. It will allow you to prepare the baseline of your HC infrastructure. The installation is very straightforward once all the requirements are met. The HX infrastructure has an important peculiarity, it is very very very (did I say very) sensitive …. if one single requirement is not met, the installation will stall and you will be in a delicate situation because you could have to wipe the servers and restart the process. As a result, you could lose precious hours.

Cisco has a way to automate the deployment and to manage your HX cluster.Finally, The HX installer will interact with the Cisco UCSM, the vCenter, and the Cisco HX Servers.

It is especially relevant to note that the Cisco HX servers are tightly integrated with all the components described in the picture below:

HyperFlex Software versions.

As usual with this kind of deployment, you have to make sure that every version running in your environment is supported.  We will run the 2.1(1b) version in our lab and will upgrade to 2.5 at a later time. We need to make sure that our FI UCS Manager is running 3.1(2g).

In addition, the dedicated vCenter that we will use is running the release 6.0 U3 with Enterprise plus licenses.

Nodes requirements.

You cannot install less than 3 nodes in a Cisco HyperFlex Cluster. Because the HX solution is very sensitive, it is mandatory to have some consistency across the nodes regarding the following parameters:

  • VLAN IDs
  • Credentials 
  • SSH must be enabled
  • DNS and NTP
  • VMware vSphere installed.

Network requirements.

First of all, the HyperFlex solutions require several subnets to manage and operate the cluster.

We will segment these different types of traffic using 4 vlans:

  • Management Traffic subnet: This dedicated subnet will be used in order for the vCenter to contact the ESXi server. It will also be used to manage the storage cluster.
    • VLAN 210: 10.22.210.0/24
  • Data Traffic subnet: This subnet is used to transport the storage data and HX Data Platform replication
    • VLAN 212: 10.22.212.0/24
  • vMotion Network: Explicit
    • VLAN 213: 10.22.213.0/24
  • VM Network: Explicit
    • VLAN 211: 10.22.211.0/24

Here is how we will assign IP addresses to our cluster:

UCSM Requirements.

We also need to assign IP addresses for the UCS Manager Fabric Interconnect that will be connected to our Nexus 5548:

  • Cluster IP Address: 
    • 10.22.210.9
  • FI-A IP Address:
    • 10.22.210.10
  • FI-B IP Address:
    • 10.22.210.11
  • A pool of IP for KVM:
    • 10.22.210.15-20
  • MAC Pool Prefix:
    • 00:25:B5:A0

 

DNS Requirements.

It is a best practice to use DNS entries in your network to manage your ESXi servers. Here we will use 1 DNS A records per nodes to manage the ESXi server. The vCenter, Fabric Interconnect and HX Installer will also have one.

The list below will show all the DNS entries I have used for this lab:

  • srv-hx-fi
    • 10.22.210.9
  • srv-hx-fi-a
    • 10.22.210.10
  • srv-hx-fi-b
    • 10.22.210.11
  • srv-hx-esxi-01
    • 10.22.210.30
  • srv-hx-esxi-02
    • 10.22.210.31
  • srv-hx-esxi-03
    • 10.22.210.32
  • srv-hx-installer
    • 10.22.210.211
  • srv-hx-vc
    • 10.22.210.210

This sounds very basics and as a consequence, it is CRITICAL that these steps are performed PRIOR any deployment otherwise you will waste a lot of time trying to recover (at some point you would have to wipe your servers and reinstall a custom ESXi image on each one). 

Finally, In the next blog post, I will show how to install the vCenter, The Fabric Interconnect and the HX installer needed for the HyperFlex deployment.

In conclusion, do not hesitate to leave a comment to let me know if you encountered any issue while planning your deployment.

Thanks for reading!  

My CCIE Journey – Act II

CCIE_DC_Logo
In fact the title should be “My CCIE Journey – Act III” but I don’t want to use that one because I had a bad experience with the CCIE Voice lab exam 🙂

There are many (very good) links about that specific subject but I wanted to give my own opinion as well :). Here is a list (incomplete for sure) of the people that have blogged about their CCIE DC lab experience :

I have shared my journey towards the CCIE RS in 2011 and I wanted to share it again with you. I passed the CCIE DC lab exam one month ago and it was tough, long, hard,arduous, baffling, difficult, exacting, exhausting, hard (yeah I already used it on purpose 🙂 ), intractable,perplexing, puzzling, strenuous, thorny, troublesome, uphill.

As soon as I failed my CCIE Voice exam, my frustration went so high and I needed a break from the Voice exam a little bit. The Data Center exams were released by Cisco and I always wanted to be involved in a Data Center infrastructure project. I immediately decided to jump into the DC field and start to climb the (infinite) ladder.

At this time my DC infrastructure background wasn’t enough to pass the CCIE DC Written, I decided to spend a year reading books and solidify my knowledge.

First and foremost the CCIE DC blueprint is like any CCIE DC, it is VERY large. As an expert that will face customers and other experts, you definitely have to dig very deep to understand what’s going on in every section of your infrastructure (Compute / Storage / Infrastructure).

In my previous CCIE Journey post I used this expression from Brian McGahan: “a CCIE journey is not a short race, it is a marathon”. 4 years after, this applies even greater today. If you have a family, you better have to have a very supportive wife/husband. My wife is the most supportive person I’ve ever met.

We had our 3rd baby 10 months ago and my daughter couldn’t sleep at night. My wife was taking care of all 3 children 24/7 while I was studying. She even stayed at my parents home for several weeks to make my study time more efficient. After all, I can say that we are both CCIE RS-DC right now :).She deserves the title as much as I do … I am pretty sure that the CCIE exam is easier than taking care of the children. What I am trying to say here, is that you have to be dedicated to this exam.

CCIE Written Preparation

I already mentioned before but I read LOTS and LOTS of books. I will give you my list very soon but first I would like to start with one of the best technical book I have read in my entire career.

Data Center Virtualization Fundamentals  written by Gustavo Santana is definitely the best Data Center book out there. If you have some Routing and Switching Skills, you probably read the very famous Routing TCP/IP Books (Volume 1 covers IGP and Volume 2 covers BGP,Multicast and IPv6). All I can say is that Santana is as awesome as Doyle. I don’t want to overemphasize but I really enjoyed every words of the book.

HTML5 Icon
 

The others books are the following:

  • Cisco UCS (a bit outdated but still nice to understand)

HTML5 Icon

HTML5 Icon
 

HTML5 Icon

HTML5 Icon
 

HTML5 Icon
 

I also read some free ebooks written by EMC and IBM. To me these 2 books regarding Storage Area Networks are great free resources:

I was almost ready to sit the CCIE DC Written exam but I decided to solidify all the theory I have gained throughout the year. In order to do that I gave a look at CCIE Training vendors.

I have a very good experience with all the main vendors and this is probably the most frequently asked question so far : “Which vendor did you use for your preparation”

First I never really picked up a vendor. I tend to prefer to choose an instructor. I went with INE and Micronics Training for my CCIE RS because I heard from close friends that Brian McGahan and Narbik were top notch instructors (and they are). For my voice studies, I went with IPX because Vik Malhi is the best Voice Trainer I’ve ever met (Since that time, Vik has its own training company CollabCert, you should definitely give it a try if you are interested in collaboration). So in my opinion, students should not pick a vendor, they should pick an instructor and an instructor that meets your personal requirements. Maybe McGahan, Kocharian and Malhi are not the best for you but I can tell you from my personal experience that they are the best for me.

Choose wisely ! A training vendor business is to make your studies time efficient.

I bought an All Access Path from INE and decided to enroll myself into the CCIE Data Center Written Bootcamp. If you want to have a look of the teaching style:

 The INE videos are matching all the blueprint : Nexus / Storage / UCS.

There is another useful (free) resource available for you guys: Cisco Live Portal. This place is the place to watch deep dive videos regarding every Cisco topic!  For the DC stuff there are many listed by Brian McGahan on its “how to pass the CCIE DC” blogpost.

I passed my CCIE DC written exam on my second try. It was a really tough exam …

In order to track my studies during the journey, I have used trello and I love this app. Here is an example of how I managed my tasks

Trello_DC

CCIE LAB Preparation

The lab is a complete different story and I didn’t really relied on any vendors regarding the workbooks. I used INE and IPX for my online bootcamp but I will cover that later.

So regarding the workbooks, I didn’t really use any of them … I just did a few lab here and here from both vendor but I didn’t really like it. I just wanted to read the config guide, build the infrastructure and then run every show command I could.

For CCIE RS and Collaboration, it is very easy to host a rack in your home or at work. For the DC track, things can get more tricky since you will need a N7K (with VDCs you slice your switch into multiple virtual switches, don’t worry it is part of the blueprint 🙂 ), 2x N5K ,2x Nexus 2232 PP (in order to run FCoE), 2x MDS (9222 is my choice) and a small JBOD (I will make a separate post to show you how to build the cheapest JBOD ever 🙂 ).

INE and IPX racks can be very busy if you want to book the racks with UCS … I also recommend to use the Cisco UCS Platform Emulator on your own laptop (run on ESXi as well if you have a virtualization lab). You can do almost everything with it (except booting your favorite Operation System / Hypervisor).

My local Cisco SE (Vincent, thank you so much !) was kind enough to let me borrow 2x N5K with some FEX and  2x MDS 9222i. I have built a cheap jbod and I could test 100% of the storage feature for the lab exam.

CCIE DC Lab
I think the most valuable resources to practice is the Cisco Partner Education Collection .

There are so many labs and hardware there (sometimes fully booked of course) than you can spend countless hours of labs … Joel Sprague (which is an MVE [Most Valuable Engineer] I met during my studies) did a very good job by posting all the valuables labs that you can do with the Cisco PEC. I didn’t do ALL of them but the vPC / Fabricpath / UCS / N1000v are definitely mandatory … The UCS is one of the best because you can boot from SAN and the UCS is yours for 8 hours and for free.. Nothing can beat that !

CCIE DC UCS
 

Even if you are studying for the CCIE LAB exam and that you know that you are going to spend 8 tough hours configuring weird things, you still need to read a lot in order to configure your infrastructure.

I would recommend to read almost all the configuration guides related to the blueprint for the Nexus. For UCS and MDS, You can periodically check but there is no need to read everything like you should do for the nexus part.

I have watched both INE and IPX videos regarding the CCIE lab exam, McGahan and Rick Mur videos are perfect ! McGahan for INE was in charge of storage and Nexus while Snow was in charge for UCS.

I also attended 2 CCIE online bootcamp from INE (McGahan Again) and IPX with Jason Lunde. Both did a great job.

McGahan is definitely the big player here, his complete set of videos (Nexus – Storage – Lab Cram Session) are simply awesome. It covers way more than you need for the CCIE DC exam

Here is a preview of its DC lab cram session:

BM DC Lab
There are plenty of nice other resources that other CCIE DC have published on their own blog. Here is the 3 I used during my studies:

CCIE LAB Exam

I decided to book the CCIE the day before my vacations started because I didn’t want to go in vacations with the CCIE still in mind 🙂

So I went to Brussels on July 10th and I was very pleased by the proctor (if you read me, I would like to thank you. The experience was great). The exam is fair, it is hard but fair. There are no second guess like I had in voice. Questions were very precises and if I didn’t understand everything in the question, the task title made me clicked in my head : “Gotcha”.

You have to CAREFULLY read the tasks. If Cisco is asking for an ACL named MYCCIEDCLAB, you will not get the point if you configure it MYCCIEDCLAb. Even if your configuration is correct, they will look for the right naming convention. If you want to prevent all sorts of easy mistakes, your best weapon is the CTRL+C , CTRL+V. I can tell you this is the best thing you will ever need in the lab. Notepad is so useful as well !

During your daily job you would still do it right ? What if you want to configure vlan 100,200,300,400,500,600 in all your devices (let’s assume VTP is bad … wait a minute … it is bad .. in my opinion 🙂 ) You would open a notepad, type your commands , and paste into all devices right ?

My advice is to do the same for your CCIE Labs.

As Brian McGahan said, I did my happy dance when you see the UCS-B series booting ESXi 🙂

HTML5 Icon
I finished the lab with 1 hour left. Now the critical thing to do was to stay there and look for small mistakes I could have make during these very long 8 hours. I found some and for every tasks I checked that what I did was still working and that 100% of the requirements were met.

Finally I left the building and asked the proctor when can I expect the results to be delivered. He told me : “within few hours” . I thought he was making fun of me but he was right.

I went to the airport to meet a friend from Belgium and I received the score report notification.

Was thrilled to see the results : “PASS”

The exam can be tought but again it is doable. During my studies I have met a much better DC engineer than me, he failed the exam twice 🙁 . So please be sure to read slowly and try to understand what they really want…

So what’s up to me now that I am a double CCIE. In the beginning of the post I said that I started to climb the infinite ladder, what does that really mean ? It doesn’t mean that now that I am a CCIE, I can rest and that I can live like that and that my knowledge will stay at the same level through my career. People who think they are done with learning  are wrong.

Knowledge has to be sustained ! I still have to work on every protocol if I want my knowledge to be intact. I also have to learn new emerging technologies like Dev-Ops (not new but still new to me) / ACI / NSX etc etc in order to become a better engineer !

I hope you enjoyed the blogpost and in the meantime, if you have some questions, you can leave a comment below.

 

Nicolas

FabricPath Multidestination Trees

DC_BLUE_SMALL
 

 

FabricPath has many advantages over the classical Spanning Tree Protocol. Mainly because it can use ECMP (Equal Cost Multi Paths) Routing.

For unicast frames it uses the well known Switch-ID that is inserted in a FabricPath header. This will be explained in a future post for sure. I have been intrigued regarding how multicast frames (Unknown Unicast – Multicast – Broadcast) are forwarding within a FabricPath network.

FabricPath uses 2 multidestination trees to forward multidestination frames: Tree 1 and Tree 2

This sounds familiar to many network engineer and it has some similarities with Spanning Tree. The protocol starts building tree 1 and in order to do that, the topology will elect a root based on 2 factors:

  • FabricPath priority : <1-255> Unlike spanning-tree where lower is better, the election will elect the switch with the highest priority.
  • SystemID :  It is a mac address that belongs to the backplane of the switch

If there is a tie in the priority, the system-id will be the tie-breaker.

Once tree 1 has been elected, FabricPath will elect the root of the second tree based on the same factor described above. The trees are both identified based on an FTag field within the header.

The multidestination tree 1 is bound to FTag 1 and will forward unknown unicast, broadcast and multicast frames.

The multidestination tree 2 is bound to FTag 2 and will forward multicast frames only. We can see here that multicast frames will be load balanced on both trees in the topology. Broadcast and unknown unicast will be forwarded on the multidestination tree 1 only.

Here is the topology we will be working on:

 

FabricPath_Topology
 

Let’s assume that the fabric path adjacencies are formed and stable.

We want the Nexus 7K spine switches to be roots of the multidestination trees. N7K-1 will be root of multidestination tree 1 and N7K-2 will be root of the multidestination tree 2.

Now let’s check how the system has built the trees and how can we verify it.

As you can see here, the system-ID looks like a MAC address and as explained above it is indeed a MAC address that belongs to the backplane of the nexus.

For example on N5K-1 (SystemID: 005.73ca.9001)  , the backplane has 96 MAC addresses that are usable

The range of MAC addresses is the following for that particular switch: [00-05-73-ca-8f-ca to 00-05-73-ca-90-20]

For example on N7K-1 (SystemID: 18ef.63e3.cec4)  , the backplane has 128 MAC addresses that are usable

The range of MAC addresses is the following for that particular switch: [18-ef-63-e3-ce-c4 to 18-ef-63-e3-cf-44]

Let’s check the topology now:

We can confirm that N7k-1 and N7K-2 are roots for the multidestination tree 1 and 2 respectively.

Now let’s check how the physical tree is built and what are the links that will be used for multidestination frames forwarding:

The command is self explanatory: The metric mentioned for multidestination tree is from the root of that tree to that switch-id

Let’s check how this command looks like from a N5K-1 point of view

It means the following:

  • From the root of the tree (switchID 71) to the switch 52, the metric is 40 (which is the default for 10G link in ISIS) and N5K-1 has to use interface E1/4
  • From the root of the tree (switchID 71) to the switch 71 (itself) the metric is 0 (obviously) and N5K-1 has to use interface E1/4
  • From the root of the tree (switchID 71) to the switch 72, the metric is 80 (crossing 2x 10Gb/s links) and N5K-1 has to use interface 1/4

If you type the same command on all the switches in the fabricpath domain, you can draw the following topology:

FabricPath_FTag_1
Reminder : Metric is 40 for a 10Gb/s link

Let’s see how a broadcast is then forwarded within the FabricPath domain and more specifically on the multidestination tree 1:

Let’s say Server1 (which is on the left) sends a regular broadcast ethernet frame (Destination MAC: FFFF.FFFF.FFFF) to know what MAC should be used to reach Server2. The broadcast is forwarded within the classical ethernet domain and reach N5K-1.

N5K-1 recognizes that the frame is a broadcast and decides to forward it on tree one using port E1/4. A FabricPath header is added to the ethernet frame with the following characteristics:

  • OUTER DA : FFFF.FFFF.FFFF  . It is a special case where the inner DA is copied to outer DA when a frame is a broadcast. (For your reference, unknown unicast FabricPath frames have a special OUTER DA:  01:0F:FF:C1:01:C0)
  • OUTER SA : SwitchID 51 ( I do not cover SSID and LID on purpose 🙂 )
  • FTAG : 1
  • TTL : 32

N7k-1 receives the frames and recognizes that it is a broadcast assigned to FTAG1. Normally a FabricPath switch uses Switch-ID to forward frames but since this is a broadcast, the FTAG will be used in order to make a forwarding decision.

FTAG1 has 1 link to N5k-2 (e1/28). The TTL is decremented and frames is forwarded on that link.

and so on and so on…..

Obviously if a brodcast was sent from Server2, N5K-2 would forward the frame onto mutlidestination tree 1 using ports E1/8 to N7K-2 and E1/4 to N7K-1.

The fact that there is no redundant link for multidestination frames guarantees that there will be no looped frames in the FabricPath domain.

Multicast frames (non broadcast and non unknown unicast) are load balanced between FTAG 1 and FTAG2.

If you use the same command as I typed above, you can draw a topology for FTAG2:

 


FabricPath_FTag_2
 

Cisco Nexus L3 daughter card

Cisco_Nexus_5596
One of my customer had an issue regarding a Nexus 5K and its L3 daughtercard.

Everything was fine on the switch except that no adjacency could be performed with its neighbors.

I gave a quick look at the licensing and all was all right :

Then I checked that some L3 features are Up and running (If you have LAN_ENTERPRISE_SERVICES licensing model activated, it will be displayed “In use” right after you activate an advanced L3 feature like BGP – for more information regarding NX-OS licensing, please refer to Cisco NX-OS Licensing:

So everything is fine but my port is still in a down/down state:

Then I gave a look at the inventory to check the L3 daughter card and was quite surprised about the output

It is very easy to misread this output (especially at 2am) because your eyes can focus on the “L3 Daughter Card” words and in fact you should read : “O2 NON L3 Daughter Card”

The N55-DL2 is the regular L2 card that is bundled with a regular Nexus 5548.

The N55-D160L3 or V2 is needed to allow the L3 ports to go in an up/up state 🙂

Pretty basic stuff but still interesting to remind.

Nicolas

 

PS: This command is very interesting to check if L3 is globally activated on your N5k