All posts by Nicolas Michel

CUCM 10.5 Upgrade issue

Hey everyone.

 

I have just finished my upgrade to CUCM 10.5.2 and I faced an issue at the end of the ugprade.

Of course this always happen after you spent some hours waiting for the upgrade to be successful 🙂

According to the very good Cisco DocWiki, VMware Tools are specialized drivers for virtual hardware that is installed in the UC applications when they are running virtualized. It is very important that the VMware tools version running in the UC application be in sync with the version of ESXi being used.

This clearly states that it is kinda mandatory for you to install or ugprade your VMware tools after every upgrade.

Depending of the version and application you are running, there are several methods to upgrade your VMware tools:

  • Method 1: Using a COP File: This is deprecated and used only for 8.0 UC servers. Definitely not something you will deal with if you are deploying a new UC infrastructure
  • Method 2: Using the CLI: This method will be used if you run UCCX 8.5(1)+ or 8.5 UC Servers or CUCM IM and P version 9. This is disruptive and the server will reboot twice.
  • Method 3: Upgrade from VI Client. Very easy method. Will install while the server is in production and it is not disruptive at all. 
  • Method 4: Auto Upgrade during a server power cycle. Not disruptive at all and will auto upgrade.

Please bear in mind that most of the recent UC applications are compatible with either Method 3 and 4. I generally like to enable auto upgrade.

So let’s get back to our issue.

After I completed my upgrade, I went to the vCenter client to install the VMware tools but I faced a problem. It was just not working. I even rebooted my server twice but still nothing .. I still had that same issue (image below is just representing a similar issue)

VMwaretools
 

I gave a look at the excellent support community and bug search tool… and I was lucky enough to find that I was not the only one facing that issue 🙂

Indeed there is a recent bug hitting CUCM 10.5 (and other apps I believe). CSCul78735

SElinux is preventing the VMware tools to be installed on the server. It is regulating access control security policies to the server and has been introduced in CUCM 8.6 (in fact it replaced Cisco Security Agent).

What we need to do is to bypass SElinux policies and here is
the (very complicated) procedure:

As soon as we do that, you need to install again the VMware tools and it worked for me.

VMwaretools_OK
 

Do not forget to enforce SElinux after you are done with the VMware tools installation.

Hope it can help you in your future upgrades 🙂

Cheers

Nicolas

Cisco ISLB Issue

Usually people are blogging on a certain topic because they want to share they knowledge with a certain protocol or product.

Today I ll take another approach with that fact and I will actually do the exact opposite. I have an issue with ISLB which allows load balancing for my iSCSI sessions. Today I will elaborate each steps needed to make it work. I have failed this configuration a LOT of time and I have followed the same steps over and over. I decided to make a blogpost about it to keep track of what I should do next time I want to configure it.

I did not play with VRRP yet but this can be an idea for a following blogpost.

The topology is the same as in my previous blog posts related to the MDS.

 

Device_Alias
 

The difference here is that both MDS will have an iSCSI interface bound to their gigabit interface. (iscsi 1/1 mapped to gig 1/1).

ISLB on Cisco MDS
 

I will start from scratch and setup the infrastructure:

The outpout above prove us that the JBOD has registered to the fabric and that VSAN 10 is running on the E port between MDS01 and MDS02. Another proof is that the FCNS commands on MDS02 has the JBOD PWWN in its database.

Now we will setup Device-alias, we will activate a test zoneset on vsan 10 because ISLB requires an already active zoneset if you want to use the auto zone feature. If you do NOT have an active zone, you will have to manually perform the zoning configuration.

 

Now we can start our ISLB configuration. Again we will first configure the infrastructure and check that both iSCSI interfaces are reachable from the L2 domain.

 

ISLB configuration can now start and you will see it is very brief:

We first need to check the IQN of our servers.


\IQN Win 2008 IQN Win 2012

 

The configuration has been commited and MDS02 should have the ISLB configuration and the zoning configured on it:

All is all right here and none of the iSCSI initiator have yet logged in the fabric:

Let’s now activate debugs on both switches and try to initiate a Fabric Login from the iSCSI initiators (Server 01 first then Server 02)


SRV01_LOGIN
SRV01_OK_MDS01
 

MDS01 has performed a FLOGI onto itself on the VSAN10 and it has been mapped to interface iSCSI 1/1.

We can also see that the initiator has been correctly mapped to the JBOD

Let’s now try with server 02

SRV02_LOGIN
SRV02_OK_MDS02

Note that the MDS02 will only see 1 FLOGI and that MDS01 will see both FLOGI from its local FC Disk and from its iSCSI Initiator.

Both servers are able to map the drive and everybody is happy 🙂

Nicolas

As I mentionned at the beginning of the post, I did not played with VRRP on purpose and I will relate about that in a following blogpost 🙂

Cisco MDS Port-Security with Auto-Learning

I have been learning about Cisco MDS port-security recently and I have been struggling with this feature because it was different from what I expected. What I was expecting was something very similar (and easy) like the good old Ethernet Port-Security feature.

This is clearly not the case and I will show you how to configure a basic port-security using auto learning. You still can manually configure entries on the MDS but I wanted to check how to feature was interacting with CFS and how it was implemented.

We will use the same topology as the one we used previously:

 


Device_Alias
VSAN 10 is the only VSAN created in the topology for clarity’s sake.

As every feature in NX-OS, there is a need to activate the feature on both MDS:

Since we want to play with the feature auto learning and CFS distribution , we need to enable it since it is not enabled by default.

As we can see above, if you enable the distribution of the port-security feature, this will not replicate to other switches in the fabric. Here the behavior is different than what we can experience when activating enhanced zoning within a storage fabric.

We do have to activate it on the other switches as well.

As soon as it is done we now need to learn some WWN into the fabric. As soon as you activate port-security for a particular VSAN, auto-learning is automagically (type made on purpose and copyrighted by Vik Malhi 🙂 ) started as well.

The output above shows us that the fabric has been locked for this particular VSAN and application.

In order to remove the lock and spread the configuration into the fabric, we need to commit the changes we’ve done here:

So, learning is enabled and a database has been activated as well. Same analogy as zoning here, there is a config database and active database. The active database has been replicated to the other switches but not the config database … Sounds like basic zoning right ? but the problem here is that the config database has NOT been replicated on MDS01 where we typed the configuration. So we need to replicate that active database to the config database on both MDS.

Let’s check what’s in the database first and :

On MDS01, we can see 3 WWN :

  • 21:00:00:18:62:8d:e8:b7(pwwn) is the pwwn owned by my JBOD and attached to the logging point 20:05:00:0d:ec:71:f1:40 on int fc1/5
  • 20:00:00:0d:ec:94:3c:c0(swwn) is the swwn owned by MDS02 and attached to the logging point 20:01:00:0d:ec:71:f1:40 on int fc1/1
  • 20:00:00:0d:ec:94:3c:c0(swwn) is the swwn owned by MDS02 and attached to the logging point 20:01:00:0d:ec:71:f1:40 on int fc1/2

The logging point here is just the switch wwn (swwn) where we type the commands, we can verify it

We will have the same kind of output on MDS02 :

The tricky part here is that you cannot copy the active database to the config database if auto-learn is running on the VSAN:

So we need to de-activate that feature:

After a copy run start we should be good to go !

But we have to bear in mind that since auto learning is now DISABLED, if any array tries to login within the fabric,it will be blocked 🙂

Feel free to comment or correct me by posting a comment below 🙂

Nicolas

EDIT:

If you now try to connect an Array to the fabric here is what you will have 🙂

Device Alias on Cisco MDS

It is definitely not convenient to configure a zone or any CFS application using WWpN.

20:ab:3d:2c:4f:89:fa:ab is not very human readable and it is definitely not efficient to keep traces of the WWpN in your MDS configuration.

Device-alias is a proprietary feature created by Cisco to make your life much easier. It will map an human readable description to WWpN.

Let’s first setup our infrastructure for this blogpost and it’s a rather simple topology, but it sums up what we are trying to achieve:

 

Device_Alias
 

Let’s first setup our VSANS,FCDOMAINS and E ports on both MDS

 

I wanted MDS02 to be our principle switch for both VSANS into our very large fabric. 🙂

 

The JBOD has to be linked with VSAN100 .

 

The Fabric is stable and both VSANS are being forwarded on both trunks.

 

Device-alias has 2 mode of configuration :

  • Basic :  Applications that will use device-aliases like zoning will expand them to regular WWpN.
  • Enhanced:  Applications that will use  device-aliases like zoning will keep track of the alias and will use it in a native format

What does that really mean ? We will configure 2 aliases and 2 zones using both methods and you will be able to check by yourself what does Cisco wanted to achieve.

Let’s check that the whole fabric (2 switches 🙂 ) is in basic device-alias mode:

Now let’s configure the device-alias for the JBOD and for a fictitious initiator

CFS should lock the Fabric from a configuration point of view, no other users would be able to override the configuration unless they clear the lock.

Commit the device-alias is mandatory in order to spread the entries throughout the whole fabric.

As we can see, the Aliases have been synchronized. Now it is time to configure a zone and check how the basic device-alias mode is different than enhanced device-alias mode.

The zoneset needs to be activated (INITIATOR does not have a fcid because it is a fake WWpN and hence the fact that it has not registered to the fabric)

As we can see the active zoneset has both the device-alias and WWpN, same thing happen for the FULL zone

Now let’s add another aliases but with the enhanced mode feature this time. You will see that the enhanced mode can be configured only on one switch in the fabric. It is automatically replicated to all other switches in the fabric using CFS

Device-alias configuration is similar to the one we did previously

The zoneset application should now use the device-alias mode to display the aliases using the native format instead of WWpN

We can now check the final result of our tests:

VSAN100 zoneset is using device-alias in basic mode while VSAN200 zoneset is using enhance mode

It is been a long post and some readers might find it annyoing but I wanted to scrutinize every steps of the configuration to understand how it really works behind the scene.

Please feel free to share your experience or comments.

 Nicolas

vPC order of operations

Cisco Nexus can be very temperamental or capricious (pick the one you prefer 🙂 ) and the vPC technology is not an isolated case.

There is a certain way to configure vPC and we will see that in that blogpost.

The following topology will be used:

 


vPC diagram
 

Enabling the feature

Obviously we need to activate the vPC and LACP feature in order to build a proper vPC configuration.

 

Peer Keepalive connectivity

The Peer Keepalive link is used to detect failure between the peers. It is definitely not used in the data plane. On N5K, the management interface is usually used for the Peer Keepalive Link and management purpose. On N7K, Cisco does not recommend to use the supervisor interface as a vPC peer-keepalive link since it can introduce failure if that supervisor crashes…..

In our example we will use the management interface in order to have IP connectivity between our vPC peers.

Now let’s check if N5K-1 can reach N5K-2

The behaviour above is normal since we are trying to ping 10.2.8.84 using the global RIB and the mgmt 0 interface does not belong to that RIB. Instead it does belong to the management vrf . So if we add the “vrf management”  keywords, everything should be fine !

and it is 🙂

 

vPC domain and vPC peer link configuration

Now let’s create the vPC domain and the vPC peer link.

Please be aware that it is a best practice to configure an unique vpc domain ID for every pair of Nexus. The vPC domain ID will be used to build a system mac-address. So if 4 nexus are connected together created 2 vPC domains and if both vPC domains have the same system mac-address, you will experience something funny 🙂

I decided that N5K-1 will be elected as the Primary vPC peer and N5K-2 will remain Secondary vPC peer.

Please note that if you do not specify a keyword for the peer-keepalive destination command, the switch automatically use the management vrf.

As we can see above, the vPC keep-alive status is OK since we have reachability between our mgmt 0 interface.

We will now create our vPC peer-link that will be mainly used for CSF and many other things (will be detailed in a future blogpost)

Enabling the vPC peer-link on the port-channel automatically set the type to network for that interface. This means that we just enabled bridge assurance on that link. In a vPC topology, Cisco recommends to enable Bridge Assurance only on the vPC peer-link.

We can now enable the interface to check the status of our port-channel.

Our Port-channel is functionning properly so we can check the status of our vPC peer link now

Everything looks so far so good !

Do you remember when I explain about the dependencies between vPC domain and vPC system mac-address ? 🙂 here is the result



vPC Configuration

We will now create a vPC towards an LACP neighbor which in this case will be a server.

Here is the configuration used on both Nexus

Let’s assume that our server’s NIC team is already configured for LACP. Let’s now check the vPC status

Consistency parameters are matching on both vPC peers and our vPC towards that server is functional 🙂

 

LACP verification

The following output will confirm that we are indeed having an LACP port-channel between the server and the vPC peers

This post was a bit basic but it can get tricky to troubleshoot a vPC so be sure to follow the steps above in order to save time in your deployment/labs.

We will dig deeper on vPC tricks soon 🙂

Nicolas