Carefully planning ESX4 and HP Storageworks EVA

As in my post about Lessons Learned on ESX4 rollout, we had a pretty serious hiccup with our storage and the ESX systems in December while trying to bring up our ESX4 environment.  The primary trouble uncovered was what I’ll call “controller ping-pong”.

An EVA normally has two (maybe more, I’m not primarily a storage guy) controllers and those handle all the requests received through the SAN.  For every LUN, one controller is its master.  Both controllers can handle requests for the LUN, but only one actually handles the access.  If the controller on fabric A is the primary but the controller on fabric B is getting more requests, eventually the EVA swaps control for the LUN to fabric B — wherever the majority of requests are coming.

This behavior would only become a problem if you had hosts configured to access the LUN on different fabrics.  ESX4 is ALUA (asymmetric logical unit access) aware, meaning it should automatically determine the optimal path and in the case of an EVA.  The EVA, I’m told by HP support, is supposed to respond an ALUA request for the optimal path by responding with the controller that is the master over the LUN.

If you, like us, have an ESX 3.5 cluster with preferred paths setup, you should proceed with caution.  The ALUA information isn’t apparently shared between clusters.  And if your clusters get different optimal paths, you could end up with controller ping-pong as requests are sent down both fabrics and the volume changes between the two, resulting in more on Fabric A followed by more on Fabric B — forcing the controller to switch masters.

So, while in a migratory state, I think my safest route is to configure the ESX4 hosts to use a preferred path like the ESX3.5 cluster nodes.  I hate to move from the default ESX configuration and this isn’t an official recommendation from HP support, but it certainly makes the most sense to define the paths being used (except in a failure).

I post this because I feel like there have to be other HP Storageworks customers who have the same situation or have experienced something similar.  I would love to hear from you…

Tags: , , ,

 

About the Post

Author Information

Philip is a IT solutions engineer working for AmWINS Group, Inc., an insurance brokerage firm in Charlotte, NC. With a focus on data center technologies, he has built a career helping his customers and his employers deploy better IT solutions to solve their problems. Philip holds certifications in VMware and Microsoft technologies and he is a technical jack of all trades that is passionate about IT infrastructure and all things Apple. He's a part-time blogger and author here at Techazine.com.

3 Responses to “Carefully planning ESX4 and HP Storageworks EVA”

  1. Yuri Semenikhin #

    Hi,
    i have problem to with EVA, i have deployment in one of our costumer on EVA 8400, so i have configured ESX4 to use RR path policy ( default PSPS for EVA is MRU), and result is to mach reservation ocure when EVA switch primary controlerfor some LUN’s , after analyzing logs i have change PSP to MRU, and after this change no problem !!!

    January 12, 2010 at 11:44 am Reply
    • Philip #

      Yuri, sorry for the very slow reply. That is interesting that you found that. We are using EVA 6000 as our primary array and we’ve had many problems with our 6400 array used as the replication destination at our secondary site. The x400 series EVA’s seem to have some issues in general, although the latest code for them seems to have helped sort out a lot of issues. From our understanding the 8400 saw the least amount of problems because of faster controllers, etc. The 6400 a few more and the 4400’s a lot of problems. Anyways, we have set our ESX hosts to use Round Robin in accordance with the HP Best Practices and so far see few problems – mostly just occasional “failed on physical path” errors in the vmkernel log.

      April 7, 2010 at 3:48 pm Reply

Trackbacks/Pingbacks

  1. Tech Talk » Blog Archive » Best practices for VMware ESX4 with HP EVA storage - April 7, 2010

    […] In previous version of ESX, the desired storage setting was fixed path for the EVA.  In our case, we simultaneously presented the ESX3.5 and ESX4 hosts to the same LUNs, meaning some were fixed and some were set to the ESX4 default, which was MRU.  This caused problems.  After initial issues, we backed away and presented one LUN at a time, performed our VMotions and then unpresented the LUN from the old cluster.  This prevented any flapping issues between controllers.  (See my eariler post about our problems.) […]

Leave a Reply

%d bloggers like this: