<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>108.bz &#187; IBM</title>
	<atom:link href="http://www.108.bz/posts/tag/ibm/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.108.bz</link>
	<description>Wandering futilities...</description>
	<lastBuildDate>Fri, 27 May 2011 09:08:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>About path thrashing and why you should always zone</title>
		<link>http://www.108.bz/posts/it/about-path-thrashing-and-why-you-should-always-zone/</link>
		<comments>http://www.108.bz/posts/it/about-path-thrashing-and-why-you-should-always-zone/#comments</comments>
		<pubDate>Wed, 20 Apr 2011 08:58:36 +0000</pubDate>
		<dc:creator>Giuliano</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[SAN]]></category>
		<category><![CDATA[Troubleshooting]]></category>
		<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://www.108.bz/?p=703</guid>
		<description><![CDATA[So, Customer starts updating all of his VMware ESX hosts and things turn out for the worst. VMs are crawling slow (ping response time from 0 to 1000ms), console access through vSphere client doesn&#8217;t always work, and hosts&#8217; CPU percentage is unnaturally high. Cause is apparent: path thrashing. Path thrashing happens when, for some reason, [...]]]></description>
			<content:encoded><![CDATA[<p>So, Customer starts updating all of his VMware ESX hosts and things turn out for the worst. VMs are crawling slow (ping response time from 0 to 1000ms), console access through vSphere client doesn&#8217;t always work, and hosts&#8217; CPU percentage is unnaturally high. Cause is apparent: path thrashing.<br />
Path thrashing happens when, for some reason, SCSI LUNs are being continuously reassinged from a controller (Target) to another one. ESX has a hard time &#8220;bouncing&#8221; I/O back and forth on the right Fibre Channel path. On Active/Passive SAN arrays a LUN can be &#8220;owned&#8221; by just one controller at a time. If the LUN owner has to be changed because of a hardware failure (path, Controller, SFP/GBIC, FC switch, &#8230;) or because the Initiator would like to, the LUN itself has to &#8220;trespass&#8221; (in EMC parlance), transition to another controller. The &#8220;command&#8221; to do so can be issued by the Initiator or internally by the storage subsystem.<br />
Back to today&#8217;s case, I was dealing with an IBM DS4800 where LUNs flipped like mad between controller A and B. How to stop it quickly?</p>
<ul>
<li>If anything, the flipping shows that failover works as expected (VMs don&#8217;t crash despite the chaos).</li>
<li>That said, I could just disconnect a controller. Not really because the same storage system hosts an Oracle RAC cluster, humming along happily, unaffected by the issue.</li>
<li>I need a way to selectively &#8220;hide&#8221; a controller from one or more hosts. I can do it easily by tweaking the SAN zoning configuration.</li>
</ul>
<p>A Zone (much like a VLAN) is basically a group of WWNs (or ports). Objects in the Zone can only talk to each other. While creating Zones, it is common practice to &#8220;go minimal&#8221;: they should contain as few stuff as possible. I usually name them like this:<br />
&nbsp;&nbsp;&nbsp;&nbsp;<span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">Z_HOSTNAME_P1_DS4800_CA1_CB1</span><br />
HBA Port 1 of HOSTNAME can see Controller A/Port 1 and Controller B/Port 1 of the DS4800.<br />
Thus, going through each ESX server&#8217;s Zone, I just remove the Controller that the host shouldn&#8217;t see. Path thrashing is temporarily stopped.<br />
The above rant serves mainly as a pro-zoning argument. &#8220;If every HBA port has to access every Controller&#8217;s port, why implement zoning?&#8221;. As you just read, zoning saved me from serious trouble, today.<br />
About the &#8220;real&#8221; issue, it was ultimately caused by a thing called &#8220;Auto Volume Transfer&#8221; (AVT)<sup class='footnote'><a href='#fn-703-1' id='fnref-703-1'>1</a></sup>. Let&#8217;s say that a LUN is assigned to controller A, but I/O for the LUN is issued to controller B. With AVT switched on the storage system will automatically transfer the LUN from A to B.<br />
The Customer ESX servers are all (correctly) configured to use the &#8220;Most Recently Used&#8221; (MRU) <a href="http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&#038;cmd=displayKC&#038;externalId=1003973">path</a> to a LUN. It seems that ESX, from a certain version on, issues I/O on the standby path, causing havoc if AVT is on. I can&#8217;t tell if that&#8217;s because it is fooled into thinking that the storage is an Active/Active one or if it just sort of periodically &#8220;probes&#8221; standby paths.<br />
How do you switch AVT off? By using the DS &#8220;Storage Manager&#8221; and changing the ESX Hosts&#8217; type from &#8220;Linux&#8221; (or whatever) to &#8220;LNXCLVMWARE&#8221;. This applies to all of the LSI derived Storage Systems (IBM, SUN StorageTek, Engenio, &#8230;). The latter host type is the right one to use when hooking an ESX cluster to an IBM DS Storage System. But &#8220;Linux&#8221; seems to do just fine on not so new ESX hosts version 4.1.x &#8230; When AVT is off, the Storage will decide to trespass LUNs only in the event of an internal hardware failure while, normally, LUN ownership will be handled by the multipathing software on the Host.</p>
<p>More reading on the subject:</p>
<p>[<a href="https://www.ibm.com/developerworks/forums/message.jspa?messageID=14532649">1</a>] Differences between the &#8220;Linux&#8221; and &#8220;LNXCLVMWARE&#8221; host types.<br />
[<a href="http://webcache.googleusercontent.com/search?q=cache:SP6Ytyb4-0YJ:https://selfservice.lsi.com/service/main.jsp%3Bjsessionid%3D0637AE74501C3CB54E44A071BEFF108D%3Ft%3DsolutionTab%26ft%3DsearchTab%26ps%3DsolutionPanels%26locale%3Den_US%26_dyncharset%3DUTF-8%26curResURL%3D%252Fservice%252Fmain.jsp%253Bjsessionid%253D0637AE74501C3CB54E44A071BEFF108D%253F_dyncharset%253DUTF-8%2526_dynSessConf%253D3633170421896306112%2526t%253DsearchTab%2526locale%253Den_US%2526_dyncharset%253DUTF-8%2526topicName%253D%2526sfield%253D%2526dosearch%253Dtrue%2526searchstring%253DQuery%25252520does%25252520not%25252520work%25252520in%25252520WINS%2526useFocusTopic%253Dtrue%2526focusTopic%253D9000029%26solutionId%3DLSI7423%26isSrch%3DYes+site:selfservice.lsi.com+AVT&#038;cd=1&#038;hl=en&#038;ct=clnk&#038;client=opera&#038;source=www.google.com">2</a>] <i>How does Auto Volume Transfer (AVT) work?</i> Courtesy of Google&#8217;s cache. Lists which SCSI commands trigger AVT.<br />
[<a href="https://www.ibm.com/developerworks/mydeveloperworks/blogs/VirtuallySpeaking/entry/vmware_scsi_errors_and_conditions_ibm_ds_storage_systems1?lang=en">3</a>] A really nice blog post about the same issue described here. (Found, of course, when I was writing mine)</p>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-703-1'>or even &#8220;Auto Disk Transfer&#8221; (ADT) <span class='footnotereverse'><a href='#fnref-703-1'>&#8617;</a></span></li>
</ol>
</div>
 <img src="http://www.108.bz/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=703" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.108.bz/posts/it/about-path-thrashing-and-why-you-should-always-zone/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Unknown devices on IBM servers</title>
		<link>http://www.108.bz/posts/it/unknown-devices-on-ibm-servers/</link>
		<comments>http://www.108.bz/posts/it/unknown-devices-on-ibm-servers/#comments</comments>
		<pubDate>Tue, 19 Jan 2010 11:23:32 +0000</pubDate>
		<dc:creator>Giuliano</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Install]]></category>
		<category><![CDATA[Legacy]]></category>
		<category><![CDATA[Server]]></category>

		<guid isPermaLink="false">http://www.108.bz/?p=202</guid>
		<description><![CDATA[When installing Windows on IBM, without using IBM ServerGuide, you&#8217;ll sometimes end up having two unknown devices in Device Manager: ASF Table ACPI\ASF0001\2&#38;DABA3FF&#38;0 and: IBM Active PCI Device ACPI\IBM37D4\2&#38;DABA3FF&#38;0 To deal with the first one, see document MIGR-43764 and download the driver mentioned there (it&#8217;s called &#8220;25k9219.zip&#8221;). The latter can be fixed by installing the &#8220;IBM [...]]]></description>
			<content:encoded><![CDATA[<p>When installing Windows on IBM, <em>without</em> using IBM ServerGuide, you&#8217;ll sometimes end up having two unknown devices in Device Manager:</p>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:550px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">ASF Table<br />
ACPI\ASF0001\2&amp;DABA3FF&amp;0</div></div>
<p>and:</p>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:550px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">IBM Active PCI Device<br />
ACPI\IBM37D4\2&amp;DABA3FF&amp;0</div></div>
<p>To deal with the first one, see document <a href="http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-43764&#038;brandind=5000008">MIGR-43764</a> and download the driver mentioned there (it&#8217;s called &#8220;25k9219.zip&#8221;).<br />
The latter can be fixed by installing the &#8220;IBM Active PCI Software&#8221;, you can find it on your server&#8217;s support page, e.g. <a href="http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-4J2QEQ&#038;brandind=5000008">here</a> (&#8220;90p4169.exe&#8221;).</p>
<p>Also, document <a href="http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-51940&#038;brandind=5000008">MIGR-51940</a>, <i>Installing Microsoft Windows Server 2003 version 1.0 &#8211; Servers</i>, proves useful.</p>
<p>And a last bit: if you&#8217;re in a hurry or haven&#8217;t got the CD handy, ServeRAID Manager Software can be installed by simply copy/pasting its folder from another server. It usually works. <img src='http://www.108.bz/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
 <img src="http://www.108.bz/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=202" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.108.bz/posts/it/unknown-devices-on-ibm-servers/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

