<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>108.bz &#187; Network Protocol Analysis</title>
	<atom:link href="http://www.108.bz/posts/tag/network-protocol-analysis/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.108.bz</link>
	<description>Wandering futilities...</description>
	<lastBuildDate>Fri, 27 May 2011 09:08:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>Dumping streaming media in 25 lines of Perl</title>
		<link>http://www.108.bz/posts/it/dumping-streaming-media-in-25-lines-of-perl/</link>
		<comments>http://www.108.bz/posts/it/dumping-streaming-media-in-25-lines-of-perl/#comments</comments>
		<pubDate>Thu, 13 May 2010 10:11:23 +0000</pubDate>
		<dc:creator>Giuliano</dc:creator>
				<category><![CDATA[IT]]></category>
		<category><![CDATA[Network Protocol Analysis]]></category>
		<category><![CDATA[Perl]]></category>
		<category><![CDATA[Reverse Engineering]]></category>

		<guid isPermaLink="false">http://www.108.bz/?p=459</guid>
		<description><![CDATA[Analysing TCP based protocols often means dealing with TCP sessions (also called streams or flows). A TCP connection, from an application point of view, is much like a bidirectional file descriptor through which ordered data can be read or written. &#8220;On the wire&#8221; though, data is not ordered at all. It is split into packets, [...]]]></description>
			<content:encoded><![CDATA[<p>Analysing TCP based protocols often means dealing with TCP <i>sessions</i> (also called streams or flows).<br />
A TCP connection, from an application point of view, is much like a bidirectional file descriptor through which ordered data can be read or written. &#8220;On the wire&#8221; though, data is not ordered at all. It is split into packets, possibly shuffled and mixed with other traffic. You can capture packets using a sniffer, but to make any sense of them you also need an analyzer tool able to do the reordering/reassembling job. <a href="http://www.wireshark.org">Wireshark</a>, for instance, doubles as a sniffer and an analyzer, backed up by the ubiquitous <a href="http://en.wikipedia.org/wiki/Libpcap">libpcap</a>.</p>
<p>Imagine having dumped/sniffed 1GB worth of traffic. We would like to pinpoint a single TCP session, isolating it from the rest. Here&#8217;s how we could proceed:</p>
<ul>
<li>Identify the source/destination addresses and source/destination ports we&#8217;re interested in. Then throw away any packet that doesn&#8217;t match this tuple. That&#8217;s what Wireshark basically does when you select a packet, right click and hit &#8220;Follow TCP Stream&#8221;. If the same tuple doesn&#8217;t get reused for another, unrelated, session, this method works just fine<sup class='footnote'><a href='#fn-459-1' id='fnref-459-1'>1</a></sup>.</li>
<li>Reorder/reassemble packets.</li>
<li>Extract packets&#8217; payload.</li>
<li>Present the payload in a way that makes sense. That depends on the L7 protocol. HTTP without <a href="http://en.wikipedia.org/wiki/HTTP_persistent_connection">keep-alive</a> is strictly request/response: print what the client sent to the server (outbound traffic) before and then what the server answered (inbound traffic). Other protocols may behave differently and you may choose to separate inbound traffic from outbound, or rely on timing to correctly present the dialogue between peers.</li>
</ul>
<p>Besides Wireshark, there are tools that do just that and can also be automated. See <a href="http://www.wireshark.org/docs/man-pages/tshark.html">TShark</a> or <a href="http://www.circlemud.org/~jelson/software/tcpflow/">tcpflow</a>.</p>
<p>What if you want to script everything and build your own TCP analyzer? Perl&#8217;s module <a href="http://search.cpan.org/search?query=Net%3A%3AAnalysis&#038;mode=module">Net::Analysis</a> is surprisingly convenient for the task. It does the dirty job I described above and presents your code with ready to be processed TCP sessions.</p>
<p>Practical goal: saving MP3 files streamed by <a href="http://grooveshark.com">Grooveshark</a>. Disclaimer: I&#8217;m by no means pushing anyone to illegally download stuff, this is just a working, sensible, instructional example that uses a song freely available anyway (by Revolution Void, check them out <a href="http://www.jamendo.com/en/artist/revolutionvoid/">here</a>, they&#8217;re great).</p>
<p><span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">GroovesharkListener.pm</span> extends <span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">Net::Analysis::Listener::HTTP</span>. It sniffs all the traffic from/to port 80 and, as soon as he sees an HTTP response with a content-type of &#8220;audio&#8221;, dumps its content to file and quits. Simple as that.</p>
<p>Put the module some place where Perl can find it and then launch (as root):</p>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:550px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"># perl -MNet::Analysis -e main GroovesharkListener 'port 80'<br />
(starting live capture)<br />
/crossdomain.xml<br />
text/xml<br />
/service.php?addSongsToQueueExt<br />
text/html; charset=UTF-8<br />
/static/amazonart/m8c8c9f4291508bca130c1caac2bda75b.png<br />
image/png<br />
[...some more cruft...]<br />
/stream.php<br />
audio/mpeg<br />
Dumping 8481224 bytes to groovesharkgyzBy.mp3 be patient...<br />
<br />
# id3v2 -l groovesharkgyzBy.mp3<br />
id3v1 tag info for groovesharkgyzBy.mp3:<br />
Title &nbsp;: Invisible Walls &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Artist: Revolution Void &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
Album &nbsp;: Increase the Dosage &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Year: 2004, Genre: Other (12)<br />
Comment: http://www.jamendo.com/ &nbsp; &nbsp; &nbsp; &nbsp; Track: 1</div></div>
<p>That&#8217;s it, just one more thing. <span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">Net::Analysis</span> doesn&#8217;t allow you to select a specific network interface, it just picks up the first available one. I wrote a small <a href='http://www.108.bz/wp-content/uploads/2010/05/NetAnalysis_device_support_in_live_capture.diff_.txt'>patch</a> to address this shortcoming, it adds a &#8220;<span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">device=</span>&#8221; parameter that you can use this way:</p>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:550px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"># perl -MNet::Analysis -e main GroovesharkListener,device=wlan1 'port 80'</div></div>
<p>And here&#8217;s what <span style="font-family: Bitstream Vera Sans Mono,Courier New,monospace;">GroovesharkListener.pm</span> looks like:</p>
<div class="codecolorer-container perl blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:550px;height:300px;"><div class="perl codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #666666; font-style: italic;"># choose a song</span><br />
<span style="color: #666666; font-style: italic;"># run (as root or via sudo):</span><br />
<span style="color: #666666; font-style: italic;"># &nbsp; perl -MNet::Analysis -e main GroovesharkListener 'port 80'</span><br />
<span style="color: #666666; font-style: italic;"># click &quot;play&quot; and wait for the file to be dumped...</span><br />
<span style="color: #666666; font-style: italic;"># &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; -- Giuliano - http://www.108.bz</span><br />
<a href="http://perldoc.perl.org/functions/package.html"><span style="color: #000066;">package</span></a> Net<span style="color: #339933;">::</span><span style="color: #006600;">Analysis</span><span style="color: #339933;">::</span><span style="color: #006600;">Listener</span><span style="color: #339933;">::</span><span style="color: #006600;">GroovesharkListener</span><span style="color: #339933;">;</span><br />
<span style="color: #000000; font-weight: bold;">use</span> strict<span style="color: #339933;">;</span><br />
<span style="color: #000000; font-weight: bold;">use</span> base <a href="http://perldoc.perl.org/functions/qw.html"><span style="color: #000066;">qw</span></a><span style="color: #009900;">&#40;</span>Net<span style="color: #339933;">::</span><span style="color: #006600;">Analysis</span><span style="color: #339933;">::</span><span style="color: #006600;">Listener</span><span style="color: #339933;">::</span><span style="color: #006600;">HTTP</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
<span style="color: #000000; font-weight: bold;">use</span> File<span style="color: #339933;">::</span><span style="color: #006600;">Temp</span><span style="color: #339933;">;</span><br />
<br />
<span style="color: #000000; font-weight: bold;">sub</span> http_transaction <span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; <span style="color: #b1b100;">my</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$self</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">$args</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">@_</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #b1b100;">my</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$http_req</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$args</span><span style="color: #339933;">-&gt;</span><span style="color: #009900;">&#123;</span>req<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span> <br />
&nbsp; &nbsp; <span style="color: #b1b100;">my</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$http_resp</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$args</span><span style="color: #339933;">-&gt;</span><span style="color: #009900;">&#123;</span>resp<span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span> <br />
<br />
&nbsp; &nbsp; <a href="http://perldoc.perl.org/functions/print.html"><span style="color: #000066;">print</span></a> <span style="color: #0000ff;">$http_req</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">uri</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">$content_type</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">$http_resp</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">header</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">'Content-Type'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <a href="http://perldoc.perl.org/functions/print.html"><span style="color: #000066;">print</span></a> <span style="color: #ff0000;">&quot;$content_type<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$content_type</span> <span style="color: #339933;">=~</span> <span style="color: #009966; font-style: italic;">/audio/i</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">$fh</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> File<span style="color: #339933;">::</span><span style="color: #006600;">Temp</span><span style="color: #009900;">&#40;</span>TEMPLATE <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'groovesharkXXXXX'</span><span style="color: #339933;">,</span> <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; SUFFIX &nbsp; <span style="color: #339933;">=&gt;</span> <span style="color: #ff0000;">'.mp3'</span><span style="color: #339933;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; UNLINK &nbsp; <span style="color: #339933;">=&gt;</span> <span style="color: #cc66cc;">0</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <a href="http://perldoc.perl.org/functions/print.html"><span style="color: #000066;">print</span></a> <span style="color: #ff0000;">&quot;Dumping &quot;</span><span style="color: #339933;">.</span><a href="http://perldoc.perl.org/functions/length.html"><span style="color: #000066;">length</span></a><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">$http_resp</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">content</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">.</span><span style="color: #ff0000;">&quot; bytes to &quot;</span><span style="color: #339933;">.</span><span style="color: #0000ff;">$fh</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">filename</span><span style="color: #339933;">.</span><span style="color: #ff0000;">&quot; be patient...<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <a href="http://perldoc.perl.org/functions/print.html"><span style="color: #000066;">print</span></a> <span style="color: #0000ff;">$fh</span> <span style="color: #0000ff;">$http_resp</span><span style="color: #339933;">-&gt;</span><span style="color: #006600;">content</span><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <a href="http://perldoc.perl.org/functions/exit.html"><span style="color: #000066;">exit</span></a><span style="color: #339933;">;</span><br />
&nbsp; &nbsp; <span style="color: #009900;">&#125;</span><br />
<span style="color: #009900;">&#125;</span><br />
<br />
<span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span></div></div>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-459-1'>newer Wireshark(s) use the &#8220;tcp.stream eq <i>x</i>&#8221; primitive <span class='footnotereverse'><a href='#fnref-459-1'>&#8617;</a></span></li>
</ol>
</div>
 <img src="http://www.108.bz/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=459" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.108.bz/posts/it/dumping-streaming-media-in-25-lines-of-perl/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

