<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cluster Connection &#187; Top500</title>
	<atom:link href="http://www.clusterconnection.com/tag/top500/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.clusterconnection.com</link>
	<description>Simplify HPC. Share the knowledge.</description>
	<lastBuildDate>Fri, 30 Dec 2011 21:23:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Clustering in the Cloud</title>
		<link>http://www.clusterconnection.com/2009/10/clustering-in-the-cloud/</link>
		<comments>http://www.clusterconnection.com/2009/10/clustering-in-the-cloud/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 22:45:49 +0000</pubDate>
		<dc:creator>Douglas Eadline</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[grid]]></category>
		<category><![CDATA[HPC]]></category>
		<category><![CDATA[InfiniBand]]></category>
		<category><![CDATA[Top500]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://www.clusterconnection.com/2009/10/clustering-in-the-cloud/</guid>
		<description><![CDATA[Are clouds a good place to do build HPC Clusters? The use of virtualization and multi-core processors has made cloud computing an option for many users. The ability to buy cloud time as you need it and not purchase hardware is certainly attractive from a financial standpoint. The concept is not new and has its [...]]]></description>
			<content:encoded><![CDATA[<p><em>Are clouds a good place to do build HPC Clusters?</em></p>
<p>The use of virtualization and multi-core processors has made cloud computing an option for many users. The ability to buy <em>cloud</em> time as you need it and not purchase hardware is certainly attractive from a financial standpoint. The concept is not new and has its roots in time shared mainframes and grid computing. One might assume the the vast amount of computing resources in clouds may make them ideal candidates for HPC clustering. Unfortunately, it is not as simple as collecting cores.</p>
<p>One of the issues facing clouds is I/O. Basically, I/O is often not predictable or repeatable. From a storage standpoint read and write times can be fast, but not always fast. In terms of messages between servers, most clouds do not support high performance interconnects and similarly make no guarantees as to latency or bandwidth consistency.  While grids paid attention to certain HPC performance guarantees in terms of I/O, clouds, in order to offer ease of use, have declined such guarantees. Unless a cloud has been specifically designed for HPC, the user cannot expect consistent and/or high performance. There are two papers which discuss this very idea. The first paper looks at <a href="http://www.usenix.org/publications/login/2008-10/openpdfs/walker.pdf">Benchmarking Amazon EC2 for High-performance Scientific Computing</a> and the second paper asks, <a href="http://www.cs.utexas.edu/users/pauldj/pubs/uchpc09.pdf">Can Cloud Computing Reach The TOP500?</a>. Both papers conclude that the cloud is not mature enough for HPC applications.</p>
<p>The limitations of the cloud become more apparent when one looks a little deeper at HPC applications. First, many applications rely on <em>user space</em> communication (i.e. high performance MPI programs transfer data directly from one node to another without using kernel services.) Such a <em>close to the wire</em> operation runs counter to the virtualization model. Secondly, as reported in the first paper (above), the performance of OpenMP applications was reduced by 7-21% when running in the EC2 cloud.</p>
<p>Recently Penguin Computing began offering POD (Penguin on Demand) for HPC cloud computing. The POD cloud offers both Ethernet and InfiniBand connections between nodes thus providing a dedicated high performance computing environment. This service can be considered a specialized HPC cloud.</p>
<p>There are some other other important issues to consider with cloud computing -- security and reliability. When data leaves your domain over the Internet it is virtually impossible to guarantee 100% security. If your organization can live with this situation, using the cloud may be an option. If on the other hand, you need to keep a tight reign on your data, then you may not want to be injecting it into the cloud. The other issue is reliability. If your day to day operations are based on using a cloud, then a contingency plan is a must. Interruptions in Internet traffic due to congestion or hardware failures can be common in some areas. In addition, the cloud provider may have issues (even go out of business) and thus not meet the service requirements.</p>
<p>I believe the cloud is an interesting model, but it is not a real solution for HPC (in its current form). My issue with clouds is that they are often categorized as "grid like" and then are somehow (incorrectly) considered "HPC like." Cloud offers utility computing like grid promised, but has pushed the application layer further away from the hardware. HPC practitioners spend a lot of time making sure the application is as close to the hardware as possible. At this point in time, HPC in the cloud is more of a curiosity than a solution. When examining HPC benchmarks it becomes clear that clouds are not the best means to provide HPC cycles. Whether efforts like POD can meet the HPC users needs in the cloud is still unknown.</p>
<p>To be fair, there are some HPC applications that lend themselves to clouds quite well. (i.e. those that do not require predictable I/O)  <a href="http://folding.stanford.edu/">Folding@home</a> and <a href="http://setiathome.berkeley.edu/">Seti@home</a> are two good examples. These applications could easily run in a cloud (in a sense they do run in the Internet cloud). Keep in mind they have been designed to work in a robust distributed fashion and are not virtualized. Clouds can be enticing and even enabling for some applications, but remember a collection of servers (in the cloud or in a rack) does not a cluster make.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.clusterconnection.com/2009/10/clustering-in-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HPL 101</title>
		<link>http://www.clusterconnection.com/2009/07/hpl-101/</link>
		<comments>http://www.clusterconnection.com/2009/07/hpl-101/#comments</comments>
		<pubDate>Thu, 23 Jul 2009 19:50:01 +0000</pubDate>
		<dc:creator>Douglas Eadline</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[BLAS]]></category>
		<category><![CDATA[HPL]]></category>
		<category><![CDATA[linear algebra]]></category>
		<category><![CDATA[Linpack]]></category>
		<category><![CDATA[Math Kernel Library]]></category>
		<category><![CDATA[Top500]]></category>

		<guid isPermaLink="false">http://www.clusterconnection.com/2009/07/hpl-101/</guid>
		<description><![CDATA[Ever wonder what the Top500 list actually measures? Every six months the largest computers in the world are ranked as to how many floating point operations per second (FLOPS) they can perform. The results are tabulated on the Top500 list. The actual benchmark is called HPL, which stands for High Performance Linpack. The Linpack benchmark [...]]]></description>
			<content:encoded><![CDATA[<p><em>Ever wonder what the Top500 list actually measures? </em></p>
<p>Every six months the largest computers in the world are ranked as to how many floating point operations per second (FLOPS) they can perform. The results are tabulated on the <a href="http://www.http.com/www.top500.org">Top500</a> list. The actual benchmark is called HPL, which stands for High Performance Linpack. The Linpack benchmark was designed to measure the floating point performance of various systems and the HPL version is designed to run on parallel computers like clusters.</p>
<p>The Linpack problem is something you may have seen in high school. Given a set of linear equations of the form,</p>
<table class="equation" border="0" cellspacing="0" cellpadding="0" width="90%" align="center">
<tbody>
<tr>
<td></td>
<td width="50%"></td>
<td colspan="2" align="center">3<em>x</em> + 2<em>y</em> - <em>z</em> =  1</td>
<td width="50%"></td>
</tr>
</tbody>
</table>
<table class="equation" border="0" cellspacing="0" cellpadding="0" width="90%" align="center">
<tbody>
<tr>
<td></td>
<td width="50%"></td>
<td colspan="2" align="center">2<em>x</em> - 2<em>y</em> + 4<em>z</em> =  -2</td>
<td width="50%"></td>
</tr>
</tbody>
</table>
<table class="equation" border="0" cellspacing="0" cellpadding="0" width="90%" align="center">
<tbody>
<tr>
<td></td>
<td width="50%"></td>
<td>-<em>x</em> +</td>
<td align="center">1</p>
<hr size="1" />2</td>
<td><em>y</em> - <em>z</em> =  0</td>
<td width="50%"></td>
</tr>
</tbody>
</table>
<p>Solve for <em>x</em>,<em>y</em>, and <em>z</em>. If you recall further, a system of linear equations can be generalized as the following:</p>
<table class="equation" border="0" cellspacing="0" cellpadding="0" width="90%" align="center">
<tbody>
<tr>
<td></td>
<td width="50%"></td>
<td colspan="2" align="center"><em>A</em> ⋅ <em>x</em> =  <em>b</em></td>
<td width="50%"></td>
</tr>
</tbody>
</table>
<p>Where <em>A</em> is a square matrix of the coefficients (the <em>3</em>, <em>2</em> values etc. in the above equations), <em>x</em> is a vector list of the "unknowns" (<em>x</em>, <em>y</em>, <em>z</em>) and <em>b</em> is a vector list of the "answers" (1,-2,0). The Linpack benchmark solves for the unknowns. The size of the test is the the number of equations and is often referred to as <strong>N</strong>. If <strong>N</strong> were 10, then <em>A</em> would be a 10x10 matrix, <em>x</em> would be list of 10 unknowns, and <em>b</em> would be a list of 10 answers.</p>
<p>In terms of memory, if <strong>N</strong> were 10,000, then a desktop computer would need about 1GB of memory. If <strong>N</strong> were 1,000,000 then 7.5 TB of memory would be needed. Enter the cluster, where HPL is designed to distribute these really large problems over the nodes. The grunt work on each node is actually done by the BLAS library, which stands for Basic Linear Algebra Subprograms. These routines are used to solve the subprogram given to each node. At various points, the nodes must exchange information using MPI (Message Passing Interface).</p>
<p>There are various checks in the program to make sure the calculations are correct. HPL also creates random data for each size problem. To keep things fair, the random data is the same for each value of <strong>N</strong>. The BLAS library is available in many forms. There are several optimized versions that can make a huge difference in performance. For instance, on Intel processors, Intel offers the <a href="http://software.intel.com/en-us/articles/intel-mkl/">Math Kernel Library</a> (MKL) that contains very fast hand optimized math routines (including the BLAS libraries).</p>
<p>In addition to problem size, there are many other tunable parameters for the HPL benchmark. The benchmark can take hours or days to run, thus getting a good "HPL number" can take a long time. It also requires the entire cluster, something that often disappoints regular users. There are many applications that use HPL type math and thus the benchmark is very relevant for some users. In other cases, where users codes are not solving <em>dense linear equations</em>, the benchmark offers a good historical measure of HPC progress, but offers little insight in terms of their application performance. Next time you hear HPL, BLAS, Linpack, MKL, and other funny sounding acronyms you will know it is just a bunch of software for solving big versions of those little high school math problems.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.clusterconnection.com/2009/07/hpl-101/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interconnects: 10GigE and InfiniBand</title>
		<link>http://www.clusterconnection.com/2009/07/interconnects-10gige-and-infiniband/</link>
		<comments>http://www.clusterconnection.com/2009/07/interconnects-10gige-and-infiniband/#comments</comments>
		<pubDate>Wed, 08 Jul 2009 17:29:38 +0000</pubDate>
		<dc:creator>Douglas Eadline</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[10 Gigabit Ethernet]]></category>
		<category><![CDATA[Gigabit Ethernet]]></category>
		<category><![CDATA[InfiniBand]]></category>
		<category><![CDATA[Top500]]></category>

		<guid isPermaLink="false">http://www.clusterconnection.com/?p=1206</guid>
		<description><![CDATA[It is two horse race, but one horse is still in the barn Ask anyone, "What are the two choices for HPC interconnects?"  and they will tell you "InfiniBand and 10 Gigabit Ethernet (10 GigE)." For the most part they are correct, but 10GigE is just entering the HPC market. It has not even landed [...]]]></description>
			<content:encoded><![CDATA[<p><em>It is two horse race, but one horse is still in the barn</em></p>
<p>Ask anyone, "What are the two choices for HPC interconnects?"  and they will tell you "<a href="http://en.wikipedia.org/wiki/InfiniBand">InfiniBand</a> and <a href="http://en.wikipedia.org/wiki/10_Gigabit_Ethernet">10 Gigabit Ethernet</a> (10 GigE)." For the  most part they are correct, but 10GigE is just entering the HPC market.  It has not even landed in the <a href="http://www.top500.org/">Top500</a> arena, although users are still confident it will show up soon.  On the other hand, InfiniBand use is climbing steadily.</p>
<p>The following table shows the June 2008 and 2009 interconnect families on the Top500 list.<br />
I grouped any interconnect that had less than 1% share in 2009 into the "other" category.</p>
<p align="center">
<table border="1" cellpadding="5">
<tbody>
<tr>
<th>Interconnect</th>
<th>6/2008</th>
<th>6/2009</th>
</tr>
<tr>
<td>Myrinet</td>
<td>2.4</td>
<td>2.0</td>
</tr>
<tr>
<td>GigE</td>
<td>56.6</td>
<td>56.4</td>
</tr>
<tr>
<td>IB</td>
<td>24.2</td>
<td>30.2</td>
</tr>
<tr>
<td>Proprietary</td>
<td>8.2</td>
<td>8.4</td>
</tr>
<tr>
<td>Other</td>
<td>8.6</td>
<td>3.0</td>
</tr>
</tbody>
</table>
<p>The first thing to notice is that GigE still dominates the list with a 56% share as it did a year ago. Also of note, number 16 on the list, from University of Toronto, used Quad Xeon E55xx  and GigE! The only other big change is the continued growth of InfiniBand (from 24% to 30%) and decrease of the "other" category.</p>
<p>So why the confidence in 10 GigE? Simple, at one point GigE was as expensive  as 10 GigE is today, but due to the commodity uptake, the price came down to the "it is free on the motherboard" option. Plus, many users like the "plug and play" nature of Ethernet as it is well understood technology.</p>
<p>In closing, the Top500 is a single benchmark and not the only measure of HPC interconnects, but it does provide an interesting snapshot of what people are using. Right now it seem the biggest competitor to 10 GigE might be GigE.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.clusterconnection.com/2009/07/interconnects-10gige-and-infiniband/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Who Uses HPC?</title>
		<link>http://www.clusterconnection.com/2009/06/who-uses-hpc/</link>
		<comments>http://www.clusterconnection.com/2009/06/who-uses-hpc/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 19:09:59 +0000</pubDate>
		<dc:creator>Douglas Eadline</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[applications]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[digital content creation]]></category>
		<category><![CDATA[HPC]]></category>
		<category><![CDATA[oil and gas]]></category>
		<category><![CDATA[Top500]]></category>
		<category><![CDATA[users]]></category>
		<category><![CDATA[weather]]></category>

		<guid isPermaLink="false">http://www.clusterconnection.com/?p=1146</guid>
		<description><![CDATA[High Performance Computing it not just for rocket scientists Many of the big headlines in HPC come from the Top500 List. While this list is valuable in it's own right, it does not speak to the many quiet breakthroughs and advances made possible by cluster computing. HPC is now more than a method for government [...]]]></description>
			<content:encoded><![CDATA[<p><em>High Performance Computing it not just for rocket scientists</em><br />
Many of the big headlines in HPC come from the <a href="http://www.top500.org">Top500 List</a>. While this list is valuable in it's own right, it does not speak to the many quiet breakthroughs and advances made possible by cluster computing.<br />
HPC is now more than a method for government labs or universities to push the limits of science. It has become a tool for creating products and content, solving problems, and optimizing processes. For example, most people would be surprised to learn that HPC has touched everything from <a href="http://www.compete.org/images/uploads/File/PDF%20Files/HPC_Secret%20Life%20of%20Coffee_052308.pdf">coffee</a>, to <a href="http://www.hpcwire.com/offthewire/17884709.html">bathing suits</a>, to <a href="http://www.compete.org/images/uploads/File/PDF%20Files/HPC_Whirlpool_032009.pdf">washing machines</a>, and the list is growing. Visit <a href="http://www.compete.org/about-us/initiatives/hpc/">The Council on Competitiveness</a> for more examples.<br />
Of course there are more traditional areas, that touch our daily lives, where HPC has become an indispensable tool. These include  the bio-sciences where humane genome data is deciphered and bio-molecules are studied to better understand and improve our quality of life. Oil discovery and recovery would be much more of a coin-toss (and much more expensive) without the use use of HPC. The weather forecast you looked at today is the product of many HPC cycles running 24x7 on a cluster. And finally, without HPC we would not have Shrek (or Donkey) that invite us laugh and cry as they move about the big screen in ways never before possible.</p>
<p>The answer to "Who Uses HPC?" is undoubtedly the scientists and engineers that are developing the products and processes that touch our lives. Perhaps more importantly, is who benefits from HPC? The answer to that is of course, "you and I." As our ability of capture and manipulate "our world" in digital form continues to grow, so does our ability to make better decisions and create a better future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.clusterconnection.com/2009/06/who-uses-hpc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Capacity vs Capability Clusters</title>
		<link>http://www.clusterconnection.com/2009/06/capacity-vs-capability-clusters/</link>
		<comments>http://www.clusterconnection.com/2009/06/capacity-vs-capability-clusters/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 16:35:54 +0000</pubDate>
		<dc:creator>Douglas Eadline</dc:creator>
				<category><![CDATA[Blogs]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Capability Clusters]]></category>
		<category><![CDATA[Capacity Clusters]]></category>
		<category><![CDATA[Processor Cores]]></category>
		<category><![CDATA[Top500]]></category>
		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://www.clusterconnection.com/?p=1060</guid>
		<description><![CDATA[Does your HPC cluster need 10,000 (or more!) cores? Probably not. Everyone in the high-performance computing industry watches the Top500 List. Twice a year the worlds fastest computers (mostly clusters) are ranked by how well they run a very large benchmark program. The Top500 List is an interesting competition that measures a great deal of [...]]]></description>
			<content:encoded><![CDATA[<p>Does your HPC cluster need 10,000 (or more!) cores? Probably not.</p>
<p>Everyone in the high-performance computing industry watches the <a href="http://top500.org">Top500 List</a>. Twice a year the worlds fastest computers (mostly clusters) are ranked by how well they run a very large benchmark program.</p>
<p>The Top500 List is an interesting competition that measures a great deal of computing muscle but it also helps track the history HPC systems; the type and number of processors, operating systems, amounts of memory, and system architecture are all detailed for each system on the list.</p>
<p>The ranking extends back to 1993 when the list began. As a mater of fact x86 clusters are have only recently jointed the list. The fastest x86 cluster in November 2008 used 51,200 cores to run the benchmark.</p>
<p>While these levels of computing are heroic, they actually don't reflect how most HPC systems are constructed.</p>
<p>So, how many cores does an average HPC program use? It depends on who you ask, but there seem to be three distinct types of cluster systems:</p>
<ul>
<li>Small (64 processor cores or less)</li>
<li>Large (over 64 cores but below 10,000 cores)</li>
<li>Staggering (over 10,000 cores)</li>
</ul>
<p>While 10,000 cores is an exciting number to visualize, clusters that use the smallest number of cores actually represent the largest segment of the HPC market.</p>
<p>In general, users that run smaller programs often share a cluster with other users. Conversely, the mammoth programs that use thousands of cores often consume every core in the cluster. To differentiate between these two types of usage, clusters are often classified as either a <strong>Capacity</strong> or <strong>Capability</strong> system.</p>
<p>Capacity clusters are the most common and are used to deliver a certain amount of "computing capacity" to the end users. For instance, a capacity cluster may support hundreds of users running any number of programs. These programs require a number of cores much less than the total number of cores in a cluster. In these types of clusters, all the compute resources are managed by a job scheduler which determines which  programs (jobs) run on the cluster.</p>
<p>A Capability cluster is designed to handle (or be capable) of running large groundbreaking programs that were previously not possible to run. These systems usually push the limits of cluster technology because large numbers of systems must work together for long periods of time.  In contrast, a capacity cluster has the ability to tolerate failure and continue running user programs.</p>
<p>Chances are if you are using a cluster it is a capacity system. If that is the case, the Top500 might interest you, but it probably has very little to do with your performance. If you are one of the the few high-end capability users, the Top500 is just for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.clusterconnection.com/2009/06/capacity-vs-capability-clusters/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

