Virtualization and Cluster HPC

August 3rd, 2009 3:38 pm
Posted by Douglas Eadline
Tags: , , , ,

How close is a virtualized HPC environment?

While the prospect of HPC virtualization is very young, the many advantages offered through virtualization are highly attractive. In particular, the ability to migrate a live process opens up many interesting scenarios. First, live migration, would allow schedulers to preempt a running job allowing higher priority jobs to run. For instance, a multi-node job could be moved from one cluster to another or even paused for an indefinite amount of time before resuming. Indeed, an entire cluster could be upgraded while the running jobs are paused. Because the new hardware supplies the same virtualized environment the applications would not know the difference.

There is no doubt virtualization has much to offer HPC. There are, however, some obstacles that need to be overcome before virtualization can be useful for HPC. If you recall, much of the performance achieved by HPC involves working as close to the hardware as possible. User-space communication is one example of this idea. (User-space communication or "kernel bypass" allows one process on one node to directly send data to another process on another node without using kernel services.) In a virtualized environment a software layer exists between the hardware and the user software application. While this layer guarantees interoperability, it does not guarantee performance and thus could add overhead to all communications reducing performance. There has been some work in this area and the results demonstrated by the Now lab at Ohio State University have shown that virtualized I/O can be highly efficient.

In addition to I/O, there is also the memory management issue. In this case, the kernel memory management can be pushed by large HPC data structures in big memory machines. For instance, using the standard Linux 4K page size, stepping through a 16 GB application may require 4 million page references that will need to be pushed in and out of the cache. Clearly, this processes could be improved for the non-virtualized environment and placing this level of memory management on top of a virtual memory layer could cause a further slowdown for big HPC applications. Fortunately, this issue is also under study and some initial results are again promising.

Of course, there are many other issues including the type of virtualization and the specific user application(s) in question. The good news seems to be that virtualization may eventually score a big play in HPC. If I were asked about HPC virtualization, I would probably say, "Not just yet, but we are getting there."


You must be a Registered Member in order to comment on Cluster Connection posts.

Members enjoy the ability to take an active role in the conversations that are shaping the HPC community. Members can participate in forum discussions and post comments to a wide range of HPC-related topics. Share your challenges, insights and ideas right now.

Login     Register Now

Author Info

Dr. Douglas Eadline has worked with parallel computers since 1988 (anyone remember the Inmos Transputer?). After co-authoring the original Beowulf How-To, he continued to write extensively about Linux HPC Clustering and parallel software issues. Much of Doug's early experience has been in software tools and and application performance. He has been building and using Linux clusters since 1995. Doug holds a Ph.D. in Chemistry from Lehigh University.