Today I came across a blog post featuring the process of how to deploy Hadoop environment onto the Windows Azure platform. For those who are not from the HPC (High-Performance Computing) world, Hadoop is an distributed computing platform, based on a Apache Foundation OSS project that supports data-intensive distributed applications. The Hadoop platform includes the Hadoop Distributed File system (HDFS) and an implementation of MapReduce software framework under a free license.
MapReduce is basically a framework for processing huge datasets on certain kinds of distributable problems using a large number of computers (nodes) and as you can imagine cloud platforms like Windows Azure are ideal for such implementation.
On the following blog post you can see all the details about the implementation of this approach on top of Windows Azure roles and with a template project in Visual Studio 2010. Really nice and smooth!!