Group targets open source cloud computing

HP, Intel, Yahoo detail efforts on software stack
Executives from Hewlett-Packard, Intel and Yahoo are calling for developers to create an open source software standard for cloud computing. They showed their work on the effort—some of it still at a very early stage—at a gathering of researchers creating a test bed for cloud services.

The trio joined with three academic research institutes to form the Open Cirrus group in July 2008, each dedicating computer servers with a total of 1,000 cores to form a distributed network of systems as a research platform. On Monday, three more research groups joined the effort—the Russian Academy of Sciences, South Korea's Electronics and Telecommunications Research Institute and MIMOS, a R&D arm of Malaysia's Ministry of Science, Technology and Innovation.

"We want to rally the larger research community around the vision of this open source cloud services stack," said Andrew Chien, vice president of research at the first annual gathering of the partners here, echoing calls of a group of 38 vendors in March.

Cloud services are essentially unused parts of large data centers rented out to host third party applications. Amazon, Google and Microsoft have announced cloud service offerings, but each uses some of its own proprietary software to implement them.

An open source stack could prevent the spread of proprietary offerings and spur innovation like the so-called LAMP stack of Linux, Apache, MySQL and Perl/Python did for Web 2.0 developers, Chien said.

"The LAMP open source stack was given credit for creation of a whole raft of Internet startups and tech innovation, and I think the same could prove true for an open cloud stack," Chien told the gathering of about 50 researchers hosted at HP headquarters.

"A million processors could be the province of a handful of companies in five years, or they may be broadly available to academic and startup communities and other businesses whether they be telecoms or entertainment companies," he said.

Commercial cloud services make their own brands of application frameworks available to users today. But they do not release the kinds of virtualization and monitoring tools, storage file systems and job schedulers they use to execute and manage those applications, Chien said.

The Intel researcher characterized as very fragmented the current variety of existing software tools for cloud computing.

"It's very hard to put together a clean, robust stack," said Chien. "We spend a lot of time in the industry building shims or glue to hold these pieces together," he said.

The software is mature enough to make decisions about a standard stack now, he said, answering a question from one researcher. "We're in a landscape where a number of players are moving forward, and I don't expect their stacks to remain static," he said.

Sketches of an open cloud stack

Chien outlined elements of a possible open source stack based around Hadoop, an open source version of Google's MapReduce software used by Yahoo and others for distributed computing. The Open Cirrus stack also could include a low-level hardware abstraction layer from HP called Physical Resource Set (PRS), a virtualization layer from Intel called Tashi and several other emerging pieces.

Presentations on the software projects made it clear some are still at an early stage of development.

PRS aims to be a platform for allocating and managing hardware resources between data centers, running below the level of virtual machines. It could be used in cases where developers want to avoid the overhead of VMs or enforce policies about sharing between data centers, said Kevin Lai, an HP Labs researcher.

HP has an early version of the code running on the CentOS using XML remote procedure calls and is developing a version for Ubuntu. Several key features are still in development including a capability to support variable pricing levels and integration for Hadoop.

In addition, PRS is currently based on a proprietary ILO protocol HP uses to manage data centers. Lai said it should be ported to a more open IPMI protocol in the future. In addition, researchers from HP and Intel agreed they need to work together to integrate PRS with Tashi.

Michael Ryan, an Intel researcher described Tashi as a virtualization layer for large data sets that aims to run across separate data centers that house both computers and storage systems in close proximity.

Tashi automatically generates and manages virtual machines in response to user application code. Intel has Tashi running on about 80 nodes of the systems it has linked to the Open Cirrus network.

Although many technical details are unresolved, executives made clear the business rationale for the open source effort.

"We want to raise the level of abstraction our developers work at so they can focus on applications-level innovation," said Shelton Shugar, senior vice president of cloud computing at Yahoo. The Web company's main aims are "improving speed of innovation in analyzing data and getting new products to market," he added.

Shugar declined to specify how many servers Yahoo manages, a closely guarded secret for many Web companies. But he did say the company supports 500 million unique users per month, hosts hundreds of petabytes of storage and can handle hundreds of thousands of transactions per second across tens of data centers it operates worldwide.

"We are interested in projects that make use of all nine data centers [Open Cirrus will be] making available," Shugar said.

"That's a very big deal because our social apps need to traverse the globe," he said. "Someone may write data here that needs to go to five different places and then get updated from somewhere else on the globe," he added.

The shift to global applications services "has big business model implications for IT vendors" and will influence everything from hardware interconnects to data structures and app design, said Russ Daniels, chief technology officer for cloud services strategy at HP.

Intel's Chien speculated that Google alone may operate as many as two million servers today, up from estimates of half a million just two years ago. Market researchers project cloud services could consume a quarter of computer servers by 2012, creating a market estimated to be worth tens of billions of dollars, he said.

"Whatever cloud computing is—and there is some debate on the definition--it seems it does matter," Chien said.

BY Rick Merritt
Source:EE Times

Copyright © 2009 TechInsights, a Division of United Business Media LLC All rights reserved.



Copyright 2008-2009 Daily IT News | Contact Us