Overview of Grid Computing
 


Overview of Grid Computing

Web Services is starting to cede its pre-eminent position in IT hyperbuzz to Grid Computing. As always the reason is the same - the major vendors are staking out positions on a technology frontier which they believe is going to unlock either the next great killer app or unleash one or several of their product lines from the doldrums or malaise. Cisco is the early adopter followed by HP, IBM, Sun and a brace of start ups promising On Demand or Grid Computing based on the the Utility Model. For example, IBM has been flaunting the US Open website as a perfect example of On Demand computing - it ramps up for in late August for 100 times the demand it will see throughout the rest of the year. Oracle has released a new version of its database, Oracle 10g, which is geared to take advantage of on demand, grid computing.

Make no mistake there is real technology being developed in the trenches; but how well it can or should be expected to meet current IT needs and requirements is an open question. But grid computing has long been hatching since the days of fail-safe and time-sharing processing back in the late 60's and early seventies - MIT's Multics was designed to be a computing utility service. The basic idea of sharing processing loads among several computing systems has long been a target of many IT shops and vendors. The trick is match transiently unused or under-exploited computing power and resources with scattered requesters of computing needs over a network or campus of systems. Grid Computing has been successfully done notably in research institutions such as MMS and Neph for weather simulation or Cactus and Tardis for astrophysics (see Globus website reference). And commercially successful systems can be found from the Sun GridEngine clusters used

to design and streamline Maclaren racing cars cars for every F1 race to the Pratt Whitney Grid usage that helps design aircraft engines. These successes reflect the fact that vast improvements in hardware are finally being matched to and managed by better and more standardized distributed processing software.

Web Services and distributed software for dynamic sharing and recovery tasks among others have helped drive and perfect new grid software. On many large computational projects such as genome identification, weather pattern recognition, or complex molecular modeling - marshaling spare computing resources is the only way tasks can hope to get done. So many major vendors, recognizing the favorable convergence of resources and interests are racing to develop positions in the fast emerging field. For example,during the summer of 2002 HP and IDC conducted a series of seminars entitled "Building Grids: Hype Meets Reality" whose theme was that medium to large scale businesses can do grid computing now. The (451)Group has recently released a 215 page report arguing the next 18 months will see the inflection point for the growth of grid computing. The result is that the emergence of grid computing may also be attracting its unfair share of hype (see articles references).

Grid computing can be divided into four main categories: 1)local clusters which manage many resources and many tasks but on only one system/project and on one network; 2)campus grids typically add many user systems or projects to the mix and usually many clusters of computing power to be managed. Also the the tasks may be spread over a wider network but usually within the same firewall; 3)Net taskers are single tasks like the SETI or FightAids projects which access and utilize worldwide Internet accessible resources marshaling them to their individual projects; and 4)global grids have many tasks/projects plus owners again accessing and utilizing shared resources anywhere on the Net and/or other special access networks. Within the last 5 years the huge amount of research and development done by universities and research institutes in sharing and effectively utilizing HPC-High Performance Computing resources has advanced the state of the art of Grid computing to such a degree that many vendors (see references for our extensive list of system vendors) are commercializing these applications. For example, doing the projects in local clusters and campus grids, are well served by software from a number of vendors including the small Info Designs DeskGrid to Entropia Platform Computing, Sun and others (see our list of software vendors in the references). Second Grid computing is well served by the new Web Services technologies. And a relative long history of application in university settings will provide the needed consulting expertise to launch these projects. So look for a fairly rapid take off of Grid computing.

Resources

There are a couple of technologies very close to Grid computing which derive or share technologies - autonomic processing and P2P-Peer to Peer processing. Autonomic processing is about enabling programs to be more resilient and self-sufficient in their operation. Autonomic computing enhances the desktop OS, client programs and servers systems such that they can be more self- optimizing in performance, self-configuring at start up, self-recovering/healing and self-protecting against outside attack. Now some parts of these functions are already available in various server and desktop client OS. What grid computing in conjunction with Autonomic computing is doing is to help establish common protocols and standards for these important tasks. Fortunately, the Grid and Autonomic developers have a strong research and development track in the university and institutional arena to draw upon (see www.gridcomputing.com reference). Peer to Peer processing is about one or more clients on the Internet sharing primarily data but also processing with minimal intermediate server intervention. Again, the discovery and brokering of services used in P2P are influencing Grid computing software.

As one might imagine given the nature and rate of change of the technology,the books and articles on the topic fall into two categories.- 1) buried in the journals of academia and 2)to be found in books and white papers but with either a proprietary bias or slightly out of date. The latter is true of two of the best books on Grid computing. Foster's The Grid is from April of 1998 but despite the rapid advancements still manages to collect together articles on the core issues and opportunities in Grid computing. Pfister's book is from the same time frame but its emphasis on the detailed issues of hardware clustering and sharing stands the test of time as contemporary readers give it 5 star ratings. In contrast the articles on Grid computing to be found in the trade press are primarily news items and status reports on who is leading the race to dominate the market. There is also a website, www.gridcomputing planet.com devoted to news, reviews, and other links; but this reviewer prefers the links only site www.gridcomputing.com for the best pointers on where on the Grid to Go.

Do It Yourself

The best way to get first hand experience with Grid Computing is to do it yourself. Their are a number of P2P grids that users can participate in by loaning spare CPU minutes over the Net to various causes. Perhaps the most famous is SETI@home which is using spare CPU cycles throughout the world to help analyze various extraterrestrial electromagnetic signals for signs of intelligence. Other Grid projects that you can lend CPU time to include fighting Aids, doing genome research, participating in stock market forecasting using neural networks or finding the 5 largest prime numbers. So this is an opportunity to test how unobtrusive sharing CPU cycles really is. And users can even set up their own Grids by downloading from small, Windows only InfoDesign's DeskGrid, or some of the gaming experiences to be found at Butterfly.NET or the full-scale, yet essentially free Sun GridEngine (the the complete source code and copious documentation is available at gridengine.sunsource.net) which works on Linux and Solaris. As well MindElectric and Entropia have downloads available for use on Windows or Mac machines.

What we found very useful at both the commercial and P2P sites is the strong community available for help and assistance. This is helpful both for setting up clients and definitely when setting up a Grid server. Sun's documentation and community support for GridEngine were very impressive; but support for Windows clients requires users to get special add-on software. In contrast, Entropia and MindElectric have less availability of their software innards and documentation but do support Windows machines quiet easily in their clusters. Finally, we appreciated being able to share with MoneyBee where we got some direct benefit from accessing their stock market forecasting results which we had contributed a small part to.

Ubero offers a mix of fee or free grid computing projects that users can sign onto and thus giving individuals and organizations a small ROI choice in how they share their excess CPU power. Of course a lot of major vendors like Intel and IBM see this as an opportunity; however the global Grid software is still not up to the task of transparenly sharing/trading resources among users who may also be suppliers or competitors at other times.

Generally, Grid software draws on a wide set of client resources from Supercomputers through midrange servers to workstations and PC desktops. This is the challenge of Grid distribution and scheduling software - to find and allocate tasks among a myriad of heterogeneous machines with varying capabilities. And to do this process while handling the inevitable exception and failover conditions. Thus the attraction of do-it-yourself is being able to try the software first as a client and then as Grid server in order to get a front line feel for how well Grid software handles these tasks. And as a Grid server, one can step graduate from local clusters to global grid. It is helpful that Grid computing has so many entry points.

However, a grid is at the ultimate point of optimization in a organization's systems. First, a grid presumes there is an adequate set of hardware and software systems in place meeting frontline operational and then planning needs of the organization. Then a grid also relies on very high standards of security, reliability and interoperability of those same systems. Finally, a grid will test the saviness of both the operations and development staff to be able to customize systems to meet specific grid-based requirements.

In effect, a grid environ and on demand, utility-like computing model is where many organizations ultimately want to be; but without the disasterous tightly linked, black out liabilities. And for some disciplined organizations such goals are reachable. However the current reality is several steps away from the necessary and sufficient conditions. Microsoft will have had to deliver substantially more on its trustworthy, high reliability, and interoperability initiatives. Database and middleware vendors likewise will have interoperability and reliability gaps to close while hardware and network vendors will have to deliver easily installed and robust security mechanisms. In short, for many organizations Grid and On Demand computing will be a goal, an ideal to be attained.

Summary.

During the recent pullback in IT investments companies are looking hard at how to get economies in their IT function. And there sitting on their desktops and networks is an enormous resource of idle CPU time waiting to be tapped into and harnessed for more productive return. Some companies like Ford, Pratt-Whitney, Nortel, are taking advantage of this spare capacity using grid software to do everything from modeling and simulations to bread and butter computer bound engineering and market analysis. But many are doing so quietly, generally keeping mum about what goes on after hours. And why not - Grid computing offers not only offers real and substantial cost savings but also competitive advantage as it is also a solid development testbed for emerging Web Services, Autonomic computing and other potentially high pay off computing strategies. Do nothing CIOs may see a hybrid adage - "idle CPU minds means having the devil to pay later".

References:

Articles:
Grid Computing by Mitchell Waldrop-MIT Technology Review May 2002 - provides a good overview of the major commercial and academic players in Grid computing

Books:
The Grid : Blueprint for a New Computing Infrastructure by Ian Foster, Morgan Kaufmann 1998 - has remarkable timeliness defining the nature and issues of grid computing.
In Search of Clusters (2nd Edition) by Gregory F. Pfister, Prentice Hall 1998 - discusses core problems in cluster and grid computing in clear terms
Grid Computing: A Practical Guide by Ahmar Abbas, Charles River Media - recent, November 2003, book on grids

Do-it-yourself:
SETI - first of the global grid projects; U. Cal Berkley looking for signs of intelligence in radio signals from space
Moneybee-Ever wondered if AI could give you advantage in predicting the stock markets? Lend some CPU time and get some answers at MoneyBee.
DaliWorldl-shows where in the world fish you have "adopted" have traveled;
FightAids - Grid project devoted to fighting aids through drug analyses passed on to the Scripps Institute
Gimps - project is trying to find the 5 largest prime numbers in the world
COSM - Stanford needs help in simulating the dynamics of protein folding in gene research
Sun's Gridengine - Get the full Sun GridEngine 5.3 and setup your own local grid to do compute sharing

Vendors:
Altair -
has OpenPBS-Open Public Batch System, a good test bed for batched grid apps (Photoshop users take note).
Avaki has campus and global grid solutions centering on enterprise information integration.
Butterfly.net - the ultimate gaming experience powered by clusters of PCs and then only one.
Centrata - data center management software performs grid-like site schedulng, backup and recovery functions.
Data Synapse - does clustering with emphasis on ease of conversion of legacy apps to Grid processing.
Ejasent - provides software to help setup and do policy-based application control for On Demand computing
Enigmatec - delivers a self managing grid computing platform
Entropia
- enables local clustering of networks of PCs for non-disruptive CPU sharing.
Frontier- prepares Java programs for being distributed over a local cluster for processing.
Gridfrastructure - provide local and campus Grid infrastructure tools for managing the grid.
Grid Systems- provide local and campus grid processing with specialized templates for converting tasks to the grid
Gridiron Software tools make it very simple to add parallel distributed processing to your tasks
Oracle - had done quite a lot prior to Oracle 10g to enable grid computing in its products, see here.
Platform Computing - has grid solutions resold by IBM and HP.
Powerllel - provides tools to parallelize applications for use in MPP and Grid computing.
Sun - Sun has cluster, campus, and global grid solutions using heterogebeous platforms and OpenSource standards
Symbiant - does consulting and software in the P2P and Grid Computing world
TheMindelectric - Gaia is P2P/Web Services/Grid computing software that automatically manages tough load balancing, clustering and failover tasks.
United Devices - MetaProcessor is campus wide grid software

Standards
Grid Forum - grid standards making group active in such areas as architectures, security, P2P processes, scheduling and resource management, etc.
Distributed Resource Management Application API (DRMAA and pronounced "drama")-GridForum standard supported by IBM, Intel, Platform Computing, Sun
GlobusThe Globus Project OGSA (Open Grid Services Architecture) will add Web services and appear in Globus Toolkit 3.0 with other Grid Computing software.

Websites:
www.dsonline.computer.org - IEEE distributed systems online site has info on all aspects of distributed processing
Gridcomputing.com - all the info on university, research institute work in Grid computing with scores of solid links
Gridcomputingplanet.com - site devoted to grid computing news, resources, and commercialization
Globus.org - global GRID applications like MMS and Neph in weather, Tardis in astrophysics, etc
Links on the Grid - a very good set of links to all things Grid

 Top of Page  Open Sourcery Home Tips & Tutorials  
©Imagenation 2001-2004