Tuesday, May 19, 2009

Zettabytes: Juiced Computers?

For the past several years, this blogger has marveled at the inifinite capacity of Blogger (owned by Google) to accept and store (for nearly 6 years now) all of the gobbledy-gook that comprises this blog. Text, image files, and video clips do not faze Blogger. Not once has the uploading of all of this stuff prompted a groan, a burp, or a yelp from the Blogger sysops (system operators) or admin (system administrators). Multiply that by gazillions of geeks like this blogger and that is lot of stuff that's being uploaded to an unknown number of mainframe computers. A Google-search for the number of blogs on Blogger produced only one nugget: Blogger limits the number of blogs that can reside within a single Google account. Limits? Reading further, each blogger is limited to one-hundred individual blogs in a single Google account! Outrageous! 100 blogs per blogger? The scale is beyond comprehension. Now, a geek reporter announces that computer servers on virutal PEDs (Performance Enhancing Drugs) are on the horizon. The reporter focuses on Facebook rather than Blogger, but since this blogger also has a Facebook page, the shoe fits. The astronauts who repaired the Hubble Telescope were constantly on alert to avoid "space junk." Cyberspace is filled with junk like this blog. Duck! If this is (fair & balanced) cyberspace junk, so be it.

[x Salon]
The GigaOm Network — Social Networking & Dawn Of The Zettabyte Era
By Om Malik

Tag Cloud of the following article

created at TagCrowd.com

Earlier today [05/18/09], I stopped by at the Social Graph Symposium at Sun Microsystems’ Menlo Park campus. The event, which attracted some of the most well-known experts on social networks and social graphs, was organized to look at the various challenges and opportunities being presented by the increased socialization of the web.

And there is no opportunity bigger than the one offered by the computational needs of this new social web. As we discussed during our first Structure conference (and we will continue to discuss at our upcoming Structure 09 conference on June 25th ), the social Internet has made it easy for anyone to create, publish, distribute and consume content. Such content can range from blog posts to YouTube videos to Flickr photos to simple, 140-character tweets.

But small drops of water will lead a bathtub to overflow if the pipes are clogged, and this is the challenge faced by the underpinnings of the web. “In the next 12 months there will be a zettabyte of information on the Internet,” said Dr. James Baty, distinguished engineer, VP and chief technology officer of Sun Microsystems. A zettabyte is the equivalent of 1 billion terabytes — or nearly a billion times the data stored on the various drives in my apartment.

That explains why Sun was interested in hosting the event — after all, online social interactions are key drivers of the massive explosion of data on the Internet. To help you better understand the magnitude of growth, let me share a couple of data points from a previous post about Facebook’s photo service.

  • Facebook users have uploaded more than 15 billion photos to date, making it the biggest photo-sharing site on the web.

  • For each uploaded photo, Facebook generates and stores four images of different sizes, which translates into a total of 60 billion images and 1.5 petabytes of storage.

  • Facebook adds 220 million new photos per week or roughly 25 terabytes of additional storage.
Facebook is trying to use a smart-software approach to manage this data deluge. The story is no different at, say, MySpace, Twitter or any big social web company. More and more companies are turning to Hadoop and other software written for the ultra web. Gary Orenstein in a recent post outlined the various systems that have emerged to capitalize on the data mining renaissance.

Sun wants to understand the computational needs of a web that is driven by real-time social interactions. For the longest time, the world has been OK with batch processing of data that took hours. Not any more — for the web (and the Internet) are becoming real-time propositions. To analyze the data would mean a lot of computing horsepower.

“The computational challenge of looking at the unstructured data and mining that data is immense,” Dr. Baty said. We are at a tipping point, he said, that will see computer clusters of today bulk up to levels that would put even steroid-enhanced baseball players to shame. “We are going to go from tens of thousands of (processor) cores in cluster to hundreds of thousands of (processor) cores,” he said. ♥

[Om Perkash Malik is the founder of Giga Omni Media, Inc. and executive editor for technology blog GigaOM. Malik graduated from St. Stephens’ College in New Delhi in 1986, with an honors degree in chemistry. He wrote Broadbandits: Inside the $750 Billion Telecom Heist in 2003]

Copyright © 2009 Salon Media Group, Inc.

Get the Google Reader at no cost from Google. Click on this link to go on a tour of the Google Reader. If you read a lot of blogs, load Reader with your regular sites, then check them all on one page. The Reader's share function lets you publicize your favorite posts.

Copyright © 2009 Sapper's (Fair & Balanced) Rants & Raves

No comments:

Post a Comment

☛ STOP!!! Read the following BEFORE posting a Comment!

Include your e-mail address with your comment or your comment will be deleted by default. Your e-mail address will be DELETED before the comment is posted to this blog. Comments to entries in this blog are moderated by the blogger. Violators of this rule can KMA (Kiss My A-Double-Crooked-Letter) as this blogger's late maternal grandmother would say. No e-mail address (to be verified AND then deleted by the blogger) within the comment, no posting. That is the (fair & balanced) rule for comments to this blog.