Casey O’Donnell’s recent post at Culture Digitally reminded me of a conversation I had not long ago with another researcher, in which we realized that, although both of us were speaking about “the cloud,” each of us were talking about slightly different things. That conversation grew into an email thread that I’ll adapt here, in which I tried to “disambiguate” (oh, lovely Wikipedia word) the various meanings that have grown up around “cloud computing.”
I feel like clarification, however small, is important because, as Tarleton Gillespie has pointed out with his example of the “politics of platforms,” when you see a good deal of equivocation around a particular word, it’s often an indicator that people are using it strategically in ways that are important to grasp for social critics and social scientists, alike.
And people certainly equivocate over what they mean by “cloud computing.” I count at least three related, but distinct meanings for “cloud computing” in popular use these days, one having to do with technical infrastructure, another to do with user experience, and a third with business model.
Below, I try and pick apart these meanings. Note that I’m intending this post for those without an extensive technical background, but readers with technical expertise should feel free to chime in with critiques, clarifications, and other comments. Those interested in a more extensive discussion should check out the excellent Wikipedia article on cloud computing which does a superb job of breaking down the various uses of the term and the history of the technologies that surround it.
First there’s the technical component. While cloud computing has a long and storied history, for the sake of convenience I’m going to sketch its technical underpinnings here in terms of relatively recent and simple developments and examples.
Interactive websites, like blogs, social networks, e-commerce sites, etc. are commonly referred to as “web applications.” And, as a point of reference, many web applications are called “three-tiered” applications, in that different parts of their code are run in three different environments. One of these “tiers” is the database management system, which stores and, upon request, coughs up all the data used in the web application. If your web application is something relatively simple, like a blog, the database management system is storing and handling a database full of things like blog posts, comments, user login information, and so forth. The database management tier runs on a server, but it is compartmentalized such that it can run on an entirely different server from the rest of the server-side software described in the next tier.
The second tier of the web application is the server-side software, which is just what it sounds like: software that’s running on the web server. This part of a web application is responsible for responding to requests from users, sending their information to the database, and using information returned by the database to deliver contextually specific webpages for the user (e.g., showing you a Facebook page populated with your friends’ information, or a page of search results for the keywords you entered.).
Note that of the three tiers I’ve described, only one (the browser tier) actually runs on your own computer or device. The other two run on the server, and this means that sophisticated and/or well-trafficked websites, while they may not use up many resources on your own computer, can end up putting quite a lot of strain, in terms of storage and memory, on the server running the site. To overcome these challenges, particularly high-traffic sites like Amazon and Google that also run sophisticated web applications housing lots of data, spent a lot of time and money figuring out ways to distribute the data storage and processing tasks of the first and second tiers across numerous computers, rather than assigning them to a single beleaguered server.
This is an oversimplified, but hopefully reasonably accurate, description of what cloud computing means from a technical perspective. There are, of course, some caveats. First, in addition to companies like Google and Amazon, there were, of course, other less consumer-oriented companies, like IBM and Oracle, who did a lot of pioneering in these areas, and the development of the technologies behind cloud computing long predates the emergence of the term. Second, many web applications use APIs, RSS, etc. to also provide data to software other than browsers (e.g., native mobile and desktop apps, etc.). And third, cloud computing can involve the creation of distributed computing resources for applications entirely unrelated to the web.
With regard to the second meaning of “the cloud,” user experience, many users have found web applications supplanting functionality they traditionally associated with desktop applications and a local hard drive. The ostensible advantages for end users are clear enough. If you use a web application like Google Drive to work on and store your documents, you can save hard drive space on your own computer, you don’t have to pay for software like Microsoft Office, you can access your data from anywhere using any operating system that will run a full-featured web browser, and—since much of the computing horsepower behind the web application comes from the server, rather than your local machine, you can get away with purchasing less expensive hardware.
In many cases, this sort of user experience is underpinned by a distributed computing infrastructure that fits the technical description of cloud computing as I’ve defined it above. However, regardless of whether the actual server infrastructure underlying a particular application is distributed or not, the notion of using online services to store data or get work done has been popularized/marketed to users as keeping one’s information “in the cloud.”
The third sense in which I hear “cloud computing” tossed around is as a business model. The neat technical trick described above, of distributing server-side software and database functionality across a large number of computers, has turned out to be salable in that lots of developers want to be able to write and host their web applications without worrying about how system resources like storage space and memory will scale as they take on more users and page views. So companies like Amazon (via Amazon Web Services, “AWS” for short) and Google (via App Engine) began selling their distributed computing power to developers—much like an ordinary web hosting service—but instead of simply leasing space on a server you instead leased access to the company’s distributed—a.k.a. “cloud”—computing resources for storing your data and server-side code.
This business model of setting up a distributed computing infrastructure to sell to application developers has become increasingly common, as has the creation of online applications that depend on cloud computing resources (e.g., Google Drive or Dropbox) and marketing them directly to end users. This business strategy allows companies to sell access to their infrastructure and/or applications for a recurring fee (or to sell ads against them in perpetuity in the case of “free” services). For example, where a university might once have paid a one-time fee to license a copy of Microsoft Outlook and Word, it may now instead pay a recurring premium use Gmail and Google Docs as part of Google’s “Apps for Education” program.
Similarly, beyond the cloud services that we encounter daily as consumers, there are a huge variety of cloud services now being developed for specialized industries, from medical records management to human resources.
Many cloud computing services, like App Engine or AWS act as intermediaries, delivering data and software created by independent developers and content providers. In writing about cases like these, I occasionally use the term “transparent intermediaries,” since part of the promise of services like App Engine is that users will remain unaware of the interventions they provide in the delivery of a client’s branded product.
What’s Interesting Here and Why’s It Important?
With regard to why the cloud, and its various definitions, matter to social critics and communication researchers, Jonathan Zittrain’s book The Future of the Internet and How to Stop It comes to mind. It does a nice job of depicting both the advantages and the pitfalls of centralized computing power for society and end users. When our applications and data are stored in Google or Facebook’s software and infrastructure, we trade a sizable amount of agency for the conveniences they provide. We’re subject to their terms of service, and we also experience a lack of control over the features we have access to. By comparison, until recently, if we didn’t like the changes made to a new version of Adobe Photoshop, we could continue to use the old one for years, whereas we’re immediately and permanently stuck with any change Google makes to Gmail or Google Drive—and, now, any change Adobe makes to its Creative Cloud. Moreover, users who trade down to tablets or lower-powered hardware with the intent of ditching desktop applications in favor of web-based ones are especially vulnerable to issues like these. And Zittrain points out that as our data and our computing resources become increasingly centralized, they also become more vulnerable to regulatory pressures.
Issues of lock-in always loom large with cloud computing, as with any “software-as-a-service” business strategy. In the context of social networks folks like Søren Petersen have done a nice job of pointing out that even the ability to take your data with you when you leave a service doesn’t truly free the user from issues of lock-in, as much of the value that online services add has to do with the context in which your content is used and displayed, rather than the data itself. This is, I imagine, as true for clients of enterprise cloud computing applications as it is for users of Facebook.
More generally, as I mention at the outset of this post, the multiplicity with which companies and individuals use the term “cloud computing” also suggests that the term may be deployed strategically, akin to the way Tarleton Gillespie has documented the manner in which companies and developers advantage themselves by using the term “platform” differently in distinct social and political contexts.
Lastly, the looming presence of Oracle in many areas of cloud computing adds another wrinkle. Tensions between proprietary and open source software development strategies have run high in recent years as various companies have tried new ways of monetizing open source projects that have at times run counter to the spirit of commons-based peer production on which open source has long been predicated. (Tony Liao’s forthcoming work on the Android Developer Challenge contains a nice discussion of some of these issues.)
Oracle has found itself at the center of many of these debates. As a corporation it has a reputation of being somewhat less than friendly in its policies toward open source software communities, but at the same time has taken over administrative oversight of numerous major open source projects in recent years as it has bought out competitors like Sun, which had been heavily involved in open source development. For example, Oracle now oversees MySQL, the open source database management software that underpins a huge a huge portion of the web’s interactive sites, and Solaris, a once largely open-source operating system that had competed nimbly with Linux. And Oracle only recently ceded control of OpenOffice, an open source competitor to Microsoft Office, but not before most of the software’s original developer community had abandoned ship.
Since tensions between open source and proprietary software development strategies are at the center of many debates about the locus of user agency in cloud computing, Oracle’s involvement in various cloud computing tussles may ultimately add some interesting threads to the narratives surrounding the still-dynamic term, “cloud computing.”