HomeNewsMeasuring “the Cloud”: performance could be better

Measuring “the Cloud”: performance could be better Better understanding of causes of delays

Storing information “in the Cloud” is rapidly gaining in popularity. Yet just how do these services really work? Researchers from the University of Twente’s Centre for Telematics and Information Technology (CTIT) have completed the first comprehensive analysis of Dropbox, a popular service that already has 100 million users. One shortcoming of this service is that performance is greatly dependant on the physical distance to the Dropbox servers. The researchers will present their findings in a major forum, the Internet Measurement Conference (IMC2012) in Boston.

Users are fully aware of the advantages of cloud services. You can access your data from anywhere in the world, using a PC, laptop, tablet, or mobile phone. On the down side, there is no systems manager for you to call, someone who can “fix things” if your data fails to appear. Researchers from the CTIT’s Design and Analysis of Communication Systems (DACS) group, together with counterparts from the Politecnico di Torino, have made the first detailed performance measurements of Dropbox (currently the most popular cloud storage service). For instance, they have examined the data exchange processes, and checked how and where the information is stored.

Hashing

Dropbox stores its information in Amazon’s servers, which are located on the west coast of the United States. The administrative functions, such as hashing (slicing up and sequencing the data), take place on its own servers. If the hashes show that a file (or a part thereof) has already been stored, then Dropbox will not transmit that information a second time. Other cloud services also use this approach, to limit the volume of data exchanged. One condition for the efficient use of hashes is that users must not encrypt their data. If users ignore this requirement, they find that Dropbox‘s performance suddenly drops away.

Something else that is not apparent to users is the physical distance to the servers on which their data is stored. Amazon has servers throughout the world, yet Dropbox only uses those that are situated on the west coast of the US. Together with the hashing operation, this can lead to a significant drop in performance. Users who are accustomed to having immediate access to their data just have to wait longer.

The paper entitled “Inside Dropbox: understanding personal cloud storage services by Idilio Drago (UT), Marco Mellia (Politecnico di Torino), Maurizio Munafò (Politecnico di Torino), Anna Sperotto (UT) Ramin Sadre (UT) and Aiko Pras (UT) will be presented on 16 November at the Internet Measurement Conference in Boston. Copies of the paper are available on request.