Wednesday, May 5, 2010

Strange use of my profile data on Gmail…….

Yesterday I sent out a calendar invite to the CTO of a well known cloud computing startup. When I view the response on Outlook on my PC, I see the acceptance email addressed to my Gmail address.

The acceptance email from this person on my iPhone was addressed to a friend of mine. On iPhone when I click on this friend’s name on the “to” field I see his Gmail address and my Gmail address in the details about my friend.

Now all this is very confusing because I sent my calendar invite from my work email and not Gmail. There is a connection though, I have set my work email as the forwarding address for the emails that I receive on my Gmail that match a set of filter criteria. More specifically, I have Google Alerts directed to my Gmail account and I forward those alerts to my work email address. Another important fact is that this startup Company is using Gmail for their corporate email.

So it is clear that when I send a calendar invite from my work email account to the personnel of this company it hits the Gmail servers. Gmail at that point is using my work email address to look up my personal Gmail account and is sending the responses to both my work email and my Gmail account. This is a big data privacy and security issue. When enterprises worry about moving their applications and data to the Cloud it is these kinds of leaks (inadvertent or otherwise) that they fear and rightly so.

In the traditional IT model, the likelihood of such events is low as there is clear physical and ownership separation of my personal data and my enterprise data. Google has a natural desire to maintain a single identity across all my aliases/accounts, which may or may not be a desired state for me. The massive analytics infrastructure operates across all data regardless of their ownership/tenancy boundaries.

While this may allow Google to offer valuable products and services to me, like highly personalized search results and recommendations, it exposes more information than I am comfortable disclosing and worst of all I don’t even fully understand what it might expose. This is a small but illustrative example that shows why it is imperative that enterprises follow the model of “trust but verify” with Cloud providers when engaging them. Cloud providers must share and expose their architectural details for 3rd party audit. There may be a case for identifying functions that are more prone to creating privacy/security risks like analytics which by design are meant to extract identifiable information from a large data sets regardless of tenant boundaries. These functions and their output data access should be controlled and controls should be made transparent. In fact, enterprise tenants should be able specify and verify what functions are allowed with specific data - like analytics. Consumers have little leverage with large Corporates since we consumers value the functionality delivered by these companies far more than the potential loss of privacy/security so we are unlikely to vote with our mouse clicks.