Utility-based data marketplaces

In "Computer Mediated Transactions" by Hal Varian, Varian offers an insightful look at how and why innovation has accelerated so rapidly within the realm of the internet. The piece offers some interesting insight regarding the historical development of the internet starting in the 1990’s, but it also makes some prescient predictions about the future. Especially given that it was written in 2010, I found the ‘Deployment of Applications’ section particularly compelling given some of the developments that have taken place since the time of this article’s writing. Most notably, Varian states “in the future it is likely that there will be a number of cloud computing vendors that will offer computing on a utility-based model. This production model dramatically reduces the entry costs of offering online services and will likely lead to a significant increase in businesses that provide such specialized services (Ambrust et al. 2009)” (Varian, 2010). Although a number of the established cloud service providers (Google, Amazon, etc.) have made efforts in this space, I believe that Snowflake is perhaps the best example of the computing future described by Varian. 

Snowflake was a private company founded in 2012 and that later had one of the most historic technology IPOs ever in 2020. Snowflake offers cloud based data storage and analytics that has become known as ‘data warehouse as a service.’ What is perhaps most interesting about Snowflake’s capabilities and business model is their ability to decouple storage from compute. Customers pay next to nothing to store their data on Snowflake servers, and are charged on a consumption basis as they run queries on data. More importantly, Snowflake has created a data marketplace such that Snowflake customers can share (or sell) data with one another, allowing small and large businesses alike to join various datasets both internally and externally. In an age where data has become a strategic differentiator for nearly every business, the ability to democratize data access through shared infrastructure is quite compelling. However, the question remains whether shared infrastructure and utility based pricing will in fact lead to a more democratic data ecosystem. 

I agree with the assertion by Varian that utility based production models are an exciting future, however I question the implication that this will be a net benefit to data consumers. In the past decade, we have seen news and communications democratized on the internet through businesses like Facebook and Twitter. However, companies like Facebook have struggled to offer a democratic communications utility while also bearing the responsibility of what is shared on their platform (often at the determinant of consumers). I wonder to what extent this offers a cautionary tale for data marketplaces like Snowflake. In the near term, businesses are likely to see lower costs of data analysis and easier access to data they may not have had the ability to query before. But if Snowflake were to grow the way Facebook did, at what point will they begin to lose control / insight over what types of data is shared and with whom? More importantly, if we believe data computing is in fact a utility, to what extent do we want such a utility completely controlled by one (or a select few) private companies?

Comments

topicTopics
academics study skills MCAT medical school admissions SAT expository writing English college admissions GRE MD/PhD admissions GMAT LSAT chemistry math strategy writing physics ACT biology language learning graduate admissions law school admissions test anxiety MBA admissions homework help creative writing AP exams MD interview prep summer activities history academic advice philosophy study schedules career advice premed personal statements secondary applications ESL PSAT economics grammar law organic chemistry statistics & probability admissions coaching computer science psychology SSAT covid-19 legal studies 1L CARS logic games USMLE calculus dental admissions parents reading comprehension Latin Spanish engineering research DAT excel political science verbal reasoning French Linguistics Tutoring Approaches chinese DO MBA coursework Social Advocacy academic integrity case coaching classics diversity statement genetics kinematics medical school skills ISEE MD/PhD programs algebra athletics business business skills careers geometry mental health social sciences trigonometry work and activities 2L 3L Anki EMT English literature FlexMed Fourier Series Greek IB exams Italian PhD admissions STEM Sentence Correction Zoom amino acids analysis essay architecture art history artificial intelligence astrophysics biochemistry capital markets cell biology central limit theorem chemical engineering chromatography climate change clinical experience constitutional law curriculum data science dental school distance learning enrichment european history finance first generation student fun facts functions gap year harmonics health policy history of medicine history of science information sessions institutional actions integrated reasoning intern international students investing investment banking mathematics mba meiosis mentorship mitosis music music theory neurology phrase structure rules plagiarism poetry presentations pseudocode quantitative reasoning school selection sociology software software engineering teaching tech industry transfer typology units virtual interviews writing circles