揭秘Databricks成长故事:看准云技术,他创建了一家市值280亿美元的公司|GGV投资笔记第九十一期
作者:Glenn Solomon
即将到来的Databricks 的IPO是最受期待的IPO之一。Databricks成立于2013年,提供AI赋能的开源数据分析平台,如其联合创始人Ali Ghodsi所述,“该平台获得巨量的企业数据,通过机器学习和数据科学做出预测”。已有超过5000家公司使用Databricks的开源湖仓一体架构来处理、编程和分析其非结构化和半结构化数据,Databricks必定大有作为。
Databricks近期融资16亿美元,公司市值超过380亿美元,其成功可见一斑。我最近和Ali讨论了他如何将一个开源项目发展为一家几百亿美元级的公司,以及企业家可以从他的经历中获得哪些经验。
这一项目是Ali在加州大学伯克利分校做研究员时协助创建的,他现在仍是该校计算机科学系的副教授。以下是我们谈话的节选。
Glenn:Databricks是如何帮助公司分析其公司数据的?
Ali:实际上,使用Databricks的方式有无数种,客户所做的那些了不起的事情令我们叹为观止。例如,再生元制药公司使用我们的ML算法检测与慢性肝脏疾病有关的DNA基因,随后他们可以开发针对这一特殊基因的靶向药物。Comcast等公司使用Databricks驱动声控遥控器。当你对着遥控器讲话时,声音数据会进入云,Databricks便通过机器学习进行处理,如果它明白了你的指令,就会将电视换到相应频道。在疫情期间,医院使用Databricks实时掌握急诊室的使用情况,这样就可以将救护车上的患者转到另一家有床位的医院。金融服务公司会分析卫星数据,预测应投资哪一全球领域和公司。壳牌公司使用Databricks监控2亿个阀门的传感器数据,预测是否有任何阀门会断裂,这样就可以提前更换,维持系统的运行、节约成本并确保员工安全。
Glenn:在加州大学伯克利分校时,你作为访问学者帮助构建了Databricks的基础开源代码。请讲讲你从黑客到创始人的历程?
Ali:伯克利分校有一位非常出色的计算机科学教授--Dave Patterson,他向学生开放实验室和办公空间,让我们进行头脑风暴和协作。我们当中有计算机科学家、工程师、数学家和ML专家,大家一起工作看看我们能创造出什么,从这里诞生了Apache Spark。最初版本是为了将巨量的数据集加载到内存中。Spark为Databricks的大部分工作奠定了基础。因为自2009年在伯克利时我就一直致力于核心技术,因此2013年我和联合创始人成立Databricks时,也深度参与到产品的创建和编程中。
Glenn:对于Databricks经历的超级增长,你为此做过规划吗?
Ali:在早期,我们制定了一个快速发展计划,设定了清晰的公司规模发展目标。但我们的早期计划是有一天将公司以1亿或2亿美元卖出去,显然我们低估了公司的潜力。
但我们做对的一件事是,我们认定了一个趋势,很多人认为这一趋势永远不会真正到来,或者至少几十年内不会到来。我们相信云可以容纳所有数据,并不需要本地解决方案。每个人都认为我们太疯狂了,因为很多公司已经在数据中心领域投资了几十亿美元,人们需要本地解决方案。一个潜在客户甚至给我们2000万美元,要求打造一个我们软件的本地版本。
这一要求很难拒绝,但我们仍坚定地只做云版本,前期很多的质疑者现在都在用我们的SaaS产品。勇往无前是使公司真正获得成功的唯一途径,因为如果你只是以些许不同的方式来做同一件事情,大公司便会吞并你。他们会复制你的战略并做的更好,因为他们有更多的资金和工程师。因此你必须考虑未来会发生什么,然后以此为指引创建一家公司。
Glenn:你如何打造了免费开源公司Apache Spark,同时又成功盈利?你是如何巧妙平衡的?
Ali:开源是一把双刃剑。我们在伯克利创建了Spark,但随后我们创建Databricks时,我们构建了新的专有引擎—Ignite。我们很快意识到仅有开源无法刺激真正的巨大增长。我们是一家企业型公司,但我们的增长却像Facebook或Twitter等B2C公司,因为我们在开源社区中的推广极为迅速。但问题是如何让人们为产品付费。开发者会在会议上找到我们,同我们合影,告诉我们Databricks如何深刻改变了他们的生活,但随后我们问他们是否愿意为我们的SaaS服务付费时,他们会说,“为什么要付费呢,你们向我们免费提供了软件”。因此我们不得不考虑应该在开源版本中留下什么,以确保开源版本对开发者是真正有价值的,同时考虑应该在SaaS版本中加入什么,以创造足够高的价值使公司愿意付费。
Glenn:你们是如何盈利的?
Ali:我们有一个非常独特的模式,称为SaaS开源,它与本地开源非常不同,你可以在本地开源平台下载免费软件,同时也可以下载开源公司打造的增加了其他特征的付费版本。对我们来说,我们只有SaaS,因此我们只是在后台不断更新产品。
我们在开发以及软件运行、运营和托管方面向客户收费。我们也将一直致力于支持完全免费的Databricks开源版本,但SaaS服务不同,它会有非常多企业感兴趣的特征,比如可靠性、可及性和可扩展性。我们一直在做SaaS,并且只做SaaS,没有收益是来自本地版本,因此我们必须从一开始就在云交付的各个方面做到出色。
Glenn:你们持续创新的秘诀是什么?
Ali:保持相对敏捷是促进我们创新的一个因素;我们现有超过1700名员工。但自第1天起,我们就一直牢记史蒂夫·乔布斯的一句话——你应该“在别人干掉你之前干掉自己”,而且要永远在未来科技和当前利益之间选择前者。
我们所有的开发者都相信你应该“杀死你的挚爱”;我们不会政治化和执念于某一种做事方法。我们创新的本质一部分来自最早创建公司20个或30个人,他们是伯克利分校的研究员,视自己为真理的追求着,坚信应该由数据来做决定,而不是人。
有时我不得不阻止创新,因为我们的技术团队总是“嘿,我们应该做这个了不起的新项目”,他们会在所做的一切努力还没成功前就想抛弃这些努力重新再来。
Glenn:你提到你们会在今年上市。Databricks下一步有何打算?
Ali:我们谈到做好了IPO的准备,但这对我们来说只是垫脚石。我们会存续很长、很长一段时间。我们的市场体量巨大,而我们只触及了表面。我们见证了客户使用他们的数据所做的事情,这十分了不起,但还会有更多的机会出现。
人们忽略了一点,那就是字母“I”对于“IPO”来说只是个首字母,它只是个开始。这也是我们对IPO的解读——它只是一个新征程的起点。
以下为英文版:
How Databricks CEO And Cofounder Ali Ghodsi Bet Big On The Cloud To Build A $28B Company
Databricks is one of the most anticipated upcoming IPOs. Founded in 2013, Databricks is an AI-enabled, open source data analytics platform company that, as co-founder Ali Ghodsi describes it, “takes massive, massive amounts of enterprise data and does machine learning and data science on top of it to predict things.” Databricks must be onto something since more than 5,000 companies use the company’s open source-driven lakehouse architecture to process, engineer, and analyze their unstructured and semi-structured data. Databricks’ success is evident by its most recent $1.6 billion financing, which valued the company at more than $38 billion. I recently sat down with Ali to discuss how he turned an open source project he helped start as a researcher at UC Berkeley—where he still serves as an adjunct professor in the computer science department—into a multibillion-dollar company, and what lessons entrepreneurs can learn from his journey. Here are some excerpts from our conversation.
Glenn:Tell us a little bit about how Databricks is helping companies analyze their data.
Ali:There are really an infinite number of ways to use, and we’ve been blown away by all the cool things our customers are doing. For example, Regeneron uses our ML algorithms to detect the gene in DNA that's responsible for chronic liver disease, and then they were able to develop a drug that targeted that particular gene. Or a company like Comcast uses Databricks to make their voice-activated remote controls work. When you talk to the remote control, that voice data goes into the cloud for Databricks to process using machine learning, and it figures out what you said and directs the TV to the right channel. And during the pandemic, hospitals used Databricks to get a real-time picture of how full their ERs were so they could redirect patients in ambulances to different hospitals that had space. Financial services firms are analyzing satellite data to make predictions about which global sectors and companies to invest in. Shell uses Databricks to monitor sensor data from 200 million valves to predict if any are going to break, so they can replace them ahead of time to keep systems running, save money, and ensure employees stay safe.
You helped build the foundational open source code for Databricks as a visiting scholar at UC Berkeley. Tell us about the journey of hacker-to-founder.
There is an incredible computer science professor at Berkeley, Dave Patterson, who just opened up labs and office space to students and said let’s brainstorm and collaborate. We had computer scientists, engineers, mathematicians, and ML experts, all just working together to see what we could create and out of that came Apache Spark. The earliest version was built to make it faster to load huge datasets into memory. Spark forms the foundation of much of what we’ve built at Databricks. When I co-founded Databricks in 2013, I was deeply involved in product creation and engineering because I had been working on the core technologies since 2009 at Berkeley.
Glenn:Did you plan for the hypergrowth Databricks has experienced?
Ali:Since the early days, we built a plan to grow fast with clear goals for how big we wanted the company to be. But our early vision was to one day sell the company for $100 or $200 million so clearly, we underestimated the potential. But what we did right is that we bet on a trend that lots of people said would never really take off, or at least wouldn’t for a few more decades. That belief was that the cloud would house all data and didn’t need an on-prem solution. Everyone told us we were crazy and needed an on-prem solution since companies had invested billions in data centers. One potential customer even offered us $20 million to build an on-premise version of our software. That was hard to turn down, but we remained steadfast in cloud-only, and a lot of those early doubters now use our SaaS product. Being bold is the only way to create really, really successful companies, because if you're trying to do one thing a little bit differently, the big companies will eat you up. They'll copy your strategy and do it better because they have more money and more engineers. So you have to think about what will happen in the future and then build a company for that future.
Glenn:How do you build a company on free open source, in this case Apache Spark, but also succeed in making money? How have you walked that fine line?
Ali:Open source is a double-edged sword. We built Spark at Berkeley, but then when we started Databricks, we built a new proprietary engine called Ignite. We quickly realized only open source would fuel really big growth. We’re an enterprise company but we’ve grown more like a B2C company such as Facebook or Twitter, because we’ve had such incredible viral evangelism from the open source community. The challenge, though, was getting anyone to pay for our product. Developers would come up to us at conferences and want selfies and tell us how much we changed their lives, but then we’d ask them if they’d like to pay for our SaaS services, and they’d say, “why would we do that, you guys give us the software for free.” So we had to figure out what to leave in the open source version, to ensure it was really valuable to developers, and what to include in our SaaS verison to make it valuable enough to companies that they’d pay for it.
Glenn:How do you make money?
Ali:We have a pretty unique model we call SaaS open source. It’s very different from on-prem open source where you can download the free software, and an open source company creates paid versions with extra features to download as well. For us, we are only SaaS so we just continually update the product in the background. We charge customers for this development as well as running, operating, and hosting the software. We also contribute constantly to the open source version of Databricks that’s entirely free, but our SaaS offering just has lots more features of interest to enterprises such as reliability, availability, and scalability. We have always been SaaS and only SaaS, with no crutch of on-prem revenue, so we had to get really good at delivering everything in the cloud from day one.
Glenn:What’s your secret to continued innovation?
Ali:One thing that has helped us innovate is staying relatively nimble; we have more than 1,700 employees today. But also, since day one, we always thought of that Steve Jobs quote about how you should “cannibalize yourself before someone else does,” and that you always pick the future technology instead of current revenue. All of our developers have the mindset that you should “kill your darlings”; there was no politics and no attachment to a certain way of doing things. Part of our innovative nature is that the first 20 or 30 people who built the early company were researchers from Berkeley who saw themselves as truth-seekers and believed the data should always decide, not the humans. Sometimes I’ve had to put the brake on innovation because the tech team is always like, “hey, we have this new great thing,” and they want to abandon everything they did yesterday before it even has a chance to become successful.
Glenn:You’ve said you’ll go public this year. What’s next for Databricks?
Ali:We’ve talked about being IPO ready, but it’s just a stepping stone for us. We are going to be around for a long, long time. Our market is gigantic and we’ve only just scratched the surface. We see the things our customers are doing with their data, and it’s incredible, but there’s so much more opportunity. People forget what the “I” means in IPO; it’s for “initial,” and that’s how we see it—just the start of a new journey
*作者简介:Glenn Solomon是GGV Capital的管理合伙人之一。GGV是一家专注于本地创业者的国际创业投资公司。Glenn Solomon关注从种子期到成熟期的企业技术初创公司,涵盖多个关键领域,包括开源、云服务、基础架构和网络安全。Glenn Solomon有20多年的创投经验,过去十年里帮助9家公司完成了IPO上市。Glenn Solomon也是播客“Founder Real Talk”的主理人,在节目中采访了多位创始人和初创公司高管,交流创始人们所面临的挑战以及如何在重重困难中成长。
相关推荐
揭秘Databricks成长故事:看准云技术,他创建了一家市值280亿美元的公司|GGV投资笔记第九十一期
亲历SARS投资人:要积极,不要激进——GGV投资笔记第二十一期
“全球化”终结了吗? | GGV投资笔记第三十一期
我们看好怎样的SaaS公司 | GGV投资笔记
GGV投资笔记第三十九期:企业服务的下一个万亿级机会在哪?
估值飙升至280亿美元,大数据独角兽Databricks再获10亿美元融资
GGV乘风破浪史(二):变革中唯一的不变,是聚焦——GGV投资笔记第四十八期
火花思维罗剑:一个CTO的跨界探索 —— GGV投资笔记第二十六期
To B企业要想富,先帮大客户“铺好路”——GGV投资笔记第八十一期
文化基因如何决定企业行为?—— GGV投资笔记第四十期
网址: 揭秘Databricks成长故事:看准云技术,他创建了一家市值280亿美元的公司|GGV投资笔记第九十一期 http://www.xishuta.com/zhidaoview22135.html
推荐专业知识
- 136氪首发 | 瞄准企业“流 3926
- 2失联37天的私募大佬现身,但 3217
- 3是时候看到全球新商业版图了! 2808
- 436氪首发 | 「微脉」获1 2759
- 5流浪地球是大刘在电力系统上班 2706
- 6招商知识:商业市场前期调研及 2690
- 7Grab真开始做财富管理了 2609
- 8中国离硬科幻电影时代还有多远 2328
- 9创投周报 Vol.24 | 2183
- 10微医集团近日完成新一轮股权质 2180