Three takeaways from Data + AI Summit

A few weeks ago I had the privilege of attending the Data + AI Summit in San Francisco. It was one of the best conferences I have attended for several reasons. The main one is that it reinforced my belief that Open Source models will play an important role in our everyday lives (it’s not just ChatGPT behind an API). I was able to witness first hand the power of Open Source and how it can spread across different domains, particularly AI. It was not just wishful thinking from my part, but something that’s actually taking shape at an accelerating pace. Here are the three things I learned while at Data + AI Summit: 

The Open Source community is huge 

The Data + AI Summit organized by Databricks brought in 12,000 attendees to the Moscone Convention Center in San Francisco, plus 75,000 online participants. These are huge numbers, very much comparable with successful Open Source events like CloudNativeCon + KubeCon by the Linux Foundation.

I had the opportunity to talk with several attendees, all of whom were very enthusiastic about Open Source and AI. Many attendees are doing important work to advance the field, one such example is the members of Berkeley Artificial Intelligence Research (BAIR) who are uniting UC Berkeley researchers across various areas who are working on fundamental advances in computer vision, machine learning, natural language processing, planning, control, robotics and more.  The Summit brought together an interesting mix of Open Source developers, researchers and businesses.

There’s a high demand for Open Source models

Databricks, the company behind Apache Spark, is experiencing high demand for Open Source models from their existing client base, even more so than proprietary models. This was very surprising to learn, as it demonstrates that businesses are really looking to fully own their AI stack.

For this very reason, Databricks is betting big on Open Source models. A few months ago, they released Dolly 2.0, the first open, instruction-following large language model (LLM) for commercial use. At the Summit, CEO of Databricks Ali Ghodsi reaffirmed their commitment to promoting Open Source models as a path towards democratizing AI and, as part of this commitment, they announced the acquisition of MosaicML for $1.3 billion. MosaicML is known for its state-of-the-art MPT large language models. 

Open Source models have a huge potential

At the Summit, there were many interesting talks, including keynotes from high profile individuals like Satya Nadella, Marc Andreessen and Eric Schmidt. But, for me, the most interesting talks were the ones that demonstrated how Open Source LLMs have the potential to provide many benefits when compared to proprietary solutions, namely, more control, stronger privacy, reduced costs, better results and improved performance.

I was also amazed to see how Databricks was able to incorporate AI into their software. Until now, you had the option of using SQL or Python to interact with Apache Spark. But writing the right query or code can be challenging oftentimes. So I was delighted to watch the demos where they introduced English as the new programming language for interacting with Spark. As a user, by using plain English to explain what you want to accomplish, the AI-assistant was able to translate that into SQL or Python. This will make the software much more accessible and will increase the productivity of all users, from newbies to experts.

Final takeaways

Overall, attending the Data + AI Summit was a wonderful experience. It was great to connect with so many members of the Open Source community and share our enthusiasm for a brighter future, where Open Source models will play a key role in making our daily lives more productive and help us make sense of the ever growing data surrounding us. Additionally, Open Source models will enable individuals and businesses to take full ownership of their data and software. 

If you are interested in learning more about Open Source and AI, please join our “Deep Dive: Defining Open Source AI” series. CFPs for the online webinars are open and we are looking for proposals that discuss the importance of Open Source models and the impact of AI on society.


  1. Nicola Fabiano Avatar
  2. R Tyler Croy πŸ¦€ Avatar
  3. Stefano Maffulli Avatar
  4. Guto Carvalho Avatar
  5. Berkubernetus Avatar
  6. Jeff Pummill Avatar
  7. DeFrisselle β˜‘οΈ Avatar
  8. cameronbosch Avatar
  9. Xerophile Avatar
  10. Jeff Pummill Avatar
  11. Florian Schmidt Avatar
  12. pchestek Avatar
  13. Ulf Avatar