Today and tomorrow, Snowflake's Build EMEA and move(data ) by Airbyte events are key highlights for the Data Engineering community. I will be watching all the sessions. What about you?
About one-third of the talks will delve into LLMs, RAGs, fine-tuning language models using your data, and engaging in conversations with your data.
Mehdi Ouazza will share insights on LLMs for data engineers, Artyom Keydunov will discuss integrating the semantic layer with LLMs, and Harrison Chase will explore building confidence in LLM applications. Jerry Liu will provide real-world insights into RAGs, along with many other notable speakers from Weights & Biases, LlamaIndex, LangChain, MotherDuck, and more.
Which sessions do you plan to attend?
Just earned my certificate from the Databricks Spark Wars Hackathon, powered by Celebal Technologies!
Thanks to an incredible team and some amazing
challenges, I've leveled up my data skills.
Excited to apply this knowledge in future projects!
#celebaltechnologies#databricks#hackathon
Why did the RDD, DataFrame, and Dataset go to therapy?
📊 The RDD said, 'I feel too unstructured.'
🤔 The DataFrame said, 'I need more schema in my life.'
😄 And the Dataset said, 'I'm just trying to keep everyone happy and optimized!'🔥
#BigDataHumor#Spark#DataEngineering#BigData
Matei Zaharia Opens Unity Catalog Code to the Community: A Landmark Open Source Move by Databricks
I have to say that the decision of Matei Zaharia, co-founder and chief technologist of Databricks, to open the Unity Catalog code to the entire community, effectively making it Open Source, surprised me. This move marks a significant step towards greater transparency, collaboration and innovation in data management.
Why is this decision important?
Collaboration and Innovation: Opening the Unity Catalog code allows developers, data scientists, and engineers worldwide to contribute to and improve the platform. This will not only accelerate innovation but also ensure that Unity Catalog evolves in line with the real and immediate needs of its users.
Transparency and Trust: Transparency is a cornerstone of the open source world. Making Unity Catalog’s code public enhances user trust, as they can now inspect the code, understand its workings, and contribute to making it even more robust and secure.
Benefits for the Community and Businesses: This initiative benefits not only the open source community but also Databricks. When community interests align with business interests, a virtuous ecosystem is created where both can thrive. Companies can benefit from the innovations and improvements contributed by the community, while the community gains access to more powerful and flexible tools.
A Special Thank You to Databricks
A special thank you goes to Databricks for supporting this decision. Opening Unity Catalog is an excellent example of how community interests can go hand in hand with business interests. Thank you, Databricks, for your commitment to fostering a culture of openness and collaboration.
The opening of the Unity Catalog code is a significant milestone that promises long-term benefits for all stakeholders involved. With this move, Databricks not only strengthens its commitment to open source but also enhances trust and collaboration with the global data community.
#OpenSource#UnityCatalog#Databricks#Collaboration#Innovation
𝗕𝗟𝗢𝗚 𝗣𝗢𝗦𝗧: 𝗧𝗼𝗽 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲 𝗗𝗮𝘁𝗮𝗯𝗿𝗶𝗰𝗸𝘀 𝘀𝘂𝗺𝗺𝗶𝘁 𝟮𝟬𝟮𝟰 🖋
The Databricks Summit in San Francisco from June 10-13 was an incredible event, filled with the latest updates, announcements, and features on Databricks Delta Lake, Spark, and more.
With 10 diverse tracks such as "Data Engineering and Streaming" and "Data Governance," the summit offered a wide range of knowledge through sessions led by experts. The event presented business use cases, cutting-edge technology demos, and new announcements to shape the future of data.
We attended several of these sessions online and are happy to share some of the key insights and topics that captured our attention.
𝗖𝘂𝗿𝗶𝗼𝘂𝘀 𝘁𝗼 𝗹𝗲𝗮𝗿𝗻 𝗺𝗼𝗿𝗲? 𝗖𝗵𝗲𝗰𝗸 𝗼𝘂𝘁 𝗼𝘂𝗿 𝗯𝗹𝗼𝗴 𝗽𝗼𝘀𝘁 𝗳𝗼𝗿 𝗮 𝗱𝗲𝘁𝗮𝗶𝗹𝗲𝗱 𝗿𝗲𝗰𝗮𝗽!
https://ow.ly/qgoh50SmZZJ#DatabricksSummit#DataEngineering#DeltaLake#DataGovernance#Spark#TechInnovation#DataScience#IntellusGroup#Aivix
Before Unity Catalog, Databricks workspaces each had their own Hive metastore where the metadata was stored and accessible only from within the workspace. The objects in the workspace were constrained to that workspace.
Databricks Unity Catalog enables the creation of a centralized metastore where objects are stored and accessible across all workspaces that are attached to this centralized metastore solution.
Delve into Unity Catalog's role in overcoming data governance challenges with Maulik Dixit in this blog series.
Chapter Two:
https://lnkd.in/detsMufd
Chapter One:
https://lnkd.in/diU7UwXE
Watch out for the next chapter, to learn how you can organize objects in the Unity Catalog.
#unitycatalog#databricks#datagovernance#Tredence#BeyondPossible
How do you get your #LLM into the top 10 of Spider, a widely used benchmark for text-to-SQL tasks?
Discover how we reached 79.9% (a 19 point boost over baseline) on the Spider dev dataset with Meta Llama3 8B through prompting and fine-tuning on Databricks. Apply this approach to improve LLM outputs!
How do you get your #LLM into the top 10 of Spider, a widely used benchmark for text-to-SQL tasks?
Discover how we reached 79.9% (a 19 point boost over baseline) on the Spider dev dataset with Meta Llama3 8B through prompting and fine-tuning on Databricks. Apply this approach to improve LLM outputs!