Jack Vanlightly’s Post

Principal Technologist at Confluent

2mo

With the announcement of S3-native-streams (Freight clusters), here is a commentary on Confluent strategy regarding object storage, streaming and an open data architecture. https://lnkd.in/dTg6DKVy

Hybrid Transactional/Analytical Storage — Jack Vanlightly

jack-vanlightly.com

6 Comments

Giannis Polyzos

Staff Streaming Product Architect @ Ververica | Apache Flink 🐿️ Streaming Lakehouse 🌊 Everything is a Stream 🌊

2mo

Jack Vanlightly Honestly I really enjoyed the writing. But I do disagree on all the Tableflow stuff, which makes me wonder how much R&D has taken place other than the transactional properties. From a practical perspective: 1. Tableflow moves the offloading from what we have been historically doing with Flink to tiered storage directly.. Yes it’s more convenient, but it’s again either batch OR stream, not batch AND stream. 2. The Stream/Table duality itself verifies it, it’s either batch or stream there no unification and u need to convert between them. 3. Iceberg itself can’t support streaming, so offloading from tiered storage (which by nature adds extra latencies), running compaction only to stream back (which by design iceberg has many limitations there) and more. 4. At the same time (you might have different view), I have seen this Kafka-table-kafka implementation in practise, but in reality there was almost 0 market demand for such use case (especially if there is latency introduced) All this fancy things to me it just looks as a variation of the Lambda architecture (Confluent advocated the Kappa in the first place), the only difference (no matter what wrapper you add on top) is the batch layer used Icebeg.

Pugazhenthi Sonachalam

2mo

Can't we have tablefow for NoSQL and RDBMS ? Good Exploration with IceBerg... But how much crowd will be using the same 😀

See more comments

To view or add a comment, sign in

More Relevant Posts

Jack Vanlightly

Principal Technologist at Confluent
7mo
Report this post
Chapter 4 of The Architecture of Serverless Data Systems: CockroachDB (serverless). https://lnkd.in/dQt9E7Xs

Serverless CockroachDB - ASDS Chapter 4 (part 1) — Jack Vanlightly

jack-vanlightly.com
Like Comment
To view or add a comment, sign in
Simon Barnes

Helping organisations unify data, analytics and AI onto one platform
2mo
Report this post
Databricks named a Leader in the 2024 #ForresterWave for Data Lakehouses, with the highest scores in both strategy and current offering categories. Databricks is the pioneer of the lakehouse, and the standard for data architectures! Learn why we were named a leader in the full report.

Forrester Wave for Data Lakehouses

databricks.com
Like Comment
To view or add a comment, sign in
Tech Reformers

382 followers
1y
Report this post
Explore Modern Data Architecture on AWS to see how you can get the most out of your data.

Modern Data Architecture on AWS | Amazon Web Services

aws.amazon.com
Like Comment
To view or add a comment, sign in
Rich Hauser

Senior Field Engineering Leader @ Databricks
2mo
Report this post
Databricks is named the #1 Leader in the 2024 #ForresterWave for Data Lakehouses, with the highest scores in both the Strategy and Current Offerings categories! As the pioneer of the Lakehouse architecture, it's truly thrilling to see Databricks set the standard for next gen architectures that support any data use case. Learn more in the full report.

Forrester Wave for Data Lakehouses

databricks.com
Like Comment
To view or add a comment, sign in
Ravi S. Sharma

Educator & Research Advocate
7mo Edited
Report this post
This paper set out to "discover" a set of design principles and rules for Cloud-based Big Data platforms for complex, heterogeneous environments. The design scope comprises Big Data's significance, challenges and architectural impacts. Using a methodology we call Reverse Engineered Design Science Research (REDSR), artifacts from leading vendors were used to elicit essential and common design principles and rules. we conclude that there is little to choose between major cloud vendor architectures. #bigdata #digitalplatforms Stephen Wingreen Purna Naga Sai Mannava

Reverse-Engineering the Design Rules for Cloud-Based Big Data Platforms

ojs.wiserpub.com
Like Comment
To view or add a comment, sign in
RisingWave

8,623 followers
10mo
Report this post
📈📊 Increasing evidence suggests that Kafka is evolving into a new form of data lake. ❓ Why is that❓ 🔹 For starters, Kafka has all the data lake properties! 🔹 Kafka also has the potential to serve as the new data lake in production. 🤔 What do you think: Will Kafka replace the existing data lake managing frameworks? 📣👇Check out what Yingjun Wu, CEO at RisingWave Labs, has to say about this: https://lnkd.in/g7xGprWp #dataprocessing #kafka #datalake #streamprocessing #datalakehouse

Why Kafka Is the New Data Lake?

https://www.risingwave.com

1 Comment
Like Comment
To view or add a comment, sign in
Rim Elfallal

The Ohio State University
10mo
Report this post
Good video explaining what an open data architecture actually is and how it’s next generation capabilities solve many challenging problems for enterprises. “Discover the power of WatsonX.data, a data store built on an open data lakehouse. See how the solution can help data management challenges such as reducing data warehouse costs and unifying data across any hybrid cloud and on-premises environments.” #openlakehousearchitecture #watsonx.data #data https://lnkd.in/dxWwTUnc

Data Lakehouse Architecture & Use-Cases

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
intelia

3,961 followers
1mo
Report this post
Forrester named Databricks a Leader with the highest score in both the strategy and current offering categories among all vendors, with 5/5 scores across 19 criteria. intelia are specialised implementation partners of #Databricks, get in touch today to find out how we can help your organisation take advantage of its capabilities to drive more value from your data. #intelia #databricks #data #datastrategy https://lnkd.in/g5SNiUvQ

Databricks

720,789 followers
2mo Edited

Databricks named a Leader in the 2024 #ForresterWave for Data Lakehouses! As pioneers of the lakehouse data architecture, we’re pleased to be placed highest in both the strategy and current offering categories. Read the report to see our cited strengths in several areas, including: • Security & Governance • GenAI/LLM • Data storage and formats • Data ingestion/pipeline • Data models

2024 Forrester Wave for Data Lakehouses | Databricks

databricks.com
Like Comment
To view or add a comment, sign in
Kai Waehner Kai Waehner is an Influencer

Global Field CTO | Author | International Speaker | Follow me with Data in Motion
10mo
Report this post
"The heart of the #datamesh beats in real-time with #apachekafka" If there were a buzzword of the hour, it would undoubtedly be “data mesh”! This new architectural paradigm unlocks analytic and transactional data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios. The data mesh addresses the most common weaknesses of the traditional centralized #datalake or data platform architecture. And the heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. Learn how the de facto standard for data streaming, Apache #kafka, plays a crucial role in building a data mesh and how it complements (not replaces!) data lakes, #datawarehouse, and other data platforms: https://lnkd.in/emgZNmsn

The Heart of the Data Mesh Beats Real-Time with Apache Kafka - Kai Waehner

https://www.kai-waehner.de
Like Comment
To view or add a comment, sign in
Axel Schwanke

Senior Data Engineer | Data Architect | Data Science | Data Mesh | Data Governance | 4x Databricks certified | 2x AWS certified | 1x CDMP certified | Medium Writer | Turning Data into Business Growth | Nuremberg, Germany
2mo
Report this post
𝗗𝗮𝘁𝗮 𝗟𝗮𝗸𝗲𝗵𝗼𝘂𝘀𝗲 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀: 𝗥𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗮𝘁𝗶𝗼𝗻𝘀 𝗳𝗼𝗿 𝗦𝘂𝗰𝗰𝗲𝘀𝘀 The transition to data lakehouses presents significant challenges and opportunities for organizations across various industries. Traditional data warehouses and data lakes often fall short in meeting the demands of modern businesses due to limitations in agility, scalability, integration, and governance. However, data lakehouses offer a solution by providing unified platforms with advanced AI capabilities to address these shortcomings and accelerate analytical use cases. 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗲𝘀 𝗚𝗲𝗻𝗔𝗜 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻: As genAI capabilities continue to evolve, organizations must carefully assess solutions that offer foundational genAI capabilities, such as natural language query and data intelligence, to simplify lakehouse development and deployment. 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗲𝗱 𝗘𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲: End-to-end integration is crucial for accelerating analytical use cases. Organizations should seek lakehouse vendors that provide integrated solutions encompassing streaming, transformation, workload management, integration, governance, and security. 𝗣𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Large and complex data warehouses often require months to set up and configure. Look for vendors offering solutions with deep integration with table formats, built-in automated performance optimization, advanced workload management, and parallel data processing to ensure performance at the speed of business. 𝗘𝗺𝗯𝗿𝗮𝗰𝗶𝗻𝗴 𝗱𝗮𝘁𝗮 𝗹𝗮𝗸𝗲𝗵𝗼𝘂𝘀𝗲𝘀 𝗰𝗮𝗻 𝗿𝗲𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝗶𝘇𝗲 𝗵𝗼𝘄 𝗼𝗿𝗴𝗮𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝘀 𝗺𝗮𝗻𝗮𝗴𝗲 𝗮𝗻𝗱 𝗱𝗲𝗿𝗶𝘃𝗲 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀 𝗳𝗿𝗼𝗺 𝘁𝗵𝗲𝗶𝗿 𝗱𝗮𝘁𝗮. By addressing challenges related to genAI integration, integrated experiences, and performance optimization, organizations can unlock the full potential of data lakehouses to drive innovation and competitiveness in today's data-driven landscape. #DataLakehouses #DataManagement #AI #Analytics #BigData #CloudComputing #DataIntegration #DataScience #DigitalTransformation #MachineLearning #Technology #BusinessIntelligence #ForresterWave #DataEngineering #GenerativeAI

Databricks

720,789 followers
2mo Edited

Databricks named a Leader in the 2024 #ForresterWave for Data Lakehouses! As pioneers of the lakehouse data architecture, we’re pleased to be placed highest in both the strategy and current offering categories. Read the report to see our cited strengths in several areas, including: • Security & Governance • GenAI/LLM • Data storage and formats • Data ingestion/pipeline • Data models

2024 Forrester Wave for Data Lakehouses | Databricks

databricks.com
Like Comment
To view or add a comment, sign in

1,162 followers

40 Posts

View Profile Follow

Jack Vanlightly’s Post

More Relevant Posts

Data Lakehouse Architecture & Use-Cases

https://www.youtube.com/

Explore topics