Publication | Closed Access
Effective Multi-stream Joining in Apache Samza Framework
17
Citations
8
References
2016
Year
Unknown Venue
Cluster ComputingEngineeringComputer ArchitectureData Streaming ArchitectureStreaming DataData ScienceManagementData IntegrationParallel ComputingData ManagementStream ProcessingStream JoiningStreaming EngineComputer EngineeringBusiness PipelinesComputer ScienceData Stream ManagementApache Samza FrameworkCloud ComputingParallel ProgrammingApache SamzaBig Data
Increasing adoption of Big Data in business environments have driven the needs of stream joining in realtime fashion. Multi-stream joining is an important stream processing type in today's Internet companies, and it has been used to generate higher-quality data in business pipelines. Multi-stream joining can be performed in two models: (1) All-In-One (AIO) Joining and (2) Step-By-Step (SBS) Joining. Both models have advantages and disadvantages with regard to memory footprint, joining latency, deployment complexity, etc. In this work, we analyze the performance tradeoffs associated with these two models using Apache Samza.
| Year | Citations | |
|---|---|---|
Page 1
Page 1