Member-only story

A Beginner Guide To Data Pipeline Architecture

Bivek Subedi
5 min readOct 14, 2023

--

fig: Basic data pipeline

We have seen many dashboards with eye-candy visuals which provide great insights. So the question is how the raw data are converted into such beautiful and insightful reports. In this blog, we are going to talk about the whole data pipeline architecture.

Data architecture is designed based on the requirements of the projects. It basically depends on how the data is updated in a warehouse or data storage. Data can come in real time or it can be non-real-time data. For this real-time stream ingestion or non-real-time ingestion is used as per requirement. There are basically three types of streaming architecture to store data in a warehouse. They are:

  1. Batch streaming Architecture:
    A batch streaming pipeline is designed for data that comes in batch or in bulk. These are stored in a certain time frame which may be weeks, months, or years. These might be Excel, database files, or CSV, which are inserted by data engineers in the main database. These data are not real-time data.
    And batch streaming architecture is used.
  2. Real-Time Streaming Architecture:
    A real-time streaming pipeline is designed for data that comes in real-time. These data must be updated as soon as it is generated. Real-time data are usually generated from real-time devices like radars, and satellites. These might also be log files. Real-time data are…

--

--

Bivek Subedi
Bivek Subedi

Written by Bivek Subedi

(BI/DATA) Engineer at Sursa Tech

No responses yet