Big Data Solution Pipelines using Open Source Technologies and Public Cloud

Big Data Pipelines using Open Source Technologies and Public Cloud

Data pipelines are a crucial component of any big data solution. These are software that handles data streaming and batch processing, whereby data undergoes various transformations along the way.

This blog describes various big data streaming/batch processing options available with private clusters leveraging open source technologies and serverless public cloud infrastructures like AWS.