While data mining and machine learning share certain characteristics, these two terms don’t mean the same thing. Both fall under the general category of data science, which uses scientific methods, systems, processes, and algorithms to extract knowledge from data. But there are some important differences between the two. The Difference Between Data Mining and Machine […]
Data pipelines are a crucial component of any big data solution. These are software that handles data streaming and batch processing, whereby data undergoes various transformations along the way.
This blog describes various big data streaming/batch processing options available with private clusters leveraging open source technologies and serverless public cloud infrastructures like AWS.