Introduction In this day and age, storage is cheap whereas compute is expensive. Hence, traditional map-reduce clusters that are kept “on” perpetually will rack up enormous costs especially if the map-reduce process is triggered sporadically. This article explores the automation of a big data processing pipeline while maintaining low cost and…