Top 5 challenges and recipes when starting Apache Spark project
Intro Apache Spark is the most advanced and powerful computation engine intended for big data analytics. Beside data engine it also provides libraries for streaming (Spark Streaming), machine learning (MLlib) and graph processing (GraphX). Historically Spark emerged as a successor of Hadoop ecosystem with a following key advantages: Spark provides…