Spark Interview Questions
Q1. What are the languages a Spark application can be built with ?
A1. Java, Scala and Python.
Q2. How can you access Cassandra using Spark ?
A2. Spark Cassandra Connector should be used for the same.
Q3. What exactly is Spark ?
A3. Spark is a Big Data processing framework which is easy to develop big data applications with lightning fast speed giving performance benefits. Spark supports in-memory computation. It can access data stored in different sources like HDFS, HBASE, Cassandra etc.
Q4. You need to develop an application with iterative algorithms. Would you prefer MapReduce or Spark ?
A4. For iterative logic processing, Spark should be preferred as it stores the data in memory.
Q5. What are the operations supported by RDD?
A5. Transformations and Actions are supported by RDD.
Q6. What exactly is transformation?
A6. Transformations are actually the functions or operations which can be applied on any RDD. Transformations don’t get executed themselves unless there is an action.
Q7. Examples of transformation ?
A7. Map() and filter ().
Q8. What are actions then ?
A8. When an action executes – it results all previously defined transformations. Action is a way to get the data from RDD to local system/machine.
Q9. Example of actions ?
A9. Reduce (), take ().
Q10. Can you you Hive with Spark ?
A10. Yes, Hive is supported with Spark.
Q11. What are the components in Spark Eco-System ?
A11. Here are the components of Spark Eco-System –
a. Spark SQL
b. Spark Streaming
Q12. Which component would you use for Machine Learning