The number of stages in a job is usually equal to the number of rdds in the dag. However, the scheduler can truncate the lineage when _______
Answers
Answered by
0
Answer:
Checkpointing
Explanation:
Resilient Distributed Datasets (RDD) is the basic data structure of Spark. It is an absolute distributed collection of objects.
In RDD, the scheduler can truncate the lineage by checkpointing.
Checkpointing is a process of truncating RDD lineage graph by calling its function and saving it to a reliable hard disk which is distributed in nature or local file system.
Similar questions