Secondary Sort
 The repartitionAndSortWithinPartitions transformation
      repartitions the dataset according to a partitioner and, within each
      resulting partition, sorts records by their keys. This transformation
      pushes sorting down into the shuffle machinery, where large amounts of
      data can be spilled efficiently and sorting can be combined with other
      operations. 
 For example, Apache Hive on Spark uses this transformation inside its
        join implementation. It also acts as a vital building
      block in the secondary sort pattern, in which you group records by
      key and then, when iterating over the values that correspond to a key,
      have them appear in a particular order. This scenario occurs in algorithms
      that need to group events by user and then analyze the events for each
      user, based on the time they occurred. 
