pyspark.RDD.rightOuterJoin#
- RDD.rightOuterJoin(other, numPartitions=None)[source]#
- Perform a right outer join of self and other. - For each element (k, w) in other, the resulting RDD will either contain all pairs (k, (v, w)) for v in this, or the pair (k, (None, w)) if no elements in self have key k. - Hash-partitions the resulting RDD into the given number of partitions. - New in version 0.7.0. - Parameters
- Returns
 - Examples - >>> rdd1 = sc.parallelize([("a", 1), ("b", 4)]) >>> rdd2 = sc.parallelize([("a", 2)]) >>> sorted(rdd2.rightOuterJoin(rdd1).collect()) [('a', (2, 1)), ('b', (None, 4))]