你可以举例说明cogroup
并过滤:
## This depends on empty resultiterable.ResultIterable ## evaluating to False intersection_rdd = rdd1.cogroup(rdd2).filter(lambda x: x[1][0] and x[1][1]) intersection_rdd.map(lambda x: (x[0], (list(x[1][0]), list(x[1][1])))).collect() ## [('www.page1.html', (['word1', 'word3'], [7.3])), ## ('www.page2.html', (['word1'], [1.25]))]