Trident groupBy operation
The groupBy
operation doesn't involve any repartitioning. The groupBy
operation converts the input stream into a grouped stream. The main function of the groupBy
operation is to modify the behavior of subsequent aggregate functions.

groupBy before partitionAggregate
If the groupBy
operation is used before a partitionAggregate
, then the partitionAggregate
will run the aggregate
on each group created within the partition.
groupBy before aggregate
If the groupBy
operation is used before an aggregate
, then input tuples is first repartition and then perform the aggregate
operation on each group.