pyspark.sql.functions.grouping_id#
- pyspark.sql.functions.grouping_id(*cols)[source]#
- Aggregate function: returns the level of grouping, equals to - (grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + … + grouping(cn) - New in version 2.0.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colsColumnor column name
- columns to check for. 
 
- cols
- Returns
- Column
- returns level of the grouping it relates to. 
 
 - Notes - The list of columns should match with grouping columns exactly, or empty (means all the grouping columns). - Examples - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame( ... [(1, "a", "a"), (3, "a", "a"), (4, "b", "c")], ["c1", "c2", "c3"]) >>> df.cube("c2", "c3").agg(sf.grouping_id(), sf.sum("c1")).orderBy("c2", "c3").show() +----+----+-------------+-------+ | c2| c3|grouping_id()|sum(c1)| +----+----+-------------+-------+ |NULL|NULL| 3| 8| |NULL| a| 2| 4| |NULL| c| 2| 4| | a|NULL| 1| 4| | a| a| 0| 4| | b|NULL| 1| 4| | b| c| 0| 4| +----+----+-------------+-------+