pyspark.sql.Column.dropFields#
- Column.dropFields(*fieldNames)[source]#
- An expression that drops fields in - StructTypeby name. This is a no-op if the schema doesn’t contain field name(s).- New in version 3.1.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- fieldNamesstr
- Desired field names (collects all positional arguments passed) The result will drop at a location if any field matches in the Column. 
 
- Returns
- Column
- Column representing whether each element of Column with field dropped by fieldName. 
 
 - Examples - >>> from pyspark.sql import Row >>> from pyspark.sql.functions import col, lit >>> df = spark.createDataFrame([ ... Row(a=Row(b=1, c=2, d=3, e=Row(f=4, g=5, h=6)))]) >>> df.withColumn('a', df['a'].dropFields('b')).show() +-----------------+ | a| +-----------------+ |{2, 3, {4, 5, 6}}| +-----------------+ - >>> df.withColumn('a', df['a'].dropFields('b', 'c')).show() +--------------+ | a| +--------------+ |{3, {4, 5, 6}}| +--------------+ - This method supports dropping multiple nested fields directly e.g. - >>> df.withColumn("a", col("a").dropFields("e.g", "e.h")).show() +--------------+ | a| +--------------+ |{1, 2, 3, {4}}| +--------------+ - However, if you are going to add/replace multiple nested fields, it is preferred to extract out the nested struct before adding/replacing multiple fields e.g. - >>> df.select(col("a").withField( ... "e", col("a.e").dropFields("g", "h")).alias("a") ... ).show() +--------------+ | a| +--------------+ |{1, 2, 3, {4}}| +--------------+