Pandas Map, Apply and ApplyMap
Data Preprocessing is an important step for the data analysts. There are some great methods built into pandas to help you accomplish your goals. In this blog we will see how to perform operations using apply( ), applymap( ) and how to substitute value using map( )
Map Method:
map( ) method only works on pandas series where type of operation to be applied depends on argument passed as a function, dictionary or a list.
When you apply the map method on series the map( ) function takes each element in the series and applies the function to it, and returns the transformed series.
To understand map( ) let’s first create a Series
s=pd.Series(['USA','INDIA',np.nan,'UK'])
map( ) accepts a dictionary or a Series. Values that are not found are converted to NaN.
s.map({'USA':'US','INDIA':'IND'})
map( ) also accepts functions.
s.map('I like {}'.format)
To avoid applying the function to the missing values na_action=’ignore’ should be used.
s.map('I like {}'.format,na_action='ignore')
Data Set for Demonstrating apply( ) and applymap( ) method
df=pd.DataFrame(np.random.randint(10,size=(6,6),dtype=int),
columns=('c1','c2','c3','c4','c5','c6'),
index=['r1','r2','r3','r4','r5','r6'])
Apply Method:
apply( ) method works on both the pandas dataframe and pandas series where function can be applied to both series and individual elements based on the type of function provided.
if we have a function that calculates the sum of the values. We can apply this function to rows or columns using apply( ). Let’s look at an example.
def sum(num):
return num.sum()
Let’s apply function to our data frame to get the sum of rows or columns.
#sum of each row
df.apply(sum,axis=1)
To get the sum of columns this what we have to do.
# sum of each column
df.apply(sum,axis=0)
Lamba with apply( ):
We can also use lambda functions with apply( )method.
# Gives the sum of each row
df.apply(lambda x : x.sum(),axis=1)
Apply Map Method:
applymap( ) method works on pandas dataframe where function is applied on every element individually.
# gives the squared elements in the entire data frame
df.applymap(lambda x : x**2)
Finally if had elements with Nan’s in a dataframe we can still use applymap to get the desired outcome.
df_s = pd.DataFrame(s)df_s
Since the type of is DataFrame we cannot use map on df_s, hence we need to use applymap.
df_s.applymap(lambda x :
'I Like {}'.format(x) if pd.notna(x) else '-' )
Conclusion:
With the three methods map( ), apply( ), applymap( ) functions can be applied to the whole or part of the data frame.
map( ) works element wise for a single series,
apply( ) function is applied to entire rows or columns and
applymap( ) works element wise for the entire data frame.
Happy reading!!!
References: