When you are working with a new Pandas DataFrame, these attributes and methods will give you insights into key aspects of the data.
The dir
function let’s you look at all of the attributes that a Python object has.
dir(df)
The shape
attribute returns a tuple of integers indicating the number of elements that are stored along each dimension of an array. For a 2D-array with N rows and M columns, shape will be (N,M).
df.shape
You may be working with a dataframe that has hundreds or thousands of rows. To get a glimpse of the data inside a dataframe without printing out all of the values you can use the head
and tail
methods.
Returns the first n rows in the dataframe
df.head() # returns rows 0-4 df.head(n) # returns the first n rows
Returns the last n rows in the dataframe
df.tail() df.tail(n)
The count method of a dataframe shows you the number of entries for each column
df.count()
Check if there are any missing values in any of the columns
pd.isnull(df).any()
The info
method of the dataframe gives a bunch of information. It tells
- The number of entries in the df
- The names of the columns
- The number of columns
- The number of entries in each column
- The dtype of each column
- If there are null values in a column
df.info()
