## IPL Analysis:

Contents

- 1 IPL Analysis:
- 2 Step 1: Importing Libraries
- 3 Step 2: Download The Dataset
- 4 Step 3: Importing The Dataset
- 5 IPL DataSet Information :
- 6 IPL Analysis 1: List Of Seasons
- 7 IPL Analysis 2: IPL Matches Season Wise
- 8 IPL Analysis 3: Most IPL Matches Played In a Stadium
- 9 IPL Analysis 4: Number of IPL Matches Played By Each Team
- 10 IPL Analysis 5 : Most Run Scored by IPL Teams
- 11 IPL Analysis 5 : Most IPL Runs by a Batsman
- 12 Average Run by Teams in Powerplay
- 13 Most IPL Century by a Player
- 14 Most IPL Fifty by a Player
- 15 Most Sixes in an IPL Inning
- 16 Most (4s) hit by a Batsman
- 17 Most runs in an IPL season by Player
- 18 No. of Sixes in IPL Seasons
- 19 Highest Individual IPL Score
- 20 Most run conceded by a bowler in an inning
- 21 Most IPL Wickets by a Bowler
- 22 Most Dot Ball by a Bowler
- 23 Most Wickets by an IPL Team
- 24 Most No Balls by an IPL team

IPL analysis plays a major role in owning a team, making decisions, and deciding team batting or bowling order. We will be using the dataset by Kaggle, and try to dig insights. We will only be using pandas. So the only thing you need to analyze the data is to get the dataset.

## Step 1: Importing Libraries

We will start by importing all the necessary libraries before analyzing the data.

`import pandas as pd`

## Step 2: Download The Dataset

We will be working on the dataset of the 2008 – 2020 Data of IPL. Before Everything downloads the Dataset from here.

## Step 3: Importing The Dataset

```
df = pd.read_csv('IPL Ball-by-Ball 2008-2020.csv')
df.head()
```

## IPL DataSet Information :

```
df.info()
```

```
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 193468 non-null int64
1 inning 193468 non-null int64
2 over 193468 non-null int64
3 ball 193468 non-null int64
4 batsman 193468 non-null object
5 non_striker 193468 non-null object
6 bowler 193468 non-null object
7 batsman_runs 193468 non-null int64
8 extra_runs 193468 non-null int64
9 total_runs 193468 non-null int64
10 non_boundary 193468 non-null int64
11 is_wicket 193468 non-null int64
12 dismissal_kind 9495 non-null object
13 player_dismissed 9495 non-null object
14 fielder 6784 non-null object
15 extras_type 10233 non-null object
16 batting_team 193468 non-null object
17 bowling_team 193277 non-null object
dtypes: int64(9), object(9)
memory usage: 26.6+ MB
```

## IPL Analysis 1: List Of Seasons

you can get all the seasons in the dataset for cricket analysis by applying unique() function on the season column so that seasons don’t repeat. Like this:

`df.season.unique()`

```
array([2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2019,
2018, 2020, 2021], dtype=int64)
```

## IPL Analysis 2: IPL Matches Season Wise

How many IPL matches were played in each season can be determined by matchid.

`df.groupby(['match_id','season']).count().index.droplevel(level=0).value_counts().sort_index().plot(kind='bar')`

## IPL Analysis 3: Most IPL Matches Played In a Stadium

According to our analysis, most ipl matches are played in **M.chinaswamy stadium.**

We have grouped **venue and match id** to count how many matches are played in any stadium.

```
%matplotlib inline
df.groupby(['venue','match_id']).count().droplevel(level=1).index.value_counts().sort_values(ascending=False)[:10].plot(kind='bar')
```

## IPL Analysis 4: Number of IPL Matches Played By Each Team

`df['bowling_team'].value_counts().sort_values(ascending=False).plot(kind='barh')`

## IPL Analysis 5 : Most Run Scored by IPL Teams

We have grouped all batting team and added all the runs scored by teams.

No wonder, **Mumbai Indians **tops the list.

```
%matplotlib inline
df.groupby(['batting_team'])['run'].sum().sort_values(ascending=False).plot(kind='barh')
```

## IPL Analysis 5 : Most IPL Runs by a Batsman

We have group all the **strikers** and add all the runs. **Virat Kohli** tops the list.

```
df.groupby(['striker'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='bar')
```

## Average Run by Teams in Powerplay

`df[df['over']<6].groupby(['match_id','batting_team']).sum()['run'].groupby('batting_team').mean().sort_values(ascending=False)[2:].plot(kind='barh')`

## Most IPL Century by a Player

`runs = df.groupby(['striker','match_id'])['runs_off_bat'].sum()`

runs[runs >= 100].droplevel(level=1).groupby('striker').count().sort_values(ascending=False)[:10].plot(kind='barh')

## Most IPL Fifty by a Player

`runs = df.groupby(['striker','start_date'])['runs_off_bat'].sum()`

data= runs[runs >= 50].droplevel(level=1).groupby('striker').count().sort_values(ascending=False)[:10].plot(kind='barh')

## Most Sixes in an IPL Inning

`df[df['runs_off_bat'] == 6].groupby(['start_date','striker']).count()['season'].sort_values(ascending=False).droplevel(level=0)[:10].plot(kind='barh')`

## Most (4s) hit by a Batsman

`data = df[df['runs_off_bat'] == 4]['striker'].value_counts()[:10].plot(kind='bar')`

## Most runs in an IPL season by Player

`df.groupby(['striker','season'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='bar')`

## No. of Sixes in IPL Seasons

`data = df[df['runs_off_bat'] == 6].groupby('season').count()['match_id'].sort_values(ascending=False).plot(kind='barh')`

## Highest Individual IPL Score

`df.groupby(['striker','start_date'])['runs_off_bat'].sum().sort_values(ascending=False)[:10].plot(kind='barh')`

## Most run conceded by a bowler in an inning

`df.groupby(['bowler','start_date'])['run'].sum().droplevel(level=1).sort_values(ascending=False)[:10].plot(kind='barh')`

## Most IPL Wickets by a Bowler

`lst = 'caught,bowled,lbw,stumped,caught and bowled,hit wicket'`

df[df['wicket_type'].apply(lambda x: True if x in lst and x != ' ' else False)]['bowler'].value_counts()[:10].plot(kind='barh')

## Most Dot Ball by a Bowler

`data = df[df['run'] == 0].groupby('bowler').count()['match_id'].sort_values(ascending=False)[:10].plot(kind='barh')`

## Most Wickets by an IPL Team

`lst = 'caught,bowled,lbw,stumped,caught and bowled,hit wicket'`

data = df[df['wicket_type'].apply(lambda x: True if x in lst and x != ' ' else False)]['bowling_team'].value_counts()

df.groupby(['batting_team'])['extras'].agg('sum').sort_values(ascending=False).plot(kind='barh')

## Most No Balls by an IPL team

`df.groupby(['batting_team'])['noballs'].agg('sum').sort_values(ascending=False).plot(kind='bar')`

As you have noticed, we have analyzed a lot of things using pandas and matplotlib. These analyses alone are sufficient enough to take some very important decisions. Imagine a Data analyst, doing a postmortem of data and digging insights much more complex than these.

This is what you do, as a Data analyst in any company, you improve the decision-making process by giving them insights like these.

If you want to learn to analyze data and become a data scientist, we are offering our courses here.

Go through the courses and learn Data analysis to become a Data analyst in less than 7 months.

Follow our Insta Page for more info like this: Console Flare (@consoleflare) is on Instagram

Want to see IPL stats : IPLT20.com – Indian Premier League Official Website

## One thought on “How To Perform IPL Analysis And Visualization With The Help Of 1 Library (pandas)”