Landing Page Landing Page
  • Facebook
  • Google+
  • Twitter
  • Instagram
  • Behance
  • Home
  • Explore Dataset
  • Gallery
  • Services
  • Testimonials
  • Clients
  • Pricing
Checkout Vegalite Visualization

Anime Dataset Visualisation¶

In [ ]:
import pandas as pd
In [ ]:
url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-04-23/raw_anime.csv"
df = pd.read_csv(url)
In [ ]:
df.head()
Out[ ]:
animeID name title_english title_japanese title_synonyms type source producers genre studio ... scored_by rank popularity members favorites synopsis background premiered broadcast related
0 1 Cowboy Bebop Cowboy Bebop カウボーイビバップ [] TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ['Sunrise'] ... 405664.0 26.0 39.0 795733.0 43460.0 In the year 2071, humanity has colonized sever... When Cowboy Bebop first aired in spring of 199... Spring 1998 Saturdays at 01:00 (JST) {'Adaptation': [{'mal_id': 173, 'type': 'manga...
1 5 Cowboy Bebop: Tengoku no Tobira Cowboy Bebop: The Movie カウボーイビバップ 天国の扉 ["Cowboy Bebop: Knockin' on Heaven's Door"] Movie Original ['Sunrise', 'Bandai Visual'] ['Action', 'Drama', 'Mystery', 'Sci-Fi', 'Space'] ['Bones'] ... 120243.0 164.0 449.0 197791.0 776.0 Another day, another bounty—such is the life o... NaN NaN NaN {'Parent story': [{'mal_id': 1, 'type': 'anime...
2 6 Trigun Trigun トライガン [] TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ['Madhouse'] ... 212537.0 255.0 146.0 408548.0 10432.0 Vash the Stampede is the man with a $$60,000,0... The Japanese release by Victor Entertainment h... Spring 1998 Thursdays at 01:15 (JST) {'Adaptation': [{'mal_id': 703, 'type': 'manga...
3 7 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN ['WHR'] TV Original ['Bandai Visual'] ['Action', 'Magic', 'Police', 'Supernatural', ... ['Sunrise'] ... 32837.0 2371.0 1171.0 79397.0 537.0 Witches are individuals with special powers li... NaN Summer 2002 Tuesdays at Unknown {}
4 8 Bouken Ou Beet Beet the Vandel Buster 冒険王ビィト ['Adventure King Beet'] TV Manga ['TV Tokyo', 'Dentsu'] ['Adventure', 'Fantasy', 'Shounen', 'Supernatu... ['Toei Animation'] ... 4894.0 3544.0 3704.0 11708.0 14.0 It is the dark century and the people are suff... NaN Fall 2004 Thursdays at 18:30 (JST) {'Adaptation': [{'mal_id': 1348, 'type': 'mang...

5 rows × 27 columns

In [ ]:
df.columns
Out[ ]:
Index(['animeID', 'name', 'title_english', 'title_japanese', 'title_synonyms',
       'type', 'source', 'producers', 'genre', 'studio', 'episodes', 'status',
       'airing', 'aired', 'duration', 'rating', 'score', 'scored_by', 'rank',
       'popularity', 'members', 'favorites', 'synopsis', 'background',
       'premiered', 'broadcast', 'related'],
      dtype='object')
In [ ]:
df.dtypes
Out[ ]:
animeID             int64
name               object
title_english      object
title_japanese     object
title_synonyms     object
type               object
source             object
producers          object
genre              object
studio             object
episodes          float64
status             object
airing             object
aired              object
duration           object
rating             object
score             float64
scored_by         float64
rank              float64
popularity        float64
members           float64
favorites         float64
synopsis           object
background         object
premiered          object
broadcast          object
related            object
dtype: object

Checking for null Values¶

In [ ]:
df.shape[0]
Out[ ]:
15278
In [ ]:
df.isnull().sum()
Out[ ]:
animeID               0
name                  0
title_english      9156
title_japanese       48
title_synonyms        5
type                  5
source                5
producers             5
genre                 5
studio                5
episodes            546
status                5
airing                5
aired                 5
duration              5
rating                5
score               500
scored_by             5
rank               1609
popularity            5
members               5
favorites             5
synopsis            713
background        14160
premiered         11099
broadcast         10876
related               5
dtype: int64

We can see title_synonyms, type, source, producers, genre, studio, status, airing, aired, duration, rating, scored_by, popularity, members, favorites, related have 5 rows with Null values let drop rows from these column¶

In [ ]:
# drop rows with missing values only in specific columns
df.dropna(subset=['title_synonyms', 'type', 'source', 'producers', 'genre', 'studio', 'status', 'airing', 'aired', 'duration', 'rating', 'scored_by', 'popularity', 'members', 'favorites', 'related'], inplace=True)

As we know we have 15278 rows and almost 14160 rows in background has Null value let's drop this column

In [ ]:
df.drop('background', axis=1, inplace=True)
In [ ]:
df.columns
Out[ ]:
Index(['animeID', 'name', 'title_english', 'title_japanese', 'title_synonyms',
       'type', 'source', 'producers', 'genre', 'studio', 'episodes', 'status',
       'airing', 'aired', 'duration', 'rating', 'score', 'scored_by', 'rank',
       'popularity', 'members', 'favorites', 'synopsis', 'premiered',
       'broadcast', 'related'],
      dtype='object')
In [ ]:
df.isnull().sum()
Out[ ]:
animeID               0
name                  0
title_english      9151
title_japanese       43
title_synonyms        0
type                  0
source                0
producers             0
genre                 0
studio                0
episodes            541
status                0
airing                0
aired                 0
duration              0
rating                0
score               495
scored_by             0
rank               1604
popularity            0
members               0
favorites             0
synopsis            708
premiered         11094
broadcast         10871
related               0
dtype: int64

We can see Main columns have lots of missing values as such:

  • title_english has almost 9151 missing value but we need this names for visualisation so let's try to translate name of title_japanese to english for these values
In [ ]:
!pip install googletrans==4.0.0-rc1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: googletrans==4.0.0-rc1 in /usr/local/lib/python3.9/dist-packages (4.0.0rc1)
Requirement already satisfied: httpx==0.13.3 in /usr/local/lib/python3.9/dist-packages (from googletrans==4.0.0-rc1) (0.13.3)
Requirement already satisfied: chardet==3.* in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (3.0.4)
Requirement already satisfied: certifi in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (2022.12.7)
Requirement already satisfied: idna==2.* in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (2.10)
Requirement already satisfied: sniffio in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (1.3.0)
Requirement already satisfied: rfc3986<2,>=1.3 in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (1.5.0)
Requirement already satisfied: httpcore==0.9.* in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (0.9.1)
Requirement already satisfied: hstspreload in /usr/local/lib/python3.9/dist-packages (from httpx==0.13.3->googletrans==4.0.0-rc1) (2023.1.1)
Requirement already satisfied: h11<0.10,>=0.8 in /usr/local/lib/python3.9/dist-packages (from httpcore==0.9.*->httpx==0.13.3->googletrans==4.0.0-rc1) (0.9.0)
Requirement already satisfied: h2==3.* in /usr/local/lib/python3.9/dist-packages (from httpcore==0.9.*->httpx==0.13.3->googletrans==4.0.0-rc1) (3.2.0)
Requirement already satisfied: hyperframe<6,>=5.2.0 in /usr/local/lib/python3.9/dist-packages (from h2==3.*->httpcore==0.9.*->httpx==0.13.3->googletrans==4.0.0-rc1) (5.2.0)
Requirement already satisfied: hpack<4,>=3.0 in /usr/local/lib/python3.9/dist-packages (from h2==3.*->httpcore==0.9.*->httpx==0.13.3->googletrans==4.0.0-rc1) (3.0.0)
In [ ]:
from googletrans import Translator
translator = Translator()

for i, row in df.iterrows():
    if pd.isna(row['title_english']):
        try:
            # translate 'title_japanese' to english
            translated = translator.translate(row['title_japanese'], dest='en').text
            # fill missing values in 'title_english' column
            df.at[i, 'title_english'] = translated
        except:
            # handle any errors
            print(f"Translation failed for row {i}")
In [ ]:
df.isnull().sum()
Out[ ]:
animeID               0
name                  0
title_english      4214
title_japanese       43
title_synonyms        0
type                  0
source                0
producers             0
genre                 0
studio                0
episodes            541
status                0
airing                0
aired                 0
duration              0
rating                0
score               495
scored_by             0
rank               1604
popularity            0
members               0
favorites             0
synopsis            708
premiered         11094
broadcast         10871
related               0
dtype: int64

Saving Data¶

In [ ]:
df.to_csv('preprocessed.csv')

Loading Translated Dataset¶

In [ ]:
path = "/content/preprocessed.csv"
translated_anime = pd.read_csv(path)

translated_anime.head(3)
Out[ ]:
Unnamed: 0 animeID name title_english title_japanese title_synonyms type source producers genre ... score scored_by rank popularity members favorites synopsis premiered broadcast related
0 0 1 Cowboy Bebop Cowboy Bebop カウボーイビバップ [] TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ... 8.81 405664.0 26.0 39.0 795733.0 43460.0 In the year 2071, humanity has colonized sever... Spring 1998 Saturdays at 01:00 (JST) {'Adaptation': [{'mal_id': 173, 'type': 'manga...
1 1 5 Cowboy Bebop: Tengoku no Tobira Cowboy Bebop: The Movie カウボーイビバップ 天国の扉 ["Cowboy Bebop: Knockin' on Heaven's Door"] Movie Original ['Sunrise', 'Bandai Visual'] ['Action', 'Drama', 'Mystery', 'Sci-Fi', 'Space'] ... 8.41 120243.0 164.0 449.0 197791.0 776.0 Another day, another bounty—such is the life o... NaN NaN {'Parent story': [{'mal_id': 1, 'type': 'anime...
2 2 6 Trigun Trigun トライガン [] TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ... 8.30 212537.0 255.0 146.0 408548.0 10432.0 Vash the Stampede is the man with a $$60,000,0... Spring 1998 Thursdays at 01:15 (JST) {'Adaptation': [{'mal_id': 703, 'type': 'manga...

3 rows × 27 columns

In [ ]:
translated_anime.isnull().sum()
Out[ ]:
Unnamed: 0            0
animeID               0
name                  0
title_english      4214
title_japanese       43
title_synonyms        0
type                  0
source                0
producers             0
genre                 0
studio                0
episodes            541
status                0
airing                0
aired                 0
duration              0
rating                0
score               495
scored_by             0
rank               1604
popularity            0
members               0
favorites             0
synopsis            708
premiered         11094
broadcast         10871
related               0
dtype: int64
In [ ]:
translated_anime.shape[0]
Out[ ]:
15273
In [ ]:
# drop rows with missing values only in specific columns
translated_anime.dropna(subset=['title_english', 'title_japanese', 'episodes','score','rank','synopsis','premiered','broadcast'], inplace=True)
In [ ]:
translated_anime.columns
Out[ ]:
Index(['Unnamed: 0', 'animeID', 'name', 'title_english', 'title_japanese',
       'title_synonyms', 'type', 'source', 'producers', 'genre', 'studio',
       'episodes', 'status', 'airing', 'aired', 'duration', 'rating', 'score',
       'scored_by', 'rank', 'popularity', 'members', 'favorites', 'synopsis',
       'premiered', 'broadcast', 'related'],
      dtype='object')
In [ ]:
translated_anime.head(3)
Out[ ]:
Unnamed: 0 animeID name title_english title_japanese title_synonyms type source producers genre ... score scored_by rank popularity members favorites synopsis premiered broadcast related
0 0 1 Cowboy Bebop Cowboy Bebop カウボーイビバップ [] TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ... 8.81 405664.0 26.0 39.0 795733.0 43460.0 In the year 2071, humanity has colonized sever... Spring 1998 Saturdays at 01:00 (JST) {'Adaptation': [{'mal_id': 173, 'type': 'manga...
2 2 6 Trigun Trigun トライガン [] TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ... 8.30 212537.0 255.0 146.0 408548.0 10432.0 Vash the Stampede is the man with a $$60,000,0... Spring 1998 Thursdays at 01:15 (JST) {'Adaptation': [{'mal_id': 703, 'type': 'manga...
3 3 7 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN ['WHR'] TV Original ['Bandai Visual'] ['Action', 'Magic', 'Police', 'Supernatural', ... ... 7.33 32837.0 2371.0 1171.0 79397.0 537.0 Witches are individuals with special powers li... Summer 2002 Tuesdays at Unknown {}

3 rows × 27 columns

Removing Redundant Columns¶

  • Unnamed: 0
  • animeID
  • title_synonyms
  • members
  • synopsis
  • related
In [ ]:
translated_anime.drop(['Unnamed: 0','animeID','title_synonyms','members','synopsis','related'],axis = 1, inplace = True)
In [ ]:
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes status ... aired duration rating score scored_by rank popularity favorites premiered broadcast
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ['Sunrise'] 26.0 Finished Airing ... {'from': '1998-04-03T00:00:00+00:00', 'to': '1... 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Spring 1998 Saturdays at 01:00 (JST)
2 Trigun Trigun トライガン TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ['Madhouse'] 26.0 Finished Airing ... {'from': '1998-04-01T00:00:00+00:00', 'to': '1... 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Spring 1998 Thursdays at 01:15 (JST)
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original ['Bandai Visual'] ['Action', 'Magic', 'Police', 'Supernatural', ... ['Sunrise'] 26.0 Finished Airing ... {'from': '2002-07-02T00:00:00+00:00', 'to': '2... 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Summer 2002 Tuesdays at Unknown

3 rows × 21 columns

In [ ]:
translated_anime.columns
Out[ ]:
Index(['name', 'title_english', 'title_japanese', 'type', 'source',
       'producers', 'genre', 'studio', 'episodes', 'status', 'airing', 'aired',
       'duration', 'rating', 'score', 'scored_by', 'rank', 'popularity',
       'favorites', 'premiered', 'broadcast'],
      dtype='object')
In [ ]:
translated_anime.drop(['status','aired','airing'], axis=1, inplace=True)
In [ ]:
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes duration rating score scored_by rank popularity favorites premiered broadcast
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ['Sunrise'] 26.0 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Spring 1998 Saturdays at 01:00 (JST)
2 Trigun Trigun トライガン TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ['Madhouse'] 26.0 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Spring 1998 Thursdays at 01:15 (JST)
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original ['Bandai Visual'] ['Action', 'Magic', 'Police', 'Supernatural', ... ['Sunrise'] 26.0 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Summer 2002 Tuesdays at Unknown
In [ ]:
translated_anime.columns
Out[ ]:
Index(['name', 'title_english', 'title_japanese', 'type', 'source',
       'producers', 'genre', 'studio', 'episodes', 'duration', 'rating',
       'score', 'scored_by', 'rank', 'popularity', 'favorites', 'premiered',
       'broadcast'],
      dtype='object')

Spliting premiered column into premiered_season, premiered_year for better understanding¶

In [ ]:
translated_anime[['premiered_season', 'premiered_year']] = translated_anime['premiered'].str.split(' ', n=1, expand=True)
translated_anime.drop(['premiered'], axis=1, inplace=True)
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes duration rating score scored_by rank popularity favorites broadcast premiered_season premiered_year
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original ['Bandai Visual'] ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ['Sunrise'] 26.0 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Saturdays at 01:00 (JST) Spring 1998
2 Trigun Trigun トライガン TV Manga ['Victor Entertainment'] ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ['Madhouse'] 26.0 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Thursdays at 01:15 (JST) Spring 1998
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original ['Bandai Visual'] ['Action', 'Magic', 'Police', 'Supernatural', ... ['Sunrise'] 26.0 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Tuesdays at Unknown Summer 2002
In [ ]:
translated_anime.dtypes
Out[ ]:
name                 object
title_english        object
title_japanese       object
type                 object
source               object
producers            object
genre                object
studio               object
episodes            float64
duration             object
rating               object
score               float64
scored_by           float64
rank                float64
popularity          float64
favorites           float64
broadcast            object
premiered_season     object
premiered_year       object
dtype: object
In [ ]:
translated_anime.isnull().sum()
Out[ ]:
name                0
title_english       0
title_japanese      0
type                0
source              0
producers           0
genre               0
studio              0
episodes            0
duration            0
rating              0
score               0
scored_by           0
rank                0
popularity          0
favorites           0
broadcast           0
premiered_season    0
premiered_year      0
dtype: int64
In [ ]:
translated_anime.shape[0]
Out[ ]:
3295
In [ ]:
translated_anime['producers'].unique()
Out[ ]:
array(["['Bandai Visual']", "['Victor Entertainment']",
       "['TV Tokyo', 'Dentsu']", ...,
       "['Studio Mausu', 'Namu Animation']",
       "['DAX Production', 'Twin Planet']", "['Polygon Magic']"],
      dtype=object)

Editing Producer Column by extraccting value from list¶

we first use the ast.literal_eval() function to convert each string representation of a list to an actual list of strings. Then, we use the .str accessor to get the first element of each list.

In [ ]:
import ast

# Convert 'producers' column to list data type
translated_anime['producers'] = translated_anime['producers'].apply(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)

# Extract the first element from each list in 'producers' column
translated_anime['producers'] = translated_anime['producers'].str[0]
In [ ]:
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes duration rating score scored_by rank popularity favorites broadcast premiered_season premiered_year
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original Bandai Visual ['Action', 'Adventure', 'Comedy', 'Drama', 'Sc... ['Sunrise'] 26.0 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Saturdays at 01:00 (JST) Spring 1998
2 Trigun Trigun トライガン TV Manga Victor Entertainment ['Action', 'Sci-Fi', 'Adventure', 'Comedy', 'D... ['Madhouse'] 26.0 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Thursdays at 01:15 (JST) Spring 1998
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original Bandai Visual ['Action', 'Magic', 'Police', 'Supernatural', ... ['Sunrise'] 26.0 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Tuesdays at Unknown Summer 2002

we first use the ast.literal_eval() function to convert each string representation of a list to an actual list of strings. Then, we use the random.randint() function to generate a random integer between 0 and the length of the list of genres (minus one, since indexing starts at zero). Finally, we use this random integer to select a random genre from the list using indexing, and assign it to the 'genre' column.

Note that we use if x to check if the 'genre' column contains any empty lists, and if it does, we assign an empty string to that row's 'genre' value.

Assigning Genre¶

We have list of Genre which might lead to confusion so we are using random library to random assigning a single genre to anime

In [ ]:
import random

# Convert 'genre' column to list data type
translated_anime['genre'] = translated_anime['genre'].apply(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)

# Randomly select a genre from each list in 'genre' column
translated_anime['genre'] = translated_anime['genre'].apply(lambda x: x[random.randint(0, len(x)-1)] if x else '')
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes duration rating score scored_by rank popularity favorites broadcast premiered_season premiered_year
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original Bandai Visual Sci-Fi ['Sunrise'] 26.0 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Saturdays at 01:00 (JST) Spring 1998
2 Trigun Trigun トライガン TV Manga Victor Entertainment Sci-Fi ['Madhouse'] 26.0 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Thursdays at 01:15 (JST) Spring 1998
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original Bandai Visual Magic ['Sunrise'] 26.0 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Tuesdays at Unknown Summer 2002

Assigning Studio¶

In [ ]:
# Convert 'producers' column to list data type
translated_anime['studio'] = translated_anime['studio'].apply(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)

# Extract the first element from each list in 'producers' column
translated_anime['studio'] = translated_anime['studio'].str[0]
In [ ]:
translated_anime.head(3)
Out[ ]:
name title_english title_japanese type source producers genre studio episodes duration rating score scored_by rank popularity favorites broadcast premiered_season premiered_year
0 Cowboy Bebop Cowboy Bebop カウボーイビバップ TV Original Bandai Visual Sci-Fi Sunrise 26.0 24 min per ep R - 17+ (violence & profanity) 8.81 405664.0 26.0 39.0 43460.0 Saturdays at 01:00 (JST) Spring 1998
2 Trigun Trigun トライガン TV Manga Victor Entertainment Sci-Fi Madhouse 26.0 24 min per ep PG-13 - Teens 13 or older 8.30 212537.0 255.0 146.0 10432.0 Thursdays at 01:15 (JST) Spring 1998
3 Witch Hunter Robin Witch Hunter Robin Witch Hunter ROBIN TV Original Bandai Visual Magic Sunrise 26.0 25 min per ep PG-13 - Teens 13 or older 7.33 32837.0 2371.0 1171.0 537.0 Tuesdays at Unknown Summer 2002
In [11]:
translated_anime.isnull().sum()
Out[11]:
Unnamed: 0            0
name                  0
title_english         0
title_japanese        0
type                  0
source                0
producers           683
genre                 0
studio              405
episodes              0
duration              0
rating                0
score                 0
scored_by             0
rank                  0
popularity            0
favorites             0
broadcast             0
premiered_season      0
premiered_year        0
dtype: int64

There are few Missing values in producers and studio columns let's drop that values

In [12]:
# drop rows with missing values only in specific columns
translated_anime.dropna(subset=['producers','studio'], inplace=True)

Saving Dataset in CSV Format¶

In [13]:
df_anime_preprocessed = translated_anime.copy()
df_anime_preprocessed.to_csv('anime_dataset.csv')

We will upload this anime_dataset.csv on our github repo at https://github.com/vaibhavhariramani/Anime_Dataset_Visualisation¶

Loading Dataseet from Github¶

In [1]:
import pandas as pd
In [8]:
url = "https://raw.githubusercontent.com/vaibhavhariramani/Anime_Dataset_Visualisation/main/anime_dataset.csv"
df_anime_preprocessed = pd.read_csv(url)
In [9]:
translated_anime.dtypes
Out[9]:
Unnamed: 0            int64
name                 object
title_english        object
title_japanese       object
type                 object
source               object
producers            object
genre                object
studio               object
episodes            float64
duration             object
rating               object
score               float64
scored_by           float64
rank                float64
popularity          float64
favorites           float64
broadcast            object
premiered_season     object
premiered_year        int64
dtype: object
In [10]:
translated_anime.isnull().sum()
Out[10]:
Unnamed: 0            0
name                  0
title_english         0
title_japanese        0
type                  0
source                0
producers           683
genre                 0
studio              405
episodes              0
duration              0
rating                0
score                 0
scored_by             0
rank                  0
popularity            0
favorites             0
broadcast             0
premiered_season      0
premiered_year        0
dtype: int64
In [ ]: