This Jupyter Notebook contains an analysis of a Netflix dataset, exploring various aspects of the available movies and TV shows.
This project aims to provide insights into the Netflix catalog, answering questions such as:
- Who are the most prolific directors on Netflix?
- What are the most common genres for movies and TV shows?
- Which countries produce the most content?
- How can we filter and analyze the dataset based on specific criteria (e.g., finding movies with Tom Cruise)?
The dataset used in this analysis is assumed to be a CSV file containing information about Netflix shows, including:
- Show ID
- Category (Movie or TV Show)
- Title
- Director
- Cast
- Country
- Release Year
- Rating
- Duration
- Listed In (Genres)
- Description
- Data Loading: The dataset is loaded into a Pandas DataFrame.
- Exploratory Data Analysis (EDA): Basic statistics and visualizations are used to understand the dataset's structure and content.
- Specific Queries: The notebook demonstrates how to filter and analyze the data based on specific criteria, such as:
- Finding the top 10 directors with the most content.
- Identifying movies categorized as comedies from the United Kingdom.
- Locating movies featuring Tom Cruise in the cast.
- Python 3.x
- Pandas
- (Potentially other libraries depending on the visualizations used)
- Install the required dependencies:
!pip install pandas
- Open the Jupyter Notebook.
- Run the cells to perform the analysis.
- More in-depth visualizations could be added to enhance the analysis.
- Additional filtering and analysis criteria could be explored based on specific research questions.
- The analysis could be extended to include sentiment analysis of show descriptions or user reviews.
Contributions to this project are welcome! If you find any issues or have suggestions for improvements, feel free to open an issue or submit a pull request.