FanFic Explorer

This Python script allows users to scrape data from the "Archive of Our Own" website, specifically focusing on fanart related to movies and games. It is capable of retrieving detailed information about various articles, including their authors, summaries, tags, and more.

Features

Retrieve a list of movies and games.
Search for specific movies or games through local or online search.
Obtain detailed information about related articles, including:
- Article title and URL
- Author's name and URL
- Tags associated with the article (genre, warnings, etc.)
- Summary of the article
- Detailed metadata and chapter content
Save the scraped data as JSON files.
Convert the extracted data into pandas DataFrames for easy manipulation and analysis.

Requirements

Python 3.x
requests
BeautifulSoup4
tqdm
pandas

You can install the required packages using pip:

pip install requests beautifulsoup4 tqdm pandas

Usage

Clone the repository

Start by cloning this repository to your local machine using the following command:
```
git clone https://github.com/heib6xinyu/FanFic-Explorer-Exploring-Fan-Fiction-Works-from-AO3.git
```
Navigate to the directory where the repository is cloned.
Running the script

To run the script, use the following command in your terminal:
```
python main.py
```
Upon execution, the script will initially fetch lists of movies and games. You can perform a search operation by invoking the local_search function and providing a search keyword.

Example:
```
results = local_search('The Witcher')
```
To scrape data related to a specific movie or game, use the get_all_info function with the appropriate parameters.

Example:
```
related_articles, related_articles_detail = get_all_info(name, url, num)
```
Data storage

The script saves the detailed information of the articles in JSON format in the working directory. These files are named related_articles.json and related_articles_detail.json.
Data analysis

You can convert the retrieved data into a pandas DataFrame for further analysis or export it into different formats (like CSV or Excel) using pandas functionalities.

Caution

This script is for educational purposes only. Please respect Archive of Our Own's Terms of Service. Do not use this script to overload their servers or infringe on the privacy of content creators.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FanFic Explorer

Features

Requirements

Usage

Caution

About

Releases

Packages

Languages

heib6xinyu/FanFic-Explorer-Exploring-Fan-Fiction-Works-from-AO3

Folders and files

Latest commit

History

Repository files navigation

FanFic Explorer

Features

Requirements

Usage

Caution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages