Skip to content

A Medical microdata Scrapper, meticulously crafted in Java to extract and process valuable information from various sources.

License

Notifications You must be signed in to change notification settings

fkitsantas/Java-Medical-microdata-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Java-Medical-microdata-Scraper

Java-Medical-microdata-Scrapper is a Java web scraping tool specifically designed for extracting medical microdata from web pages. This tool allows you to scrape information such as company name, drug name, drug class, indications, dosage form, side effects, and warnings from target URLs. The extracted data can be printed to the console and saved in an XML format for further analysis or storage.

Installation

To use Java-Medical-microdata-Scraper, follow these steps:

  1. Clone the repository to your local machine:

git clone https://github.com/fkitsantas/Java-Medical-microdata-Scrapper.git

  1. Import the project into your preferred Java IDE.

  2. Build the project to resolve any dependencies.

  3. Run the MedDataScraper class to execute the scraper.

Usage

The scraper is designed to extract medical microdata from specific web pages. By default, it is set to scrape data from the URL http://linter.structured-data.org/examples/schema.org/Drug-TreatmentIndication-MedicalContraindication-273-rdfa.html. You can modify the URL in the code to scrape data from your desired source.

The extracted data will be printed to the console, displaying drug information including drug name, company name, drug class, indications, dosage form, side effects, and warnings.

The scraped data will also be saved in an XML format. The XML file will be named Scraped_Medical.xml and will be stored in the project directory.

Dependencies

Java-Medical-microdata-Scrapper tool requires the following dependencies:

  • Java 8 or above
  • DOM API
  • XML API

These dependencies are included in the standard Java Development Kit (JDK) libraries, so there's no need to install any additional packages.

About

A Medical microdata Scrapper, meticulously crafted in Java to extract and process valuable information from various sources.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages