Skip to content

Using a large language model and exploratory data analysis to analyse the customer side of health insurance transcripts

Notifications You must be signed in to change notification settings

Tai-O/call-analytics

Repository files navigation

Customer Call Analytics

This project provides tools for analyzing customer service call transcripts using various NLP techniques including sentiment analysis, topic modeling, and LLM-based analysis.

Features

  • Sentiment analysis of customer conversations
  • Topic modeling to identify key themes
  • Word frequency analysis
  • Custom LLM analysis using Azure OpenAI
  • Vector database integration for efficient transcript retrieval

Project Structure

call_analytics/
├── src/                    # Source code
│   ├── data_loader.py     # Transcript loading and processing
│   ├── sentiment_analyzer.py # Sentiment analysis
│   ├── topic_modeling.py  # Topic modeling
│   ├── word_analysis.py   # Word frequency analysis
│   └── llm_analyzer.py    # LLM integration
├── config/                 # Configuration files
│   └── config.py          # Project configuration
└── utils/                  # Utility functions
    └── text_processing.py  # Text preprocessing utilities

Installation

  1. Clone the repository:
git clone https://github.com/Tai-O/call-analytics.git
cd call-analytics
  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Download required NLTK data:
import nltk
nltk.download('all')

Configuration

Get Open AI API KEY

  • Create an account on https://platform.openai.com/api-keys and get an API KEY (Needed to access GPT 3.5 turbo model)
  • Open command prompt (Windows)
  •   setx OPENAI_API_KEY "api-key-here"
  • Permanent setup: To make the setup permanent, add the variable through the system properties as follows:

Right-click on 'This PC' or 'My Computer' and select 'Properties'.

  • Click on 'Advanced system settings'.
  • Click the 'Environment Variables' button.
  • In the 'System variables' section, click 'New...' and enter OPENAI_API_KEY as the variable name and your API key as the variable value.
  • Verification: To verify the setup, reopen the command prompt and type the command below. It should display your API key: echo %OPENAI_API_KEY%

Update the Azure OpenAI credentials in config/config.py:

AZURE_CONFIG = {
    "endpoint": "endpoint",
    "deployment": "deployment",
    "api_key": "api-key"
}

Usage

Basic Analysis

from src.data_loader import read_transcripts, extract_member_lines
from src.sentiment_analyzer import analyze_sentiment, plot_sentiment_distribution

# Load transcripts
transcripts = read_transcripts('path/to/transcripts')
member_lines = extract_member_lines(transcripts)

# Analyze sentiment
sentiment_scores = [(line, analyze_sentiment(line)) for line in member_lines]
plot_sentiment_distribution(sentiment_scores)

LLM Analysis

from src.llm_analyzer import LLMAnalyzer
from config.config import AZURE_CONFIG, LLM_PROMPT_TEMPLATE

# Initialize LLM analyzer
analyzer = LLMAnalyzer(
    azure_endpoint=AZURE_CONFIG["endpoint"],
    api_key=AZURE_CONFIG["api_key"]
)

# Create vector database
collection = analyzer.create_vector_db("transcripts", member_lines)

# Generate insights
query = "Identify the sentiment of the call"
context = analyzer.query_vector_db(collection, query)
response = analyzer.generate_response(
    LLM_PROMPT_TEMPLATE.format(query, context)
)

About

Using a large language model and exploratory data analysis to analyse the customer side of health insurance transcripts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published