diff --git a/README.md b/README.md index c517b95..6419ff8 100644 --- a/README.md +++ b/README.md @@ -243,6 +243,19 @@ Explore our extensive list of GenAI agent implementations, ranging from simple t #### Implementation 🛠️ • Implement a multi-step process involving question anonymization, high-level planning, task breakdown, adaptive information retrieval and question answering, continuous re-planning, and rigorous answer verification to ensure grounded and accurate responses. +20. **[Social Media Manager Agent 📸](https://github.com/NirDiamant/GenAI_Agents/blob/main/all_agents_tutorials/Social_Media_manager_langgraph.ipynb)** + + #### Overview 🔎 + A sophisticated Social Media Manager agent built with LangGraph that automates content sourcing and LinkedIn post creation. The agent supports multiple content sources including Towards Data Science articles, Reddit posts, YouTube transcripts, audio transcriptions, and LinkedIn profiles. It uses a state-based workflow to process content, generate engaging posts in a consistent style, and can optionally publish directly to LinkedIn using browser automation. + + #### Implementation 🛠️ + Utilizes LangGraph's StateGraph to orchestrate a multi-step workflow: + 1. Content Router - Analyzes user intent to determine appropriate content source + 2. Content Fetching - Dedicated nodes for each source (YouTube, Reddit, TDS, etc.) + 3. Post Generation - Uses GPT models to transform source content into LinkedIn-style posts + 4. LinkedIn Integration - Optional automated posting using Playwright for browser automation + + The system maintains state throughout the process and includes error handling and checkpoint management using MemorySaver. ## Getting Started diff --git a/all_agents_tutorials/Social_Media_manager_langgraph.ipynb b/all_agents_tutorials/Social_Media_manager_langgraph.ipynb new file mode 100644 index 0000000..c567fc9 --- /dev/null +++ b/all_agents_tutorials/Social_Media_manager_langgraph.ipynb @@ -0,0 +1,1278 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "s8I5PTPhHdEY" + }, + "source": [ + "# Building a Social Media Manager with LangGraph: A Tutorial\n", + "\n", + "## Overview\n", + "\n", + "This tutorial guides you through creating a Social Media Manager using LangGraph, demonstrating how to build a stateful application that can process content from multiple sources and generate LinkedIn posts. The agent showcases how to handle complex workflows with multiple content sources, AI-powered content generation, and automated social media posting.\n", + "\n", + "## Motivation\n", + "\n", + "Managing social media content creation involves multiple steps: content sourcing, processing, generation, and posting. LangGraph provides an excellent framework for orchestrating these steps in a maintainable way. This Social Media Manager demonstrates how to create a robust workflow that can handle various content sources while maintaining clear separation of concerns.\n", + "\n", + "## Key Components\n", + "\n", + "1. **StateGraph**: Core workflow manager defining the content processing pipeline\n", + "2. **AgentState**: Custom type tracking conversation history, content items, and generated posts\n", + "3. **Content Sources**:\n", + " - YouTube transcription\n", + " - Reddit summaries\n", + " - Towards Data Science articles\n", + " - LinkedIn profile posts\n", + " - Audio transcription\n", + "4. **LLM Integration**: Using LLM for content generation and decision making\n", + "5. **LinkedIn Automation**: Automated posting using Playwright\n", + "\n", + "## Method Details\n", + "\n", + "Our Social Media Manager follows a multi-step process:\n", + "\n", + "1. **Content Router**:\n", + " - Analyzes user intent to determine content source\n", + " - Routes to appropriate content fetching node\n", + "\n", + "2. **Content Fetching**:\n", + " - Each source has a dedicated node (YouTube, Reddit, etc.)\n", + " - Handles authentication and content extraction\n", + " - Processes content into a consistent format\n", + "\n", + "3. **Post Generation**:\n", + " - Uses GPT-4 to transform source content into LinkedIn-style posts\n", + " - Maintains consistent tone and formatting\n", + "\n", + "4. **LinkedIn Posting**:\n", + " - Automated browser control for posting\n", + " - User confirmation before posting\n", + " - Error handling and session management\n", + "\n", + "The workflow is managed by LangGraph, ensuring proper state transitions and error handling throughout the process.\n", + "\n", + "\n", + "## Conclusion\n", + "\n", + "This Social Media Manager demonstrates LangGraph's capability to handle complex, real-world applications. The graph-based structure provides clear separation of concerns, making it easy to add new content sources or modify the posting workflow. The integration of AI for content generation and decision-making showcases how modern language models can be effectively incorporated into practical applications.\n", + "\n", + "This example serves as a foundation for developers looking to build sophisticated content management systems, showing how to handle multiple data sources, state management, and automated social media interactions within a structured framework." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BX9gpUclNO3v" + }, + "source": [ + "# install dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "E2Fze8AGNabY", + "outputId": "2bb6f3f3-448e-4327-d3c2-c8cdb548045e" + }, + "outputs": [], + "source": [ + "!pip install beautifulsoup4 langchain_groq langchain_core langchain_openai langgraph selenium rich langchain_anthropic whisper praw python-dotenv youtube_transcript_api gradio langserve PyPDF2 playwright lxml sse_starlette\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "yXUZY6wgxMSl", + "outputId": "527d16d2-9620-4187-856a-5ecf92f55397" + }, + "outputs": [], + "source": [ + "# install playwright\n", + "!playwright install\n", + "\n", + "!apt-get update\n", + "!apt-get install -y \\\n", + " libwoff1 \\\n", + " libharfbuzz-icu0 \\\n", + " libgstreamer-plugins-base1.0-0 \\\n", + " libgstreamer-gl1.0-0 \\\n", + " libgstreamer-plugins-bad1.0-0 \\\n", + " libenchant-2-2 \\\n", + " libsecret-1-0 \\\n", + " libhyphen0 \\\n", + " libmanette-0.2-0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3K-wcxINIRPa" + }, + "source": [ + "### Setup and Imports\n", + "First, let's import the necessary modules and set up our environment. Create a .env file in the root of the project and add your keys. If you're using Groq, make sure to set the GROQ_API_KEY environment variable and it will only support 8k context window with current model selection. Use OpenAI for the best results.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ra2FHEPcG6M3" + }, + "outputs": [], + "source": [ + "# Standard library imports\n", + "import asyncio\n", + "import os\n", + "import re\n", + "from typing import Any, Annotated, Dict, List, Optional, Tuple\n", + "from uuid import uuid4\n", + "\n", + "# Third-party imports\n", + "from bs4 import BeautifulSoup\n", + "from dotenv import load_dotenv\n", + "from IPython.display import display, Image\n", + "from langchain_core.messages import AIMessage, AnyMessage, HumanMessage, SystemMessage, ToolMessage\n", + "from langchain_core.runnables.graph import MermaidDrawMethod\n", + "from langchain_openai import ChatOpenAI\n", + "from langgraph.checkpoint.memory import MemorySaver\n", + "from langgraph.graph import END, StateGraph\n", + "from langgraph.graph.message import add_messages\n", + "from openai import OpenAI\n", + "from langchain_groq import ChatGroq\n", + "from playwright.async_api import (\n", + " Page,\n", + " TimeoutError,\n", + " async_playwright,\n", + " Browser,\n", + " Playwright\n", + ")\n", + "import praw\n", + "from pydantic import BaseModel, Field\n", + "import requests\n", + "from youtube_transcript_api import YouTubeTranscriptApi\n", + "\n", + "# Load environment variables\n", + "load_dotenv()\n", + "\n", + "groq_key = os.getenv('GROQ_API_KEY')\n", + "openai_key = os.getenv('OPENAI_API_KEY')\n", + "\n", + "if not groq_key and not openai_key:\n", + " raise ValueError(\"Either GROQ_API_KEY or OPENAI_API_KEY must be set\")\n", + "\n", + "if openai_key:\n", + " print(\"OPENAI_API_KEY is set in environment variables\")\n", + " model = ChatOpenAI(temperature=0, model=\"gpt-4o-2024-08-06\") \n", + " os.environ[\"OPENAI_API_KEY\"] = openai_key\n", + "elif groq_key:\n", + " print(\"GROQ_API_KEY is set in environment variables\", groq_key)\n", + " model = ChatGroq(temperature=0, model=\"llama3-groq-70b-8192-tool-use-preview\", base_url=\"https://api.groq.com\")\n", + " os.environ[\"GROQ_API_KEY\"] = groq_key\n", + " \n", + "# LinkedIn configuration\n", + "os.environ[\"LINKEDIN_EMAIL\"] = os.getenv('LINKEDIN_EMAIL')\n", + "os.environ[\"LINKEDIN_PASSWORD\"] = os.getenv('LINKEDIN_PASSWORD')\n", + "os.environ[\"LINKEDIN_PROFILE_NAME\"] = os.getenv('LINKEDIN_PROFILE_NAME')\n", + "\n", + "# Langchain configuration\n", + "os.environ[\"LANGCHAIN_TRACING_V2\"] = os.getenv('LANGCHAIN_TRACING_V2')\n", + "os.environ[\"LANGCHAIN_API_KEY\"] = os.getenv('LANGCHAIN_API_KEY')\n", + "os.environ[\"LANGCHAIN_ENDPOINT\"] = os.getenv('LANGCHAIN_ENDPOINT')\n", + "os.environ[\"LANGCHAIN_PROJECT\"] = os.getenv('LANGCHAIN_PROJECT')\n", + "\n", + "# PRAW https://praw.readthedocs.io/en/stable/getting_started/quick_start.html\n", + "os.environ[\"PRAW_CLIENT_ID\"] = os.getenv('PRAW_CLIENT_ID')\n", + "os.environ[\"PRAW_CLIENT_SECRET\"] = os.getenv('PRAW_CLIENT_SECRET')\n", + "os.environ[\"PRAW_USER_AGENT\"] = os.getenv('PRAW_USER_AGENT')\n", + "os.environ[\"PRAW_USERNAME\"] = os.getenv('PRAW_USERNAME')\n", + "os.environ[\"PRAW_PASSWORD\"] = os.getenv('PRAW_PASSWORD')\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pk1CUGevIbTB" + }, + "source": [ + "### Define Pydantic schemas for structured outputs with llms that we will use in the nodes for deterministic routing." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "x9lr5BCSIa66" + }, + "outputs": [], + "source": [ + "class AudioTranscriptionDecision(BaseModel):\n", + " \"\"\"\n", + " Whether the user wants to create a post from audio\n", + " \"\"\"\n", + " transcribe_audio: bool = Field(description=\"Whether to create a post from audio\")\n", + " reason: str = Field(description=\"Reason for the decision\")\n", + "\n", + "class YoutubeTranscriptionDecision(BaseModel):\n", + " \"\"\"\n", + " Whether the user wants to create a post from youtube\n", + " If the url is not provided, url MUST be set to 'URL NOT PROVIDED'\n", + " \"\"\"\n", + " transcribe_youtube: bool = Field(description=\"Whether to create a post from YouTube\")\n", + " reason: str = Field(description=\"Reason for the decision\")\n", + " url: str = Field(description=\"The YouTube URL to parse. If the url is not provided, url MUST be set to 'URL NOT PROVIDED'\", default=\"URL NOT PROVIDED\")\n", + "\n", + "class RedditSummaryDecision(BaseModel):\n", + " \"\"\"\n", + " Whether the user wants to create a post from reddit\n", + " \"\"\"\n", + " summarize_reddit: bool = Field(description=\"Whether to create a post from Reddit\")\n", + " reason: str = Field(description=\"Reason for the decision\")\n", + "\n", + "class TowardsDataScienceDecision(BaseModel):\n", + " \"\"\"\n", + " Whether the user wants to create a post from towardsdatascience\n", + " \"\"\"\n", + " fetch_tds_articles: bool = Field(description=\"Whether to create a post from Towards Data Science\")\n", + " reason: str = Field(description=\"Reason for the decision\")\n", + "\n", + "class LinkedInProfileDecision(BaseModel):\n", + " \"\"\"\n", + " Whether the user wants to create a post from linkedin profile\n", + " \"\"\"\n", + " fetch_linkedin_posts: bool = Field(description=\"Whether to create a post from LinkedIn profile\")\n", + " reason: str = Field(description=\"Reason for the decision\")\n", + "\n", + "class ExitDecision(BaseModel):\n", + " \"\"\"Model for exit decision\"\"\"\n", + " should_exit: bool = Field(description=\"Whether the user wants to exit the application\")\n", + " reason: str = Field(description=\"Reason for the exit decision\")\n", + "\n", + "class UserIntentClassification(BaseModel):\n", + " \"\"\"\n", + " Use this to classify the user's intent, whether they want to create a post from audio, youtube, reddit, or towardsdatascience,\n", + " \"\"\"\n", + " reddit_summary_decision: RedditSummaryDecision\n", + " towards_data_science_decision: TowardsDataScienceDecision\n", + " linkedin_profile_decision: LinkedInProfileDecision\n", + " audio_transcription_decision: AudioTranscriptionDecision\n", + " youtube_transcription_decision: YoutubeTranscriptionDecision\n", + " exit_decision: ExitDecision\n", + "\n", + "\n", + "class YouTubeURLParser(BaseModel):\n", + " \"\"\"\n", + " Parse the YouTube URL from the user's input\n", + " \"\"\"\n", + " url: str = Field(description=\"The YouTube URL to parse\")\n", + "\n", + "class RedditFetchParams(BaseModel):\n", + " \"\"\"\n", + " Parse the number of posts and subreddit from the user's input\n", + " \"\"\"\n", + " post_count: int = Field(description=\"The number of posts to fetch from Reddit\")\n", + " subreddit: str = Field(description=\"The subreddit to fetch posts from\")\n", + "\n", + "\n", + "class ContentItem(BaseModel):\n", + " \"\"\"\n", + " A model representing a single content item.\n", + "\n", + " This class encapsulates a piece of content as a string, which could be text from\n", + " various sources like LinkedIn posts, Reddit posts, or transcribed content.\n", + "\n", + " Attributes:\n", + " content (str): The actual content text\n", + " \"\"\"\n", + " content: str = Field(description=\"The actual content\")\n", + "\n", + "class LinkedInPostDecision(BaseModel):\n", + " \"\"\"\n", + " A model representing a decision about posting content to LinkedIn.\n", + "\n", + " This class contains information about whether content should be posted to LinkedIn,\n", + " along with confidence level and reasoning for the decision.\n", + "\n", + " Attributes:\n", + " should_post (bool): Whether the content should be posted to LinkedIn\n", + " confidence (float): A value between 0 and 1 indicating confidence in the decision\n", + " reasoning (str): Detailed explanation for why this decision was made\n", + " \"\"\"\n", + " should_post: bool = Field(description=\"Whether the user wants to post to LinkedIn\")\n", + " confidence: float = Field(description=\"Confidence level of the decision\", ge=0, le=1)\n", + " reasoning: str = Field(description=\"Explanation for the decision\")\n", + "\n", + "class AgentState(BaseModel):\n", + " \"\"\"\n", + " A model representing the current state of the agent.\n", + "\n", + " This class tracks the ongoing conversation, planned actions, collected content,\n", + " and generated posts throughout the agent's execution.\n", + "\n", + " Attributes:\n", + " messages (List[AnyMessage]): History of messages in the conversation\n", + " next_action (Optional[str]): The next action the agent should take\n", + " content_items (Optional[List[ContentItem]]): Collection of content gathered\n", + " generated_posts (Optional[List[str]]): Posts that have been generated\n", + " \"\"\"\n", + " messages: Annotated[List[AnyMessage], add_messages] = Field(default_factory=list)\n", + " next_action: Annotated[Optional[str], Field(default=None)]\n", + " content_items: Annotated[Optional[List[ContentItem]], Field(default=None)]\n", + " generated_posts: Annotated[Optional[List[str]], Field(default=None)]\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "j8tSKCdnIxtb" + }, + "source": [ + "## Browser Actions\n", + "\n", + "This section explains the browser actions used in the Social Media Manager application. These actions interact with the LinkedIn website to perform tasks such as logging in and creating posts.\n", + "\n", + "### Initialize Browser\n", + "\n", + "This action starts a Chromium browser instance using Playwright. It sets up a new browser context and a new page within that context.\n", + "This allows us to interact with the browser for web scraping and automated tasks.\n", + "\n", + "### Close Browser\n", + "\n", + "This action closes the browser instance that was previously opened using Playwright. This is important to release resources and prevent the browser from running unnecessarily.\n", + "\n", + "### Login to LinkedIn\n", + "\n", + "This action uses the `LoginPage` class to log in to the LinkedIn website. It navigates to the login page and enters the provided credentials (LinkedIn email and password) to authenticate the user. Once the login process is complete, it waits for the user to be redirected to the LinkedIn feed, confirming a successful login.\n", + "\n", + "\n", + "### Create Post\n", + "\n", + "This action creates a new LinkedIn post using the `FeedPage` class. It first waits for the feed to load then clicks a button to start the post creation process. Then it fills the provided content into the text area and clicks the \"Post\" button to publish the post on LinkedIn.\n", + "\n", + "### Get LinkedIn Posts\n", + "\n", + "This action retrieves LinkedIn posts from a specific profile using the `ProfilePage` class. It navigates to the user's recent activity page and uses BeautifulSoup to extract text from the posts displayed. The extracted post content is then returned as a list of ContentItem objects. It also handles scrolling the page to load more posts.\n", + "\n", + "These browser actions play a crucial role in the overall functionality of the Social Media Manager. They enable automated interactions with LinkedIn, enabling features like post creation and content extraction. Through the Playwright library, the agent can mimic user behavior, performing various actions within the browser environment.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "id": "yhjLD0YsIxJV" + }, + "outputs": [], + "source": [ + "class FeedPage:\n", + " \"\"\"\n", + " Class for interacting with LinkedIn feed page and creating posts.\n", + "\n", + " Attributes:\n", + " page (Page): Playwright page object\n", + " start_post_button (Locator): Locator for the \"Start post\" button\n", + " post_text_area (Locator): Locator for the post text input area\n", + " post_button (Locator): Locator for the \"Post\" button\n", + " \"\"\"\n", + " def __init__(self, page: Page):\n", + " self.page = page\n", + " # Updated selectors\n", + " self.start_post_button = page.locator(\"button[class*='share-box-feed-entry__trigger']\")\n", + " self.post_text_area = page.locator(\"div[role='textbox']\")\n", + " self.post_button = page.locator(\"button[class*='share-actions__primary-action']\")\n", + "\n", + " async def create_post(self, content: str):\n", + " \"\"\"\n", + " Creates a new LinkedIn post with the provided content.\n", + "\n", + " Args:\n", + " content (str): The text content to post\n", + "\n", + " Raises:\n", + " TimeoutError: If any element takes too long to appear\n", + " Exception: If post creation fails for any reason\n", + " \"\"\"\n", + " try:\n", + " print(\"Waiting for feed to load...\")\n", + " await self.wait_for_feed_load()\n", + "\n", + " print(\"Clicking 'Start post' button...\")\n", + " await self.start_post_button.wait_for(state=\"visible\", timeout=10000)\n", + " await self.start_post_button.click()\n", + " await asyncio.sleep(2)\n", + "\n", + " print(\"Waiting for post editor...\")\n", + " await self.page.wait_for_selector(\"div[role='textbox']\", state=\"visible\", timeout=10000)\n", + "\n", + " print(\"Filling post content...\")\n", + " # Try multiple selector strategies\n", + " text_area = await self.page.wait_for_selector(\"div[role='textbox'], div[contenteditable='true'], .ql-editor\",\n", + " state=\"visible\",\n", + " timeout=10000)\n", + " if text_area:\n", + " await text_area.fill(content)\n", + " else:\n", + " raise Exception(\"Could not find post text area\")\n", + "\n", + " await asyncio.sleep(2)\n", + "\n", + " print(\"Clicking post button...\")\n", + " post_button = await self.page.wait_for_selector(\n", + " \"button[class*='share-actions__primary-action'], button[class*='share-box_actions']\",\n", + " state=\"visible\",\n", + " timeout=10000\n", + " )\n", + " if post_button:\n", + " await post_button.click()\n", + " else:\n", + " raise Exception(\"Could not find post button\")\n", + "\n", + " # Wait for post to complete\n", + " await asyncio.sleep(5)\n", + " print(\"Post created successfully!\")\n", + "\n", + " except TimeoutError as e:\n", + " print(f\"Timeout error: {str(e)}\")\n", + " await self.page.screenshot(path=\"post_error.png\")\n", + " raise Exception(f\"Failed to create post: Timeout - {str(e)}\")\n", + " except Exception as e:\n", + " print(f\"Error creating post: {str(e)}\")\n", + " await self.page.screenshot(path=\"post_error.png\")\n", + " raise Exception(f\"Failed to create post: {str(e)}\")\n", + "\n", + " async def wait_for_feed_load(self):\n", + " \"\"\"\n", + " Waits for the LinkedIn feed to load by checking for feed indicators.\n", + "\n", + " Raises:\n", + " Exception: If feed fails to load within timeout\n", + " \"\"\"\n", + " try:\n", + " # Wait for multiple possible feed indicators\n", + " await self.page.wait_for_selector(\n", + " \".feed-shared-update-v2, .share-box-feed-entry__trigger\",\n", + " state=\"visible\",\n", + " timeout=15000\n", + " )\n", + " except TimeoutError:\n", + " print(\"Feed load timeout - taking screenshot for debugging\")\n", + " await self.page.screenshot(path=\"feed_load_error.png\")\n", + " raise Exception(\"Feed failed to load: Timeout\")\n", + "\n", + "class LoginPage:\n", + " \"\"\"\n", + " Class for handling LinkedIn login process.\n", + "\n", + " Attributes:\n", + " page (Page): Playwright page object\n", + " email_input (Locator): Locator for email input field\n", + " password_input (Locator): Locator for password input field\n", + " login_button (Locator): Locator for login submit button\n", + " pin_input (Locator): Locator for verification code input\n", + " \"\"\"\n", + " def __init__(self, page):\n", + " self.page = page\n", + " self.email_input = page.get_by_label(\"Email or Phone\")\n", + " self.password_input = page.get_by_label(\"Password\")\n", + " self.login_button = page.locator('button[data-litms-control-urn=\"login-submit\"]')\n", + " self.pin_input = page.locator('input[name=\"pin\"]') # For verification code\n", + "\n", + " async def login(self):\n", + " \"\"\"\n", + " Performs LinkedIn login process with credentials from environment variables.\n", + " Handles verification code if required.\n", + "\n", + " Raises:\n", + " Exception: If login fails for any reason\n", + " \"\"\"\n", + " try:\n", + " print(\"Starting LinkedIn login process...\")\n", + " username, password = os.getenv(\"LINKEDIN_EMAIL\"), os.getenv(\"LINKEDIN_PASSWORD\")\n", + "\n", + " print(\"Navigating to LinkedIn login page...\")\n", + " await self.page.goto(\"https://www.linkedin.com/login\", timeout=15000)\n", + " await asyncio.sleep(2)\n", + "\n", + " print(\"Filling login form...\")\n", + " await self.email_input.fill(username)\n", + " await asyncio.sleep(1)\n", + " await self.password_input.fill(password)\n", + " await asyncio.sleep(1)\n", + "\n", + " print(\"Clicking login button...\")\n", + " await self.login_button.click()\n", + "\n", + " # Wait for either verification page or successful login\n", + " try:\n", + " # Check for verification code input\n", + " verification_selector = 'input[name=\"pin\"], input[name=\"verification-code\"]'\n", + " await self.page.wait_for_selector(verification_selector, timeout=10000)\n", + "\n", + " print(\"\\nVerification code required!\")\n", + " print(\"Please check your email/phone for the verification code\")\n", + " verification_code = input(\"Enter the verification code: \")\n", + "\n", + " # Fill in verification code\n", + " verification_input = await self.page.query_selector(verification_selector)\n", + " await verification_input.fill(verification_code)\n", + "\n", + " # Click submit button (different selectors possible)\n", + " submit_button = await self.page.query_selector('button[type=\"submit\"]')\n", + " if submit_button:\n", + " await submit_button.click()\n", + "\n", + " # Wait for successful login after verification\n", + " print(\"Waiting for successful login after verification...\")\n", + "\n", + " except TimeoutError:\n", + " # No verification required, continue with normal login flow\n", + " print(\"No verification code required, continuing...\")\n", + "\n", + " # Final check for successful login\n", + " try:\n", + " await self.page.wait_for_function(\n", + " \"\"\"() => {\n", + " return window.location.href.includes('/feed') ||\n", + " window.location.href.includes('/checkpoint') ||\n", + " window.location.href.includes('/home')\n", + " }\"\"\",\n", + " timeout=30000\n", + " )\n", + " print(\"Successfully logged in!\")\n", + "\n", + " except TimeoutError:\n", + " current_url = await self.page.url()\n", + " print(f\"Login failed. Current URL: {current_url}\")\n", + " await self.page.screenshot(path=\"login_error.png\")\n", + " raise Exception(\"Login process timed out\")\n", + "\n", + " except Exception as e:\n", + " print(f\"Login failed with error: {str(e)}\")\n", + " await self.page.screenshot(path=\"login_error.png\")\n", + " raise Exception(f\"Login failed: {str(e)}\")\n", + "\n", + "async def login_to_linkedin(page: Page) -> None:\n", + " \"\"\"Helper function to login to LinkedIn using LoginPage class\"\"\"\n", + " login_page = LoginPage(page)\n", + " await login_page.login()\n", + "\n", + "class ProfilePage:\n", + " \"\"\"\n", + " Class for interacting with LinkedIn profile pages and retrieving posts.\n", + "\n", + " Attributes:\n", + " page (Page): Playwright page object\n", + " base_url (str): Base URL for LinkedIn profiles\n", + " linkedin_profile_name (str): Username of the profile to scrape\n", + " \"\"\"\n", + " def __init__(self, page: Page):\n", + " self.page = page\n", + " self.base_url = \"https://www.linkedin.com/in\"\n", + " self.linkedin_profile_name = \"shreyshahh\"\n", + "\n", + " async def get_linkedin_posts(self) -> List[ContentItem]:\n", + " \"\"\"\n", + " Retrieves recent LinkedIn posts from a user's profile.\n", + "\n", + " Returns:\n", + " List[ContentItem]: List of posts as ContentItem objects\n", + " \"\"\"\n", + " await self.page.goto(f\"{self.base_url}/{self.linkedin_profile_name}/recent-activity/all/\")\n", + " await asyncio.sleep(3)\n", + " for _ in range(2):\n", + " await self.page.evaluate(\"window.scrollTo(0, document.body.scrollHeight)\")\n", + " await asyncio.sleep(2)\n", + "\n", + " linkedin_soup = BeautifulSoup(await self.page.content(), \"lxml\")\n", + " containers = [c for c in linkedin_soup.find_all(\"div\", {\"class\": \"feed-shared-update-v2\"})\n", + " if \"activity\" in c.get(\"data-urn\", \"\")]\n", + "\n", + " posts = []\n", + " for i, container in enumerate(containers):\n", + " element = container.find(\"div\", {\"class\": \"update-components-text\"})\n", + " if element and element.text.strip():\n", + " posts.append(ContentItem(title=f\"LinkedIn Post {i+1}\", content=element.text.strip()))\n", + " return posts[:5]\n", + "\n", + "async def initialize_browser(headless: bool = True) -> Tuple[Playwright, Browser, Page]:\n", + " \"\"\"\n", + " Initializes a Playwright browser instance with appropriate settings.\n", + "\n", + " Args:\n", + " headless (bool): Whether to run browser in headless mode\n", + "\n", + " Returns:\n", + " Tuple[Playwright, Browser, Page]: Initialized Playwright objects\n", + " \"\"\"\n", + " playwright = await async_playwright().start()\n", + "\n", + " # Configure browser for Colab environment\n", + " browser = await playwright.chromium.launch(\n", + " headless=True, # Set to True for Colab\n", + " args=[\n", + " '--no-sandbox',\n", + " '--disable-dev-shm-usage',\n", + " '--disable-blink-features=AutomationControlled'\n", + " ]\n", + " )\n", + "\n", + " # Create context with realistic browser settings\n", + " context = await browser.new_context(\n", + " viewport={'width': 1920, 'height': 1080},\n", + " user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36'\n", + " )\n", + "\n", + " page = await context.new_page()\n", + " return playwright, browser, page\n", + "\n", + "async def close_browser(playwright: Playwright, browser: Browser) -> None:\n", + " \"\"\"Closes browser and stops playwright instance\"\"\"\n", + " await browser.close()\n", + " await playwright.stop()\n", + "\n", + "\n", + "async def login_to_linkedin(page: Page) -> None:\n", + " \"\"\"Helper function to login to LinkedIn\"\"\"\n", + " await LoginPage(page).login()\n", + "\n", + "\n", + "def fetch_url_content(url):\n", + " \"\"\"\n", + " Fetches content from a URL using requests.\n", + "\n", + " Args:\n", + " url (str): URL to fetch content from\n", + "\n", + " Returns:\n", + " Optional[bytes]: Response content if successful, None otherwise\n", + " \"\"\"\n", + " response = requests.get(url)\n", + " return response.content if response.status_code == 200 else None\n", + "\n", + "def parse_article_content(article_url):\n", + " \"\"\"\n", + " Parses article content from a URL, removing unwanted elements.\n", + "\n", + " Args:\n", + " article_url (str): URL of article to parse\n", + "\n", + " Returns:\n", + " Optional[str]: Cleaned article content if successful, None otherwise\n", + " \"\"\"\n", + " if not (content := fetch_url_content(article_url)):\n", + " return None\n", + "\n", + " article_soup = BeautifulSoup(content, \"html.parser\")\n", + " full_content = \"\\n\".join(p.get_text() for p in article_soup.find_all(\"p\"))\n", + "\n", + " unwanted = [\"Sign up\\nSign in\\nSign up\\nSign in\\nMariya Mansurova\\nFollow\\nTowards Data Science\\n--\\nListen\\nShare\",\n", + " \"\"\"\\n--\\n--\\nTowards Data Science\\nData & Product Analytics Lead at Wise | ClickHouse Evangelist\\nHelp\\nStatus\\nAbout\\nCareers\\nPress\\nBlog\\nPrivacy\\nTerms\\nText to speech\\nTeams\"\"\"]\n", + "\n", + " return re.sub(r\"[^\\x00-\\x7F]+\", \"\", \"\".join(full_content.replace(s, \"\") for s in unwanted))\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "feinrQ2NRLz1" + }, + "source": [ + "# Define prompts for summarizing and creating the post.\n", + "\n", + "Provide your writing style to create similar content. Provide at least 5 examples" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "CZpxzx-8RKrv" + }, + "outputs": [], + "source": [ + "\n", + "prompt = \"\"\"\n", + "Analyze these LinkedIn posts:\n", + "\n", + "WRITER POSTS:\n", + "______________________\n", + "A lot of people are not going to like this.\n", + "\n", + "AI employees are taking over phone calls:\n", + "\n", + "This is Bland AI. And it's changing everything.\n", + "\n", + "If businesses don't adapt to this new tech,\n", + "they will be left behind.\n", + "\n", + "What does that mean?\n", + "\n", + "→ AI handles millions of calls 24/7.\n", + "→ AI talks in any language or voice.\n", + "→ AI integrates with data systems seamlessly.\n", + "→ AI customized for customer service, HR, or sales.\n", + "\n", + "Bland AI is leading this change. Period.\n", + "\n", + "If you haven't already,\n", + "do it before others.\n", + "\n", + "♻️ Repost this if you think it's the future.\n", + "\n", + "PS: If you want to stay updated with genAI\n", + "\n", + "1. Scroll to the top.\n", + "2. Follow Shrey Shah to never miss a post.\n", + "_____________________________\n", + "\n", + "Create a LinkedIn post about this topic in the same style:\n", + "{topic}\n", + "\n", + "Guidelines:\n", + "- Match the tone and structure\n", + "- Use similar formatting (bullet points, emojis)\n", + "- Include hashtags\n", + "- Add links as [text](url) if provided\n", + "- Focus on the main topic\n", + "- Keep it professional\n", + "\n", + "Write only the final post content.\n", + "\"\"\"\n", + "\n", + "reddit_summarization_prompt = \"\"\"\n", + "Summarize this Reddit content:\n", + "\n", + "Title: {title}\n", + "Content: {body}\n", + "Comments: {comments}\n", + "\n", + "Guidelines:\n", + "- Extract key points and insights\n", + "- Include any links/references\n", + "- Focus on the main topic\n", + "- Make it LinkedIn-friendly\n", + "- 3-5 paragraphs\n", + "- No Reddit-specific terms\n", + "- Professional tone\n", + "\n", + "Write only the final summary.\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "F55QQlS-MQ3T" + }, + "source": [ + "#Define Agent Functions\n", + "Now we'll define the main functions that our agent will use\n", + "\n", + "### Core Functions:\n", + "\n", + "- **`fetch_tds_articles`:** Retrieves articles from Towards Data Science and prepares them for potential posting.\n", + "- **`fetch_linkedin_profile_posts`:** Extracts recent posts from a specified LinkedIn profile.\n", + "- **`transcribe_audio`:** Transcribes audio files (e.g., podcasts, voice recordings) using OpenAI's Whisper.\n", + "- **`transcribe_youtube`:** Transcribes YouTube videos using the YouTubeTranscriptApi.\n", + "- **`summarize_reddit`:** Fetches posts from a given subreddit and generates a concise summary for each one.\n", + "- **`create_post`:** Generates a social media post based on a provided content item (e.g., article, audio transcript, Reddit summary).\n", + "- **`should_post_to_linkedin`:** Determines if the user wishes to publish a specific post to LinkedIn, based on user prompt.\n", + "- **`post_to_linkedin`:** Publishes a social media post to the user's LinkedIn profile using the Playwright library to interact with the browser.\n", + "- **`determine_next_action`:** Determine the next action in teh workflow based on the current agent state\n", + "- **`user_intent_classification`:** Classifies user intent from conversation messages using an AI model.\n", + "- **`content_router:`** Routes the workflow based on classified user intent and manages content source selection.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "_Y9svkBTJ-2_" + }, + "outputs": [], + "source": [ + "# Fetch articles from Towards Data Science\n", + "def fetch_tds_articles(state: AgentState) -> AgentState:\n", + " print(\"Fetching articles from Towards Data Science...\")\n", + " page_content = fetch_url_content(\"https://towardsdatascience.com/latest\")\n", + " if not page_content:\n", + " print(\"Failed to fetch TDS content\")\n", + " return {\"content_items\": []}\n", + "\n", + " soup = BeautifulSoup(page_content, \"html.parser\")\n", + " content_items = [ContentItem(content=content) for article in soup.find_all(\"div\", class_=\"postArticle\", limit=5)\n", + " if (link_tag := article.find(\"a\", {\"data-action\": \"open-post\"}))\n", + " and (content := parse_article_content(link_tag[\"href\"]))]\n", + "\n", + " print(f\"Found {len(content_items)} TDS articles\")\n", + " return {\"content_items\": content_items}\n", + "\n", + "async def fetch_linkedin_profile_posts(state: AgentState) -> AgentState:\n", + " print(\"Fetching LinkedIn profile posts...\")\n", + " playwright, browser, page = await initialize_browser(headless=False)\n", + " try:\n", + " await login_to_linkedin(page)\n", + " # Change this line from scrape_linkedin_posts to get_linkedin_posts\n", + " posts = await ProfilePage(page).get_linkedin_posts()\n", + " print(f\"Found {len(posts)} LinkedIn posts\")\n", + " return {\"content_items\": posts}\n", + " finally:\n", + " await close_browser(playwright, browser)\n", + "\n", + "# Transcribe audio file\n", + "async def transcribe_audio(state: AgentState, audio_file: str = \"./audio.mp3\", openai_model: str = \"whisper-1\") -> AgentState:\n", + " print(f\"Transcribing audio file: {audio_file}\")\n", + " with open(audio_file, \"rb\") as audio:\n", + " transcription = OpenAI().audio.transcriptions.create(model=openai_model, file=audio, response_format=\"text\")\n", + " print(\"Audio transcription complete\")\n", + " return {\"content_items\": [ContentItem(content=transcription)]}\n", + "\n", + "# Transcribe YouTube video\n", + "async def transcribe_youtube(state: AgentState) -> AgentState:\n", + " print(\"Transcribing YouTube video...\")\n", + " user_input = state.messages[-1].content if state.messages else \"\"\n", + " video_id = model.with_structured_output(YouTubeURLParser).invoke([SystemMessage(content=\"Parse the YouTube URL and return the video ID\"), HumanMessage(content=user_input)]).url.split(\"v=\")[1]\n", + " print(f\"Processing video ID: {video_id}\")\n", + " transcript = \" \".join(entry[\"text\"] for entry in YouTubeTranscriptApi.get_transcript(video_id))\n", + " return {\"messages\": [AIMessage(content=transcript)], \"content_items\": [ContentItem(content=transcript)]}\n", + "\n", + "\n", + "# Summarize Reddit posts\n", + "async def summarize_reddit(state: AgentState, post_count: int = 2, subreddit: str = \"LangChain\") -> AgentState:\n", + " print(f\"Summarizing {post_count} posts from r/{subreddit}\")\n", + " reddit = praw.Reddit(**{k.lower()[5:]: v for k, v in os.environ.items() if k.startswith(\"PRAW_\")})\n", + " content_items = []\n", + " for submission in list(reddit.subreddit(subreddit).hot(limit=post_count))[1:]:\n", + " print(f\"Processing submission: {submission.title}\")\n", + " submission.comments.replace_more(limit=None)\n", + " summary = model.invoke([HumanMessage(content=reddit_summarization_prompt.format(\n", + " title=submission.title,\n", + " body=submission.selftext or \"No content\",\n", + " comments=\"\\n\".join(c.body for c in submission.comments.list()[:5])\n", + " ))]).content\n", + " content_items.append(ContentItem(content=summary))\n", + " return {\"content_items\": content_items}\n", + "\n", + "# Create social media post\n", + "async def create_post(state: AgentState) -> AgentState:\n", + " generated_posts: List[str] = []\n", + " if state.content_items:\n", + " print(f\"Creating posts from {len(state.content_items)} content items\")\n", + " for i, content_item in enumerate(state.content_items, 1):\n", + " print(f\"Generating post {i}...\")\n", + " final_prompt = prompt.format(topic=content_item.content)\n", + " messages = [\n", + " HumanMessage(content=final_prompt),\n", + " ]\n", + " response: AIMessage = await model.ainvoke(messages)\n", + " generated_posts.append(response.content)\n", + " return {\"generated_posts\": generated_posts}\n", + "\n", + "# Check if post should be published to LinkedIn\n", + "async def should_post_to_linkedin(post: str) -> bool:\n", + " print(\"Checking if post should be published to LinkedIn...\")\n", + " user_input = input(f\"Do you want to post this content to LinkedIn?\\n\\n{post}\\n\\nEnter 'yes' to post or 'no' to skip: \")\n", + " analysis = await model.with_structured_output(LinkedInPostDecision).ainvoke([\n", + " SystemMessage(content=\"Analyze the user's response to determine if they want to post the content to LinkedIn. Provide a decision, confidence level, and reasoning.\"),\n", + " HumanMessage(content=f\"User's response: {user_input}\")\n", + " ])\n", + " print(f\"Post decision: {analysis.should_post}\")\n", + " return analysis.should_post\n", + "\n", + "# Post content to LinkedIn\n", + "async def post_to_linkedin(state: AgentState) -> None:\n", + " if not state.generated_posts:\n", + " print(\"No posts to publish\")\n", + " return\n", + "\n", + " print(\"Initializing LinkedIn posting process...\")\n", + " playwright, browser, page = await initialize_browser(headless=True)\n", + "\n", + " try:\n", + " print(\"Logging into LinkedIn...\")\n", + " await login_to_linkedin(page)\n", + "\n", + " # Wait for navigation after login\n", + " await asyncio.sleep(5)\n", + "\n", + " print(\"Navigating to LinkedIn feed...\")\n", + " await page.goto(\"https://www.linkedin.com/feed/\", timeout=15000)\n", + " await asyncio.sleep(3)\n", + "\n", + " feed_page = FeedPage(page)\n", + "\n", + " for i, post in enumerate(state.generated_posts, 1):\n", + " print(f\"Processing post {i}/{len(state.generated_posts)}\")\n", + "\n", + " max_retries = 3\n", + " for attempt in range(max_retries):\n", + " try:\n", + " if await should_post_to_linkedin(post):\n", + " print(f\"Attempt {attempt + 1} to create post...\")\n", + " await feed_page.create_post(post)\n", + " print(\"Successfully posted to LinkedIn\")\n", + " break\n", + " else:\n", + " print(\"Post skipped\")\n", + " break\n", + " except Exception as e:\n", + " print(f\"Attempt {attempt + 1} failed: {str(e)}\")\n", + " if attempt == max_retries - 1:\n", + " raise\n", + " await asyncio.sleep(5)\n", + "\n", + " # Wait between posts\n", + " await asyncio.sleep(5)\n", + "\n", + " except Exception as e:\n", + " print(f\"Error in LinkedIn posting process: {str(e)}\")\n", + " await page.screenshot(path=\"linkedin_error.png\")\n", + " raise\n", + " finally:\n", + " await close_browser(playwright, browser)\n", + "\n", + " return {\"messages\": [AIMessage(content=\"Post created on LinkedIn. Do you want to create another post from any other source?\")]}\n", + "\n", + "# Determine the next action in the workflow\n", + "def determine_next_action(state: AgentState) -> str:\n", + " \"\"\"\n", + " Determines the next action in the workflow based on the current agent state and user intent.\n", + "\n", + " Args:\n", + " state (AgentState): The current state containing:\n", + " - messages: List of conversation messages\n", + " - next_action: The next action to be taken\n", + "\n", + " Returns:\n", + " str: The next action to take or END if should exit\n", + "\n", + " Examples:\n", + " >>> # Continue case\n", + " >>> state = AgentState(next_action=\"GET CONTENT FROM YOUTUBE\")\n", + " >>> determine_next_action(state)\n", + " \"GET CONTENT FROM YOUTUBE\"\n", + "\n", + " >>> # Exit case\n", + " >>> state = AgentState(messages=[HumanMessage(content=\"no thanks, I'm done\")])\n", + " >>> determine_next_action(state)\n", + " \"__end__\"\n", + " \"\"\"\n", + " # Check for exit keywords in the last message\n", + " if state.messages:\n", + " last_message = state.messages[-1].content.lower()\n", + " exit_keywords = ['exit', 'quit', 'done', 'no', 'stop', 'bye', 'goodbye', 'end']\n", + "\n", + " if any(keyword in last_message for keyword in exit_keywords):\n", + " print(\"Exit request detected\")\n", + " return END\n", + "\n", + " print(f\"Next action determined: {state.next_action or END}\")\n", + " return state.next_action or END\n", + "\n", + "# Classify user intent from messages\n", + "async def user_intent_classification(state: AgentState) -> Dict[str, Any]:\n", + " \"\"\"\n", + " Classifies user intent including exit requests.\n", + " \"\"\"\n", + " print(\"Classifying user intent...\")\n", + " system_message = \"\"\"Classify the user's intent based on their messages:\n", + " 1. Determine if they want to exit (look for words like 'exit', 'quit', 'done', 'no', 'stop', 'bye')\n", + " 2. If not exiting, determine if they want to create a post from:\n", + " - audio\n", + " - YouTube\n", + " - Reddit\n", + " - Towards Data Science\n", + " - LinkedIn profile\n", + " Extract any relevant information from the user's messages to help with classification.\n", + " \"\"\"\n", + "\n", + " filtered_messages = [msg for msg in state.messages if not isinstance(msg, ToolMessage)]\n", + " result = await model.with_structured_output(UserIntentClassification).ainvoke([\n", + " SystemMessage(content=system_message),\n", + " *filtered_messages\n", + " ])\n", + " print(f\"Classification result: {result}\")\n", + " return result\n", + "\n", + "# Route content based on user intent\n", + "\n", + "# Then, fix the content router function\n", + "async def content_router(state: AgentState) -> Dict[str, Any]:\n", + " \"\"\"\n", + " Routes the workflow based on classified user intent and manages content source selection.\n", + " Includes exit handling.\n", + " \"\"\"\n", + " print(\"Routing content based on user intent...\")\n", + "\n", + " # Check if this is the first message\n", + " if not state.messages:\n", + " return {\n", + " \"messages\": [\n", + " AIMessage(content=\"\"\"Hello, what kind of post do you want to create?\n", + " Here are the options:\n", + " 1. youtube\n", + " 2. reddit\n", + " 3. towardsdatascience\n", + " 4. audio transcript\n", + " 5. linkedin profile\n", + " Or type 'exit' to quit.\"\"\")\n", + " ],\n", + " \"next_action\": END\n", + " }\n", + "\n", + " user_intent = await user_intent_classification(state)\n", + "\n", + " # Check for exit intent first\n", + " if user_intent.exit_decision.should_exit:\n", + " return {\n", + " \"messages\": [AIMessage(content=\"Thank you for using the service. Goodbye!\")],\n", + " \"next_action\": END\n", + " }\n", + "\n", + " # Handle YouTube URL request\n", + " if user_intent.youtube_transcription_decision.transcribe_youtube:\n", + " if user_intent.youtube_transcription_decision.url == \"URL NOT PROVIDED\":\n", + " return {\n", + " \"messages\": [AIMessage(content=\"Please provide a YouTube URL\")],\n", + " \"next_action\": END\n", + " }\n", + " else:\n", + " return {\"next_action\": \"GET CONTENT FROM YOUTUBE\"}\n", + "\n", + " # Map intents to actions\n", + " intent_mapping = {\n", + " user_intent.audio_transcription_decision.transcribe_audio: \"GET CONTENT FROM AUDIO\",\n", + " user_intent.reddit_summary_decision.summarize_reddit: \"GET CONTENT FROM REDDIT\",\n", + " user_intent.towards_data_science_decision.fetch_tds_articles: \"GET CONTENT FROM TOWARDS DATA SCIENCE\",\n", + " user_intent.linkedin_profile_decision.fetch_linkedin_posts: \"GET CONTENT FROM LINKEDIN PROFILE\"\n", + " }\n", + "\n", + " # Find the first true intent and get its action\n", + " for intent_condition, action in intent_mapping.items():\n", + " if intent_condition:\n", + " print(f\"Selected next action: {action}\")\n", + " return {\"next_action\": action}\n", + "\n", + " # Default action if no intent matches\n", + " default_action = \"GET CONTENT FROM TOWARDS DATA SCIENCE\"\n", + " print(f\"No specific intent matched. Selected default action: {default_action}\")\n", + " return {\"next_action\": default_action}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e9fuhiNrMuj1" + }, + "source": [ + "## Create and Compile the Graph\n", + "Now we'll create our LangGraph workflow and compile it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ZAM8ORyIJKRY", + "outputId": "2686dc81-099c-48bc-b543-2493a1e0a85e" + }, + "outputs": [], + "source": [ + "router_paths = {\n", + " \"GET CONTENT FROM AUDIO\": \"GET CONTENT FROM AUDIO\",\n", + " \"GET CONTENT FROM YOUTUBE\": \"GET CONTENT FROM YOUTUBE\",\n", + " \"GET CONTENT FROM REDDIT\": \"GET CONTENT FROM REDDIT\",\n", + " \"GET CONTENT FROM TOWARDS DATA SCIENCE\": \"GET CONTENT FROM TOWARDS DATA SCIENCE\",\n", + " \"GET CONTENT FROM LINKEDIN PROFILE\": \"GET CONTENT FROM LINKEDIN PROFILE\",\n", + " \"CREATE POST\": \"CREATE POST\",\n", + " END: END,\n", + "}\n", + "\n", + "# Build the workflow graph\n", + "def build_workflow() -> StateGraph:\n", + " print(\"Building workflow graph...\")\n", + " workflow = StateGraph(AgentState)\n", + " workflow.add_node(\"ROUTER\", content_router)\n", + " workflow.add_node(\"GET CONTENT FROM AUDIO\", transcribe_audio)\n", + " workflow.add_node(\"GET CONTENT FROM YOUTUBE\", transcribe_youtube)\n", + " workflow.add_node(\"GET CONTENT FROM REDDIT\", summarize_reddit)\n", + " workflow.add_node(\"GET CONTENT FROM TOWARDS DATA SCIENCE\", fetch_tds_articles)\n", + " workflow.add_node(\"GET CONTENT FROM LINKEDIN PROFILE\", fetch_linkedin_profile_posts)\n", + " workflow.add_node(\"CREATE POST\", create_post)\n", + " workflow.add_node(\"POST TO LINKEDIN\", post_to_linkedin)\n", + "\n", + " workflow.set_entry_point(\"ROUTER\")\n", + " workflow.add_conditional_edges(\"ROUTER\", determine_next_action, router_paths)\n", + "\n", + " for source in [\"GET CONTENT FROM AUDIO\", \"GET CONTENT FROM YOUTUBE\", \"GET CONTENT FROM REDDIT\",\n", + " \"GET CONTENT FROM TOWARDS DATA SCIENCE\", \"GET CONTENT FROM LINKEDIN PROFILE\"]:\n", + " workflow.add_edge(source, \"CREATE POST\")\n", + "\n", + " workflow.add_edge(\"CREATE POST\", \"POST TO LINKEDIN\")\n", + " workflow.add_edge(\"POST TO LINKEDIN\", END)\n", + "\n", + " print(\"Workflow graph built successfully\")\n", + " return workflow\n", + "\n", + "workflow_builder: StateGraph = build_workflow()\n", + "memory: MemorySaver = MemorySaver()\n", + "workflow: Any = workflow_builder.compile(checkpointer=memory)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ObbnljoZP2fq" + }, + "source": [ + "#Display the graph structure" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 371 + }, + "id": "4YyECaP8OXTt", + "outputId": "9ab14edb-5b5a-4731-f27b-7667efc156a2" + }, + "outputs": [], + "source": [ + "\n", + "display(\n", + " Image(\n", + " workflow.get_graph().draw_mermaid_png(\n", + " draw_method=MermaidDrawMethod.API,\n", + " )\n", + " )\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IquSAejVRnUH" + }, + "source": [ + "#Run the graph" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "uX-P_aGDM2YS", + "outputId": "c2d0c41f-8716-4ace-b383-f11cef5affac" + }, + "outputs": [], + "source": [ + "user_request = \"Create a post from towardsdatascience\"\n", + "result = None\n", + "config = {\"configurable\": {\"thread_id\": uuid4()}}\n", + "default = AgentState()\n", + "default.messages.append(\n", + " HumanMessage(content=user_request)\n", + ")\n", + "print(\"Starting main workflow loop...\")\n", + "while True:\n", + " try:\n", + " result = await workflow.ainvoke(input=result if result else default, config=config)\n", + "\n", + " # Check if we should exit\n", + " if result.get(\"next_action\") == END and any(\n", + " msg.content.lower() == \"thank you for using the service. goodbye!\"\n", + " for msg in result.get(\"messages\", [])\n", + " ):\n", + " print(\"Exiting workflow\")\n", + " break\n", + "\n", + " user_input = input(result[\"messages\"][-1].content + \"\\n(Type 'exit' to quit): \")\n", + " result[\"messages\"].append(HumanMessage(content=user_input))\n", + "\n", + " except Exception as e:\n", + " print(f\"Error in workflow: {str(e)}\")\n", + " break" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Sa9MPR1JQGZE" + }, + "source": [ + "#Use case examples" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "id": "AjcdEE_0QJ3t" + }, + "outputs": [], + "source": [ + "user_request = \"Create a post from linkedin\" # you can define how many posts you want to create\n", + "\n", + "# OR\n", + "\n", + "user_request = \"Create a post from reddit\" # you can define how many posts you want to create\n", + "\n", + "# OR\n", + "\n", + "user_request = \"Create a post from towardsdatascience\" # you can define how many posts you want to create\n", + "\n", + "# OR\n", + "\n", + "user_request = \"Create a post from audio\" # you must update an audio file first and update path for the file\n", + "\n", + "# OR\n", + "\n", + "user_request = \"Create a post from youtube\"\n" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.8" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}