Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Playtime Session] Implement voice input functionality for mic button #95

Open
15 tasks
Rushi-faldu opened this issue Dec 2, 2024 · 0 comments
Open
15 tasks
Labels
dev-work Issues that need to be developed playtime-session All the issues related to the playtime session development

Comments

@Rushi-faldu
Copy link
Collaborator

Rushi-faldu commented Dec 2, 2024

Purpose

Enable interactive voice-based commands using Genkit’s generative AI framework for real-time user interactions during playtime sessions.

User story

As a parent, I want to use voice commands during a playtime session to repeat instructions, pause, skip, or modify activities, so the interaction feels natural and personalized.

Acceptance criteria:

  • Mic button activates voice input using Genkit’s Speech-to-Text (STT) module to capture real-time user queries/commands like Yes, ready, etc for positive confirmation or other commands like 'Repeat', 'Skip', and 'Pause'.
  • If the user gives the 'Repeat' command, then the agent fetches the last activity step and reads it aloud.
  • If the user gives the 'Pause' command, the agent pauses the current activity and provides a user confirmation message, e.g., "Activity paused. Say 'resume' to continue."
  • Map the commands mentioned above with respective actions.
  • The agent should also accept personalized modifications from users (e.g., "Can I use blocks instead of cups?") and adapt the instruction dynamically using generative AI and deliver the updated instruction.
  • If the voice command is unclear or not recognized, provide a fallback response, e.g., "I didn’t catch that. Could you say it again?".
  • Use audio feedback to confirm actions, e.g., "Repeating the instruction." or "Skipping to the next activity."

Other notes

Out of scope:

  • If the user gives the 'Skip' command, the agent skips the current activity and prompts the Generative AI agent to suggest the next one.
  • If the mic button is not actively used for 5 seconds, it should automatically turn off for privacy and not listen to the user unless the user clicks on it again.

Design notes

  • Wireframes are here

  • Design specifics are here


Definition of Done (DoD)

  • The feature has been fully implemented, thoroughly tested, and integrated into the app with no critical bugs or performance issues.
  • Manual testing is complete, verifying that the feature meets all acceptance criteria.
  • Big features are documented inside the code, with clear explanations of functionality, usage, and any relevant edge cases.
  • The feature has undergone peer review, with the code approved by at least one other team member to ensure quality and maintainability.
  • The feature branch has been successfully merged into the main branch, and the pull request is closed and the branch is deleted.
  • The feature’s UI/UX has been reviewed to ensure consistency and alignment with design guidelines.
@Rushi-faldu Rushi-faldu converted this from a draft issue Dec 2, 2024
@saramakishti saramakishti changed the title [Playtime Session] implement voice input functionality for mic button [Playtime Session] Implement voice input functionality for mic button Dec 3, 2024
@saramakishti saramakishti added dev-work Issues that need to be developed playtime-session All the issues related to the playtime session development labels Dec 3, 2024
@saramakishti saramakishti moved this to Product Backlog in amos2024ws04-feature-board Dec 3, 2024
@Rushi-faldu Rushi-faldu moved this to Product Backlog in amos2024ws04-feature-board Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev-work Issues that need to be developed playtime-session All the issues related to the playtime session development
Projects
Status: Product Backlog
Development

No branches or pull requests

2 participants