Releases · chigkim/VOCR

29 Sep 21:32

chigkim

v2.1.0

400a8a3

VOCR v2.1.0 Latest

Latest

Changelog

v2.1.0

Find after scan: Command+Control+f, Thanks @vick08
More options after take a picture with camera, Thanks @vick08

Contributors

vick08

Assets 3

24 May 14:52

chigkim

v2.0.1

a8db631

VOCR v2.0.1

Changelog

v2.0.1

VOCR no longer crashes when VoiceOver is not running. Instead, it speaks with system speech synthesizer.

v2.0.0

Ask AI (OpenAI GPT-4, Ollama)
Explore with AI (OpenAI GPT-4 Only)
Capture image with a camera and ask AI: Command+Shift+Control+C
Supports FaceTime, external, and iPhone cameras (with Continuity Camera feature)
Open image files from Finder
VOCR Menu: Command+Shift+Control+S
Real-time OCR: Command+Shift+Control+R
Object detection for icons
Auto Scan
Customize shortcuts
Auto updates
Save last image
Save OCR results
Faster VOCursor scanning
Target window feature
Disable mouse movement
Launch on login
Startup sound
Logger

Assets 3

22 May 18:15

chigkim

v2.0.0

42e9a92

VOCR v2.0.0

Changelog

Ask AI (OpenAI GPT-4, Ollama)
Explore with AI (OpenAI GPT-4 Only)
Capture image with a camera and ask AI: Command+Shift+Control+C
Supports FaceTime, external, and iPhone cameras (with Continuity Camera feature)
Open image files from Finder
VOCR Menu: Command+Shift+Control+S
Real-time OCR: Command+Shift+Control+R
Object detection for icons
Auto Scan
Customize shortcuts
Auto updates
Save last image
Save OCR results
Faster VOCursor scanning
Target window feature
Disable mouse movement
Launch on login
Startup sound
Logger

Assets 3

20 May 21:33

chigkim

v2.0.0-beta.3

c31cf53

VOCR v2.0.0-beta.3 Pre-release

Pre-release

Changelog

Camera Capture: command+control+shift+c
Settings > Choose camera to select an external camera
gpt-4o for 50% cheaper and faster response.
A shortcut cannot be deleted.
New updates submenu
Able to toggle automatically check for updates and automatically install updates from the menu.
Pre-release channel
Kill other running instances VOCR.
Store API key in Keychain
- Quit VOCR
- Delete ~/Library/Preferences/com.chikim.VOCR.plist permanently with Command+Option+Delete
- Reboot.
Added permission for notification center
Fixed when menu is not working after closing a window
Logger creates file when the file is deleted.
Check for update when launching
Increased timeout for request to 10 minutes
Play sound when VOCR is launched and ready.
Alert update through notification center
Fixed error when encountering Ollama model with no families.
Realtime OCR shortcut toggles the feature.
Autoupdater
Implemented logger
Ask which model for Ollama to use if multiple clip models are found.
You can also select a model for Ollama by just click Ollama in the model menu.
Ask for a prompt after taking a screenshot.
New prompt for explore
Explore no longer generates images meant for debugging.
Presents the same menu when launched by shortcut or clicking statusbar.
Reports more errors when request fails.
Cancels previous request when making new request
Ollama support
Use original screenshot resolution instead of window resolution point except explore mode.
New Workflow: Use Command+Control+Shift+W/V to set the target to a window/VOCursor and perform the OCR scan. After that, the features such as real-time OCR, explore, and ask will use the target.
Reset shortcut if there are different features after an update
Bug fix: global shortcuts sometimes not active
Customize shortcuts
Token usage at the end of description
Support system prompt for GPT
Setting to toggle use last prompt without asking
Save last screenshot
Dismiss menu with command+Z instead of esc if realtime or navigation is active.
You can just press return to ask GPT without editing.
Changed diff algorithm for less verbose realtime OCR.
Realtime OCR remains active at its initial location, allowing you to move the VOCursor during the process. To perform realtime OCR in a different location, stop the OCR, move the VOCursor, then restart realtime OCR.
Realtime OCR of VOCursor: Command+Control+Shift+r
Able to toggle obbject detection from the setings.
OCR Window: Command+Control+Shift+w
OCR VOCursor: Command+Control+Shift+v
Ask GPT about VOCursor: Command+Control+Shift+a
Settings: Command+Control+Shift+S
Faster screenshot of VOCursor
Open an image file in VOCR from finder to ask GPT
Gpt response gets copied to the clipboard, so you can paste somewhere if you miss it.
Object Detection through rectangles: Any boxes without text such as icons.
Moved save OCR result to the menu.
Moved target window to settings menu.
auto Scan: Thanks @vick08
Readme Improvement: Thanks @ssawczyn

The GPT features utilize GPT-4V, and they require your own OpenAI API key.

The usage cost from VOCR is an estimate. For the official usage and cost, please refer to the Usage Dashboard on OpenAI website. Also you can create an monthly limit and alert on the website as well.

Explore feature only works with GPT, and location information from the model is extremely unreliable and inaccurate.

Instruction for Ollama

Download Ollama and install.
Open terminal, and type "ollama pull llava" without the quotes.
Wait for Ollama to finish downloading the model.
Quit terminal
Go to VOCR menu > Settings > Models and select Ollama

Experimental

These features may not make into the public release.

Identify object when navigation is active: Command+Control+I
Explore window with GPT: Command+Control+Shift+e
an option to switch to using a local model such as Llava using llama.cpp instead of GPT.

Warning: It's very complex to set your own Llama.cpp server.

Assets 3

26 Jan 14:54

chigkim

v2.0.0-beta.2

d7b8202

VOCR v2.0.0-beta.2 Pre-release

Pre-release

Changelog

New updates submenu
Able to toggle automatically check for updates and automatically install updates from the menu.
Pre-release channel
Kill other running instances VOCR.
Store API key in Keychain
- Quit VOCR
- Delete ~/Library/Preferences/com.chikim.VOCR.plist permanently with Command+Option+Delete
- Reboot.
Added permission for notification center
Fixed when menu is not working after closing a window
Logger creates file when the file is deleted.
Check for update when launching
Increased timeout for request to 10 minutes
Play sound when VOCR is launched and ready.
Alert update through notification center
Fixed error when encountering Ollama model with no families.
Realtime OCR shortcut toggles the feature.
Autoupdater
Implemented logger
Ask which model for Ollama to use if multiple clip models are found.
You can also select a model for Ollama by just click Ollama in the model menu.
Ask for a prompt after taking a screenshot.
New prompt for explore
Explore no longer generates images meant for debugging.
Presents the same menu when launched by shortcut or clicking statusbar.
Reports more errors when request fails.
Cancels previous request when making new request
Ollama support
Use original screenshot resolution instead of window resolution point except explore mode.
New Workflow: Use Command+Control+Shift+W/V to set the target to a window/VOCursor and perform the OCR scan. After that, the features such as real-time OCR, explore, and ask will use the target.
Reset shortcut if there are different features after an update
Bug fix: global shortcuts sometimes not active
Customize shortcuts
Token usage at the end of description
Support system prompt for GPT
Setting to toggle use last prompt without asking
Save last screenshot
Dismiss menu with command+Z instead of esc if realtime or navigation is active.
You can just press return to ask GPT without editing.
Changed diff algorithm for less verbose realtime OCR.
Realtime OCR remains active at its initial location, allowing you to move the VOCursor during the process. To perform realtime OCR in a different location, stop the OCR, move the VOCursor, then restart realtime OCR.
Realtime OCR of VOCursor: Command+Control+Shift+r
Able to toggle obbject detection from the setings.
OCR Window: Command+Control+Shift+w
OCR VOCursor: Command+Control+Shift+v
Ask GPT about VOCursor: Command+Control+Shift+a
Settings: Command+Control+Shift+S
Faster screenshot of VOCursor
Open an image file in VOCR from finder to ask GPT
Gpt response gets copied to the clipboard, so you can paste somewhere if you miss it.
Object Detection through rectangles: Any boxes without text such as icons.
Moved save OCR result to the menu.
Moved target window to settings menu.
auto Scan: Thanks @vick08
Readme Improvement: Thanks @ssawczyn

The GPT features utilize GPT-4V, and they require your own OpenAI API key.

The usage cost from VOCR is an estimate. For the official usage and cost, please refer to the Usage Dashboard on OpenAI website. Also you can create an monthly limit and alert on the website as well.

Explore feature only works with GPT, and location information from the model is extremely unreliable and inaccurate.

Instruction for Ollama

Download Ollama and install.
Open terminal, and type "ollama run llava" without the quotes.
Wait until you get the prompt >>> send a message
Then type /bye and press return
Quit terminal
Go to VOCR menu > Settings > Models and select Ollama

Experimental

These features may not make into the public release.

Identify object when navigation is active: Command+Control+I
Explore window with GPT: Command+Control+Shift+e
an option to switch to using a local model such as Llava using llama.cpp instead of GPT.

Warning: It's very complex to set your own Llama.cpp server.

Download

Assets 3

21 Jan 12:31

chigkim

v2.0.0-beta.1

2e27dbd

VOCR v2.0.0-beta.1 Pre-release

Pre-release

Changelog

Pre-release channel
Kill other running instances VOCR.
Store API key in Keychain
- Quit VOCR
- Delete ~/Library/Preferences/com.chikim.VOCR.plist permanently with Command+Option+Delete
- Reboot.
Added permission for notification center
Fixed when menu is not working after closing a window
Logger creates file when the file is deleted.
Check for update when launching
Increased timeout for request to 10 minutes
Play sound when VOCR is launched and ready.
Alert update through notification center
Fixed error when encountering Ollama model with no families.
Realtime OCR shortcut toggles the feature.
Autoupdater
Implemented logger
Ask which model for Ollama to use if multiple clip models are found.
You can also select a model for Ollama by just click Ollama in the model menu.
Ask for a prompt after taking a screenshot.
New prompt for explore
Explore no longer generates images meant for debugging.
Presents the same menu when launched by shortcut or clicking statusbar.
Reports more errors when request fails.
Cancels previous request when making new request
Ollama support
Use original screenshot resolution instead of window resolution point except explore mode.
New Workflow: Use Command+Control+Shift+W/V to set the target to a window/VOCursor and perform the OCR scan. After that, the features such as real-time OCR, explore, and ask will use the target.
Reset shortcut if there are different features after an update
Bug fix: global shortcuts sometimes not active
Customize shortcuts
Token usage at the end of description
Support system prompt for GPT
Setting to toggle use last prompt without asking
Save last screenshot
Dismiss menu with command+Z instead of esc if realtime or navigation is active.
You can just press return to ask GPT without editing.
Changed diff algorithm for less verbose realtime OCR.
Realtime OCR remains active at its initial location, allowing you to move the VOCursor during the process. To perform realtime OCR in a different location, stop the OCR, move the VOCursor, then restart realtime OCR.
Realtime OCR of VOCursor: Command+Control+Shift+r
Able to toggle obbject detection from the setings.
OCR Window: Command+Control+Shift+w
OCR VOCursor: Command+Control+Shift+v
Ask GPT about VOCursor: Command+Control+Shift+a
Settings: Command+Control+Shift+S
Faster screenshot of VOCursor
Open an image file in VOCR from finder to ask GPT
Gpt response gets copied to the clipboard, so you can paste somewhere if you miss it.
Object Detection through rectangles: Any boxes without text such as icons.
Moved save OCR result to the menu.
Moved target window to settings menu.
auto Scan: Thanks @vick08
Readme Improvement: Thanks @ssawczyn

The GPT features utilize GPT-4V, and they require your own OpenAI API key.

The usage cost from VOCR is an estimate. For the official usage and cost, please refer to the Usage Dashboard on OpenAI website. Also you can create an monthly limit and alert on the website as well.

Explore feature only works with GPT, and location information from the model is extremely unreliable and inaccurate.

Instruction for Ollama

Download Ollama and install.
Open terminal, and type "ollama run llava" without the quotes.
Wait until you get the prompt >>> send a message
Then type /bye and press return
Quit terminal
Go to VOCR menu > Settings > Models and select Ollama

Experimental

These features may not make into the public release.

Identify object when navigation is active: Command+Control+I
Explore window with GPT: Command+Control+Shift+e
an option to switch to using a local model such as Llava using llama.cpp instead of GPT.

Warning: It's very complex to set your own Llama.cpp server.

Assets 3

29 Mar 15:06

chigkim

v1.0.0-beta.2

5695406

VOCR v1.0.0-beta.2 Pre-release

Pre-release

Fixed crashing when there's no window. Thanks @vick08

Assets 3

07 May 22:48

chigkim

v1.0.0-beta.1

577859f

VOCR v1.0.0-beta.1 Pre-release

Pre-release

HIGHLY EXPERIMENTAL: USE AT YOUR OWN RISK!

You can now choose sound output for positional audio feedback.

Assets 3

08 Apr 19:55

chigkim

v1.0.0-alpha.3

3fe1695

VOCR v1.0.0-alpha.3 Pre-release

Pre-release

HIGHLY EXPERIMENTAL: USE AT YOUR OWN RISK

Assets 3

08 Apr 09:26

chigkim

v1.0.0-alpha.2

c5b035e

VOCR v1.0.0-alpha.2 Pre-release

Pre-release

HIGHLY EXPERIMENTAL: USE AT YOUR OWN RISK
Added import feature

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changelog

v2.1.0

Contributors

Changelog

v2.0.1

v2.0.0

Changelog

Changelog

Instruction for Ollama

Experimental

Changelog

Instruction for Ollama

Experimental

Download

Changelog

Instruction for Ollama

Experimental

Releases: chigkim/VOCR

VOCR v2.1.0

Changelog

v2.1.0

Contributors

VOCR v2.0.1

Changelog

v2.0.1

v2.0.0

VOCR v2.0.0

Changelog

VOCR v2.0.0-beta.3

Changelog

Instruction for Ollama

Experimental

VOCR v2.0.0-beta.2

Changelog

Instruction for Ollama

Experimental

Download

VOCR v2.0.0-beta.1

Changelog

Instruction for Ollama

Experimental

VOCR v1.0.0-beta.2

VOCR v1.0.0-beta.1

VOCR v1.0.0-alpha.3

VOCR v1.0.0-alpha.2