Skip to content

the-mad-statter/ctakes-smokingstatus-4.0-bin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ctakes-smokingstatus-4.0-bin

Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows. GitHub release (with filter) GitHub last commit (branch) License: Apache 2.0 Build Status

About

Apache cTAKESTM is a natural language processing system for extraction of information from electronic medical record clinical free-text.

This version of the smoking status collection processing engine processes flat files to classify patient records into five pre-determined categories:

  1. past smoker (P)1
  2. current smoker (C)1
  3. smoker (S)
  4. nonsmoker (N)
  5. unknown (U)

1where a past and current smoker are distinguished based on temporal expressions in the patient's medical records.

Requirements

Java 1.8 is required to run cTAKES

Installation

Scripts

Install scripts for Windows and Linux are located in the Scripts directory.

UMLS Access Rights

In the initial setup cTAKES will recognize only few sample concepts in text. If you wish to perform named entity recognition or concept identification for anything other than these few words, you will need to provide UMLS credentials to cTAKES. If you do not have a UMLS API Key, you may request one at UMLS Terminology Services. After obtaining a Key, there are two methods (i.e., Operating System Variable or a Java Command Parameter) with two options each (i.e., ctakes.umls_apikey or umlsKey) to utilize it with Apache cTAKES.

Method 1: Operating System Variable

Set either ctakes.umls_apikey or umlsKey as an operating system variable:

Option Windows Linux
1 set ctakes.umls_apikey=MY_UMLS_KEY export ctakes.umls_apikey=MY_UMLS_KEY
2 set umlsKey=MY_UMLS_KEY export umlsKey=MY_UMLS_KEY

Method 2: Java Command Parameter

Set either Dctakes.umls_apikey or DumlsKey in your Java command parameters.

Once you have your UMLS API Key find the line in each script that runs java and add the chosen parameter to the java command with your key. Make sure you substitute your actual key. In the examples below, the rest of the lines after -cp are not shown because you do not need to modify the rest of the line. Do not delete the rest of the line after -cp however.

Option Code
1 java -Dctakes.umls_apikey=MY_UMLS_KEY -cp ...
2 java -DumlsKey=MY_UMLS_KEY -cp ...

Usage

Windows

Step Windows
1. Place patient note files: C:\apache-ctakes-4.0.0.1\testdata\smoking\testinput
2. Run the smoking status pipeline: cd C:\apache-ctakes-4.0.0.1
bin\runSmokingStatusCPE.bat
3. Results are written to: C:\apache-ctakes-4.0.0.1\testdata\smoking\testoutput\results.txt

Linux

Step Linux
1. Place patient note files: /usr/local/apache-ctakes-4.0.0.1/testdata/smoking/testinput
2. Run the smoking status pipeline: cd /usr/local/apache-ctakes-4.0.0.1
./bin/runSmokingStatusCPE.sh
3. Results are written to: /usr/local/apache-ctakes-4.0.0.1/testdata/smoking/testoutput/results.txt

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

About

Washington University in Saint Louis

Established in 1853, Washington University in Saint Louis is among the world’s leaders in teaching, research, patient care, and service to society. Boasting 24 Nobel laureates to date, the University is ranked 7th in the world for most cited researchers, received the 4th highest amount of NIH medical research grants among medical schools in 2019, and was tied for 1st in the United States for genetics and genomics in 2018. The University is committed to learning and exploration, discovery and impact, and intellectual passions and challenging the unknown.