Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: roundtrip SFM data #18

Merged
merged 5 commits into from
Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 6 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# sfm-utils
Utilities to parse text files (e.g. from Toolbox) and migrate SFM markers into a format suitable for Paratext.
Each book input is written to individual .sfm files.
Utilities to parse book translations in SFM text files (.txt, .rtf, .sfm) into JSON objects, and then write out the books into SFM suitable for Paratext or .tsv.
When directories are processed, each book input is written to individual .sfm files.

Assumptions:
* Each text file is for a single chapter of a book
Expand All @@ -13,15 +13,16 @@ Note for developers: Replace `sfm-utils.exe` references with `node dist/index.js

Command-line
```bash
Usage: sfm-utils.exe -p p_arg [-t t_arg | -d d_arg | -j j_arg | -s s_arg]
Usage: sfm-utils.exe -p p_arg [-f f_arg | -t t_arg | -d d_arg | -j j_arg | -s s_arg]
```

Parameters
```bash
Required
-p [Paratext project name (can be 3-character abbreviation)]

Optional - one of:
Optional for processing txt or sfm files - one of:
-f [A single SFM file (can be an entire book)]
-t [A single Toolbox text file (one chapter of a book)]
-d [Directory of Toolbox text files for a single book (one chapter per file)]
-j [JSON file representing a single book - used for testing conversion to SFM]
Expand All @@ -34,11 +35,6 @@ Parameters
```

### Help
Obtaining the sfm-utils version:
```bash
sfm-utils.exe --version
```

For additional help:
```bash
sfm-utils.exe -h
Expand All @@ -49,7 +45,7 @@ sfm-utils.exe -h

## Developer Setup
These utilities require Git, Node.js, and TypeScript (installed locally).
Back translations in .rtf text files will also need UnRTF installed for converting the Rich Text format.
Back translations in .rtf text files will also need UnRTF installed for converting the Rich Text format (only works on Linux).

### Install Git
Download and install Git
Expand Down
43 changes: 28 additions & 15 deletions src/books.ts
Original file line number Diff line number Diff line change
Expand Up @@ -719,9 +719,9 @@ export const bookInfo: bookType[] = [
code: "XXE",
name: "Extra Book E",
num: 98,
chapters: 999,
versesInChapter: [0],
verses: 0
chapters: 1,
versesInChapter: [0, 462],
verses: 462
},
{
code: "XXF",
Expand All @@ -743,13 +743,24 @@ export const bookInfo: bookType[] = [
//#endregion

/**
* Description of the unit within a chapter.
* Description of the unit within a chapter or header.
*/
export type unitSubtype =
"padding" |
"chapter" |
"verse" |
"section";
"padding" |

// Units in headers
"header" |
"toc1" |
"toc2" |
"toc3" |
"main_title" |
"chapter_label" |

// Units in chapters
"chapter" |
"verse" |
"section" |
"paragraph";

export interface unitType {
type: unitSubtype,
Expand All @@ -762,7 +773,8 @@ export interface unitType {
export interface objType {
header: {
projectName: string,
bookInfo: bookType
bookInfo: bookType,
markers: unitType[]
},
content: unitType[]
}
Expand All @@ -771,7 +783,8 @@ export const PLACEHOLDER_BOOK: bookType = bookInfo[0];
export const PLACEHOLDER_BOOK_OBJ: objType = {
"header": {
"projectName" : "",
"bookInfo" : PLACEHOLDER_BOOK
"bookInfo" : PLACEHOLDER_BOOK,
"markers": []
},
"content": []
}
Expand Down Expand Up @@ -862,7 +875,7 @@ export function getBookByName(name: string): bookType {
case 'I Corinthians':
case '1Corinthians':
case 'x1Corinthians':
case '1 Cor':
case '1 Cor':
bookName = "1 Corinthians";
break;
case '2Corinthians':
Expand All @@ -880,25 +893,25 @@ export function getBookByName(name: string): bookType {
break;
case 'Phil':
bookName = 'Philippians';
break;
break;
case '1Thessalonians':
case '1 Thess':
case '1 Thess':
bookName = '1 Thessalonians';
break;
case '2Thessalonians':
case '2 Thess':
bookName = '2 Thessalonians';
break;
case '1Timothy':
case '1 Tim':
case '1 Tim':
bookName = '1 Timothy';
break;
case '2Timothy':
case '2 Tim':
bookName = '2 Timothy';
break;
case '1Peter':
case '1 Pet':
case '1 Pet':
bookName = '1 Peter';
break;
case '2Peter':
Expand Down
88 changes: 83 additions & 5 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import { CommanderError, program } from 'commander';
import * as fs from 'fs';
import * as backTranslation from './backTranslation.js';
import * as books from './books.js';
import * as path from 'path';
import * as toolbox from './toolbox.js';
import require from './cjs-require.js';
import * as sfm from './sfm.js';
Expand All @@ -19,6 +20,7 @@ program
.description("Utilities to 1) parse Toolbox text files into JSON Objects. " +
"2) take a JSON file and write out an .SFM file for Paratext.")
.option("-b, --back <path to single text file>", "path to back translation rtf text file")
.option("-f, --sfm <path to single SFM file>", "path to SFM file")
.option("-t, --text <path to single text file>", "path to a Toolbox text file")
.option("-bd, --backDirectory <path to directory containing rtf text files>", "path to directory containing multiple RTF text files")
.option("-d, --directory <path to directory containing text files>", "path to directory containing multiple Toolbox text files")
Expand All @@ -44,6 +46,9 @@ if (debugMode) {
if (options.back) {
console.log(`Back Translation text file path: "${options.back}"`);
}
if (options.sfm) {
console.log(`SFM file path: "${options.sfm}"`);
}
if (options.text) {
console.log(`Toolbox text file path: "${options.text}"`);
}
Expand Down Expand Up @@ -86,6 +91,10 @@ if (options.back && !fs.existsSync(options.back)) {
console.error("Can't open back translation text file " + options.back);
process.exit(1);
}
if (options.sfm && !fs.existsSync(options.sfm)) {
console.error("Can't open SFM file " + options.sfm);
process.exit(1);
}
if (options.backDirectory && !fs.existsSync(options.backDirectory)) {
console.error("Can't open back translation directory " + options.backDirectory);
process.exit(1);
Expand All @@ -104,9 +113,9 @@ if (options.superDirectory && !fs.existsSync(options.superDirectory)) {
}

// Validate one of the optional parameters is given
if (!options.back && !options.text && !options.backDirectory && !options.directory &&
if (!options.back && !options.sfm && !options.text && !options.backDirectory && !options.directory &&
!options.json && !options.backSuperDirectory && !options.superDirectory) {
console.error("Need to pass another optional parameter [-b -t -bd -d -j -bs or -s]");
console.error("Need to pass another optional parameter [-b -f -t -bd -d -j -bs or -s]");
process.exit(1);
}

Expand All @@ -121,6 +130,9 @@ if (options.json) {
// Parse an rtf text file into a JSON object
const bookObj: books.objType = books.PLACEHOLDER_BOOK_OBJ;
processBackText(options.back, bookObj);
} else if (options.sfm) {
const bookObj: books.objType = books.PLACEHOLDER_BOOK_OBJ;
processSFMText(options.sfm, bookObj);
} else if (options.text) {
// Parse a txt file into a JSON object
const bookObj: books.objType = books.PLACEHOLDER_BOOK_OBJ;
Expand Down Expand Up @@ -320,6 +332,69 @@ function processText(filepath: string, bookObj: books.objType): books.objType {
return bookObj;
}

/**
* Take an SFM file and make a JSON book type object
* @param {string} filepath - file path of a single text file
* @param {books.bookType} bookObj - the book object to modify
* @returns {books.bookType} bookObj - modified book object
*/
function processSFMText(filepath: string, bookObj: books.objType): books.objType {
const bookInfo = toolbox.getBookAndChapter(filepath);
const currentChapter = bookInfo.chapterNumber;
const bookType = books.getBookByName(bookInfo.bookName);
if (bookInfo.bookName === "Placeholder") {
// Skip invalid book name
console.warn('Skipping invalid book name');
return bookObj;
} else if (currentChapter > bookType.chapters) {
// Skip invalid chapter number
console.warn('Skipping invalid chapter number ' + currentChapter + ' when ' +
bookObj.header.bookInfo.name + ' only has ' + bookType.chapters + ' chapters.');
return bookObj;
}

if (bookObj.content.length == 0) {
bookObj = toolbox.initializeBookObj(bookInfo.bookName, options.projectName);
}

if (!bookObj.content[currentChapter]) {
console.error(`${bookInfo.bookName} has insufficient chapters allocated to handle ${currentChapter}. Exiting`);
process.exit(1);
}
// Initialize all chapters for book
for (let ch:number=1; ch<= bookObj.content.length-1; ch++) {
if (bookObj.content[ch].type != "chapter") {
// Initialize current chapter
bookObj.content[ch].type = "chapter";
bookObj.content[ch].content = [];
}
}
sfm.updateObj(bookObj, filepath, s, debugMode);

// For single file parameter, write valid output
if (options.text && bookObj.header.bookInfo.code !== "000") {
// For testing, write out book JSON Object
writeJSON(bookObj);

//valid JSON Object to SFM
sfm.convertToSFM(bookObj, s);
} else if (options.sfm && bookObj.header.bookInfo.code !== "000") {
const basename = path.parse(path.basename(filepath)).name;

// For testing, write out book JSON Object
writeJSON(bookObj, basename + '.json');

if (bookObj.header.bookInfo.code == 'XXE') {
// Special SFM file written to TSV
sfm.convertToTSV(bookObj, basename);
} else {
//valid JSON Object to SFM
sfm.convertToSFM(bookObj, s);
}
}

return bookObj;
}

/**
* Take a JSON file and make an SFM file
Expand Down Expand Up @@ -350,18 +425,21 @@ async function processJSON(filepath: string){

/**
* Write JSON file (for testing purposes).
* Filename will be [##][XYZ][Project name].json
* If filename not provided, it will be [##][XYZ][Project name].json
* ## - 2-digit book number
* XYZ - 3 character book code
* Project name - Paratext project name
* @param {books.bookType} bookObj - the book object to write to file
* @param {filename} string - filename to write.
*/
function writeJSON(bookObj: books.objType) {
function writeJSON(bookObj: books.objType, filename : string = '') {
if (debugMode) {
// Add leading 0 if book number < 10
const padZero = bookObj.header.bookInfo.num < 10 ? '0' : '';
const filename = padZero + bookObj.header.bookInfo.num +
if (filename == '') {
filename = padZero + bookObj.header.bookInfo.num +
bookObj.header.bookInfo.code + bookObj.header.projectName + '.json';
}
fs.writeFileSync('./' + filename, JSON.stringify(bookObj, null, 2));
console.info(`Writing out "${filename}"`);
}
Expand Down
Loading