Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

中英文文献中 unicode 引号的处理 #184

Open
zwz opened this issue Jan 26, 2024 · 7 comments
Open

中英文文献中 unicode 引号的处理 #184

zwz opened this issue Jan 26, 2024 · 7 comments

Comments

@zwz
Copy link

zwz commented Jan 26, 2024

在用 bibutils 转化文献格式的时候,发现它会将英文引号'"处理为,,,(mac下拼音输入法也是这样的引号)

然后发现,这些引号,不管是否在中英文文献中,最终在pdf中都是全角的样子
这样的话,对于英文文献就显得非常难看和不合理

不知道是否可以添加对这些引号的自动处理?

以下是MWE

\documentclass[]{ctexart}
\usepackage[backend=biber,
  style=gb7714-2015]{biblatex}
\begin{filecontents}[force,noheader]{\jobname.bib}
@article{A01,
  author = {Author, A.},
  year = {2001},
  title = {“Test” User’ Manual},
  journaltitle = {A journal},
  number = {1},
  pages = {1--4},
}
@article{A02,
  author = {张三},
  year = {2001},
  title = {“测试”手册},
  journaltitle = {测试},
  number = {1},
  pages = {1--4},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\begin{document}

\fullcite{A01,A02}

\end{document}
@hushidong
Copy link
Owner

hushidong commented Jan 26, 2024 via email

@zwz
Copy link
Author

zwz commented Jan 27, 2024

是吗,我在wps中看了一下,好像是半角的样子

截屏2024-01-27 上午9 37 29

然后我在emacs下看了一下其中两个引号的字符信息如下:

            character: “ (displayed as “) (codepoint 8220, #o20034, #x201c)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x201C
               script: symbol
               syntax: . 	which means: punctuation
             category: .:Base, <:Not at eol, c:Chinese, h:Korean, j:Japanese
             to input: type "C-x 8 RET 201c" or "C-x 8 RET LEFT DOUBLE QUOTATION MARK"
          buffer code: #xE2 #x80 #x9C
            file code: #xE2 #x80 #x9C (encoded by coding system utf-8)
              display: by this font (glyph code):
    mac-ct:-*-Menlo-normal-normal-normal-*-13-*-*-*-m-0-iso10646-1 (#x6CC)



            character: ’ (displayed as ’) (codepoint 8217, #o20031, #x2019)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x2019
               script: symbol
               syntax: . 	which means: punctuation
             category: .:Base, >:Not at bol, c:Chinese, h:Korean, j:Japanese
             to input: type "C-x 8 RET 2019" or "C-x 8 RET RIGHT SINGLE QUOTATION MARK"
          buffer code: #xE2 #x80 #x99
            file code: #xE2 #x80 #x99 (encoded by coding system utf-8)
              display: by this font (glyph code):
    mac-ct:-*-Menlo-normal-normal-normal-*-13-*-*-*-m-0-iso10646-1 (#x6C9)

@zepinglee
Copy link
Contributor

Unicode 编码的弯引号不分全角、半角,显示的字形只跟字体有关。

@zwz
Copy link
Author

zwz commented Jan 27, 2024

Unicode 编码的弯引号不分全角、半角,显示的字形只跟字体有关。

嗯,我想也应该是这样
bibutils 没有道理会使用全角的引号

我将这些字符的具体信息贴上来
是不是可以根据其值,分中英文进行自动处理?

@zwz zwz changed the title 中英文文献中的引号处理 中英文文献中 unicode 引号的处理 Jan 27, 2024
@zepinglee
Copy link
Contributor

我将这些字符的具体信息贴上来 是不是可以根据其值,分中英文进行自动处理?

可以设置中英文,但主要是底层的 xeCJKluatex-ja 还不能很方便地切换弯引号对应中英文字体。

@hushidong
Copy link
Owner

还是直接指定的好,

中文的使用输入法直接输入全角的引号:“,”,','
英文的使用默认英文输入的半角符号:``,",`,'

尽管 ``和` 映射的也是U+201C,U+2018,但会使用默认的英文字体。其它的则能正常区分的。

比如:

\documentclass[]{ctexart}
\usepackage[backend=biber,
  style=gb7714-2015]{biblatex}
\begin{filecontents}[force,noheader]{\jobname.bib}
@article{A01,
  author = {Author, A.},
  year = {2001},
  title = {“Test” User’ Manual}, #“=U+201C ”=U+201D ‘=U+2018 ’=U+2019
  journaltitle = {A journal},
  number = {1},
  pages = {1--4},
}

@article{A03,
  author = {Author, A.},
  year = {2001},
  title = {``Test" User' Manual}, #"=U+0022 '=U+0027
  journaltitle = {A journal},
  number = {1},
  pages = {1--4},
}

@article{A04,
  author = {张三},
  year = {2001},
  title = {``测试"手册},
  journaltitle = {测试},
  number = {1},
  pages = {1--4},
}

@article{A02,
  author = {张三},
  year = {2001},
  title = {“测试”手册},
  journaltitle = {测试},
  number = {1},
  pages = {1--4},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\begin{document}
\nocite{*}

\printbibliography

\end{document}

结果为:

图片

换个字体后:



\documentclass[]{ctexart}
\usepackage[concrete]{fontsetup} 
\usepackage[backend=biber,
  style=gb7714-2015]{biblatex}
\begin{filecontents}[force,noheader]{\jobname.bib}
@article{A01,
  author = {Author, A.},
  year = {2001},
  title = {“Test” User’ Manual}, #“=U+201C ”=U+201D ‘=U+2018 ’=U+2019
  journaltitle = {A journal},
  number = {1},
  pages = {1--4},
}

@article{A03,
  author = {Author, A.},
  year = {2001},
  title = {``Test" User' Manual}, #"=U+0022 '=U+0027
  journaltitle = {A journal},
  number = {1},
  pages = {1--4},
}

@article{A04,
  author = {张三},
  year = {2001},
  title = {``测试"手册},
  journaltitle = {测试},
  number = {1},
  pages = {1--4},
}

@article{A02,
  author = {张三},
  year = {2001},
  title = {“测试”手册},
  journaltitle = {测试},
  number = {1},
  pages = {1--4},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\begin{document}
\nocite{*}

\printbibliography

\end{document}

结果为:

图片

@zepinglee
Copy link
Contributor

目前 xeCJK 就是这么设计的。

我记得有人提过用 babel 之类的机制切换语言时同时修改这些字符的字体映射(也就是兼容 babel),不过应该需要修改很多底层内容。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants