Fonts & Character SetsCopy and Paste Character Substitution

Information and discussion about fonts and character sets (e.g. how to use language specific characters)
Post Reply
oezi
Posts: 4
Joined: Wed Dec 02, 2009 10:42 pm

Copy and Paste Character Substitution

Post by oezi »

I would like to improve how text from my PDF can be copy/pasted. In particular I have trouble with

Code: Select all

``quotation''
and ligatures fi, ff, etc.

I am using T1 fontenc, which looks nice but when I copy from the following MWE

Code: Select all

\documentclass{article}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc} 
\usepackage[english]{babel}
\usepackage{lmodern}   
\begin{document} 
``finding the buffet''
\end{document}
and paste into an UTF-8 text editor I get:

Code: Select all

000000019  E2 80 9C EF AC 81 6E 64-69 6E 67 20 74 68 65 20   |“finding the |
000000029  62 75 EF AC 80 65 74 E2-80 9D                     |buffet”      |
But I would much rather get

Code: Select all

0000000C1  22 66 69 6E 64 69 6E 67-20 74 68 65 20 62 75 66   |"finding the buf|
0000000D1  66 65 74 22                                       |fet"            |
I am not sure it is possible to include in the PDF another encoding of characters meant for copying.

Just to clarify:
* Ligatures look beautiful and I want to keep them
* Only copy and paste could be better
* Acrobat is smart about it and resolves the letters, but other readers have more problems.

Any ideas?
Last edited by oezi on Fri Dec 04, 2009 9:35 pm, edited 1 time in total.

Recommended reading 2024:

LaTeXguide.org • LaTeX-Cookbook.net • TikZ.org
LaTeX Beginner's Guide LaTeX Cookbook LaTeX TikZ graphics TikZによるLaTeXグラフィックス
User avatar
localghost
Site Moderator
Posts: 9201
Joined: Fri Feb 02, 2007 12:06 pm

Copy and Paste Character Substitution

Post by localghost »

Try with cmap and/or microtype.


Best regards and welcome to the board
Thorsten
How to make a "Minimal Example"
Board Rules
Avoidable Mistakes[/size]

¹ System: openSUSE 42.2 (Linux 4.4.52), TeX Live 2016 (vanilla), TeXworks 0.6.1
oezi
Posts: 4
Joined: Wed Dec 02, 2009 10:42 pm

Re: Copy and Paste Character Substitution

Post by oezi »

Thank you Thorsten!

That did the trick!
Last edited by oezi on Mon Dec 07, 2009 1:44 pm, edited 1 time in total.
oezi
Posts: 4
Joined: Wed Dec 02, 2009 10:42 pm

Copy and Paste Character Substitution

Post by oezi »

Just in case anybody has the same problem:

Put

Code: Select all

\usepackage{cmap}
at the top of your latex file. To change `` and '' to " edit t1.cmap in your LaTeX installation by adding the following two lines:

Code: Select all

<10> <0022>
<11> <0022>
While Acrobat does not seem to care, evince and Foxit now copy "straight" quotes.
User avatar
localghost
Site Moderator
Posts: 9201
Joined: Fri Feb 02, 2007 12:06 pm

Copy and Paste Character Substitution

Post by localghost »

oezi wrote:Thank you Thomas! […]
Who is Thomas?
How to make a "Minimal Example"
Board Rules
Avoidable Mistakes[/size]

¹ System: openSUSE 42.2 (Linux 4.4.52), TeX Live 2016 (vanilla), TeXworks 0.6.1
oezi
Posts: 4
Joined: Wed Dec 02, 2009 10:42 pm

Re: Copy and Paste Character Substitution

Post by oezi »

:shock: Sorry Thorsten! Of course I meant you! :D
Post Reply