Since a .tex
file is already essentially compatible with all text editors, you may be wondering why anyone would need to do any sort of conversion to .txt
format. I would also have wondered at why such a task would need to be done--that is, until I ran into the need myself.
But I did run into such a need. The issue was that I had a nicely formatted .tex
document--an article, actually, that I had translated with the aid of my computer, into a foreign language. I had one fairly competent translator check the machine translation over and do some corrections and thought I was ready to go. Then, another translator had a look and offered to do further improvements, to which I gladly assented. This is where I ran into a problem with the nicely-formatted file.
This translator, although quite well-qualified and fairly capable in matters technical, was nonetheless not at all familiar with Tex/LaTeX formatting. So I couldn't really give him the document in the most optimum format for me (.tex
) for correcting. And at the same time getting from him the corrected text in a format like .pdf
or .doc
would further complicate my task of getting it back to its nicely-formatted .tex
state. Thus, I decided that .txt
would be the most neutral format to use for providing the translator with the computer-translated text for further revision. But how to do that?
Well, I had already created a pdf of the document, so I had that to work from. Using sed
or awk
to strip out all the TeX formatting codes would be an option for someone far better versed in those utilities that I am. But even that might prove a fairly involved and time-consuming task.
Some on-line searching revealed another possible solution: it involved using the utility pdftotext
. It seemed worth a try.
Sure enough, running pdftotext file.pdf file.txt
actually gave quite good results. There were a few anomalies I needed to clean up, but they were actually fairly few in number. I'd say the whole process took about 15 minutes total, after which, I had a .txt
version of this 5k-word file that I could submit to the translator.
So, in the unlikely event that you may need to convert your .tex
file to .txt
, I can recommend the routine of first converting it to .pdf
, then the resulting file to .txt
. I should probably mention that this file didn't contain graphics, a table, or any sort of chart. So I can't vouch for how it would work on files containing comparatively more complex elements such as that. So, probably the less complex the document is, the more successfully it will convert using this method. So, there you have it, a method for converting .tex
to .pdf
to .txt
No comments:
Post a Comment