This series is written by a representative of the latter group, which is comprised mostly of what might be called "productivity users" (perhaps "tinkerly productivity users?"). Though my lack of training precludes me from writing code or improving anyone else's, I can, nonetheless, try and figure out creative ways of utilizing open source programs. And again, because of my lack of expertise, though I may be capable of deploying open source programs in creative ways, my modest technical acumen hinders me from utilizing those programs in what may be the most optimal ways. The open-source character, then, of this series, consists in my presentation to the community of open source users and programmers of my own crude and halting attempts at accomplishing computing tasks, in the hope that those who are more knowledgeable than me can offer advice, alternatives, and corrections. The desired end result is the discovery, through a communal process, of optimal and/or alternate ways of accomplishing the sorts of tasks that I and other open source productivity users need to perform.
Showing posts with label txt. Show all posts
Showing posts with label txt. Show all posts

Saturday, October 19, 2024

TeX/LaTeX: tex to txt

Since a .tex file is already essentially compatible with all text editors, you may be wondering why anyone would need to do any sort of conversion to .txt format. I would also have wondered at why such a task would need to be done--that is, until I ran into the need myself.

But I did run into such a need. The issue was that I had a nicely formatted .tex document--an article, actually, that I had translated with the aid of my computer, into a foreign language. I had one fairly competent translator check the machine translation over and do some corrections and thought I was ready to go. Then, another translator had a look and offered to do further improvements, to which I gladly assented. This is where I ran into a problem with the nicely-formatted file.

This translator, although quite well-qualified and fairly capable in matters technical, was nonetheless not at all familiar with Tex/LaTeX formatting. So I couldn't really give him the document in the most optimum format for me (.tex) for correcting. And at the same time getting from him the corrected text in a format like .pdf or .doc would further complicate my task of getting it back to its nicely-formatted .tex state. Thus, I decided that .txt would be the most neutral format to use for providing the translator with the computer-translated text for further revision. But how to do that?

Well, I had already created a pdf of the document, so I had that to work from. Using sed or awk to strip out all the TeX formatting codes would be an option for someone far better versed in those utilities that I am. But even that might prove a fairly involved and time-consuming task.

Some on-line searching revealed another possible solution: it involved using the utility pdftotext. It seemed worth a try.

Sure enough, running pdftotext file.pdf file.txt actually gave quite good results. There were a few anomalies I needed to clean up, but they were actually fairly few in number. I'd say the whole process took about 15 minutes total, after which, I had a .txt version of this 5k-word file that I could submit to the translator.

So, in the unlikely event that you may need to convert your .tex file to .txt, I can recommend the routine of first converting it to .pdf, then the resulting file to .txt. I should probably mention that this file didn't contain graphics, a table, or any sort of chart. So I can't vouch for how it would work on files containing comparatively more complex elements such as that. So, probably the less complex the document is, the more successfully it will convert using this method. So, there you have it, a method for converting .tex to .pdf to .txt

Saturday, April 25, 2020

Another imagemagick trick: txt to png

I ran across an article some time ago that discussed exporting text to an image format using imagemagick's convert option. Although it looked interesting, I wasn't sure at the time whether it would actually be useful to me in any real-world way. Since I'm now considering various schemes for making paper back-ups of some sensitive data, I've begun investigating again this means of making an image from text. So I thought I'd provide a bit of documentation here and just make a comment or two on some recent tests.

So the current iteration of the command I've been testing is

convert -size 800x -pointsize 6 caption:@my-data.txt my-data.png

The -size switch performs the obvious function of setting the image size, or at least its width. The -pointsize switch defines the size of font that will be used in the text that will appear in the image--in the case of this example, a very small font. I'm sure font face can be specified as well, though I am not experimenting with that at this time.

In the example given here, the name of a text file is specified. Long lines are split by the program according to the width specified, but if no width is specified the width of the image will correspond to the longest line length. The output of a command can also be directed to an image. Slightly different syntax from what is seen in this example would need to be used in that case, of course.

Another convert option that works similarly to caption is label. It seemed to me the caption option was more relevant to the task I was experimenting with since a larger amount of text could be involved.

The experiments I've been conducting are for the possible purpose of making a paper back-up of the sensitive data. An image file of the data could be created, printed, then the image file erased.

Finally, I recently discovered that there is a fork of imagemagick called graphicsmagick. I have not looked into that very deeply or used it so far. But I will be investigating further.

For reference I got my original introduction to this neat feature from an article at https://www.ostechnix.com/save-linux-command-output-image-file/

More can be found at http://www.imagemagick.org/Usage/text/