Field Notes of an Audacious Amateur: A second addendum to the second installment

I really do have things other than outlining with nano to write about. Really I do. For example, there's the project of installing the Tinycore distribution on some older machines in our computer lab to write about--something I did about a month ago and about which I've already started an article; there's an article about the newsbeuter rss client; one about how to set a weather map as the desktop background; and so forth. But I've gone on kind of a jag with this nano project lately, and it's complex and foreign enough to me that if I don't record it now, I'm liable to forget important details. So, you're forced to endure another installment on it. :)

What follows may be the last entry on this topic for awhile--we'll see whether any enhancements will be forthcoming soon. If not, I'll probably move on to some of those other important writing projects after this entry. Anyway, on to the topic at hand.

With help from some generous folks over at the linux questions forum, I now have a 3 scripts to share. All the scripts--one written in perl, one in python, and one in awk--add mark-up to the nano outlines I create, mark-up that causes them, once pdflatex has been run on them, to transform into documents that print nicely on paper. I'll paste below the code for each of the three scripts so that, not only will I be less likely to forget how all this works, but also for the possible benefit of others who want to do something similar to what I'm doing. I'll start with the perl script.

The following perl script will add TeX/LaTeX mark-up to the outline files I create with nano. The code can be seen in the following graphic (thanks, blogspot, for making it such a PITA to post code snippets here that I have upload graphic files in order to show them, and sincere thanks to formatmysourcecode for providing the real and effective solution evident below):

#!/usr/bin/perl
# run outl2tex.pl as follows: outl2tex.pl file.outl > file.tex

print <<END
\\documentclass[14pt]{extarticle}
\\usepackage{cjwoutl}
\\usepackage[top=1in,bottom=1in,left=1in,right=1in]{geometry}
\\usepackage{underlin}
\\setlength{\\headsep}{12pt}
\\pagestyle{myheadings}
\\markright{\\today{\\hfill ***Header*title*here***\\hfill}}
\\linespread{1.3} % gives 1.5 line spacing
\\pagestyle{underline}
\\begin{document}
\\begin{outline}[new] 
END
;

while (<>) {
  s/^(\t*)=(.*)/"$1\\outl{".((length $1) + 1)."}$2"/e;
  print;
}

print <<END
\\end{outline}
\\end{document} 
END
;

(the code pasting above was made easy and possible by the source code formatter available at http://formatmysourcecode.blogspot.com/)

As will be obvious, I've named the script outl2tex.pl. As should also be clear, the script is to be run by issuing the command outl2tex.pl name_of_outl_file.outl > name_of_tex_file.tex

The next script--actually scripts--I'll show are the python ones. Why two? Well, it seems python is a rapidly-developing scripting language--something I found out when trying to run the first script that was created for me: it returned an error on my machine. At the same time, it ran fine on the machine of the fellow who created it.

Well, as it turns out, I have version 3.2.2 installed on my machine, while he developed the script on a machine that had version 2.6 on it. So, the following script is the one he originally wrote and that ran on his computer, and one which should work for python version 2.x (he wrote it for 2.6, I ran it successfully with version 2.7, but whether it works for all 2.x versions I cannot say for certain).

#!/usr/bin/python
# for use with versions 2.x of python
# run outl2tex.py as follows: outl2tex.py file.outl > file.tex

import sys
import re

if( len( sys.argv ) != 2 ):
    print >> sys.stderr, "{0} requires one filename to process.".format( sys.argv[0].split('/')[-1] )
    sys.exit( 1)

try:
    rawOutline = open( sys.argv[1], 'r' )
except:
    print >> sys.stderr, "Unable to open {0} for reading".format( sys.argv[1] )
    sys.exit( 2 )

print ( '\\documentclass[14pt]{extarticle}\n'
        '\\usepackage{cjwoutl}\n'
        '\\usepackage[top=1in,bottom=1in,left=1in,right=1in]{geometry}\n'
        '\\usepackage{underlin}
        '\\setlength{\\headsep}{12pt}
        '\\pagestyle{myheadings}\n'
        '\\markright{\\today{\\hfill ***Header*title*here***\\hfill}}\n'
        '\\linespread{1.3} % gives 1.5 line spacing\n'
        '\\pagestyle{underline}
        '\\begin{document}\n'
        '\\begin{outline}[new]\n' )

for inputLine in rawOutline:
    reMatches = re.match( r"(\t*)=(.*)", inputLine )
    if( reMatches == None ):
        print inputLine.rstrip()
    else:
        tabCount = len( reMatches.group(1).split('\t') )
        print "{0}\\outl{{{1:d}}}{2}".format( reMatches.group(1), tabCount, reMatches.group(2) )

print ( '\\end{outline}\n'
        '\\end{document}\n' )

This script is run in essentially the same way as the perl one.

Next, the python 3.x script. As with the 2.x version listed above, I've confirmed that this one runs using version 3.2.2, though I am uncertain whether it can be expected to run under all 3.x versions of python.

#!/usr/bin/python
# for use with versions 3.x of python
# run outl2tex.py as follows: outl2tex.py file.outl > file.tex

import sys
import re

if( len( sys.argv ) != 2 ):
    print ( "{0} requires a filename to process.".format( sys.argv[0].split('/')[-1] ), file=sys.stderr )
    sys.exit( 1 )

try:
    rawOutline = open( sys.argv[1], 'r' )
except:
    print ( "Unable to open {0} for reading".format( sys.argv[1] ), file=sys.stderr )
    sys.exit( 2 )

print ( '\\documentclass[14pt]{extarticle}\n'
        '\\usepackage{cjwoutl}\n'
        '\\usepackage[top=1in,bottom=1in,left=1in,right=1in]{geometry}\n'
        '\\usepackage{underlin}
        '\\setlength{\\headsep}{12pt}
        '\\pagestyle{myheadings}\n'
        '\\markright{\\today{\\hfill ***Header*title*here***\\hfill}}\n'
        '\\linespread{1.3} % gives 1.5 line spacing\n'
        '\\pagestyle{underline}
        '\\begin{document}\n'
        '\\begin{outline}[new]\n' )

for inputLine in rawOutline:
    reMatches = re.match( r"(\t*)=(.*)", inputLine )
    if( reMatches == None ):
        print ( inputLine.rstrip() )
    else:
        tabCount = len( reMatches.group(1).split('\t') )
        print ( "{0}\\outl{{{1:d}}}{2}".format( reMatches.group(1), tabCount, reMatches.group(2) ) )

print ( '\\end{outline}\n'
        '\\end{document}\n' )

Finally, here's the awk script, and it's a long 'un (unfortunately this script is now broken owing to some updates I made to the template file):

#!/usr/bin/awk -f
#
# -v tab=8
#       set tab stops at every eight columns (the default).
#
# -v template=template.tex
#       set the path to the LaTeX template file.
#
# run outl2tex.awk as follows: outl2tex.awk file.outl > file.tex

# Convert tabs to spaces.
function detab(detab_line) {

    if (length(tabsp) != tab) {
    }

    while ((detab_pos = index(detab_line, "\t")) &gt; 0)
        detab_line = substr(detab_line, 1, detab_pos - 1) substr(tabsp, detab_pos % tab) substr(detab_line, detab_pos + 1)

    return detab_line
}

BEGIN {
    # Set tab width to default, unless set on the command line.
    if (tab &lt; 1)
        tab = 8

    # Set template name to default, unless set on the command line.
    if (length(template) &lt; 1)
        template = "template.tex"

    # Record separator is a newline, including trailing whitespace.
    RS = "[\t\v\f ]*(\r\n|\n\r|\r|\n)"

    # Field separator is consecutive whitespace.
    FS = "[\t\v\f ]+"

    # Configuration -- parsed from magic comments.
    split("", config)
    config["tab"] = tab
    config["template"] = template

    # We are not working on anything yet.
    template = ""
    header = ""
    footer = ""
    split("", outline)
    outline[0] = 1
    maxspaces  = 0
    CURR = ""
}

CURR != FILENAME {

    # Empty line?
    if ($0 ~ /^[\t ]*$/)
        next        

    # Configuration comment?
    if ($0 ~ /^%[\t ]*[A-Za-z][0-9A-Za-z]*[\t ]*:/) {
        name = $0
        sub(/^%[\t ]*/, "", name)
        sub(/[\t ]*:.*$/, "", name)
        value = $0
        sub(/^[^:]*:[\t ]*/, "", value)

        # Make the name case-insensitive.
        temp = name
        name = ""
        for (i = 1; i &lt;= length(temp); i++) {
            c = substr(temp, i, 1)
            uc = toupper(c)
            lc = tolower(c)
            if (uc != lc)
                name = name "[" uc lc "]"
            else
                name = name c
        }

        config[name] = value
        next
    }

    # Comment line (skipped)?
    if ($0 ~ /^[\t ]*%/)
        next

    # This is the first line of actual content.
    CURR = FILENAME

    # Set up tabs as currectly specified.
    tab = int(config["tab"])
    tabsp = "                "
    while (length(tabsp) &lt; tab)
        tabsp = tabsp tabsp
    tabsp = substr(tabsp, 1, tab)

    # Have we used a template yet?
    if (length(template) &lt; 1) {
        # No, read it.
        template = config["template"]
        if (length(template) &lt; 1) template = "-"
        OLDRS = RS
        RS = "(\r\n|\n\r|\r|\n)"

        while ((getline line &lt; template) &gt; 0) {
            # Content marker line?
            if (line ~ /^[\t\v\f ]*[Cc][Oo][Nn][Tt][Ee][Nn][Tt][\t\v\f ]*$/)
                break

            # Outline level definition?
            if (line ~ /^%[\t ]*\\outl{/) {
                level = line
                sub(/^[^{]*{/, "", level)
                sub(/}.*$/, "", level)
                level = int(level)

                line = detab(line)
                sub(/\\.*$/, "", line)
                sub(/%/, "", line)
                spaces = length(line)
                outline[spaces] = level
                if (spaces &gt; maxspaces)
                    maxspaces = spaces
                continue
            }

            # Default value definition?
            if (line ~ /^%[\t ]*[A-Z][0-9A-Za-z]*:/) {
                name = line
                sub(/^%[\t ]*/, "", name)
                sub(/[\t ]*:.*$/, "", name)
                value = line
                sub(/^[^:]*:[\t ]*/, "", value)

                # Make the name case-insensitive.
                temp = name
                name = ""
                for (i = 1; i &lt;= length(temp); i++) {
                    c = substr(temp, i, 1)
                    uc = toupper(c)
                    lc = tolower(c)
                    if (uc != lc)
                        name = name "[" uc lc "]"
                    else
                        name = name c
                }

                # If not in config already, set.
                if (!(name in config))
                    config[name] = value
                continue
            }

            # Comment line?
            if (line ~ /^[\t ]*%/)
                continue

            # Ordinary header line. Remove comment.
            sub(/[\t ]%.*$/, "", line)
            header = header line "\n"
        }

        # The rest belongs to footer.
        while ((getline line &lt; template) &gt; 0)
            footer = footer line "\n"

        close(template)
        RS = OLDRS

        # Fill in the outline levels.
        level = outline[0]
        for (spaces = 1; spaces &lt; maxspaces; spaces++)
            if (spaces in outline)
                level = outline[spaces]
            else
                outline[spaces] = level

        # Replace all known ~Name~ in the template.
        for (name in config) {
            gsub("~" name "~", config[name], header)
            gsub("~" name "~", config[name], footer)
        }

        # Replace all other ~Name~ entries in the template with empty strings.
        gsub(/~[A-Z][0-9A-Za-z]*~/, "", header)
        gsub(/~[A-Z][0-9A-Za-z]*~/, "", footer)

        # Emit the template.
        printf("%s", header)
    }
}

/^[\t ]*=/ {
    line = $0
    prefix = index(line, "=") - 1

    # Indentation size in spaces.
    spaces = length(detab(substr(line, 1, prefix)))

    # Find out the outline level for this indentation.
    if (spaces &gt; maxspaces)
        level = outline[maxspaces]
    else
        level = outline[spaces]

    # Add outline level definition.
    line = substr(line, 1, prefix) "\\outl{" level "}" substr(line, prefix + 2)

    printf("%s\n", line)
    next
}

{   printf("%s\n", $0)
}

END {
    printf("%s", footer)
}

As you'll note, this script is quite a bit more complex than the others. It expects, for example, that the TeX template is located in the directory in which the script is run, since it reads values from there (neither the perl nor the python scripts expect such a template file). If it doesn't find the template, it won't process the .outl file.

Furthermore, certain variables can be entered on the command line with this script--unlike the others: the number of spaces that make up a tab space can be fed to it, or an alternate name for the template file--even the title can be set as a command line option using -v title="Title". The title can likewise be entered as a commented line at the top of the outline file after the fashion % Title: A Nice Outline (the percentage sign is how commented lines are to be formed using TeX/LaTeX mark-up).

That about wraps up this second addendum to outlining with nano. The only other addition I can make for the time being is to point out that I've slightly modified the TeX/LaTeX preamble for my outlines, adding in the header, instead of the author's name, mark-up that will print there today's date (\today)--information more apropos for my purposes. See below the slightly modified template:

\documentclass[14pt]{extarticle}
\usepackage{cjwoutl}
\usepackage[top=1in,bottom=1in,left=1in,right=1in]{geometry}
\usepackage{underlin}
\setlength{\headsep}{12pt}
\pagestyle{myheadings}
\markright{\today{\hfill ***Header*title*here***\hfill}}
\linespread{1.3} % gives 1.5 line spacing
\pagestyle{underline} %this along with underlin package give hrule in header
\begin{document}
\begin{outline}[new]
%\outl{1}
%       \outl{2}
%               \outl{3}
%                       \outl{4}
%                               \outl{5}
%                                       \outl{6}
%                                               \outl{7}
%                                                       \outl{8}
%                                                               \outl{9}
%                                                                       \outl{10}
\end{outline}
\end{document}

I've also altered slightly my .nanorc file so that it includes my preferred "pseudo-bullet"--the equals sign. What that means is that any line that does not begin with the equals sign--regardless of whether it's indented or not--does not get color highlighting applied. Lines that do begin with the equals sign--indented or not--do get color highlighting applied. The relevant section of the updated .nanorc is:

syntax "outl" "\.outl$"
color brightwhite "(^)=.*$"
color brightred "(^[[:blank:]])=.*$"
color brightgreen "(^[[:blank:]]{2})=.*$"
color magenta "(^[[:blank:]]{3})=.*$"
color brightblue "(^[[:blank:]]{4})=.*$"
color brightyellow "(^[[:blank:]]{5})=.*$"
color cyan "(^[[:blank:]]{6})=.*$"
color brightred "(^[[:blank:]]{7})=.*$"
color brightgreen "(^[[:blank:]]{8})=.*$"
color magenta "(^[[:blank:]]{9})=.*$"
color brightblue "(^[[:blank:]]{10})=.*$"

That's pretty much it. The only other modification that might be done involves quotation marks: since TeX/LaTeX expects, not " but `` (double back-ticks) as the opening quotation marks, my files produce the wrong sort of open quotation mark (back-facing instead of front-facing). If one of the above scripts could be modified to detect the opening quotation marks and replace them with the needed double back-ticks, it would make processing of the outline files more complete (ugh, I've just realized I have a similar problem with single quotes or what are sometimes called apostrophes). Otherwise, a search-and-replace will be in order prior to running pdflatex on the converted outlines.

Ya know, come to think of it, working with TeX/LaTeX is a real PITA. The final output is stunning, and no word processor I've ever dealt with comes anywhere close to matching it aesthetically. But it's still a PITA. And that's gonna to have to do it for now.

Field Notes of an Audacious Amateur

Thursday, February 2, 2012

A second addendum to the second installment

No comments:

Post a Comment

Blog Archive

Labels