Table Manipulations

Some table manipulation programs. Most are conversions between different forms:

HTML makes a good output format. You can get an ASCII pretty print by piping to lynx. e.g. tsv2html table.tsv | lynx -stdin -dump

columns.awk
columns.py
columns.rb
csv2tsv.c
csv2tsv.rb
csv2tsv2.c
db2tsv.awk
fillHTMLfromTSV.awk
fs.awk
qTable.awk
tsv2html
tsv2html.awk
tsv2html.pl
tsv2html.py
tsv2html.rb
tsv2html.sed
tsv2html3.awk
tsv2htmlplus.awk
unquote.awk
rdb2html.awk
tsv2rdb.awk
tsv2txt.awk
txt2tsv.awk

The Perl program, cvs2tsv.pl, doesn't work as can be seen by testing it with test.csv. Compare with:
./csv2tsv <test.csv | unquote.awk -F \\t OFS=\\t | cat -vt

For other text formats see ESR's Art of Unix Programming.

Metadata

Simple tables contain just data in rows and columns. Metadata can be introduced several ways. One could consider the HTML tags to be meta data, but just the tr and td tags are not really metadata. attribute values in HTML tags could contain metadata. The simplest and most common bit of metadata are column headings in the first line. Some utilities like cut, paste, and even join still work with such files. It breaks sort, but try head -1 file.tsv; sed '1d' file.tsv | sort -t"^I" .... Many of the above scripts are designed to allow or even expect such headings. The th HTML tag can be used for this. Other metadata is sometimes started with a # which is interepreted as a comment to be ignored by shell scripts, Perl, Ruby, and other languages. The above scripts do not handle this. RFC 822 or #%key=value can be used to store a hash of metadata. The above scripts do not handle this. See also Relation ASCII.


Eric@BlossomAssociates.Net 2005-12-02