X BitMaps: extract image data and display in ascii terminal
Intro
While being distracted from my previous distractions from an earlier distraction (fonts), I was intrigued by: [Great 202 Jailbreak - Computerphile]
The report [a Summer Vacation: Digital Restoration and Typesetter Forensics] included a link to [archive made available of Martin W. Guy's backup to tape from the 80s], where the authors found some data they used either directly or to confirm their earlier guesses about construction of the document. This appears to have taken about 6-8 weeks of work to rebuild one printed report from various information they were able to find or still had in hand. But I digress.
Within the archive index was images described as: Mike Hawleys's collection of tiny X bitmaps (Dec 1988) Including: [Brian Kernighan].
Unknown image type
After clicking the extension-less file I saw:
#define bwk_width 48 #define bwk_height 48 static char bwk_bits[] = { 0x00, 0x00, 0xc0, 0x3f, 0x00, 0x00, 0x00, 0x00, 0xf8, 0xea, 0x01, 0x00, ...
Hoping to find information to help find an application that could show this source code, I saved it to disk and tried file: bwk.image.c_source: ASCII text
. Seeing this is c source code, I assumed that this was used by directly compiling into a larger c application. What I could have done was attempt to identify the file with:
test | result |
---|---|
ffprobe | bwk.image.c_source: Invalid data found when processing input |
gimp | bwk.image.c_source' failed: Unknown file type |
imageinfo | XBM X Windows system bitmap (black and white) 1850 8 48x48
using: imageinfo --format --fmtdscr --size --depth --geom bwk.image.c_source |
imagemagick identify | XBM 48x48 48x48+0+0 8-bit sRGB 2c 1.85KB 0.000u 0:00.000 |
Python workout
Given the things I tried hadn't made me any the wiser, I considered starting a c app, to include the file and code something to view it somehow. Expanding my python skills was more important, so I began looking at the structure of the file to plan how to proceed: - read the file - get the width - get the height - get the image data - transform / feed into an image creation library to create a png/bmp - whatever was easiest. - not knowing about the file format I decided to also grab the filename (bwk) from the defines, assuming that you could define more than 1 image in a file, and you need to pick the right defines and data for a single file.
Python development environment
Half the problem is to find and setup a dev env to speed the development. I started with: python3, gedit, gnome-terminal, firefox (google, python manual, stackoverflow). Hacking involved trying stuff in the python3 interpreter, and then copy paste into my code.py in gedit.
Later I started using bluefish editor, with a custom command for python:
gnome-terminal --geometry=100x50+1200+0 --working-directory='%c' -e "bash -c \"python3 '%f'; read -n1 junk\""
Clicking Python starts the terminal, with the correct directory, starts python3 with the file in the editor, and pauses the terminal output until a key is pressed - necessary to see interpreter messages and my hacking output.
file read
Getting the text of the file into a string in memory was easy:
fhand = open('bwk.image.c_source.txt') fhand = open('test.xbm') sDataRaw = fhand.read()
regular expressions
I learnt a lot about regex's by using the re module, and then the extended regex module to detect conforming file content. The [regex builder/tester] was useful. At first I tried to match the two #define lines, and extract the match group data, leaving the pixel data for a second regex.
import re ... matchobject = re.search('.*#define ([[:alpha:]]{1,3})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2}).*', sDataRaw) if matchobject: print(matchobject)
However, this would only show the first match. I extended this to match the overall file structure extracting: imagename1, metric1, value1, imagename2, metric2, value2, imagename3, and which had data that looked like a c string of 0xab hex values. Since I needed multiple matches, I changed to regex library instead:
import regex ... pattern = regex.compile('#define ([[:alpha:]]{1,8})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2})\n.*#define ([[:alpha:]]{1,8})_([[:alpha:]]{4,6}) ([[:digit:]]{1,2}).*static char ([[:alpha:]]{1,8})_bits\[\] *= *.*[, \n0x[:xdigit:]]+\};', regex.DOTALL)