arthur tabachneckwilliam klein thornhill, ontariotoronto, ontario 2012 orlandoflorida april 22-25,...
Post on 19-Dec-2015
212 views
TRANSCRIPT
Arthur Tabachneck William KleinThornhill, Ontario Toronto,
Ontario
2012OrlandoFloridaApril 22-25, 2012
I Know Where You Were Last Summeror
If a Picture is Worth a Thousand Words, How Much is a Picture+Its Metadata Worth?
if you are connected to the internetclick anywhere in this box to see the
comic strip that was shown on this slide during the presentation
We’ve all heard the expression"A picture is worth a thousand words"
Copyright: Bronnian Comics. Reprinted with permission – Torstar Syndication Services
The comic strip on the last screenwas actually a jpg file
What if you wanted to use SAS to create a web page that contained
a collection of your pictures
in the order that the pictures were taken
with links that, when clicked, would bring up Google Earth to show where the pictures were taken
Not difficult to do if you are familiar with
using a pipe to find all of the files
the Exif and JFIF formats used for most jpg files
writing code to read hexadecimal characters
writing code to read big/little endian numbers
using functions to find strings
using base SAS to create html files
using the Google Map API
Would you like to know how to build such pages, quickly and easily, plus gain some new skills?
all of the necessary code is provided and explained in this paper
"Excuse me. Is this the Society for Asking Stupid Questions?"
A jpg file is comprised of one really long record
Hex: ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 60Pos: 01 02 03 0405 06 07 08 09 10 11 12 13 14 15 16Hex: 00 60 00 00 ff e1 01 0c 45 78 69 66 00 00 49 49Pos: 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32Hex: 2a 00 08 00 00 00 02 00 25 88 04 00 01 00 00 00Pos: 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48Hex: 92 00 00 00 9b9c 01 00 6c 00 00 00 26 00 00 00Pos: 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64Hex: 00 00 00 00 43 00 68 00 75 00 63 00 6b00 6c 00Pos: 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
ff e0 4a 46 49 4645 78 69 6608 00 00 00 02 00 25 8892 00 9b 9c 6c 00 26 00C h u c k l
application use markerJFIF tagExif tagEndian Identifierlocation of number of Exif tagsnumber of Exif tagsGPS (Global Positioning System) Exif taglocation of GPS (Global Positioning System) infotitle tagtitle lengthlocation of title infotitle
6 characters past the 1st character of the string ExifExif relative position 0 (position 31)
i.e., position 31+'92'x or 31+146=177i.e., position 31+'08'x or 31+8=39i.e., position 31+'26'x or 31+38=69i.e., position '6c'x or 108 charactersJPEG File Interchange FormatExchangeable Image File Format
4d 4d
little endianbig endian
big endian=134,217,728 little endian=8
49 49
What the program does
What the program does
What the program does
There are eight macro variablesyou can set to make the program:
display the page title you wantdisplay the note you want shown
display a before picture captiondisplay specific text if a
picture doesn’t have a titledisplay a picture number
include or exclude GPS hyperlinks
Plus you can specify
where your jpg files are located
the name of your output file
the command you want used to find your jpg files
This presentation, the code and a paper on the topic are all available on the past presentation’s page for December 9, 2011 at: www.torsas.ca
I Know Where You Were Last Summeror
If a Picture is Worth a Thousand Words, How Much is a Picture+Its Metadata Worth?
How the program worksindicate what you want by changing the macro
variable assignments in the first eight lines%let path=c:\;
%let outfile=my_pictures.html;
%let caption=Caption;
%let webpagetitle=Our Meeting;
%let webpagenote=;
%let afterpicturenumber=yes;
%let defaultpicturetitle=;
%let includeGPSlinks=yes;
the filename you want for your output
where your jpg files are located
what you want shown if a picture doesn’t have a title
the title you want shown at the top of
the web pagethe note you want shown below the
web page titlewhether you want numbers shown to the right of each picture
the caption you want shown before
picture titles
whether you want GPS links to be
included or excluded
If you’re not on Windowsmodify the filename statement
%let path=c:\;
%let outfile=my_pictures.html;
%let caption=Caption;
%let webpagetitle=Our Meeting;
%let webpagenote=;
%let afterpicturenumber=yes;
%let defaultpicturetitle=;
%let includeGPSlinks=yes;
filename indata pipe "dir &path.*.jpg /b";
I Know Where You Were Last Summeror
If a Picture is Worth a Thousand Words, How Much is a Picture+Its Metadata Worth?
If you aren’t interested in how the code works, you already have everything you need.
i.e., 1. if desired, modify the macro variables2. if needed, change the filename statement3. run the code
How the program worksa format is created to correctly input
big and little endian numbers
proc format; value tendian 18761='pibr.' 19789='s370fpib.';run;
How the program worksa datastep to parse the desired data
filename indata pipe "dir &path.*.jpg /b";data want (keep=pi: dt: c: title w: he:); length fil2read title $420; format dt_taken datetime19.; format lat lon $1.; length coordinates $35; infile indata truncover; informat picture $100.; input picture &; fil2read="&path."||picture; done=0; infile dummy filevar=fil2read RECFM=n lrecl=15000 end=done;
How the program worksread first 15,000 characters of each file
do while(not done); input VAR1 $char15000.; coordinates="No GPS"; if length("&defaultpicturetitle") lt 2 then title=picture; else title="&defaultpicturetitle"; pheight=0; pwidth=0; jheight=0; jwidth=0; portrait=0; dt_taken=0; positionx = index(var1,'Exif');
How the program worksparse locations and initialize variables
if positionx gt 0 then do; positionx+6; endian=put(input(substr(var1,positionx,2), pibr2.),tendian.); numberx=inputn(substr(var1,positionx+ inputn(substr(var1,positionx+4,4),endian,4), 2),endian,2); offset=positionx+inputn(substr(var1, positionx+4,4),endian,4)+2; gps_offset=0; subject_offset=0; date_offset=0; last_offset=0; orientation=0;
How the program worksparse IFD0 tags
do i=0 to numberx-1; xtag=inputn(substr(var1,offset+i*12,2),endian,2); if xtag eq 274 then orientation= inputn(substr(var1,offset+i*12+8,2),endian,2); else if xtag eq 34665 then do; numberx2L=positionx+inputn(substr( var1,offset+i*12+8,4),endian,4); numberx2=inputn(substr(var1,numberx2L,2), endian,2); last_offset=positionx+inputn(substr( var1,offset+i*12+8,4),endian,4)+2; end;
How the program worksparse IFD0 tags (continued)
else if xtag eq 34853 then do; gps_bytes=inputn(substr(var1, offset+i*12+2,2), endian,2); if gps_bytes eq 2 then gps_offset=positionx+ inputn(substr(var1,offset+i*12+10,2),endian,2); else gps_offset=positionx+inputn(substr(var1, offset+i*12+8,4),endian,4); end; else if xtag eq 40091 then do; title_offset=positionx+inputn(substr( var1,offset+i*12+8,4),endian,4); title_length=inputn(substr( var1,offset+i*12+4,4),endian,4); end; end;
How the program worksparse IFD tags
if last_offset then do; do i=0 to numberx2-1; xtag=inputn(substr(var1,last_offset+i*12,2), endian,2); if xtag eq 36867 then date_offset= inputn( substr(var1,last_offset+i*12+8,4),endian,4); else if xtag eq 40962 then pwidth= inputn( substr(var1,last_offset+i*12+8,4),endian,4); else if xtag eq 40963 then pheight= inputn( substr(var1,last_offset+i*12+8,4),endian,4); end;end;
How the program worksget title info
if 1<=title_offset<=15000-title_length then title=compress(inputc( substr(var1,title_offset,title_length), title_length),,"c");
How the program worksget Global Positioning System (GPS) info
if 1<=gps_offset<=15000-24 then do; numberx=inputn(substr(var1,gps_offset,2),endian,2); offset=gps_offset+2; do i=0 to numberx-1; xtag=inputn(substr(var1,offset+i*12,2),endian,2); if xtag eq 1 then lat=input(substr(var1,offset+i*12+8,1), $1.); else if xtag eq 2 then do; lat_offset=positionx+inputn(substr(var1, offset+i*12+8,2),endian,4); latdeg=inputn(substr(var1,lat_offset+ 0,4),endian,4)/ inputn(substr(var1,lat_offset+ 4,4),endian,4); latmin=inputn(substr(var1,lat_offset+ 8,4),endian,4)/ inputn(substr(var1,lat_offset+12,4),endian,4); latsec=inputn(substr(var1,lat_offset+16,4),endian,4)/ inputn(substr(var1,lat_offset+20,4),endian,4); end;
How the program worksget GPS info (continued)
else if xtag eq 3 then lon=input(substr( var1,offset+i*12+8,1),$1.); else if xtag eq 4 then do; lon_offset=positionx+inputn(substr(var1, offset+i*12+8,2),endian,2); londeg=inputn(substr(var1,lon_offset+ 0,4), endian,4)/inputn(substr(var1,lon_offset+ 4,4), endian,4); lonmin=inputn(substr(var1,lon_offset+ 8,4), endian,4)/inputn(substr(var1,lon_offset+12,4), endian,4); lonsec=inputn(substr(var1,lon_offset+16,4), endian,4)/inputn(substr(var1,lon_offset+20,4), endian,4); end; end;
How the program worksget GPS info (continued) and get date info
decimal_lat=ifc(lat eq "N",latdeg+(latmin*60 +latsec)/ 3600,-1*(latdeg+(latmin*60+latsec) /3600)); decimal_lon=ifc(lat eq "E",londeg+(lonmin*60+ lonsec)/ 3600,-1*(londeg+(lonmin*60+lonsec) /3600)); if nmiss(of dec:) lt 1 then coordinates=catx(",",decimal_lat,decimal_lon); end; if 1<=date_offset<=15000-19 then dt_taken= input(substr( var1,positionx+date_offset,19), anydtdtm19.);end;
How the program worksif necessary, get alternative height and width dataif index(var1,'JFIF') then do; positionx = index(var1,'FFC0'x); if positionx gt 0 then do; jheight=input(substr(var1,positionx+5,2), s370fpib2.); jwidth=input(substr(var1,positionx+7,2), s370fpib2.); if jheight gt 0 and pheight le 0 then do; pheight=jheight; pwidth=jwidth; end; else if jheight gt 0 and pheight gt 0 then do; if input(substr(var1,3,2),s370fpib2.) eq 65504 then portrait=1; end; end; end;
How the program worksadd caption and correct height and width
if pheight eq 0 then do; pheight=450; pwidth=600; end; if upcase(substr(reverse(scan(picture, -2,".")),1,4)) eq "TREV" then portrait=1; title="&caption"||title; if portrait and pwidth gt pheight then do; holdw=pwidth; pwidth=pheight; pheight=holdw; end;
How the program worksif necessary, re-size pictures
if pheight gt pwidth then do; height=pheight+(pheight*(600/pheight-1)); width=600*(1+(pwidth/pheight-1)); end; else if pwidth gt pheight then do; height=600*(1+(pheight/pwidth-1)); width=pwidth+(pwidth*(600/pwidth-1)); end; else do; height=600; width=600; end;
How the program worksloose ends and wrap up
if dt_taken eq 0 then do; _dt_pattern_num=prxparse( "/\d\d\d\d\:\d\d\:\d\d\ \d\d\:\d\d\:\d\d/o"); date_offset=prxmatch(_dt_pattern_num,var1); if date_offset then dt_taken=input(substr( var1,date_offset,19),anydtdtm19.); end; output; done=1; end;run;
How the program workssort the file by dates pictures were taken
proc sort data=want; by dt_taken;run;
How the program workscreate html file
data _null_; file "&path.&outfile"; set want end=done; if _n_=1 then put '<html> ' / '<head> ' / "<title>&webpagetitle</title> " / '<meta http-equiv="Content-Type' '" content=' ' "text/html; charset=iso-8859-1">' / '<style type=“ ' 'text/css">' / '<!-- ' / '.style4 { ' / ' font-size: 36px; ' / ' color: #0033FF; ' / '} ' / '--> ' / '</style> ' / '<style type="' 'text/css">' / '<!-- ' / '.style2 { ' / ' font-size: 20px; ' / ' color: #0033FF; ' / '} ' / '--> ' / '</style> ' / '</head> ‘ / '<body>' / '<p class="style4"> ' "&webpagetitle</p>" / '<p class="style2">'"&webpagenote</p>"
How the program workscreate html file (continued)
/ '<table width="600" border="0" cellpadding="0" cellspacing="0">' /' <tr><td>'; if coordinates eq "No GPS" or upcase( "&includeGPSlinks.") eq "NO" then put '<img src=' picture 'width="' width '" height="' height '></td></tr><tr>'; else put '<a href="http://maps.google.com/maps?f= q&source=s_q&hl=en&geocode=&q=' coordinates '&aq=&ie=UTF8&t=f&z=16&vpsrc=0&ecpose=' coordinates ',1025.22,0,0,0"> <img src=' picture 'width="'width '" height="' height '"></a></td></tr><tr>'; if upcase("&afterpicturenumber") eq "YES" then put '<td width=" ' width ' " style="color:blue">' title '<br/><br/></td><td width=" ' width ' "allign="right"> ' _n_ '<br/><br/></td></tr>'; else put ' <td width=" ' width ' " style="color:blue">' title '<br/><br/></td></tr>'; if done then put / '</table>' / '</body>‘ / '</html>';run;
Author Contact InformationYour comments and questions are valued and encouraged.Contact the authors:
Arthur Tabachneck, Ph.D.myQNA, Inc.
Thornhill, Ontarioe-mail: [email protected]
William Klein, Ph.D.Toronto, Ontario
e-mail: [email protected]