1 the latest web developments: how do i deploy them? brian kellyemail address uk web focus...
TRANSCRIPT
1
The Latest Web Developments: How Do I
Deploy Them?Brian Kelly Email Address
UK Web Focus [email protected]
UKOLN
University of Bath
http://www.ukoln.ac.uk/UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.
2
Contents
• Background
• Web Developments:• Data Formats• Transport• Addressing• Metadata
• Deployment Issues
• Questions
Aims of Talk• To give an overview of
the Web architecture• To review new web
developments• To address
implementation models
Aims of Talk• To give an overview of
the Web architecture• To review new web
developments• To address
implementation models
3
Background
UK Web Focus:• National web coordination post based at UKOLN,
University of Bath• Responsible for tracking web developments and
informing and advising UK HE community• Represents JISC on W3C
W3C (World Wide Web Consortium):• International organisation responsible for
coordinating web standards• See <URL: http://www.w3.org/>• See list of recommendations, working drafts and
notes at <URL: http://www.w3.org/TR/>
4
Web and Standardisation
W3C•Produces W3C Recommendations on Web protocols
•Managed approach to developments
•Protocols initially developed by W3C members
•Decisions made by W3C, influenced by member and public review
•UK membersinclude JISC,UKERNA,Southampton andBristol
IETF• Produces Internet
Drafts on Internet protocols• Bottom-up approach to developments• Protocols developed by
interested individuals• "Rough consensus and working
code"
ISO• Produces ISO
Standards• Can be slow moving
and bureaucratic• Produce robust
standards
Proprietary• De facto standards• Often initially appealing
(cf PowerPoint)• May emerge as
standards
PNGHTMLZ39.50Java?
PNGHTMLZ39.50Java?
PNGHTMLHTTP
PNGHTMLHTTP
HTTPURN
HTTPURN
HTML extensionsPDF and Java?
HTML extensionsPDF and Java?
5
The Web Vision
Tim Berners-Lee's vision for the Web:• Evolvability is critical • Automation of information management:
If a decision can be made by machine, it should• All structured data formats should be based on XML• Migrate HTML to XML• All logical assertions to map onto RDF model• All metadata to use RDF
See keynote talk at WWW 7 conference at <URL: http://www.w3.org/Talks/1998/0415-Evolvability/slide1-1.htm>
6
Web Protocols
Web initially based on three simple protocols:
• Data FormatsHTML (HyperText Markup Language) provides the data format for native documents
• AddressingURLs (Uniform Resource Locator) provides an addressing mechanism for web resources
• TransportHTTP (HyperText Transfer Protocol) defines transfer of resources between client and server
Data FormatHTML
AddressingURL
TransportHTTP
7
HTML History
HTML 1.0 Unpublished specification.
HTML 2.0 Spec. based on innovations from NCSA (forms and inline images!)
HTML 3.0 Proposed spec. (renamed from HTML+).Very comprehensive Failed to complete IETF standardisation Little implementation experience
Proprietary Introduction of proprietary HTML elements by Netscape and Microsoft
HTML 3.2 Spec. based on description of mainstream innovations in marketplace
HTML 4.0 Current recommendation1998
1994
1997
1994-5
1995
1992
DilemnaProprietary extensions
cause problems.But experiments
are needed
8
HTML 4.0, CSS 2.0 and DOMHTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment
HTML 4.0 : W3C-Rec• Improved forms• Hooks for stylesheets• Hooks for scripting
languages• Table enhancements• Better printing
CSS 2.0 : W3C-Rec• Support for all HTML
formatting • Positioning of HTML
elements• Multiple media support
CSS Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with
known bugs
CSS Problems• Changes during CSS development• Netscape & IE incompatibilities • Continued use of browsers with
known bugs
DOM : W3C-WD• Document Object Model• Hooks for scripting
languages• Permits changes to
HTML & CSS properties and content (DHTML)
9
HTML Limitations
HTML 4.0 / CSS 2.0 have limitations:• Difficulties in introducing new elements
– Time-consuming standardisation process (<ABBREV>)
– Dictated by browser vendor (<BLINK>, <MARQUEE>)
• Area may be inappropriate for standarisation:– Covers specialist area (maths, music, ...)– Application-specific (<STUD-NUM>)
• HTML is a display (output) format• HTML's lack of arbitrary structure limits
functionality:– Find all memos copied to John Smith– How many unique tracks on Jackson Browne CDs
10
XML
XML:• Extensible Markup Language• A lightweight SGML designed for network use• Addresses HTML's lack of evolvability• Arbitrary elements can be defined (<STUDENT-NUMBER>, <PART-NO>, etc)
• Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998
• Support from industry (SGML vendors, Microsoft, etc.)• Various XML DTDs already agreed (MathML, CML)• Support in Netscape 5 and IE 5
11
XML Concepts
Well-formed XML resources:Make end-tags explicit: <LI>...</LI>
Make empty elements explicit: <IMG .../>
Quote attributes <IMG SRC="logo" HEIGHT="20"
Use consistent upper/lower case
Valid XML resources:
Need DTD
XML Namespaces:Mechanism for ensuring unique XML elements:<?xml:namespace ns="http://foo.org/1998-001" prefix="i">
<P>Insert <i:PART>M-471</i:PART></P>
12
XML Deployment
Ariadne issue 14 has article on "What Is XML?"
Describes how XML support can be provided:
• Natively by new browsers
• Back end conversion of XML - HTML
• Client-side conversion of XML - HTML / CSS
• Java rendering of XML
Examples of intermediaries
See http://www.ariadne.ac.uk/issue15/what-is/See http://www.ariadne.ac.uk/issue15/what-is/
13
XLink, XPointer and XSL
XLink will provide sophisticated hyperlinking missing in HTML:
• Links that lead user to multiple destinations• Bidirectional links• Links with special behaviors:
– Expand-in-place / Replace / Create new window– Link on load / Link on user action
• Link databases
XPointer will provide access to arbitrary portions of XML resource.Interesting IPR issues!
XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents)
EnglandFrance
<commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>
<commentary xml:link="extended" inline="false"> <locator href="smith2.1" role="Essay"/> <locator href="jones1.4" role="Rebuttal"/> <locator href="robin3.2" role="Comparison"/> </commentary>
14
Addressing
URLs (e.g. http://www.bristol-poly.ac.uk/depts/music/latest.html) have limitations:
• Lack of long-term persistency– Organisation changes name– Department / Product scrapped– Directory structure reorganised
• Inability to support multiple versions of resources (mirroring)
URNs (Uniform Resource Names):• Proposed as solution• Difficult to implement (no W3C activity in this area)
15
Addressing - Solutions
DOIs (Document Object Identifiers):• Proposed by publishing industry as a solution• Aimed at supporting rights ownership• Business model needed
PURLs (Persistent URLs):• Provide single level of redirection
Pragmatic Solution:• URLs don't break - people break them• Design URLs to have long life-span
Further Information:<URL: http://www.ukoln.ac.uk/metadata/resources/urn/>
<URL: http://hosted.ukoln.ac.uk/biblink/wp2/links.html>
16
TransportHTTP/0.9 and HTTP/1.0: Design flaws and implementation problems
HTTP/1.1: Addresses some of these problems 60% server support Performance benefits! (60% packet traffic reduction) Is acting as fire-fighter Not sufficiently flexible or extensible
HTTP/NG: Radical redesign used object-oriented technologies Undergoing trials Gradual transition (using proxies)
17
MetadataMetadata - the missing architectural component from the initial implementation of the web
Metadata
PICS, TCN,
MCF, DSig,
DC,...
AddressingURL
Data formatHTML
TransportHTTP
Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management
Metadata Needs:• Resource discovery• Content filtering• Authentication• Improved navigation• Multiple format support• Rights management
18
Metadata Examples
DSig (Digital Signatures initiative):• Key component for providing trust on the web• DSig 2.0 will be based on RDF and will support signed
assertion:– This page is from the University of Bath– This page is a legally-binding list of courses provided
by the University
P3P (Platform for Privacy Preferences):• Developing methods for exchanging Privacy Practices of
Web sites and user
Note that discussions about additional rights management metadata are currently taking place
19
RDF
RDF (Resource Description Framework):• Highlight of WWW 7 conference
• Provides a metadata framework ("machine understandable metadata for the web")
• Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping
• Based on a formal data model (direct label graphs)
• Applications include:– cataloging resources – resource discovery– electronic commerce – intelligent agents– digital signatures – content rating– intellectual property rights – privacy
20
Browser Support for RDF
Mozilla (Netscape's source code release) provides support for RDF.
Mozilla supports site maps in RDF, as well as bookmarks and history lists
See Netscape's or HotWired home page for a link to the RDF file.
Trusted 3rd
Party Metadata
Embedded Metadata
e.g. sitemaps
Image from http://purl.oclc.org/net/eric/talks/www7/devday/Image from http://purl.oclc.org/net/eric/talks/www7/devday/
21
Deployment Issues
Various interesting new technologies have been outlined
How can they be deployed in our environment?
Should we:• Ignore them?• Accept them fully?• Accept them partly?
22
Ignore New Developments
We can chose to ignore new developments, and continue to use HTML 3.2:
Safe option, with no new training, support or software costs
Experience in effectiveness, limitations, etc. Fails to address current performance problems Fails to address accessibility problems Fails to provide new functionality Service likely to look "old-fashioned" compared
with competition
23
Fully Accept New Developments
We can chose to more wholesale to, say, HTML 4.0 and CSS 2.0:
Can be exciting to be at leading edge Performance benefits Accessibility benefits Based on open-standards Provides motivation for users to upgrade browsers Likely to be solution at some point (cf. Gopher) Backwards compatibility problems with old browsers Costly to deploy new authoring news, training, .. Likely to be bugs and incompatibilities with new
tools and browsers
24
Implement "Safe" Solutions
An alternative is to use "safe" parts of technologies which are backwards compatible and avoid major browser bugs
Attractive sounding compromise position Lose some functionality, but not all Can be difficult or expensive to find "safe" options
(does .margin-left work on IE on SGI?) Tools may not allow safe options to be chosen Lack of validation tools for checking conformance
with restricted set of specification
Note
See <URL: www.webreview.com/guides/style/insafegrid.htm> for unsafe CSS 2.0 properties
25
Decision Time
What would you opt for?
Stick with current technologiesCheap, default option. Continuation of performance and accessibility problems. Unlikely to be long term solution.
Deploy new technologiesMore expensive option. Functionality, performance and accessibility benefits. Access problems for old browsers.
Use "safe" new technologiesMay require home-grown tools and support. Avoids some of the problems of other solutions
26
An Alternative
An alternative approach to deploying new technologies is available:
• Use more intelligent server-side software• Use "proxies" to address limitations of
browser technologies. The term intermediary was used in a paper [1] at the WWW 7 conference to describe this approach
• Protocol solutions, such as Transparent Content Negotiation (TCN)
[1] "Intermediaries: New Places For Producing and Manipulating Web Content"
27
Intelligent Server Software
Simple model:• Server receives request for resource• Server delivers resource to client
More sophisticated model:• Server receives request for resource • Server processes header information from client• Server delivers resource to client based on client
information
This is referred to as browser-sniffing or user-agent negotiation
Note that server support is now available in Apache and in server add-ons such as PHP/FI and MS Active Server Pages
28
Portion of CSS file for IETotal 797 lines
W3C CSS Gallery
W3C have a link to a core style sampler service.
The service provides 8 core style sheets which can be freely linked to.
The style sheets use "browser sniffing". Different style sheets are delivered to different browsers.
H1, H2, H3, H4, H5, H6, .. {color: black; background: white}
Portion of CSS file for NetscapeTotal 169 lines
H1 {font-family: Tahoma, ... font-size-adjust: .53; margin-top: 1.33em; font-weight: 500; ...}
29
Java IntermediariesNetscape and Internet Explorer don't support MathML
Who cares? MathML Java renderers are available
This concept can be generalised to deploying support for other new markup languages.
For example see the Displets work at http://www.cs.unibo.it/~fabio/displet/
30
Deploying URNs
ProblemToday's browsers can't process URNs, such as:
urn:doi:10.1000/1
Possible Solution• A separate program could resolve URNs into URLs• Andy Powell (UKOLN) has demonstrated use of
Netscape's autoproxy to pass on URNs of the format above to Squid for resolution [1]
• Example of use of an intermediary to deploy new technologies not supported by current browsers
[1] "Resolving DOI Based URNs Using Squid" at http://mirrored.ukoln.ac.uk/lis-journals/dlib/dlib/dlib/june98/06powell.html
31
Intermediaries
Intermediaries:• Enable new functionality to be introduced to the
web without extending the client or the server • Intermediaries can be implemented using proxies
• Intermediaries can be used for applications such as web personalisation, document caching, content distillation and protocol extension
• Demonstration available using WBI (Web Browser Intelligence)
• See <URL: http://wwwcssrv.almaden.ibm.com/wbi/>
• Another example for web accessibility at <URL: http://www.inf.ethz.ch/department/IS/ea/blinds/>
32
Web Applications
An Example• We're familiar with HTML
validation services (e.g. HENSA mirror)
• We can "go there" and use the service• We can also have a link from the page which will run
the service (rather than just go to the form)• Consider:
– Web page is in Bath– User is in Sheffield– Application is in Kent
• An example of a web (intermediary?) application
33
ExamplesExamples of remote web applications include:
• Link checking• Website analysis• Document format
conversion• Accessibility support
Imagine an intermediary service which called an XML - HTML conversion service if the browser agent didn't support XML
http://www.ukoln.ac.uk/web-focus/webwatch/services/url-info/
http://wheel.compose.cs.cmu.edu:8001/cgi-bin/browse/objweb
34
Content Negotiation
Transparent Content Negotiation (TCN):• Method of deploying new formats
Client: ACCEPT image/gif, image/png
Server:If foo.png exists, send, else foo.gif
• Used for logos on W3C website• Not widely deployed
Transparent Feature Negotiation:• Proposal for deploying new HTML elements• Over-engineered? Requires naming authority
35
Fourth and Fifth Ways
Several other options for deploying new web technologies (e.g. on low spec PCs):
Run Browser on Server• Use Windows Terminal Server, Citrix, etc.• Browser runs on NT server
Deploy JavaPC (e.g. for DOS)• Use the JavaPC and run HotJava browser (min.
spec 486 PC with 8Mb)
Opera• Supports CSS, Frames, … on 486 PCs (8Mb)• See <URL: http://www.operasoftware.com/>
36
Conclusions
To conclude:• New web protocols are still being developed• Deployment of new technologies can be expensive
or time-consuming, but is likely to be needed• Various deployment models:
Don't implement Implement fully Implement via proxy Others (thin
clients, …)• We can't do it all ourselves• Experience in developing (wide-area) web
applications will help in developing intermediaries