interaction models between query clients, information resources & discovery services
DESCRIPTION
Interaction models between Query Clients, Information Resources & Discovery Services. Mark Harrison [email protected]. Assumptions. Connectivity, Availability EPCIS instances generally connected to the Internet and generally reliable - PowerPoint PPT PresentationTRANSCRIPT
Interaction models between Query Clients,
Information Resources & Discovery Services
Mark Harrison
Assumptions
Connectivity, Availability
• EPCIS instances generally connected to the Internet and generally reliable
• EPCIS instances may have downtime (e.g. for maintenance)
• Volumes of queries to EPCIS is approximately 10% of the volume of capture events (An EPCIS handles more capture events than query requests)
• Address of an EPCIS may change infrequently
Trust and Confidentiality
• Provider of a Discovery Service is expected to be trustworthyand to act in the interest of the resources (e.g. EPCIS instances)
Requirements
• Client Queries must be treated confidentially by a DS
• DS records (typically EPC, resource URL) must be treated confidentially by a DS
• Latency times and response times should be minimized
• The Query Response must be complete - i.e. it must contain all answers by resources that have willingly chosen to provide an answer
Queries & Data - Assumptions
• Clients can specify either:
– a full query (including EPC and other parameters)
OR
– only the EPC identifier
• The EPC number represents the query key
• Resources (e.g. EPCIS) can publish to a DS:
– Full EPCIS events (including business step, etc.)
OR
– EPC identifier only
Queries & Data - Assumptions
Discovery Services might hold / handle one of:
• (EPC, resource reference e.g. EPCIS URL)
• Fully replicated EPCIS events
• EPCs of interest to various client (clients' interests)
• Full client queries (e.g. EPCIS queries)
Interaction modes
• One-off queries– Assist with gathering of historical data (e.g. trace)
up to time of query(DS provides referrals or forwards query)
– Synchronous response may be possible
• Standing queries– Be notified of future updates from new resources
(e.g. companies who handle the object at future times)
– Only asynchronous notification possible
Interaction mode & Transient Connectivity
• Quick response times are a key requirement for a DS (quick ~ up to 5 seconds / synchronous response)
(questionnaire / interview responses).
• Predictable response time is required
• EPCIS resources are likely to have permanent connectivity to the network
Data Ownership & Trust
• Data ownership a key concern.
• Users may be reluctant to share more than minimum necessary data with a DS - or sharing of additional data should be optional.
• Reject models that require the resource owner (e.g. company having an EPCIS) to share detailed information with a DS without first gaining details of which clients require the detailed access and being able to refuse or negotiate this access.
Threats & concerns relating to Discovery Services
• Revealing sensitive information (volumes, flows of goods) to unauthorized parties– e.g. where resources lose control over which clients see links
– e.g. 'harvesting' of info from client queries by 'honeypot' resources
• Excessive network traffic / unnecessary messages– New vulnerabilities / mechanism for Denial of Service attacks?
• Slow response times– Inability to provide synchronous response
– Waiting for response from underlying / proxy query - and maintaining session state
• Manageability / Complexity of specifying access control policies– versus making a separate assessment for each query / each new client
– reuse / enforcement at both EPCIS and DS layers of architecture
– need for consistency (synchronization?) between a resource's policies at EPCIS & DS layers
• A Discovery Service may need to restrict which clients and which resources can use the DS (to limit DoS, honeypot attacks)
Clients, Intermediary & Resources
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Wishes to retrieve information (e.g. event data)from one or more organizations
Holds information about individual objectsCould be an EPCIS - but also web pages,web services, XML data and other service types
Maintains an internal list of associations and can help clients find resources (or vice versa)
A note about the diagrams that follow...
QueryClient
Intermediarye.g. Discovery
Service
Resourcee.g.EPCIS
Resourcee.g.EPCIS
Resourcee.g.EPCIS
Resourcee.g.EPCIS
Resourcee.g.EPCIS
QueryClient
FamiliarDirectory-likediagram
A note about the diagrams that follow...
QueryClient
Intermediarye.g. Discovery
Service
Resourcee.g.EPCIS
Interaction diagram
Any clientAny EPCIS
(or even another DS
or other kinds of services)
N.B.1. Even though the following diagrams only show one client and one resource, this is for clarity, to show the sequence of interactions - it does not imply that only one client or one resource can connect to each DS
N.B.2. When a resource is shown sending a message to an intermediary DS, this doesnot require additional EPCIS functionality. The resource may consist of an EPCIS repository and a separate 'DS publishing application' that publishes selected events (or fields derived from EPCIS events) to a Discovery Service.
Different message flow sequences
• We'll look at different message flow sequences between Client, Resource & Intermediary (Discovery Service)
• ... and analyze their merits in terms of:– impact on performance, – response time to queries, – confidentiality of resource's info – confidentiality of client's query
• Consider three phases of interaction...
SetupClient & Resource interact with DS to register interests & capabilities and negotiate security rights
DiscoveryProvides either client or resource with sufficient info to initiate service fulfillment
Service FulfillmentResource becomes aware of client request and is able to meet it
Client is querying
Resource is publishing
Client Resource
Client may be unknown
Client is publishing
Resource is querying
Client Resource
Resource may beunknown
Directory ofResources
Directory ofClients
Notification ofResources
Notification ofClients
SynchronousRequest/Response
AsynchronousPublish & Subscribe
WithDiscoveryphase
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Directory of Resources
Setup
Discovery
Fulfillment
EPC, URL
EPC
URL
full EPCIS query
EPCIS result-set
EPC, URL, serviceType
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Directory of Clients
Setup
Discovery
Fulfillment
? EPCs
EPC, URL
EPC, Client ID
full EPCIS query
EPCIS result-set
EPC, Client ID
EPC, Client ID
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Notification of Resources
Setup
Discovery
Fulfillment
EPC, URL
EPC, Client ID
EPC, URL
full EPCIS query
EPCIS result-set
EPC, Client ID
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Notification of Clients
Setup
Discovery
Fulfillment
EPC
EPC, URL
EPC, Client ID
full EPCIS query
EPCIS result-set
EPC, Client ID
EPC, URL,serviceType
Resource to Client
Client Resource
Client to Resource
Client Resource
Meta Resource
Meta Client
Notification ofEvents
Query Propagation
SynchronousRequest/Response
AsynchronousPublish & Subscribe
Withouta distinctDiscoveryphase
EPCIS events
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Meta Resource
Setup
Fulfillment
full EPCIS query
EPCIS result-set
EPCIS events
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Meta Client
Setup
Discovery
Fulfillment
full EPCIS queryClientID
EPCIS result-set
any queries?
queries,ClientID
ClientID,EPCIS queries
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Notification of Events
Setup
Fulfillment
full EPCIS queryClientID
full EPCISevents
full EPCISevents
ClientID,EPCIS queries
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Query Propagation
Setup
Discovery
Fulfillment
EPC, URL
full EPCIS queryClient ID
full EPCIS queryClient ID
EPCIS result-set
EPC, URL, serviceType
EvaluationModel Protection of
Client Confidentiality
Protection of Resource Confidentiality
ResponseLatency
Status
Directory of Resources
Good Concern Good Candidate
Directory of Clients
Concern Good Poor Rejectfor one-off queries
Notification of Resources
Good Concern Good Candidate
Notification of Clients
Concern Good Concern Candidate
Meta Resource
Good Poor Good Reject
Meta Client Concern Good Poor Rejectfor one-off queries
Notification of Events
Good Poor Good Reject
Query Propagation
Concern Good Concern Candidate
Types of data stored at intermediary (DS)
Model Type of data stored at intermediary to complete discovery
Directory of Clients
Clients' query key (i.e EPCs)
Notification of Resources
Clients' query key (i.e EPCs)
Meta Client Full client queriesNotification of Events
Full client queries
Notification of Clients
Resource keys (i.e. EPCs) and resource refs
Directory of Resources
Resource keys (i.e. EPCs) and resource refs
Query Propagation
Resource keys (i.e. EPCs) and resource refs
Meta resource N/A - full replication of resources' data (e.g. all EPCIS events)
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Directory of Resources
Setup
Discovery
Fulfillment
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Notification of Resources
Setup
Discovery
Fulfillment
One-off queries Standing queries
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Query Propagation
Setup
Discovery
Fulfillment
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Meta Client
Setup
Discovery
Fulfillment
EPC, URL, serviceType EPCIS query, Client ID
EPC, URL, serviceType EPC, Client ID
Directory Service
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Directory of Resources
Setup
Discovery
Fulfillment
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Notification of Resources
Setup
Discovery
Fulfillment
(one-off queries) (standing queries)
• 'immediate' response - resources do not need to be available at time of DS query• DS does not need to maintain session information for each DS query• client can choose which links to follow and can adjust its query for each resource• DS needs to store and enforce access control policies on behalf of resources in addition to the access control mechanism that each resource provides• resources need to trust the DS operator to enforce their policies on their behalf• resource does not find out client ID until client chooses to make an EPCIS query
EPC, URL, serviceType EPC, Client ID
Directory Service
QueryClient
DS
EPCIS 1 EPCIS 2 EPCIS 3
EPC Data
456 XXX
EPC Data
123 YYY
EPC Data
123 ZZZ
EPC Resource
Who has infoabout EPC 123 ?
EPC Resource
123 EPCIS 2
123 EPCIS 3
456
123
123
EPCIS 1
EPCIS 2
EPCIS 3
Query Relay
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Query Propagation
Setup
Discovery
Fulfillment
QueryClient
Intermediarye.g. DS
Resourcee.g.EPCIS
Meta Client
Setup
Discovery
Fulfillment
(one-off queries) (standing queries)
• DS does not return any resource data to clients• DS propagates client query to resources and client must wait for them to respond• The same client query will be sent to all resources - and the client has no visibility nor control over which publishers will receive their query (all receive the same query) - but has the convenience of not having to make iterative follow-up EPCIS queries• DS does not store fine-grained access control policies on behalf of resources - access control is done by each resource independently, with knowledge of client ID• Each resource can log all (successful/failed) attempts to access their data• Resources can deny access to certain clients without making client aware of denial
EPC, URL, serviceType EPCIS query, Client ID
Query Relay
QueryClient
DS
EPCIS 1 EPCIS 2 EPCIS 3
EPC Data
456 XXX
EPC Data
123 YYY
EPC Data
123 ZZZ
EPC Resource
Who has infoabout EPC 123 ?
456
123
123
EPCIS 1
EPCIS 2
EPCIS 3
Routing asynchronous replies back to clients
• Some models require a DS or a resource (e.g. EPCIS) to respond asynchronously to a client.
• Client might specify a return address of a listener or client proxy that is reachable (e.g. in DMZ, not behind firewall)
• In some models, the response does not come from the DS but from an unexpected / unknown resource
– Will need mutual authentication + establishment of mutual trust
• Routing resource responses back via a DS may be an option
– allows consolidation of responses from resources
– allows decoupling of client / resource address info
– DS may / may not maintain state / session info (# expected replies)
– burden of maintaining client session information adds to complexity and also adds to scalability problems
Clients receiving replies
- problems of slow / withheld responses
• Clients may have to receive and combine/correlate independent responses from multiple resources
• Problems:– Will client know how many replies to expect for each DS query?– Were all resources willing to reply to the client?– Will some resources be slow to respond?– How should a resource respond (without revealing its ID) when it has
information and may be willing to co-operate but still needs further client credentials / justification / negotiation with client?
• Possible options:– DS might return to client # of resources the DS forwarded query to.
Risky?– Use timeout intervals for receipt of responses. But may miss slow replies.– Resource might send an opaque token to client via a DS,
with DS acting as a go-between to help facilitate initial negotiations between client and resource (by passing messages)
Analysis of design options - Security & Trust
Confidentiality of Client queries
• In Query Relay model, risk of 'harvesting' by rogue resources ('honeypots').
• Clients may need to check plausibility of records asserted by resources. (e.g. an object cannot be within physical custody of two organizations at the same time)
– Might need 'business step' to understand whether physical custody is being claimed. (to allow for non-custodial resource owners [e.g. insurers ])
• Blacklists and whitelists– Sent by client with client's query, possibly cached in DS– Use of blacklists to prevent relaying of queries to known competitors or
dubious resources– Use of whitelists to restrict forwarding to only a set of trusted business
partners.– May prevent client from discovering unknown yet trustworthy resources that
hold relevant information
Analysis of design options - Security & Trust
Confidentiality of Resource information
• In Directory Service model, release of records (links) to client should be controlled via access control policies specified by the resource owner and enforced by the DS operator
• DS operator may also (need to) specify and enforce overall policies for their DS (e.g. who can query, who can publish, regulators' policies) that over-ride individual policies specified by resource owners.
• Resource owners should be aware of these DS policies before publishing to it
• In Directory Service model, scalability & management of security policies is a major concern
• If resource policy hides resource ID from unknown clients, how could those clients begin to negotiate with the resource?
– Possible mediation role for DS using temporary token + relaying of messages?
• In Query Relay model, resource can decide whether to allow access– Without delegation to a DS– Also taking into account real-time info (e.g. current load on resource)
• Query Relay model may work better for unsolicited client communications
Overcoming 'deadlock' before trust is established
Discovery Service
QueryClient
Resourcee.g.EPCIS
"I hold information about EPC xyz- but hide my real ID & contact info from unauthorized / unknown clients"
Opaque token0A8274B2845EF
Request for access,Quoting token
Request for accessfrom client with ID...
Analysis of design options - Security & Trust
Information Integrity
• Must prevent compromising the integrity of information held within DS
• Deletion / change of information only in accordance with security policies
• Resource should retain right to modify or delete - but this might be over-ridden by DS policies (e.g. to maintain a journal for regulated supply chains)
– Delete => mark as void
– Modify/update => mark as void and re-assert
• May need to consider digitally signing:
– Client queries (signed by Client)
– DS records (signed by the resource owner / publisher)
– Responses (signed by DS or by resource if responding directly)
• Potential problem of embedding URLs within DS records since any modification to the URL may break the original digital signature for each record.
– Consider decoupling URL from each DS record - store in 'Resource Profile' instead
• May need DS to indicate to the client whether it was able to validate signatures
– For signed DS records it received, where the record is not returned in full to a client
– Whether or not the underlying DS record was / was not signed, validated / did not validate
Analysis of design options - Security & Trust
Service Availability
• DS should be designed to be resilient against Denial-of-Service attacks.
• The DS design should not compromise the clients or resources or make them more vulnerable to Denial-of-Service attacks.
• In Directory Service model, URL of resource is only released to clients fulfilling access policy restrictions - helps prevent attacks on resources
• However, resources under attack may need to change their address (URL)
• Need ability to decouple current (possibly mutable) URL of a resource from its immutable resource ID - rather than embedding URL within each DS record. (See also previous slide re digitally signed records)
• In Query Relay design, if clients rely on propagation of full (EPCIS) query via a query relay DS, they would be particularly dependent on DS availability
( unless they have previously cached URLs of relevant resources )
Decoupling of URLs from Discovery Service records
Discovery Service
Resource ProfileURL
serviceType
DS Record
EPC or IDTimestampResourceID
[other metadata]
ResourceID=...
ResourceID
Resourcee.g.EPCIS
Analysis of design options - Security & Trust
Attack Scenarios
• Possible misuse of Query Relay design to launch DoS attacks on resources.– Possible countermeasures:
• client authentication with DS,• limit how frequently client may make queries
• Registration of non-existent resource addresses for already assigned EPCs– Increases network load, slower responses (timeouts, retries)– Query Relay and each client of a Directory Service could identify resources that
persistently fail - and remove from resource cache or add them to blacklists
• Registration of existent resource addresses - but of incorrect service type– Countermeasure: authenticate resources before allowing them to publish
• Impersonation of valid clients by malicious clients to mislead DS or resources– Countermeasure: authentication of DS clients
Analysis of design options - Security & Trust
Inter-working with NATs and Firewalls
• Clients must be able to interact with a DS from behind a firewall or Network Address Translation (NAT) box.
• Stateful firewalls match returning traffic with outbound addresses
• Problem of responses from unexpected network addresses (especially in Query Relay model variant when responses are not returned via the DS but directly from resources)
• Can also be a problem when sending responses via a message transport network. (address of message router might not be expected/recognized)
• Client may need to provide client proxy (listener address) in DMZ for receiving inbound responses, (allow for inspection while quarantined)
Analysis of design options - Security & Trust
Management of Access Control Policies
• Need high-level policies about which clients / resources can interact with DS
• Resources need to be able to restrict which clients can access their information (including the links to their information)
• For all models, the underlying resources need an access control mechanism
• For the Directory Service model, resources may need to be able to specify fine-grained access control policies to be enforced by the DS without the DS needing to contact the resource to check authorization
– May be considered as a subset of access control policy for underlying resource
– In Directory Service model, DS holds significant amount of policy state information - but management of DS policy may only be marginally more than management of underlying resource policy
– Maybe even possibility of using a common policy language / framework for both
• For Query Relay, policies are stored and enforced primarily at each underlying resource, although less granular policies may be pushed to DS / network to reduce load on resources
• Directory Service model provides clients some opportunity to avoid 'honeypot' harvesting attack ( by allowing inspection of link before contact )
Analysis of design options
- Network Performance / Resilience
• Persistent state information on a Directory Service– EPC - resource links (both models)– Security Policies (more detailed for Directory Service model)– Client subscriptions to new DS records (so new resources can be found)
• Management– Client subscriptions should be self-managing with automated removal– DS may also provide automated retention management of records
(Time-to-live / renewal of lease)
• Transient state information on a Discovery Service– Client session information (especially for Query Relay model)
(manages correlation of responses from resources with client's query)
Analysis of design options
- Network Performance / Resilience
Transaction Duration, Transparency & Predictability
• Client should receive response with minimum delay / response time
• Client should be able to manage communication with DS and resources
• Short, predictable response times preferred
• Client should be able to detect failed communications
• Client should be able to selectively retry only the communications that failed
• In Directory Service model, client has maximum control of communications with DS and underlying resources
• In Query Relay model, client must wait to ensure that all resources have had sufficient time to respond.
– Difficulty in knowing which resources have failed to respond (especially if time interval is too short).
– Possibly difficult to selectively retry the communication
Analysis of design options
- Network Performance / Resilience
Caching to improve performance (within a Discovery Service or by clients)
• DS maintains an internal cache of resource availability– May be insufficient to answer an EPCIS query directed at a DS
• Client can cache responses from a DS for future use with the same EPCs
• Potential problem with Query Propagation model:
– Client's cache may be missing potentially relevant resources because a previous client query to Query Relay network was too specific
(so only some resources responded to the query)
– Possible solution is for a resource to identify itself as a potential resource for a specific EPC, even if it had no results for the more specific query from the client
Analysis of design options
- Network Performance / Resilience
Processing Load on DS
• DS should be able to handle multiple simultaneous client communications• Requests should be handled quickly, with minimal computational effort
• Processing load depends on:• matching of client's request to DS records or routing tables.• retrieval and enforcement of applicable security policies
• Note about supporting additional metadata fields (e.g. bizStep)– May add to complexity of DS search– May result in more finer-grained security policies– May require post-processing of results to limit visibility of additional metadata
Conclusions
• Considered different models for interactions between clients, resources and intermediaries such as Discovery Services
• Choice depends on impact on security, performance & scalability– Not necessarily a single solution for all kinds of supply chains– Friendly community supply chains vs strongly competitive vs highly regulated
• Directory Service is traditional well-proven approach but has unique challenges as a Discovery Service:
– Delegated control and scalable expression, evaluation and enforcement of security policies
• Query Relay model perhaps less obvious - but routing networks are established e.g. in peer-to-peer content retrieval networks.
– Major challenges are detection and prevention of:• Honeypots for harvesting information from client queries• Injection of false information to mislead or cause disruption
– Need secure resource registration and policing of resource behaviour– Need secure client registration and policing to prevent DoS attacks on resources
Further reading and AcknowledgementsThese slides were prepared for the EPCglobal Data Discovery JRG face-to-face meeting (Alpharetta, June 2008) and are based on section B of deliverable D2.4 from the BRIDGE project:
BRIDGE WP02 High Level Design Discovery Serviceshttp://www.bridge-project.eu/index.php/public-deliverables/en/
I would like to acknowledge that D2.4 B was jointly authored by:Trevor Burbridge (BT) Oliver Kasten (SAP)Cosmin Condea (SAP)Mark Harrison (University of Cambridge, Auto-ID Lab)
with additional inputs from:Nicholas Pauvre (GS1 France)members of AT4 wireless (leader of BRIDGE WP2 [DS] )
A paper from the SAP team within BRIDGE on the Query Relay model appeared in the proceedings of the Internet of Things 2008 conference:
http://www.springerlink.com/content/v568wv5751r1187q/http://dblp.uni-trier.de/rec/bibtex/conf/iot/KurschnerCKT08