data handling system (dhs)€¦ · bulk data transport (bdt) manages large-volume, high-rate data...

27
Data Handling System (DHS) October 3, 2011 Bruce Cowan

Upload: others

Post on 28-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Handling System (DHS)

October 3, 2011

Bruce Cowan

Page 2: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DHS Responsibilities

● Receive science data from ATST virtual cameras

● Route data appropriately

● Provide means to process science data to reduce its volume

● Provide means of producing and displaying quality assurance data

● Receive header info from all ATST systems & associate with corresponding

science data

● Retain calibration data for subsequent use

● Distribute science data and header info for external use

4.3-2100/2110

Page 3: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Basic DHS Structure

Virtual Camer

a 1

Camera Line 1 Storage

Header DB

Calibration

DB

Virtual Camer

a 2

Camera Line 2 Storage

Virtual Camer

a X

Camera Line X Storage

Da

ta D

istrib

utio

n

4.3-22XX

Page 4: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Basic DHS Structure

Virtual Camer

a 1

Camera Line 1 Storage

Header DB

Calibration

DB

VC 2

Camera Line 2 Storage

Virtual Camer

a X

Camera Line X Storage

Da

ta D

istrib

utio

n

VC 3

VC 4

Virtual Camera Line 2

Virtual Camera Line 3

Virtual Camera Line 4

Virtual Camera Line 1

4.3-22XX

Page 5: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Configurable Data Routing

Virtu

al C

am

era

s

Sto

rage

4.3-2230/2370

Page 6: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Configurable Data Routing

Virtu

al C

am

era

s

Sto

rag

e

4.3-2230/2370

Page 7: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Routing

● Flexibility achieved using RTI DDS publish/subscribe over multicast

● Applications (components) “discover” each other using multicast without

having to have a data path predefined for each sender. A sender simply

declares it will “publish” data of a certain topic/size, and all receivers

“subscribing” to that topic/size will automatically receive any messages that

are published.

● Multicast provides reduced network traffic compared to unicast (point-to-

point). A single message can be received by multiple clients.

4.3-2230/2370

Page 8: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Routing - Unicast

Node 1

Switch

Node 2

Sender

Node 3

Unicast: A separate message must be specifically sent to each, so network traffic increases proportionally with number of recipients. The sender must know the addresses of all receivers.

Page 9: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Routing - Multicast

Node 1

Switch

Node 2

Sender

Node 3

Multicast: A single message is sent to a “group” address, and the switch makes a copy to send to all machines that have joined the group. Interested receivers just register with the switch to be included.

Page 10: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DHS Physical Overview

Page 11: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DHS Software Layers

DHS Applications

Data Processing

Pipeline (DPP)

Quality Assurance

System (QAS)

Data Storage +

Distribution (DSD)

Databases / Data Stores

Bulk Data Transport (BDT)

Page 12: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Bulk Data Transport (BDT)

● Manages large-volume, high-rate data flows within the DHS

● Provides a way of creating a Camera Line, by defining a data route through a

sequence of computer nodes. This route may be different from one experiment to

the next.

● Two classes of data

● Science data (up to 960 MB/s for each camera line)

● Quality assurance data (much slower, < 100 MB/s)

● Data routing requirements accomplished using publish/subscribe abilities of Data

Distribution Service (DDS) middleware. Science data will flow over 10 GbE, and

QA data over standard gigabit.

4.3-22XX

Page 13: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

BDT 10GbE Sustained Speed Test

Simulated camera line - Various data sizes

VBI's 32MB @ 30Hz requirement drove BDT development, as evidenced by the 32MB “sweet spot” in the table.

4.3-2240/2440

Page 14: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Quality Assurance Support (QAS)

● Allow system users to view/analyze data as it is collected to enable them to adjust

the ATST system to improve data quality

● Two aspects to quality assurance

● Quick Look Display - Near real-time display of raw data with simple image

manipulation capability.

● Detailed Display – More thorough checking of data with data-specific

processing that may introduce some delay. This processing is accomplished

via plugins supplied by the instrument manufacturers, which are inserted into

the data stream leading to the display.

● The BDT data stream is accessed by insertion of a QAS Probe, which will send

sampled data to an associated QAS Sink for display. The current baseline choice

for display functionality is DS9.

4.3-2505/2510

Page 15: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Quality Assurance System

Transfer Node

QL Display

QAS Probe

QAS Sink

Gigabit

Processing

Node

QAS Probe

Detailed Display

Instrument

Developer

Plugin

QAS Sink

Gigabit

4.3-2505/2510

Page 16: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Storage and Delivery (DSD)

● Acceptance of science information from source application nodes and the

persistence of that information to storage media

● Long term storage of calibration data needed for summit processing

● Long term storage of engineering data and system logs

● Repackaging and distribution of science data and information to external sites

4.3-26XX

Page 17: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Processing Pipeline (DPP)

● Allows for the insertion of extra steps into the BDT data stream. This can be done

to either:

● Sample data from the primary stream without affecting the data within it. This

is the mechanism used by the Quality Assurance System to access the

stream for display purposes.

● Modify data available to all downstream components. Some instruments,

such as the VBI, need significant processing resources to reduce the raw

data into a product that can be transferred off-site in a timely fashion.

● Each Camera Line is capable of having its own independent DPP.

● The DPP is not responsible for the algorithms employed in the processing, but

manages a plugin's access to the data stream.

4.3-23XX

Page 18: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DPP Configuration

DataProcessingComponent/QasDDProbeComponent configuration attributes:

● dhs.cameraLine – camera line number

● dhs.topicName – subscribed topic name

● dhs.maxData – subscribed maximum data size (in bytes)

● dhs.qas.ddHandlerClass – the full class name of the IDataHandler plugin

● dhs.repubTopicName – the topic name under which to publish the plugin’s data product

● dhs.maxPluginData – the maximum data size (in bytes) of the plugin’s published data

The IDataHandler plugin may require configuration. In addition to any custom, implementation specific

attributes, the following are available for plugins that need to subscribe to more than the one primary data

stream:

● dhs.subtopic.cameraLine.# - the subtopic’s camera line

● dhs.subtopic.maxData.# - the maximum data size (in bytes) for the subtopic

● dhs.subtopic.name.# - the subtopic’s topic name

● dhs.subtopic.key.# - if > 1 subtopic, the subtopic “key”

Allowing for multiple subtopics, the “#” must be substituted with an unbroken numerical sequence starting with

“1” for the first subtopic.

4.3-2330

Page 19: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

IDataHandler Interface (for Plugins)

// Allow plugin opportunity to allocate resources

public void onDoInit()

// Allows the plugin to receive the component's doSet() attributes

public void set(IAttributeTable table)

// Get array of event names this plugin wishes to subscribe to

public String[] getEventNames()

// Notify receipt of subscribed event

public void eventNotify(String eventName, IAttributeTable eventValue)

// Receives the data buffer, plus the publisher on which to optionally publish the new data product.

public void process(IBdtBuffer buffer, IBdtPublisher pub)

// If subscriptions to other BDT streams have been configured, this method delivers the incoming content. The

// "key" is a string specified in the configuration to help distinguish multiple subtopics. It may also be null if there

// is only one.

public void subTopicReceive(String key, IBdtBuffer buffer)

// Allow plugin opportunity to de-allocate resources

public void onDoUninit()

4.3-2330

Page 20: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Camera Line Detail 1

Virtual Camera

Transfer Store

Zero or more

Transfer Node

Processing Node

Quick Look Display

Detailed Display

Camera Store

Page 21: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Camera Line Detail 2

Virtual Camera

Pub

Transfer Store

Transfer Node

Pub

Sub

QAS Probe

Sub

Pub

data

Quick Look Display

QAS Sink

Sub

Zero or more

Processing Node Sub

Pub

QAS Probe

Sub

Pub

data

Detailed Display

QAS Sink

Sub

Camera Store

Sub

Sub

Page 22: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DSD Data Flow 4.3-26XX

Page 23: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Data Transfer to ATST Base Facility

TeraPac3

Portable Rugged Hard Drive Array Briefcase

● 8 x 3.5” SATA drives provide up to 24TB with currently available drives. ● Pelican water tight roller case for transport

Page 24: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

DHS Hardware

● Storage technologies are expected to evolve before final deployment of the DHS, but its

design can be extrapolated from currently available hardware, and proposed next gen.

● Drive Controller – SAS 2.0

● 8 ports operating at 600 MB/s per port, v3.0 will be 1.2 GB/s per port

● LSI SAS 9269-8i benchmarked at 2875 MB/s read, 1800 MB/s write

● Host Bus – PCI-Express v2.0

● 500 MB/s per lane vs 250 MB/s for v1.0. (SAS 2.0 can't run full speed)

● Computers with PCIe v3.0 (1 GB/s per lane) now being sold.

● Network Connectivity – 10 GbE

● Storage – Solid State Drives (SSD)

● Too costly now, but SSD enterprise arrays already at 12.8 GB/s read, 9.7 GB/s write

Page 25: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Mini DHS - Functionality

● Test platform for instrument/camera developers

● Receive and store data from a single Virtual Camera via 10 GbE

● May not provide the full-speed capture (960 MB/s) of a full-blown DHS

Camera Line, but should be close. Capture duration will be limited due to

drive array capacity.

● Fixed QAS Probe monitors camera data stream, and publishes QAS Display data

to gigabit port.

● Receive and store camera header data

● Access Portal accepts HTTP requests to build FITS file exports from camera and

header data

● Supports routing data through an external Processing Node for DPPS testing.

Page 26: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Mini DHS - Proposed Hardware

● Intel Xeon X5660 processor

● 2.8 GHz, 12 MB cache, 6 cores + hyper-threading

● Motherboard upgradeable to second CPU

● 24 GB DDR3 1333 MHz RAM

● Motherboard upgradeable to 48 GB

● 9 x 300/600 GB Seagate Cheetah 15K.7 SAS 2.0 drives

● 1 drive for O/S+CSF/ICE/etc, 8 in RAID 0 array for fast storage

● LSI SAS9260-8i SAS 2.0 PCI-Express 2.0 RAID controller

● Intel X520-DA2 10-GbE dual-port PCI-Express 2.0 network card

● Dual-port gigabit ethernet on motherboard

Page 27: Data Handling System (DHS)€¦ · Bulk Data Transport (BDT) Manages large-volume, high-rate data flows within the DHS Provides a way of creating a Camera Line, by defining a data

Compliance Matrix

● Separate file...