part 1: introduction to seiscomp - gfz-potsdam.de

Post on 31-Jul-2022

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Part 1: Introduction to SeisComP

What is SeisComP?

The Seismological Communication Processor (SeisComP) is a concept for a networked seismographic system, originally developed for the GEOFON network and further extended within the projects MEREDIAN (“Mediterranean-European Rapid Earthquake Data Information and Archiving Network”) and GITEWS (“German Indian ocean Tsunami Early Warning System”). SeisComP development is coordinated by the GEOFON software development group at GFZ Potsdam (geofon_devel@gfz-potsdam.de).

SeisComP is free software, consisting of several sub-packages.

Tasks of SeisComP

Data acquisition Data quality control Data recording Real-time communication Network status monitoring Real-time data processing Issuing event alerts Waveform archiving Waveform data distribution

What is SeedLink?

SeedLink is a data acquisition system, which lays at the core of SeisComP acquisition package. SeedLink clients connect to the server using a TCP/IP application level protocol (SeedLink protocol).

The data source of a SeedLink server can be anything which is supported by a SeedLink plugin—a small program that sends data to the SeedLink server. The plugin API is defined in C language, but wrappers exist for C++, Java and Python. Data supplied by a plugin can be in a form of Mini-SEED packets or just raw integer samples with accompanying timing information.

Digitizer support

Plugins for following digitizers have been implemented by GFZ, ORFEUS and others:

● Quanterra Q380/Q680, Q4120 and Q730 (via Comserv)● Quanterra Q330 (UDP/IP)● EarthData PS2400 and PS6-24● Lennartz M24, PCM 5800 and MARS-88● Güralp DM24● Kinemetrics K2● Geotech DR-24● Nanometrics HRD-24● SARA SADC10/18/20/30● Lacrosse 2300 weather station

SeedLink clients

SeisComP includes the following standard clients:● slarchive saves data to the disk in Mini-SEED format, using SDS

(SeisComP Data Structure), BUD (Buffer of Uniform Data) or a user-defined directory structure.

● slqplot is used to plot the traces in real time, either in X-Window or into a file.

● slinktool is used mainly for testing the server and to get info about the available stations, time spans of streams, gaps, etc.

Using the libslink software library or its Java counterpart JSeedLink, C and Java programmers can create custom SeedLink clients for their own real-time data processing applications without having to know the details of SeedLink protocol. libslink supports Linux/UNIX, Windows and MacOS X platforms.

Data import and export

In addition to plugins that talk directly to a digitizer, plugins for exporting data from the following data acquisition systems are available:

● IRIS/GSN Live Internet Seismic Server (LISS)● IRIS/IDA Near Real-Time System (NRTS)● Earthworm● Kinemetrics' Antelope● Nanometrics' NAQS● Güralp's SCREAM● RefTek's RTPD

Earthworm, Antelope and NAQS can also work as SeedLink clients, importing data from SeedLink.

Overview of SeedLink connectivity

Part 2: What's new in SeisComP 2.5

In short

Major new features in SeisComP 2.5 are● integration of former add-on packages (SHM, AutoLoc2);● modular and extensible configuration script;● configuration profiles.

Modular structure of SeisComP

Earlier SeisComP versions supported only data acquisition. Add-on data processing and analysis packages, such as AutoLoc and SHM, were available, but not supported by make_key/make_conf scripts.

SeisComP 2.5 introduces a new directory structure, which makes it easy to manage sub-packages. The modular configuration script supports all included sub-packages and can be easily extended to support additional ones.

SeisComP now resides in $HOME/seiscomp, which contains a sub-directory for each sub-package, as well as key for top-level “key files” and pkg for package configuration modules. The latter are shell scripts that must have certain interface. If such a script is added to pkg, the package will be automatically recognized by the configuration script.

Sub-packages

SeisComP 2.5 consists of the following sub-packages:● acquisition: SeedLink server and its plugins, slarchive.● autopick (part of SeisPy): creates pick list, which can be used for

automatic location of events and triggered recording ofwaveforms.

● autoloc (part of SeisPy): uses the pick list to create a list ofautomatic earthquake locations.

● slmon (part of SeisPy): creates a web-page that displays data and feed latency of stations.

● seisgram (based on SeisGram2K by Anthony Lomax): shows real-time waveforms from multiple stations

● qplot: displays waveforms in a format that resembles drumrecordings; can also create GIF files to be displayed on web page.

● analysis (based on Seismic Handler by Klaus Stammler): can be used to analyse waveforms and correct automatick picksmanually.

Key files

The concept of key files, used by the acquisition package since SeisComP 1.0, has been extended to all sub-packages.

In SeisComP 2.5, there are two levels of key files● Top-level key files, located in $HOME/seiscomp/key/ contain

parameters that are of interest to many packages, eg., station latitude/longitude. In future SeisComP versions, this information might be taken from the inventory database.

● Package-level key files, located in $HOME/seiscomp/package/key/, contain parameters that are of interest to a specific package. Q330 authorization code (acquisition) and STA length (autopick) fall into this category.

Key files (contd.)

Both levels, described previously, contain 4 types of key files:● global: parameters that are common to all stations. Eg., name of

data center.● network: parameters that are common to all stations of one

network. Name of the network is an obvious example.● station: parameters of a station.● profile: contains the same parameters as “station” key file, but is

shared by one or more stations.

Configuration profiles

Configuration profile is a set of station parameters, which is shared by one or more stations. It is possible to define individual profiles for each sub-package. For example, you can define an acquisition profile for each SeedLink server you use, which saves you from the tedious job of entering the same station parameters over and over again.

Likewise, it might be useful to define profiles for other sub-packages, such as an autopick profile for each set of stations using differentinstrumentation.

Plugin templates

New plugins are automatically recognized by the configuration script when templates and a special key file are added to $HOME/seiscomp/acquisition/templates/source/

It is possible to define multiple specializations of existing plugins, eg., a chain_plugin specialization, which renames networks, stations or streams, creates downsampled streams and so on.

When plugin specializations are used, it is rarely necessary to modify configuration files by hand and risk losing the changes when the files are overwritten.

Distribution

SeisComP 2.5 is distributed as a single tar file, containing both source and precompiled binaries (the idea was borrowed from Seismic Handler). The package is installed by simply unpacking it in a chosen directory and executing the setup script. If necessary, the installed package can be be recompiled by entering a couple of “make”commands as described in the README.

Part 3: Towards SeisComP 3.0

SeisComP releases

Release Date Highlights

1.0 February 2001 SeedLink 2.0 (plugin interface)Plugins for EarthData PS2400 and Lennartz M24

1.1 August 2001 SeedLink 2.1 (streams.xml, improved buffer structure)make_conf/make_key scriptsLISS plugin, SeedLink->Antelope connectivity

1.1.5 January 2002 SeedLink 2.5 (multi-station mode)

1.1.6 March 2002 GIF live seismograms

2.0 October 2003 SeedLink 3.0 (INFO request, time window extraction)libslink, chain_plugin, Comserv-independence

2.1 June 2004 Python add-on package (SeisPy) incl. AutoLoc2chain_plugin extension interface, triggered streams

2.5 March 2006 Integration of add-on packages, modular config script

3.0 future? Database integration, distributed architecture

Metadata (inventory)

SeedLink can only transfer waveform data. Thus, the data is useless for seismologists, unless they have obtained metadata (station coordinates, response, etc.) via other channels or they make assumptions about station hardware (seismometer type, etc.). Inearlier SeisComP versions, metadata did not exist at all; SeisComP 2.5 includes limited metadata, which is in simple cases (standard orientation, no history) enough to create dataless SEED volumes locally. SeisComP 3.0 will include full inventory database; metadata can be transferred via TCP/IP using SeedLink's companion protocol—ArcLink.

Distributed architecture

Data processing modules, most notably autopick and autoloc are offered for SeisComP 2.x as an add-on, but all data processing must take place on a single computer. Secondly, if a user modifies automatic picks manually, the modified picks will not end up indatabase and the event location and magnitude will not be updated.

In order to solve the above problems, a new data processing concept is being developed. According to the new concept, all modules are communicating via so-called “microkernel”. If, eg., a new pick comes in or an old pick is modified, all relevant modules are notified, so the location and magnitude can be recalculated. The modules can run on different machines and communicate over network, it is thus possible to transfer only picks instead of complete waveform data to save network bandwidth. More processing modules, such as modules for calculating different magnitudes, can be added to the systemdynamically.

Database

SeisComP database has 2 major components:● Inventory DB contains all information that is needed for

IDL:iris.edu/Fissures/IfNetwork/Instrumentation:1.0 and dataless SEED volumes.

● Event DB contains event information, such as origins,magnitudes, pick associations, etc.

These components should be viewed as groups of tables in a single (MySQL, etc.) database. In addition to the two groups, there might be tables that contain user profiles, etc.

Inventory DB Event DB

SeisComP DB

XML schema

Database schema is accompanied by an XML schema for data exchange. Both are mappings of the general SeisComP object model.

Net_code Descriptio Net_start Net_end

Net_code Net_start Sta_code Sta_start Sta_end Descripti Latitude Longitud Elevatio

<network code=”GE” description=”GEOFON” start_ <station code=”RGN” description=”GEOFON/GR</network>

DB schema

UML diagram

XML schema

Relationaldatabase XML document

networkcodedescriptionstart_timeend_timeinstitutionsclasstype

stationcodedescriptionstart_timeend_timelatitudelongitudeelevationdepth seis stream

datalogger componentcodeazimuthdip

QC logstart_timeend_timemessage aux stream

codeloc_idstart_timeend_timeformatdevicesource_id

seismometernamedescriptionmanufactureretc.

seismometer_snseismometer

datalogger_sn

codeloc_idstart_timeend_timedepthformat

calibration

gain0gain1gain2

sn

index

dataloggernamedescriptionmanufactureretc. sample_rate

decimationsample_ratefilter_chain

FIR resp.namecoefficientsetc.

PAZ resp.namecoefficientsetc.

calibration

gain0gain1gain2

sn

(1..n)(0..n)

(0..n)

(1..n)

(1..3)

(0..n)

(0..n)(0..n)

(1..n)

inventory

(1..n

)

(1..n)

(1..n)(0..n)

(0..n)

outage

start_timeend_time

outage

start_timeend_time

(0..n)

aux device,routing

Provisional SeisComP inventory model

Provisional SeisComP inventory model (contd.)

routing

networkstation

inventory

(0..n

)

(0..n)

network,seismometer,datalogger,FIR resp.,PAZ resp.

aux devicenamedescriptionmodelmanufacturer

aux sourceiddescriptionunitconversionsample_rate

(1..n)

arclinkstart_timeend_timeaddresspriority

seedlinkaddresspriority

(0..n

)

(0..n)

XML-SEED is an XML format, where each SEED blockette is directly mapped to an XML element. Thus, XML-SEED, like SEED, is a relatively low-level format, which is good for building self-consistent volumes for data exchange, but not very useful for the purpose of maintaining of large networks. In particular:

● SEED does not have a notion of digitizer; only FIR filters are specified for each channel. It is difficult if not impossible to repair SEED volumes if it is found that one particular digitizer model has an incorrect FIR filter—one has to find all SEED volumes where this digitizer is used (based on additional knowledge) and fix all streams that use the FIR filter in question.

● SEED does not distinguish between component, channel and stream, so it is necessary to specify, eg., orientation and sample rate independently for each stream, even though VHZ, LHZ, BHZ have always the same orientation and BHZ, BHN, BHE have always the same sample rate. If it is found that orientation or sample rate is incorrect, all streams must be modified.

SeisComP inventory model and XML-SEED

SeisComP inventory model and XML-SEED (contd.)

The primary goal of SeisComP inventory model is to simplify the maintenance of large heterogeneous seismic networks, consisting of multiple sub-networks. We avoid duplicating the information—there is one definition for each seismometer and digitizer, which are referred to. Thus, if a digitizer has incorrect FIR filter or a seismometer hasincorrect gain, it has to be corrected only in a single location.

The inventory model is “DHI-compatible” in a sense that it specifies all information needed for DHI. Some of this information, such as digitizer model, is missing in SEED. Creating a DHI instrumentation object (IDL:iris.edu/Fissures/IfNetwork/Instrumentation:1.0) from an instance of inventory model (either database or XML file) is straightforward.

We have developed a Python script that creates dataless SEED from an instance of inventory model. XML-SEED will be supported in near future.

Metadata exchange

ArcLinkserver

XMLTCP/IP

ArcLinkclient

SQL

Da

tale

ssS

EE

D

DB

SQ

L

XM

L

Da

tale

s sS

EE

DX

ML

ArcLinkclient

XMLTCP/IP

ArcLinkserver

SQLSQL

ArcLinklibrary

SQLDB

SQ

L

ArcLinklibrary

Da

tale

ssS

EE

DX

ML

Part 4: ArcLink/DHI

ArcLink node

Server

TCP port 18001

Request handler

req

ue

stpipe

statu

spipe Data product

(SEED or XML file)

Configurable parameters:

● Maximum number of parallel request handlers (eg., number of requests processed in parallel).

● Number of pre-forked requesthandlers.

● Length of request queue.● Maximum number of parallel TCP/IP

connections.

(1..n)

Data repository

Inventory DB

Request handler

Inventorymodule

Eventmodule

Waveformmodule

Request handler

Data archive Event DB

Data repository

Server

ArcLink network

ActiveArcLinknode

ActiveArcLinknode

ActiveArcLink

node

Inventory DBsynchronization

Client

Prim

ary

conn

ectio

nSeco

ndary

connecti

on

Secondary

connection

Inventory DB

synchronization

Inventory DB

synchronization

PassiveArcLink

node

Secondary

connection

ArcLink user interfaces

ArcLink network

command-lineclient

(arclinktool)

WWWinterface breq_fast AutoDRMDHI

JWeed VaseJPlotResp

Request format General request format:

REQUEST request_type attributesstart_time end_time network station stream location constraints...END

Sample inventory request:

REQUEST INVENTORY instruments=true outages=false logs=true2005,09,01,00,00,00 2005,09,01,00,10,00 * . restricted=false2005,09,01,00,00,00 2005,09,01,00,10,00 * * BH* * latmin=50 latmax=60END

Sample waveform request:

REQUEST WAVEFORM format=FSEED2005,09,01,00,00,00 2005,09,01,00,10,00 GE WLF BH*2005,09,01,00,00,00 2005,09,01,00,10,00 GE KBS BHZ 00END

Request types

Request Type Attributes Product Format

INVENTORY instruments=true|falseoutages=true|falselogs=true|false

XML

RESPONSE format=SEED|XSEED Dataless SEEDXML (XML-SEED)

WAVEFORM format=MSEED|FSEED|XSEED Mini-SEEDFull SEEDXML (XML-SEED)

EVENTS picks=true|false XML (QuakeML)

CORBA & DHI

Common Object Request Broker Architecture (CORBA) is a standard for distributed computing—running single application on multiple physical computers. The interface of a CORBA object can be used on the client side as a normal Java/C++ object; the implementation of the object runs on the server side.

Data Handling Interface (DHI) is a specification of CORBA objects for accessing seismological data.

DHI server

Objectimplementation ORB

Objectimplementation ORB

ObjectinterfaceORB

ObjectinterfaceORB

DHI client

Inter-ORBprotocol

Inter-ORBprotocol

Problems with CORBA

CORBA is an elegant solution for systems whose all components are trusted, but security problems arise when clients are executed by untrusted users on the Internet. Network-specific security checks are even in conflict with the fundamental idea of CORBA—to hide from the programmer that the objects are not running on the local machine.

For the same reason, CORBA does not provide any standard methods for determining the IP of client. Notion of “logging in” does not exist in CORBA—the client typically obtains a reference to existing CORBA object via nameserver; this object is then used to create other CORBA objects. It it is not possible to see which users have which objects in the server—CORBA works behind the scenes and the programmer has very little control over it. Combined with the lack of user authentication in DHI, it makes very difficult to make statistics.

Problems with DHI

Due to the lack of user authentication, it is not possible to provide access to restricted datasets and make user-based statistics.

Fast and stable network connection is required, because CORBA connections have relatively high overhead and it is not possible to resume broken download.

Download size must be limited to few megabytes to avoid running out of memory. Nevertheless, the server may run out of memory if many clients request data at the same time.

No routing. The user must know in advance from which datacenter to request the data.

Problems with DHI (contd.)

Mini-SEED headers are removed. Even though Mini-SEED data can be reconstructed on the client side, some information, such as timing quality, is lost.

No support for non-waveform Mini-SEED data (eg., log records). Even though it is theoretically possible to construct full SEED on the

client side, no software exists to accomplish that. Too much flexibility—many interfaces are not implemented in any of

the existing servers. Developers of DHI clients need to find out which interfaces are implemented in which servers by trial and error.

No documentation.

ArcLink + DHI = ?

Despite of the problems, DHI is attractive due to a range of GUI clients. Therefore we have implemented a DHI server, which acts as a proxy to ArcLink network. A user can install this package in the local network of his institute.

This approach solves many of the problems:● It is possible to set username and password in ArcLink/DHI

configuration, so the administrator of the ArcLink server can make institute-based statistics and provide access to restricted datasets.

● Using ArcLink/DHI, a user connects to “virtual” datacentre—ArcLink provides request routing, which is missing in DHI.

● There are no security problems, because the users are trusted.● In case of fast local network, CORBA overhead is not a problem.

To do...

Reorganize GEOFON inventory DB. Review and “standardize” inventory schema

● add QC logs, aux streams and outages;● allow reuse of network code for temporary nets;● add “restricted” attribute to individual streams;● add “last_modified” attribute;● add (default) “depth” attribute to station;● try to use FISSURES terminology where applicable;● ...

Event request. Inventory DB synchronization.

To do... (contd.)

Full SEED (mostly done) and XML-SEED (should be easy). breq_fast and AutoDRM interfaces. Tools to edit inventory DB. Server improvements:

● user profile management;● request priorities;● blocking download;● save/restore mechanism to allow server restart without canceling

requests.

Appendix

autopick/autoloc

slmon

seisgram

qplot

analysis

Equipment used in GEOFON stations

STS-2 VBB seismometer

EarthData digitizer + SeisComP box

Seismic vault equipment

Seismometer STS-2

STS-2 Hostbox

DigitizerPowerbox Serial Line Driver

GPS Receiver

Recording site equipment

Directional WLAN Antenna

SeisComP Recorder Box

Powerbox

Batteries Charger

Transformer Box

SeedLink data flow in GFZ

Network data processing

Data processing products

The end

top related