from the channelarchiver to the best ever archive utility, yet [email protected] july 2009

14
From the ChannelArchiver to the Best Ever Archive Utility, Yet [email protected] July 2009

Upload: eleanore-foster

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

From theChannelArchiver

to theBest Ever Archive

Utility, Yet

[email protected]

July 2009

2 Managed by UT-Battellefor the U.S. Department of Energy

CSS-based OPICSS-based OPI

ArchiveArchive EngineArchiveArchive Engineconfig.xmlconfig.xml

IOCIOC

Binary Data Files

Data ServerData Server

“ASCII” Config.“ASCII” Config.Channel AccessChannel Access

XML-RPCXML-RPC

Channel Archiver

History~2000:

Started byBob Dalesio

~2003:Index Tools,Data Server

~2007:CSS Client

3 Managed by UT-Battellefor the U.S. Department of Energy

Problems

• Data file format optimized to write many samples quickly– More then 40000/second

– .. but we only used maybe 1000/sec

– .. and many ill-configured or duplicate channels

• Headaches with data maintenance:– Scripts to restart engines, copy data, update indices.

– Index time grows with data

– Stuck when index files reach 2GB

– SNS Users faced with ~80 sub-archives

– No clue what needs fixing after network/power problems

– No idea who contributes how many samples

– No way to remove selected channels or time ranges

– Improving on this means implementing an RDB

4 Managed by UT-Battellefor the U.S. Department of Energy

CSS-based OPICSS-based OPI

ArchiveEngineArchiveEngine

config.xmlconfig.xml

IOCIOC

RDB(Oracle/MySQL)

RDB(Oracle/MySQL)

Channel AccessChannel Access

New Setup

SamplesSamplesConfig.Config.

EngineConfig-Import

Other tools for config & samples

5 Managed by UT-Battellefor the U.S. Department of Energy

CSS Data Browser Handles Both• New URL

• Just one‘RDB’sub-archive

• Old and newdata can becombined inone plot

6 Managed by UT-Battellefor the U.S. Department of Energy

Web Interface to Engine Config

• Tomcat/JSP/Servlets to view and edit

7 Managed by UT-Battellefor the U.S. Department of Energy

Web Config View: Channel Stats

8 Managed by UT-Battellefor the U.S. Department of Energy

Web Config View: Sample Stats

OK

??

9 Managed by UT-Battellefor the U.S. Department of Energy

Stats• 38 sample engines, 83000 channels

• Host that runs sampling engine:– CPU load 45%, zero disk I/O wait, very responsive

• Oracle– Cluster– Sample tables partitioned by day– 8000 samples/sec peak

in write tests– Operationally maybe

¼ of that

• Better configurationwould likely havefewer samples/sec

10 Managed by UT-Battellefor the U.S. Department of Energy

Summary, Status

• At SNS, BEAUtY replaced Channel Archiver– Parallel operation for ~2 month– Turned old sample engine off this month

• About a year of testing, many Oracle setup issues– Oracle cluster setup– Updated partitioning

• Next– Data reduction: Replace Oracle partitions of old data with reduced

channel/sample count

11 Managed by UT-Battellefor the U.S. Department of Energy

Stuff

12 Managed by UT-Battellefor the U.S. Department of Energy

Hurdles• Months: Get new Oracle server configured– Interface cards for storage array, fiber switches– Firewall holes for office access, backup, admin

• Changes in 10g– SELECT MIN(stamp), MAX(stamp) -> NULL, NULL– No “range” partitioning on Index-Organized-Tables

• Configuration issues– ORA-01654: unable to extend …– ORA-00257: archiver error

• What used to be impossible is now “easy”, but still expensive– DELETE FROM SAMPLES WHERE …

13 Managed by UT-Battellefor the U.S. Department of Energy

Configure Sample Engines

• Configuration is in RDB– Directly use SQL– EngineConfigImport for legacy config files– View/Edit via web

• Hierarchical (as before)– Sampling engine (name, where to run, …)• Groups– Channels

• No more duplicate channels!

14 Managed by UT-Battellefor the U.S. Department of Energy

Other Ideas

• Jlab’s MyA– Operational, but• Wrapper code around MySQL to create ‘cluster’

• Handles less meta info (units, limits, …)

• Viewer is one-of C++/TCL/Tk

• Gabriele Carcassi mentioned RDDTool– Toolset for logging data with data-aging– Command-line RPN tools, web viewer– May be faster than general-purpose RDB• but lacks advantage of gpp RDB