the 1 st national puc dockets database: aee powersuite eric fitz director, engineering and product...

Post on 06-Jan-2018

219 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

AEE Membership Across Technologies 3

TRANSCRIPT

THE 1ST NATIONAL PUC DOCKETS DATABASE:

AEE POWERSUITE

Eric FitzDirector, Engineering and Product Development

NARUC Subcommittee on Information Services

November 2014

Advanced Energy Economy

2

AEE is a national association of business leaderswho are making the global energy system more secure, clean, and affordable.

Mission: Transform public policy to enable rapid growth of advanced energy companies.

3

AEE Membership Across Technologies

Two Energy Policy Data Problems:

4

+

2) Big Data1) Fragmented Data

INDUSTRY DATA IS FRAGMENTED

Industry Stakeholder Groups

NREL

EIA DSIRE

OpenEI

PUCs

Databases

C AM AI L T X

C T

1

2

3

You must follow dozens of data sources to track important issues.

Big Data

Policy work is plagued by the three “Vs”• Volume of policy data• Variety of legislative/regulatory processes• Velocity of data change

AEE DIGITAL PLATFORM VISION

Industry Stakeholder Groups

NREL

EIA DSIRE

OpenEI

PUCs

Databases

C AM AI L T X

C T

AEE Big Data Asset

The Solution – AEE’s PowerSuite

8

PowerSuite is robust set of tools – including BillBoard, DocketDash, and PowerPortal – that allows you to search, track, and collaborate on energy legislation and utility regulatory proceedings from across the country with one, easy-to-use interface.

PowerSuite Products

9

Review of Features

10

Core Features

Search

• First national PUC database• Advanced energy focused bills • Simple interface

Track

• Email notifications• Favorites• Reporting

Collaborate

• Summaries• Priority and Position• Comments

User Testimonial

11

Jim KennerlySenior Policy Analyst

“PowerSuite is really amazing…I've already discovered some incentives in California (tax exemptions and such) we didn't even have in the database! This is really going to help us tremendously - great product.”

DEMO

DocketDash System Details

13

SH

DocketDash Coverage: 46 States + DC

14

Under DevelopmentQuality Assessment (QA) PendingReview Completed

DocketDash Key Stats

15

Dockets 190K

32M60GB of raw text

Documents 2.6M900GB of pdfs

Pages

Number of Pages: Wikipedia vs. DocketDash

16

Series10

10,000,000

20,000,000

30,000,000

40,000,000

# Pa

ges

[Mill

ions

] DocketDash

34M* 32M

*As of November 2014, http://en.Wikipedia.Org/wiki/wikipedia:statistics

VS.

DocketDash will surpass Wikipedia’s total content in a few months.

Collect Display (User Interface)

PUCs

C AM AI LC T

T X A

C

Bills

Dockets

Store IndexAdapt

B

Process

AEE Big Data Asset

DocketDash Technology Stack

Technology Stack Detail

18

Process

Download

Adapt

Collect •Dynamic docket metadata collection at off-peak hoursDocket #, Title, Description, Parties, Date...

•Map source schema to AEE standard

•Queue downloads and identify scanned documents

OCR PIPELINE

OCR = Optical Character Recognition

Reassembled PDF

ExtractedText

Validate •Review metadata and check for failures

Scanned Document

20 CPU-Years

What Have We Learned?

19

• PUC docket sites vary dramatically state by state• Usability

• Permalinks• Search

• Data structure• Nomenclature• Digital vs. paper system

• Creating a standardized docket system is hard

QUESTIONS?

Create an account today > PowerSuite.aee.net

For federal, state, and municipal government employees

PowerSuite is FREE

top related