open-csam information aggregator and reporting … · scalable backend using django rest api...
TRANSCRIPT
![Page 1: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/1.jpg)
Georgios Chatzichristos
Operational Security Unit - ENISA
04 06 2019
OPEN-CSAM
INFORMATION AGGREGATOR AND
REPORTING TOOL USING AI AND
NATURAL LANGUAGE PROCESSING
![Page 2: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/2.jpg)
THE GOAL
Help Decision Makers
take better decisions !
![Page 3: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/3.jpg)
3
THE TRIGGER
Open Cyber Security Awareness Machine
BluePrint
Technical
Operational
OperationalTechnical
![Page 4: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/4.jpg)
4 Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 5: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/5.jpg)
5
The Vision
Open Cyber Security Awareness Machine
Make enisa an open source info hub with good training data for AI available for all
Threat analystsAcademiaEssential Services providers
Researchers
Cyber Security professionals
.
.
.
.
Cyber Security
Professionals
CSIRTs Training data for AI
Use services
Contribute to QoS
![Page 6: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/6.jpg)
6
The process
Open Cyber Security Awareness Machine
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 7: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/7.jpg)
7
NLP
Open Cyber Security Awareness Machine
Continuous monitoringDaily/Weekly/Monthly/Yearly
Stats
Trending terms in Tweets Trending terms in News
ENISA’s
termsENISA’s
topics
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 8: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/8.jpg)
8
AI
Open Cyber Security Awareness Machine
Continuous monitoringDaily/Weekly/Monthly/Yearly
Stats
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 9: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/9.jpg)
9 Open Cyber Security Awareness Machine
Hardcoded
Used to drive AI
Knowledge Graph
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 10: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/10.jpg)
10
Searching
Open Cyber Security Awareness Machine
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 11: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/11.jpg)
11 Open Cyber Security Awareness Machine
Searching
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 12: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/12.jpg)
12
Reporting
Open Cyber Security Awareness Machine
Monitor & cluster
trending open source
information
Cybersecurity search engine
Reporting
![Page 13: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/13.jpg)
OPENCSAM
CURRENT STATUS OF
DEVELOPMENT
EAU DE WEB
![Page 14: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/14.jpg)
14
OPENCSAM STATUS
Open Cyber Security Awareness Machine
3 main directions:
● Scalable backend using Django REST API
● Friendly UX for all user types
● Proof-of-concept implementations for:
○ Knowledge Graph enrichment (w/ Twitter hashtags and
PageRank)
○ News clusterization (w/ Universal Sentence Encoder)
○ Text summarization (w/ Universal Sentence Encoder)
○ Classify web content for KG terms (w/ Tensorflow and
corpus-derived FastText SkipGram word embeddings)
○ Classify web corpus (w/ USE & seed corpus)
![Page 15: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/15.jpg)
15
OPENCSAM ARCHITECTURE
Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 16: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/16.jpg)
16
Django REST API based backend
Celery for queue management and
scrapers
OPENCSAM - BACKEND
Short name of the powerpoint presentation, maximum length two thirds of the page
Postgresql database
ElasticSearch for document
management
Docker for easy deployment
![Page 17: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/17.jpg)
17
OPENCSAM - UI DEVELOPMENT
Short name of the powerpoint presentation, maximum length two thirds of the page
● Developed wireframes for main modules:
○ Knowledge graph
○ Reporting/summarisation
○ Trending terms
○ Search
○ Source administration
![Page 18: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/18.jpg)
18
Dynamic KG editor prototype
Term evolution over time
OPENCSAM - UI DEVELOPMENT
![Page 19: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/19.jpg)
19
• Extract emerging terms (hashtag
co-occurrence)
• Generate co-occurrence graph
• Score concepts with PageRank
• Detect variations in rank over time
• Suggest terms with the highest
positive variation
• The super-user adds the terms to
the KG
OPENCSAMKG SUGGESTED TERMS
Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 20: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/20.jpg)
20
• Build a set of corpus-wide common keyphrases
• bigrams and trigrams, scored using Normalized Pointwise
Mutual Information (NPMI)
• 1,2,3,4 elements PositionRank-ed keyphrases
•Cluster the keyphrases using word embeddings
•Build co-occurrence graph, use traverse distances
as feature in text classification task
•Propose for Knowlege Graph, based on quality
OPENCSAM – KEYPHRASES
Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 21: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/21.jpg)
21
High quality sentence encodings from Universal Sentence Encoder
● Seed topics with “topic words”
● Encode topic words and news titles to vectors
● Set seed-word vectors as cluster centers
● Group titles around cluster centers
Future:
● replace USE with BERT, finetune for our corpus
● Explore VLAWE (Vector of Locally-Aggregated Word Embeddings)
OPENCSAM – CLUSTER NEWS
Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 22: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/22.jpg)
23
● Extractive summarization (best ranked phrases)
● Users can select phrases to be kept/removed
● Included text comes from multiple articles
● Algorithm uses Universal Sentence Encoder
sentence embeddings
OPENCSAM –SUMMARISATION
Short name of the powerpoint presentation, maximum length two thirds of the page
![Page 23: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/23.jpg)
24
JOIN THE OPENCSAM
COMMUNITY
Open Cyber Security Awareness Machine
https://github.com/enisaeu/OpenCSAM
![Page 24: OPEN-CSAM INFORMATION AGGREGATOR AND REPORTING … · Scalable backend using Django REST API Friendly UX for all user types Proof-of-concept implementations for: Knowledge Graph enrichment](https://reader035.vdocuments.net/reader035/viewer/2022062506/5f7fb125b669fd2e0a2ebc70/html5/thumbnails/24.jpg)
THANK YOU FOR YOUR
ATTENTION
Vasilissis Sofias Str 1, Maroussi 151 24,
Attiki, Greece
+30 28 14 40 9711
www.enisa.europa.eu