graph-based analysis of javascript repositories · 2018. 2. 13. · graph-based analysis of...

21
© Tresorit - Confidential Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage company

Upload: others

Post on 13-Oct-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

© Tresorit - Confidential

Graph-based analysis of JavaScript repositoriesAdam LippaiWeb team lead of Tresorit, the encrypted cloud storage company

Page 2: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

ingraphGraph database engine for incremental evaluation of openCypher queries

https://github.com/szarnyasg

Similar to Neo4j + incremental evaluation

2

Page 3: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

openCypher – pattern matching

3

A B

C

D

user friend foaf

A B C

A B D

Page 4: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Incremental evaluation

1. A v B v C

2. A v B v C v D

(In reality it’s more complex, the actual algorithm is called RETE and it’s based on radix trees)

4

Page 5: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

5

Page 6: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

6

Page 7: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

7

Page 8: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Why is static analysis important?

• QA is expensive• Money: Get the bugs fixed in the earliest stage, cut the

administration and release overhead

• Developer experience: less round-trips -> better focus on one task

• Learning by example

• Insights for project and code health

• It scales across companies• Patterns that lead to bugs can be shared

• Find bugs in your code already found by Microsoft, Facebook, Google

8

Page 9: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

What does the new database enable?• Granularity & scope

• Developer empowerment

• Maintainability

9

Page 10: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Granularity – now

• Linters work within files

• TypeScript Compiler and other IDE tools create “interfaces of the imported files” for specific use-cases (e.g. type inference)

10

Page 11: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Granularity – future

• Complete project• Every JS file, together

• Multiple projects• Every project – think npm install or npm update

• Over time, over Git branches• A project doesn’t have 10000 states, but 1 initial state

and 9999 changes

11

Page 12: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Developer empowerment – now

• CTRL + F

• Find class/method/function in IDE

• Class maps for OOP

• Scaffolded Babylon, Acorn or TS compiler script

• Generated + searchable docs based on JSDoc

12

Page 13: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Developer empowerment – future

• Where is this code used?

• What parts of the code can modify this variable?

• What side effects can this call or assignment have?

• Did I change my libs API? Is it a breaking change?

• How to structure my code?

• Where to cut modules and bundles?

-> ingraph enables such queries

13

Page 14: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Maintainability – now

• We have unified data structures (similar AST formats)

• De facto standard language: XPath

• Unique visitor patterns

• Hard testability of plugin system• Plugins mutate state

• Problem of “multi-pass” analysis

14

Page 15: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Maintainability – future

• Adding abstraction without losing information

• Common declarative query language – openCypher

15

MATCH (bi:BindingIdentifier)

<-[:binding]-()-->

(be:BinaryExpression)

-[:right]->(right:Expression)

WHERE be.operator = 'Div'

AND right.value = 0.0

RETURN bi

Page 16: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

var foo = 1 / 0;

16

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

Expressionvalue = 1.0

Expressionvalue = 0.0

bi be

right

MATCH (bi:BindingIdentifier)

<-[:binding]-()-->

(be:BinaryExpression)

-[:right]->(right:Expression)

WHERE be.operator = 'Div'

AND right.value = 0.0

RETURN bi

Page 17: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Graphs are powerful

• Existing optimal algorithms and good heuristics instead of „not that bad code”

• Incremental query caching is possible – eg. RETE or TREAT

17

Page 18: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Use cases for incremental pattern matching• Type propagation and checking (type inference)

• Dead code elimination

• (Asynchronous) code flow checks – can the program reach a specified state, can a value be undefined etc.

• Fuzzing like behavior, e.g. integration test generation

• Code vectorization -> AI

18

Page 19: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

The right tool is

• Declarative• what instead of how

• Stateful and incremental• cache the existing knowledge

• Instant• inside your IDE

19

Page 20: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Codemodel-rifle

20

1. Parsing JS using Shift Java

2. Transforming

3. Loading the model into Neo4j or ingraph

4. Executing queries on top of it

• https://github.com/steindani

• https://github.com/luczsoma

Page 21: Graph-based analysis of JavaScript repositories · 2018. 2. 13. · Graph-based analysis of JavaScript repositories Adam Lippai Web team lead of Tresorit, the encrypted cloud storage

Thank you!

21