modi: mobile deep inference...─ modi allows for dynamic mobile inference model selection through...

20
MODI : Mobile Deep Inference Samuel S. Ogden Tian Guo Made Efficient by Edge Computing

Upload: others

Post on 10-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

MODI: Mobile Deep Inference

Samuel S. OgdenTian Guo

Made Efficient by Edge Computing

Page 2: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Sneak Peek

• Aim is to be a dynamic solution to a dynamic problem that has previously been solved statically

[email protected]

Page 3: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Background – Mobile Inference

• Using deep learning models in mobile application─ Increasingly common to use deep learning models within mobile

applications§ Image recognition§ Speech recognition

[email protected] - MODI3

• Two major metrics─ Model Accuracy─ End-to-end latency

• On-device vs. Remote inference

Page 4: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Mobile Deep Inference Limitations

• Highly constrained resource models─ Battery constraints─ Lack of limited hardware in general case

• Highly dynamic environment─ Variable network conditions

• Common approaches are statically applied─ Choosing a one-size-fits-all model for on-device inference─ Using the same remote API for all inference requests

[email protected] - MODI4

Page 5: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Our Vision: MODI

• How do we balance accuracy and latency based on dynamic constraints?

• Provide a wide array of models─ Model-usage families and derived models

• Dynamically choose inference location and model─ Make decision based on inference environment

§ e.g., network, power, model availability

• Make the choice transparent

[email protected] - MODI5

Page 6: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

MODI: System design

6 [email protected] - MODI

MODI Models & Stats

MODIOn-device Inference

MODIRemote inference

Mobile Device Edge Servers

Central Manager

Requests Response

Metadata

Met

adat

a

Mod

els

Page 7: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

MODI: System design

7 [email protected] - MODI

MODI Models & Stats

MODIOn-device Inference

MODIRemote inference

Mobile Device Edge Servers

Central Manager

Requests Response

Metadata

Met

adat

a

Mod

els

Page 8: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

MODI: System design

8 [email protected] - MODI

MODI Models & Stats

MODIOn-device Inference

MODIRemote inference

Mobile Device Edge Servers

Central Manager

Requests Response

Metadata

Met

adat

a

Mod

els

Page 9: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Principles

• Maximize usage of on-device resources

• Storage and analysis of metadata

• Dynamic model selection

[email protected] - MODI9

Page 10: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Questions

• Which compression techniques are useful?

• Which model versions to store where?

• When to offload to edge servers?

[email protected] - MODI10

Page 11: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Questions

• Which compression techniques are useful?

• Which model versions to store where?

• When to offload to edge servers?

[email protected] - MODI11

Page 12: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Results – Model Compression

Storage requirements reduced by 75% for quantized models

12 [email protected] - MODI

InceptionV3 image classification model1 running on a Google Pixel2 device

1https://arxiv.org/abs/1512.00567

Page 13: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Results – Model Compression

Load time reduced by up to 66%─ Leads to ~6% reduction in accuracy

13 [email protected] - MODI

InceptionV3 image classification model1 running on a Google Pixel2 device

1https://arxiv.org/abs/1512.00567

Page 14: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Questions

• Which compression techniques are useful?─ Quantization and gzip significantly reduce model size

• Which model versions to store where?

• When to offload to edge servers?

[email protected] - MODI14

Page 15: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Results – Model Comparison across devices

Pixel2 over 2.5x faster than older devices─ Specialized deep-learning hardware

15 [email protected] - MODI

InceptionV3 image classification model optimized for inference

Page 16: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Questions

• Which compression techniques are useful?─ Quantization and gzip significantly reduce model size

• Which model versions to store where?─ Mobile devices can reduce runtime up to 2.4x

• When to offload to edge servers?

[email protected] - MODI16

Page 17: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Results – Inference Offloading Feasibility

Network transfer is up to 66.7% of end-to-end time

17 [email protected] - MODI

Used AWS t2.medium instance running InceptionV3

Page 18: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Design Questions

• Which compression techniques are useful?─ Quantization and gzip significantly reduce model size

• Which model versions to store where?─ Mobile devices can reduce runtime up to 2.4x

• When to offload to edge servers? ─ Slower networks would hinder remote inference

[email protected] - MODI18

Page 19: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep

Conclusions & Questions

• Key points:─ MODI allows for dynamic mobile inference model selection through post-

training model management─ Enables greater flexibility for mobile deep inference

• Controversial:─ Whether using a low-tier AWS instance is similar to edge

• Looking forward:─ Integrating MODI with existing deep learning frameworks─ Explore explicit trade-off points between on-device and remote inference─ Exploring how far in the edge is ideal for remote inference─ What other devices could this be used for?

[email protected] - MODI19

Page 20: MODI: Mobile Deep Inference...─ MODI allows for dynamic mobile inference model selection through post - training model management ─ Enables greater flexibility for mobile deep