open standard apis for augmented reality€¦ · cv libraries . application ... easy integration...
TRANSCRIPT
© Copyright Khronos Group 2014 - Page 1
Open Standard APIs for Augmented Reality
Neil Trevett Vice President Mobile Ecosystem, NVIDIA
President, Khronos Group
© Copyright Khronos Group 2014 - Page 2
Khronos Connects Software to Silicon
Open Consortium creating ROYALTY-FREE, OPEN STANDARD APIs for hardware acceleration
Defining the roadmap for low-level silicon interfaces needed on every platform
Graphics, compute, rich media, vision, sensor and camera
processing
Rigorous specifications AND conformance tests for cross-
vendor portability
Acceleration APIs BY the Industry
FOR the Industry
Well over a BILLION people use Khronos APIs Every Day…
© Copyright Khronos Group 2014 - Page 3
Khronos Standards
Visual Computing - 3D Graphics - Heterogeneous Parallel Computing
3D Asset Handling - 3D authoring asset interchange
- 3D asset transmission format with compression
Acceleration in HTML5 - 3D in browser – no Plug-in
- Heterogeneous computing for JavaScript
Camera Control API
Over 100 companies defining royalty-free APIs to connect software to silicon
Sensor Processing - Vision Acceleration - Camera Control - Sensor Fusion
© Copyright Khronos Group 2014 - Page 4
Vision Pipeline Challenges and Opportunities
• Light / Proximity
• 2 cameras
• 3 microphones
• Touch
• Position - GPS - WiFi (fingerprint) - Cellular trilateration - NFC/Bluetooth Beacons
• Accelerometer
• Magnetometer
• Gyroscope
• Pressure / Temp / Humidity
19
Sensor Proliferation Diverse sensor awareness of the user and surroundings
• Camera sensors >20MPix • Novel sensor configurations • Stereo pairs • Active Structured Light • Active TOF • Plenoptic Arrays
Growing Camera Diversity Capturing color, range
and lightfields
Diverse Vision Processors Driving for high performance
and low power
• Camera ISPs • Dedicated vision IP blocks • DSPs and DSP arrays • Programmable GPUs • Multi-core CPUs
Flexible sensor and camera control to generate
required image stream
Use best processing available for image stream processing –
with code portability
Control/fuse vision data by/with all other sensor data
on device Camera Control API
© Copyright Khronos Group 2014 - Page 5
Vision Processing Power Efficiency • GPUs are more power efficient than CPUs at vision acceleration
- When exploiting data parallelism can be x10 as efficient – and getting better!
• SOCs have space for transistors – but can’t turn on at same time! - Would exceed Thermal Design Point of mobile devices
• Dedicated units can further increase locality and parallelism for efficiency - Dark Silicon - specialized hardware – only turned on when needed
• Ultra-low power vision scanners will become essential for wearables - Sensor and visual awareness to trigger
full vision acceleration subsystems
Pow
er E
ffic
ienc
y
Computation Flexibility
Enabling new mobile vision-based experiences requires pushing computation onto
GPUs and dedicated hardware
Dedicated Hardware
GPU Compute
Multi-core CPU
X1
X10
X100
© Copyright Khronos Group 2014 - Page 6
OpenVX – Power Efficient Vision Acceleration • Out-of-the-Box vision acceleration framework
- Low-power, real-time, mobile and embedded
• Performance portability - ISPs, dedicated vision hardware,
DSPs and DSP arrays, GPUs, Multi-core CPUs
• Suited for low-power, always-on acceleration - Can run solely on dedicated vision hardware
• Foundational API for vision acceleration - Can be used by middleware or applications
• Complementary to OpenCV - Which is great for prototyping
• Khronos open source sample implementation - To be released with final specification Open source sample
implementation Hardware vendor implementations
OpenCV open source library
Other higher-level CV libraries
Application
© Copyright Khronos Group 2014 - Page 7
OpenVX Graphs – The Key to Efficiency • Vision processing directed graphs for power and performance efficiency
- Each Node can be implemented in software or accelerated hardware - Nodes may be fused by the implementation to eliminate memory transfers - Processing can be tiled to keep data entirely in local memory/cache
• VXU Utility Library for access to single nodes - Easy way to start using OpenVX by calling each node independently
• EGLStreams can provide data and event interop with other Khronos APIs - BUT use of other Khronos APIs are not mandated
OpenVX Node
OpenVX Node
OpenVX Node
OpenVX Node
Downstream Application Processing
Native Camera Control
Example OpenVX Graph
© Copyright Khronos Group 2014 - Page 8
OpenVX and OpenCV are Complementary
Governance Community driven open source with no formal specification
Formal specification defined and implemented by hardware vendors
Conformance No conformance tests for consistency and every vendor implements different subset
Full conformance test suite / process creates a reliable acceleration platform
Portability APIs can vary depending on processor Hardware abstracted for portability
Scope Very wide
1000s of imaging and vision functions Multiple camera APIs/interfaces
Tight focus on hardware accelerated functions for mobile vision Use external camera API
Efficiency Memory-based architecture Each operation reads and writes memory
Graph-based execution Optimizable computation, data transfer
Use Case Rapid experimentation Production development & deployment
© Copyright Khronos Group 2014 - Page 9
OpenCL as Parallel Language Backend
JavaScript binding for initiation of OpenCL C kernels
River Trail Language
extensions to JavaScript
MulticoreWare open source project on Bitbucket
OpenCL provides vendor optimized, cross-platform, cross-vendor access to
heterogeneous compute resources
Harlan High level language for GPU
programming
Compiler directives for
Fortran, C and C++
Java language extensions
for parallelism
PyOpenCL Python
wrapper around OpenCL
Language for image
processing and computational photography
Embedded array
language for Haskell
© Copyright Khronos Group 2014 - Page 10
OpenVX and OpenCL are Complementary
Use Case General Heterogeneous programming
Domain targeted Vision processing
Ease of Use General-purpose math libraries with no built-in vision functions
Fully implemented vision operators and framework ‘out of the box’
Architecture Language-based – needs online compilation
Library-based - no online compiler required
Target Hardware
‘Exposed’ architected memory model – can impact performance portability
Abstracted node and memory model - diverse implementations can be optimized
for power and performance
Precision Full IEEE floating point mandated Minimal floating point requirements –optimized for vision operators
It is possible to use OpenCL to build OpenVX Nodes on programmable devices BUT - do we need definition of an efficient, vision-capable OpenCL Device?
Precisely defined precision for image and vision operations?
© Copyright Khronos Group 2014 - Page 11
Khronos APIs for Vision Processing
GPU Compute Shaders (OpenGL 4.X and OpenGL ES 3.1) Pervasively available on almost any mobile device or OS Easy integration into graphics apps – no compute API interop needed Program in GLSL not C Limited to acceleration on a single GPU
General Purpose Heterogeneous Programming Framework Flexible, low-level access to any devices with OpenCL compiler Single programming and run-time framework for CPUs, GPUs, DSPs, hardware Open standard for any device or OS – being used as backed by many languages and frameworks Needs full compiler stack and IEEE precision
Out of the Box Vision Framework - Operators and graph framework library Can run on dedicated hardware – no compiler needed Easier performance portability to diverse hardware Suited for low-power, always-on acceleration Fixed set of operators – but can be extended
© Copyright Khronos Group 2014 - Page 12
Need for Camera Control API • Advanced control of ISP and camera subsystem – with cross-platform portability
- Generate sophisticated image stream for advanced imaging & vision apps
• No platform API currently fulfills all developer requirements - Portable access to growing sensor diversity: e.g. depth sensors and sensor arrays - Cross sensor synch: e.g. synch of camera and MEMS sensors - Advanced, high-frequency per-frame burst control of camera/sensor: e.g. ROI - Multiple input, output re-circulating streams with RAW, Bayer or YUV Processing
Image Signal Processor (ISP)
Image/Vision Applications
Khronos Camera Control API Defines control of Sensor, Color Filter Array
Lens, Flash, Focus, Aperture
Auto Exposure (AE) Auto White Balance (AWB)
Auto Focus (AF)
EGLStreams
© Copyright Khronos Group 2014 - Page 13
Low-level Sensor Abstraction API
Apps Need Sophisticated Access to Sensor Data Without coding to specific
sensor hardware
Apps request semantic sensor information StreamInput defines possible requests, e.g.
Read Physical or Virtual Sensors e.g. “Game Quaternion” Context detection e.g. “Am I in an elevator?”
StreamInput processing graph provides optimized sensor data stream
High-value, smart sensor fusion middleware can connect to apps in a portable way
Apps can gain ‘magical’ situational awareness
Advanced Sensors Everywhere Multi-axis motion/position, quaternions,
context-awareness, gestures, activity monitoring, health and environmental sensors Sensor Discoverability
Sensor Code Portability
© Copyright Khronos Group 2014 - Page 14
Khronos APIs for Augmented Reality
Advanced Camera Control and stream
generation
3D Rendering and Video Composition
On GPU
Audio Rendering
Application on CPUs, GPUs
and DSPs
Sensor Fusion
Vision Processing
MEMS Sensors
Camera Control API
EGLStream - stream data
between APIs
Precision timestamps on all sensor samples
AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for
all these subsystems to work efficiently together
© Copyright Khronos Group 2014 - Page 15
Web Acceleration APIs • Khronos and W3C liaison for Web APIs
- Leverage proven native APIs - Fast API development/deployment - Designed by hardware community - Familiar foundation reduces
developer learning curve
Native APIs shipping or Khronos working group
JavaScript API shipping, acceleration being developed or work underway
WebVX? Vision
Processing
WebKAM? Camera
control and video
processing
Possible future JavaScript APIs or acceleration
WebStream? Sensor Fusion
Native
JavaScript Canvas
Path Rendering
Camera Control
HTML
© Copyright Khronos Group 2014 - Page 16
COLLADA and glTF Open Source Ecosystem
Tool Interop
Three.js glTF Importer. Rest3D initiative
COLLADA2GLTF Translator
OpenCOLLADA Importer/Exporter
and COLLADA Conformance Tests
On GitHUB
Pervasive WebGL deployment
Other authoring formats
Web-based Tools
https://github.com/KhronosGroup/glTF
https://github.com/KhronosGroup/OpenCOLLADA
https://github.com/KhronosGroup/COLLADA-CTS
© Copyright Khronos Group 2014 - Page 17
Initial Compression Results • Compression Efficiency
- Gzip (default level=6) - OpenCTM (default settings) - Open3DGC and Webgl-loader
- Positions on 14 bits - Normals and texCoords on 10 bits
Open3DGC is 5x-9x more efficient than Gzip 1.3x-2.4x more efficient than OpenCTM and 1.2x-1.5x more efficient than webgl-loader
0
100
200
300
400
CAD(3748 models)
3D Scanned(78 models)
MPEG dataset(1211 models)
Size
(MBy
tes)
Gzip
OpenCTM
Webgl-loader + Gzip
Open3DGC-ASCII + Gzip
Open3DGC-Binary
© Copyright Khronos Group 2014 - Page 18
Summary • Khronos is building a family of interoperating APIs and formats relevant to
portable and power-efficient Augmented Reality • Any company is welcome to join Khronos to influence the direction of mobile
and web-based AR! - $15K annual membership fee for access to all Khronos API working groups - Well-defined IP framework protects your IP and conformant implementations
• www.khronos.org - [email protected]