neon - arm

3
NEON The ARM ® NEON™ general-purpose SIMD engine efficiently processes current and future multimedia formats, enhancing the user experience. NEON technology can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D graphics, gaming, audio and speech processing, image processing, telephony, and sound synthesis by at least 3x the performance of ARMv5 and at least 2x the performance of ARMv6 SIMD. Cleanly architected NEON technology works seamlessly with its own independent pipeline and register file. NEON technology is a 128-bit SIMD (Single Instruction, Multiple Data) architecture extension for the ARM Cortex™-A series processors, designed to provide flexible and powerful acceleration for consumer multimedia applications, delivering a significantly enhanced user experience. It has 32 registers, 64-bits wide (dual view as 16 registers, 128-bits wide. NEON instructions perform "Packed SIMD" processing: Registers are considered as vectors of elements of the same data type Data types can be: signed/unsigned 8-bit, 16-bit, 32-bit, 64-bit, single precision floating point Instructions perform the same operation in all lanes The ARM Cortex™-A series processors with NEON technology, as well as ARM's Mali multimedia hardware solutions are used in multimedia applications ranging from smartphones and mobile computing devices to HDTV. NEON Enhancing User Experience NEON enhances many multimedia user experiences: Watch any video in any format Edit and enhance captured videos - video stabilization Anti-aliased rendering and compositing Game processing Process multi-megapixel photos quickly Voice recognition Powerful multichannel hi-fi audio processing NEON Features and Benefits NEON supports the widest range of multimedia codecs used for internet applications: Many soft codec standards: MPEG-4, H.264, On2 VP6/7/8, Real, AVS Ideal solution for normal size "internet streaming" decode of various formats Not just for codecs - also applicable to 2D and 3D graphics and other vector processing Off the shelf tools, OS support, and ecosystem support Fewer cycles needed: NEON will give 60-150% performance boost on complex video codecs Individual simple DSP algorithms can show larger performance boost (4x-8x) Processor can sleep sooner, resulting in overall dynamic power saving NEON technology features a number of elements to increase performance and simplify software development, such as: Aligned and unaligned data access allows for efficient vectorization of SIMD operations. Clean instruction set architecture designed for autovectorizing compilers and hand coding. Efficient access to packed arrays such as ARGB or xyz coordinates NEON Applications NEON Ecosystem Why NEON?

Upload: kuba

Post on 20-Jan-2016

25 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: NEON - ARM

23.2.2014 NEON - ARM

http://www.arm.com/products/processors/technologies/neon.php 1/3

NEONThe ARM® NEON™ general-purpose SIMD engine efficiently processes current and future multimedia formats,

enhancing the user experience.

NEON technology can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D

graphics, gaming, audio and speech processing, image processing, telephony, and sound synthesis by at least 3x the

performance of ARMv5 and at least 2x the performance of ARMv6 SIMD.

Cleanly architected NEON technology works seamlessly with its own independent pipeline and register file.

NEON technology is a 128-bit SIMD (Single Instruction, Multiple Data) architecture extension for the ARM Cortex™-A series processors, designed to

provide flexible and powerful acceleration for consumer multimedia applications, delivering a significantly enhanced user experience. It has 32

registers, 64-bits wide (dual view as 16 registers, 128-bits wide.

NEON instructions perform "Packed SIMD" processing:

Registers are considered as vectors of elements of the same data type

Data types can be: signed/unsigned 8-bit, 16-bit, 32-bit, 64-bit, single precision floating point

Instructions perform the same operation in all lanes

The ARM Cortex™-A series processors with NEON technology, as well as ARM's Mali multimedia hardware solutions are used in multimedia

applications ranging from smartphones and mobile computing devices to HDTV.

NEON Enhancing User Experience

NEON enhances many multimedia user experiences:

Watch any video in any format

Edit and enhance captured videos - video stabilization

Anti-aliased rendering and compositing

Game processing

Process multi-megapixel photos quickly

Voice recognition

Powerful multichannel hi-fi audio processing

NEON Features and Benefits

NEON supports the widest range of multimedia codecs used for internet applications:

Many soft codec standards: MPEG-4, H.264, On2 VP6/7/8, Real, AVS

Ideal solution for normal size "internet streaming" decode of various formats

Not just for codecs - also applicable to 2D and 3D graphics and other vector processing

Off the shelf tools, OS support, and ecosystem support

Fewer cycles needed:

NEON will give 60-150% performance boost on complex video codecs

Individual simple DSP algorithms can show larger performance boost (4x-8x)

Processor can sleep sooner, resulting in overall dynamic power saving

NEON technology features a number of elements to increase performance and simplify software development, such as:

Aligned and unaligned data access allows for efficient vectorization of SIMD operations.

Clean instruction set architecture designed for autovectorizing compilers and hand coding.

Efficient access to packed arrays such as ARGB or xyz coordinates

NEONApplications

NEON EcosystemWhy NEON?

Page 2: NEON - ARM

23.2.2014 NEON - ARM

http://www.arm.com/products/processors/technologies/neon.php 2/3

Support for both integer and floating point operations ensures adaptability to a broad range of applications, from codecs to High Performance

Computing to 3D graphics.

Tight coupling to the ARM processor provides a single instruction stream and a unified view of memory, presenting a single development

platform target with a simpler tool flow.

The large NEON register file with its dual 128-bit/64-bit views enables efficient handling of data and minimizes access to memory, enhancing

data throughput.

How to use NEON

OpenMAX DL library:

Recommended approach to accelerate AV codecs

Libraries released in source form, free-of-charge from the ARM website

Supports the following formats: MPEG-4 simple profile, H.264 baseline, JPEG, MP3, AAC

Supports the following functions: FIR, IIR, FFT, Dot Product, Color space conversion, de-blocking.de-ringing, rotation, scaling, composition

Vectorizing compilers:

Exploits NEON SIMD automatically with existing source code

Supported by ARM RealView Development Suite (v3.1 Pro and later)

Supported by gcc in versions 2007q3 and later

C intrinsics:

C function call interface to NEON operations

Supports all data types and operations supported by NEON

Supported in ARM RealView Development Suite (version 3.1 and later) and gcc version 2007q3 and later

Assembler:

For those who really want to optimize at the lowest level

Supported in ARM's RealView Development Suite (version 3.1 and later) and gcc version 2007q3 and later

NEON Support in the Open Source Community

NEON is currently supported in the following Open Source projects:

Android - NEON optimizations

Skia library, S32A_D565_Opaque is 5x faster using NEON

Ubuntu 09.04 support NEON:

NEON versions of critical shared libraries

Bluez - official linux Bluetooth protocol stack

NEON SBC audio encoder

Pixman (part of Cairo 2D graphic library)

Compositing/alpha blending

X.Org, Mozilla Firefox, Fennec and Webkit browsers

eg fbCompositeSolidMask_nx8x0565neon is 8x faster using NEON

ffmpeg - libavcodec

LGPL media player used in many Linux distributions

Video: MPEG-2, MPEG-4ASP, H.264 (AVC), VC1

Audio: Ogg Vorbis

x264 - Google Summer of Code 2009

GPL h.264 encoder - eg for video conferencing

NEON technology is supported by the industry’s largest network of Partners – the ARM Connected Community. Leading silicon, systems, design

support and software providers come together to provide a complete and optimized solution for products based on NEON technology.

Company Application

H.264, VC1, MPEG-4

VP6/7, MPEG-4, VC1, H.264, video stabilization

MPEG-4, MPEG-2, H.263, H.264, WMV9, VC1