device creation_marco cornero_st-ericsson

Upload: see2009

Post on 30-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    1/22

    Enabling Symmetric Multi-processing for

    Symbian Mobile DevicesMarco Cornero

    Advanced Computing Architectures

    3G MultimediaST-Ericsson

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    2/22

    Agenda

    Motivations

    Convergence and parallelism SMP for the Mobile

    Vision and Implementation at ST-Ericsson

    Symbian SMP

    SMP Performance Scalability

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    3/22

    Convergence

    GPRS56-114Kbits

    WCDMA384Kbps

    HSPA+21-42Mbps

    LTE>100 Mbps

    HSPA3,6-14Mbps

    1000 TIMES INCREASED PEAK RATE IN 10 YEARS

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    4/22

    Augmented Reality the Beginning...

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    5/22

    Virtual Reality

    IEEE Spectrum - Augmented Reality in a Contact LensBY Babak A. Parviz // September 2009http://www.spectrum.ieee.org/biomedical/bionics/augmented-reality-in-a-contact-lens/

    http://www.spectrum.ieee.org/biomedical/bionics/augmented-reality-in-a-contact-lens/http://www.spectrum.ieee.org/biomedical/bionics/augmented-reality-in-a-contact-lens/
  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    6/22

    Back to Reality

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    7/22

    Technology Lessons

    Source: Intel 2001

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    8/22

    Technology Lessons

    Parallelism is the only way to scale performance whileremaining within the technology sweet-spots

    Fmax(Area) @ Vmin+0.1V

    Fmax(Area) @ Vmin

    Fmax vs Area

    Area

    Frequency

    fCVP 2

    Technology-dependent Sweet

    Spot

    Diminishing Returns

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    9/22

    ReconfigHW Acc.

    Prog.Multimedia

    Mobile Chips Architecture Evolution

    Design for manufacturability Regularity Voltage/frequency islands Redundancy

    HW Acc.

    DSP

    CPU

    Yesterday

    Graphics

    HW Acc.

    DSPDSP

    HostCPU

    Today

    HostHostHostCPU

    Graphics+MM?

    Programmable and scalableperformance

    Power

    Tomorrow

    Multiple technology reasons

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    10/22

    ST-Ericsson U8500 And SMP

    U8500Integrated

    smartphone platform

    Full HD 1080pCamcorder

    Dual displayAdvanced 3Dgraphics

    HSPA modem

    Dual-core Cortex-A9

    Both with NEON extensions

    32KB Data + 32KB Code L1 cache per core

    L1 cache coherency

    512KB shared L2 cache

    Adaptive Voltage Scaling (AVS)

    Compensation of process variationsDynamic Voltage and Frequency Scaling (DVFS)

    Wait For Interrupt WFI

    very little dynamic power consumption

    WFI - Retention mode (ST-Ericsson implementation)

    Low-leakage with register and RAM (cache) contentspreserved

    Fast wakeup: ~50 Sec

    Power management

    Dual-core Cortex-A9

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    11/22

    SMP & Operating Systems

    The magic is done by Cache coherency HW

    SMP Operating System

    Memory

    Proc Proc Proc

    L1 $ L1 $ L1 $

    L2 $

    Cache Coherency

    Scalable Computing Engine(N CPUs)

    Shared Memory

    Parallel Applications

    Shared data

    The Hardware The Programmers View

    Standard APIs

    SMP Operating System

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    12/22

    Symbian SMP New concepts

    Single OS image managing multiple cores

    Automatic scheduling and load balancing of threadsacross multiple cores

    True concurrency reinforces the need to usespecific APIs for thread sync and ordering

    Cannot rely on explicit interrupt disablingor thread priorities

    Need to use appropriate locking mechanisms

    Efficient spin locks for SMP

    Memory barriers

    Single core guarantees memory consistency

    On SMP, explicit memory barriers areneeded to ensure memory consistency

    Normally hidden within APIs implementations

    Dedicated DFC queue per driver to encourage exploitation of SMP

    CPU CPU

    Automatic scheduling &load balancing

    Single OS image

    CPU CPU

    NKern::DisableAllInterrupts();

    Only this core affected!

    CPU CPU

    SPIN_LOCK(lock);// critical region

    SPIN_UNLOCK(lock);

    SPIN_LOCK(lock);// critical region

    SPIN_UNLOCK(lock);

    CPU CPU

    x = ;

    = x;Sync +Mem Barrier

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    13/22

    Symbian SMP How-not-to

    Non-Synchronized IPC calls

    e.g. Client not explicitly waiting for Server request to complete

    Unsafe assumptions on thread ordering, based on thread

    priority

    Use of interrupt disabling to guarantee execution exclusivity

    no longer works as this only locks interrupts on one core

    Access of shared data without proper locking

    Symbian Crazy Scheduler can

    be used for testing on mono-core

    architectures

    SMP-Safeimproves

    code quality

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    14/22

    Copyright 2008 Symbian Software Ltd. Page: 5

    SMP-Safe

    User EUSER HAL

    F32 FileServer

    FSY

    New for SMP

    Minor SMP update

    SMP-Safe

    Compatible Mode

    Kernel

    PDD

    EKERN

    VariantASSPNano

    KernelPDDPDD

    PDDPDDLDD

    SW

    PDDPDDMedia

    DriverKEXTMemModel

    MMU PeripheralsTIMERSPICHW MediaCPU

    ExistingApplications

    SMP SafeApplications

    SystemServers

    Compatibility box

    Symmetric Multi-Processing with Symbian OSJason Parker, October 2008

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    15/22

    Symbian SMP Power Management

    SMP provides a wealth of power optimizationcombinations - That is why it is interesting

    Sophisticated SMP power management support inSymbian

    Turbo-modes Asymmetric

    Architectures

    100%

    CPU0 CPU1

    WFI

    60%

    40%

    WFI100%

    60%

    40%

    WFI

    DVFS

    CPU0 CPU1

    100%

    WFI

    60%

    40%

    OFF

    CPU DPS

    CPU0 CPU1

    DPS + DVFS100%

    WFI

    60%

    40%

    OFF

    CPU0 CPU1

    .

    LoadBalancing

    Enhanced idlehandling

    Workload Predictor,Core Control

    (DVFS, sleep modes)

    User side power policymanagement and use

    case hinting

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    16/22

    System Servers

    Opportunistic approach (especially on dual-core):just relax

    Symbian asynchronous client-server architecture is alreadyvery SMP-friendly, to start with 100+ servers on a typical handset configuration + some servers already multithreaded

    SMP-Enhanced ongoing right now Analysis of potential inefficiencies on critical use cases,

    and elimination of bottlenecks

    SMP-Optimized

    Explicit parallelization of remainingcritical code

    SMP Performance Scalability

    FileServer

    Multimedia

    Communication

    ClientAppl

    100+

    Plenty of

    Parallelism

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    17/22

    Multimedia is MP-Friendly by Nature

    Space

    Time

    Implementation

    CPU0 CPU1

    filter1 filter2

    CPU0 CPU1

    OpenMax Components

    DSPCPU0 CPU1

    SMP OS

    NMF SMP EE

    MM Frameworks

    Appl.

    Appl.

    Appl.

    NMF Comps

    ST-EricssonNomadikMulti-Processing

    Framework

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    18/22

    H.264 decoder

    Multimedia SMP-Optimized Open SW -

    e.g. ffmpeg-mt Speedups on Our Platform

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    19/22

    Web Browsing

    Web Engines Cores today are not SMP-Friendly Advanced research programs ongoing,e.g. Berkley parallel browser project and others(http://parlab.eecs.berkeley.edu/research/80)

    But Web Engine is not alone in the system farfrom that

    Today we already measure a considerablespeedup from browser out-of-the box Without counting SMP-optimized MM

    MultimediaCommunication

    PluginsUser Interface

    http://parlab.eecs.berkeley.edu/research/80http://parlab.eecs.berkeley.edu/research/80
  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    20/22

    Web Browsing the Best

    Has Yet to Come Multi-tab Multi-task

    All browsers are moving there

    Ideal for SMP scalability

    HTML5 Web Workers Ability to Spawn JavaScript

    background threads

    They open the doors to

    Web 3.0

    Ideal for SMP scalability

    0

    0.5

    1

    1.5

    2

    2.5

    1 Worker 2 Workers 4 Workers

    Speedup

    Moonbat suite(modified SunSpider to make use of Web Workers)http://www.yafla.com/dforbes/resources/moonbat/moonbat-driver.html

    http://www.yafla.com/dforbes/resources/moonbat/moonbat-driver.htmlhttp://www.yafla.com/dforbes/resources/moonbat/moonbat-driver.html
  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    21/22

    Conclusions

    Convergence Imagination is the limit

    Well, power is the other limit

    Todays multi-processing Symbian is ready

    Scalability: opportunisticapproach enough to saturatetodays modest parallelism

    Programming for future architectures More programming effort from

    the SW community will be needed Heterogeneous multi-multi-processing

    Not just for HPC Labs, but in everyones hands

    It is going to be compelling and competitive,

    because the reward is worth it!

    ReconfigHW Acc.

    Graphics,Multimedia

    HostHostHostCPU

    Graphics

    HW Acc.

    DSPDSP

    HostCPU

  • 8/14/2019 Device Creation_Marco Cornero_ST-Ericsson

    22/22

    THANK YOU!