module 5: programmable components in soc i 이 찬 호 ( 숭실대학교,...
TRANSCRIPT
Module 5:Programmable Components in SoC I
이 찬 호 (숭실대학교 , 정보통신전자공학부 )
2 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction
1. RISC machine2. About the ARM architecture3. Architecture versions4. Performance comparison
2. Processor architecture1. Processor modes2. Registers3. Instruction format4. About Thumb instructions5. Memory model
3 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차3. Organization
1. 3-stage pipeline organization2. 5-stage pipeline organization3. Multiplier
4. Processor cores1. Architecture evolutions2. ARM7 Thumb family3. StrongARM4. ARM9 family5. ARM9E family6. ARM10 family7. ARM11 family8. X-Scale
4 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차5. ARM development environment
1. Real-time debug and trace2. On-chip debug technology3. ARM development environment4. RealView development tools
6. IP solutions1. AMBA2. PrimeCell peripherals
7. ARM Applications1. Network microcontroller2. The Psion Series 5MX3. GSM system4. OneC VWS22100 GSM chip
5 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction
1. RISC machine2. About the ARM architecture3. Architecture versions4. Performance comparison
2. Processor architecture3. Organization4. Processor cores5. ARM development environment6. IP solutions7. ARM applications
6 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1. Introduction
RISC architecture [1]
Fixed instruction size (e.g., 32bit) Load-store architecture
Operands must be located in registers The operation result is put into register
Large register file Simple addressing modes
RISC organization Hard-wired instruction decoding logic Pipelined execution Single-cycle execution
1.1 RISC machine 1/3
7 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.1 RISC machine
Advantage Simple hardware
Small die size Low power consumption
Simple decoding Higher performance
Easy to implement an effective pipelined structure
Disadvantage Poor code density
RISC has a fixed size of instruction format Small number of instructions
2/3
8 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.1 RISC machine
Summary of 80386 and MIPS R2000 architectures [17] MIPS R2000 Intel 80386
Date announced 1986 1985
Instruction size (bits) 32 Variable
Address space (size, model) 32 bits, flat32 bits, segmented with paging support
Data alignment Aligned No
Data addressing modes 2 11
Protection Page Segmented Scheme
Integer registers (number, model, size)
31 GPR*32 bits8 GPR*32 bits, 6 segment registers*16 bits, 2 other * 16 bits
Separate floating-point registers 16*32 or 16*64 bits 8*80 bits
Floating-point format IEEE 754 single, doubleIEEE 754 single, double, extended
3/3
9 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.2 About the ARM architecture
The ARM architecture [2]
RISC + additional features Occupies almost 75% of 32bit embedded RISC
microprocessor market Additional features of ARM
Auto-increment/decrement addressing modes Single data-processing instruction can perform both ALU
and shifter operations Load/Store multiple instruction Conditional execution
10 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.3 Architecture versions [3] 1/3
11 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.3 Architecture versions v4
The oldest version of the architecture supported today 32bit address space T variant: 16 bit Thumb instruction set M variant: long multiply(64bit result)
v5 Improvement of ARM/Thumb inter-working CLZ instruction E variant: Enhanced DSP instruction set J variant: acceleration of Java byte-code execution
v6 Improvement of the memory system Support of Single Instruction Multiple Data (SIMD)
2/3
12 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.3 Architecture versions
Architecture variants T: 16-bit Thumb instruction D: On-chip Debug support M: Hardware long Multiplier I: Embedded ICE E: DSP extension S: Synthesizable core J: Jazelle Java accelerator ~20: with cache and MMU ~40: with cache, protection unit rather than MMU ~22: smaller cache than ~20
3/3
13 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
1.4 Performance comparison
14 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture
1. Processor modes2. Registers3. Instruction format4. About Thumb instructions5. Memory model
3. Organization4. Processor cores5. ARM development environment6. IP solutions7. ARM applications
15 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2. Processor Architecture [2]
Mode reg. CPSR[4:0] Use
User usr 10000Normal program execution mode with restricted system resources
FIQ fiq 10001 Processing fast interrupts
IRQ irq 10010Processing general-purpose interrupts
Supervisor svc 10011 Processing software interrupts
Abort abt 10111 Processing memory faults
Undefined und 11011Handling undefined instruction traps
Systemsys=usr
11111Running privileged OS tasks(ARM architecture v4 and above)
2.1 Processor modes
16 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.2 Registers 1/3
17 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Visible registers 31 general-purpose registers, 6 program status registers At any time, 16 general-purpose registers and one or two status r
egisters are visible according to processor mode General-purpose registers (GPR)
Unbanked registers, R0-R7, R15 The same physical registers in all processor modes
Banked registers, R8-R14 The physical register referred to by each of them depends on
the current processor mode Special function of R13-15
Stack pointer (R13) Link register (R14): save the return address Program counter (R15): point to address of instruction to be fetched
2/3
18 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Program status registers (PSR) CPSR (Current PSR)
SPSR (Saved PSR) Each exception mode has a SPSR To preserve the value of the CPSR when the exception
occurs
3/3
19 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.3 Instruction format
Register Bank
Rn Rm
Rd
Shifter
ALU
Condition evaluation
flagsload enable
#shift, sh
opcode
Update flagsif(S==1)
32
28
27
26
25
24
21
20
19
16
15
12
11 7 6 5 4 3 0
cond 00 # opcode S Rn Rd #shift Sh 0 Rm
ADDEQS Rd, Rn, Rm, LSL #2
• 3 address format• Conditional execution• Specification of flag-update• a shifted operand
20 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.4 About Thumb instructions
Thumb instruction set Re-encoded subset of the most commonly used ARM
instruction set 16 bit format: to allow better code density 32-bit performance at 8/16-bit system cost At least, few 32bit ARM codes are needed
Exception → the processor switch to ARM state: PSR-manipulating instructions can be called only in ARM
state
Thumb state T in CPSR == 1
Thumb entry By executing BX instruction
1/3
21 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.4 About Thumb instructions
Registers Visible GPR
Lo registers(r0-r7) Special purpose registers
Some thumb IR access Program counter(r15) Link register(r14) Stack pointer(r13)
Restricted register access A few instructions allow the
‘High’ registers(r8~r15) to be specified
2/3
22 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.4 About Thumb instructions
Thumb-ARM similarities Load-store architecture Support 8bit byte, 16bit half-word, 32bit word
Half-words are aligned on 2byte boundary Words are aligned on 4byte boundary
A 32bit unsegmented memory Thumb-ARM differences
All Thumb instructions except branch are executed unconditionally
2-address format Lesser addressing modes than ARM
3/3
23 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.4.1 Thumb implementation [1]
Implementation into a 3-stage pipeline
data in
instructionpipeline
immediate ¼elds
B operand bus
data in from memory
mux
Thumbdecompressor
ARM instructiondecoder
mux
select high orlow half-word
select ARM orThumb stream
1/2
24 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
2.4.1 Thumb implementation
Instruction mapping
#imm8
15 13 12 11 10 8 7 0
0 0 1 10 Rd
1 1 1 0 0 0 0 0 0 0 #imm81 0 1 0 0 1 0 Rd 0 Rd
31 28 27 26 25 24 21 20 19 16 15 12 11 0
‘always’condition
zeroshift
immediatevalue
destinationmajor opcode,format 3: MOV/CMP/ADD/SUBwith immediate
and sourceregister
minor opcodedenoting ADD
& set CC
Equivalent ARM code: ADDS Rd, Rd, #<imm8>
Thumb code: ADD|SUB Rd, #<imm8>
2/2
25 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture3. Organization
1. 3-stage pipeline organization2. 5-stage pipeline organization3. Multiplier
4. Processor cores5. ARM development environment6. IP solutions7. ARM applications
26 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
3. Organization
Organization Address generating block
Address register Incrementer Address selector
Register bank 31-GPRs, 6-PSRs 2 read, 1 write ports Additional 1 read, 1 write port for
PC Barrel shifter ALU IO registers
Instruction pipeline Read data register Byte replicator
Control logic External interface Instruction decoder Datapath control
multiply
data out register
instruction
decode
&
control
incrementer
registerbank
address register
barrelshifter
A[31:0]
D[31:0]
data in register
ALU
control
PC
PC
ALU bus
A bus
B bus
register
3.1 3-stage pipeline1/3
27 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Pipeline stages Fetch
Instruction fetch from memory Decode
Instruction decoding Datapath control signals for the
next cycle Execute
Reading registers Shift and ALU operations Writing back to the register ban
k
DP F D E
PC+i F D E
PC+2i F D E
2/3
28 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Branch LDR
LDR
F D E1Calc
E2xfer
E3move
F D E
F D E
F D E
B F D E1 E2 E3
PC+i F
discarded
PC+2i
F
discarded
T F D E
T+i F D E
3/3
29 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I3.1.1 Multiple load/store
instruction LDM
LDM F D A1 A2 A3 … An
L1 L2 … Ln-1 Ln
M1 … Mn-2 Mn-1 Mn
F D E
F D E
F D E
30 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I3.2 5-stage pipeline
organization To increase performance [1]
Increase of the clock rate Simplifying each pipeline stage Increasing the number of pipeline stages
Reduction of the average number of clock cycles per instruction (CPI)
To prevent von Neumann’s bottleneck Exploiting Harvard architecture
1/4
31 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Organization Harvard architecture
Separated cache Register bank
3 read, 2 write ports Additional address increme
nter for multiple load/store Forwarding paths to resolve
data dependencies
I-cache
rot/sgn ex
+4
byte repl.
ALU
I decode
register read
D-cache
fetch
instructiondecode
execute
buffer/data
write-back
forwardingpaths
immediatefields
nextpc
regshift
load/storeaddress
LDR pc
SUBS pc
post-index
pre-index
LDM/STM
register write
r15
pc + 8
pc + 4
+4
mux
shift
mul
B, BL
MOV pc
2/4
32 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Pipeline comparison
Interlock [6]
The ADD instruction cannot start until the data is returned from the load
The ADD instruction has to delay entering the execute stage of the pipeline by one cycle
PC behavior [1]
The 5-stage pipeline emulate the behavior of the 3-stage designs
LDR rN, [..] ; load rN from somewhereADD r2, r1, rN ; and use it immediately
instructionfetch
instructionfetch
Thumbdecompress
ARMdecode
regread
regwriteshift/ALU
regwriteshift/ALU
r. read
decode
data memoryaccess
Fetch Decode Execute
Memory WriteFetch Decode Execute
ARM9TDMI:
ARM7TDMI:3/4
33 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
LDR Branch
F D E M W
B F D E1 E2 E3 M W
F
F
F D E M W
ADD
F D E M W
LDR F D E M W
F D E M W
F D E M W
Separated cacheInstruction and data cacheare accessible at the same time
4/4
34 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
3.3 Multiplier [1]
Low-cost multiplication hardware 32-bit results for multiply and multiply-accumulate Recently not used Shift and add: the barrel shifter and ALU to generate a 2-
bit product in each cycle → 16 cycles in worst case Early termination logic Employ modified booth’s algorithm (radix-4)
1/2
35 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
High-performance multiplication 64-bit results for multiply
and multiply-accumulate Employ 32x8 multiplier
4 layers of carry-save adder array, each handling two multiplier bits
Multiply eight bits per cycle
4 cycles in worst case Early termination logic
Rs >> 8 bits/cycle
carry-save adders
partial sum
partial carry
initialization for MLAregisters
Rm
ALU (add partials)
rotate sum andcarry 8 bits/cycle
2/2
36 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture3. Organization4. Processor cores
1. Architecture evolutions2. ARM7 Thumb family3. ARM9 family4. ARM9E family5. X-Scale
5. ARM development environment6. IP solutions7. ARM applications
37 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4. Processor Cores4.1 Architecture evolutions
38 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4.2 ARM7 Thumb family [7]
ARM7 Thumb family(v4T) Low-power, 32bit RISC cores optimized for cost and
power-sensitive applications 3 stage pipeline Unified bus interface
1/4
39 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM7TDMI [1]
Base integer core (Hard macro cell) a 3 volt compatible rework of the ARM6 32-bit integer core
Low power, fully static design 3-stage pipeline Unified bus interface The Thumb 16bit compressed instruction set On-chip Debug support
Interface for direct connection to Embedded Trace Macrocell JTAG interface unit
Enhanced Multiplier with yielding a full 64 bit result Embedded-ICE hardware to give on-chip breakpoint and watchpoi
nt support
2/4
40 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM7TDMI-S A synthesizable version of the ARM7TDMI Delivered as a high-level language module The core can be synthesized with reduced functionality
ARM720T macrocell High-performance processor for syste
ms requiring full virtual memory management and protected execution spaces.
Additional features 8K unified cache Memory Management Unit Write buffer AMBA AHB bus interface
3/4
41 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM7EJ-S Enhanced core ARM v5TEJ Jazelle technology
hardware acceleration in the execution of Java byte-code DSP extensions
16bit data operations Saturating, signed arithmetic Enhanced MAC operations
Performance
4/4
42 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4.4 ARM9 family [8]
ARM9 family(v4T)
1/4
43 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM9 family (v4T) Very high-performance, low power optimized 32-bit RISC
cores for wide variety of cost and power-sensitive applications
ARM and Thumb instruction sets 5-stage pipeline Up to 300 MIPS (Dhrystone 2.1) in a typical 0.13m
process Single 32-bit AMBA interconnect interface MMU supporting virtual memory system Harvard architecture 8-entry Write buffer
2/4
44 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM920T and ARM922T macrocell To support platform OS such as Linu
x 16k I-cache and 16k D-cache (ARM9
20T) or 8k I-cache and 8k D-cache (ARM922T)
MMU AMBA bus interface Embedded Trace Macrocell
ARM940T Applications such as DSL modem chi
pset 4k I-cache and 4k D-cache Protection unit rather than MMU
3/4
45 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Performance
4/4
46 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4.5 ARM9E family [10] 1/4
47 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
ARM 9E family (v5TE) Single core solutions for microcontroller, DSP and Java applicatio
ns Synthesizable soft IP 5-stage integer pipeline Harvard architecture ARM, Thumb and DSP instruction sets ARM Jazelle technology for Java acceleration (ARM926EJ-S) Up to 300 MIPS (Dhrystone 2.1) in a typical 0.13µm process Integrated real-time trace and debug support Optional VFP9 coprocessor for floating-point operation High-performance AHB system Memory management unit 16-entry write buffer
2/4
48 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
The DSP extensions Single cycle 16x16 and 32x16 MAC (multiply-accumulate) operati
on Enhanced saturation arithmetic behavior and performance Tightly Coupled Memory
TCMs are intended for storing real-time code and data Access to TCMs are deterministic and do not incur access penalties
Cache preloads instructions
3/4
49 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4.8 XScale Intel ARM v5TE architecture Intel superpipelined RISC Tech
nology 7-stage interger pipeline MAC pipeline with early termin
ateion 8-stage memoy pipeline
Branch target buffer (BTB) Seperated cache & MMU
32k I-cache, 32k D-cache
1/2
Multiply-Accumulate Coprocessor provides 40-bit accumulation of 16x16, dual 16x16(SIMD),
16x32 signed multiplies
50 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
4.8 XScale Clock and Power management
supports dynamic clock and voltage scaling Performance monitoring unit
two 32-bit event and one 32-bit clock counter
2/2
51 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture3. Organization4. Processor cores5. ARM development environment6. IP solutions7. ARM applications
52 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I5 ARM development
environment ARM Developer Suite (ADS)
Integrated Development Environment (IDE) Codewarrior IDE: edit, compilation, … AXD debugger: GUI debug environment ARMulator (Software emulator)
Debug Hardware Multi-ICE
JTAG-based In-Circuit Emulator Controls EmbeddedICE-RE and ETM logic
MultiTrace Traces port analyzer unit passively Collects information from ETM
53 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture3. Organization4. Processor cores5. ARM development environment6. IP solutions
1. AMBA2. PrimeCell Peripherals
7. ARM applications
54 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
6. IP Solutions [14]
The de facto Standard for On-Chip Bus AMBA is an open standard on-chip bus specification
The Advanced High-performance Bus (AHB) Connect high-performance system modules Single clock edge Support burst and split transactions Centrally multiplexed bus scheme
AHB-Lite A subset of full AHB specification Single bus master is used
Multi-layer AHB Multiple bus masters
The Advanced Peripheral Bus (APB) Simpler bus protocol designed for peripherals Connection to the system bus via a bridge
6.1 AMBA
55 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
6.2 PrimeCell Peripherals [14]
Re-usable soft IP macrocells developed to enable the rapid assembly of SoC designs
Ready to use, fully verified and compliant with the AMBA on chip bus standard
Fully packaged, ready to use soft IP macrocells
Rapid and easy integration into AMBA-based SoC designs
Royalty-free license for single or multiple use
Supplied in VHDL and Verilog HDL with synthesis scripts
Software device drivers are included as source code
56 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
목차1. Introduction2. Processor architecture3. Organization4. Processor cores5. ARM development environment6. IP solutions7. ARM applications
1. The Psion Series 5MX
57 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
7.1 The Psion Series 5MX
ARM7100
ROMDRAM
640 x 240
ADC
digitizing
LCD
tablet
PSU
PC cards
Flash
codec
IrDA Tx/Rx
RS232
keyboard
infrared
audio
1/2
58 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
7.2 The Psion Series 5MX
ARM7100LCD
controller
control
address (28)
data (32)
DRA(13)
clock PLL
codec i/f
FIFOs
powermgt.
counter/timers
MMUARM7core
8 Kbytecache
UART
RTCosc.
3.6864 MHz
32.786 KHz
interruptcontroller
RAS, CAS(8)
WE, OE(2)
externalbus
control
DRAMcontroller
sync serial
AMBA
expansion
ARM710a
parallel I/O PSU control
2/2
59 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
Summary
The Advanced RISC machine Enhanced RISC architecture Simple hardware but effective instruction sets
Thumb instruction set High-density code on ARM cores
ARM offers a wide range of processor cores ARM Ltd., provides designers with fully
integrated development environment ARM cores are widely used in embedded
markets
60 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
References[1] steve furber, "ARM system-on-chip architecture 2nd. ed.", Addison w
esley, 2000[2] "ARM Architecture Reference Manual", ARM Ltd., June 2000[3] "ARM Architecture Version 6 (v6) White Paper", ARM Ltd., January 2
002[4] "Improving ARM code density and performance", ARM Ltd., June 20
03[5] Application Note 04 "Programmer's Model for Big-Endian ARM", AR
M Ltd., December 1994[6] "ARM9TDMI Rev3 Technical Reference Manual", March 2000[7] "ARM7 Family Flyer", ARM Ltd.[8] "ARM9 Family Flyer", ARM Ltd.[9] "ARM9E Family Flyer", ARM Ltd.[10] "ARM10E Family Flyer", ARM Ltd.[11] "White paper - The ARM11 Microarchitecture", ARM Ltd., April 200
2
61 Copyright 2003ⓒ
SoC Architecture 5. Programmable processor components in SoC I
References[12] "Intel XScale Microarchitecture Technical Summary", Intel Co., 200
0[13] "ARM debugging techniques for embedded systems using real-time
software trace", ARM Ltd. 2002[14] "ARM Product Backgrounder", ARM Ltd., November 2003[15] "Samsung communication MCU S3C4510", Samsung Electronics C
o., Ltd.[16] "Sceptre HPE EDGE/GPRS/GSM High performance solution", Agere
Systems Inc., November 2003[17] Comparison between CISC and RISC, Yi Gao, Shilang Tang, Zhongli
Ding, University of Maryland[18] ARM application note 29, "Interfacing a memory system to the AR
M7TDMI without using AMBA", ARM Ltd., December 1995[19] "Profile guided selection of ARM and Thumb instructions", Arvind K
rishnaswamy, Rajiv Gupta, The University of Arizona