Challenges & Implications For VLSI Architectures For Multimedia Processing
Challenges & Implications For VLSI Architectures For Multimedia Processing
Challenges & Implications For VLSI Architectures For Multimedia Processing
Vineet Sahula
sahula@ieee.org
Deptt of ECE
Malaviya National Institute of Technology
Jaipur
Outline
• Motivation & challenges
• Choice of architectures
• Tasks in Multimedia processing
• Design optimization approach
– Throughput enhancement
– Power optimization
• Medium level
• Tasks are link between simple data structures (pixel) and
symbolic information
– Data dependent decisions & lower regularity
• High level
• Operations on symbols & complex objects of variable sizes
– Highly data dependent computation-flow
– Advance prediction not possible
• Throughput (MOPS)
– Motion estimation (~80%)
• Not matched by a GPP
• Needs specific optimizations
– ASICs
Energy Efficiency
MOPS/mW
Dedicated HW
Reconfigurable HW
DSP/ASIPs
Programmable Processor
flexibility
• Software programmable
• general purpose (GPP)
• Application specific (DSP/ASIP)
• Hardware programmable (CPLD/FPGA)
• Dedicated hardware (ASIC)
IETE'05 VLSI Arch. for Multimedia 7
Data-Path & Control
a b • z=(a+b)+(c+d)
c d
• Dedicated HW
– 2 time steps with 2
Mx My ALUs
Start
Mx=1
– 1 time step with 3
LR=1 S1 ALUs
My=0
R
S2 Mx=0
My=1
•Control FSM
LR=1 • 1-hot encoded
z Stop [HW control]
• Micro-program control
[Control memory]
IETE'05 VLSI Arch. for Multimedia 8
Programmable Processors
z=(a+b)+(c+d)
Reg I-Reg
Bank
Load R1
Memory Load R2
Rx Ry HW
Microprogram
Control
control R3R1+R2
DSP- Multiply-Accumulate R R +R
3 3 4
IETE'05 VLSI Arch. for Multimedia 9
Architecture Characteristics
• Processors
• Instruction set is fixed/customized
• Algorithm changes adapted through SW rewriting
• Power & computation-time overheads are large
• Reconfigurable HW
• Architecture at logic level is fixed
• Architecture reconfiguration requires interconnection
programming
• Dedicated HW
• HW can’t be reconfigured
• Can be extremely power-efficient and high performance
IETE'05 VLSI Arch. for Multimedia 10
Dedicated HW
• Suitable most compute intensive Low Level Tasks
• Functionality is fixed
• Redesign means new design
[2] D. Chauhan et al, Hardware Design evaluation for fast motion estimation, B. Tech. Thesis, MNIT Jaipur, 2004
[3] Govind S. and V. Sahula, ASIP Design Space exploration for motion estimation IEEE VDAT 2003
IETE'05 VLSI Arch. for Multimedia 13
Dedicated HW Implementation
• 2D DCT/IDCT for Video codec
– Matrix multiplication, a regular and parallelized
• Motion Estimation
– Estimate MV through Block matching
• a very regular & parallelized
– Minimizing a distortion metric
– Mean absolute difference MAD
– Object based ?
IETE'05 VLSI Arch. for Multimedia 14
Media Processor Chips
• Philips TriMedia
• Audio/visual, graphics, communication tasks
• VLIW
– 25 FU: ALUs,multipliers, FP units
• AT & T AVP4000
• DSP
• 3 ASICs
Data
• Critical path delay, TD Path
– From primary input Ii to Primary output Oi
O1O2 …Oj..
• TD in ns
• Throughput: rate of getting output/sec
Throughput=number-of-operations/sec,
much higher than 1/TD
Area-pipeline
Area-pipeline
VCC VCC
[4] G. Singh, Low power Floating Point Arithmetic circuits, M. tech. Thesis, MNIT, 2003
[5] P. Jain, V. Sahula, Low power IPP characterization for small digital circuits, IEEE VDAT, 2002