Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
333 views410 pages

Microprocessors in Robotic and Manufacturing Systems (Muhadharaty) PDF

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 410

Microprocessors in Robotic and Manufacturing Systems

International Series on
MICROPROCESSOR-BASED SYSTEMS ENGINEERING

VOLUME 6

Editor
Professor S. G. Tzafestas, National Technical University, Athens, Greece

Editorial Advisory Board


Professor C. S. Chen, University 0/ Akron, Ohio, u.S.A.
Professor F. Harashima, University o/Tokyo, Tokyo, Japan
Professor G. Messina, University o/Catania, Catania, Italy
Professor N. K. Sinha, McMaster University, Hamilton, Ontario, Canada
Professor D. Tabak, Ben Gurion University o/the Negev, Beer Sheva, Israel

The titles published in this series are listed at the end of this volume.
Microprocessors in
Robotic and Manufacturing
Systems

edited by

SPYROS G. TZAFESTAS
Department of Electrical Engineering,
National Technical University of Athens,
Athens, Greece

SPRINGER SCIENCE+BUSINESS MEDIA, B.V.


Library of Congress Cataloging-in-Publication Data
Microprocessors In robotlc and manufacturlng systems I edited by
Spyros Tzafestas.
p. cm. -- (InternatIonal serles on mlcroprocessor-baseo
systems englneerlng i v. 6)
Includes bibllographlcal references and lndexes.
ISBN 978-94-010-5694-6 ISBN 978-94-011-3812-3 (eBook)
DOI 10.1007/978-94-011-3812-3
,. Robots--Control systems. 2. Mlcroprocessors. 3. Computer
lntegrated manufacturlng systems. 1. Tzafestas. S. G.• 1939-
II. Series.
TJ21'.35.M53 1991
629.S·92--dc20 91-29676

ISBN 978-94"010-5694-6

Printed an acidjree paper

AII Rights Reserved


© 1991 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1991
Softcover reprint ofthe hardcover Ist edition 1991
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
inc1uding photocopying, recording or by any informat ion storage and
retrieval system, without written permission from the copyright owner.
CONTENTS

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

PARr 1
ROBOTIC COMPUTATIONS

CHAPTER 1
COMPUTATION OF ROBOT DYNAMICS BY A
MULTIPROCESSOR SCHEME
Yuan F. Zheng

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Tree-Height Reduction by Newton-Euler State Space Equations. . 6
3. The Multiprocessor System and the Task Allocation. . . . . . . . . . . . 17
4. Performance Evaluation .................................... , 22
5. Parallel Processing for a Two Link Robot System. . . . . . . . . . . . . . .. 24
6. Conclusion ............................................... 27
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

CHAPTER 2
AUTOMATIC SCHEDULING OF THE NEWTON-EULER
INVERSE DYNAMICS
Erik H. D'Hollander

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2. Task Granularity and Communication Overhead. . . . . . . . . . . . . . . . 32
3. The Newton-Euler Inverse Dynamics. . . . . . . . . . . . . . . . . . . . . . . . . 35
4. Automatic Data-Flow Analysis: the LEM Analyzer ............... , 37
5. Parallel Execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 45
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

CHAPTER 3
NUMERICAL APPLICATIONS OF DSPs IN ROBOTIC
COMPUTATIONS
A. Di Stefano and O. Mirabella

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2. A Survey of the Main Characteristics of DSPs . . . . . . . . . . . . . . . . . . 52
3. Numerical Applications in Robotic Computations. . . . . . . . . . . . . .. 61
4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
VI

CHAPTER 4
PARALLEL PROCESSING OF ROBOT CONTROL
AND SIMULATION
Hironori Kasahara

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2. Parallel Processing of Robot-Arm Control Computation. . . . . . . . .. 80
3. Parallel Processing of Robot Dynamics Simulation. . . . . . . . . . . . . . 92
4. SurnmaIY and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 98
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 98

PARl'2
ROBOTIC CONTROL AND VISION

CHAPTERS
MICROPROCESSOR-BASED CONTROLLERS
FOR ROBOTIC MANIPULATORS
J.A. Tenreiro Machado and J.L. Martins de Carvalho

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2. Survey of Microprocessor-Based Controllers. . . . . . . . . . . . . . . . .. 104
3. Techniques to Improve Microprocessor-Based Controller
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 120
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

CHAPTER 6
DESIGN ASPECTS OF A ROBOT COORDINATED
BY A DESKTOP COMPUTER
John Billingsley

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 131
2. The Supervisory System ................................... , 132
3. Kinematics and Their Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 132
4. Polynomial Fitting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 139
5. Implementing the Difference Equations. . . . . . . . . . . . . . . . . . . . .. 141
6. Axis Command and Coordination. . . . . . . . . . . . . . . . . . . . . . . . . . .. 143
7. Implementing the Position Control Loop . . . . . . . . . . . . . . . . . . . . .. 147
8. Priority of Operations in the Axis Controller. . . . . . . . . . . . . . . . . .. 150
9. Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 152
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 152
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 152
vii

CHAPTER 7
A PC SIMULATION PROGRAM FOR COMPARING
PERFORMANCES OF ROBOT CONTROL ALGORITHMS
J. O'Shea and R. Benali

l. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
2. Trajectory Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
3. Robot Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4. Robot Dynanlics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5. Some Control Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

CHAPTERS
COLLISION STRATEGIES FOR ROBOT RETREAT
AND RESISTANCE
R.E. Goddard. K.L. Boyer and H. Hemami

l. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2. Pre-Collision Visual Sensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 178
3. Post-Collision Tactile Sensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4. Control Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 205
5. Four link Biped. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6. Discussion and Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

CHAPTER 9
A DISTRIBUTED CONTROL NETWORK
FOR SENSORY ROBOTICS
A.T. de Almeida. U.C. Nunes. J.M. Dias. H.J. AratYo
and J. Batista

l. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
2. Distributed system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
3. Modelling the environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
4. Vision system .......................................... 221
5. Manipulator gripper and associated sensors ................. 227
6. Sensor data integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7. Software conSiderations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
8. Conclusions ............................................ 233
References .............................................. 234
viii

CHAPTER 10
NEURAL NETWORKS AND ROBOT VISION
Erik H. D'HoUander

1. Neural networks: what's in a name? . . . . . . . . . . . . . . . . . . . . . . . .. 237


2. Neural networks operation ................................ 238
3. The linear pattern associator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
4. The nonlinear multilayered pattern associator . . . . . . . . . . . . . . . 242
5. The auto assoctator ...................................... 249
6. Constraint satisfaction network ............................ 251
7. Competitive learning model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 253
8. Neural networks as parallel distributed processing systems .... 255
9. Conclusion .............................................. 256
References .............................................. 256

PARTS
MANUFACTURING SYSTEMS

CHAPTER 11
MICROPROCESSORS IN DATA ACQUISITION SYSTEMS
FOR PROCESS CONTROL
A.M. Weilert

1. Microprocessor based data acquisition systems. . . . . . . . . . . . . . .. 261


2. Process control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 270
3. Integration/automation with microprocessor based systems. . . .. 274
References .............................................. 277

CHAPTER 12
DESIGN AND ANALYSIS OF A MODULAR CNC SYSTEM
Y. Altintas

1. Introduction ............................................ 280


2. Physical components of the CNC milling machine. . . . . . . . . . . .. 282
3. Transfer functions of the feed drive control system . . . . . . . . . . .. 283
4. State space modelling of the feed drive control system ........ 290
5. Experimental verification of the feed drive servo model. . . . . . .. 294
6. Time domain Simulation of contouring errors . . . . . . . . . . . . . . . .. 299
7. Conclusion .............................................. 309
References ..................................... . . . . . . . .. 309
ix

CHAPTER 13
COMPUTER / PROGRAMMABLE CONTROL
OF A FLEXIBLE MANUFACTURING CELL
J. Richard. F. LePage and G. Morel

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 311
2. Cell functional requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
3. Structural requirements .................................. 314
4. Study and realization ........... " . . . . . . . . . . . . . . . . . . . . . . . . . 320
5. Conclusion .............................................. 332
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
References ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 332

CHAPTER 14
ROBOT AND PLC SUPPORT SYSTEM
BASED ON FmER OPTIC MAP NETWORK
P. Hodaie. B. Chan. B. Szabados and O. Storoshchuk

1. Introduction ..................,......................... 335


2. Local area network selection criteria ....................... 337
3. Network design ......................................... 338
4. Support application softwares ............................. 340
5. ConcluSions and recommendations ......................... 350

CHAPTER 15
EXPERIMENTAL RESULTS OF PARAMETER
IDENTIFICATION AND NONLINEAR CONTROL
ON A PM STEPPER MOTOR
M. Aiello. R. Rekowski. M. Bodson. J. Chiasson
and D. Schuerer

l. Introduction ., ................ , . . . . . . . . . . . . . . . . . . . . . . . . .. 353


2. Feedback linearization controller ................ , ..... ,.... 354
3. Model validation and parameter identification. . . . . . . . . . . . . . . .. 358
4. Experimental set up and results. . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
5. Future research. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 366
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 366
References ........................................ . . . . .. 367
x

CHAPTER 16
DIGITAL SIGNAL PROCESSOR IMPLEMENTATION
OF A KALMAN FILTER FOR DISK DRIVE
HEAD POSITIONING MECHANISM
J.A. Lowe and F.L. Lewis

1. Introduction .................................... . . . . . . . .. 369


2. The head-positioning actuator. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 370
3. Kalman ffiter design ........ . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 373
4. Kalman ffiter Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 376
5. Conclusion ............ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Acknowledgemenm ........................................ 383
References .............................................. 383

CHAPTER 17
SIMULATION ANALYSIS OF FLEXIBLE MANUFACTURING
SYSTEMS USING RENSAM
Jorge Haddock and YOWlg Chul. Kim

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 386
2. RENSAM System Description. . . . . . . . . . . . . . .. . . . . . . . . . . . . .. 386
3. Case Study: CompanyC..................... ...... ......... 388
4. RENSAMResulm......................................... 392
5. Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 396
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 402

AlTIliOR INDEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 403


S~INDEX ........................................... 405
PREFACE

Microprocessors play a dominant role in computer technology and


have contributed uniquely in the development of many new concepts and
design techniques for modem industrial systems. This contribution is
excessively high in the area of robotic and manufacturing systems.
However, it is the editor's feeling that a reference book describing this
contribution in a cohesive way and covering the major hardware and
software issues is lacking. The purpose of this book is exactly to fill in this
gap through the collection and presentation of the experience of a
number of experts and professionals working in different academic and
industrial environments.
The book is divided in three parts.
Part 1 involves the first four chapters and deals with the utilization of
microprocessors and digital signal processors ( DSPs ) for the
computation of robot dynamics. The emphasis here is on parallel
computation with particular problems attacked being task granularity,
task allocation/scheduling and communication issues. Chapter I, by Zheng
and Hemami, is concerned with the real-time multiprocessor
computation of torques in robot control systems via the Newton-Euler
equations. This reduces substantially the height of the evaluation tree
which leads to more effective parallel processing. Chapter 2, by
D'Hollander, examines thoroughly the automatic scheduling of the
Newton-Euler inverse dynamic equations. The automatic program
decomposition and scheduling techniques developed are embedded in a
tool used to generate multiprocessor schedules from a high-level language
program. Chapter 3, by Di Stefano and Mirabella deals with the application
of DSPs in robotic computations. A number of numerical algorithms are
designed and implemented using the TMS320 family. Chapter 4, by
Kasahara, proposes several parallel processing schemes on multiprocessor
systems for the Newton-Euler or Walker-OrIn's robot control computation
and dynamic simulation.
Part 2 contains six chapters devoted to the computations involved in
robot controllers and vision systems. Chapter 5, by Tenreiro Machado and
Martins de Carvalho, is devoted to the real-time multiprocessor
implementation of nonconventional robot controllers, such as sliding
xi
xii

controllers and learning controllers. In Chapter 6. Billingsley studies


several design aspects of a robot coordinated by a desktop computer. In
Chapter 7. O'Shea and Benali propose a benchmark test together with a
PC based simulation program for comparing the performances of robot
control algorithms. Then they apply these tests to some robot control
algorithms for the PUMA 560 robot. In Chapter 8. Goddard. Boyer and
Hemami are concerned with robot systems where collision avoidance is
not possible. They examine the technical issues for a retreat strategy and
a resistance strategy. and discuss the realization of these strategies using
microprocessors. In Chapter 9. deAlmeida. Nunes. Dias. Araujo and
Batista. discuss the problems of sensory integration and real-time path
control. They propose a distributed hierarchical architecture that involves
vision. tactile. range and proximity sensors. Part 2. ends up with Chapter
10. where D'Hollander discusses the basic neural network techniques
(pattern recognition. blurred image restoration. unsupervised
classification) applicable to robotic vision. D'Hollander also provides an
analysis and comparison of the fundamental neural network models which
if blended can be used to treat complex scene analysis situations.
Part 3 presents a number of important applications of
microprocessors to manufacturing systems. In Chapter 11. Weilert
provides a compilation of issues regarding the impact of microprocessors
on data acquisition systems that involve sensors. acquisition circuitry and
control units. process control measurement devices. etc. The discussion
is centered on the practical rather than the theoretical aspects of these
topiCS. In Chapter 12. Altintas deals with the utilization of
microprocessors in unmanned machining systems particularly with
modular computer numerical control ( CNC). The discussion is based on
a detailed analysis. design and performance evaluation of a modular
research CNC system for a milling machine. In Chapter 13. Richard. Le
Page and Morel provide a thorough study and implementation of a
programmable control loop of a flexible manufacturing cell. Both the
functional and stuctural requirements (distributed control architecture.
communication architecture) are examined. and an experimental
laboratory cell is realized and tested. In Chapter 14. Hodaie. Chan.
Szabados and Storoschchuk discuss the importance and applicability of
microcomputer-based Local Area Networks ( LANs ) to the factory floor
that involves robotic systems and programmable logic controllers ( PLCs ).
Their discussion includes design and technical issues and is supported by
a description of a system of this type actually implemented. In Chapter
15. Aiello. Rekowski. Bodson. Chiasson and Schuerer deal with the
application of nonlinear controllers for the fast and accurate positioning of
permanent magnet ( PM ) stepping motors. They include a thorough
mathematical modelling and analysis. as well as a successful
implementation based on the Motorola DSP56001ADS digital signal
processor. a PC. two PWM amplifiers and a PM stepper motor. In Chapter
16. Lowe and Lewis show how a Kalman filter ( KF ) can be used for
estimating the state of a disk drive head positioning mechanism. They
describe a DSP implementation together with experimental results which
xiii

show that the KF needs less than 2% of the computational bandwidth of


the DSP at a 250 I.I.S sampling period. The book ends with Chapter 17
where Haddock and Kim present and discuss a generic simulation model
( called RENSAM ) for manufacturing systems that was developed at
Rensselaer Polytechnic Institute.
Taken together the above contributions provide a well balanced and
representative picture of the capabilities of present day microprocessors
to carry out efficiently the computations involved in real-scale robotic and
manufacturing systems. Their efficiency was shown to be much better if a
parallel approach together with appropriate communication concepts and
technologies are used.
The book is appropriate for the researcher and professional in this
field. but it can be used profitably by postgraduate and senior
undergraduate students working in this area.
I am grateful to all contributors for their excellent up-to-date
contributions and for their help throughout the lengthy publication
process. I hope that this book will really achieve its goal by providing an
important addition to the current literature of microprocessor-based
systems. Finally. I would like to address my thanks to the editorial staff of
Kluwer AcademiC Publishers for their continuous support in my efforts.

July 1991 Spyros Tzafestas


CONTRIBUTORS

AIELLO M. Aerotech. Inc .• Zeta Drive. Pgh. PA, U.S.A


ALTINTAS Y. British Columbia University. Vancouver.
B.C .• Canada
ARAUJOH.J. Coimbra University. COimbra. Portugal
BATISTAJ. Coimbra University. COimbra. Portugal
BENALIR Ecole Polytechnique de Montreal. Montreal. Canada
BIILINGSLEY J. Portsmouth Polytechnic. Portsmouth. England
BODSONM. Carnegie Mellon University. Pgh. PA, U.S.A
BOYERKL. The Ohio State University. Columbus. Ohio. U.S.A
CHANB. Mc Master University. Hamilton. Ontario. Canada
CHIASSONJ. Pittsburgh University. Pgh. PA, U.S.A
DE ALMEIDA AT. COimbra University. COimbra. Portugal
D'HOLLANDER E. State University of Ghent. Ghent. Belgium
DIASJ.M. Coimbra University. COimbra. Portugal
DI STEFANO A Catania University. Catania. Italy
GODDARDRE. The Ohio State University. Columbus. Ohio. U.S.A
HADOOCKJ. Rensselaer PolytechniC Institute.
Troy. New York. U.S.A
HEMAMI H. The Ohio State University. Columbus. Ohio. U.S.A
HODAIEP. Mc Master University. Hamilton. Ontario. Canada
KASAHARAH. Waseda University. Tokyo. Japan
KIM. Y.C. Rensselaer Polytechnic Institute.
Troy. New York. U.S.A
LEPAGEF. Centre de Recherche en Automatique de Nancy.
Vandoeuvre les Nancy. France
LEWISF.L. The University of Texas at Arlington. Fort Worth.
Texas. U.S.A
LOWEJA The University of Texas at Arlington. Fort Worth.
Texas. U.S.A.
MARTINS J.L.
DECARVAUIO Porto University. Porto. Portugal
MIRABELLA O. Catania University. Catania. Italy
MORELG. Centre de Recherche en Automatique de Nancy.
Vandoeuvre les Nancy. France
NUNES U.C. Coimbra University. COimbra. Portugal
O'SHEAJ. Ecole Polytechnique de Montreal.
Montreal. Canada
REKOWSKIR Aerotech. Inc .• Zeta Drive. Pgh. PA, U.S.A
RICHARDJ. Centre de Reserche en Automatique de Nancy.
Vandoeuvre les Nancy. France
SCHUERERD. Pittsburgh University. Pgh. PA, U.S.A
STOROSCHCHUK O. General Motors of Canada. Oshawa. Ontario. Canada
SZABADOSB. Mc Master University. Hamilton. Ontario. Canada
TENREIRO
MACHADOJA Porto University. Porto. Portugal
WEILERT AM. Hewlett-Packard. Colorado. U.S.A
ZHENGY.F. The Ohio State University. Columbus. Ohio. U.S.A
xv
PART!

ROBOTIC COMPUTATIONS
CHAPTER 1

COMPUTATION OF ROBOT DYNAMICS BY A


MULTIPROCESSOR SCHEME

Yuan F. Zheng
Hooshang Hemami
Dept. of Electrical Engineering
The Ohio State University
Columbus, OH 43210

ABSTRACT. The computation of applied torques in real time in the dynamic con-
trol of robot systems is complicated and time consuming. The Newton-Euler state
space formulation is used for computing the dynamics. By using this formulation,
the backward recursion for calculating angular velocities and angular accelerations
is eliminated. The calculation of lineal' acceleration is simplified. It involves only
two steps. This reduces the height of the evaluation tree in calculating the applied
torques which makes the parallel processing more effective. A multiprocessor sys-
tem composed of a central CPU and a group of satellite CPU's is suggested for
implementing the computations. The task of each satellite CPU is t.o take care of
one link of the system by calculating all it.s related data. The task of the CPU is to
compute the applied torques. Because of this arrangement, t.he required software
system is the same for all satellite CPU's. The proposed multiprocessing scheme
results in a flexible and modular system which is adaptive for a variety of dynamic
configurations. Computer simulation results of this strategy are presented to show
that the suggested multiprocessing scheme achieves a good speedup factor over the
uniprocessing system.

I Introduction

Robotic systems are widely used for industrial automation. The dynamics of
the system are not taken into account in the design of control strategy for the ma-
jority of today's industrial manipulators [1-4J. However, in cases of fast motion
and mechanical configurations with strongly coupled elements, the dynamics of the
system are needed in the derivation of control forces [5J. Because of the highly non-
3
4

linear ano coupleo charader of mll1t.ihody lin kage systems, the dynamics eq11flt.ions
of the system are very complicated. Early formulations of dynamics were based on
Lagrange's equations[6].
Lagrange's formulation is inefficient in computing the applied torgues when po-
sitions, velocities and desired accelerations are given [7,8]. As a result it is very
time consuming in real time control cases. Effort.s have been made to render the La-
grangian dynamics comput.ationally feasible. The Coriolis and the centrifugal forces
are the greatest calculation burden in the dynamics. A frequent.ly used method is
to eliminate these furces and consequently simplify the dynamics. However, t.he
contributions from Coriolis and centrifugal terms are significant. at fast speeds of
movements. Ignoring these t.erms will result in notable errors in the computed
t.orgues. Several met.hods have been proposed t.o raise t.he efficiency in comput.a-
tion of the applied torgues. Luh et al [7] proposed an approach based on recursive
Newton-Euler formulation of the dynamics. This method greatly reduces t.he cal-
culation time. The Lagrange formulation for a six-joint robot manipulation needs
66,271 multiplications and 51,548 additions. The Newt.on-Euler method requires
only 8.52 multiplications and 738 additions[I]. Later, Hollerbach[8] suggested a
recursive Lagrangian formulation for manipulator dynamics. His method requires
2,195 multiplications and 1,719 additions if 3 x 3 rather than 4 x 4 t.ransformationma-
trices are used. It seems that the recursive Newton-Euler is more efficient than the
recursive Lagrangian formualtion. However, Silver[9] showed that. with the proper
choice of the represent.ation for the rotational dynamics, the recursive Lagrangian
formualtion requires exactly the same computation as the recursive Newton-Euler
formualtion.
The recursive method as developed in [7] and [8] appears to be computationally
efficient. The computation is only feasible for real-time purposes with computers
that have reasonable computation power such as PDP 11 's[7]. It is desirable that
the computation of the applied torques for multi body systems be executable by
low cost microcomputers. To achieve real time control with microcomput.ers that
have limit.ed computat.ional power requires furt.her improvement of comput.ation ef-
ficiency. No method other than the recursive methods of [7] and [8] can offer better
computational efficiency. Computation parallelism then becomes the natural and
the best choice since parallelism can offer increased speed of computation without
the improvement of operation speed for individual hardware components. Recently,
5

Zakharov[10] summarizeo t.hree mf't.horls for parallelism, namdy pipf'lining, t.hf' lise
of multiple functional units within a uniprocessor, and the system with many pro-
cessors. The first and the second methods need specially designed processors. With
the complicated computation task of system dynamics, the design of these types of
processor is certainly a challenging problem. Lathrop proposed a parallel process-
ing algorithm for low-pair kinematic chains, suitable for special-purpose hardware
of VLSI[llJ. However, the design and manufacturing of such a processor is very
expensive. On the other hand, with the advent of low cost micro-processors, the
third method, multiprocessor systems become very attractive.
Luh and Lin[12] first proposed a parallel processing system which utilizes one
CPU for each link of a robot manipulator. The recursive Newton-Euler method
was employed for computational algorithm. For a multiple processor system, the
scheduling of parallel computation is of central importance. It directly affects com-
putational efficiency of the parallel processing system. The main efforts in [12]
were actually devoted to finding an optiomal schedule for the parallel computation.
Unfortunately, the recursive method itself inherently possesses serial computations
that cannot be executed concurrently, even though more than one CPU is available.
The result is that an optimal scheduling is hard to find and the improvement is very
limited. In the recursive methods there are two recursions: the backward and the
forward. The backward recursion propagates angular velocities, angular velocities,
angular accelerations, linear accelerations, total link forces, and total link torques
from the base to the end link[7]. The forward recursion propagates the forces and
applied torques exerted on individual joints from the end link of the manipulator to
the base[7]. Because of these two recursions, the dynamics behavior of one link af-
fects that of its adjacent links. Since each CPU is responsible for calculating all the
above parameters of the corresponding link it has to idle until its required data are
sent in by its adjacent CPU's [12J. This serial precedence imposes an optimization
problem in the assignment of computational subtasks to individual CPU's. A so
called "variable" branch-and-bound methods was developed in [12J, which schedules
the computation of tasks by distributing the load in some sequential order among
the CPU's under the series-papallel constraints, such that the time for completing
the computation is minimum. However, because of the two recursions, the im-
provement of computational efficiency is greatly affected by the series constraints
and the heavy communication among CPU's. The method is based on "search and
6

try" pro('ednre, it. t.akes great. effort to determine the final minimum-time schedule.
Furthermore, the subtasks to be processed have different sequences in individual
CPU's. Consequently, adding or dropping one link of the system will totally dis-
turb the previous schedule and the entirely complicated procedure will have to be
restarted again.
In this paper, a new multiprocessing scheme is suggested for the parallel compu-
tation of llluitibody system dynamirs. The Newton-Euler state space formulation
developed by Hemami[13] is utilized to compute the dynamics of the system. The
latter formulation keeps the computational efficiency of the recursive Newton-Euler
formulation. The backward recursion for angular velocities and angular accelera-
tions, however, are eliminated. The computation of linear velocities is not required
in the Newton-Euler state space formulation. The backward recursion for linear
accelerations is simplified to only two steps of a series precedence. Furthmore,
the multiprocessor system suggested has a central CPU in addition to the satellite
CPU's. The applied torgues are calculated in the central CPU such that heavy
communication among satellite CPU's is avoided. Because of the above considera-
tions, the parallel processing becomes more effective and the computation time may
be further reduced. Even more important is the fact that the software is the same
for each satellite processor.

II Tree-Height Reduction by Newton-Euler State Space Equations

For a multi body system of n joints, it is assumed that the system has a sufficient
number of sensors to determine at any time the joint position and velocities. Given
the desired joint accelerations, it is the goal to compute general torgues at each
joint. The whole data processing procedure is divided into a number of tasks. If
each task is regared as an atom, the computation procedure can be expressed as a
tree[14]. For example, consider the evaluation of a third order polynomial:

f(x) = ao + ;r(nl + X[02 + xa!l]) (1)

If each arithmetic operation is considered as an atom, the evaluation of f(:r) can


be depicted as a parse tree (Fig. 1). One can see from Fig. 1 that the tree requires
six steps for the evaluation of f( x). Thus the tree has a height of six. Since no
two steps of the tree can be executed simultaneously, the computation cannot be
7

ST~
STEP 5

STEP

STEP 1

Fig. 1: Tree Expression of Eq. (2)


STEP n
CALCULATE Wn
qn qn
STEP n-l
CALCULATE Wn-l
qn-l qn-l
STEP 2
CALCULATE W 2
q2 q2
STEP 1
CALCULATE W l
ql ql
Fig. 2: Tree expression for calculation of Wi's by using recursive formulation.
8

speeded IIp by using more t.han onp prOCPSSOf. TherefofP, t.o fPflllCP computation
time, one must first reduce the height of an evaluation tree so that each step in the
tree could have as many atoms as possible.
In t.he recursive Newton-Euler formualt.ion, t.he three-height. is a large number.
For example, the angualr velocities and angualr accelerations are expressed in the
recursive Newton-Euler method[7] as follows:

Wi - 1 + Zoqi-l,
(2)
Wi - 1 + Zoqi-l + Wi - 1 X Zoqi
where Wi is the angular velocity of link i a 3 x 1 vector; qi is the joint velocit.y of joint
i-a scalar and Zo = [00 l]T. The evalutation of Wi and Wi needs Wi - 1 and W i- 1 of
the previous link. To calculate the angular velocities and accelerations of all links,
the tree expression is as shown in Fig. 2. This kind of tree structure is not desired
since there is only one atom in each step. Additional processors do not speed up
the computation. The same situation applies in evaluating linear velocit.ies, linear
accelerations, total forces and tot.al torgues[7]. As a result, calculat.ing t.he final ap-
plied torgues for a multi body system with n links needs more than n st.eps; and, the
task scheduling becomes extremely difficult. For any multibody system with more
than five joints, it is very hard to find an optiomal schedule on a digital comput.er
with a reasonable amount of computing t.ime [12]. In t.he Newton-Euler stat.e space
formulation of the system dynamics suggested in [13], two coordinat.e systems are
assumed for each link - an inertial coordinate syst.em and a body coordinate system.
The origin of the latter is located at the center of gravity and t.he axes are among
the principal axes of the body (Fig. 3). Let the state variables be Xi, .Y";, 0i and
Wi; where (Ji and Wi are the Euler angles and the angular velocities expressed in
the body coordinate system. The Newton-Euler State Space equations for each link
are [13].

Oi B(O;)Wi ,
Wi J i- 1 [J(Wi ) + N;],
(3)
Xi ..fYi,
and Xi M-1Fi

where Mi is the mass matrix, Ji is t.he moment of inertia matrix, and N; and Fi are
the total forces and total torgues acting on link i, as defined in [7]. The matrices
B(O;) and vector J(W;} are
9

x 3.
1

o
x 2 .
1

k.
1

Fig. 3 : Body coordinate system and inertial coordinate system.


10

- sin Bi3 - cos Bi3


~---
0 -_._--
cos Bi2 cos Bi2

B( B;) = (4)
cos 8i3
° sin8 i3

tan 8iZ sin 0;3 1 - tan BiZ cos 0;3

From Eq. (3) one can obtain Wi and TV; without the backward recursion of
Eq. (1):

Wi = B- 1 (Oi)8;,

and

(6)

where e is the desired acceleration and Oi and 8 are the states of the
i i ith link which
may be detected by appropriate gyroscope type of sensors. Alterantively, angle g
and joint angular rate q (Fig. 4) may be sensed. For example, in the two-link system
of Fig. 4, link 1 and link 2 each have one degree of freedom. Link 1 can rotate along
the fixed axis X03 on the platform. Link 2 can rotate along 01Xl1. The joint angle
q1 is defined as the angle between axes X 02 and X l2 . The joint angle q2 is defined
as the angle between axes 01X13 and 02XZ3' Let Oll, 012 and 813 be the Euler
angles of link 1, and let 821 , On and 023 be the Euler angles of link 2. It follows
that 011 = g1, 8 12 = 0, and 013 = 0, and B21 = g1,
On = g2, and 823 = 0. Generally,
the dependence of the Euler angle 8 on the joint angle q can be described by a set
of parametric relations as follows:

, i = 1,2, ... j (7)

where the elements of matrix C are constants (0, 1, and -1), j is the number of
links in the system, and n is the total degrees of freedom of the system. If each
joint has only one degree of freedom, j is equal to n. With the calculation of Bi and
Bi
11

x 23

o
./ ~ X 02

Fig. 4.a : A t.wo-link system.


12

as st.ep 1, t.he eval11at.ion of TVi 11$ f'tf'P 2, 1111" rompl1t.f1t.iOll trpe will havp f1 h{'i~ht.
of 2 (Fig. 5). Furthermore, since the calculation of Wi does not. require knowledge of
the angular velocity of the other links, the tree height is independent of t.he number
of joints. The same situat.ion happens during the calcualtion of Wi.
In order to calculate the external force Fi on link i, one needs to knows Xi in
Eq. (3). For any two connected links i and i +1, there exists a holonomics connection
constraint (Fig. 3) expressed by

(8)

where the terms are

Ai( (}i) 3 x 3 matrix, transform a vector from the body coordinate syst.em to the
inertial coordinate system,

Ki vector from the center of the body coordinate origin to the upper joint (along
body axis X 3i ),

Li vector from the center of the body coordinate origin to the lower joint (along
body axis X3;)'

By differentiating Eq. (8) twice, one gets

(9)

and

(10)

where WWi is a 3 X 3 skew symmetric matrix whose elements are related to the
vector Wi as follows [13]:

WWi= [ W.~ (11 )


-W.2
13

STEP 1

x
2

Xl ,
X i3

X i3
,,
X i2

x~2
S~rEP 2
0

,,,

STEP 3

" ,

Fig. 4.b : Euler angles.


14

STEP 2
CALCULATE W. s ~

STEP 1

ql qn
CALCULATE e. and e.
. .•
~ ~

ql qn
q2

q2

Fig. 5 : Tree expression for calculation of W/s by Newton-Euler state space


formulation.
15

The t.f.'rmfl f( TC IInrl [,"; are Rimilarly rlefl1lf'd. H appearfl t.hllt to (,R]rJ.JRte of; hy
Eq. (10), backward recursion is still required. But it actually takes only two steps
to complete the computation of Xi. The computation of Xi may be arranged as
follows.
Let

(12)

and

(13)

The linear accelerations of all the links can then be expressed as

Xl = --AWKI'
X2 AW(LI - Kd - AWK2 ,
(14)

where

AW(Li - K;) = AWL i - AWKi .

In the first step, AWKi required by Xi, AW(Li - K;) required by X i+1, X iH ,···
and X n , are calculated. In the second step Xi is then calculated by Eq. (14). Since
all the WIs and Wis are available at the same time as it was pointed out earler,
the computation of AW Ki and AW( Li - I() for all the links can be carried out in
parallel. However, the calculation of Xn needs more time to be completed than those
of other linear accelerations. It appears that the value of XIs are not all available
at the same point of time. One may notice that the second step represented by
Eq. (14) only has a number of additions varying from zero for link 1 to (n - 1)
for link n. Clearly, step 2 needs much less computation time than step 1 does,
which includes matrix multiplications. Therefore step 1 is significant enough to
represent the parallel computation of the linear accelerations. The tree structure
for computation of all XIs is now as in Fig. 6. One may recall that in the recursive
Newton-Euler method of [7], the angular velocities and angular accelerations are
calculated one by one. The calculation of the linear accelerations, which needs
16

X2

STEP 2

STEP 1

Fig. 6 : Tree expression for calculation of Xi'S.


information of angular velocities and angular accelerations, can only be done by
one as well. This will result in a tree structure like that of Fig. 2, and additional
processors do not reduce the computation time.
With the above discussion of the tree-height reduction, one can now summarize
the procedure for computing Fi and Ni in the following steps.

Step 1: Calculate Oi and iJi by Eq. (7)

Step 2: Calculate Wi by Eq. (6)

Step 3: Cnlc111ntf' TTT; hy Eq. (fl)

Step 4: Calculate AW K;, AW(Li - K;} by Eq. (12) and Eq. (13)

Step 5: Calculate Xi by Eq. (14)

Step 6: Calculate Fi by Eq. (3)


17

St.ep 7: Cnlrlllnt.p Ni hy F,q. (:3)

As pointed out previously, there do not exist recursive relations between the links;
all the F!s and NIs can be computed in parallel by more than one processor.
To calculate Ti-the general torgue for link i, the following equations are em-
ployed[13].

(15)

(16)

( 17)

(18)

(19)

where 11, is the last link of the system. There exists a forward recursion. If Eqs. (15)
to (19) are assigned in one atom, the tree structure after calculation of .{Yfs is as
shown in Fig. 7.
The dynamics of a multi body system has now been assigned an evaluation tree
by using Newton-Euler state space formulation. Notable tree-height reduction has
been achieved. If each atom is regarded as a task, t.he next step is the task allocation
among the multiprocessors. This will be discussed in the next section.

III The Multiprocessor System and the Task Allocation

The evaluation tree for calculating the applied torques was derived in the last
section. The tree height is significantly reduced by using Newton-Euler state space
formulation. To effectively execut.e the parallel ('ompllt.ation as schecluled, the ar-
chitecture of the multiprocessor system and the task allocation are also important.
The first point of interest about the architechture is the number of CPU's. The sys-
tem, specially designed for the parallel computation of the applied torgues, should
have 11, satellite CPU's (one CPU for each link) and a central CPU. Later it will be
made dear that the addition of the central CPU greatly reduces the COllllllunic.ation
18

Fig. 7 : Tree expression for calculation of general torques.


19

CENTRAL
PROCESSOR
I/O

+ t+
(2) SATTELITE (2) SATTELITE (2) SATTELITE
PROCESSOR PROCESSOR PROCESSOR

- I/O
t
-- +
I/O

+
"-
t
I/O

r r
Fig. 8 : Block diagram of multiprocessor system.

among satellite CPU's and also makes the task allocation much easier than with-
out the central CPU. The second aspect concering the system architect.ure is the
coupling among the CPU's. A multiprocessor system can be either tightly coupled,
meaning that the system has a global memory shared by all the CPU's, or loosely
coupled, meaning that the system has no global memory [15J. For a tightly coupled
system, communication between processors occurs through global memory. A means
must be provided to resolve the conflicts for the access of the shared system. For a
loosely coupled system, the communication is done through input/output channels
[15J. For the purpose of simplicity and modularity, the loosely coupled system is
chosen. Each satellite processor of the system should have local memory and enough
I/O capability. The general block diagram of the suggested multiprocessor system
is shown in Fig. 8.
With t,hE" evalmltion t.rE"e and t.he sperifif'd systf'm Sh·l1rt.lll'e, the task allnration
becomes easy and obvious. In the last section, it was stated that F; and Ni could be
calculated in the seven steps. All the steps are the same for every link of a 111ult.ibody
system, except for step 5 in which the last link (link 5) needs 1110re additions in the
20

comput.at.ion of t.hf' linf'ar aC(,f'lf'rat.ioll. Howf'vf'r, if (1 IJ) is modifiNl R..s

Xi = AW(L i - n+1 + K;-n+1) + AW(L i - n+2 + K i - n+2) + ...


+AW(L i _ 1 - Ki - 1 ) - AWKi , i = 1,2, . .. n (20)

with

AW(L. _ K.) = { 0 j ~0
J J AW(Lj _ K j ) j >0
Step 5 will be the same for all the links in terms of computation time. If each step is
considered as a task, all the seven tasks for one link are assigned to the satellite CPU.
Since there are as many satellite CPU's as the number of links, all the calculation of
Fis and NIs can be carried out in para.l1el and completed simultaneously. Anot.her
advantage is t.hat the program in every satellite CPU is exactly the same. This
unique feature of the software plus the loosely coupled structure of the hardware
make the mechanism easy to be adopted to different robotic systems.
The final goal, however, is to calculate the applied t.orgue TIs. The calculation
of Tis has been assigned in an atom. If the task is allocated at one of t.he satellite
CPU's, two problems will arise:

1. Software uniformity for satellite CPU is destoryed.

2. Excessive idle time for the ot.her processor may result.

One approach is to furt.her divide the task into small subtasks and distribute them
among satellite CPU's. This may reduce idle time, but software uniformity is de-
stroyed and heavy communication between satellite processors arises.
An alternative approach is to allocate the task to central CPU. As a result,
the software uniformity for satellite CPU's is maintained and heavy communication
between satellite CPU's is reduced. Most of the necessary communication is between
the central CPU and satellite CPU's. It may be noticed that after satellite CPU's
complf't.e t.he calculat.ion of F;'s and N,'s and iwno t.llf'm t.o t.hf' cf'ntral ('PTT, f1lf'Y
are totally idle until the computation in t.he central CPU is completed. To reduce
the idle time, one has to speed up the computation in the cent.ral CPU. The task for
calcualting Ti's is divided into two subtasks. Substask 1 includes every evaluation
which does not require the information of F/s and N:s; therefore, subtask 1 can
be executed in parallel with t.he execution of t.asks in sat.ellite CPU's. Subtask 2
21

SA TTElliTE CPU

EXECUTION TIME
COMMUNICATION
FOR STEP 1 TO 4 TIME EXECUTION TIME IDLE TIME

nT C1

CENTRAL CPU
EXECUTION TIME
FOR SUBTASK 2
EXECUTION TIME
FOR SUBTASK 1 IDLE TIME COMMUNICATION
TIME

t 3
nT c2

Fig. 9 : Time scheduling for central CPU and satellite CPU's.

finishes the stage of calculation which needs Fi'S and Ni's. For example, Eq. (19)
is used to calculate Tn. It could be rearranged as

(21)

The evaluation of A~_l An, K K'!; and LLnA~ in Eq. (21) is assigned into subtask 1.
With these terms known, less time is required to evalua.t.e Tn by subtask 2. Conse-
quently, the idle time of the satellite CPU's is much reduced. The execut.ion times
for the satellite CPU's and the central CPU are described in Fig. 9.
22

IV Performance Evaluation

The performance of a multiprocessor system is evaluated by a speedup factor


Sm. Let tm be the time required to perform some calculation using more than one
processor and tu be the time required by a single processor. The speedup factor of
the m-processor calculation over a uniprocessor is defined as

S -~ (22)
m - tm

Suppose that there are n links in the multi body system and one processor. To
calculate Fi and N i , the processor is required to execute step 1 through st.ep 7.
From Fig. 9, the time consumed by this calculation is

(23)

where

To calculate all the FIs and NIs for the system requires

(24)

The time required to calculate applied torgues is

(25)

where t~, t4 and t5 are denoted in Fig. 9. Since t~2 is the time required for all n
applied torques, it is directly proportional to n. Eq. (25) can then be expressed as

where

T3 = ~ and T4 = i5 - t4 .
n n
The total time required is then

(26)

If multiprocessor system is used, the time required is


23

The time interval ts includes COllllllUication time between satellite CPU's, t2 - tt,
and that between satellite and central CPU's, t4 - t3. Both are directly propotional
to (n - 1) and n respectively. They may be expressed as

and

For small values of n, one could have

(27)

The total time required by the multiprocessor system is

The speedup factor is

Sm= t::= n(T1 +T2 +T3 +T4 ) (28)


t~ Tl + T2 + n(Tc1 + Tc2 + T4 ) - Tel
where all the parameters are summarized as follows

Tl Execution time for calculating AWJ(i and AW(Li - J(i).


T2 Execution time for calculating Xi. F; and N i.
T3 Execution time for subtask 1 when n = 1.
T4 Execution time for subtask 2 when n = 1.
Tel = Communication time for sending AW(Li - J(i)'
Tc2 Communication time for sending Fi and N i .
n Number of links multilillk system.

For large values of n, one may have

(29)

The total time required becomes


24

The sppedup factor is

Tl + T2 + T3 + T4 (31)
Tc2 + T3 + T4
One may notice that Sm is now a constant and has nothing to do with t.he number
of links. But Tl and T2 are much greater than Tc2 , T3 and T4 as it will be seen in
the next. section, the speedup fact.or will always be greater than 1.
The numerators in both (28) and (31) are fixed for any particular value of n. To
increase the speedup factor, one may reduce t.he denominators in which t.he COlll-

munication times should be as small as possible. The interprocessor communication


time is the bottleneck of a multiprocessor system. The time intervals Tel and Tc2
are the times required for sending only 3 x 1 vector. Therefore, there does not exist
heavy communication between the processors.

V Parallel Processing for a Two Link Robot System

To test the effectiveness of the parallel processing strategy discussed in t.he


previous sections, a Fortran program was writt.en and applied to a two link robot
system (Fig. 10). The reason for chossing two link system is because the nUlllber of
links will not affect execution time for calculating Fis and NIs; however, the small
number of links may render simplicity of simulation. Both joint.s J 1 and J 2 have
three degrees of freedom and J 1 is fixed on the platforlll. The dynamic equations
of this two body system are

li+Ni'
BiWi ,
(32)
Xi,
7n..'Y; Fi , i = 1,2

r2 - 7n2g,
Af A2(N2 + h - J( J(2 Afr 2), (33)
and
Tl A 1 (T2 + Nl + 11 - J(J(lr 1 + LL 1 Afr2).
25

J2

Fig. 10 : A two-link robot.

Eqs. (32) is used to calculate FIs and NIs. The calculation is divided into seven steps
as indicated previously, and tasks for each link are allocated into two processors.
Eqs. (33) is used to calculate the final torgues. The task of calculating T/s is divided
into two subtasks. Subtask 1 includes:

1. Calculat.e Ai = AT .4 2 , a 3 x 3 matrix.
2. Calculate K A2 = K K2Af, a 3 x 3 matrix.

3. Calculate K Al = K K1Af, a 3 x 3 matrix.

4. Calculate LA = LL1A[, a 3 x 3 matrix.


26

Suht.ask 2 inclndes:

2. Calculat.e r2 = F2 + m~2g, a 3 x 1 vector.

The program is execut.ed on a multiprocessor syst.em developed at The Ohio State


Unversity. It is a loosely coupled system which has the sallle structure as the one
discussed in the last section. The details of the system are not presented here. A
special program called int.erprocessor communication mudule is used t.o send dat.a
between the processors. It. takes 22.65 fl8ec for the transmitting task to transmit a
two-byte datum. The receiving task needs 2.5.05 flsec to read in a two-byte satum.
The transmission of AvV( Li - 1\i) by the satellite CPU's involves a three element
vector. As a result

Tel = (22.05 + 25.0,5) x 3 = 143.1 flsec = .143 msec

The data transmitted from satellite CPU's to the central CPU are F/s and Nfs.
Each is a three element vector. Therefore, the total time of Tc2 is

Tc2 = (22.65 + 25.05) x 3 x 2 = 286.2 flsec = .286 msec

The execution time of the task itself is distrihuted as follows

Time required by executing step 1 to step 4:


Tl = 38 ms.

Time required by executing step 5 to step 7:


T2 = 22 ms.

Time required by executing subt.ask 1 in the central CPU:


2T3 = 30 ms.

Time required by executing subt.ask 2 in the cent.ral CPU:


2T4 = 20 ms.
27

Since

2T3 30111,8 < 38111,8 + .143111,8 + 22111,8 + 2 x .286111,8


Tl + Tel + T2 + 2Tc2 ,

(28) is employed to calculate the speedup factor:

--~--
2{38+22+15+10)
__.
._--_.. - - ----_. ------
-------------~-----------~ --

38 + 22 + 2 x (.143 + .286 + 10) - .143


2.106.

All the above execution times include the evaluation of trigonometric functions. If
one implements is a table lookup for these complex function, execution time can be
further reduced.
The above executive and communication times can be used to find the speedup
factor for a general industrial manipulator with six joints. Since

6T3 90111,8 > 38111,8 + 5 x .1437n8 + 22111,8 + 6 x .286111,8


Tl + 5Tc1 + T2 + 6Tc2

Eq. (31) should be used to calculate out the speedup factor:

S = 38. +......20 + 15. +..10 = 3.36


m .286 + 15 + 10
Thus the speedup factor for an industrial robot with six links is 3.36. The number
of processors is seven. For cheap processors, reducing computation time by a factor
of 3.36 is significant.

VI Conclusion

By using Newton-Euler state space formulation, the height of the evaluation


tree of the applied torgues is reduced. Because of the tree-height reduction, parallel
processing becomes more effect.ive ann t.he comp1lt.at.ion t.ime may he reduced. A
multiprocessor system is especially designed for the parallel processing of the multi-
body system. One CPU, called the satellite CPU, is assigned for the link of the
system to calculated Fi and N i . Another CPU, the central CPU, is assigned to cal-
culate all the applied torgues. All the programs in individual CPU's are identical.
This architecture makes it easy to add or drop links to a multi body system. By
28

using t.h~ cf'nt.rn.l CPlT, hC'n.vy comnnmicatioTl haffic het.ween t.hf' ~n.t "llit ... nplT'~ is
much reduced, and the complicated process of task allocation among the satellite
CPU's is avoided.
The parallel processing strategy developed here is applied to a two link robot
system. The speedup factor is 2.106. The simulation results are generalized to
industrial robots with six links. A speedup factor of 3.36 is obt.ained.
29

REFERENCES

[1] J .Y.S. Luh, "An Anatomy of Industrial Robots and Their Controls," IEEE
Trans. Automat. Contr. Vol. AC-28, no.2, pp. 133-153.

[2] Y.F. Zheng and F. R. Sias, "Design and Motion Control of Pratical Biped
Robot.s," International Journal of Robotics and Automation, Vol. 3 , 110.2, pp.
70-78, 1988.

[3] D.E. Whitney, "Resolved Motion Rate Control of Manipulators and Human
Prostheses," IEEE Trans. Man-Machines Syst., Vol.MMS-10, no. 2, pp. 47-53,
June 1969.

[4] C.S.G. Lee "Robot. Arm Kinemat.ics, Dynamics and Cont.rol," IEEE COM-
PUTER, PP.62-80, December, 1982.

[5] M. Vukobratovic, and D. Stokic, "Is Dynamic Control Needed in Robotic Sys-
t.ems, and, if so, to What Ext.ent?", The International Journal of Robotics
Research, Vol.2, no.2, Summer 1983.

[6] R.P. Paul, Robot Manipulators: Mathematics, Programming, and Control,


Cambridge, Ma., M.LT. Press, 1981.

[7] J .S. Luh, M. Walker, and R. Paul, "On-line Computational Scheme for Mechan-
ical Manipualtors," Trans. ASME, J. Dynamics Syst., Meas. Contrl., Vol.102,
pp.69-76, June 1980.

[8] J .M. Hollerbach, "A Recursive Lagrangian Formulation of Manipulator Dy-


namics and a Comparative Study of Dynamics Formulation Complexity," IEEE
Trans. Syst., Man, Cybern., Vol. SMC-10, no.11, pp.730-736, November, 1980.

[9] W.M. Silver, "On the Equivalence of Lagrangian and Newton-Euler Dynamics
for Manipulators," The International Journal of Robotics Research, Vol. 1 , no.2,
Summer 1982.

[10] V. Zakharov, "Parallelism and Array Processing," IEEE Trans. Computer, Vol.
C-33, no. 1, pp 45-78, 1984.
30

[11] n. H. LII.throp, "Parallplis111 ill !\ rms lI11rl Lf'gs", l\f.S. Thf'sis, Till" l\f IIssadl11 Sf'Hs
Institute of Techuology, December, 1983.

[12] J .Y.S. Luh, and C.S. Lin, "Scheduling of Parallel Computation for a Computer-
Controlled Mechanical Manipulator," IEEE Trans. Syst., Man, Cybern. Vol.
SMC-12, no. 2 , pp 214-234, Marchi April, 1982.

[13] H. Hemami, "A State Space Model for Interconnected Rigid Bodies," IEEE
Trans. Atdoma. Contr., Vol. AC-27. no.2, pp 376-382, April, 1982.

[14] D.J. Kuck, "A Survey of Parallel Machine Organization and Program," Com-
puting Surv£Y:J(ACMj, Vol. 9 , no.l, pp. 29-.59, March, 1977.

[1.5] B.H. Liebowitz, "Multiple Processor Minicomputer System Part 1: Design


Concepts." Compute!' Design, pp. 87-9.5, Oct. 1978.
CHAPTER 2

AUTOMATIC SCHEDULING OF THE NEWTON-EULER


INVERSE DYNAMICS

Erik H. DlIollander
State University of Ghent
Department of Electrical Engmeenng
St.-Pietersnieuwstraat 41. B-9000 Ghent. Belgium

Abstract
The robust control of a robot arm requires the fast calculation of the dynamic
equations. Up to now the Newton-Euler formulation is considered to offer the most
efficient implementation, with a workload proportional to the number of links. However,
most low cost micro-controllers are unable to deliver the processing speed required for
operation in a real-time environment. Therefore several authors suggest a parallel
implementation. Using a careful manual decomposition of the taskgraph, speedups of
roughly equal to the number of links are reported in the literature. In this chapter
some general techniques for the automatic program decomposition and scheduling are
presented. These methods are embedded in the LEM Analyzer, a tool to generate
multiprocessor schedules from a high-level language program. A simulator allows to
measure the parallelism and estimate the performance of the execution. The automated
analysis is applied to the inverse dynamics problem. It compares favorably with a
manual decomposition and scheduling. Moreover, the LEM Analyzer holds the promise
to decompose and simulate many other robot-arm configurations in a large range of
multiprocessor configurations.

1 Introduction
The real-time control of a multiple link robot arm involves the computation of the inverse
dynamics. The dynamic equations of a n link manipulator describe the interaction between
the link control and the external forces. The Newton-Euler formulation of these equations
[19] provides a solution in O(n) steps. This makes a real-time control feasible, if the
resonance frequencies of the links remain below 60 Hz [20]. However, as noted in [26], a
60 Hz sampling rate is estimated too low for a realistic servo-control. In [22] a 300 Hz
frequency or 3.33 ms sampling period is recommended for the control of a PUMA robot.
A higher control loop frequency would even be preferred if possible. In this chapter we
will show that with the actual 32-bit micro-processors, a parallel execution of the inverse
dynamic equations is possible in .7168 IDS, giving a robust 1.395 KHz sampling rate.
Four approaches exist to speed up the computations. These are the development of
ASIC chips (Application Specific IC's), [2,15,9,16], pipelined array processors [25], a RISC
architecture [3] and several multiprocessor implementations [18,22,12,13,26,23].
31
32

The VLSI developments are bit-serial architectures [2,9,15,16] or vector processors [21].
They represent ad-hoc solutions for the specific nature of the robotics vector computations.
Pipelined array processors require large vectors to compensate for the expensive start-up
time of the pipe. Since most robotics matrices have a dimension of merely 3 x 3, this might
block out the pipelined systems all together. However, Wander and Tesar [25] devised a
special "base plus offset" addressing method which allows to align the data of consecutive
operations in long pipelines. This results in a total of 5.5 ms for the dynamics calculation.
A RISC approach for robot control is the JIFFE Very Large Instruction Word processor
[3]. Andersson uses two Arithmetic and Logical Units (ALUs) in parallel, fed with 200 bit
wide microcode instructions. An Am29331 program sequencer points into a 64K memory
address space. With 20 MFlops per ALU, the peak performance of the JIFFE processor
is 40 MFlops on 32 bit floating point numbers. The dynamic model of Izaguirre and Paul
[11] executes in 47.8 JJS, i.e. close to 20 MFlops.
Finally, several authors present multiprocessing solutions because it is natural to dis-
tribute the load of the relatively independent link equations over low cost micro-controllers.
For example, Luh et a1. [18] assign one controller per link to calculate the dynamics of the
Stanford arm. With 1 and 6 processors they obtain a processing time of 24.8 ms and 9.67
ms respectively, yielding a speedup of 2.56. Kasahara and Narita [12,13] schedule the tasks
on an arbitrary number of processors. Their modified critical path algorithm results in ex-
ecution times of 24.83 ms and 5.73 ms for 1 and 6 processors respectively, giving a speedup
of 4.33. Vukobratovic et a1. [23] use their own computer-generated numeric-symbolic dy-
namic model [24]. The calculations are organized into a many pseudo-independent subtasks.
These are distributed using a largest-processing-time-first list-processing algorithm [4]. For
the Stanford arm benchmark, the processing time is reduced from 28.7 ms to 4.81 ms with
6 processors, giving a speedup of 5.97. Recently, Wang et a1. [26] implemented the inverse
dynamics equations on a four processor MC68020jMC6881 system. Following a manual
decomposition of the Newton-Euler equations, the tasks are scheduled by the method of
Kasahara [12,13]. The measured execution times are 5.477 ms and 1.48 ms with 1 and 4
processors respectively, yielding a speedup of 3.70.
A comparison ofthe different speedup figures in the multiprocessing publications suggests
that the expected parallelism of the inverse dynamics equations is less than 10. The obtained
speedup critically depends on the task decomposition and scheduling method. In this
chapter we present several methods to automate the task decomposition and scheduling
of the inverse dynamic equations. These techniques are embedded in the LEM Analyzer
and apply to other models as well, such as the inverse kinematics or the joint coordinates
equations. The taskgraph is generated from a Fortran program describing the equations.
Then the task are scheduled using a simple but efficient list-scheduling algorithm. In this
way the inverse dynamics shows a speedup of 7.73 with 9 processors.

2 Task Granularity and Communication Overhead


When a program is parallelized, a task graph is built. The nodes denote the tasks to be
executed and the edges represent the dataflow and precedence constraints between the
tasks.
For optimal performance, a tradeoff between the task size and the parallelism has to be
33

made. Small tasks increase the parallelism as well as the communication overhead. At the
other hand, large tasks cause processor idling times. We define the following task graph
parameters.
The task granularity r is defined as the average duration of a task 'f divided by the
duration of the whole program E.

r='f/E. (1)
The communication density of a task graph, p, measures the amount of inter-task com-
munication, c, with respect to the effective processing time, E - c.

p = c/(E - c). (2)


The communication density is largely determined by the taskgranularity r and varies
between zero (no communication) and infinity (no processing). The critical path length, L,
is the execution time of the longest chain of sequential tasks in the taskgraph.
Define the speedup Si =tl/ti, where it and ti mean the execution times on 1 and i
processors, respectively. Since the execution time is at least L, the upper speedup limit of
a program with a total workload E is

Sm,IJZ = E/L. (3)


The performance of a multiprocessed program principally depends on the critical path
length, L and on the communication density, p. IT the communication overhead is ignored,
bounds for the speedup on a p-processor system are obtained as follows. The fastest exe-
cution occurs when the workload off the critical path, E - L, is carried out in parallel with
the critical path. This gives

tp,m,in = max(L,E/p) (4)


Sp,m,IJZ = min(p, E/ L). (5)
In the worst case, the precedence constraints force the E - L workload to start only
when the critical path is finished. To be consistent with the definition of largest path,
the non-critical tasks must be executable in zero time on an infinite number of processors.
Therefore, the E - L workload is called an impulse task [6]. The worst case execution time
and speedup on p processors then becomes

tp,m,IJZ = L + (E - L)/p (6)


Sp,m,in = E /tp,m.tJz
= E/[L + (E - L)/p] = p/[l + (p - l)/Sm.tJz]. (7)

The half-performance number of processors, Pl/2, giving a speedup Sp = E/2L, is

Pl/2 = W-1. (8)


34

~fficiency
1.0

0.9

0.8

0.7

0.6
0.0 0.1 0.2 0.3 0.4
~

Figure 1: Speedup reduction for communication densities p = 0···4. Processor, memory


combinations (p, m): a) 4 p, 1 m, b) 8 p, 2 m, c) 12 p, 3 m, d) 16 p, 4 m. For the critical
density p = f3erit, the efficiency varies between 77% and 79%.

The impact of the inter-processor communication is a function of the communication


density. The fraction of time a task communicates or executes is p/(1 + p) and 1/(1 +
p) respectively. Assume a balanced communication between p processors and m shared
memories. Then the m memory paths saturate when the memory access time of (p - 1)
processors exceeds the execution time of 1 processor, i.e. when (p - l)p/m ~ 1. This
happens for the critical communication density p = f3eril' with

f3eril = m/(p - 1). (9)


The communication overhead has been analyzed by a closed queueing network model [7].
This shows that near the critical communication level, f3eril' the communication bandwidth
remains between 77% and 79% of the maximum value (fig. 1). In some applications
however, hot-spots may cause a temporary but severe communication block, e.g., when all
processors try to update the same memory location.
35

3 The Newton-Euler Inverse Dynamics


The Newton-Euler equations have been formulated by Luh et al. [18]. The equations
describing the i-th link in a n-link manipulator are given below.

(I) A?Wi = Ai-l(A


i O
i_lwi-l + IiZOqi
·}
(2) veel li(A?_lwi-l} x (ZOq;)
(3) A?Wi A~-l[(A?_lWi_l + li(zoqi + veel}]
(4) vee2 Ai-l(A
i o ·
i_1Vi-l + (1-I )i zoqi.. }
(5) vee3 = (A?Wi) x [(A?Wi) x (A?p*;)+
2(1-li )(A;-lzoqi)]
(6) vec4 (A?Wi) x (A?p' i)
(7) A?Vi vec2 + vee3 + vee4
(8) vee5 (A?Wi) x [(A?Wi) x (A?Si)]
(9) vec6 (A?Wi) x (A?Si)
(10) A?Fi = mi( vee5 + vee6 + A?Vi)
(ll) vec7 (A?JiAh)(A?w;)
(12) vec8 (A?Wi) x [(A?JiAh}(A?Wi)]
(13) A?Ni = vee7 + vee8
(14) A?fi = A~+1(A?+lfi+d + A?Fi
(15) vec9 = (A?+lpi) X (A?+li+d
(16) veel0 = Aii+l (Ai+lDi+l
0
+ vee9)
(17) vecll A?(pi + Si) x (A?Fi)
(18) A?Di veclO + vecll + A?Ni

{ (A?Di)T(A;-lzo) + biqi, if link Ii = 1


Ti
(A?fi)T(A;-lzO) + biqi, if link Ii = 0

Ii = 1 if link i is rotational
Ii = 0 if link i is translational
These equations compute the external forces Fi and the external moments Ni exerted on
link i by progressively calculating the linear velocity Vi and acceleration Vi, the angular
velocity Wi and acceleration Wi, from the base to the hand. Then the applied torques
and forces Ti are calculated in reverse order from the hand to the base [18]. Matrix A~_l
transforms a vector between coordinate systems i - I and i, Zo = [00 I]T. Some variables
of link i are represented with respect to the coordinate system associated with link i - l.
These are the origin of coordinate system i, pi, and the angular velocity and acceleration
qi and qi. fi and Di represent the force and moment vectors exerted on link i by link i - l.
The input variables are n, the number of joints, and fn+l, Dn+l, the force and moment
exerted by the hand (i.e. link n) on the load. The outputs are Ti, the torques or forces
to be applied for each link i. Constants are mi, the total mass, J i , the inertia matrix in
base coordinates, Si, the center of mass of link i, relative to its own coordinates and bi , the
damping coefficient. The derivation of these formulas can be found in [18].
A Fortran-77 program for the Stanford arm is derived from the Newton-Euler equations
(fig. 2).
36

do i = 1,6
ao(i) fl (ao(i-l»
if (i.ne.l .and. i.ne.3) then
veel = f2 (ao(i-l»
endif
aod(i) = f3 (aod(i-l),veel)
vee2 = f4 (avd(i-l»
if (i.ne.l .and. i.ne.4 . . and. i.ne.5) then
vee3 f5 (ao(i»
vee4 = f6 (aod(i»
endif
avd(i) f7 (vee2,vee3,vee4)
vee5 = f8 (ao(i»
vee6 f9 (aod(i»
agf(i) flO (vee5 , vee6 , avd(i»
vee7 fl1 (aod(i»
if (i.ne.12) then
vee8 = f12 (ao(i»
endif
agn(i) = f13 (vee7,vee8)
akf(i) = f14 (akf(i+l),agf(i»
if (i.It.4) then
vee9 = f15 (akf(i»
endif
if (i.It.6) then
vee10 = f16 (akn(i+1),vee9)
vee11 f17 (agf(i»
endif
akn(i) f18 (vee10 , vee11 , agn(i»
enddo

Figure 2: Fortran program of the Newton-Euler equations for the Stanford arm.
37

IF-tests are used to skip the zero operations reflecting the physical properties of the
manipulator. The functions fi 0, i = 1,,,,,18 represent the 18 Newton-Euler equations.
We assume that the torque Ti is calculated together with the force f; or the moment D;.

4 Automatic Data-Flow Analysis: the LEM Analyzer


The source program describing the robot dynamic equations is analyzed and parallelized
by the LEM Analyzer. Then the partitioned program is simulated on up to 10 processors,
observing the precedence constraints and the communication overhead. In this section the
partitioning and scheduling methods of the analyzer are described.
The LEM Analyzer consists offour parts, the do-loop unroller, the task graph generator,
the scheduler and the simulator.

4.1 Do-loop Unroller


A linear list of statements without cycles is generated by unrolling the do-loops and eval-
uating the IF-tests. The substitution of the do-loop variable by its actual values generates
a program of 95 statements. E.g., the first statements become

ao(1) = f1 (ao(O»
aod(l) f3 (aod(0),vec1)
vec2 = f4 (avd(O»
avd(l) = f7 (vec2,vec3,vec4)
vec5 f8 (ao(l»

The loop creates side-effects when the same variables obtain new values in subsequent
iterations. For example, the values of vectors vec2 and vec5 in iteration i+1 must not be
stored before the values of iteration i are consumed by the successor tasks. Redundant
dependencies are avoided by creating new versions for the variables in each iteration of the
loop.

4.2 Taskgraph Generation


A statement is a single task. The taskgraph is represented by a table where an entry consists
of the function to be evaluated and an argument list. The first statements are represented
in Table 1.

4.3 Scheduling
The scheduling problem is conveniently formalized as follows [4]. Given a taskgraph of the
program, G = {J, <*, e} where

• J = is a set of task nodes, J = {T;},


• <* = the precedence constraints. Ti <*T; indicates that T; uses the results of Ti,
• Ti = e(T;) = the execution time of task Ti.
38

Nr Function Input Arguments Output


Tl fl ao(O) ao(1)
T2 f3 aod(O) , vecl aod(1)
T3 f4 avd(O) vec2
T4 f7 vec2, vec3, vec4 avd(l)
T5 fS ao(1) vec5

Table 1: A tabular description of a task graph segment representing the precedence relations,
e.g. T5 uses the output of T1.

Find a partition of the tasks J on p processors, P = {A l ,· •• , Ap} such that the maximal
execution time of all processors

tma:t max (L T;)


= i€[l,p] ;€A;

is minimized.
It is well known that the scheduling problem of an acyclic taskgraph is NP-complete
[5J. Therefore suboptimal heuristics must be used. The LEM Analyzer uses the following
simple, yet efficient LPTF (Longest Processing Time First) list algorithm [lOJ.

4,4 LPTF algorithm


1. Calculate the tasklevels. The level of a task is the longest path from this task to
an end-task in the taskgraph. The level of a task represents the minimal remaining
execution time when the task starts.

2. Arrange the tasks by decreasing task-levels.

3. Assign the highest level task without unfinished predecessors, to the first free proces-
sor.

4. Repeat from 3 until all tasks are assigned.

The list-algorithm finds a solution within 5% of the minimal scheduling time with a
probability of 95% [IJ. Upper bounds for the execution time exist when the task graph is a
tree [14J, or the tasks are independent [8J.

r
tp $ to + Tma:t - Tma:tlp1 (tree-graph)
tp $ (4/3 - l/3p) to (independent tasks)

In these equations to represents the execution time of the optimal schedule on p processors
and Tma:t is the maximal task length.
39

r-----------,
LM 1 f-- P1 I
I"
I " " I
I I
LM 2 f-- P2 I
I " " " I
· I I
· I I
· · I
Crossbar Switch
I
· · I I
I I
LM p - Pp I
---"------- J
'\
1
L_"

M1 M2 . . . Mm

Figure 3: Simulated architectures. The number of processors, p, and shared memories, m,


is selectable. The code is stored in local memories. The data is stored in local (LM) or in
shared global memories (M).

4.5 The Simulator


The LEM Analyzer contains a simulator for architectures with up to 50 processors and
memories. The processors communicate with the shared memory by a single bus or by a
crossbar switch (fig. 3).
The simulation provides a detailed timing analysis and indicates the parallelism in the
program as well as the communication overhead. A Gantt·chart shows the execution of
each processor.
The main performance measures are the critical path length, the bus-load, the processor
utilization, the communication overhead and the speedup. The input for the simulator
is the task graph produced by the analyzer, the task duration times and the number of
processors and memories.

5 Parallel Execution
5.1 Program-dependent Speedup
The simulated architectures are based on the multiprocessor system for robotic applications,
developed by Wang et 801. [26J. The shared memory architecture has 4 MC68020 processors
and MC6881 floating-point coprocessors, communicating over a single VME bus. They
measured the execution time of the statements in figure 2 for the 6 links. We used the same
execution time, yet averaged over all links. In this way the duration time of a particular
statement is independent of the link number.
40

Link
Task 1 2 3 4 5 6 Eqn. Time (ps)
ao 2 16 34 51 66 81 1 47
vecl 17 52 67 82 2 32
aod 3 18 35 53 68 83 3 59
vec2 4 19 36 54 69 84 4 35
vec3 20 37 85 5 84
vec4 21 38 86 6 34
avd 5 22 39 55 70 87 7 43
vec5 6 23 40 56 71 88 8 85
vec6 7 24 41 57 72 89 9 50
agf 8 25 42 58 73 90 10 77
vec7 9 26 43 59 74 91 11 42
vec8 27 44 60 75 92 12 94
agn 10 28 45 61 76 93 13 44
akf 11 29 46 62 77 94 14 78
vec9 12 30 47 15 43
vecl0 13 31 48 63 78 16 69
vecll 14 32 49 64 79 17 57
akn 15 33 50 65 80 95 18 69

Table 2: Tasks and task numbers generated by the analyzer; associated execution times.

The program for the Stanford manipulator (figure 2), yields 95 tasks. Table 2 represents
the tasks and their duration times.
Task 1 represents a startup and synchronization task between the consecutive updates.
The total execution time on a single processor is 5.542 ms. The task-level and precedence
analysis reveals a critical path of L=0.7168 ms. Consequently, the maximal speedup of the
Newton-Euler equations is Sma., = 7.7316.
Neglecting the overhead, the optimal speedup is obtained with 9 processors and the
Gantt-chart in fig. 4 shows the minimal execution time of 0.7168 ms.
The precedence constraints create a delay at the start of the calculations, which prevent
a maximum speedup with fewer processors.

5.2 Communication Overhead


The memory access time for one vector read- or write-operation is estimated at 6 = 6Aps.
Obviously, the execution time is affected by the inter-processor communication. The com-
munication overhead depends on the memory access time, the interconnection network and
the placement of the variables in the memories. Following the task assignment, the variables
are distributed over the local and the global memories. To reduce the bus load, variables
shared by tasks running on the same processor are stored in the local memory. Remov-
ing the local variables from the shared memory contributes significantly to the speedup.
41

P2 d3 17 U 8 1 1 1 1 46 1471 48 1 nl
53 68 83

P3 d415119~21151 141 181 1 85 189159173179180 I


P4 d171181351381 25 I 42 186143114 I 63 I 15 I
P5
~f61 37 1671 60 1721121741 49 161 I 50 I
P6 d91101 40 I 56 I 88 130 191 I 64 176 1 65 I
P7 I 1341521241661821571 75 I 13 I 78 I 33 I
P8
=-:J 23 1 44 1 71 1 92 1 31 128193lJYl
P9
d
I
.13
27 126 1 11 1 29 58 132 145 1 62 I
.26 .38 .51 .64
I
.n ms
.00

Figure 4: Optimal schedule of the Newton-Euler dynamic equations on 9 processors, without


communication overhead. The tasks are numbered according to the Table 2. Negative levels
indicate processors waiting for data from other processors.

The fraction of local memory accesses and its effect on the speedup are depicted in fig.
5. Apparently memory balancing pays off in single-bus systems, where the communica-
tion bandwidth is relatively low. Actually with more than 4 processors, the bus becomes
saturated and therefore a small increase in local data accesses has a marked effect on the
speedup.
Fig. 6 shows a detailed time chart of a single-bus shared memory multiprocessor with
9 processors. Compared with the conflict-free case (fig. 4), the bus saturation causes the
speedup to drop from 7.7 to 4.5.
The bus contention is reduced using m > 1 memories. E.g., the speedup of a 9 x 9
crossbar system increases to 7.6 and a 10 x 10 system delivers the maximal theoretical
speedup. The speedup values with and without overhead are given in Table 3.
Now it is useful to compare the simulation times with the results measured on the 4
processor MC68020jMC6881 multiprocessor system in [26]. Wang measured the execution
times on one to four processors, both with all variables in the shared memory and with
the local data kept in local memory. Table 4 compares the results with the simulated
execution of the LEM Analyzer schedules. The simulated execution times appear slightly
over-estimated, especially when the local memory optimization is in effect. This can be
explained by the average memory access times. Since the memory access times were not
mentioned in [26], a data transfer was estimated at 6.4 Jl.S. Actually the memory access
times of the multiprocessor in [26] are smaller. Moreover, the local memory is faster, since
the measured execution time on 1 processor is lower with a local than with a global memory.
42

Fraction Speedup
100% 5.0
a o

-'--+---+-+--+--+ b

50% 2.5

o% '--_~_~_~_~_ ___' 0.0

o 2 4 6 8 10

Figure 5: Effect of the local memory optimization. a) Fraction of the data which can be
stored locally, b) speedup with only global data, c) speedup after removal of the local data
from the shared memory. The results are for a configuration of 4 processors and 1 memory.

I I I ! I I I I I
.00 .13 .26 .38 .51 .64 .77 .90 1.02 1.15 1.28 ms

Figure 6: Execution of the Newton-Euler dynamic equations on 9 processors and 1 shared


memory. The unnumbered low- and high-level tasks represent shared memory reads and
writes respectively. Negative levels indicate idle processors, waiting for data from other
processors, or waiting for a shared memory access.
43

Proc. Sp 1 memo 2mem. 10 memo


1 1.000 1.000 1.000 1.000
2 1.982 1.942 1.977 1.977
3 2.966 2.794 2.858 2.887
4 3.832 3.305 3.654 3.798
5 4.758 3.883 4.396 4.656
6 5.587 4.184 5.006 5.413
7 6.321 4.184 5.481 6.099
8 6.928 4.184 5.773 6.713
9 7.732 4.487 6.511 7.664
10 7.732 4.464 6.415 7.732

Table 3: Speedup neglecting the communication overhead, Sp, and speedup in the presence
of bus conflicts with 1,2 and 10 shared memories.

global memory local memory optimization


Proc. Measured Simulated Diff. Measured Simulated Diff.
1 5.463 5.542 0.08 5.168 5.542 0.37
2 2.977 3.008 0.03 2.679 2.854 0.18
3 2.114 2.221 0.11 1.856 1.984 0.13
4 1.758 1.818 0.06 1.472 1.677 0.20

Table 4: Comparison of the execution times from Wang et a1. [26], and the simulation
times obtained with the LEM Analyzer.
44

Processor utilization
1.00

0.80
a

c
0.60

b
0.40
o 2 4 6 8 10

Figure 7: Processor utilization a) without communication overhead, b-d) affected by bus


conflicts in a system with I, 2 and 10 shared memories.

5.3 Processor Utilization


The processor utilization in a p-processor system is Up = Sp/p. The utilization deteriorates
• when p exceeds the theoretical speedup limit, Smax = E / L, or

• when the interconnection network saturates.

For p» Smaz, the processor utilization behaves like Up I':j Smax/P.


When P I':j Smax, the bus conflicts become the main bottleneck (fig. 7). Although
additional memories help to diminish the communication overhead, increasing the number
of memories shows only a marginal effect on the performance. The required number of
memories to avoid a bus saturation is given by equation 9. The average execution- and
communication-times in the taskgraph are 4396.8 J.lS and 1145.6 J.lS respectively. This
gives a communication density of Pt= 0.2606. The actual inter-processor traffic density
is P = Up(m)pt, since the processors are active only a fraction Up(m) of the time. Note
that Up(m) depends on number of shared memories, m. Thus, the minimum number of
memories to avoid a saturation of the interconnection network, mmin, is the solution of the
equation
mmin = ptpUp(mmin).
With mmin memories, the total idle time and the communication overhead is shown in
Table 5. It turns out that the communication overhead remains acceptable. The communi-
cation overhead with one additional memory is also depicted, reflecting the effect of extra
memories.
45

Proc. mmin Idle (%) C(mmin) (%) C(mmin+1) (%)


1 1 0.0 0.0 0.0
2 2 2.9 2.0 0.2
3 2 6.9 5.8 3.6
4 2 1704 13.7 4.6
5 2 12.1 7.6 2.7
6 2 16.6 lOA 6.1
7 2 21.7 13.3 8.7
8 3 27.8 16.7 10.7
9 3 27.7 15.8 6.7
10 3 35.9 17.0 5.1

Table 5: mmin: minimal number of memories and busses to avoid traffic congestion. Idle:
total processor idling time. C(mmin) and C(mmin+1): communication delay for saturation
conditions and for one extra shared memory.

Source Processors Time (ms) Speedup Algorithm


Luh ('82) 6 9.67 2.57 N-E
Kasahara ('85) 6 5.73 4.33 N-E
Vukobratovic ('88) 6 4.81 5.97 [24]
Wang ('89) 4 1048 3.70 N-E
LEM Analyzer ('90) 9 0.72 7.73 N-E

Table 6: Comparison between various approaches to parallelize the inverse dynamics equa-
tions.

6 Conclusion
A uniform methodology to distribute the workload of robotic applications has been pre-
sented. It is applicable to algorithms which can be converted into a direct acyclic task graph.
The LEM Analyzer applies do-loop unrolling, task graph generation, side-effect elimina-
tion and list scheduling. In this way a program is prepared for execution on a p-processor
system with 1 to m memories and a single-bus or a crossbar interconnection network.
The analyzer is extended with a simulator. This provides a sophisticated toolbox to study
the program parallelism, the inter-processor communication and the effective speedup on
systems with up to 50 processors and memories.
A model of the communication overhead provides a formula (eq. 9) to calculate the
number of memories for an adequate communication bandwidth. Furthermore, the memory
allocation scheme balances the variables over the local and shared memories.
Finally, the automatic generated parallel schedules for the Newton-Euler dynamics com-
pare favorably with the manual analyses available in the literature (see Table 6).
46

References
[IJ Adam T.L., Chandy K.M., Dickinson J.R., A comparison of list schedules for parallel
processing systems, Communications of the ACM 17, 12, 685-690, 1974.

[2J Amin-Javaheri M., Orin D.E., A systolic architecture for computation of the manipu-
lator inertia matrix, IEEE International Conference on Robotics and Automation 87,
Vol. 2, pp. 647-653, 1987.

[3J Andersson R.L., Computer Architectures for Robot Control: A Comparison and a New
Processor Delivering 20 Real Mjlops, IEEE International Conference on Robotics and
Automation 89, pp. 1162-1167,1989.

[4J Coffmann Jr., E.G., Denning P.J., Operating Systems Theory" Prentice Hall, 1973.

[5J Cook Stephen A., An overview of computational complexity, Communications of the


ACM 26, 6, 401-408, 1983.

[6J D'Holiander E.H., Performance Modelling of Supercomputer Architectures and Algo-


rithms, In "Scientific Computing on Supercomputers", Devreese J.T. and Van Camp
P.E. (Eds.), Plenum Press, Chapter 1, pp. 3-28, 1989.

[7J D'Hollander E.H., Speedup bounds for continuous system simulation on a homogeneous
multiprocessor, Proceedings of the IntI. Conf. on Parallel Processing '81, pp. 318-324,
1981

[8] Graham R.L., Bounds on Certain Multiprocessing Anomalies, SIAM Journal of Applied
Mathematics, Vol. 17, 2, pp. 416-429, 1969.

[9] Harber R.G., Hu X., Li J., Bass S.C., The Application of Bit-Serial CORDIC Com-
putational Units to the Design of Inverse Kinematic Processors, IEEE International
Conference on Robotics and Automation, Vol. 2, pp. 1152-1163, 1988.

[lOJ Hu T.C., Parallel sequencing and assembly line problems, Op. Res., 9, 6, 841-848, 1961.

[l1J Izaguirre A., Paul R.P., Automatic Generation of the Dynamic Equations of the Robot
Manipulators Using a LISP Program, IEEE International Conference on Robotics and
Automation, Vol. 1., pp. 220-226, April 1986.

[12J Kasahara H., Narita S., Parallel Processing of Robot-Arm Control Computation on
a Multimicroprocessor System, IEEE Journal of Robotics and Automation, Vol. 1, 2
(June), pp. 105-113, 1985.

[13J Kasahara H., Narita S., Practical muLtiprocessor scheduling algorithms for efficient
parallel processing, IEEE Trans. on Computers 33, 1023-1029, 1984.

[14] Kaufman M.T., An almost optimal algorithm for the assembly line problem, IEEE
Trans. on Computers-23, 11, 1169-1174, 1974.
47

[15] Lee C.S. George, Chang Po Rong, Efficient Parallel Algorithms for Robot Inverse
Dynamics Computation, IEEE Trans. on Systems, Man and Cybernetics, Vol. 16, 4,
pp. 532-542, 1986.

[16] Lee C.S.G., Chang P.R., A Maximum Pipelined CORDIC Architecture for Inverse
Kinematic Position Computation, IEEE Journal of Robotics and Automation, Vol. 3,
5, pp. 445-458, 1987.
[17] Ling Y.L.C., Sadayappan P., Olson K.W., Orin D.E., A VLSI Robotics Vector Proces-
sor for Real- Time Control, IEEE International Conference on Robotics and Automa-
tion, Vol. 1, pp. 303-308, 1988.
[18] Luh J.Y.S., Lin C.S., Scheduling of Parallel Computation for a Computer-Controlled
Mechanical Manipulator, IEEE Transactions on Systems Man and Cybernetics, Vol.
12, pp. 214-234, 1982.
[19] Luh J.Y.S., Walker M.W., Paul R.P.C., On-line Computational Scheme for Mechanical
Manipulators, Trans. of ASME J. of Dynam. Syst. Meas. and Control, Vol. 102,6, pp.
69-76, 1980.

[20] Luh J.Y.S., Walker M.W., Paul R.P.C., Resolved-acceleration control of Mechanical
Manipulators, IEEE Transactions on Automatic Control, Vol. 25, 3, pp. 468-474, 1980.

[21] Narasimhan S., Siegel D.M., Hollerbach J.M., Condor: A Revised Architecture for
Controlling the Utah-MIT Hand, IEEE International Conference on Robotics and Au-
tomation, Vol. 2, pp. 1116-1119, March, 1987.

[22] Nigam R., Lee C.S.G., A Multiprocessor-Based Controller for the Control of Mechanical
Manipulators, IEEE Journal of Robotics and Automation, Vol. 1, 4, pp. 173-181, 1985.

[23] Vukobratovic M., Kircanski N., Li S.G., An Approach to Parallel Processing of Dy-
namic Robot Models, The IntI. Journal of Robotics Research, pp. 64-71, 1988.

[24] Vukobratovic M., Kircanski N., Computer-assisted generation of robot dynamic models
in analytical form, Acta Appl. Math., Vol. 3, pp. 49-70, 1985.

[25] Wander J.P., Tesar D., Pipelined Computation of Manipulator Modeling Matrices,
IEEE Journal of Robotics and Automation, Vol. 3, 6 (December), pp. 556-566, 1987.

[26] Wang W-S., Chen K-K., Lai Y-S., Liu C-H., Implementation of II Multiprocessor Sys-
tem for Real- Time Inverse Dynamics Computation, IEEE International Conference on
Robotics and Automation, pp. 1174-1179,1989.
CHAPTER 3

NUMERICAL APPUCATIONS OF DSPs IN ROBOTIC


COMPUTATIONS

A. Di Stefano, O. Mirabella
Informatics and Telecommunications Institute
Engineering Faculty - Catania University
V.le A. Doria, 6 - 95125 CATANIA, ITALY

ABSTRACT. DSPs have been realized in the last years to meet


the need of high computational performance, even if reducing
the other general purpose features, as for example easy
programming or the possibility to obtain efficient compilers.
Aim of the paper is to apply the DSPs for realizing real time
numerical computations in the field of robotics. After a
survey of some characteristics of DSPs, the paper deals with
the design of the main numerical algorithms for robotics,
pointing out the architectural solutions which fit for
effective implementations. Frames of programs referring to
TMS320 family assembler are shown in the discussion about the
implementation solutions and examples got in literature are
evalutated.
1. Introduction
A Digital Signal Processor (DSP) is a specialized device
which uses digital techniques to implement filters and
algorithms for signal processing and process control. In
these applications, which feature intensive computation and
real time constraints, the use of digital techniques provides
greater accuracy, immunity to noise and a wider dynamic range
than implementation using ana logic techniques. Digital
circuits are, in fact, more stable and less sensitive to
noise, temperature and the effects of aging on components. In
addition, in digital systems the noise due to the
arithmetical rounding of computations can be controlled as
desired by adopting the necessary degree of numerical
accuracy.
To satisfy the time constraints the application imposes,
the first implementations of DSP systems used devices with
TTL technology small-scale integration and hard-wired logic.
Various multipliers, adders and accumulators were hard-wired
49
50

so as to obtain the required functions in hardware. It was


thus possible to obtain systems which featured high speed but
were costly and not very flexible. For these reasons, DSP
system applications were confined for a number of years to
military and industrial fields.
General purpose microprocessors are not quick enough to
perform all the operations required in digital signal
processing applications. They are characterized by complex
functions which allow sophisticated memory management and
flexible interaction with I/O devices, they possess extended
sets of instructions and allow the design of efficient
compilers, but they are heavy and slow to perform operations.
In the last few years, however, technological progress in
the production of VLSI components has led to the development
of programmable single-chip DSP devices which specialize in
operations such as "multiply-accumulate", of fundamental
importance for signal processing applications like digital
filters or spectral analysis. A typical example of
application is the Finite Impulse Response (FIR) filter shown
in Fig. 1, whose response is determined by:
f
Y(n) E X(n-k) . b(k)
k=O
where the coefficients from b(O) to b(f) represent the
coefficients of the filter, while the values x(n) represent
the input data. The importance of multiply-add operations in
such an application are immediately noticeable [1].

x(n)
-'--'-r---/ Z - 1 Z -1
z-· ·····8
-
j o Del1lY
EB Add
[> ~1ultiDly

Figure 1. Basic structure of a FIR filter.


In general, DSPs can be said to prove efficient in all
those applications whose implementation algorithms present
51

the following characteristics:


intensive use of addition and multiplication, especially
multiply-accumulate operations (MAC).
intensive use of shift operations in order to manage
data on different scales.
strictly iterative operations on series with a linear
index.
The presence of instructions for signal processing,
however, would not be sufficient to obtain the high
performance levels of modern DSPs, if it were not coupled
with the following features:
pipelined architecture with a high degree of parallelism
[2] ;
the presence of on-chip RAM which allows faster
management of a significant number of sample data from
the off-chip memory;
a hardwired design for heavier instructions such as
multiplication or division (instead of a microprogrammed
implementation) which makes execution faster [3];
arithmetical accuracy to keep errors within acceptable
tolerance limits during iterations.
In general, it must be stressed that the basic philosophy of
all programmable DSPs is that of obtaining maximum
performance in terms of computational throughput without
considering other characteristics such as easy programming or
the possibility to obtain efficient compilers for high-level
languages. From this point of view, in fact, on account of
the particular architecture used in DSPs, they usually offer
quite bad results as compared to those obtained in assembler
programming. For this reason, DSP manufacturers offer a wide
range of application programs ready for use, rather than
trying to produce efficient compilers.
In the following sections the authors present a brief
survey of the main characteristics of some of the most common
DSP components, examining certain numerical applications in
the field of robotics.
1.1. Organization of the Paper

The first point to be dealt with in section 2 concerns the


numerical formats used in DSPs, with a discussion of the
implications of use of fixed- and floating- point notation.
Subsequently the authors examine various architectures
which differ on account of their different data paths and
memory organization. In particular they examine the Texas
Instruments TMS320 family, which represents a reference point
in the field of DSPs as it is the most widely-used in a vast
52

range of applications.
Finally, with reference to certain problems typical of
robotics (concerning the cinematics and dynamics of the
movement of a robot arm and image processing), they examine
the various approaches and implementation problems. In this
section they also discuss the main parts of certain assembler
routines for the solution of specific numerical calculation
problems.
2. A Survey of the Main Characteristics of DSPs
2.1. Arithmetic

Typical DSP applications make intensive use of mathematical


operations. To get the most out of these components it is
indispensable to know how DSPs represent and treat numbers.
A real number in binary form is usually represented by a
sign, a radix point and a magnitude. The sign indicates
whether the number is positive or negative, while the radix
point separates the integer part of the number from the
fractional part. Maximum magnitude is linked to the number of
bits used to represent a number.
The sign is represented by using the most significant bit.
Negative numbers can be represented in various formats, the
most common of which is the twos complement format which has
the advantage of providing a single representation of zero.
The position of the radix point can be specified in two
different ways:

the fixed point format, in which the radix point is


placed in a pre-established position and separates
the integer and the fractional part of the number;
The floating point format, which uses two numbers to
represent a value: a mantissa and an exponent.
The fixed point format is used in most DSPs (only the most
recent generations of DSPs, which feature high-level
performance, use the floating point), and provides greater
speed at a lower cost (using the same technology).
Unfortunately the fixed point requires accurate programming
to avoid overflow errors and loss of accuracy in
calculations.
Overflow errors occur when the attempt is made to record a
word the size of which is greater than the capacity of the
register. In fixed-point DSPs this may occur in the
accumulator as a consequence of addition operations (two
numbers of N bits may generate a number of N+1 bits), and
above all multiplication operations (the product of two
numbers of N bits may generate a number of 2N bits). For this
reason in 16-bit DSPs, the size of the Aritmetic Logic Unit
(ALU) and the accumulator is not less than 32 bits, as in the
53

TMS32010 [4] (see Fig. 4 in par 2.3.1 for a simplified


diagram of the internal architecture).
This solution may, however, not be sufficient because
subsequent addition or multiplication operations may still
make the size of the word greater than that of the
accumulator. A partial solution consists of further
increasing the size of the ALU and the accumulator, as in the
Motorola DSP56001 or the AT&T WEDSP16. The data path of the
latter is shown in Fig. 2; the accumulator and the ALU have
36 bits (instead of 32) and so there are 4 bits more space to
ensure against overflow.

~ ! Y DATA BUS

Ix REG.(16) IVL REG( 16) VH REG( 16)J

L MULTIPLIER
(16x16)
I--
X DATA
P REGISTER
BUS
(32)

1
SHIFT ( - 2 , 0 , +2)

1 L

"- MUX

I
/

QfSHIFT(q

1
Ao ACCUMULATOR
(36)

Al ACCUMULATOR
(36)

I
L
SHIFT I SAT.

1
Figure 2. Simplified architecture of the arithmetic
module in the DSP16.
Another possible solution to the problem of overflow
consists of using saturation arithmetic, by means of which,
in the occurrence of overflow, the maximum ALU value
(positive or negative) that can be represented is loaded into
the accumulator. This course of action (which is possible,
for example, in the TMS32010 where it is activated by the
54

instruction Set OVerflow Mode, SOVM), leads to loss of


accuracy, but this loss is much less than that usually caused
by overflow. The problem of loss of accuracy is linked to
that of overflow in that the latter is avoided by trying to
limit the size of the numbers being operated on, at the
expense of accuracy.
Use of the fixed point representation has the advantage of
keeping the position of the radix point fixed. Whereas on the
one hand this simplifies numerical operations and saves
memory, on the other it limits the width and accuracy of the
numbers that can be represented. So when it is necessary to
operate on a wide range of variable numbers, or a high level
of precision is required, the floating point seems to be the
most convenient solution.
A floating point number consists of a Mantissa M(x) and an
exponent E(x) such that:
x M(x)' 2 E (X)

The two parts of the number x are stored as fixed point


numbers; the mantissa is in an entirely fractional form,
while the exponent is in a integer form. The most common
format is a 32-bit one, 24 bits of which are for the mantissa
and 8 for the exponent. This provides a dynamic range of over
1500 dB. When two floating point numbers, x and y, are
multiplied, the result, R, will be of the following type:
z M(x)' M(y) . 2 E (X)+E(y)

that is, it will be composed of two independent operations:


the product of the mantissa and the sum of the exponents.
Consequently, the register in which the product of the two
mantissas will be stored must have a sufficient capacity if
maximum accuracy is to be maintained. If the mantissa is of
24 bits, the register must have 48 bits. In practice, the
size of the register is less (e.g. 32 bits are destined to
this purpose in the TMS32030), accepting a loss of accuracy
[5] •
After each floating point operation, a number has to be
normalized to eliminate all the redundant bits so as to
achieve the greatest possible accuracy with the number of
bits available. This is obtained by a shift operation which
eliminates insignificant bits, and an adjustment of the
exponent which takes the scaling operation into account.
2.2. Data Paths and Memory Management
Since modern digital computers have been developed, the most
common architecture is that of Von Neumann, in which there is
no pre-established distinction between the areas of memory
reserved for data and those devoted to the storage of
55

programs. In practice, all the accessible memory addresses


can contain either a code to be executed (i.e. instructions)
or data.
The typical activity of a Von Neumann processor alternates
sequentially between two operations - extraction of an
instruction from the memory (FETCH) and execution of the
instruction after decoding (DECODE-EXECUTE). So programs are
sequentially executed as a new instruction is not taken into
consideration until the previous one has been completely
executed.
Using this approach, to increase the speed at which
programs are executed, it is nessary to reduce execution time
for a single instruction, on which the overall speed of the
processor depends. This places serious limits on the maximum
speed that can be obtained, because it is difficult to go
below certain execution times. In addition, the Von Neumann
architecture does not allow efficient application of
pipelining and parallel processing techniques and, on account
of its inability to manage memory efficiently, it is nowadays
considered a bottleneck for the intensive numerical
calculation applications for which a DSP is designed.
For this reason Programmable DSPs usually adopt the
Harward architecture [2], the only exception being AT&T
DSP32C, which we shall discuss below. This architecture is
highly suitable for parallel processing and pipelining as the
data and program areas of the memory are completely separate.
The separation is physical as there exist two separate buses,
one to access the data memory area and another to access the
program area. This means that it is possible to extract both
data and programmes at the same time by parallelizing the
fetch phase for one instruction and the execute phase for the
previous one.

CLOCK 24
DATA

14
1 data
14
-.
MEMORV
16K x 16

PROGRAM ~ 1-
MEMORV address
ADSP
24
16K x 24 2100
data 1""'- 14
add~ 1/0
DEVICE
24

Figure 3. Simplified architecture of a DSP2100.


56

An extremely effective example of the potential offered by


the Harward architecture is represented by the ADSP-2100 by
Analog Devices [6] (see Fig. 3 for simplified architecture),
which can address an external data memory Ml of 16Kx16 bits
and a program memory M2 of 16Kx24bits, which can also be
configured as a data memory. It also has a cache memory of
16x24 bits to memorize short sequences of instructions, which
provides a high degree of parallelism in operations since, by
using M2 as a data memory, two operands can be taken and an
instruction executed in a single cycle.
In order to exploit the presence of independent external
memories and provide efficient pipelining, the use of
separate address generators for M1 and M2 is useful, as it
allows the addresses of subsequent operands to be calculated
while the current operands are being extracted.
An extension of the concept of multiple memory is to be
found in the DSP5600 by Motorola [7] which contains a pair of
internal data memories, X and Y, of 256x24 bits each and a
program memory of 2Kx24 bits. These memories can be
externally expanded to 64Kx24 bits each. The presence of
memories inside the chip allows the execution of operations
at maximum speed, while the presence of parallel paths for
data transfer between the memories and the arithmetical unit
allows the processor to extract the two operands and the next
instruction in a single cycle.
The size of the internal memories is sufficient for
complex operations without having to access the external
memory area. A 256-point FFT, for example, requires only 512
words of memory, 256 of which are for data and 256 for the
coefficients. Nevertheless there are a great number of
applications for which the internal memory is insufficient
and it is necessary to use external memories. In this case,
in order to maintain parallelism with the internal memories,
the off-chip memories have to use three non-multiplexed
external buses (one for each memory space), resulting in an
obvious increase in the number of pins and thus in the size
of the chip. This approach is only adopted in LM32900 by
National, which requires 96 pins for the addresses and data
buses alone. In most DSPs, including the DSP5600, the
external buses are multiplexed and so access to external
memories cannot occur in parallel, thus slowing operations
down as compared with those using internal memory.
A different approach is used in AT&T's DSP32C [8] which,
unlike most other DSPs, uses a Von Neumann architecture.
There are no different buses for the various memory areas,
but a single high-speed bus both for data and programs. This
bus provides access to a memory area in half a cycle, so in
one instruction each of the two memories can be accessed. It
is therefore possible, for example, to remove the two
operands from the internal memory, perform a multiplication-
accumulation operation, and write the result in the memory in
57

a single instruction cycle.


The architecture of DSP32 is very similar to that of a
microcontroller. The presence of a single bus simplifies
programming of the device because it is no longer necessary
to distinguish the data area from the program area, the
memory being seen as a common resource. On the other hand, in
order to exploit to the full the virtual parallelism offered
by the bus it is necessary for the data in operations with
several operands to be distributed in two separate memory
banks.
For reasons of space, the survey presented here is
necessarily confined to the examination of certain typical
cases.
The following section presents two DSPs by Texas
Instruments which will be used to give some examples of DSP
application in robotics.
2.3. The TMS320 Family

The TMS320 family, produced by Texas Instruments, is the


best-known family of DSPs. The original DSP is TMS32010,
developed in 1982, which is currently the most widely-used
thanks to its simplicity, its suitability for a large number
of applications, its low cost and the availability of a range
of application software. Subsequently more powerful DSPs have
been developed - TMS32020, TMS32025 and TMS32030.
In the following sections, however, reference is only to
TMS32010 and TMS32025, on which the examples of application
have been implemented.

2.3.1. TMS32010

A feature common to all the DSPs in the TMS320 family is the


use of a modified version of the Harward architecture which
increases the flexibility of the device as it allows the
transfer of data between the data memory and the program
memory.

TABLE 1

MAIN FEATURES OF TMS32010

200 nS instruction cycl e


2B8 words on-chip RAM
32-bit ALU end AccumUlator
16 x 16 multiply in 200 nS
Externel memory expansion to 8Kbytes
Two pen:lllel shifters
16 bit bi di ret! omll Date Bus
Single 5-volt supply
58

The basic characteristics of the TMS32010 are represented


in Table 1, and a reduced version of the architecture is
given in Fig. 4.

PROGRAM BUS DATA BUS

DATA BUS

Figure 4. Simplified architecture of T.I. TMS32010

Arithmetical Elements
The basic arithmetical elements contained in the chip are
the ALU, the accumulator, the multiplier and the shift
registers. After being removed from the RAM, data are sent to
the ALU by a shift register which can shift a word from 0 to
15 bits to the left, according to the value specified in the
instruction.
In the ALU the data can be stored, added to or subtracted
from the accumulator. As the accumulator is of 32 bits, the
data is stored in two locationss in the 16-bit data memory.
The multiplier block is composed of three units: the 16-
bit register T to memorize the multiplicand, the 32-bit
register P to memorize the result of the product and a
multiplication matrix for the fixed-point 16x16 product.
Data and Program Memory
The data memory is present in the chip and is made up of
144 16-bit words. It is subdivided into page zero, from place
o to 127, and page one from 128 to 144. Addressing can be
direct, immediate or indirect. In the latter case, the 8
least significant bits of the auxiliary registers are used,
which are able to address all 144 locations in the on-chip
59

RAM. The use of auxiliary registers is particularly useful in


certain instructions which automatically increase or decrease
their content.
The addressable program memory is of 4Kx16 bits and is
entirely off-chip. Transfer of data between the program
memory and the data memory is possible by means of two
particular instructions - Table read (TBLR) and Table write
(TBLW) .

I/O Functions
The TMS32010 can interface up to 8 external peripherals
which can be used as both input and output. The peripherals
are selected by decoding the first three pins (AO-A2) of the
address bus. The two instructions which allow data to be
transferred between the processor and a peripheral are IN and
OUT.

2.3.2. TMS32025

This component has a more powerful architecture than


TMS32010, the main differences being an increase in speed,
greater memory availability and a greater degree of
parallelism due to the presence of a central ALU (CALU) and
an Auxiliary Register Arithmetic Unit (ARAU). This
parallelism means a more powerful set of arithmetical, logic
and bit-manipulating operations that can be performed in a
single machine cycle.
The main features of this DSP are summarized in Table 2,
and a simplified version of the internal architecture is
given in Fig. 5.

TABLE 2
MAIN FEATURES OF TMS32025

100 nS instruction cycle


544 words of on-chi pRAM
126 Kwords of Data/Program space
32-bi t ALU and Accumul ator
Si xteen I nput/Output channel s
ei ght Auxil i ary Regi sters and dedi cated Aritmeti c Unit
16-bit parallel multiprocessor interface
Double-buffered static Serial Port
On-chi p ti mer for control operations
Si ngl e 5-vol t suppl y
60

iiROGR~J-
COUNT.
control C

I!>TAlUS I
stgnols 0
N IPROGR.
memory T ROM ISTACK I
R

I
select 0
clock L.
PROGRAM I BUS
stgnols
I
. DATA BUS l
l

I-J. ~
TR( 16)

I ARP(3) MULTIPLIER
ARO ... AR7
-'lARB( 3) I I SHIFTER 1
PR(16)
externol
dot%ddr. T ! ~ 1 SHIFTER 1
bus
I-
ARAU
J~ !
~
-r
\. MUX /

I
T~ ALU (32) 'i
RAM
(B2)
I
1
l RAM
(B1)
RAM
(BO)
1 ACCH( 16) ACCL( 16)1

SHIFTER I
~ 1
/ MUX \.
I
T
DATA BUS

Figure 5. Simplified architecture of the T.I. TMS32025

Arithmetical Elements
The CALU contains a 16-bit scaling shifter, a 16x16
parallel multiplier, a 32-bit ALU, a 32-bit accumulator and
some scalers which are available at the outputs of both the
accumulator and the multiplier.
ALU input data are provided by the accumulator and the
multiplier or the scaling shifter which can shift the data 0
to 16 bits to the left. There are several additional shifters
in the CALU which align the products in fractional
arithmetic, avoid overflow in multiplications with multiple
accumulations and normalize floating-point numbers contained
in the accumulator.
TMS32025 also has 8 auxiliary registers (ARO-AR7), which
can be used for indirect RAM addressing or for temporary
memorization. These registers can be manipulated by the ARAU,
which can serve as an additional Arithmetic Unit as the file
of the auxiliary registers can communicate directly with the
RAM. The ARAU has a 16-bit arithmetic with no sign, while the
CAIU has a 32 bits two's complement arithmetic.
61

Data and Program Memory


The RAM in the device is composed of 544 16-bit words and
can be divided into three blocks: BO, B1 and B2. BO has a
memory of 256 words and can be configured (through the
instruction CFND or CFNP) as a data or program memory. B1 and
B2 together have 288 words and can only be used as a data
memory. TMS32025 can also address 64K bits of off-chip
program memory.
There are three separate address spaces for the program
memory, data memory and I/O devices. Six peripheral registers
are also mapped in the memory - the serial port register, the
timer register, the period register, the interrupt mask
register and the global memory allocation register.
I/O Functions
TMS32025 uses a memory-mapped technique to address I/O
devices. It supports 16 parallel input channels and the same
numer of output channels. It also has an on-chip serial port
which can interface serial devices such as codecs, serial A/D
converters and others.
The architecture of TMS32025 has been designed in such a
way as to provide for multiprocessor applications, reducing
the need for additional hardware to a minimum. This can be
obtained by using the serial port to connect two processors,
or through parallel communication using a common memory.
3. Numerical Applications in Robotic Computations
3.1. DSP based algorithms for kinematics and dynamics

The motion control and mechanics of the robot manipulator are


led by kinematic and dynamic equations. The algorithms
proposed in literature to solve these equations follow either
the closed approach (the best-known examples of which are the
inverse transformation of Paul et al. [9], the helicoidal
algebra of Kohli and Soni [10] and the double matrices of
Denavit and Hartenberg [11]) or the recursive approach of
Goldenberg and Lawrence [12] or Wang and Ravani [13].
The main limit of the closed forms solutions is that they
are always manipulator dependent and so cannot be used for
general-purpose control or simulation programs. The recursive
approach is more frequently adopted because it allows a more
general solution, even though the computation is more
complex. It is therefore important, above all in the
application of recursive algorithms, to ensure a high
computation rate which will allow convergence conditions to
be reached in a short time. It is mainly in this type of
approach that DSP programs are advantageous, as they greatly
reduce the time it takes to execute each step in the
recursion.
Anyway, the computational kinematics and dynamics always
62

involve solving certain classical algorithms for numerical


computation such as product, transpose and inversion of
matrices, and the roots of trigonometric functions (sin(x),
cos(x) and atan2(x,y». These are the algorithms we shall
deal with in the following sections in order to demonstrate
the versatility of DSPSi their implementation refers to
TMS32025.

3.1.1. Vectorial Product of Matrices


Although the algorithms of Strassen (14] and Pan (15] for the
vectorial product of matrices are characterized by a lower
number of operations than the row-per_column algorithm
derived from the definition itself, we have implemented the
latter on a DSP. In fact, it requires more elementary
operations such as product and sum/difference calculations,
but the set of registers and instructions offered by TMS32025
make the approach based on the row-per_column definition more
efficient.

CNFP * Configure block BO as Program


* Memory.
LARK 0,4 * Load Auxiliary Register ARO with
* the Matrix Dimension.
LAR I,X * Load ARI with the addr. of first
* element of 2nd matrix.
LARK 2,3 * Load AR2 with the number of
* columns of 2nd matrix (used as a
* loop counter).
LOOPI SBRK 15 * Subtr. 15 to current register ARI
ZAC * Reset Accumulator.
MPYK 0 * Reset P-Register.
RPTK 3
MAC >FFOO,*O+ * Multiply 1st row of A by the
* column of B inherent the current
* loop.
APAC * add P register to Accumulator.
LARP 4 * Load Aux. Reg. Pointer with 4.
SACH *+,4.2 * Store high Accumulator with
* shift 4 in the location pointed
* by AR4. Set ARP to 2 (for a new
* loop count).
BANZ LOOPl,*-,1 * If AR2 is not zero then
* decrement AR2, Jump to specified
* PMA and load ARP with 1.

Figure 6. The main module of the program for Product of


Matrices. It performs the product of a row by 3
columns.
63

The product of matrices can be obtained rapidly, if the


order of the matrices is fixed, by using certain
instructions which exploit the internal architecture of the
TMS32025. The example given in Fig. 6 refers to a 4x4 matrix.
Many problems concerning robot kinematics deal with
vectorial products of transformation matrices of this size.
To calculate the product of matrices of any size, even those
which are not square, it is generally sufficient to add a
number of program blocks like the one shown in Fig. 6,
equivalent to the number of rows in the first matrix.
The two matrices are introduced by rows into the memory
blocks BO and Bl. As we said previously, the TMS32025 allows
block BO to be used both as program memory (located from FFOO
onwards) and as data memory (from 200 onwards) by means of
the instructions CNFD and CNFP. By means of the instruction
CNFP, BO is configured as program memory. This allows the
row-per_column product to be obtained by using the MAC
instruction in combination with RPTK.
RPTK <n>, where n is constant, is an instruction which
allows execution of the next single instruction to be
repeated n+l times without having to go through the fetch
phase again.
It is worthwhile mentioning here that whereas TMS32020 has
two pipeline levels, thus allowing a prefetch phase, TMS32025
has got three, due to the additional presence of the 16-bit
registers Micro Call Stack (MCS) and the PreFetch Counter
(PFC) .
The PFC contains the address of the instruction currently
being "prefetched", but it can also be used to address the
data memory (when instructions like MAC/MACD, TBLR/TBLW, etc.
are used). In this case the MCS is used to register
temporarily the content of the PFC.
MAC <PMA>, <indirect DMA address> (i.e. Program Memory
Address and Data Memory Address) multiplies the value of the
program memory (specified by the PMA) by the value of the
data memory (indirectly given by an auxiliary register), sums
the result of the previous product to the accumulator, and
increases the PMA and modifies the DMA as specified by
addressing. All these operations are performed in just two
clock cycles if it is not used with RPTK, or in a single
clock cycle if combined with RPTK.
For example, MAC >FFOO,*O+, in execution of an
instruction, increases the current register pointed by the
ARP (Auxiliary Register Pointer) by the value contained in
ARO. Use of MAC together with the instruction RPTK shows all
the potential of the TMS32025 internal architecture; the
fetch phase of the instruction is, in fact, only performed
once for all the n+l repeats and is recorded in the MCS,
while the two operands enter the multiplier almost
simultaneously by means of the two buses, the data and
program buses. At the same time the ARAU (Auxiliary Register
Arithmetic Unit) executes the register operation specified by
indirect addressing, which in our case consists of adding the
content of the ARO register to the content of the current
register. In this way we can obtain N products and N sums in
only N+2 clock cycles, as compared with 2N.
The program calculates the vectorial product of 4x4
matrices in nv = 330 clock cycles. Shifting is necessary to
bring the result to the required format because if the input
data are in a Q12 format, the result of a product will be in
a Q24 format, so it is necessary to shift 12 bits to the
right. This is achieved quite simply by means of the
instruction SACH.
The limitation of this program is that it is necessary to
have previous knowledge of the order of the two matrices, due
to the fact that the MAC instruction does not allow a
variable to be the first operand, requ1r1ng instead a
constant representing the address of the program memory in
which the first value to be multiplied is stored. To overcome
this problem, the instruction must be abandoned and
consequently the calculation time for the product of two 4x4
matrices increases up to 829 clock cycles.
For a known order or any order of matrices, the same
programs realized with TMS32010 require 1000 and 1216 clock
cycles respectively. In addition, to be able to obtain the
product the second matrix has to be introduced by columns,
and so if this is not possible the product has to be preceded
by a transpose which leads to an 86 clock cycle increase in
calculation time.
Example
The vectorial product of matrices, above all of
transformation matrices, is one of the most frequently
recurring computations in manipulator kinematics and
dynamics. Of the many possible examples, the one given here
concerns calculation of the Jacobian matrix.
The Jacobian of an n_DOF manipulator is a 6xn matrix,
which transforms the differential displacement of the joint
space into the cartesian space. The Jacobian matrix
computation occurs in several applications, for example in
iterative solution of kinematics equations [12] or in
calculation of motion force control of manipulators [9].
Calculation of the Jacobian can be derived in both an
analytical and an iterative form. If, for instance, we refer
to a 6_DOF manipulator, the analytical method suggested by
K.S. Fu et alii [16] calculates the i-th column Ji of the
Jacobian relating to the configuration space vector qER 6
consisting of joint variables 9i. For a prismatic joint:

Zi-l
65

Where Zi-l is the versor along the movement axis of


joint i expressed in the base coordinates system, while i-1 P6
is the position of the origin of the system of co-ordinates
of the end-effector with respect to the (i-1)-th system of
coordinates expressed in the base reference system.
Referring to the representation of Denavit-Hartenberg
[11], the vector Zi-l is given by the third column of the
matrix

while the vector i-1 P6 is the fourth column of the matrix

The i-th column of the jacobian can therefore be derived


by calculating all the transformation matrices A and their
vectorial product according to (1) and (2).
For a 6-link arm, it would be necessary to obtain 15
vectorial ~~?ducts of 4x4 matrices and calculate the values
Zi-l and P6 for a Ji. But using the previous formulae
recursively:
(1' )

(2' )

and as the transformation matrices i-1 Ai are orthonormal


(i.e .. their inverse coincides with the transpose), the Zi-l
and ~-lp6 values for all 6 Ji are computed with 9 vectorial
products and 6 transposes of 4x4 matrices. On the TMS32025
the matrix transpose routine takes nt = 79 clock cyles.
Besides the above mentioned operations, the Jacobian also
requi+es computation of the 6 vectorial products of vectors
Zi_1X~-lp6. On the TMS32025 this routine takes np = 38 clock
cycles.
The total computation time for the Jacobian is:
9·nv+6·nt+6·np = 3672 clock cycles
3.1.2. atan2 function

A function which recurs in all problems concerning robot


kinematics and dynamics is the two-operand function
atan2(x,y) , which gives the arc-tangent of the ratio x/y,
related to one of the four quadrants by evaluating the sign
of the two operands. This would obviously not be obtainable
as a simple atan(z), where z = x/y, as tan(z) = tan(z+180o) =
-tan(-z).
The algorithm first calculates the z=x/y relation for
which the arc-tangent is to be obtained, and then the value
66

(CS) to add to the result in order to find the right


quadrant:
1) if x>O, y>O then the arc is in the first quadrant, CS=O.
2) if x>O, y<O then the arc is in the fourth quadrant,
CS=-90° .
3) i f x<O, y>O then the arc is in the second quadrant,
CS=90°.
4 ) i f x<O, y<O then the arc is in the third quadrant,
CS=180 ° .
Having calculated the value of z=x/y and the appropriate
quadrant, atan2(z) can be calculated from the series
expansion of atan(z), remembering that the radius of
convergence is Izl<l which allows rotation angles of less
than 45°. To extend this series expansion outside the
convergence radius it is sufficient to use the trigonometric
formula atan(z)=900-atan(1/z). In fact for Izl>l it is
(l/I Z I)<1·
Anyway, the expansion atan(z)=Ez 2n + 1 /(2n+l) does not
provide a good approximation for z values close to 1. The
errors measured in this range reach maximum values of 1%.
If the domain is limited to the interval ]0,1[, we can use
the optimized series expansion [17], valid for z<l and
extended to values z~l using the trigonometric equation
atan(z)=900-atan(l/z). In this way we calculate the
arctangent of a number less than 1.
So the atan2 program consists of calculating the quadrant
to which the result is to be related, evaluating x/y and
calculating the arctangent.
In the case of a numerator which is less than the
denominator, division in TMS32025 can be performed rapidly in
the following way:
ZALH NUM
RPTK 14
SUBC DEN
ZALH loads the upper part of the accumulator with the content
of the specified DMA by cancelling the lower part, while the
instruction SUBC performs conditional subtraction of the
content of the specified DMA from the accumulator [18]. Here
the output is in Q15.
The arctangent is calculated by a modified series
expansion, with optimized coefficients in such a way as to
reduce errors. The arctangent therefore becomes:
atan(z) = z·(a+z·(b+z· (c+z'(d+z'e»»
The loop which performs the necessary function is shown in
Fig. 7. It is supposed that parameters e,d,c,b and a reside
67

in that order in locations 200-204.

LRLK 2,>200 *
LAC *+,0,1 *
SACL W * Load W with parameter "e".
LARK 1,4 * Initialize Loop counter.
LT X
LOOP LARP 2
MPY W
PAC
ADD *+,15,1 * Add to accumulator the current
* register shifted 15, increment
* current register and load ARP
* with 1.
SACH W,l * the obtained result is converted
* into Q15 format.
BANZ LOOP

Figure 7. The main module of the program for atan2


computation.
After obtaining the result of the arctangent, it is
related to the appropriate quadrant by simply adding a
suitable previously calculated value (CS). Calculation time
is na = 130 clock cycles, whereas the same program with the
TMS32010 takes 185 clock cycles.
Example
The atan2 function, like sin and cos, frequently recurs in
various solutions for inverse kinematics, as in the form
proposed by Paul et alii [9] or the calculation of residuals
in the recursive method of Goldenberg and Lawrence [12].
In the latter method, the condition of convergence is
given by
R(q)= (Rx, Ry, Rz, R~, Re, R~) = 0

which at each step refers to calculation of the vector of


residual position
Rx = n B. (pt_pB)
Ry = OB.(pt_pB)
Rz = a B. (pt_pB)

and residual orientation, expressed for example by Euler


angles
R~ atan2[(OB· a t),(a B·a t )]
R9 atan2[«na·at)CosR~+(oB·at)sinR~),(aa·at)]
68

R~ = atan2[(-(na·at)sinR¢+(oa·nt)COSR¢),
(-(na'ot)sinR¢+(oa'ot)cOsR¢)]

where, the index t and a refer to target and actual frames, p


is the position vector and (n, 0, a) is the orientation unit
vector of the end effector.
Each recursive step requires computation of the inner
products and differences of vectors, execution time for which
are ni = 10 and nd = 23, and of the trigonometric functions
sin, cos and atan2.
Implementation of the sin function has an execution time
of ns = 151 clock cycles for angles ~90, and 155 for angles
>90. To be more precise, calculation of the residual vector
takes

1·nd+3·ni = 53 clock cycles for Rx, Ry and Rz


6'ni+2'ns+3'na = 752 clock cycles for R¢, R9 and R
thus each iteration requires 805 clock cycles.
3.1.3. Floating point matrix inversion
The inverse of a NxN matrix A can be achieved using the
classical Gaussian elimination method, since finding the A- 1
is equivalent to solve the system of linear equations AX=I
whith X the nxn unknown matrix.
The floating point representation of the information
allows to obtain accuracy in the results, which can vary over
a very high range.
The method consists in reducing the complete matrix
B=[AII] of N raws and 2N columns in such a way A is
transformed into the upper diagonal and then into the
identity matrix. At the end of this transformation, the
matrix that was initially I becomes A- 1 •
The flow diagram given in Fig. 8 illustrates the main
modules of the program follows an inizialization phase where
the matrix B is introduced in memory in order of raws. The
modules are:

DIVF which registers in the variables MPR and EPR


respectively the mantissa and exponent of the inverse of Akk
and puts 4000,1 in location Akk.
PROD which multiplies the whole of the k-th raw by the
value (MPR,EPR).
QWE which performs the operation
Aij = Aij Aik' Akj where i=k+1 .. N.
It calls the PRoe routine described in Fig. 9 which performs
the internal cycle
-Akj . Aik -> PR
PR + Aij -> Aij where j=k+1 .. 2N.
69

DATA INPUT
- INITIALIZATION -
!
r---+< k = 1.2•...... N
I

n
1 Alele= 4000 .. 21 1
DIVF

I ( MPR. EPR> = A~d

J = K+l. K+2 •... 2N>


PROD
IAlej = Alej"(MPR.EPR>1

-+( 1= K+I. K+2 • ... N >


OWE
AjIc => (W.O> 1
1

I
-+(J = K+ I. K+2 • .... 2N>

1- Alej" (W.O> => PR 1


PROC.
+
Ajj = Aij + PR

rl
1 1 SUM

CN~~-I ~:~ ...


2>
K =
1

J.
END
1 1

Figure 8. Flow diagram of the matrix inversion routine.

The floating point sum for calculation of Aij in each


iteration is performed by the routine SUM, which is not
explained here for the sake of brevity.
After the first phase, the initial matrix has been reduced
to the upper triangle with unity in the diagonal elements.
while the identity matrix is the lower triangle.
At this point, it is sufficient to repeat the routine QWE
N-1 times in order to reduce the initial matrix to a unit
one. At the end, the inverse matrix will appear in the place
of the identity matrix.
PROe uses as input values the content of the register
AR1=<Akj>, the variable PR = (MPR,EPR) and the content of the
register AR2=<Aij>. It uses AR3 as a loop counter and AR5 as
a temporary register.
The result of the product is to be found in the product
register PR in a Q30 format; to use it in the Q15 format a
70

1-bit shift to the left has to be made and the upper part
taken. This is performed by combining instructions SPM and
SPH.

PROC SPM 1 * Set to 1 two bits of PM field


* in the Status Register SP1.
LARP 1
LT *+
MPY MPR
SPH W
LAC *+,0,5
ADD EPR
SACL Q * store into Q the sum of exponents.
ZALH W * Store into W the product of the
* two mantissas with left shift 15.
BZ PRB
LAR 5,Q
NORM *-
SACH w
LAC W
SUB OTTO
BZ OT2 * Normalize the results and test
* overflow.
JP SAR 5,Q
LARP 2
CALL SUM, *+
BANZ LOOP
RET

RB LARP 2 **
ADRK 2 If the result is zero, then
LARP 3 * increment Auxiliary Register AR2
BANZ LOOP * to process next element without
RET * calling "SUM".

OT2 LALK >4000 **


NEG If the mantissa >8000 then
SACL W * shift right, set to COOO
MAR *+ * and increment the exponent to
B JP * avoid overflow.

Figure 9. The block which performs the Reduction of a Row.

SPH <v> stores the upper PR in the variable v, while SPM


<n> indicates how the product has to be shifted according to
the value of the constant n, as given by a suitable code.
This allows the result to be put into the desired format
without any increase in time.
Another important instruction used in this routine is
71

NORM, which allows us to work with a floating point.


Normalizing a number means shifting it to the left until a
number between 0.1 and 0.99999 or between -0.999 and -0.1 is
obtained, and obviously, to obtain the correct result,
decreasing the exponent by the same amount.
The instruction NORM *- executes an XOR on the last two
bits in the accumulator and if the result is 0 it sets the
Test Control (TC) bit to 0, decreases the current register
and shifts the content of the accumulator 1-bit to the left.
So if we put the mantissa of the number to be normalized in
the upper part of the accumulator and the exponent in the
current register, at the end of the instruction NORM we will
have the mantissa of the normalized number in the upper part
of the accumulator and the exponent in the current register.
One of the most common applications of matrix inversion is
the inverse of the Jacobian of the manipulator, which recurs
in various recursive solutions, such as that of Goldenberg et
alii, quoted above.

3.2. FFT
A typical DSP application concerns the Fourier Transform
which plays a key role in image preprocessing of robot vision
apparatus or for the spectral analysis in measurement and
control apparatus.
The spectral analysis of digital signals is performed in
the frequency domain. The Fourier transform converts
information from the time domain into the frequency domain.
When digital signals are in use, data need to be computed by
the Discrete Fourier Transform (DTF), given by:

N-1 nk
X(k) = E x(n) W k = O,l, ... ,N-1
n=O N
-j2rr/N
where W = e is known as the fase [19 ]
N

The direct computation of X(k) requires about N2 complex


operations, then for large values N the computation time
increases rapidly. Using the radix-2 FFT (Fast Fourier
Transform) algorithm developed by Cooley and Tukey, that use
Decimation in Time (DIT) method to compute DFT, the complex
operations to compute is approximately N/2 . 10g2 N
and then the computation time is considerably reduced.
Radix-2 DIT FFT algorithm [20] utilizes the symmetry and
periodicity properties, given by:

k k+(N!2)
Simmetry Property: W -W
N N
72

k k+N
- Periodicity Property: W W
N N
An N-point OTF can be evaluated through this algorithm as
a collection of 2-point OTFs or butterflies. If the X(K) is
an N-point OFT, the radix-2 OIT FFT al~orithm sequences
through a total of M stages, where N = 2, with N/2
butterflies per stage, giving a total of (N/2)log2
butterflies to be computed. The general radix-2 butterfly in
the m-th stage of an N-point OIT FFT is shown in Fig. 10.

Pm Pm + 1

Qm W<X:==Qm+l
Figure 10. Radix-2 butterfly.

The input/output equations are given by:

k
P m+l Pm + W Qm (3)
N

k
Qm+l Pm - W Qm (4 )
N

In general inputs and outputs of butterfly are complex


numbers, such as the twiddle factor, given by:

Pm PR + j PI (5 )

Qm QR + j QI (6)

k -j(2rr/N)k
W e cos (x) - j sin(x) (7)
N

where x = (2rr/N)k;

by substituting the values from (5), (6) and (7), the equation:

(3) and (4) become:

P m +l (PR+A) + j (PI+B) (8)

Qm+l (PR-A) + j (PI-B) (9)


73

where:

A QR cos(x) + QI sin(x)

B QI cos (x) QR sin(x)


The operations required by the butterfly routine are
represented in Fig. 11 where:

KF=(NPS/2)-l
DF=NPS/2 J=l

PR=XR(I)/2 PI=XI(I)/2
QR=XR(L)/2 QI=XI(L)/2

BUTTERFLY ROUTINE

XR(I)=PR' XI(I)=pr
XR(L)=QR' XI(L)=QI'

I,;~~)
~'- ..

YES

! CI=CI+11+!.----.-<: CI=CF

YES

YES

Figure 11. Flow diagram of the FFT routine.


74

LT XL
MPY COS
PAC
LT YL
MPY SIN
APAC
SACH XT,1 * XT XL*COS+YL*SIN
LT YL
MPY COS
PAC
LT XL
MPY SIN
SPAC
SACH YT,1 * YT YL*COS-XL*SIN
LAC XI
ADD XT
SACL XI * XI XI+XT
SUB XT,1
SACL XL * XL XI-XT
LAC YI
ADD YT
SACL YI * YI YI+YT
SUB YT,1
SACL YL * YL YI-YT

Figure 12. The program segment for Butterfly computation.

PR,PI and QR,QI are real and immaginary parts of the


first and second value before the butterfly, see
equation (3) and (4);
PRO,PIO and QRo,QIo are real and immaginary parts of the
first and second value after the butterfly, see equation
( 8) and (9);

The necessary program segment for butterfly computation


implemented on a TMS32010 is reported in Fig. 12. It must be
pointed out that this operation is characterized by an
algorithm which makes a great use of loops. FFT execution
time can be gratly reduced by avoiding the use of loops and
writing the program as a string of instructions, without
using counters, pointers or branches, in straight-line
coding. Unfortunately this means a greater memory consumption
which may be eccessive for large transforms (1024 points).
Fig.13 compares FFT execution times with straight-line and
looped programs.
75

Performance of FFT implementation on TMS320 10

Strai ght -1 i ne-coded Program versus Looped-coded Program

NUMBER OF POI NTS REQUI RED TI ME MEMORV SIZE

(poi nts) (micro Sec.) (words)

8 (S.U 35.2 182

8 (Looped) 158.4 140

16(S.U 108.8 554

1 6( Looped) 388 148

32(S.U 291 1474

32( Looped) 839 156

64(S.L.) 737 3722

64(Looped) 1901 164

Figure 13. FFT execution times with straight-line and


looped programs
4. Conclusions

In the previous sections the authors have presented a survey


of the functions offered by DSPs, pointing out their ability
to provide an efficient solution to problems presenting
highly complex computations, which are typical of robotics.
DSP applicability is obviously not confined to the topics
dealt with here, but can be extended to various other fields
of application (e.g. process control).
The authors' aim was not so much that of providing an
exhaustive solution to a particular robotics problem, as
discussing DSP design methodology.
References
[1] S. Magar, "Trends in Digital Signal Processing
Architectures", Wescon 82 Conference Record (1982).
[ 2] R. Simar, "Performance of Harward Architecture in
TMS320", Mini/Micro West 1983 Computer Conference and
Exibition (1983).
[3] K.McDonough, E. Caudel, S. Magar and A.Leigh,
"Microcomputer with 32-Bit Aritmetic Does High Precision
Number Crunching", Electronics, vol.55, No.4, 1982.
76

[4] L. Pagnucco, C. Garcia, "A 16/32-Bit Architecture for


Signal Processing", Mini/Micro West 83 Computer
Conference and Exibition, 1983.
[5] R.Simar, T.Leigh, P.Koeppen, J.Leach, J.Potts and
D.Blolock "A 40MFLOPS Digital Signal Processor: The
First Supercomputer on a Chip" Proc. of ICASSP 87, april
1987
[6] "ADSP-2100 User's Manual Architecture" Analog Devices
Inc. 1989
[7] K.Kloker, "Motorola DSP5600 Digital Signal Processor",
IEEE Micro, December 1986.
[8] R.N.Kershaw, L.E.Bays, R.L.Freyrnan, J.J.Klinikowssi,
C.R.Miller, K.Mondal, H.S.Moscovitz, W.A,Stoker and
L.V.Tran, " A Programmable Digital Signal Processor with
32-Bit Floating Point Aritmetic", ISSCC 85 Digest of
Technical Papers, Feb.1985.
[9] P.R.Paul, C.H.Wu, "Resolved Motion Force Control of
Robot Manipulators", IEEE Trans. Syst. Man, Cybern.,
vol.SMC-12, no.3, 1982.

[10] Kohli, D.Soni, "Kinematic Analysis of Spatial Mechanism


via Successive Screw Displacement" J. Engr. for Industry
Trans. ASME, vol.2, serie B pagg.730-747, 1975.
[11] J.Denavit, and R.S.Hartenberg, "A Kinematic Notation for
Lower-pair Mechanisms Based on Matrices", ASME J. Appl.
Mech. pp.215-221, 1955.
[12] A.A. Goldenberg and D.L.Lawrence, "A Generalized Solution
to the Inverse Kinematics of Robotic manipulators", ASME
J. Dynamic Syst., Meas., Contr., vol.107, Mar. 1985.
[13] T.L.Wang and B.Ravani, "Recursive Computations of
Kinematic and Dynamic Equations for Mechanical
manipulators", IEEE J. Robotics Aut., vol. RA-1, no.3,
Sept 1985.
[14] V.Strassen, "Gaussian Elimination is not optimal",
Numer. Math. no.13, pp.354-356, 1969.
[15] V.Pan, "Strassen Alghoritm is not optimal. Trilinear
Technique of aggregating, uniting and cancelling for
constructing fast algorithms for Matrix multiplication",
Proc. 19th Annual Syrnp. Found. Compo Science, Ann Arbor,
MI, 1978.
77

[16] K.S. Fu, R.C.Gonzalez, C.S.George Lee, "Robotics Control


Sensing Vision and Intelligence", McGraw-Hill, New York,
NY, 1987.
[17] D.E.Knuth, "The Art of Computer Programming: Volume2,
Seminumerical Algorithms", Second Edition. Reading MA:
Addison-Wesley, 1969.
[18] "TMS32025 User's Guide" Texas Instrument 1987

[19] R. Bracewell, "The Fourier Transform and Its


Applications" MCGraw-Hill, New York, NY, USA, 1987.

[20] H.W.Johnson, C.S.Burrus, "An in-order, in-place Radix-2


FFT" IEEE ICASSP Proc.28A, 1984.
CHAPTER 4

PARAlLEL PROCESSING OF ROBOT CONTROL


AND SIMULATION

Bironori Kasahara
Department of Electrical Engineering, Waseda University
3-4-1 Ohkubo Shinjuku-ku, Tokyo, 169, Japan

ABSTRACT
This chapter describes parallel processing of robot-arm control
computation and simulation. The parallel processing of robot control
computation has attracted much attention to develop a cost-effective,
compact, and advanced robot controller which allows a robot system to
perform very complicated operations quickly and accurately even in
hazardous environments, such as space and ocean floor. Also, parallel
processing of robot motion simulation is important to efficiently
develop the advanced robot hardwares and the controllers. This chapter
introduces several parallel processing schemes on multiprocessor systems
for the robot-arm control computation and the robot-arm simulation.
More concretely, first of all, parallel processing schemes for the
Newton-Euler equations for dynamic control are introduced. Secondly, how
to implement the schemes on an actual multiprocessor system is
discussed. Thirdly, parallel processing schemes for Walker & Orin's
algorithm for dynamics .simulation and an implementation of the schemes
on a multiprocessor system are described. Finally, future directions of
the parallel processing of the robot control and simulation are briefly
discussed.

1. INTRODUCTION

Along with the introduction of robot systems into CIM (Computer


Integrated Manufacturing) and hazardous jobs in space, ocean floor, and
radio active pollution areas, the demand for more advanced robot systems
which can quickly and accurately perform complicated operations has been
increasing. In order to develop the advanced robot systems, putting
more advanced control schemes to practical use and developing robot
motion simulators for supporting the efficient design of robot systems
are required. For example, making the dynamic control fit for practical
use has been desired for the development of advanced robot-arms because
the dynamic control allows a robot-arm to accurately trace a desirable
trajectory even if weight of a work load is unknown or variable (Luh
et.al.,1980; Luh & Lin, 1983). Also, need for a real-time robot
dynamics simulator has been increasing for efficiently designing a
complicated robot-arm hardware and for quickly evaluating performance of
the control schemes without an actual robot hardware.
The real-time computational loads required for the robot-arm dynamic
control or the dynamics simulation, however, is very large (Luh & Lin,
79
80

1983) . In other words, expensive mini- or supermini-computers having


high processing capabilities is often needed to implement real-time
robot-arm dynamic controllers and simulators. In order to improve the
processing capability and the cost performance of the controllers and
simulators, recently, much attention has been paid to parallel
processing of the robot control and simulation computation (Kasahara &
Narita, 1985a, 1987; Lathrop, 1985; Liao & Chern, 1985; Luh and Lin,
1983; Orin et.al., 1984; Vukobratovic et.al., 1987).
Several types of parallel processing techniques can be applied, such as,
the vector pipeline machines, the processor array computers, the data-
flow machines and the multiprocessor systems (Hwang & Briggs, 1984), for
the parallel processing of the robot-arm control and the simulation.
The vector pipeline machines, which has been used by many minisuper- or
super-computers, are not, however, suitable for parallel processing of
the robot control computation and the simulation computation because the
computation includes operations of a lot of small matrices, or short
vector operations. Generally, the vector machines can not efficiently
handle an operation of small matrices since the initial pipe-delay,
which is the delay time until the first data pass through the pipeline,
is relatively large compared with the total processing time of the
operation. The processor array computers are usually suitable for the
matrix operations though they have some problems to efficiently process
parts other than matrix operations and even matrix operations which
includes various sizes of matrices which does not fit the size of the
processor array. The data-flow machines, which has been known as a non
Von Neumann type computer, also may have difficulties to
deterministically manage sampling times for on-line real-time control
and simulation because data driven execution of the data-flow machines
is generally non-deterministic. The multiprocessor systems (Gajski &
Peir, 1985) appear to be suitable for real-time robot control and
simulation since they can process small matrices efficiently and manage
the sampling time. Furthermore, multiprocessor systems composed of
inexpensive microprocessors are cost-effective.
Taking into consideration above facts, this chapter mainly reviews
recent researches on parallel processing using multiprocessor systems of
robot dynamics computation for control and simulation.

2. PARALLEL PROCESSING OF ROBOT-ARM CONTROL COMPUTATION

This section describes parallel processing schemes of the Newton Euler


equations for robot-arm control and implementation of a scheme on an
actual multiprocessor system.

2.1 Robot-arm Control COmputation

Robot-arm control computation generally involves a lot of operations of


small matrices related with coordinate transformations. For example,
Newton-Euler equations for motion (Luh et. al. ,1980), which are often
used to calculate force or torque applied to each robot joint given a
desirable motion trajectory (the inverse dynamic problem), consists of
81

AOw;= I A :-I(A ?_IW;_I + ZoQ;), if link i is rotational,

I
, A;-I(A?_,w;_,), if link i is translational,

A ;-I[A ?_IW;_I + ZoQ; + (A?_ IWi-l) x (Zolii)],


A ow; = if link i is rotational,

, A :-I(A ?_Iwi-/l, if link i is translational,

(A ~w;) x (A ?pf) + (A ?w;) x [(A ?w;) x (A ?pil] + A :-'(A ?_Iv;-Il,

if link i is rotational.

A ~v;= A ;-I(ZoIl;+A ?_Iv;-ll + (A ?_lW;) x (A ?pfj

+ 2(A ?w;) x (A;-'Zoli;) + (A ?w;l x [(A ?w;l x (A ?pil] ,

if link i is translational,

A ?N, = (A ?J;A :J(A ?w;) + (A ?w;) x [(A ?J;A ~(A ?w;l]

A ?n;=A ;+'[A?+ In;+1 + (A?+ IPfj X (A?nh+ I)]

I
+ (A ?pi +A?s;l x (A?F;) + A?N;

(A ?ni)'(A ;-1 40) + b,qi' if link i is rotational,

T; = (A ?/;)' (A ;-1 40) + b,qi' if link i is translational.

• The subscript i stands for the ith link of the manipulator. The equations
above the dashed line are calculated repeatedly for i = 1 to n (the number of
links) and those below the line for i = n to 1 in the reverse direction. A;+ I and
Ji are 3 x 3 matrices; 41, qi' mi, bi, and 1', are scalars. The remaining variables
are all 3 x 1 vectors.

Fig. 1. Newton-Euler equations for robot-arm dynamic control.


82

many 3 by 3 matrix operations and 3 by 1 vector operations as shown in


Fig. 1. It has been known that the Newton-Euler equations has the
lowest time complexity among algorithms to solve the inverse dynamic
problem. The time complexity is O(n) where 0 is the order function and n
is a number of robot joints. In Fig. 1, the subscript i stands for the
i-th joint of the arm. The equations above the dashed line are
calculated repeatedly for i=1 to n and those below the line for i=n to
.
1 in the reverse direction.
. ..
In these equations, variables Ai L+ 1 and J i
are 3 by 3 matrLces, qi' qi' mi , b i , and ti are scalars and the
remaining variables are all 3 by 1 vectors. The parallel processing of
the Newton-Euler equations has been studied by many researchers since
Luh (Luh & Lin, 1983) proposed the parallel processing for the first
time. The next section reviews several research efforts in this field.

2.2 Parallel Processing Schemes of Robot-Arm Control Computation

Parallel processing schemes for the Newton-Euler equations which have so


far been proposed can be divided into two different approaches; namely,
hardware oriented approaches and software oriented approaches. In
hardware oriented approaches, dedicated parallel processing hardwares
are designed exclusively for the Newton-Euler equations. In software
oriented approaches, parallelizing software, like scheduling software to
assign the Newton-Euler equations onto parallel processors, is developed
for an ordinary multiprocessor system. In the following, these
approaches are briefly explained.

2.2.1 Hardware Oriented Approach

As an example of dedicated multiprocessor systems for Newton-Euler


equations, there is an architecture based on the pipeline concept
proposed by Nigam and Lee (Nigam & Lee, 1985). Here, the pipeline
concept means that a processing-unit is divided into m cascaded
subprocessing-units with the unit processing time and different data are
continuously input into one end of the subprocessing-units every unit
time. In this method, if the same calculation is repeated many times
for different data, the total processing time will be reduced to 11m
times as long as a sequential processing time. In order to extract such
a pipelining effect, Nigam and Lee analyzed a computational flow, or a
data flow (Padua et.al., 1980), in the computation for each robot joint
and represented the flow by a graph shown in Fig. 2. In this figure,
an equation and its computational amount are shown inside each block and
a data dependency between equations is represented by an edge between
the blocks. Next, these blocks are partitioned into seven groups, each
of which is assigned to a processor, having nearly equal computational
amount as shown in Fig. 3 ..
Taking into consideration a fact that neighboring processors need a lot
of data transfers, Nigam and Lee tightly connected neighboring
processors by a local bus between each neighboring processor-pair.
Furthermore, in order to allow data transfers among any processors, they
relatively loosely connected all processors by a global bus.
In this system, calculation for each robot joint is repeated for i=l to
83

2
[ 4

'-
5 6

"ll
Ajw; - A~v. - A~ta AO,
I I
- I I
-
-
~~~~·11
1 126 adds - 90 adds 54 7 8
54 adds
54 mutt. - 162
mull. ~Ir 108 mutt. 72

A<?w.
I I
r--
l[ A?n.
I I
t---
1
I

42 adds 130 adds f - - 12 adds


54 mult. 130 mult. 18 mult.

'-
- AjN;

103 adds
- 144 mult.

Fig. 2. Data flow of Newton-Euler equations analyzed by Nigam & Lee.

Mechanical manipulator

Fig. 3. An multiprocessor architecture proposed by Nigam & Lee.


84

n based on the pipeline concept and force or torque for each joint is
obtained every unit time in a steady state. This scheme is effective if
a robot-arm has a lot of joints, or degrees of freedom (O.O.F.). It is
not, however, efficient for present robot-arms having only five or six
joints, since the initial pipe delay makes processing efficiency low.
In order to solve this problem, more flexible parallel processing
schemes are required.

2.2.2 Software Oriented Approach

Key problems for parallel processing on a general-purpose multiprocessor


system are how to decompose the computation of the Newton-Euler
equations into several small parts of computation called tasks and how
to allocate, or schedule, the tasks to processors. This paragraph
describes software parallel processing schemes to solve these problems.

1) Decomposition of the control computation into tasks


The first step in parallel processing on a multiprocessor is to
decompose the control computation into tasks, which are small units of
computation to be allocated to parallel processors.
In the decomposition of the Newton-Euler equations, we must determine
the optimal task size from the following several possible task sizes, or
task grains. The first possible task grain is an equation level grain
in which each equation is defined as a task. The second one is a matrix
or vector operation level grain in which one or several matrix/vector
operations are treated as a task. The last one is a scalar operation
level grain in which one or several arithmetic operations correspond to
a task. The determination of the task size has a strong influence on the
performance of parallel processing (Gajski & Peir, 1985; Kasahara &
Narita, 1985a,b; Kruatrachue & Lewis, 1988). At a glance, the finest
task grain seems to give us the shortest parallel processing time since
it can extract the maximum parallelism. In practice, however, it does
not always give us the shortest processing time because it suffers from
large processing overheads such as data transfer overhead and task
synchronization overhead. On the contrary, if too coarse task grain is
chosen, only limited parallelism can be exploited though the overheads
are relatively small. Therefore, in order to get the minimum processing
time, the task grain must be decided taking into account the processing
capability of each processor element, the data transfer capability among
processor elements, synchronization overhead (Hwang & Briggs, 1984) and
the capability of the scheduling algorithm.
In this chapter, the equation level task grain, which was employed by
many researches (Luh & Lin, 1983; Kasahara & Narita, 1985), is chosen as
an example for explanation since the following parallel processing
scheme can equally be applied to other task grains. When we decompose
Newton-Euler equations for the Stanford manipulator with six D.O.F. into
tasks by using equation level task grain, we have fifty-four tasks. We
generate, however, eighty-eight tasks by decomposing several tasks with
large processing time into smaller tasks because large imbalance of the
task processing time sometimes causes load imbalance among processors.
There exist precedence constraints among the generated tasks caused by
85

data dependencies (Padua & Wolfe, 1986). In other words, if a vector


variable defined by task i is referred by another task j, task j must
wait completion of task i. The precedence constraints, or the
execution order constraints, among the tasks for the Newton-Euler
equations can be represented by a directed acyclic graph called the
task graph (Kasahara & Narita, 1984a,1985a) as shown in Fig. 4. Fig. 4
represents the task graph for the Stanford manipulator. In the task
graph, the nodes represent the tasks and the arcs among nodes show the
precedence constraints. For example, the arcs among nodes 1, 4 and 6
mean that task 6 must wait to begin execution until tasks 1 and 4 are
completed. The processing time of each task on a processor is attached
beside the node. Task processing time is not, however, a deterministic
value since the time for floating point operations may vary with the
values of operands or data. In light of this fact, an average task
processing time (Kasahara & Narita, 1985a,b) is used for the scheduling
described in the next section.
Once the task graph is generated by the data-flow among tasks, the
lower bound of the parallel processing time can be obtained as the
critical path length tcr of the task graph (Coffmann, 1976; Kasahara &
Narita, 1984a). Therefore, the most efficient parallel processing means
to attain tcr by using the minimum number of processors because we can
not obtain a shorter processing time than tcr even if more processors
are used.

2) Task scheduling
In order to process the set of tasks on a multiprocessor system
efficiently, an assignment of the tasks onto the parallel processors and
an execution order of the tasks assigned to the same processor must be
determined optimally. The problem which determines the optimal task
assignment and the optimal execution order can be treated as the
traditional multiprocessor scheduling problem of which the objective
function is the minimization of the parallel processing time, or the
schedule length (Coffmann, 1976; Kasahara & Narita, 1984a, 1985a). The
scheduling problem, however, has been known as one of the most
intractable optimization problems, for which no practical optimization
algorithms had been developed for nearly 20 years. More precisely
speaking, the problem belongs to the class of "strong" NP-hard problems
(Garey & Johnson, 1979).
In order to decrease the difficulty of the scheduling problem in hand,
Luh and Lin fixed the number of processors at six, which is equal to the
number of robot joints, and assigned tasks related to each joint onto a
processor. After this simplification, they tried to optimize only the
execution order of the tasks assigned to each processor by taking into
consideration the precedence constraints among tasks assigned to other
processors. Finally, they obtained a schedule which reduces the
processing time for the eighty-eight tasks from 24.8 [ms) for a single
processor to 9.67[ms) for six processors as shown in Table 1.
On the other hand, Kasahara and Narita successfully solved the original
intractable multiprocessor scheduling problem by developing two
practical algorithms named CP/MISF (Critical Path / Most Immediate
Successors First) and DF/IHS (Depth First / Implicit Heuristic Search)
86

C 0

Fig. 4. A task graph for Newton-Euler equations applied to Stanford


manipulator with 6 D.O.F ••
87

(Kasahara & Narita, 1984a). The former is a heuristic scheduling


algorithm and quickly gives us accurate approximate solutions even on a
microcomputer. Its time complexity is o(n 2 +mn), where 0 means the
order function, n is the number of tasks and m is the number of
processors. since CP/MISF is a heuristic algorithm, it is difficult to
know if the solution is optimal. The latter is a practical
optimization algorithm which combines CP/MISF and a branch & bound
method and allows us to obtain optimal solutions or approximate solution
with guaranteed accuracy in a few minutes on a main-frame computer. The
solutions given by DF/IHS are always more accurate than those by
CP/MISF. The proper use of the both algorithms depends on the following
factors; such as, required accuracy of solutions, processing capability
of a computer to be used for scheduling and a number of tasks to be
scheduled. For example, DF/IHS is more suitable for parallel processing
of the robot control because a robot controller is not frequently
modified after it is completed. Therefore, it is important to reduce
parallel processing time of the control computation by use of DF/IHS
even if DF/IHS needs long scheduling time. On the other hand, CP/MISF
is suitable for parallel processing of the robot simulation because
parallelized programs for the simulation are frequently modified
together with modifications of robot configuration during design of
robot hardwares. Details of CP/MISF algorithm are describes in section
3, which discusses parallel processing schemes of the simulation, though
details of DF/IHS are not given in this chapter for lack of space.
By using DF/IHS algorithm, Kasahara and Narita solved the original
problem with the eighty-eight .tasks for the Stanford manipulator for an
arbitrary number of processors. Table 1 lists the scheduled results by
DF / IHS, where the number of processor is varied from 1 to 7. The
schedules by DF /IHS are optimal or near optimal for all cases. Also,
the schedule length or processing time for three processors is shorter
than Luh' s schedule for 6 processors. Furthermore, if six processors
are used, the processing time by DF/IHS is 40 percent shorter than Luh's
schedule. The Gantt chart in Fig. 5 shows an optimal schedule for three
processor. In this figure, PI, P2 and P3 means processor No.,
horizontal axis of each row shows elapsed time and figures inside
rectangles represent task number. From the figure, we notice that an
optimal schedule sometimes contradicts human intuition.
In terms of scheduling of the Newton-Euler equations, Bahren, Toomarian
and Protopopescu (Bahren et.at., 1987) reported that they could have
almost the same length of a schedule for four processors by using neural
networks as that by the DF/IHS algorithm. This report is interesting not
only from an aspect of robotics but also of combinatorial optimization.
The application of the neural networks to the scheduling problem has
not, however, been very effective because it takes a long time to have
a good solution and also needs special know-how to adjust parameters of
neural networks.
CP/MISF and DF/IHS algorithms mentioned before do not take into
consideration data transfer overhead among processors explicitly because
they take care of relatively coarse grain tasks, namely equation level
task grain for the robot control. Scheduling algorithms should,
however, take in to account the data transfer overhead when finer grain
88

TABLE 1 Comparison of schedules by Luh et.al. and Kasahara et.al.

No. of processors Processing time (ms)

Luh & Lin Kasahara & Narita

1 24.80 24.83
2 12.42
3 8.43
4 6.59
5 5.86
6 9.67 5.73
7 5.69

53 55 62

61 37 65 32 I
66
! !

3.0 4.5 5.0

Fig. s. An optimal schedule for three processor& obtained by the DF/IHS


algorithm.
89

tasks are treated or when a multiprocessor system with low data transfer
capability is used. To this end, scheduling algorithms considering data
transfer have been actively researched (Kasahara & Narita, 1984b;
Kasahara et.a!., 1990) though development of practical optimization
algorithms is very intractable. Recently, a heuristic algorithm named
CP/OT/MISF, which is an extended version of CP/MISF to handle data
transfer, was developed. The effectiveness of the CP/OT/MISF algorithm
has already been confirmed on a multiprocessor system (Kasahara et.al.,
1990).

2.3 Parallel Processing on an Actual Multiprocessor System

As mentioned above, many types of theoretical parallel processing


schemes have so far been proposed. However, few parallel processing
schemes have been implemented on actual parallel processing systems
because implementation of special purpose parallel processing hardwares
is, generally, costly and time-consuming.
In this section, implementation of the software oriented parallel
processing scheme using the OF/IHS on an ordinary shared memory
multiprocessor is described as an example of successful parallel
processing on an actual multiprocessor system.

2.3.1 An Experimental Shared Memory Multiprocessor System

Fig. 6 shows the configuration of the experimental multiprocessor


system which was developed to evaluate practicality and usefulness of
the parallel processing scheme using OF/IHS (Kasahara and Narita,
1985a). This multiprocessor system was developed by modifying off the
shelf plant controllers. The system consists of seven Processor
Elements (PEs), a Common Memory (CM), a single bus, and a serial
communication unit (SCU) to control the data transfer between a Host
Computer and PEs. This system configuration is very simple in an
attempt to clearly evaluate the performance of the scheme using
the static multiprocessor scheduling algorithms. Each PE is
composed of a l6-bit microprocessor intel 8086, its co-processor for
floating-point operations 8087 and a local memory (16KB ROM, 32KB RAM).
The data transfers among the PEs is made via the common memory. The
bus access control is made by a distributed bus arbiter on each PE.
Each arbiter controls the bus accesses based on a priority which is
given a priori to each PE and SCU. The HC is a 16-bit personal
computer and connected to the multiprocessor unit by a serial
communication link (RS-232C).

2.3.2 Generation of Machine Codes

In order to execute the robot control program in parallel on the


multiprocessor system, we must generate parallel machine codes for
processors based on a schedule generated by OF/IHS. The machine code
for each PE mainly consists of executable codes for tasks assigned to
the PE and the codes for synchronization among the tasks allocated to
different PEs. The task codes are optimized so that the registers in
90

8087 can be used effectively. The synchronization among tasks is made


by the version number method because the synchronization problem in the
present case becomes a one-writer, some-reader problem. In the version
number method, a writer task, which writes the output data of the task
(the shared data) onto the common memory (CM) in order to send the data
to other tasks allocated to other processors, updates the version number
on the CM after writing the shared data. The reader tasks then confirm
the update of the version number and read the shared data. The version
number, which is a function of a number of iterations, is assigned to
each shared data on the CM. Every PE has the same version number
during one iteration, or during one sampling period, and increase the
version number independently after completion of execution of the tasks
for the iteration. In the code generation, the codes to set the
version number are inserted after writer tasks and the codes to check
the up-date of the version number are inserted before reader tasks. The
set and check of the version number causes synchronization overhead.
The overhead, however, can be minimized by removing redundant
synchronization codes (Kasahara & Narita, 19B5b, 19B7) detailed in
Section 3 .•
Here, it should be noted that the machine code for a PE is completely
different from the code for another PE. The generated machine code for
each PE is down-loaded onto the local memory of the PE and executed by
the PE in cooperation with the other PEs during an iteration. The
synchronization between successive iterations and management of sampling
time are performed by PE 1. PE 1 gives all PEs the trigger to start the
next iteration after it checks the termination of the execution of all
PEs and of the I/O.

2.3.3 Performance of Parallel Processing

This section describes the results of parallel processing of dynamic


control computation for the Stanford manipulator. Curve A in Fig. 7
shows schedule length of optimal schedules generated by DF/IHS and
curve B represents the processing time measured on the actual
multiprocessor system. Curve A can also be regarded as the minimum
processing time, or the ideal processing time, for each number of
processors without any overheads. Curve A reaches the critical path
length, which is the lower bound of parallel processing time, when five
processors are used. This fact shows that use of more than five
processors is unnecessary when the equation level task size is adopted
on the experimental multiprocessor system. Curve B is very close to the
ideal curve. This fact means that efficient parallel processing with
low overhead was realized on the actual multiprocessor system. In
addition, the measured processing time is reduced from 29.7[ms] for
single processor to IS.l[ms] for two processors, 10.1[ms] for three
processors and 6.7[ms] for five processors which is almost t cr • This
processing time of 6.7[ms] easily clears the longest sampling period of
1/60[s] for real time operation stated by Luh (Luh and Lin, 19B3).
From this result, we can see that we can easily implement the software-
oriented parallel processing scheme on a simple shared memory
multiprocessor system and can have good parallel processing performance.
91

GLOBAL

PEl PE2 PE7

Fig. 6. A shared memory multimicroprocessor system developed to evaluate a


software oriented parallel processing scheme.

III
E
Q)
20
E
~

OJ
c
'iii
en
Q)
u
...
0
a.
.J.
~ 10
iii
...
III 6.7
n. Ie( 6,S

3
2 4 5 6 7
Number of Processors
Fig. 7. Parallel processing time of robot dynamic control computation on
an actual multiprocessor system.
92

3.PARALLEL PROCESSING OF ROBOT DYNAMICS SIMULATION

This section introduces parallel processing schemes so far proposed for


robot-arm dynamics simulation. This section consists of description of
Walker and Orin's algorithm for robot-arm simulation, parallel
processing schemes for the algorithm and an example of implementation of
a parallel processing scheme on a multiprocessor system.

3.1 Robot Dynamics Simulation

The Walker and Or in's algorithm shown in Fig. 8, which is handled in


this section, is one of the most efficient algorithms with computational
complexity of o(n 3 ), where n is number of robot joints.
In Fig. 8, equations (1) to (12) calculate the bias vector, which is the
summation of gravity, Coriolis, centrifugal and external forces.
These equations are Newton-Euler equations. Let the joint (angle)
displacement q and the (angular) velocity q be the current value and set
the (angular) acceleration q equal to zero. Then equations (1) to (8)
are calculated repeatedly for i=l to the number of links of the
manipulator n as indicated by "do i=1 to n" and "end do" in the
figure and equations (9) to (12) for i=n to 1 in the reverse direction.
In equations (13) to (28), the inertia matrix H is computed. By solving
the linear equations (29) using the bias vector and the inertia matrix
H and the torques and the forces applied to the joints, the (angular)
acceleration of each joint q is computed. Then q and q at the next time
instance are calculated by numerical integration.
Therefore, this algorithm includes a lot of small-matrix operations,
namely, operations of 3 by 3 matrices and 3 by I vectors, as well as
the robot control, solution of linear equations and numerical
integration. In addition, the computational amount of the simulation is
much larger than that of the control.

3.2 Parallel Processing Schemes of Robot-Arm Simulation

For parallel processing of the Walker and Orin's simulation algorithm,


both hardware oriented and software oriented parallel processing schemes
have also been proposed.

3.2.1 Hardware oriented parallel processing scheme

As a hardware oriented approach, Liao and Chern have proposed a VLSI


array processor named CBAP (Cross-Bus Array Processor) as shown in Fig.
9 for parallel processing of the Walker and Orin's algorithm (Liao &
Chern, 1985). They reported that processing time of the simulation
program is theoretically reduced to one third of sequential processing
time by using 6 by 6 CBAP, namely, 36 processing elements. Practicality
of the CBAP is not , however, clear because the CBAP has not been
implemented. In other words, we can not know that CBAP suffers from how
large processing overhead caused by data transfer among processors and
synchronization which usually deteriorates parallel processing
93

do 1=1 lo n,
Al",= {AI·tA!.,W,.,+ .. oi,) (lInk I : ROlallonall(l)
A:-tA:.,w,.,) <link I : Tr:wlallonal)(2)

ATl'4I=(AT J ,A~) (AT c:,,)+(ATw,)x «AT J ,Ah(AT WI)) (8)


end do;
do I", 10 I:

A: f,"'AnA:., f,.,HA1F, (9)

A! n I=A:' \AT., n ,,' + (AT., p ,»)( (AT., f't'»


+IAh,+Al .. )XIAlF,HAlN, (10)
bl JIA:n,)lIA:·' •• )+b,~, (R)(l1l
• '1= t<AT f,)T(A:- ' z.>+b, ~I mm)
end do:

MII=mll (13)

A:'IDII=A:.dA!'n+A!p,,) (14)

A:"EIIAr' =A:'I (A!EIIA:)A:-! (15)


d. j",-I I. I:
MJ=M,..+mJ (16)

A~-1 0 I=A:.a<A~ oJ)


1 1
=A ., (M,lm, (A, .,+A, p,)+M,., lA,•
., •
o,.,+A, p,lI) (17)

Al·, E,AI·'=Al·, IA1E,.,AI+AI J ,AI


+M,., ((AI o,.,+A!p,-A! o')'IAI o",+Alp,-Al a') I
-IA! 0 ,., +AI p ,-Ala,) IAI c,., +AlP,-A! 0 11'1
+rn, I(A~ 'I+A' PI-A' c ,). (A~ .,+A' PI-A' 0 ,) 1
- (Al • ,+A' PI-A' 0 .)(Al. J +~, p , .... A' a ,)'1) Ai" (18)

,-I F ,= {&IXM,A'-,O'
(Ill (IS)
A'
(00 M,)I ml2O)
A''-I N,_- {IAI.'E,AI")" (Il) em
0 m(22)

A~-,l ,=A"1 F, (23)

A'-, n,=A'-, N,+<A' .. 0" X(A'-, F,> (24)


do I=j-I 10 I;

A~-I f ,=A: .• (A~ f 'f') (25)

AT .• DI=AI .• (ATn,.,+(AT p,)X(AT f ,f')) (26)


end do;
end dOi
d. j", I. I:
do I=J 10 I:

H 1,= { IA1.,n,1. (Jl) (27)

(AT •• til. m(28)


end dOj
end do;
H'q=t-bi •• (29)

.tte" n ; tud»er or Unk. bl,Ml,ml :Icllar


A~tJhEI,1 : 3X3utrlx H I nXn .. Irll
WI, W I, V I, VI, P I, • I, F I, N I, f I, n I, 0 j: 3 x 1 VIC tor

t,bi'l,q,q,q :nX1Yeolor ZI=(OOI)1


Fig. 8 Walker & Orin's algorithm for robot-arm dynamics simulation.
94

performance.

3.2.2 Software oriented parallel processing scheme

The software oriented parallel processing scheme using static


multiprocessor scheduling algorithms mentioned in the previous section
can be also applied to parallel processing of the Walker and Orin's
algorithm. In parallel processing of the simulation, simulation
program, or parallelized machine code, should be re-generated frequently
together with modification of simulated robot configuration. Therefore,
we need a user-friendly compiler which automatically generates parallel
machine codes for a given robot configuration. More concretely, the
special purpose parallelizing compiler generates a sequential simulation
program for the given robot configuration, decomposes the program into
tasks, generates a task graph, schedules tasks to processors, and
generates optimized parallel machine codes for processors (Kasahara &
Narita, 1987) .

1) Task Generation
The simulation program, or equations to be calculated for the
simulation, are automatically derived from Walker and Orin's algorithm
by the special purpose compiler when the robot configuration, e.g. the
degrees of freedom, the type of joints, the length of links etc., is
given. The first step in parallel processing is to decompose these
equations into tasks with suitable sizes. The suitable task grain
should be determined taking into account the various factors mentioned
in the previous section. The equation level task grain is again chosen
taking into consideration the capabilities of the multiprocessor system
shown in Fig. 6.
In generation of tasks by the compiler, the following optimization are
made:
a) In order to extract parallelism to the full extent, an array
expansion technique, which is a variation of scalar expansion
technique (Padua & Wolfe, 1986), has been used. It transforms a
program part to be executed serially by output dependencies (Padua,
Kuck & Lawrie, 1980), into a parallelizable program by substituting
a vector variable by a two-dimensional array variable.
b) The equations, which can be computed at compile-time, are computed by
the compiler to minimize the computation to be executed at rue-time.
c) The Gauss-Jordan method with pivot selection has been used to solve
the linear equation. The dividing operations by the pivots and the
up-date operations for each row have been treated as a task to
generate large-size tasks.
d) For numerical integration, Euler, Trapezoidal, and 3-rd and 4-th
Adams-Bas.hforth methods are provided. A user can choose a desirable
integration method. The integration calculation for each variable is
treated as a task.
There exist data dependencies, or precedence constraints, among the
generated tasks for the simulation, as well as for the control. Fig.
10 represents the task graph for the Stanford manipulator. It involves
197 nodes.
95

0,

0,

0,

Oul~ut I \ Inpul Bu,


Bus Re9ister I,
• Reqister

Fig. 9 A VLSI array processor proposed by Liao and Chern.

Fig. 10 A Task graph for robot dynamics simulation


96

2) Scheduling
The generated tasks are scheduled onto processors by using DF/IHS and
CP/MISF. As mentioned before, CP/MISF is more suitable than DF/IHS for
the parallelizing compiler for the simulation because we would like to
have a solution quickly on a microcomputer.
CP/MISF algorithm is presented briefly below.
Step1: Determine the "level" for each task. The level Ii of task Ti is
defined as the longest path from the exit node to node Ni
corresponding to T i •
Step2: Construct the priority list in the descending order of Ii and the
number of immediately succeeding tasks.
Step3: Execute list scheduling (Coffmann, 1976), namely data driven type
scheduling, on the basis of the priority list.
The worst-case performance of CP/MISF, i.e., the error of the worst-case
solution t obtained by CP/MISF from the true optimal solution t opt is
given by t«2-1/m)t opt . CP/MISF gives us a good schedule for the robot
simulation in a few second on a 16 bit personal computer.

3) Code Generation
The compiler on the HC in Fig. 6 generates the machine codes to be
executed on each PE based on a scheduled result. The code generation
process for the simulation is almost the same as for the control. Here,
a code optimization technique to reduce synchronization overhead is
introduced which was not detailed in Section 2.
For example, let tasks A, Band C be allocated to processor 1 and tasks
D and E to processors 2 and 3 respectively as shown in Fig .11. Then,
task B does not need to check the version number set by task A. This
stems from the fact that both tasks are allocated to the same
processor. Also, task C and E have no need to check the version number
set by task D since the termination of task D has been confirmed by task
B. The elimination of redundant synchronization codes reduces
synchronization overhead significantly.

3.3 Parallel Processing of Simulation on a Multiprocessor System

This section introduces performance of parallel processing of dynamics


simulation of the stanford manipulator on the multi-microprocessor
system in Fig. 6 when the parallel processing scheme using CP/MISF is
applied to the simulation. In this case, the special purpose compiler
for parallel processing of the simulation on the HC, or 16 bit
microcomputer, automatically generates generates parallel machine codes
and downloads the codes onto the local memories on PEs after a robot
configuration is input by a user.
Curve A of Fig. 12 shows the theoretical scheduled results by CP/MISF.
Curve B shows the processing time measured on the multiprocessor system.
The scheduled results again express the ideal processing times without
synchronization overhead for each number of processors. Curve A reaches
the critical path length for six processors. Curve B is very close to
the ideal curve. Furthermore, the measured processing time shown by
97

Pi P2 P3
~ Precedence
relations
FS Flag set
Fe Flag check
o Unnecessary

Fig. 11 Minimization of synchronization overhead.

100

O~--~--~--~---J----~--~-
I 2 J 4 5 6 7
Number of processors
Fig. 12 Pa·rallel processing time of robot dynamics simulation on an actual
multiprocessor system.
98

Curve B is reduced from 84.97[ms] for single processor to 44.40[ms] for


two processors, 30.24[ms] for three processors and 17.93[ms], which is
almost t cr ' for six processors. This processing time of 17.93[ms] almost
clears the longest sampling period of 1/60[s] for real time operation
although the experimental multiprocessor system has only low processing
capability. If the 5[MHz] clock rate of 8086 is changed into 8[MHz], the
processing time may be reduced by 11 [ms] . In addition, the processor
utilization factor is very high; 96% for two processors, 94% for three
processors and 79% even for six processors by which the critical path
length is almost realized.

4.Summary and Future Directions

This chapter has described several parallel processing schemes for the
robot dynamic control computation and the robot dynamics simulation.
Especially, the software oriented parallel processing schemes using the
static multiprocessor scheduling algorithms are theoretically and
practically efficient and can be easily implemented on a simple
inexpensive multimicroprocessor system.
As prospects for future, it is expected that a multimicroprocessor robot
controller will be more important to realize a compact powerful
controller to control an advanced autonomous robot system having various
sensors including a vision sensor and an auditory sensor and
intellectual functions using neural networks. Several research efforts
have been performed in this direction. For example, the HERMIES project
developing such a controller by using a hypercube multiprocessor on the
market (NCUBE) with sixty-four processors (Bahren, 1987). Also, the
OSCAR project developing an advanced shared memory multiprocessor
system composed of sixteen 5-MFLOPS RISC processors and software
oriented parallel processing schemes for the robot control and
simulation, neural network simulation, logical programs and numerical
programs (Kasahara et.al., 1990).
Finally, researches on parallel algorithms for the robot control and
simulation also seem to be getting more important (Fijany and Bejczy,
1989a,b) together with advance of multiprocessor systems and
parallelization techniques of the sequential control and simulation
algorithms mentioned in this chapter.

REFERENCES

Bahren, J., Toomarian, N. and Protopopescu, V.(1987), "Optimization of


the computational load of a hypercube supercomputer onboard a mobile
robot," Applied Optics, Vol 26, No 23, pp 5007-5014.
Coffman, E. G. (1976), "Computer and Job-shop Scheduling Theory" New
York: Wiley.
Fijany, A. and Bejczy, K. (1989a), "Parallel algorithms and
Architecture for Computation of Manipulator Inverse Dynamics", Proc.
Workshop on Parallel Algorithms and Architectures in Robotics of
1989 IEEE Int. Conf. on Robotics and Automation.
99

Fijany, A. and Bejczy, A. K. (1989b) " A Class of Parallel Algorithms


for Computation of the Manipulator Inertia Matrix," IEEE Trans.
Robotics & Auto., Vol. RA-S, No.5, pp.600-61S, Oct. 1989.
Gajski, D. D. and Peir, J. K.(l98S) "Essential issues in multiprocessor
systems," Computer, Vol 18, No 6, pp 9-27.
Garey, M. R. and Johnson, D S, (1979) "Computers and Intractability: A
Guide to the Theory of NP-Completeness" San Francisco: Freeman.
Hwang, k. and F. A. Briggs,(1984),"Computer Architecture and Parallel
Processing," McGRAW-HILL.
Kasahara, H. and Narita, S. (1984a), "Practical multiprocessor
scheduling algorithms for efficient parallel processing," IEEE
Trans. Comput. Vol C-33, No 11, pp 1023-1029.
Kasahara, H. and Narita, S. (1984b), "Load distribution among real-time
control computers connected via communication media," Proc. IFAC
9th World Congress, Pergamon Press: Oxford.
Kasahara, H. and Narita, S. (198Sa), "Parallel processing of robot arm
control computation on a multimicroprocessor system," IEEE J. of
Robotics and Automation, Vol RA-1, No 2, pp 104-113.
Kasahara, H. and Narita, S. (198Sb) "An approach to supercomputing
using multiprocessor scheduling algorithms," Proc. IEEE 1st Int.
Conf. on Supercomputing Systems, pp 139-148.
Kasahara, H., Fujii, H. and Iwata, M.(1987), "Parallel processing of
robot motion simulation," Proc. 10th World Congress, Pergamon
Press:Oxford.
Kasahara, H., Honda, H., Narita S., (1990) , "Parallel Processing of
Near Fine Grain Tasks Using Static Scheduling on OSCAR (Optimally
SCheduled Advanced multiprocessoR)", Proc. IEEE ACM
Supercomputing'90.
Kruatrachue, B. and Lewis, T. (1988), "Grain size determination for
parallel processing," IEEE Software, No 1, pp23-32.
Lathrop, R. H.(198S), "Parallelism in manipulator dynamics," Int. J. of
Robotics Research, Vol 3, pp 80-102.
Liao, F. Y. and Chern, M. Y.(198S), "Robot manipulator dynamic
computation on a VLSI array processors," Proc. 1st IEEE Int. Conf.
on Supercomputing Systems, pp 116-121.
Luh, J.Y.S., Walker, M.W. and Paul, R.P.C. (1980),
100

Padua, D. A, Kuck, D.J. and Lawrie, D. H. (1980) "High-speed


multiprocessors and compilation techniques," IEEE Trans. Comput.,
Vol C-29, No 9 pp 763-776.
Padua, D.A. and Wolfe,M. (1986), "Advanced Compiler Optimization for
Supercomputers," C.ACM, Vo1.29, No.12, pp.1184-1201.
Vukobratovic, M. K., Kircanski, N. M. and Lee, S. G.(1987), "An
approach to parallel processing of dynamic robot models," Proc. 10th
World Congress, Pergamon Press: Oxford.
Walker, M. W. and Orin, D. E.(1981), "Efficient dynamic computer
simulation of robotic mechanism," Proc. Jt. Automatic Control
Conf ..
PART 2

ROBOTIC CONTROL AND VISION


CHAPTER 5

MICROPROCESSOR-BASED CONTROLLERS
FOR ROBOTIC MANIPULATORS

JA Tenrefro Machado and J.L. Martins de Carvalho


University of Porto, Faculty of Engineering
Department of Electrical and Computer Enginneering
Rua dos Bragas, 4099 Porto Codex:
Portugal

ABSTRACT. Robotic manipulators pose a challenging problem to


control system theory. Classical controllers, such as the
well known PID controller still used in present day indu-
strial robots, are inadequate for high performance manipu-
lators. This problem motivated the emergence of a new class
of controllers based on different concepts, namely non-linear
and adaptive controllers. Unfortunately, these complex algor-
ithms require powerful computer structures. While monoproces-
sor systems may be non-economical, multi-microprocessor
architectures are still in a research stage and the total
computational efficiency is far from desirable. To surpass
these limitations new controller structures were devised.
Some are based on the reformulation of the overall control
concepts. In this line of thought methods like sliding
controllers and learning controllers are being actively
investigated. In the former the algorithms are easy to
implement using standard hardware and applications have
already been reported. With respect to learning controllers,
they are still in a research stage; nevertheless, preliminary
results indicate that a considerable computational reduction
can be achieved by a more intelligent use of the microproces-
sor memory. Alternative strategies based on a more sound
allocation of the computing tasks were also suggested:
Techniques such as multirate sampling, preview schemes and
dedicated compilers have achieved promising results. This
chapter focuses on the aforementioned control methods having
in mind its real-time implementation on microprocessor-based
structures.

1. I NTRODUCT I ON

Since the advent of robotic technology manipulator control


has been an area of active research. These mechanical devices
composed of several links connected through revolute or
103
1~

prismatic joints, have complex dynamic phenomena which make


difficult the development of efficient controllers. Classical
systems such as the well known (linear) PID controllers are
still used in present day industrial robots; however, they
are inappropriate for high performances. In fact PID systems
lead to limited path tracking accuracy and may exhibit
vibrations at high speeds. The reduced efficiency of these
machines, when compared with the human arm, motivated the
appearance of controllers based on different concepts. Des-
pite the large variety that as been reported until now, the
technical literature is scarce on issues such as:
Experimental data and implementation details of
microprocessor control of industrial robots.
Comparison of control systems from the point of view
of resulting performances.
Evaluation of the actual computational load posed by
different algorithms and its dependence on issues such as
programming language and computer architecture.
Part 1 of this chapter reviews the current state-of-the-art
in microprocessor-based controllers. Part 2 presents general
techniques amenable to real-time implementations. By putting
together these techniques, which are scattered through out
the literature, we hope to clarify the issues arising from
technology limitations and those dependent on the micropro-
cessor control capabilities.

2. SURVEY OF MICROPROCESSOR-BASED CONTROLLERS

Robot controllers proposed so far can be divided into the


following categories:
• Linear Controllers
• Nonlinear Model-Based Controllers
• Variable Structure Controllers
• Adaptive Controllers
Learning Controllers
We begin with a description of each one, followed by a
comparison of their performances and complexity.

2.1. Linear Controllers

Linear PID schemes were the first choice for the manipulator
control problem. In fact, the vast experience accumulated in
the process industries with PID controllers naturally led to
this option. In this scheme each joint controller is indepen-
dent of each other and its tuning is done by model free
heuristic procedures. Coupling between joints is regarded by
the controller as disturbances. Consequently, the higher the
robot speed/acceleration the higher the coupling effects and
the worst the controller performances. Yet, besides its
fundamental limitations, this controller exhibits two charac-
teristics - simplicity and decentralized architecture - that
105

explain its widespread use in industrial robotics. In fact,


these properties make its real time implementation straight-
forward with present day microprocessors. Usually, the imple-
mentation follows a simple "master-slave" architecture: one
master processor evaluates the reference trajectories while a
slave processor, on each joint, computes the control law.
Linear strategies are appealing because they are easy to
understand and to analyse. At present they are still an
object of research, and can be divided into three main
categories:
• Modification of the robot mechanical structure.
• Integration of sensory information.
• Improvement of the control algorithm.
In the first case performance improvement is achieved by
modifying the mechanical characteristics. Simplification of
the non-linear dynamics allows the efficient use of linear
controllers [1-3]. Examples are the use of counterwights [4-
5], closed loop mechanical structures [6-7], pendulum robots
[8-9] and parallel robots [10]. Other studies [11-12] empha-
sizing the redesign of the mechanical structure suggest that
many non-linear terms of the manipulator dynamics can be
eliminated. However, it is still not clear the connection
between such approaches and more general kinematic/dynamical
optimization criteria [13-17].
The second approach is a shift of the control problem
towards greater sensory needs. In other words, one exploits
sensory redundancy as a means to improve performances, albeit
keeping a low computational complexity. We mention, for
example, studies making use of acceleration [18-21], torque
[22-23] and electric current [24].
The third category aims at the improvement of the algorithm
itself. Theoretical and laboratory experiments [25-27] and
simulation studies [28] have been reported. Recent implemen-
tations [29-31] have shown that good performances can be
attained with no additional complexity.

2.2. Non-Linear Model-Based Controllers

The explicit consideration of the kinematic and dynamic model


of the robot manipulator allows more sophisticated control
strategies than the preceding ones. Non-linear Model Based
Controllers use such models and, by means of appropriate
feedforward and feedback loops, they provide an approximate
overall decoupled linear performance [32-33].
The most popular non-linear control structures can be
grouped into three categories:
• Feedforward Control
• Computed Torque Control
• Resolved-Acceleration Control
The later is designed to work in the operational space while
the first two deal with joint space coordinates. Examples of
microprocessor implementations can be found in [34-38].
106

The feed forward control schemes [39) use the inverse


dynamic model to compute a feedforward torque; linear control
actions can then be implemented by means of feedback.
Experimental evaluations of this method and its connection
with the computed-torque method are reported in [40-42).
The computed-torque method [32] also uses the inverse
dynamic model but inside feedback and feedforward loops. With
an error-free parameter estimation the dynamics becomes that
of n uncoupled second-order systems, where n is the number of
joints. Deviations from this ideal situation are unavoidable
and this has stimulated several experimental [41,42,48-49)
and simulation supported theoretical studies [43-47).
The resolved-acceleration control algorithm [50) being
designed for the operational space is more difficult to
analyse and to implement in real-time.
The disadvantages of the above methods lies in the computa-
tional burden posed by the mathematical model which dictates
low sampling frequencies and, consequently, losses of con-
troller performance [47,51]. This state of affairs has
spurred the development of more efficient computational
schemes. In essence these efforts have been addressed to:
• The mathematical model itself, or
• Multi-processor based computational structures.
In the first case, and in order of inception, we have
numerical-recursive algorithms based on the Lagrange-Euler
(LE) [52] and Newton-Euler (NE) [53] formalism, hybrid
algorithms i. e. a mixture of numerical and symbolic calcula-
tions [54] and computations via symbolic computing [55-60).
In the second case the suggested structures take advantage of
parallelism in the on-line computations [61-67]. In this
spirit we also mention recent studies addressing the use of
special microprocessors [68) and transputers [69-70). Further
improvements are to be expected with new parallel structures
based on symbolic expression calculations [71].

2.3. Variable Structure Controllers

The idea of variable structure systems (VSS) appeared about


three decades ago and since then it has found application in
the control of aircraft, power system, robots and other
plants.
The most salient feature of this method is the ability to
change the controller structure in accordance with the plant
state. Associated with this change of structure, control
actions lead the plant into the so called sliding mode (72).
While the system remains in the sliding mode it is insensi-
tive to disturbances and parameter variations. Besides, the
dynamic characteristics of the sliding mode are imposed by
the controller designer and are independent of the plant
dynamics.
Such properties make this technique rather attractive for
robot control: insensitivity to disturbances (while in sli-
107

ding mode) eliminate link interactions, therefore allowing a


decentralized controller structure and simple models for
controller design [73-74]. Unfortunately there are still
practical problems that must be overcome. The first concerns
the reaching phase; in fact convergence in only ensured when
the system enters the sliding mode, but there is no guarantee
that the system will always reach it. The second arises when
the system is in sliding mode: unless the dynamic character-
istics of the sliding mode coincide with one of the robot's
natural modes, the controller has to change its structure
infinitely often. Because, in practice, only a finite number
of switches is possible in a finite time interval, the robot
will not follow exactly the desired trajectory; instead it
will exhibit high frequency oscillations around it. This
phenomenon, also known as "chattering" may excite undesirable
resonances in the mechanical structure.
A variety of techniques to attenuate these problems, and
amenable to microprocessor implementation, have been sugge-
sted in the literature. We mention for example, the use of a
simplified robot model [75], several "smoothing techniques"
of the sliding mode [76-80], the use of feedforward [73,81],
variable structure model following control [82] and the
integration of acceleration feedback [83]. Experiments [73,
78-81] have confirmed the usefulness of the above techniques
and demonstrated the aptitude of the VSS algorithms to
microprocessor implementation.

2.4. Adaptive Controllers

The fundamental difference between fixed feedback control


systems and adaptive control systems resides in the fact that
the later "adapt" their characteristics to the changing
dynamics of the controller process.
General purpose adaptive algorithms can be classified as
model reference adaptive control (MRAC) and self-tuning
adaptive control (STAC). The basic idea of MRAC is to
synthesize the joint torques which force the robot to behave
in a desirable manner as prescribed by a reference model. In
STAC the robot is modelled as a linear, time-varying,
discrete-time process, which is updated (identified) at every
sampling instant, and on the basis of which the controller is
designated to achieve a prescribed goal.
In the case of rigid robots it appears that dedicated
adaptive algorithms have advantages over general purpose ones
[84]. Because rigid robots can be modelled fairly accurately,
this allows the implementation of simpler adaptive schemes.
The adaptive algorithms can then be used only to compensate
for model inadequacies. Significant examples can be mentio-
ned, namely, Model Reference Adaptive Systems [85-88], Self-
Tuning Controllers [89-91], adaptive controllers using feed-
forward [92-95] and the adaptive version of the computed
torque method [96]. Unfortunately, and to the best of the
108

authors knowledge. most of the implementations of the above


techniques on actual mechanical manipulators [87.91.93.95-99]
have shown that further work is still necessary before
adaptive robot control starts to have some industrial impact.

2.5. Learning Controllers

The ease of human beings in learning. executing and control-


ling fast and accurate movements. has spurred interest in
control structures of the human brain.
Pioneering work based on models of the cerebellum for arm
control [100-101]. and later resumed [102-106] seem natural
in the present wave of interest in neural networks [107-109].
A different approach has been adopted by other researchers.
this time based on knowledge-based control schemes [110-113].
All of these strategies display the ability of not only
learning a given trajectory but also to generalize that
knowledge to other trajectories.
Algorithms based on non-biological principles. displaying
learning capabilities. however without being capable of
generalizing it [114-116]. seem to be more amenable to
implementation in microprocessor-based systems. In fact the
mentioned strategies barter real-time arithmetic capabilities
(which are important in conventional controllers) for big
memory capabilities. required to store information instead of
computing it. However. neural networks seem even to question
the fundamentals of present day microcomputers. The second
strategy. based on the use of RAM in conjunction with the
application of iterative algorithms. is. from inception.
better adapted to the standard microprocessor structure.
Finally we mention fuzzy rule control schemes [113.117]. A
fuzzy rule-based controller does not exhibit. in itself. such
learning capabilities. rather it is capable to integrate in
its structure the experience gained by human operators.
Therefore. we can say that here there is also a (indirect)
transfer from arithmetic capabilities to memory requirements
to handle the set of appropriate fuzzy rules.

2.6. Controller Performance Comparison and Complexity


Evaluation

Comparison of controller performances and computational load


evaluation are important from both practical and theorectical
viewpoints. However. the answer to these questions is not
easy and the studies addressing this problem are scarce. This
can be explained by the large number of factors involved.
namely:
• The nature of the task.
Intrinsic kinematic and dynamic robot characteristics.
Controller capabilities .
• Sensor characteristics.
This multitude of factors has led to comparisons based only
lW

on either path tracking accuracy or algorithm computational


load, the other factors not being explicitly considered.
With reference to the comparisons of kinematic and dynamic
performances, with emphasis on path tracking accuracy, we
mention (34,40,41,42,48,49,98,99,118], while an evaluation of
the computational load of several algorithms can be found in
(34,35,119]. A closer look at these studies reveals that a
great deal of effort is still necessary before we can handle
the above issues under a unified framework.

3. TECHNIQUES TO IMPROVE MICROPROCESSOR-BASED CONTROLLER


IMPLEMENTATION

As discussed before, the greater the complexity of the


controller the greater the computational requirements. It is
then appropriate to question to what extent it is possible to
implement a given algorithm. In the affirmative case the
question still remains about the most adequate techniques or
strategies to do the job. In order to answer these questions
we state the problem as follows:
Evaluation of the computational load with respect to
the different logical and arithmetic operations. Evaluation
of the capabilities of several computing systems (software
and hardware).
Techniques to improve the real-time performances of
the controller.

3.1. Evaluation of the Computational Load and of


Software/Hardware Capabilities

The evaluation of the computational load required by an


algorithm, as well as the capabilities of a given computatio-
nal system (software/hardware), are essential steps in any
preliminary study regarding its future implementation.
Many of the algorithms suggested in the literature neglect
these points, and no assessment is made about computational
requirements. Some studies take into account the computatio-
nal load based on the required number of sums and multiplica-
tions, while others only mention the maximum sampling fre-
quencies achieved after the algorithm has been installed in a
given system. Obviously any of these approaches is far from
satisfactory, because they do not take into account all the
"factors" involved. The difficulty of the problem lies
precisely in the large number of factors involved, such as
the type of microprocessor, clock frequency, memory wait
states, type and version of the compiler, type and accuracy
of the operations, etc.
Although not considering all possible combinations of
operations/compiler/processor/coprocessor, the data displayed
in Tables 1-5 attempts to clarify these issues. They show the
minimum and maximum times required by each arithmetic or
TABLE 1. Computation times for the IBM PS/2 series o

Computer PS/2-8530-021 PS/2-8560-071 PS/2-8570-A21


-~~

!--.

OS MSDOS 3.3 MSDOS 3.3 MSDOS 3.3

Clock 8 MHz 10 MHz 25 MHz

Processor 8086 80286 80386


-~~--
----
Prog.Lang TC V2.0 jJC V4.0 TC V2.0 ).IC V4.0 TC V2.0 jJC V4.0

Operation Computation Time (jJsec)

ADD real 540-660 210-280 284-330 110-138 83-93 31-40

MULT real 870-940 540-610 362-386 186-198 110-114 55-59

SIN 3500-4300 6600-7450 1150-1480 2343-2710 335-418 692-807

COS 3400-4400 7100-7700 1090-1490 2727-2857 324-418 802-846

ASIN 6580-9560 7650-10700 2300-3520 2803-4137 664-1016 840-1231


--
ACOS 7240-9780 7950-10450 2640-3630 2963-3920 763-1049 889-1176

ATAN 4600-6700 6300-7450 1480-2250 2323-2783 434-643 697-835


------ "----

SQRT 1220-1540 700-1100 494-550 307-478 148-163 93-139


-~-~

ADD int 8.3 7.5 4.1 4.0 1.2 1 .2


-- -.--f-----._-

MULT int 22.5 24.0 5.2 4.5 1.5 1.3

IF 9.1 12.0 4.5 6.0 1.1 1.5


TABLE 2. Computation times for the IBM PS/2 series
I
Computer PS/2-8530-021 PS/2-8560-071 PS/2-8570-A21 ,

OS MSDOS 3.3 MSDOS 3.3 MSDOS 3.3

Clock 8 MHz 10 MHz 25 MHz

Proc/Copr 8086/8087 80286/80287 80386/80387

Prog.Lang TC V2.0 JlC V4.0 TC V2.0 JlC V4.0 TC V2.0 JlC V4.0

Operation Computation Time (Jlsec)

ADD real 52-55 54-66 62-65 61-63 9-11 9-11

MULT real 57-63 60-66 70-72 69-72 9-11 9-11


- t---
SIN 300-360 440-550 312-357 493-546 27-39 104-110

COS 300-360 430-550 305-357 500-547 27-39 98-110

ASIN 220-305 320-390 247-338 325-390 43-66 65-77

ACOS 245-330 320-440 259-345 344-409 44-66 65-83

ATAN 220-275 220-330 214-266 240-280 39-55 50-61

SQRT 76-80 159-165 76-77 127-128 14-17 31-33

ADD int 8.3 7.5 4.1 4.0 1.2 1.2

MULT int 22.5 24.0 5.2 4.5 1.5 1.3

IF 9.1 12.0 4.5 6.0 1.1 1.5


TABLE 3. Computation times for several systems tv

Computer VAX 11~750 ).lVAX~II Q3 VAXSTATION DECSTATION


--,---- ..

OS VMS 4.7 Ultrix 2.2~1 VMS 5.1 Ultrix V3.1

Clock 16.7 MHz


.. ~ ..
- -- .. 1---- ..
Proe/Copr Proe. + FPA Proe. + FPA Proe. +FPA Proe. + FPA

Prog.Lang VAX C V2.4 System C System C System C

Operation Computation Time ().lsee)

ADD real 9~10 10~1l 2.9~3.3 0.16~0.33

MULT real 9~10 10~11 3.0~3.3 0.16~0.33

SIN 64~73 35 ·-40 19~22 5.0~8.3

COS 72~90 133~138 19~22 6.7~8.3

ASIN 72~89 37~40 20~23 8.3~10

ACOS 82~100 70~73 22~25 8.3~10


- ------- ---,-'"'.---~----

ATAN 63~80 167~173 17~20 6.7~10

SQRT 57~67 30~33 15~18 1.6~1.7


.--------~---.- .,-
---~- f--.
ADD int 2.0 3.5 0.5 0.024
f-- ._-- ---- -
MULT int 1.9 7.7 0.5 0.024
1----. .--,---- - - - - " "

IF 5.3 4.3 1.1 0.055


~ ... ----_ .. _-_._-
'-
TABLE 4. Computation times for several systems

Computer HP 9000-330 SUN 3-60 SUN 4-110 Apollo DN3500

OS HP-UX 6.01 Unix 4.2 Unix 4.3.2 BSD 4.2 Do/IX

Clock 16.7 MHz 20 MHz 14.3 MHz 25 MHz

Proc/Copr 68020/68881 68020/68881 Sys4 SPARC 68030/68882

Prog.Lang System C System C GNU Vl.25 System C System C

Operation Computation Time ().Jsec)

ADD real 12-16 18-19 11-13 3-4 7.3

MULT real 8-14 15-16 11-12 3 7.3-7.5

SIN 25-38 17-20 20-25 5-7 23-25

COS 27-42 18-23 22-28 5-8 25-27

ASIN 23-30 22-25 23-27 27-32 143-145

ACOS 22-27 20-25 23-30 25-33 138-140

ATAN 18-22 17-20 20-27 3-7 25-27

SQRT 22-25 15-22 23-27 3-7 27-28

ADD int 1.3 1 .2 0.5 0.9 0.4

MULT int 3.5 3.3 0.5 2.8 0.6

IF 1 .7 1.6 1 .6 1.2 1.3


-~~
- --- ------ -_._------ ~--- -- ---- ----

w
-I:.

TABLE 5. Computation Time of transputers

Transputer T414-20 T800-20


' - - - - - - - - - 1---

Clock 20 MHz 20 MHz List of Symbols

FPA no yes TC - Turbo C


)..IC - Microsoft C
Prog.Lang. PC V2.0 PC V2.0 Occam 2 PC - 3L Parallel C

Operation Computation Time ()..Isec)


Operation Precision
ADD real 50-60 4-5 1 .8-2.0
Real - 8 bytes
MULT real 50-60 4-5 2.5-2.6 Integer - 4 bytes

SIN 150-160 30-40 38.8-40.7

COS 70-80 10-20 34.8-36.0


----
ASIN 80-90 10-20 30.6-47.3
t----
ACOS 80-90 10-20 31.1-46.6
r.----------- --
ATAN 90-100 10-20 36.8-36.9
-- - - - -
SQRT 50-60 10-20 13.0-13.8
r--------------
ADD int 3.85 1.65 0.66
--
MULT int 3.95 1.85 2.54

IF 4.32 2.08 0.922


115

logical operation for a given system (software/hardware).


Among the large number of possibilities we have chosen those
combinations more relevant in controller implementation.
Inspecting the data several conclusions can be drawn:
Trignometric operations are the most time consuming.
Logical and integer arithmetic operations are the
fastest.
The arithmetic coprocessor is essential to speed-up
floating point operations.
Logical and integer arithmetic operations are not
affected by the presence or absence of coprocessors.
For a given hardware, large variations of computing
time may occur depending on the compiler being used.
Theoretically, the speed of calculations increases
linearly with the clock frequency. However, for large fre-
quencies, wait states may occur when accessing memory. In
this case the way memory is organized is the determining
factor in the full use of that velocity [120] .
• Reduced Instruction Set Computer (RISC) architectures
appear to be far more efficient than conventional Complex
Instruction Set Computers (CISC).
In order to provide the reader with a better perspective of
these properties, we have decided to measure the frequency of
calculation for an adequate benchmark. Because extrapolation
from generic benchmarks are questionable [120-122], we have
decided to select a benchmark capable of reflecting the
requirements normally associated with robot control. Thus, we
show in Fig. 1 the frequencies of calculation of the inverse
dynamics for the six degrees of freedom (d.o.f.) PUMA 560
manipulator and for the systems mentioned in Tables 1 to 5.
The inverse dynamics equations already include the simplifi-
cations presented in [123]. Some improvements introduced by
the authors have reduced the number of required operations to
5 sines, 6 cosines, 89 sums and 151 multiplications.
Besides the already mentioned properties, this benchmark
(that uses only floating point operations) reveals, rather
unexpectedly, a better performance of the system 8086/8087 (8
MHz) in comparison with the 80286/80287 (10 MHz).
In conclusion, the results in Fig. 1 show that, by the time
the whole of the control algorithm is implemented, which
contains several sub-algorithms dealing with kinematics,
dynamics, control and trajectory planning the computational
load can easily reach levels incompatible with the use of
(desirable) high sampling frequencies.

3.2. Techniques to Improve the Real-Time Performances of


Microprocessor-Based Controllers

Modern robot control algorithms pose very stringent computa-


tional requirements. Although technological progress makes
available systems with ever increasing performances, the
truth is that their use as robot controllers may not be
Legend a-

1 IBM PS/2-8530-021
8086, 8 MHz, IIC
2 IBM PS/2-8560-071
10· 80286, 10 MHz, IIC
3 IBM PS/2-8570-A21
80386, 25 MHz, IIC
4 IBM PS/2-8560-071
11 80286/80287,
i /il"'.':I.' Ililll' '1'111
IW~I'
:,11,::"; illi,lIll '1.'1
",I
l''.,,1 10 MHz. TC
10 3 i'
I
',',I!: ~'I " III,!.!,:II ,. :II'I!i'" 5 IBM PS/2-8530-021
illll:d,I"',lllii 8086/8087. 8 MHz. TC
ill'ii' Ii"III"f!!i "it"I'l,il 'IH. " ,!ill';I'1111,' Il jl~. ii I:.
6 VAX 11-750
i,l! illl'l il I! 7 IIVAX-II Q3
, '"11111'
I'll: ;::1; iq\W~!
l.I.I I;.'1 "I,.11'11'1':1: ; :.' i" 8 HP 9000-330.
Computing .i,lh;' I'!I/II::: i' /1;' II:!' ',11 ,iiilip ,I'
,"I: 68020/68881. 16.7 MHz
Frequency 10' "1111111/. tll l'i."I···· ~1 1: 1" '11,1.""/1 1,,, ..
I, I"! III II' ~."II"I:
'I1li'Iiii !ii!) ,'I! 9 SUN 3-60, 68020/68881
(Hz) 20 MHz, GNU V1,25
I'
It!!:1 !i,1 11:"I'1'
i~'I' IIi:1i ~i'1111l' 'I ;lli'l :11:'
"~iill ,1,
i'. ."I "!I I I
'I' 10 IBM PS/2-8570-A21
1111/111 11111.111 !!llii.'.?I' ,I i i 'il!iil!1 :11111 80386/80387 25 MHz TC
1'1,1 1'1' ':'I'd I'l ' ",' .
, I,'111'I,,;li r:I"'
Hiii:,il "!!'
I'll"",!!I 11I11
\ ': "11"'1'1'
I'll! I lit
11 Apollo DN3500
... "11111, ...... .1:::... 111/:.11" .... ,1,1/1" .... ',111'1'1 .....1"111'
68030/68882, 25 MHz
10 I: ,I,! lillll":
j.j h" ~ II
'i'::;!
i'lllI':!!j ) 11 .I,IUt ,III'Ii ""I I! 12 VAXSTATION
il/illi: 111111" '1'llli'I'
i,'"I '~ "'I"
iilill :I! 13 T800-20. Occam 2
i!lil: 'ill': i:ill:': ;1'/;: [II'I!,I )':II!! illllli!1 (8 bytes)
l 1:.,.': 111:1
i,l'I' hi'!
;' :,,1';:"1
!I'.. ,III':"'11'"
I'j f,j
il l 'l i,l ill!I'
"'I .I!W:
'I' hI:i!,!!I!III!II' 14 T800-20. Occam 2
1. 'I ~1 ,J.!i .', ~:fn ! 11:1, ti"
1.' i Ir'·~ ill'itJ; .I,!
111 '/111' (4 bytes)
JIi Iii I '1,,1111 1,+,1;1111.; 'III/i':, i,Ililii!+IIIIi,J!:lllIIllil.;
;;;!~ IE:; )' :1': C '1;' '1 :~l "i .I!. ':, I;:i "1:' :;
15 SUN 4-110, Sys4 SPARC
1 :::~i ,'il·· ;i!;, 14.3 MHz
16 DECSTATION 3100. MIPS
16.67 MHz
Microsystems

Fig. 1. Comparison chart of the computing frequencies for PUMA 560 dynamic equations
(5 sines, 6 cosines, 89 sums, 151 multiplications) using the microsystems
considered in Tables 1 to 5.
For the systems tested with two alternative compilers, it is depicted only
the result of the fastest code.
Floating point operations are evaluated using the standard precision of 8
bytes, except for microsystem 14 which uses 4 byte precision operations.
117

economically feasible. In the sequel we present a set of


techniques to improve the real-time performances of the
controller, which are, to a large extent "hardware and
technology independent".
We group these techniques into six categories:
• Assembly language programming.
• Low precision arithmetic calculations.
• Use of memory.
• Multirate schemes.
• Preview techniques.
• Dedicated compilers.
which are now discussed in the sequel.

3.2.1. Assembly Language Programming. This is a well known


technique and constitutes a natural first choice to improve
the real-time performances of the controller. In fact direct
programming in assembly language allows considerable optimi-
zation of the generated code. However, modern control algor-
ithms are very complex and that makes their programming an
assembly very time consuming. This is the reason why only a
few researchers have adopted this strategy [36-38,74,98] as
an alternative to high level languages such as FORTRAN or C.

3.2.2. Low Precision Arithmetic Cal.~~_!~ions. This option has


been one of the principal alternatives to the assembly
language. The most common technique consists in giving up
floating point calculations in favour of fixed point arithme-
tic [41,124,125].
At a first sight it would seem more practical to reduce
only the accuracy of the floating point calculations. Howe-
ver, this is somewhat deceptive because many of the high
level languages implement low accuracy floating point opera-
tions on the basis of operations with standard accuracy.
Therefore, and contrary to our expectations, there is no
improvement of the computation times. Clearly, if the high
level language has distinct implementations for different
accuracies then computation can be speeded up. This is shown
in Fig. 2 for several floating point arithmetic operations,
on the IMS T414-20 and IMS T800-20 transputers, programmed in
Occam 2. For the IMS T414-20 (supporting floating point
operations in software) one achieves higher speed up than for
the IMS T800-20 (which has a 64-bit floating point unit on-
chip) .
Alternative techniques to achieve higher sampling rates
through a compromise to data precision representation will be
discussed below in sections 3.2.3. and 3.2.6.

3.2.3. Use of Memory. This method transfers, in part or


completely, the load of on-line calculations to memory-based
evaluations. A common procedure consists of replacing the
floating point calculations of trignometric functions by
memory tables [37,124,125]. Although this method can be
00

10'

Legend
1 0 2 tUmllilill'l'ii""" "'II~IIII!!IH'i:I!'" .," 'liHllii:iIltl' . "ilI!il~II:!JII IliilllllUII:II' ,.," ""111111111 ,', .. ~ T414-20. 8 bytes
lEi:11 T414-20. 4 bytes
Computing
Time 1111~~llrllr"" i'rlllll'!IIII::1 II 1:'1 T800-20. 8 bytes
(Jjsec) I :1' 1 1 ,:11,:(
II,iil :: 11il:T11 illllllill:I'
Illl: :i:li ~t" l Ilil~Il~,l rl~'i, :'lr: ~I ~11~llilllrl lr" 1 I I ~Illlll
ill,lllltriTi 1I1I1 1~':!":IIIII:I[;:!
u: T800-20. 4 bytes
,1"iIIUIUUI:I:I:III:,::::::I::,i""IIDIIIIIW:I:I:iU:,:::"II::.:I"",1111111101:1'1'1:11:::\11 :,I",,·'III!III!!I:I'I:II:'::'::I1:::.I.. ·IIIIIIIIII:I'I:I:III:":'::I1:'::1,,,,1111111111'1:1'1:11:.,::::1,,,..,
10

1
SIN COS ASIN ACOS ATAN SQRT

Floating Point Operations

Fig. 2. Comparison chart of the computing time for several floating point operations
in transputers T414-20 and T800-20 using 8 bytes and 4 bytes precision
(source code programmed in Occam 2).
119

generalized to a greater portion of the algorithm, the


application of this technique to control algorithms has not
received much attention to date [47] with the exception of
the already mentioned learning controllers. Therefore, a
large field of research remains open.

3.2.4. Multirate Schemes~ The effects of the sampling fre-


quency upon the performances of a digital robot controller
are very important [51]. In a control system, made of several
feedback and feedforward paths, it is natural to expect
different speeds of response along them. Therefore, it is
reasonable to allocate different sampling (and computing)
rates to such paths; this forms the basis of multirate
schemes [126-127]. In this way the computing power is
assigned to each loop (or variable) in accordance to its
needs, allowing therefore a more rational management of the
system resources (47,60].

3.2.5. Preview Techniques. Because microprocessors have limi-


ted computational capabilities the maximum allowable sampling
frequency of the controller is bounded above. At the same
time this gives rise to a time delay between the instant
sensors are read and the control command is issued.
Preview techniques attempt to minimize this detrimental
effect (128]. Their application to the control of robot
manipulators has shown to be simple and efficient
[47,129,130].

Compilers.
~-!-2-,-~-,_Dedicated The implementation of a control
algorithm implicitly assumes the representation of process
variables by means of real numbers. To these real numbers
corresponds a computer internal representation by means of
floating point structures. This representation, requiring
several bytes, is in sharp contrast with the 8 to 16 bit
accuracy of standard A/D and D/A converters.
The present authors have developed a method for implemen-
ting real-time control algorithms [131-132] that eliminate
this discrepancy, by optimizing the chain

Data collection --) Proccessing --) Control action

In their method, an "uniform" representation for the variab-


les is adopted, that is, real world and computer internal
variables have identical accuracies. Once the admissible
range of a variable is known, quantization levels can be
defined as a function of the accuracy of the A/D and D/A
converters; to each of these levels, an integer binary code
is assigned. Then the control algorithm can be processed by
Boolean Algebra operations upon the bits representing the
variables. The Boolean Algebra operations are optimized by
means of Binary Decision Diagrams (BDD's), that are extremely
efficient, once converted to IF_THEN_ELSE structures as shown
120

by Tables 1 to 5.
This method is still in a research stage; however, it has
already suggested interesting extensions to RISC computatio-
nal structures and parallel architectures. In the first case
the extension is motivated by the fact that processing by
means of BDD's only requires microprocessors with a reduced
number of instructions; in the second case the advantage
stems from the simplicity of task allocation to parallel
processors.

4. CONCLUSIONS

The scientific literature is fertile in controllers for


robotic manipulators. However, the validation of these algor-
ithms through practical implementations is still confined to
a few examples. Moreover, the high computational burden posed
by many of these algorithms preclude its industrial applica-
tion using present day microcomputers. This situation can be
overcome through the development of control strategies more
adapted to microprocessor-based structures. The analysis of
both the computational requirements and microcomputer capabi-
lities reveals that limitations can be alleviated through the
adoption of general techniques associated with the data
representation. Furthermore, the use of these techniques
achieves not only a more efficient management of the computa-
tional resources but also provides a deeper insight on
developments towards future control architectures.

5. ACKNOWLEDGEMENTS

The authors would like to thank Antonio M. C. Costa, Jose C.


S. Alves, A. Augusto Sousa and Jyrki Leppanen for their help
in the computation of Tables 1 to 5.

6. REFERENCES

(1] Sweet, L.M. and Good, M.C. (1985) 'Redefinition of the


Robot Motion-Control Problem', IEEE Control Systems
Magazine 5, 18-24.
(2] Pak, H.A. and Turner, P.J. (1986) 'Optimal Tracking
Controller Design for Invariant Dynamics Direct-Drive
Arms', J. Dynamic Syst. Measurement, Contr., Trans. ASME,
360-365.
(3] Youcef-Toumi, K. and Kuo, A.T.Y. (1987) 'High Speed
Trajectory Control of a Direct-Drive Manipulator', in
Proc. 26th IEEE Conf. on Decision and Control, Los
Angeles, CA.
(4] Machado, J.A.T. and Carvalho, J.L.M. de (1987) 'Dynamics
and Control of Robot Manipulators', INESC Internal
121

Report RI/55/87.
(5] Chung, W.-K. and Cho, H.S. (1988) 'On the Dynamic
Characteristics of a Balanced PUMA-760 Robot', IEEE
Trans. Indust. Electron., IE-35, 222-230.
[6] Asada, H. and Youcef-Toumi, K. (1984) 'Analysis and
Design of a Direct-Drive Arm with a Five-Bar-Link
Parallel Drive Mechanism', J. Dynamic Syst. Measurement,
Contr., Trans. ASME, 106, 225-230.
[7) Kazerooni, H. (1989) 'Statically Balanced Direct Drive
Manipulator', Robotica, 7, 143-149.
[8] Asada, H., Kanade, T. and Takeyama, I. (1983) 'Control
of a Direct-Drive Arm', J. Dynamic Syst. Measurement,
Contr., Trans. ASME, 105, 136-142.
(9] Ostman, I., Allared, C.-A. and Holmqvist, U. (1985)
'Pendulum Robot: Six Degrees of Freedom Without Any
Backlash for Fast and Precise Assembly', Asea Journal,
58, 12-17.
(10] Dwolatzky, B. and Thornton, S.G. (1988) 'The GEC
Tetrabot - a Serial-Parallel Topology Robot: Control
Design Aspects', in Proc. lEE Control 88, Oxford, UK.
[11) Yang, D.C.H. and Tzeng, S.W. (1986) 'Simplification and
Linearization of Manipulator Dynamics by the Design of
Inertia Distribution', The Int. J. Robotics Research,
5, 120-128.
(12] Youcef-Toumi, K. and Asada, H. (1987) 'The Design of
Open-Loop Manipulator Arms With Decoupled and
Configuration-Invariant Inertia Tensors', J. Dynamic
Syst. Measurement, Contr., Trans. ASME, 109, 268-275.
(13] Tsai, Y.C. and Soni, A.H. (1981) 'Accessible Regions and
Synthesis of Robot Arms', J. Mechanical Design, Trans.
ASME, 103, 803-811.
[14] Asada, H. (1983) 'A Geometrical Representation of
Manipulator Dynamics and Its Application to Arm Design',
J. Dynamic Syst. Measurement, Contr., Trans. ASME, 105,
131-142.
(15] Yoshikawa, T. (1985) 'Manipulability of Robotic
Mechanisms', The Int. J. Robotics Research, 4, 3-9.
[16] Yoshikawa, T. (1986) 'Analysis and Design of Articulated
Robot Arms from the Viewpoint of Dynamic Manipulability',
Robotics Research: The Third International Symposium,
MIT Press.
(17] Galhano, A.M.F., Machado, J, .A.T. and Carvalho, J.L.M.
de (1990) 'On the Analysis and Design of Robot
Manipulators: A Statistical Approach', in Proc. 11th
IFAC World Congress, Tallinn, USSR.
[18] Futami, S., Kyura, N. and Hara, S. (1983) 'Vibration
Absorption Control of Industrial Robots by Acceleration
Feedback', IEEE Trans. Indust. Electron .• IE-30,
299-305.
(19] Luo. G.-L. and Saridis, G.N. (1984) 'Robust Compensation
for a Robotic Manipulator', IEEE Trans. Automat. Contr.,
AC-29, 564-567.
122

(20] Luo, G.L. and Saridis, G.N. (1985) 'L-Q Design of PID
Controllers for Robot Arms', IEEE J. Robotics and
Automation, RA-l, 152-159.
(21] Studenny, J. and Belanger, P.R. (1986) 'Robot
Manipulator Control by Acceleration Feedback: Stability,
Design and Performance Issues', in Proc. 25th IEEE Conf.
on Decision and Control, Athens, Greece.
[22] Van Brussel, H. Van and Vastmans, L. (1984) 'A
Compensation Method for Controlling Time-Varying Robot
Configurations', in Proc. 14th ISIR, Gothenburg, Sweden.
(23] Luh, J.Y.S., Fisher, W.D. and Paul, R.P.C. (1983)
'Joint Torque Control by a Direct Feedback for
Industrial Robots', IEEE Trans. Automatic Control,
AC-28, 153-161.
[24] O'Shea, J. and Turgeon, A.B. (1986) 'Benefits of
Positive Current Feedback Inner Loop for Robot Servos',
in Proc. IFAC Theory of Robots, Vienna, Austria.
[25] Seraji, H., Jamshidi, M., Kim, Y.T. and Shahinpoor, M.
(1986) 'Linear Multivariable Control of Two-Link
Robots', J. Robotic Systems, 3, 349-365.
(26] Stone, H.W. and Neuman, C.P. (1984) 'Dynamic Modeling of
a Three Degree-of-Freedom Robotic Manipulator', IEEE
Trans. Syst., Man, Cybern., SMC-14, 643-654.
[27] Good, M.C., Sweet, L.M. and Strobel, K.L. (1985)
'Dynamic Models for Control System Design of Integrated
Robot and Drive Systems', J. Dynamic Syst. Measurement,
Contr., Trans. ASME, 107, 53-59.
[28] Golla, D.F., Garg, S.C. and Hughes, P.C. (1981) 'Linear
State-Feedback Control of Manipulators', Mechanism and
Machine Theory, 16, 93-103.
(29] Ohishi, K., Ogino, Y. and Hotta, M. (1988)
'Decentralized Manipulator Joint Control Based on Model
Following Acceleration Control System', in Proc. 1988
IEEE Int. Workshop on Intelligent Robots and Systems,
Tokyo, Japan.
(30] Hsia, T.C.S. (1989) 'A New Technique for Robust Control
of Servo Systems', IEEE Trans. Indust. Electron., IE-36,
1-7.
(31] Chen, Y. (1989) 'Replacing a PID Controller by a Lag-
Lead Compensator for a Robot - A Frequency Response
Approach', IEEE Trans. Robotics and Automation, 5,
174-182.
(32] Bejczy, A.K. (1974) 'Robot Arm Dynamics and Control',
Jet Propulsion Lab. Tech. Memo 33-669.
(33] Freund, E. (1982) 'Fast Nonlinear Control with Arbitrary
Pole-Placement for Industrial Robots and Manipulators',
The Int. J. Robotics Research, 1, 65-78.
[34] Katie, D. and Vukobratovic, M. (1986) 'The Influence of
Actuator Model Complexity on Control Synthesis for High
Performance Robot Trajectory Tracking', in Proc. IFAC
Theory of Robots, Vienna, Austria.
[35] Vukobratovic, M. and Kircanski, N. (1986) 'Numerical
123

Complexity of Decentralized Dynamic Control Laws for


Manipulator Systems', Robotica, 4, 255-262.
[36] Freund, E. and Klein, H.J. (1984) 'Practical Results
With Nonlinear Control Strategy for Computer-Controlled
Industrial Robots', in Proc. 14th ISIR, Gothenburg,
Sweden.
[37] Spong, M.W., Thorp, J.S. and Kleinwaks, J.M. (1987)
'Robust Microprocessor Control of Robot Manipulators',
Automatica, 23, 373-379.
[38] Lin, S.-K. (1987) 'Microprocessor Implementation of
the Inverse Dynamical System for Robot Control', in
Proc. 10th IFAC World Congress, Munich, FRG.
[39] Luh, J.Y.S. (1983) 'Conventional Controller Design for
Industrial Robots - A Tutorial', IEEE Trans. Syst., Man,
Cybern., SMC-13, 298-316.
[40] An, C.H., Atkeson, C.G. and Hollerbach, J.M. (1986)
'Experimental Determination of the Effect of Feedforward
Control on Trajectory Tracking Errors', in Proc. 1986
IEEE Int. Conf. on Robotics and Automation, San
Francisco, CA.
[41] An, C.H., Atkeson, C.G., Griffiths, J.D. and Hollerbach,
J.M. (1987) 'Experimental Evaluation of Feedforward and
Computed Torque Control', in Proc. 1987 IEEE Int. Conf.
on Robotics and Automation, Raleigh, NC.
(42] Khosla, P.K. and Kanade, T. (1988) 'Experimental
Evaluation of Nonlinear Feedback and Feedforward Control
Schemes for Manipulators', The Int. J. Robotics
Research, 7, 18-28.
(43] Hewit, J.R. and Burdess, J.S. (1981) 'Fast Dynamic
Decoupled Control for Robotics, Using Active Force
Control', Mechanism and Machine Theory, 16, 535-542.
[44] Sahba, M. and Mayne, D.Q. (1984) 'Computer-Aided Design
of Nonlinear Controllers for Torque Controlled Robot
Arms', lEE Proc., 131, Pt. D, 8-14.
(45] Tourassis, V.D. and Neuman, C.P. (1985) 'Robust
Nonlinear Feedback Control for Robotic Manipulators',
lEE Proc., 132, Pt. D, 134-143.
[46) Neuman, C.P. and Tourassis, V.D. (1987) 'Robust Discrete
Nonlinear Feedback Control for Robotic Manipulators', J.
Robotic Systems, 4, 115-143.
[47] Machado, J.A.T. and Carvalho, J.L.M. de (1989)
'Engineering Design of a Multirate Nonlinear Controller
for Robot Manipulators', J. Robotic Systems, 6, 1-17.
[48) Leahy. M.B. and Saridis, G.N. (1987) 'Compensation of
Unmodeled PUMA Manipulator Dynamics', in Proc. 1987 IEEE
Int. Conf. on Robotics and Automation, Raleigh, NC.
(49] Leahy, M.B., Valavanis, K.P. and Saridis, G.N. (1989)
'Evaluation of Dynamic Models for PUMA Robot Control',
IEEE Trans. Robotics and Automation, 5, 242-245.
[50] Luh, J.Y.S., Walker, M.W. and Paul, R.P.C. (1980)
'Resolved-Acceleration Control of Mechanical
Manipulators', IEEE Trans. Automat. Contr., AC-25,
124

468-474.
(51] Khosla, P.K., (1987) 'Effect of Control Sampling Rates
on Model-Based Manipulator Control Schemes', JPL
Workshop on Space Telerobotics, Pasadena, CA.
[52] Hollerbach, J.M. (1980) 'A Recursive Lagrangian
Formulation of Manipulator Dynamics and a Comparative
Study of Dynamics Formulation Complexity', IEEE Trans.
Syst., Man, Cybern., SMC-I0, 730-736.
[53] Luh, J.Y.S., Walker, M.W. and Paul, R.P.C. (1980) 'On-
Line Computational Scheme for Mechanical Manipulators',
J. Dynamic Syst. Measurement, Contr., Trans. ASME, 102,
69-76.
(54] Horak, D.T. (1984) 'A Simplified Modeling and
Computational Scheme for Manipulator Dynamics', J.
Dynamic Syst. Measurement, Contr., Trans. ASME, 106,
350-353.
(55] Leu, M.C. and Hemati, N. (1986) 'Automated Symbolic
Derivation of Dynamic Equations of Motion for Robotic
Manipulators', J. Dynamic Syst. Measurement, Contr.,
Trans. ASME, 108, 172-179.
(56] Koplik, J. and Leu, M.C. (1986) 'Computer Generation of
Robot Dynamics Equations and the Related Issues', J.
Robotic Systems, 3, 301-319.
[57] Cheng, P.-Y., Weng, C.-I. and Chen, C.-K. (1988)
'Symbolic Derivation of Dynamic Equations of Motion for
Robot Manipulators Using Piogram Symbolic Method', IEEE
J. Robotics and Automation, 4, 599-609.
[58] Neuman, C.P. and Murray, J.J. (1987) 'Customized
Computational Robot Dynamics', J. Robotic Systems, 4,
503-526.
[59] Neuman, C.P. and Murray, J.J. (1987) 'Symbolically
Efficient Formulations for Computational Robot
Dynamics', J. Robotic Systems, 4, 743-769.
[60] Kircanski, M., Vukobratovic, M., Kircanski, N. and
Timcenko, T. (1988) 'A New Program Package for the
Generation of Efficient Manipulator Kinematic and
Dynamic Equations in Symbolic Form', Robotica, 6,
311-318.
[61] Luh, J.Y.S. and Lin, C.S. (1982) 'Scheduling of Parallel
Computation for a Computer-Controlled Mechanical
Manipulator', IEEE Trans. Syst., Man, Cybern.,
SMC-12, 214-234.
(62] Kasahara, H. and Narita, S. (1985) 'Parallel Processing
of Robot-Arm Control Computation on a
Multimicroprocessor System', IEEE J. Robotics and
Automation, RA-l,104-113.
(63] Nigam, R. and Lee, C.S.G. (1985) 'A Multiprocessor-Based
Controller for the Control of Mechanical Manipulators',
IEEE J. Robotics and Automation, RA-1, 173-182.
[64] Binder, E.E. and Herzog, J.H. (1986) 'Distributed
Computer Architecture and Fast Parallel Algorithms in
Real-Time Robot Control', IEEE Trans. Syst., Man,
125

Cybern., SMC-16, 543-549.


[65] Watanabe, T., Kametani, M., Kawata, K. and Tetsuya, K.
(1986) 'Improvement in the Computing Time of Robot
Manipulators Using a Multimicroprocessor', J. Dynamic
Syst .. Measurement, Contr., Trans. ASME, 108, 190-197.
[66] Vukobratovic, M., Kircanski, N. and Li, S.G. (1988) 'An
Approach to Parallel Processing of Dynamic Robot
Models', The Int. J. Robotics Research, 7, 64-71.
[67] Chen, C.L., Lee, C.S.G. and Lou, E.S.H. (1988)
'Efficient Scheduling Algorithms for Robot Inverse
Dynamics Computation on a Multiprocessor System', IEEE
Trans. Syst., Man, Cybern., 18, 729-743.
(68] Kabuka, M. and Escoto, R. (1989) 'Real-Time
Implementation of the Newton-Euler Equations of Motion
on the NEC uPD77230 DSP', IEEE Micro, .9, 66-76.
[69] Jones, D.I. and Entwistle, P.M. (1988) 'Parallel
Computation of an Algorithm in Robotic Control', in
Proc. lEE Control 88, Oxford, UK.
(70] Geffin, S. and Furht, B. (1989) 'Transputer-Based
Dataflow Multiprocessor for Robot Arm Control',
Microprocessors and Microsystems, 13, 219-226.
(71) Murray, J.J. and Neuman, C.P. (1988) 'Organizing
Customized Robot Dynamics Algorithms for Efficient
Numerical Evaluation', IEEE Trans. Syst., Man, Cybern.,
18, 115-125.
(72) Utkin, V.I. (1977) 'Variable Structure Systems with
Sliding Modes', IEEE Trans. Automat. Contr., AC-22,
212-222.
(73] Young, K.-K.D. (1978) 'Controller Design for a
Manipulator Using Theory of Variable Structure Systems',
IEEE Trans. Syst., Man, Cybern., SMC-8,101-109.
[74] Klein, C.A. and Maney, J.J. (1979) 'Real-Time Control of
a Multiple-Element Mechanical Linkage with a
Microcomputer', IEEE Trans. Indust. Electron., IECI-26,
227-234.
(75] Morgan, R.G. and Ozguner, U. (1985) 'A Decentralized
Variable Structure Control Algorithm for Robotic
Manipulators', IEEE J. Robotics and Automation, RA-1,
57-65.
(76] Slotine, J.-J. and Sastry, S.S. (1983) 'Tracking Control
of Non-Linear Systems Using Sliding Surfaces, with
Application to Robot Manipulators', Int. J. Control,
38, 465-492.
(77] Slotine, J.-J.E. (1985) 'The Robust Control of Robot
Manipulators', The Int. J. Robotics Research, 4, 49-64.
(78] Staszulonek, A. and Brussel, H. Van (1986) 'Inertially
Decoupled, Sliding Mode Controller Design for Trace and
Pick-Up Robot', in Proc. 15th ISIR, Brussels, Belgium.
(79) Hashimoto, H., Maruyama, K. and Harashima, F. (1987)
'A Microprocessor-Based Manipulator Control with Sliding
Mode', IEEE Trans. Indust. Electron., IE-34, 11-18.
(80] Hashimoto, H., Slotine, J.-J.E., Arai, Y. and Harashima,
126

F. (1989) 'Implementation of VSS Control to Robotic


Manipulators - Smoothing Modification', IEEE Trans.
Indust. Electron., 36, 321-329.
[81) Hiroi, M., Hojo, M., Abe, Y. and Dote, Y. (1986)
'Microprocessor-Based Decoupled Control of Manipulator
Using Modified Model-Following Method with Sliding
Mode', IEEE Trans. Indust. Electron., IE-33, 110-113.
[82) Young, K.-K.D. (1988) 'A Variable Structure Model
Following Control Design for Robotics Applications',
IEEE J. Robotics and Automation, 4, 556-561.
[83) Machado, J.A.T. and Carvalho, J.L.M. de (1988) 'A New
Variable Structure Controller for Robot Manipulators',
Third IEEE Int. Symp. on Intelligent Control, Arlington,
Virginia.
[84] Landau, I.D. (1989) 'Adaptive Control - Robustness and
performance enhancement', Proc. IFAC Symp. on Adaptive
Systems in Control and Signal Processing, Glasgow, UK.
[85) Dubowsky, S. and DesForges, D.T. (1979) 'The Application
of Model-Referenced Adaptive Control to Robotic
Manipulators', J. Dynamic Syst. Measurement, Contr.,
Trans. ASME, 101, 193-200.
[86) Balestrino, A., Maria, G. de and Sciavicco, L. (1983)
'An Adaptive Model Following Control for Robotic
Manipulators', J. Dynamic Syst. Measurement, Contr.,
Trans. ASME, 105, 143-151.
[87) Anex, R.P. and Hubbard, M. (1984) 'Modelling and
Adaptive Control of a Mechanical Manipulator', J.
Dynamic Syst. Measurement, Contr., Trans. ASME, 106,
211-217.
(88] Horowitz, R. and Tomizuka, M. (1986) 'An Adaptive
Control Scheme for Mechanical Manipulators -
Compensation of Nonlinearity and Decoupling Control', J.
Dynamic Syst. Measurement, Contr., Trans. ASME, 108,
127-135.
(89] Koivo, A.J. and Guo, T.-H. (1983) 'Adaptive Linear
Controller for Robotic Manipulators', IEEE Trans.
Automat. Contr., AC-28, 162-171.
(90) Koivo, A.J. (1985) 'Self-Tuning Manipulator Control in
Cartesian Base Coordinate System', J. Dynamic Syst.
Measurement, Contr., Trans. ASME, 107, 316-323.
(91] Lemos, J.M., Coito, F., Garcia, F., Silvestre, C.,
Shirley, P., Concei9ao, P. and Sentieiro, J.S. 'Long-
Range Adaptive Control Algorithms for Robotic
Applications', in 'Progress in Robotics and Intelligent
Systems', C. Y. Ho and George W. Zobrist (ed.).
[92) Lee, C.S.G. and Lee, B.H. (1984) 'Resolved Motion
Adaptive Control for Mechanical Manipulators', J.
Dynamic Syst. Measurement, Contr., Trans. ASME, 106,
134-142.
[93] Slotine, J.-J.E. and Li, W. (1987) 'Adaptive Manipulator
Control', in Proc. 1987 IEEE Int. Conf. on Robotics and
Automation, Raleigh, NC.
127

[94] Slotine, J.-J.E. and Li, W. (1989) 'Composite Adaptive


Control of Robot Manipulators', Automatica, 25, 509-519.
(95] Seraji, H. (1989) 'Decentralized Adaptive Control of
Manipulators: Theory, Simulation and Experimentation',
IEEE Trans. Robotics and Automation, 5, 183-201.
[96] Craig, J.J., Hsu, P. and Sastry, S.S. (1987) 'Adaptive
Control of Mechanical Manipulators', The Int. J.
Robotics Research, 6, 16-28.
(97] Tomizuka, M., Horowitz, R., Anwar, G. and Jia, Y.L.
(1988) 'Implementation of Adaptive Techniques for
Motion Control of Robotic Manipulators', J. Dynamic
Syst. Measurement, Contr., Trans. ASME, 110, 62-69.
(98] Murphy, S.H. and Saridis, G.N. (1987) 'Experimental
Evaluation of Two Forms of Manipulator Adaptive
Control', in Proc. 26th IEEE Conf. on Decision and
Control, Los Angeles, CA.
(99] Leahy, M.B., Murphy, S.H., Al-Jaar, R. and Saridis, G.N.
(1987) 'Experimental Evaluation of a Puma Manipulator
Model-Referenced Adaptive Controller', in Proc. 26th
IEEE Conf. on Decision and Control, Los Angeles, CA.
(100]Albus, J.S. (19751 'A New Approach to Manipulator
Control: The Cerebellar Model Articulation Controller
(CMACI', J. Dynamic Syst. Measurement, Contr., Trans.
ASME, 97,220-227.
(101]Albus, J.S. (1975) 'Data Storage in the Cerebellar Model
Articulation Controller (CMAC), , J. Dynamic Syst.
Measurement, Contr., Trans. ASME, 97, 228-233.
[102]Horn, B.K.P. and Raibert, M.H. (1977) 'Configuration
Space Control', MIT AI Memo no. 458.
[103]Raibert, M.H. (1978) 'A Model for Sensorimotor Control
and Learning', Biological Cybernetics, 29, 29-36.
[104]Hirzinger, G. (1986) 'Robot Systems Completely Based on
Sensory Feedback', IEEE Trans. Indust. Electron., IE-33,
105-109.
[105]Miller III, W.T., Glanz, F.H. and Kraft III, L.G. (1987)
'Application of a General Learning Algorithm to the
Control of Robotic Manipulators', The Int. J. Robotics
Research, 6, 84-98.
[106]Miller III, W.T. (1989) 'Real-Time Application of Neural
Networks for Sensor-Based Control of Robots with
Vision', IEEE Trans. Syst., Man, Cybern., 19, 825-831.
[107)Kawato, M., Uno, Y., Isobe, M. and Suzuki, R. (1988)
'Hierarchical Neural Network Model for Voluntary
Movements with Application to Robotics', IEEE Control
Systems Magazine, 8, 8-1S.
[108)Guez, A. and Selinsky, J. (1988) 'A Treinable
Neuromorphic Controller', J. Robotic Systems, 5,
363-388.
[109]Kuperstein, M. and Rubinstein, J. (1989) 'Implementation
of an Adaptive Neural Controller for Sensory-Motor
Coordination', IEEE Control Systems Magazine, 9,
25-30.
128

(110]Bekey, G.A. and Tomovic, R. (1986) 'Robot Control By


Reflex Actions', in Proc. 1986 IEEE Int. Conf. on
Robotics and Automation, San Francisco, CA.
[lll]Suganuma, Y. and Ito, M. (1986) 'Knowledge-Based
Learning Control', Control-Theory and Advanced
Technology, 2, 367-383.
(112]Suganuma, Y. and Ito, M. (1989) 'Learning in Movement
and Control', IEEE Trans. Syst., Man, Cybern., 19,
258-270.
(1l3]Silva, C.W. de and MacFarlane, A.G.J. (1989) 'Knowledge-
Based Control Approach for Robotic Manipulators', Int.
J. Control, 50, 249-273.
[114]Kawamura, S., Miyazaki, F. and Arimoto, S. (1988)
'Realization of Robot Motion Based on a Learning
Method', IEEE Trans. Syst., Man, Cybern., 18, 126-134.
(1l5]Konishi, Y., Aoyama, T. and Inasaki, I. (1988)
'Trajectory Generation and Control of a Five-Bar-Link
Parallel Direct-Drive Robot', Robotics & Computer-
Integrated Manufacturing, 4, 395-402.
(116]Bien, S.-R.O.Z. and Suh, I.H. (1988) 'An Iterative
Learning Control Method with Application for the Robot
Manipulator', IEEE J. Robotics and Automation, 4,
508-514.
(117]Mandic, N.J., Scharf, E.M. and Mamdani, E.H. (1985)
'Pratical Application of a Heuristic Fuzzy Rule-Based
Controller to the Dynamic Control of a Robot Arm', lEE
Proc., 132, Pt. D, 190-203.
(118]Liu, C.-H. (1986) 'A Comparison of Controller Design and
Simulation for an Industrial Manipulator', IEEE Trans.
Indust. Electron., IE-33, 59-65.
[119]Liu, C.-H. and Chen, Y.-M. (1986) 'Multi-Microprocessor-
Based Cartesian-Space Control Techniques for a
Mechanical Manipulator', IEEE J. Robotics and
Automation, RA-2, 110-115.
[120]Lua, K.T. (1989) 'Relative Performance Measurement of
80386, 80286 and 8088 Personal Computer Systems',
Microprocessing and Microprogramming, 26, 85-95.
(121]Price, W.J. (1989) 'A Benchmark Tutorial', IEEE Micro,
9,28-43.
(122] 'Lies, damned lies and benchmarks' in 'The Transputer
Applications Notebook: Systems and Performance', INMOS,
258-279, 1989.
(123]Armstrong, B., Khatib, O. and Burdick, J. (1986) 'The
Explicit Dynamic Model and Inertial Parameters of the
PUMA 560 Arm', in Proc. 1986 IEEE Int. Conf. on Robotics
and Automation, San Francisco, CA.
(124]Snyder, W.E. (1985) 'Industrial Robots: Computer
Interfacing and Control', Prentice-Hall.
(125]Suehiro, T. and Takase, K. (1985) 'A Manipulation System
Based on Direct-Computational Task-Coordinate Servoing',
Robotics Research: The Second International Symposium,
MIT Press.
129

(126]Glasson, D.P. (1983) 'Development and Applications


of Multirate Digital Control', IEEE Control Systems
Magazine, 3, 2-8.
(127]Berg, M.C., Amit, N. and Powell, J.D. (1988) 'Multirate
Digital Control System Design', IEEE Trans. Automat.
Contr., 33, 1139-1150.
[128]Vinante, C.D., Bermudez, C. and Tarre, F. (1986)
'Predictive Compensation Implemented with a
Microprocessor', IEEE Control Systems Magazine, 6,
40-43.
[129]Yoshimoto, K. and Wakatsuki, K. (1984) 'Application on
the Preview Tracking Control Algorithm to Servoing a
Robot Manipulator', Robotics Research: The First
International Symposium, MIT Press.
[130]Yoshimoto, K. and Sugiuchi, H. (1985) 'Trajectory
Control of Robot Manipulator Based on the Preview
Tracking Control Algorithm', Robotics Research: The
Second International Symposium, MIT Press.
[131]Machado, J.A.T., Carvalho, J.L.M. de, Matos, J.A.S.
and Costa, A.M.C. (1988) 'A New Computational Scheme for
Robot Manipulators', NATO Advanced Research Workshop on
Robots with Redundancy: Design, Sensing and Control,
Italy.
[132]Machado, J.A.T., Carvalho, J.L.M. de, Matos, J.A.S.
and Costa, A.M.C. (1988) 'Robot Manipulator Dynamics -
Towards Better Computational Algorithms', in Proc. IFAC
Symp. on Robot Control, Karlsruhe, FRG.
CHAPTERS

DESIGN ASPECTS OF A ROBOT COORDINATED


BY A DESKTOP COMPUTER

JOHN BILLINGSLEY,
Professor of Robotics,
Portsmouth Polytechnic,
Portsmouth P013DJ, UK

1. Introduction

An industrial robot is usually purchased complete with mechanical structure, drive electronics,
controlling computer and embedded programming language. The robot is a single unit, to be
used or abandoned as a whole, even though it contains a mix of technologies including that
volatile and obsolescent commodity, software. Purchasers are likely to find manufacturers
unwilling to upgrade an older model to take advantage of a software improvement, recommend-
ing replacement of the whole system instead.
The users of computer systems are becoming increasingly discerning when upgrading their
equipment. They see that the speed advantage of a new processor or operating system can be
offset by any loss of compatibility with existing software packages or high-cost peripherals.
Accordingly they seek to upgrade in a progressive way, transferring the assets of the old system
to the new and insisting on compatibility.
Components of, say, a desk-top publishing system are "unbundled". The software package
can be installed on any computer within a given range, and drivers are available within the
package to exploit any useful printer, laser or dot-matrix, which the user may wish to employ.
Computer, software or printer can be replaced or upgraded without trashing the other two
components of the system.
It is suggested that an unbundled approach would be attractive to the industrial robot market,
the software and supervisory systems being offered on a main-stream desktop computer. With
the low-cost norm being a multi-megabyte memory plus fast hard disk, this will outperform many
times over the LSI-II minicomputers and other computer systems customarily found embedded
in commercial robots. This computer would be interfaced with drive systems to control the axes,
which in turn must have sufficient "intelligence" to require relatively infrequent updating with
a succession of target points.
The nub of the system is the development of an acceptable general interface, with associated
protocol, which will allow easy substitution of supervisory computers or of robot configurations.
With such a standard, the way is open for the development of '~third-party" software packages
for specific tasks, such as loading and unloading of a particular machine tool. The three
components of the system, software, computer and robot, now reflect the example of the
provision by a software house of a word-processing package, for use on an IBM PC to drive an
Epson printer. Products from several manufacturers can be mixed, and if any part becomes
superseded by a new development it can be replaced without the loss of use of the other two.
131
132

To pursue these possibilities, a Mitsubishi RM-501 small robot was acquired by Portsmouth
Polytechnic Robotics Group, naked of its controlling computer and drive system. Controllers
have been developed for the individual axes, together with an interface to a supervisory
computer. Although a great deal of work remains to be accomplished, the philosophy underlying
the design is set out below.

2. The supervisory system

Robot programming can be divided into several phases. One of these concerns the entry and
editing of coordinate data, sometimes with graphical assistance. Many packages now exist for
performing this task, running on a desktop machine or larger; these at present require onward
transmission of the program to the robot controller. A market-standard microcomputer system
can of course perform this part of the operation without question.
In the next phase of programming, coordinates are corrected (or are sometimes entered ab
initio) by guiding the robot through its sequence with the assistance of a teach-pendant. Now
the computer commands the axes to move in response to key-strokes or button-presses by the
operator. These may relate directly to individual axis movements, or may be transformed from
"world" or "tool" coordinates to make the human driving task easier. Solution of the inverse
kinematic equations is necessary if "world mode" or "tool mode" entry is required.
In the command language, there may be provision for specifying that a move should be made
in a straight line, rather than at uniform axis speeds. Some machines which offer world mode
programming, however, record only the joint coordinates for later replay, relying on the operator
to record sufficient intermediate points if straight-line movement is required. Move interpola-
tion is given some prominent consideration in the present study.
Modern sixteen-bit and thirty-two bit microcomputers are at least as powerful as the
processors at the heart of most robots. For high speed performance, however, there is an
advantage to be gained in reducing the number of interpolated points at which joint angles must
be calculated for producing smooth world-mode movement. Departing from the elementary
strategy of demanding a constant rate of change ofjoint angles between supervisory commands,
many authors suggest the use of fifth-order polynomial segments to achieve continuity of
acceleration at the blends. Here it is proposed that a cubic interpolation method will provide
an adequate compromise, giving acceptably small departures of the path while the slight
acceleration demand transients are smoothed by the electronic and mechanical responses. Sets
of coefficients for a third order difference equation are transmitted from the supervising
computer system to each of the axis controllers, which have a common "handshake" system to
synchronise their operation.

3. Kinematics and their inverse.

The Mitsubishi RM-501 is a five axis robot. A general inverse for tool position does nottherefore
exist, and a restriction on tool movement is imposed. The necessary constraint is easily
expressed, however, as will be seen.
133

.!lPproach

.QPening

Figure 1. Gripper Coordinates.

Using the notation ofPauP, we may start with the approach vector !t, the opening vector Q and
the normal vector n defined to be coincident with the z, y and x axes at the base, see figure 2. The
first robot axis is the rotation of the base about the z axis. This and successive transformations
will represent the stages shown in figure 2:

(a) Al = Rotate about .!! - base rotation.


(b) A2 = Rotate about n - shoulder elevation.
(c) GI = Translate along.!! - shoulder to elbow.
(d) ~ = Rotate about n - elbow bend.
(e) G2 = Translate along.!! - elbow to wrist.
(f) A4 = Rotate about n - wrist bend.
(g) As = Rotate about .!! - wrist twist.
(h) G3 = Translate along.!! - wrist to tool tip.

Now Al can be represented in matrix form as:

[c.
~l
-Sl 0
Al = Sl ci 0
0 0 1
0 0 0

where ci and Sl are sin(9-I) and cos(91) respectively. Similarly we have:

A2
.[j 0
c2
S2
0
0
-S2
c2
0 ~l
134

~""'Q
1ft T1= ~A2 rotot~
~
b -

0- Gz
I ~Ironslol~
'""',-'
;le'" n"'*'. . rolate
A3
I ,
I ,'<::1I~t:
,
I ....... n
, I" g I ,-
I

d " g

Figure 2. The elemelltary movemellts which combine to define the robot's position.
135

"[j ~l
0 0
G1 1 0
0 1
0 0

where T is the length of the "humerus",

[; ~1
0 0
~ c3 -S3
S3 c3
0 0

[; ~l
0 0
Gz 1 0
" 0 1
0 0

where U is the length of the "ulna",

[; ~1
0 0
A4 c4 -S4
S4 c4
0 0

~1
-ss 0
As [ Ss
c, Cs 0
0 0 1
0 0 0

[; ~l
0 0
and G3 1 0
" 0 1
0 0

where V is the length of the hand.


136

We could now express the total transformation as the matrix product:

which we alternatively define as Ts' represented as

[~:
=
0x

0y
ax
ay
px]
py
0z az pz
0 0 1

where (Px' Py' Pz) is the vector coordinate of the tool tip, and the first three columns represent
the cartesian components of the vectors fl, Q and E. respectively.
Now we can express Ts as:

(3.1)

If the term in the brackets is denoted as ITs' it can be expanded as

]
-S5 0 0
ITs = [ ss·c)4
C, cs·c)4 -s)4 -s234.V-S23.U-s2.T
ss·s234 cs·s)4 c234 c)4.V +c).U +c2.T
0 0 0 1 (3.2)

where c234 represents COS(ft2 + 83 + ft.), and so on.


Performing the final multiplication reveals that the end effector coordinates are given by:

Px sl·s)4.V + SI·S).U + SI·S2· T


Py -C I·s234.V CI·s23.U CI·s2·T
Pz c234.V + c).U + c2.T (3.3)

Again following Paul we can approach the task of solving for the inverse kinematic relationship
by noting that:
137

whence

The inverse of AI is here equal to its transpose, and we fmd:

cl.n. +sl·ny CI'OX +sl'Oy cl.a. +sl·ay CI.P. +sl'PY


ITs = -sl.n. +cl.ny -SI'O. +cl'oy -sl.a. +cl·ay -SI'P. +cI'py
n 0 a P.
• • •
0 0 0 1 (3.4)

Now we can equate the 1,4 terms in equations 3.2. and 3.4. to arrive at

(3.5)

while the 1,3 terms give

(3.6)

Together, these equations express the restrictions imposed on the tool direction vector by the
lack of a sixth axis.
Equating further terms give the following equations:
3,3 gives:
a. = c234 (3.7)
2,3 gives:
-SI· a• + cl·ay = S234 (3.8)
2,4 gives:
-SI'P, + cI'py = -S234' V - S23' U - s2· T (3.9)
3,4 gives:
(3.10)

Now the equations can be solved systematically. Using the unambiguous ATAN2 function,
where ATAN2(A.sin9, A.cos9) = e, we have from 3.5:
138

and from 3.7 and 3.8:

Now we calculate R (3.11)

and S (3.12)

Equations 3.9 and 3.10 reveal that

(3.13)

and (3.14)

Squaring and adding yields:

where ~ = ATAN2(S, R)

Hence = sin-I 'P + R2 + S2 _ U2 -~


2 T (R2 + S2)1/2 (3.15)

and similarly

= sin-I U2 + R2 + S2 - T2 - 82 _ ~
2 T (R2 + S2)1/2 (3.16)

whence 84 can be deduced from 3.12.


It has been seen that some restriction must be imposed on the tool position to make it
accessible from the five-axis movement. This implies that while the tool tip position is accurately
interpolated in terms of its cartesian coordinates, true constant-tool-angle straight line motion
is not in general possible. Instead we may choose to interpolate the values of 82 + 83 + 84 and
8s linearly between the end points.
We now have a sequence of times at which we compute demanded values of each of the five axis
angles. In the next section, cubic polynomials are fitted to sequences of values, so that further
smoothing will take place in each axis.
139

4. Polynomial fitting.

The computation of polynomial approximations to interpolate between evaluations of the


inverse kinematics is not a novel suggestion; a succession of coordinate sets might be output as
axis position demands. In this study, however, the execution of a cubic curve is implicit in the
implementation of each axis, which receives a command in terms of position demand and its frrst
three derivatives. The position demand is modified during execution by repeatedly adding a
velocity element to it. The velocity parameter is in turn subjected to the addition of an
acceleration term at each iteration, while the acceleration again changes through the addition
of a final term. These four parameters, the initial position and its three derivatives, must be
determined from past, present and future computed targets.

I ..----- I I
r:
I

n = -1
J:::::::::i:------']
n = 0 n =1 n=2 time

Figure 3. Fitting a cubic to the endpoints and to previous and following points as"pathjinders".

The cubic is fitted through four values at equal intervals of time. The times of these evaluations
can be considered as nT, where n takes values -I, 0, 1 and 2. This stresses that the resulting cubic
will be employed between the values of t = 0 and t = T, using the previous point at t = -T and a
future point at t=2T as "pathfinders".
There can of course be a discontinuity of derivative between segments, which might have been
eliminated by matching a higher order curve instead. The purpose of the interpolation, however,
is merely to improve on a simple linear interpolation between computed targets which are
themselves samples of a smooth curve. Further smoothing will result from the dynamics of the
servomotor, and so the added complexity of matching the derivative does not appear to be
warranted.
The calculation can be derived by considering a set of four polynomials, orthogonal at the
sample points where they take the following values:

n= -1 0 1 2

Po(nT): 1 1 1 1

PI(nT): -1 -1 1 1

P 2(nT): 1 -1 -1 1

P3(nT): -1 1 -1 1 (4.1)
140

We see that the fitted curve is given by

x(nT) = L v, P,(nT)
,=0,3

where V
r 1/4 L P,(nT) x(nT)
(4.2)
.=·1,2

We find that to satisfy equation 4.1

PI(nT) = (-2 n3 + 3 n2 + 5 n - 3) /3

PinT) = n2 - n - 1

i.e.

Po(nT) 3 0 0 0 1
PI(nT) 1 -3 5 3 -2 n
PinT) 3 -3 -3 3 0 n2
P3 (nT) 3 -4 -6 4 n3 (4.3)

Now if

we have from 4.2 and 4.3:

ao 3 -3 -3 3 Vo
al 1 0 5 -3 -4 VI
a2 3 0 3 3 -6 v2
a3 0 -2 0 4 V3
141

3 -3 -3 3 1 1 1 1
x(-1) ]
.1 0 5 -3 -4 -1 -1 1 1 x( 0)
12 0 3 3 -6 1 -1 -1 1 x(T)
0 -2 0 4 -1 1 -1 1 x(2T)

whence

ao 0 6 0 0 x(-T)
al 1 -2 -3 6 -1 x( 0)
a2 6 3 -6 3 0 x(T)
~ -1 3 -3 1 x(2T) (4.4)

5. Implementing the difference equations

At each iteration of the processor controlling one of the robot axes, a proportion of the velocity
parameter, XI' will be added to the position demand xo:

For computational convenience N might be taken as 256, with the implication that the velocity
is scaled to represent the distance covered in 256 computation iterations. Time t will then be
represented by r /N, where r is the number of iterations so far performed. Velocity is updated
by acceleration in the same way, as is acceleration in its turn. The execution of the repeated
summation process can be performed by modifying each variable either before or after it has
been used to update a lower derivative, i.e. by performing the sequence of operations

X. . .
:= X + X +1/N (5.1)

in descending or ascending order of the subscript. In the first case, ifwe consider a constant third
derivative term of value 1 the result is seen to be

xo(r/N) = r(r + 1)(r + 2)/(6 Nl)

while in the second it is

xo(r/N) = r(r - l)(r - 2)/(6 NJ)

In either case the result will differ from t3/ 6 to a significant extent unless N is extremely large.
142

For general digital simulation, there are many integration algorithms which replace the simple
summation with a more complicated procedure to give a closer approximation to continuous
integration. In this instance, there is a much simpler solution. The inverse operation ofthe post-
updating method in which assignment 5.1 is applied in ascending order of subscript is the forward
difference operator.
If we take successive forward differences ofthe function Xo = t 3 (t = r/N) we find for r = 0, 1,
2 ... a set of sequences of values as follows:

N3xo(r) 0, 1, 8, 27, 64, 125 ...

N2 x1 (r) 1, 7, 19, 37, 61 ...

N x2(r) 6, 12, 18, 24 ...

~(r) 6, 6, 6 ...

Similarly for the function Xo = t 2we have

N2xoCr) 0, 1, 4, 9,

N x1(r) 1, 3, 5 ...

x2(r) 2, 2 ...

~(r) 0 ...

thus if we set up the variables with initial values

then the correct function will be accurately generated by the repeated addition process. For the
particular value N =256 we can represent the coefficients as hexadecimal "decimals" as follows:
143

x., 0 1 0 0 x(-T)
Xl -0.54DS8 0.80FF8 1.007F8 -0.2AAA8 X( 0)
X2 O.FF -1.FD O.FD 0.01 x(T)
'S -1 3 -3 1 x(2T)

6. Axis command and coordination

To enable the system to function as a robot, some sort of interface must be provided to link the
coordinating microcomputer to the individual axis control cards. This functional link is "private"
to the system which forms the robot and is specific to the task of providing command and status
monitoring in the most expedient way. Of course the computer must respect general commu-
nication structures and protocols in its dealings with any external network.

With independent processors controlling the individual axes, it is necessary to ensure that the
solutions of the difference equations keep in step. A three-line handshake is provided for this
purpose, in structure not unlike the handshake associated with the IEEE-488 bus. Each card has
two wire-or-ed outputs, each presenting a high or low resistance to ground. These are labelled
''waiting'' and "computing". A master card monitors these signals, and when appropriate issues
a signal "proceed".

Axes complete computation at I Computation


staggered times 1 resumes
-, -, 1- r
computintlL-_ _B--L..:_ _C_1L-1_ _ _ _ _ _B--L.:_A--1:_c---,1

I
1

_____________
All have released

waiting ~~M~:_~~:
Added
delay

proceed

I
Figure 4. Handshake example for three axes.
144

As each processor completes its iteration, it pulls "computing" false and releases "waiting".
Only when all the control cards have reached the waiting state will the "waiting" line go high. At
this point the master card pulls "proceed" false. In response, each processor releases
"computing" after pulling down "waiting". Thus only when all the cards have acknowledged the
low transition of "proceed" will "computing" be seen to go high. As the final step in the cycle,
"proceed" is taken high and the computation in each card is allowed to perform a further
iteration.
The "proceed" signal can be constructed using two NO R gates cross-connected as a bistable.
In practice, however, additional gating and timing circuits are added so that the supervisory
computer can regulate the speed of operation, permitting the execution of single steps if they are
called for.
Within the axis control card, the repeated computation is performed which yields the "target
position" and "target velocity". The motor carries a position transducer and a tachometer, so
that feedback of "position now" and ''velocity now" can be applied to achieve position control.
An analysis of the design of this loop is given in the next section.
Now the supervisory system must be able to issue commands to each processor, to transmit
sets of parameters for each move and to receive status and position information. In a journal
papers it was suggested that each command could be structured as a succession of eight-bit bytes,
the hexadecimal code SO being given special significance as a control character. Here the byte
following the code SOh determines the command, which will in general be followed by a list of
data bytes. Whenever a data byte is required to have value SOh, it is represented by two bytes
SOh FFh.
The list which follows contains those commands which are seen to be necessary. A code has
been suggested for each command, but if the scheme should find acceptance as a general
standard then the final set of codes might be the subject of considerable discussion.

(i) Output commands:

Olh: Add the following coefficients which define a move to the queue of moves awaiting
execution. The coefficient set consists of:
number of calculation iterations (2 bytes),
initial position (3 bytes),
velocity (2 bytes),
acceleration (2 bytes),
jerk (2 bytes).

02h: Delete a move from the end of the queue. On the status bits, show the number of moves
still queued, or FFh if this command is invalid.

03h: Delete all queued moves, but complete the present move.
145

04h: Stop and clear. Set "all moves complete".

10h: Resume operation at the next "proceed" state of the handshake lines; this recovers from
the "stop" condition (llh).

llh: Stop and lock. Position control continues, but the position target freezes. The handshake
is frozen to halt all other channels.

12h: Stop and stand by. Position control continues with frozen target. The card releases both
handshake lines to allow other cards to proce,ed.

13h: Go limp. This is always preceded by a "stop" command. The motor drive is then set to
zero by the expedient of repeatedly copying the "actual position" value to the target position.
Another "stop" command will restore locked position control, with the present actual position
retained as target.

14h: Rotate until sense switch changes state. Two bytes set a velocity, another two set a limit
to the manoeuvre. The status byte shows 00 or 01 while running, according to the present value
of the switch. It shows 81h or 80h on successful completion, or FOh or Flh if the movement was
exhausted without reaching the sensor switching position.

ISh: Reset the axis counter, to make the "actual position" the reference zero position.

20h: Set normal mode: lock handshake when data points are exhausted.

21h: Set "ragged" mode: release handshake line whenever data is exhausted.

(ii) Commands for input to the supervisory computer:

40h: Show status byte - the normal state. The eight-bit reply channel is updated with the card
status whenever it changes. The eight parallel bits can be read by the master machine without
use of any handshake. Three bits of the status indicate the number of queued moves remaining
to be performed - with a limit of seven. To avoid transient errors due to asynchronous reading,
a "Gray code" format can be used. Other bits indicate the partial receipt of a command string
and the synchronisation status in terms of "computing" or "waiting".

41h: Copy the status ofthe movement in progress to a buffer - steps remaining, position, velocity,
acceleration, jerk. These can then be read in sequence through the status byte, which now
performs a data handshake. Since the status channel does not carry commands, the code 80h
requires no special treatment.
146

(iii) Special development commands:

To assist in developing the axis control software, provision has been made to download machine
code into the controller's RAM, and to transfer execution to any desired experimental routine.

60h: Read memory byte. The command is followed by two address bytes in the order high-low
(note that if any such byte is 80h, it must be transmitted as two bytes 80h FFh). The required
byte is copied to the status port. When it is read, the handshake causes the next byte to be output,
continuing until a further command is received.

6th: Write to memory. The command is followed by two address bytes, then by a succession
of bytes to be deposited in memory from that address onwards. Any byte of value 80h is
transmitted as 80h FFh. Any other command ends the transfer sequence.

63h: Execute memory. The command is followed by two address bytes. Care must be taken
to ensure that the new program is linked to the communication routine.

The hardware implementation of the link now remains to be defined. A number of choices
are available for the transmission technique. A serial line could be used, in the manner of a local
area network, giving savings in wiring. In common with several other bus structures, this can give
rise to "listener-talker" problems. If two separate lines are used for input and output, each
processor card must still have the facility to sieve all the received messages and to respond to
those which concern it. This in turn requires frequent interrupt activity if a single processor is
to be used per axis.
The method chosen for the experimental system maps each axis card into a separate group
of addresses in the memory or I/O map of the supervisory machine. Although this is to some
extent hardware specific, there is a commonality which runs across all present machines in the
provision of an expansion bus. This always carries address and data lines and a read/not-write
line or its equivalent. The same signals are to be found among those of a VME bus.
A single decoder on the master card enables a "peripheral interface adaptor" PIA chip
corresponding to one of the axis cards. This device has two ports, and one is here configured as
an input while the other becomes an output. Each port is linked pin-for-pin to its counterpart
on a second PIA, which is in turn connected to the internal bus of the axis controller. Two
handshake lines per port link the devices, which are designed to react to data transfers and to
reflect the status "new" or "taken" in an addressable register. Interrupts can also be generated
as required.
The pair of PIA's offer a low cost solution to the problem of providing a one-byte cache for each
axis in each direction, giving a fast asynchronous link with full semaphore protection of the data
transfer.
Such a technique can achieve very efficient data packing and gives high speed of communi-
cation over dedicated parallel channels. It was used in the prototype of a six-legged walking
robot. s A subsequent pneumatically powered walking robot was designed for low-cost fabrica-
tion and required greater simplicity of communication and interfacing. For this, a serial
multidrop system has been used. The most significant nibble of each transmitted character
147

defmes its control function, such as "select hoard" or "pulse control valve", whilst the least
significant nibble carries a data value. In this system, the command traffic is relatively light and
it has been possible to spare the time for a one-byte response after each command.

7. Implementing the position control loop.

Much theoretical research has been devoted of late to the design of control loops which com-
pensate for inter-axis forces; the solutions which seem to find favour usually involve large
computational resources. Their design process is often based on the optimisation of an integral-
square-of-error cost function and great care is taken by the authors that the servo amplifier
should not saturate - the linear theory would then not be applicable. The concern for theoretical
respectability will usually result in performanc:e which is inferior to that of a system designed by
traditional methods.
When designing autopilot actuators twenty-five years ago, an important concern was to close
a "velodyne" loop around the servomotor with as high a tachometer gain as possible. The
response to force disturbances was then minimal; full saturated drive torque could be applied
for a velocity error of as little as 0.02% of the available speed range.
Any control signal injected into this loop consequently represented a demand for servomotor
velocity. A feedback signal was applied which was proportional to servo position error and very
large gains were again possible. Full saturating servo torque was achieved for a very small error
- the limitation was found in the accuracy of the transducer, rather than in the performance of
the control loop. In effect, the error due to an applied force was negligihle, less than that defmed
by the mechanical rigidity of the structure, until the disturbing force became equal to the limit
of the servomotor driving force. For greater forces control was lost completely - as it would be
under any controlling algorithm.
The most convenient method of analysing the saturating second-order system is hy means of
the phase-plane diagram, in which velocity is plotted against position error. When velocity
demand is made proportional to position error, the plane is divided into three regions by two
parallel oblique lines. The "proportional band", where the drive is not saturated, separates the
regions where the drive is at a positive extreme and at a negative extreme.
The behaviour of the controlled system is represented in the phase-plane by "trajectories",
curves along which the point representing the system's state will proceed. Under constant
acceleration or deceleration these will be parabolae, see figure 5 (reproduced from "Control-
ling with Computers"8 with the permission of the publishers.)
In the proportional band, the drive to the servo amplifier may be expressed as:

d = -A * (position error + k * velocity)


Outside the proportional band, the drive takes a maximum or minimum limiting value + / -u....
corresponding to the sign of d. As the overall gain A is increased, the proportional band narrows
to a "switching line", with slope -11k. The trajectory of the servomotor must now locally follow
one or other of two parabolic paths, according to the sign of d.
148

velocity axis

position ->
U = -umox

Proportional
band

Figure 5. Phase-plane trajectories of a second-order system with saturating drive.

velocity

trajectory

position el1Vr

Figure 6. A switching line showing sliding Figure 7. A trajectory which overshoots

Figure 6 shows that in some circumstances a parabolic path appears impossible. When the
trajectory crosses the switching line from the negative-drive region to the positive-drive one, the
positive-drive trajectory is seen to carry it immediately back into the negative-drive region. The
drive oscillates rapidly. The system performance may, however, be quite acceptable. The rapid
drive switching causes the servo state to follow the switching line, in what is termed "sliding"
motion. If the switching line has slope 11k, it is defined by the equation:

x+ kx = 0

Any state which follows this path will therefore satisfy the above equation, which can be solved
as:

The path of the servo state represents an exponential decay of error having time-constant k.
149

A slight proportional band can in practice modify the performance to remove the rapid
switching of the drive, replacing it with a steady value corresponding to the mean of the mark-
space ratio. This has the particular advantage of allowing the servo to come to rest with an
ensuing steady drive of zero.
Figure 7 shows that for a certain size of step disturbance, the accelerating trajectory cuts the
switching line at a point from which the decelerating trajectory passes through the origin. The
motor moves ftrst under full acceleration then under full deceleration to reach the origin in
minimum possible time. It also shows that when the parabola intersects the switching line
outside this point, an overshoot is inevitable. If the designer succumbs to the temptation to
reduce k to achieve faster response, the size of position step which can be achieved without
overshoot is reduced.

With a dual-slope switching line,


faster final settling is possible
without the risk of overshoot.

Figure 9. With a velocity limit, overshoots Figure 10.A dual-slope switching line.
from large displacements are avoided.
The answer to the design dilemma lies in nonlinear control. If the position error term is
limited in magnitude, giving a limit to the velocity demanded, then the phase-plane appears as
in figure 8: there is no overshoot. Alternatively a dual-slope curve of demanded velocity versus
error can give the performance of ftgure 9. The function which gives time-optimal control from
all errors is the switching parabola of figure 10.
i

u =-1
11- +1

Figure 10. The time-optimal switching curve, fonned


Time·optimal swilching
from two parabolae. curve for.f = II. where IUIEiI
150

To have chosen time-optimal control carries a large performance penalty in practice. Ma-
noeuvres are performed at the maximum speed from which the motor can theoretically come
to rest without overshoot. Disturbing forces and changes in inertial parameters will cause the
trajectory to deviate from the predicted one, and oscillatory behaviour will probably be the
result.
Suppose that a parabolic switching line is used nevertheless, but that the parabola represents
a deceleration which is only one half of the performance of which the motor is capable. For a
step change of target, the motor will again start by applying full acceleration. On reaching the
switching curve, the system will now enter a sliding mode in which the state follows the curve with
mean deceleration as defined by the parabola. Any disturbing forces which are less than one half
offull drive in magnitude will fail to detach the systems state from the sliding trajectory; they will
only modify the mean value of the motor torque.
The cost of this immunity to disturbances is an increase in settling time of 40% after the sliding
curve has been reached (a factor of 1.4), but since full acceleration is applied before reaching the
switching curve this increase may be only 20% overall for a step change of target position.
These discussions have assumed that the position error is represented by a continuous signal.
Often the position measurement will take the form of an incremetal optical transducer and
counter, so that the error signal changes in discrete values. The switching curve will now appear
as a staircase of steps, as the target speed changes for each value of the error count. In such a
system, the control is likely to be implemented by means of a microprocessor. Now the velocity
demand signal can be read from a table, where the index is error size, to give any switching curve
which may be desired.9
Control of a multi-axis robot with servomotor loops designed in this fashion will be immune
to substantial disturbance forces and to errors in predicted inertial parameters. As the operating
speed is pushed upwards, however, a point is reached where acceleration and coriolis forces
exceed the safety margin allowed. The robot path will break free from the target trajectory and
large errors can result. To avoid this, the path must be followed at speeds within the drive
limitation of the robot axes - a limitation common to all possible control strategies. To ensure
error-free control it is now necessary to moderate the target speed at which the spatial trajectory
is to be followed. Such path planning is a problem in the realm of dynamic programming.

8. Priority of operations in the axis controller

The axis processor has three tasks:


(i) To provide position control in the axis servo,
(ii) To process commands from the coordinator,
(iii) To compute a target coordinate from the control polynomial, participating in the synchro-
nising handshake.

(i) Performance will deteriorate significantly if the position control loop is not serviced
regularly, and so this is initiated by a high priority interrupt. The controller design used in the
experimental system involves a novel power amplifier, published in a paper coauthored by
D. Sanders3. Incremental encoder signals are processed by a 74LS2000 device, so that the
151

position can be read at each control cycle, and a corrective signal computed for application to
the drive amplifier.
Care must be taken to ensure that the interrupt does not result in a target position being used
which has been incompletely updated. Since the controlling function is initiated by the "Non
Maskable Interrupt" the target position data must be duplicated, flags being set before and
cleared after each replica is updated.

(ii) The supervisory computer must receive a swift response, but a delay equal to the position
control service time can be acceptable. Receipt of a byte on the command bus is thus serviced
by a lower level interrupt.
Commands must be able to initiate or inhibit control action, and to reschedule tasks at the
lower level too. A slightly unusual software structure has therefore been used, which accom-
plishes the necessary tasks without the complexity of a conventional multi-tasking operating
system.
The interrupt routine first fetches the byte from the communications interface. It tests it for
the "command" value 80h, in which case it sets a flag and returns forthwith (after restoring
appropriate registers).
If the "command" flag is not set, the byte is a data item for a command routine already in
progress. Two bytes are read from memory and pushed onto the stack, and a "return from
subroutine" is performed. This transfers control into the appropriate part of a routine which has
previously been invoked, which deals promptly with the input byte and then CALLs ENDINT
if a further byte is expected. ENDINT pops the return address bytes off the stack, saving them
for use when the next interrupt occurs, and then performs an interrupt return. If the command
task is complete, a jump is instead made to IDLE, which again calls ENDINT, jumping back to
IDLE as each non-command byte is received.
If the "command" flag is set, and the received byte is FFh, then the flag is reset, value 80h is
substituted for the byte, and computation proceeds as in the preceding paragraph.
If the "command" flag is set, then the flag is reset and the received byte is used as an index
into a pair of tables giving the address of the command routine to be initiated. These are pushed
onto the stack, and a subroutine RETURN is performed as before. When command is recovered
through ENDINT, a resumption address will be saved for the new routine.
This method of operation clearly imposes certain restrictions on all command routines, since
they must leave the stack clean for the interrupt return. In particular, although the command
routine may contain subroutines, none of these must include a call to ENDINT to fetch another
byte. Register values are lost between calls, and so their values are saved in locations particular
to the command level.

(iii) This task includes the three-line synchronisation handshake. Since this is designed to
reflect the readiness of all axis controllers to proceed, it would be pointless to cause computation
of the target coordinates to be interrupted by an event on the "proceed" line. Instead the com-
putation closes with a polling loop on the "proceed" status, from which the two levels of inter-
rupt can seize command as required.
152

9. Conclusion

A system has been outlined which will allow a general desktop computer to be used as the co-
ordinator and software vehicle of a robot. Interfacing is performed via VME or general
expansion bus to individual axis controller cards, which contain a minimum of software to
support position control, polynomial target interpolation and communication with the coordi-
nating machine.
The robot programming language has not featured in the description, since a major feature
of the proposal is that a variety of languages, be they general or application specific, can be
supported on the system at the behest of the user. Since the host computer is connected into
the system through a single plug-in card, and since only one memory offset address is involved
in the "personalisation" of the interface, almost complete portability can be achieved among
machines offering a given operating system.
The resulting "unbundling" can go far to preserve the durability of investment in both soft-
ware and hardware.

10. Acknowledgement.

The purchase of the Mitsubishi Robot was made possible through a grant from the Science and
Engineering Research Council under the initiative "Application of Computers in Manufactur-
ing Engineering".

11. References

1. R. Paul: Robot Manipulators, MIT Press, 1982.


2. J.J. Craig: Introduction to Robotics Mechanics and Control, Addison Wesley, 1986.
3. D. Sanders and J. Billingsley: Servomotor Power Amplifier, Electronic Engineering, August
1986, vol. 58, no. 716, pp 47-49.
4. J. Billingsley, D.R.S. Hedgeland and J.L. Moughton: A Multiprocessor Research Facility for
Control Applications, Proc. 2nd lEE Conference "Trends in On-line Computer Control
Systems", Sheffield, April 1975.
5. A.A. Collie, J. Billingsley and L. Hatley: The Development of a Pneumatically Powered
Walking Robot Base, Proc.I.Mech.E. Conference "UK Research in Advanced Manufacture",
London, December 1986.
6. J. Billingsley: A system for partitioned control of a robot, lEE Proceedings part D, vol. 134
no. 5, Sept 1987, pp 309-316.
7. J. Billingsley: Controlling with computers, Control theory and practical digital systems, pub.
McGraw-Hill 1989, ISBN 0-07-084193-4
8. J. Billingsley, H. Singh: The use of a microcomputer as a dedicated position controller, Proc.
IFAC/IMEKO international symposium on application of microprocessors for instrumenta-
tion and automatic control, London, November 1980, pp 93-97.
CHAPTER 7

A PC SIMULATION PROGRAM FOR COMPARING


PERFORMANCES OF ROBOT CONTROL ALGORITHMS

J. O'SHEA, R. BENALI
Ecole Poly technique de Montreal
C.P. 6079, Succ. A
Montreal H3C 3A7, Canada

ABSTRACT. In order to compare the performances of robot control


algorithms, a benchmark test, based on the Puma 560 tracing straight
line trajectories, is proposed. For simulating that test with a personal
computer, subroutines of a program written in Pascal are described and
their 1 istings are suppl ied. Tests of some robot control algorithms
serve as examples.
1. Introduction

The motivation for this work came mostly from the need to test quickly
and inexpensively various robot control algorithms, proposed in the
literature, on a comparative basis. Simulation permits to carry that
evaluation easily and provides for direct comparison if it is limited to
only one type of robot. Since the Puma-560 is one of the most popular
robot and is well documented, it appeared as a natural choice to serve
as a comparative simulation model.
Generally, there exists two kinds of robot computer simulation. One
consists of displaying on a screen an animated picture of a robot
performing a task. CAnA [1] is a good example of such a type of
simulation. This is done mostly for determining or validating the
location of the work station relative to the robot and other machine-
tools. The second corresponds to the traditional way of simulating
processes or systems by plotting curves of variables of interest versus
time. This is specifically the type of simulation dealt with in this
paper.
Whereas reference [2] uses the Newton-Euler formulation of the equations
of motion which were programmed in Fortran to run on a CYBER 175
computer, this paper presents a study based on the Lagrange formulation,
while the simulation program is written in Pascal and runs on a PC.
153
154

However, the general scheme proposed in [2] has been retained and
appears herein as figure 1.

TRAJECTORY GENERATION
IN CARTESIAN COORDINATES

/
TRANSFORMATION FROM CARTESIAN
TO JOINT COORDINATES
l
l 9d
.....-----1-' CONTROL ALGORITHMS SIMULATION I
~ u
1-------'-1 ACTUATORS SIMULA TlON l
, Fm

1 - - - - - + 11 ROBOT DYNAMICS SIMULATION l


, 9
,1 ACTUAL POSITION AND SPEED COMPUTATIONS
1 - - -.. I

L-_ _ _ _ _ _ _ _ ~
Ie. e

Figure 1: Robot Simulation Program Organisation

For testing the performances of different robot control algorithms, it


was felt that simulating the first three degrees of freedom (the wrist
motion having much less influence on trajectory errors) was sufficient.
In principles, however, nothing prevents one to take advantage of the
modularity of the program presented below and extend it to a full six
degree-of-freedom robot simulation.
Actua 11 y , the computer used for th i s work is a bas i c I BM- PC equ i pped
with a 8087 co-processor, 640K RAM, a 10 megabyte hard disk and a 5 1/4
inch floppy disk drive. The simulation algorithm for the robot dynamics
being customized [3] and being 1imited to three degree-of-freedom and
taking a long time to run (between 10 and 25 minutes) obviously
explained the great disparity of the means between [2] and this paper.
The program consists of three modules called respectively "trajectory
155

generation", "controller" and "robot dynamics". The subroutines they


contain constitute topics that are dealt with in the following sections
where the respective listings also appear.
2. Trajectory generation
Seeking a benchmark trajectory for comparing the performances of control
algorithms, in the case of revolute-joint robot, one finds the straight
line a good candidate. While it is easy to visualize geometrically and
define mathematically, it presents a difficult task, when it is
sufficiently long, for each rotary joint to execute. One must remark
that "trajectory" here means the locus traced in cartesian workspace by
the distal end of the robot third link which corresponds to the centre
of the wrist axes. For that point to move smoothly, the programming
method resorts to polynomial functions. The 4-3-4 trajectory [3] was
adopted.
Briefly, the projections of the trajectory, on the inertial axes in the
workspace (see figure 2), are divided into three segments and the motion
along those segments obeys, in our case, a fourth order time polynomial
for the first segment, a third order for the second segment and a forth
order for the third segment.

Figure 2: Puma 560


156

Based on that approach and arbitrarily allowing a one-second time


interval for each segment, a trajectory generation program was written
for simulating the trajectory in the workspace.
g[3,6]:=-(e*t1*(t3*t3»/(6);
g[3,7J :=-(2*e*t1);
9 [3,8] :=(e*t1);
g[3,9]:=-(e*t1)/(6);
{ Trajectory data readings } g[4,1]:=-(6*h*t2)/(t1);
g[4,2]:=(3*t2*h);
readln(f1,ts1,ts2,ts3,tsim,dt); g[4,3]:=-(t1*t2*h)/2;
for i: =1 to 6 do g[4,4]:=(3*t2*an);
begin g[4,5]:=-(3*t2*t3*e);
for j:=1 to 8 do g[4,6]:=-(t2*e*(t3*t3»/(2);
begin g[4,7J:=-(6*e*t2);
read(f1,ae[i,j]); g[4,8]:=(3*t2*e);
encl·, g[4,9]:=-(e*t2)/(2);
end; g[5,1]:=(2*af)/(t1);
g[5,2] :=-af;
{ Polynomial coefficients calculation } g[5,3]:=(t1*af)/6;
g[5,4] :=-am;
t1:=ts1;t2:=ts2-ts1;t3:=ts3-ts2; g[5,5]:=e*t3*(t1+3*t2);
a:=(3*t1+9*t2-t1*t2-6*t2*t2)/(2*(t1+2*t2»; g[5,6]:=(e*(t3*t3)*(tl+3*t2»/6;
b:=(t3*(t1+6*t2-3»/(t2*(t1+2*t2»; g[5,7J:=2*e*(t1+3*t2);
e:=(t2*t2)/(t3*(t1+2*t2)*(2*a+t3»; g[5,8]:=-e*(t1+3*t2);
an:=(1+e*b*(t1+2*t2»/(t1+2*t2); g[S,9]:=(e*(t1+3*t2»/6;
am: =( (t2)/( t1+2*t2) )+(e*b*( t1+3*t2»; g[6,1]:=-(2*ak)/(t1);
af:=(am*t2)-(t3*e*(t1+3*t2»; g[6,2] :=ak;
h:=(an*t2)-(t3*e); g[6,3]:=-(t1*ak)/6;
ak:=(t3*(b*t2-t3»/(2*a+t3); g[6,4]:=(b*t3)/(2*a+t3);
al:=(t1/(t1+2*t2»+(t1*e*b); g[6,S]:=-(t3*t3)/(2*a+t3);
az:=al-«t1*t3*e)/t2); g[6,6]:=(a*(t3*t3»/(3*(2*a+t3»;
g[6,7J:=(4*a)/(2*a+t3);
g[1,1]:=2*(1+az); g[6,8]:=-(2*a)/(2*a+t3);
g[1,2]:=-(t1*az); g[6,9]:=(a)/(3*(2*a+t3»;
g[1,3]:=-«t1*t1)*(1-az»/6; g[7,1]:=(6*ak)/(t1);
g[1,4]:=-(t1*al)/(t2); g[7,2]:=-(3*ak);
g[1,5]:=(e*t3*(t1*t1»/(t2); g[7,3]:=(t1*ak)/2;
g[1,6]:=(e*(t1*t1)*(t3*t3»/(6*t2); g[7,4]:=-(3*t3*b)/(2*a+t3);
g[1,7J:=(2*e*(t1*tl»/(t2); g[7,5]:=(3*(t3*t3»/(2*a+t3);
g[1,8]:=-(e*(t1*t1»/(t2); g[7,6]:=-(a*(t3*t3»/(2*a+t3);
g[1,9]:=(e*(t1*t1»/(6*t2); g[7,7J:=(6*t3)/(2*a+t3);
g[2,1]:=-(1+2*az); g[7,8]:=-(3*t3)/(2*a+t3);
g[2,2]:=(tl*az); g[7,9]:=(t3)/(2*(2*a+t3»;
g[2,3]:=«t1*t1)*(1-az»/6; g[8,1]:=-(6*ak)/(tl);
g[2,4]:=(t1*al)/(t2); g[8,2] :=(3*ak);
g[2,5]:=-(e*t3*(t1*t1»/(t2); g[8,3]:=-(t1*ak)/2;
g[2,6]:=-(e*(t1*t1)*(t3*t3»/(6*t2); g[8,4]:=(3*t3*b)/(2*a+t3);
g[2,7J:=-(2*e*(t1*t1»/(t2); g[8,5]:=-(3*(t3*t3»/(2*a+t3);
g[2,8]:=(e*(t1*t1»/(t2); g[8,6]:=(a*(t3*t3»/(2*a+t3);
g[2,9]:=-(e*(t1*t1»/(6*t2); g[8,7J:=-(4*a+8*t3)/(2*a+t3);
g[3,1]:=(2*t2*(1-az»/(t1); g[8,8]:=(4*a+5*t3)/(2*a+t3);
g[3,2]:=-(1-az)*t2; g[8,9]:=-(a+t3)/(2*a+t3);
g[3,3]:=(t1*t2*(1-az»/6; g[9,1]:=(2*ak)/(t1);
g[3,4] :=al; g[9,2] :=-ak;
g[3,5]:=-(e*t3*t1); g[9,3] :=(t1*ak)/6;
157

g[9,4]:=-(t3*b)/(2*a+t3); ( Trajectory Generation }


g[9,5]:=(t3*t3)/(2*a+t3); ts:=O;
g[9,6]:=-(a*(t3*t3»/(6*a+3*t3); REPEAT
g[9,n:=(2*a+3*t3)/(2*a+t3); begin
g[9,8]:=-(2*a+2*t3)/(2*a+t3); n:=1;
g[9,9]:=(4*a+3*t3)/(12*a+6*t3); REPEAT
for m:=1 to 6 do begin
begin IF ts<=ts1 THEN
begin
d [1,ml :=ae[m,2]-ae[m,1]- (ae[m,5]*t1) t:=(ts-tsO)/(ts1-tsO);
-«ae[m, n*(t1*t1 »/2); p[n]:=cof[n,1,O]+cof[n,1,1]*t
d[2,ml :=-ae[m,5]- (ae[m, n*t1); +cof[n,1,2]*t*t+cof [n,1 ,3]*t*t*t+
d [3,ml :=-ae[m, n; cof [n,1 ,4]*t*t*t*t;
d[4,ml :=ae[m,3]-ae[m,2]; end
d[5,ml :=0; ELSE IF (ts>ts1) and (ts<=ts2) THEN
d[6,ml :=0; begin
d[7,ml:=ae[m,4]-ae[m,3]; t:=(ts-ts1)/(ts2-ts1);
d[8,ml:=t3*ae[m,6]; p[n]:=cof[n,2,O]+cof[n,2,1]*t
d[9,ml:=(t3*t3)*ae[m,8]; +cof [n,2,2]*t*t+cof [n,2,3]*t*t*t;
end
as[m,1]:=ae[m,5]*t1; ELSE IF (ts>ts2) and (ts<=ts3) THEN
aa[m,1]:=«t1*t1)*ae[m,n)/2; begin
end; t:=(ts-ts2)/(ts3-ts2);
for i :=1 to 9 do p[n]:=cof[n,3,O]+cof[n,3,1]*t
for j:=1 to 6 do +cof [n,3,2] *t*t+cof [n,3,3]*t*t*t+
begin cof[n,3,4]*t*t*t*t;
c[i,j]:=O; end
for k:=1 to 9 do ELSE IF ts>ts3 THEN
c [i, j] :=c [i, j]+g [i ,k] *d[k, j]; begin
end; p[n] :=ae[n,4];
end;
for n:=1 to 6 do n:=n+1;
begin end;
cof[n,1,O]:=ae[n,1]; UNTIL n>nmax;
cof[n,1,1]:=as[n,1];
cof[n,1,2]:=aa[n,1]; ( Recording of Trajectory on Fi le }
cof[n,1,3]:=c[1,n];
cof [n, 1,4] :=c [2,n]; writeln(f2, ts, I I ,p[1], I I ,p[2], I I,

cof[n,2,O]:=ae[n,2]; p[3] );
cof[n,2,1]:=c[3,n];
cof[n,2,2]:=c[4,n]; ts:=ts+dt;
cof[n,2,3]:=c[5,n]; end-,
cof[n,3,O]:=ae[n,3]; UNTIL ts>=tsim ;
cof[n,3,1]:=c[6,n];
cof [n,3,2] :=c [7 ,n];
cof[n,3,3]:=c[8,n];
cof[n,3,4]:=c[9,n];
end; tsO:=O;

Now, the position vector P=(Px,Py'Pz)t, which points from the orlgln of
the arm coord i nate system to the di sta 1 end of 1ink 3, is known as a
function of time. Since the velocity of the effector along the
trajectory will also be required, one needs a subroutine for computing
the time derivative of the desired trajectory in the workspace:
158

px(t i} - Px(t i-1}


Vx =
ti - t i-1
Py(t;} - Py(t i-1 }
vy = (2.3)
ti - t i-1
Pz (t i ) - Pz(t i-1 }
vz
ti - t i-1

Where ti - t i-1 = sampl ing period


That subroutine is called VIT-DER for desired velocity and its listing
appears below:
Procedure VIT_DER(position,velocity:string); < Velocity calculation }
vx:=(px2-pxl)/(t2-tl);
vy:=(py2-pyl )/(t2-tl);
vz:=(pz2-pzl)/(t2-tl);
< Position reading }
readln(fl,tl,pxl,pyl,pzl); { Velocity recording }
writeln(f2,tl,' ',vx,' ',vy,' ',vz);
REPEAT
begin pxl :=px2;pyl :=py2;pzl :=pz2; tl :=t2;
end;
( Position reading } UNTIL eof(fl) ;
readln(fl, t2,px2,py2,pz2); End; {VIT_DER}

2.1. INVERSE KINEMATICS


2.1.1. Desired Joint Positions.
Knowing the effector trajectory in the workspace, a coordinate
transformation program is then required to get the joint trajectory. One
is thus led to the inverse kinematics problem. For the first three
joints of the Puma robot, the solution can be found directly in [4]
where a position vector p = (Px' Py' Pr}t which points from the origin of
the arm coordinate frame (x o' Yo' zo) to the distal end of link 3, now
serves to determine the desired angular position ad i for each joint (see
figure 2). The simulation subroutine called CAR-POL, associated to the
listing below, performs the calculations.
159

sb:=sqrt(l-cb*cb);
bl:=sa*cb+(arm*elbow)*ca*sb;
b2:=ca*cb-(arm*elbow)*sa*sb;
p[2] :=atn(bl ,b2);
REPEAT rl:=sqrt(px*px+py*py+pz*pz-c6*c6);
begin cc:=(a2*a2+(d4*d4+a3*a3)-rl*rl)
c6:=0.149;d4:=0.433; /(2*a2*sqrt(d4*d4+a3*a3»;
a2:=0.431; sc:=arm*elbow*sqrt(l-cc*cc);
a3:=·0.02032; sb:=d4/(sqrt(d4*d4+a3*a3»;
cb:=abs(a3)/sqrt(d4*d4+a3*a3);
( Cartesian position reading } bl:=sc*cb-cc*sb;
readln(fl, t,px,py,pz); b2: =cc*cb+sc*sb;
p[3]:=atn(bl,b2);
{ Inverse kinematic calculation }
arm:=l; { Recording of joint coordinates }
bl:=(-arm*py*sqrt(px*px+py*py-c6*c6) -
px*c6)/(px*px+py*py); writeln(f2, t, I I ,p[l], I I ,p[2], I I ,p[3]);
b2:=(-arm*px*sqrt(px*px+py*py-c6*c6) + end;
PY*c6)/(px*px+py*py); UNTIL eof(fl) ;
p[l] :=atn(bl ,b2);
elbow:=l;arm:=l ;
sa:=-pz/(sqrt(px*px+py*py+pz*pz-c6*c6»;
ca:=-arm*sqrt(px*px+py*py-c6*c6)
/(sqrt(px*px+ py*py+pz*pz-c6*c6»;
cb:=(px*px+py*py+pz*pz-c6*c6+a2*a2-(d4*d4
+a3*a3»/(2*a2*sqrt(px*px+py*py+pz*pz
-c6*c6»;

The vector 9d (t) can now be input to the robot controller.


2.2.2. Desired Joint Ve7ocities.
In order to also input the desired joint velocities to the control
system, one needs to revert to the inverse kinematics formulation but
this time replacing Px' Py, Pz with vx' vy, vz.
Then the CAR-POL subroutine serves a second time for transforming vx '
Vy~ V z obtained by means of VIT -OER into the desired joint velocities
• a • d • d
91 (t), 92 (t), 93 (t).
Both 9d and ad are now ready to serve as input vectors to the robot
controller.
3. Robot Control
The module called COM-ROB comprises routines for simulating the control
algorithms, the actuators, the dynamics of the robot and the direct
kinematics.
160

3.1. CONTROL ALGORITHM


As the whole simulation is designed for testing control algorithms, COM-
ROB allows for substitution of the control algorithms subroutines.
Descri pt ions of different algorithms take place in the fifth sect i on.
For now, it suffices to say that these subroutines must begin with the
jOint position and velocity errors and end with the control signals y
which constitute the inputs to the actuators.
3.2. ACTUATOR MODEL
In the Puma, armature controlled DC-motors coupled to gear reducers are
used as actuators. The block diagram of one of the actuator is shown on
figure 3 where the symbols are defined as follows:
Laplace variable
torque constant of motor, in N-m/amp
Armature current in amperes
e.m.f. constant of motor, in volts/rad/sec
armature resistance of motor
armature inductance of motor
voltage applied of the armature of motor
gear ratio of motor
torque generated at the output of the gear reducer and
applied to the associated joint, in N-m.

- :
MOTOR
---- ------ -- -- - -------------------,
torqGe GEAR Fm El

L1 1:E~R 1_--------'
,-----------,

-KY-~r-v-\a-/-(R-+-S0l_t_~-
u ~AANIPULATOR
f7
e,-
REDUCER

co::te'l_ •
e.m.f. REDUCER
._--------------------------------

Figure 3: Actuator Model


161

The subroutine MOTORS computes the torques applied to the manipulator


joints. First the armature current must be determined. The DC motor
model
u - Kb nm 9 = RI + L dI/dt (3.1 )

once discretized and rearranged for programmation, yields

where k is the sampling instant and T is the sampling period.


But the joint velocity, 9, has yet to be computed. This will be done
later by a subroutine calle~ DYN-ROB which simulates the manipulator.
For now let us assume that 9 is available in memory. Once the armature
current is computed, a test checks wether or not it exceeds the physical
limits. Then the output torque is simply computed as follows
(3.3)

This ends the subroutine MOTORS. The listing appears below:


Procedure MOTORS(u,v:vec_real;varuO:vec_real; { Test of current limit }
dt:real;var fm:vec_real); IF abs(cour[n]»limite THEN
begin
Begin{MOTORS} writeln('saturation of joint ',n,'
{ Setting of initial motor parameters } cur rent = " cur [n]) ;
( Motor torque calculation ) end;
n:=1; n:=n+1;
end;
REPEAT UNTIL n>nnax ;
begin n:=1;

( Net armature voltage ) End; (MOTORS)


urn] :=u[n] -v[n]*kb*nm[n];

( Armature current )
cur[n]:=(u[n]*dt+l*uO[n])/(r*dt+I);
( Motor torque )
fm[n] :=cur[n]*nm[n]*lca;
uO[n] :=u[n];
162

4. ROBOT DYNAMICS
4.1. MANIPULATOR MODEL
The Lagrange formulation [5] of the manipulator dynamics gives a
convenient simulation model:

Em = 0(9) ~ + £(9,9) (4.1 )


where,
Em vector of torques applied on each joint by their
associated actuator
0(9) inertia tensor (matrix)
"
~ joint acceleration vector designated "ace" in the
listing
£(9,9) the sum of vectors representing the Coriolis and
centrifugal accelerations as well as the gravity
effects and the friction torques.

0 11 012 013
0(9) 012 022 023
013 023 033

ett 02e
1 + 01 + fbI + fci
P(9,e)= e0 e + °2 + fb2 + fc2
a03a
t + 03 + fb3 + fc3

0111 0112 °113


0112 0122 0123
0113 0123 0133

0211 0212 0213


0212 0222 0223
0213 0223 0233
163

0 311 0 312 0 313


03(9} 0 312 0 322 0 323
0 313 0 323 0 333

and fbi viscous friction of joint


fC I coulomb friction of joint
The subroutine MOD-ROB actually performs the computations related to the
robot dynamics. The explicit form of matrix 0 has been obtained through
calculations from equation 6.66 of [6] and the numerical values for the
Puma 560 are taken from [7] except for the val ues of fbi and fC I which
must be assumed.
Since the joint position is the desired output, equation (4.1) can be
rewritten as follows:

(4.2)
In order to solve for 9, a double integration needs to be performed. But
this constitutes the task for another subroutine called OYN-ROB to be
described in the next section. Further to the above considerations, the
listing of MOD-ROB has been developed, and is shown below:
Procedure MOD_ROB (fm,p,v:vec_real; di j [1,3]:"' 013*( (x3*d3 + a3*y3 + a3*d3)*823 -
vartau:vec_real;var acc:vec_real;mc :real); z3*d3*c23) ;

di j [2,3]:= m3*«a2*x3 +a2*a3)*c3 + a2*z3*s3


+2*a3*x3 + a3*a3 +k3yy);
{ Robot dynamics calculations }
di j[2,1]:= di j[1,2];
dij[1,1]:= ia1 + m1*k1yy di j[3,ll:= di j[1,3];
+ m2*(k2xx*s2*s2 + k2yy*c2*c2 + a2*a2*c2*c2 + di j [3,2]:= di j [2,3];
2*a2*x2*c2*c2)
+ m3*(k3xx*s23*s23 + k3zz*c23*c23 + d3*d3 + dijk[1,1,1] :=0;
a2*a2*c2*c2
+ a3*a3*c23*c23 + 2*a2*a3*c2*c23 dijk[1,1,2] :=
+ 2*x3*(a2*c2*c23 + a3*c23*c23) m2*«k2xx-k2yy-a2*a2-2*a2*x2)*c2*s2-a2*y2*c22
+ 2*y3*d3 + 2*z3*(a3*c23*s23 + a2*c2*s23»; )
+m3*(k3xx*(c2*s2+c3*s3-2*s2*83*s23)
dij[2,2]:= ia2 + m2*(k2zz + a2*a2 + 2*a2*x2) +k3zz*(2*s2*s3*s23-c2*s2-c3*s3)
+m3*«2*a2*a3+2*a2*x3)*c3 + 2*a2*z3*s3 + k3yy +x3*(-2*a2*c2*s23+4*a3*s2*s3*s23+a2*s3-2*a3*c
+ a2*a2+a3*a3+2*a3*x3); 2*s2-2*a3*c3*s3)
+z3*(a2*c2*c23-a2*s2*s23+2*a3*c23*c23-a3)
dij[3,3]:= ia3 + m3*(k3yy + a3*a3 + 2*a3*x3); +a2*a3*s3-2*a2*a3*c2*s23-a2*a2*c2*s2+2*a3*a3*
82*83*s23
dij[1,2]:= m2*a2*z2*s2 -a3*a3*(c2*s2+c3*s3»;
+ m3*«d3*x3 + a3*y3 +a3*d3)*s23 dijk[1,1,3] :=
+ (a2*y3 +a2*d3)*s2 - d3*z3*c23); m3*(k3xx*(c2*s2+c3*s3-2*s2*s3*s23)
+k3zz*(2*s2*s3*s23-c2*s2-s3*c3)
164

+x3*(4*a3*s2*s3*s23-2*a3*c2*s2-2*a3*c3*s3-a2* { Final torque for acc =0 }


c2*s23)
+z3*(2*a3*c23*c23+a2*c2*c23-a3) tau[IJ:= fm[1J - di [1J
+2*a3*a3*s2*s3*s23-a2*a3*c2*s23-a3*a3*c2*s2-a -dijk[I,I,I]*v[I]*v[l]
3*a3*c3*s3); -2*dijk[I,I,2]*v[11*v[2]
-2*dijk[I,I,3]*v[11*v[3]
dijk[I,2,1J :=dijk[I,I,21; -2*dijk[I,2,3]*v[2]*v[31
-dijk[I,2,2]*v[2]*v[2]
dijk[I,2,21:= m2*a2*z2*c2 -dijk[I,3,3]*v[3]*v[3];
+m3*(d3*z3*s23+(d3*x3+a3*y3+a3*d3)*c23);
tau [2] : = fm [2] - di [2]
di jk.[I,2,31:= -dijk[2,I,I]*v[I]*v[l]
m3*(d3*z3*s23+(d3*x3+a3*y3+a3*d3)*c23); -2*dijk[2,2,1]*v[I]*v[2]
-2*dijk[2,3,1]*v[I]*v[31
dijk[I,3,1J:= dijk[I,I,31; -2*dijk[2,2,3]*v[2]*v[3]
dijk[I,3,21:= dijk[I,2,31; -dijk[2,2,2J*v[2]*v[2J
-dijk[2,3,31*v[3]*v[31;
dijk[I,3,31 :=
m3*(d3*z3*s23+(d3*x3+a3*y3+a3*d3)*c23); tau[3]:= fm[3] - di [3]
-dijk[3,I,I]*v[11*v[1J
dijk[2,I,1J:=-dijk[I,I,2]; -2*dijk[3,2,IJ*v[11*v[2]
di jk [2,1,2]:= 0; -2*dijk[3,3,1]*v[11*v[3]
dijk[2,I,3]:= 0; -2*di jk [3,3, 2] *v[21 *v[3]
dijk[2,2,1J :=dijk[2,I,2]; -dijk[3,2,2]*v[2]*v[2]
dijk[2,2,2J:= 0; -dijk[3,3,3]*v[31*v[3];

dijk[2,2,31 := { Acceleration calculation }


m3*«-a2*x3-a2*a3)*s3+a2*z3*c3); { 1- Inversion of Dij }
i_dij[I,I] :=
dijk[2,3,1J:= 0; di j [2,2J*di j [3,31-di j [3,2] *di j [2,3J;
dijk[2,3,2]:= dijk[2,2,3]; i_di j[I,2]:=
-di j [2,1] *di j [3,3] +di j [3,1] *di j [2,3];
di jk [2,3,3]:= i_di j [1,3]:=
m3*«-a2*x3-a2*a3)*s3+a2*z3*c3); di j [2, 11*di j [3,21-di j [3, 11*di j [2,2J;
i_di j [2,2]:=
di jk[3,I,11 :=-di jk[I,I,3]; di j [I,I]*di j [3,3]-di j [3, 1]*di j [1,3];
dijk[3,I,2] :=-dijk[2,I,3]; i_di j [2,3]:=
di jk [3,1,3]:= 0; -di j [I,1J*di j [3,21+di j [3,1J*di j [1,21;
dijk[3,2,1J:= dijk[3,I,2]; i_di j [3,3]:=
di jk [3, 2,2] :=-di jk [2, 2,3]; di j[I,I] *di j [2,21-di j [2,1] *di j [1,2];
dijk[3,2,3]:= 0; i_dij[2,1]:= i_dij[I,21;
di jk[3,3,1]:= di jk[3, 1,3]; i_dij[3,1]:= i_dij[I,31;
dijk[3,3,2]:= dijk[3,2,3]; i_dij[3,2]:= i_dij[2,31;
dijk[3,3,31:= 0;
{ Determinant calculation }
di [1]:= 0; det dij := dij[I,I]*i dij[I,I] +
dij[I,2]*i_dij[I,2] -
di[2]:=-m2*g*«x2+a2)*c2-y2*s2) + dij[I,31*i_dij[I,3];
-m3*g*(x3*c23+z3*s23+a3*c23+a2*c2);
{ 2- Acceleration }
di[3]:= -m3*g*(x3*c23+z3*s23+a3*c23); acc[I]:=I/det_dij*(i_dij[I,11*tau[IJ +
i_dij[I,2]*tau[2] + i_dij[I,3]*tau[3]);
{ Effects of friction forces }
di [1]:= di [1J+ibl*v[I]+icl*sgn(v[1J);
di[2]:= di[2]+ib2*v[2]+ic2*sgn(v[21);
di[3]:= di[3]+ib3*v[3]+ic3*sgn(v[3]);
165

acc[2):=1/det_dij*(i_dij[2,1)*tau[1) + acc[3):=1/det_dij*(i_dij[3,1)*tau[1) +
i_dij[2,2)*tau[2) + i_dij[2,3)*tau[3); i_dij[3,2)*tau[2) + i_dij[3,3)*tau[3);

End; {MOD_ROB>

4.2. INTEGRATION OF JOINT ACCELERATION


The joint acceleration being the desired end result of the robot
dynamics simulation, a double integration must be performed in order to
feedback the actual joint velocity and position to the controller. A
Runge-Kutta fourth order method serves for that purpose. Its algorithmic
form has been used in the subroutine DYN-ROB and can be recognized in
the listing below:
Procedure DYN_ROB(fm:vec_real;varp: MOD_ROB(fm,px,vx,tau,acc,mc);
vec_real;var v:vec_real;dt,mc:real); n:=1;
REPEAT
begin
f3x[n):=h*(v[n)+f2y[n)/2);
( Initial values > f3y [n) : =h*acc [n) ;
h:=dt; px [n) : =p [n) +f3x [n) ;
n:=1; vx[n) :=v[n)+f3y[n);
px[l] :=0;px[2) :=0;px[3) :=0; n:=n+1;
vx[1):=0;vx[2):=0;vx[3):=0; end;
MOD_ROB(fm,p,v,tau,acc,mc); UNTI L n>nnax ;
n:=1; MOD_ROB(fm,px,vx,tau,acc,mc);
REPEAT n:=1;
begin REPEAT
f1x[n) :"'h*v[n); begin
f1y[n):=h*acc[n); f4x[n):=h*(v[n)+f3y[n);
px[n):=p[n)+f1x[n)/2; f4y [n) : =h*acc [n) ;
vx [n) : =v [n) +f 1y [n]f2; n:=n+1;
n:=n+1; end;
end; UNTIL n>nnax ;
UNTI L n>nnax ;
( Velocity and positional calculations >
MOD_ROB(fm, px,vx, tau, acc,mc); n:=1;
n:=1; REPEAT
REPEAT begin
begin p[n) :=p[n)+(f1x[n)+2*f2x[n)
f2x[n) :=h*(v[n)+f1y[n)f2); +2*f3x [n) +f4x [n) )/6;
f2y[n):=h*acc[n); v[n):=v[n)+(f1y[n)+2*f2y[n)
px[n):=p[n)+f2x[n)/2; +2*f3y[n) +f4y [n) )/6;
vx[n):=v[n)+f2y[n)/2; n:=n+1;
n:=n+1; end;
end; UNTIL n>nnax ;
UNTI L n>nnax ;

As a compromise, values for "h" between 4 and 20 milliseconds allow for


acceptable simulation speed and computation accuracy.
166

4.3. EFFECTOR TRAJECTORY IN THE WORKSPACE


The symbols p and v in DYN-ROB, as seen above, stand for 9 and 9, the
joint position and velocity. In order to obtain the actual trajectory of
the end effector (distal end of link 3), one needs to solve the direct
kinematics problem.
Thi s requi res to determine the homogenous transformat ion matrix °T4
which is the chain product of the successive coordinate transformation
matrices
(4.3)
where
cos9 i -cosaisin9 i sina isin9 i aicos9 i
sin9 i cosaicos9 i -sinaicos9 i aisin9 i (4.4)
o sinai cosa i di
o o o 1

and
c i = cos 9i
si = sin 9i
for the case of the Puma (figure 2), the derivation of the i- 1Ai matrices
gives

c1 0 -Sl 0
Sl 0 c1 0
0 -1 0 0
0 0 0 1

c2 -s2 0 a2c2
S2 c2 0 a2s2
0 0 1 0
0 0 0 1

(4.5)
c3 0 -S3 a3c3
s3 0 c3 a3s3
0 1 0 d3
0 0 0 1
167

1 0 o o
o 1 o o
o 0 1 o
o 0 o 1

From the fourth column of °T4, the coordinates of the distal end of link
3 relative to the base are found to be
px = a3c1c23 + d4c1s23 + a2c1c2 - d3 s 1
py = a3 s 1C 23 + d4 s1 S23 + a2 s 1c2 + d3c1 (4.6)
pz = -a3 s23 + d4c23 - a2 s2
where
C 23 = cos( 92 + 93 )
S23 = sin ( 92 + 93 )

Table 4.1 contains the numerical values for the Puma-S60 parameters [7]
needed in equation 4.6 to compute the trajectory in workspace.
Table 4.1 Parameters of the Puma-S60

Frame 9; 0; a; d;

1 91 -90 0 0
2 92 0 0.4318 0
3 93 90 -0.0191 0.1505
4 0 0 0 0.4331

From equation 4.6, a short subroutine called POL-CAR was written. Its
1isting follows:
Procedure POL_CAR(positionjp,pos_car:string); { Manipulator parameters }
d3:=0.149;d4:=0.433;
a2:=0.431;
83:=-0_02032;
REPEAT { Readings of polar coordinates }
begin readln(f1, t,p[ll,p[21,p(31);
{ Cartesian coordinates calculation }
168

px:=a3*c1*c23+d4*c1*s23+a2*c1*c2-d3*s1; { Recording of cartesian coordinates }


py:=a3*s1*c23+d4*s1*s23+a2*s1*c2+d3*c1; writeln(f2, t,px,' ',py,' ',pz);
pz:=-a3*s23+d4*c23-a2*s2; end;
UNTIL eof(f1) ;

5. Some Control Algorithms


As examples of different control strategies that can be tested and
compared with the simulation program described in the previous sections,
four types of control laws are now briefly summarized.
For comparing performances allowed by different algorithms, two straight
line trajectories were chosen. One originates at (-0.5, 0.4, 0.1) and
terminates at ( 0.7, 0.4, 0.1) while the second one starts at (0.1, 0.2,
0.1) and ends at (0.1, 0.65, 0.1). Those coordinates are relative to the
inertial frame Xo' Yo, Zo shown on figure 2.
The objective being to compare different types of robot controllers, the
simulation program, due to its modularity, allows for changing control
law without requiring further modifications. Therefore a whole series of
control algorithms can be tested at once. The control algorithms
presented below were so tested in previous studies.
5.1. PIO CONTROLLER
The PIO compensator usually serves as a basis for comparisons when one
wants to establish the merits of a new algorithm. Besides, this is the
type of controller used in industrial robots. The PIO controller
studied hereafter consists actually of a PI controller with tachometric
feedback. Figure 4 shows its block-diagram: there is one servoloop per
degree of freedom without any interact ion compensat ion. Th is
corresponds to the controller configuration of most industrial robots.
169

~u
p (8,8)

__ K_a--.J~
I
!-r----l8
ROBOT
f 8

Figure 4: PID Controller

From the block-diagram, one can write the control algorithm:


(k-1)T
u Kp (9 - 9d ) + '<v (9 - 9d ) + Ki I (9 - 9d ) (5.1)
j=O
where
Kp proportional gain
Ky derivative gain
Kj = integral gain
T sampling period
nT = actual sampling instant
For tuning the controllers, suggested initial values are
Kp = 80 Kj = 12

In the short sub-routine presented below and associated to equation 5.1,


p and v stand respectively for 9 and e.
Procedure PID (p,pd,v,vd:vec_real; REPEAT
var u:vec_real;var s_err:vec_real;var begin
err:vec_real;kc :mat_real); err [n] :=pd[n]·p[n];
s_err[n]:=s_err[n]+err[n];
Begin{PID} n:=n+1;
end;
{ Error and error integral Calculations } n>nmax ;
UNTI L
n:=1; { Control variable calculation }
170

n:=1; n:=n+1;
REPEAT end;
begin UNTIL n>nnax ;
urn] :=kc[n,1]*err[n]+kc[n,2]*(vd[n]-v[n])+
kc [n,3]*s_err [n]; End;{PID}

5.2. POSITIVE-CURRENT FEEDBACK [8]


Like the PIO controller, the positive-current feedback (PCF) controller
permits only to servo each degree of freedom separately. The
interact i on between the 1inks is seen as a disturbance, at the loop
level. The design philosophy counts on the good disturbance rejection
capabil ity of PCF to obtain improved performances. From the block-
diagram of figure 5, an expression for a control algorithm is obtained:
(5.2)

where:
Ki = current gain
F m = torque at t = (k - l)T as computed by the motor subroutine
nm = gear ratio
Ka = motor torque constant

pee,S)

K_O~~
L -_ _
t- ROBOT
G T_G
lLJ

Figure 5: PCF Controller

The subroutine for the PCF algorithm is a particular case of the next
algorithm. Therefore no listing is given for now. Suggested numerical
values for initially tuning the controller are Kp = 80 Kv = 10 Ki = 8.
171

5.3. POSITIVE-CURRENT FEEDBACK + PID [9]


Further improvements can be obtained by combining both PCF + PID.
Besides, anticipative action can be~ added to the PID for still better
performances. The resulting controller configuration appears as a block
diagram in figure 6. The control algorithm, now called PID/CF, becomes
(K-l)T
u = Kl (ed-e) + K2 (ed-e) + K3[fnl(n mKa )] + K4 }; (ed-e) + Ks ed (5.3)
j=O
where
Kl proportional gain
K2 derivative gain
K3 current gain
K4 integral gain
Ks anticipative gain

'----------1 K2 1 - - - - '

Figure 6: PCF + PID Controller Block-Diagram

Suggested values for initially tuning the controller are


Kl = -3
Ks = -28

The following listing permits to simulate that controller.


Procedure RPC (p,pd,v,vd:vec_real;var
u:vec_real;var s_err:vec_real;var { Error and error integral Calculations
err:vec_real;kc :mat_real; }
fm:vec_real); n:=1;
REPEAT
Begin{RPC} begin
172

err[n] :=pd[n]-p[n]; REPEAT


s_err[n]:=s_err[n]+err[n]; begin
n:=n+1; u[n] :=kc [n, 1]*err [n]+kc [n,2]*(vd[n]-v[n])+
end; kc[n,3]*(fm[n]/nm[n]/ka) +
UNTI L n>max ; kc[n,4]*s_err[n]+ kc[n,5]*pd[n];
{ Control variable calculation } n:=n+1;
n:=1; end;
UNTI L n>max ;

End;{RPC}

5_4. AUTO-ADAPTATION BASED ON ACCELERATION FEEDBACK [10]


An auto-adaptive algorithm for compensating the torques due to the
Coriolis and centrifugal acceleratipns proposed in [10] and revisited in
[11] needs that a, bes ides a and a be fedback. For that reason, thi s
compensation is named AAF. The block-diagram of figure 7 describes its
structure.

-
e
e
Figure 7: AAF Controller

All the variables shown are vectors: the algorithm, this time, takes
into account the interaction between the degrees-of-freedom. As before,
position and velocity feedbacks are used but the new feature comes from
the generation of the compensat i ng torque tc' Idea 11 y, the matri x Dc
must be identical to the inertia tensor of the robot. However, keeping
Dc constant does not deteriorate appreci ably the performances. Hence,
the algorithm becomes:
(5.4)
where,
173

Kcl proportional gains (diagonal matrix)


Kc2 derivative gains (diagonal matrix)
and
tc (k) = tc (k-l) + Dc (y - ~) (5.5)
finally
(5.6)
In the listing below, numerical values are suggested for Dc' if assumed
constant. Past experience suggests, for initial tuning, to take these
numerical values

KC2 5

The associated subroutine can be implemented by means of this listing:


Procedure ARA (p,pd,v,vd:vec_real;var { Compensating torque calculation }
u:vec_real;var err:vec_real;kc :mat_real); n:=1;
REPEAT
Begin{ARA} begin
tc [n] :=tc [n] -di j [n, 1] *(acc [1] -uc [1] )
{ Error Calculation } -di j [n,2] *(acc [2] -uc [2] )
n:=1; -dij [n,3]*(acc[3] -uc[3]);
REPEAT n:=n+1;
begin end;
err [n,O] :=pd[n] -p[n]; UNTIL n>I'IIIIIX ;
n:=n+1; { Control variable calculation }
end; n:=1;
UNTIL n>I'IIIIIX ; REPEAT
begin
{ Calculation of uc } u[n] :=di j [n, 1]*uc[1]+di j [n,2]*uc [2]
n:=1; +di j [n,3] *uc [3] +tc [n];
REPEAT n:=n+1;
begin end;
uc[n] :=kc[n.1]*err[n] UNTIL n>I'IIIIIX ;
+kc [n,2]*(vd[n] -v[n] );
n:=n+1; End; {ARA}
end;
UNTIL n>I'IIIIIX ;

6. Conclusions
More on these algorithms and thei r performances can be found in the
references as their analysis does not presently constitute the
objective.
174

The software described in the previous sections was prepared with great
care for modularity thus favoring the ease of modifications. In
particular, it permits one to replace compensators at will without
having to rewrite other parts of the program. In this way, the need to
compare the performances of the various robot control algorithms
proposed in the literature can be satisfied.
Moreover, the lack of benchmark model for testing robot control
algorithms has been remedied. The situation had forced authors to use
hypothetical models for demonstrating the goodness of their controllers
leaving the reader with no direct basis for comparisons. But,
simulating a real robot (like the Puma 560 for which all the data are
available) undergoing the task to move objects along the straight line
trajectories proposed in this paper, offers a definitive benchmark
solution.

REFERENCES
[1] ANONYMOUS (1986), CATIA User's Manual - Robotics Module, Dassault
Systems, Suresnes France.
[2] WALKER M.W. and ORIN D.E. (1982), "Efficient Dynamic Computer
Simulation of Robotic Mechanisms", ASME Journal of Dynamic
Systems, Measurement and Control, 205-211.
[3] NEUMAN C.P. and MURRAY J.J. (1987), "The Complete Dynamic Model
and Customized Algorithms of the Puma Robot", IEEE Trans. on
Systems, Man, cybernetics, vol. SMC-17, no. 4, 635-644.
[4] FU K.S., GONZALEZ R.C. and LEE C.S.G. (1987), Robotics: Control,
Sensing, Vision and Intelligence, McGraw-Hill, New York.
[5] KOIVO, A.J. (1989), Fundamentals for Control of Robotic
Manipulators, Wiley, New York.
[6] PAUL R.P. (1981), Robot Manipulator: Mathematics, Programming, and
Control, MIT Press, Cambridge, Mass.
[7] TARN T.J., BEJCZY A.K., HAN S. and YUM X. (1985), "Inertia
Parameters of Puma 560 Robot Arm", Robotics Laboratory Report SSM-
RL-85-02, Saint-Louis.
[8] O'SHEA J. and TURGEON, A.B. (1988), "Benefits of Positive Current-
Feedback Inner Loop for Robot Servos", IFAC Proceedings on "Theory
of Robots", Pergamon Press, Oxford, 185-191.
175

[9] O'SHEA J •• BENALI R. and TURGEON A.B. (1990), "An Improved Robot
Controller Based on Positive Current Feedback Plus PIO", IMACS
Proceedings of MIM-S2 '90, Brussels.
[10] CHAUDET R. and O'SHEA J. (1983), "Val idation of an Adaptive Robot
Control Structure by Means of Simulation", IMACS Proceedings on
"Simulation in Engineering Sciences", Elsevier Pub., 411-416.
[11] GOREl R. and O'SHEAJ. (1989), "Robot Positionning Control: Survey
and Comparative Analysis", Proceedings Incarf 89: 4th Int. Conf.
on CAD, CAM, Robotics and Factories of the Future, vol. 1, New-
Delhi, 639-648.
CHAPTERS

COLUSION STRATEGIES FOR ROBOT RETREAT


AND RESISTANCE

R.E. GODDARD, K.L. BOYER, and H. HEMAMI


Department of Electrical Engineering
The Ohio State Universit:1J
Columbus, Ohio 43210

Abstract

The best collision strategy for a robot is avoidance. If avoidance is not possible or feasible
the next best strategy, under certain circumstances, may be 1) partial judicious retreat
in order to ameliorate the consequences of collision, or 2) to resist the collision impact in
order to maintain stability. The retreat and resistance strategies share common elements of
design: sensory requirements, processing and transformations modules, and control struc-
tures. Much of the system design is amenable to microprocessor based implementation.
The subject of this chapter is to present the technical implementation of both of these
strategies via an example, and to discuss a microprocessor realization of them. The article
is divided into six: sections. Section 1 introduces the two strategies and the technical issues
via an example. Section 2 is concerned with pre-collision sensing and estimation require-
ments. Section 3 deals with the sensory requirements after collision. Section 4 presents
the control requirements. Section 5 offers some illustrative numerical simulations of re-
treat or resistance. Section 6 summarizes the major attributes of the retreat and resistance
strategies.

1. Introduction

Collision avoidance for robots has been the subject of many recent studies[1,2,3,4]. In
most of these studies, the robot is in full control of th(; situation and total avoidance of
the collision with the environment is planned in advance. When total avoidance is not
possible or feasible, the next best strategy, under certain circumstances may be partial
retreat in order to ameliorate the consequences of collision or to resist the collision impact
and dissipate the collision energy in order to maintain stability. Examples can be drawn
from the field of human activity in general, and athletics in particular, to illustrate partial
retreat or resistance. In soccer or football a player may retreat from collisions in order
to maintain speed or possession of the ball. A wrestler must deliberately resist collision
or forces that he encounters in contact with the opponent, and must sustain his postural
stance against steady force exerted by the opponent. There may also be interim maneuvers
between the extremes of resistance and retreat. The athlete (robot) may have to reorient
itself, i.e., change its attitude, in order to negotiate a retreat or to resist more effectively.
Many of these reactions are more accurately characterized as reflexive than as preplanned,
or structured, activities.
177
178

If the resistance or retreat is local, i.e., the magnitude of the force or the resulting distances
in retreat are small, both situations can be viewed as extreme cases of the mechanical
impedance of a linear system. Suppose the robotic system is linearized about an equilib-
rium point from which retreat or resistance is to take place, and suppose an analogy is made
between force and voltage on the one hand and velocity and current on the other. Then
the classical concept of impedance can be extended to the linearized robotic system[5,6J. In
resistance the robotic system should display high impedance (it should exhibit stiffness) in
the direction of the force. This means small velocity and, consequently, small displacement
should result. In retreat, small forces should produce high withdrawal velocity and corre-
spondingly large displacement. In this case, the system presents low impedance (exhibits
compliance) in the direction of the applied force.

In this chapter the common sensory, processing, and control requirements of both strategies
are discussed via an illustrative example. The example is a four segment robot anchored
to the ground with a state space of the angles and angular velocities of the segments with
respect to an inertial frame of reference.

2. Pre-Collision Visual Sensing

In this section we will describe a strongly goal-oriented stereo vision system charged with
the task of establishing the possibility of collision with an inbound airborne projectile. From
an input sequence of stereo video image pairs, the system produces a current best estimate
of:

• Collision likelihood
• Impact position
• Impact vector
• Impact velocity

Such a system could form a component of a larger, more general, VISIon system. The
restriction to visual sensing merits further discussion. Progress has been made recently in
sensor design and integration [7,8,9J. We limit the sensory configuration to focus on its
contribution and requirements.

The processing demands imposed by this particular problem dictate a special case solution;
such a system should necessarily be, in some sense, minimalist. By this we mean the system
should construct a scene description just sufficiently rich to solve the problem at hand
and should do no more processing than is absolutely necessary. In addition, the imaging
resolution should be just sufficient. Extracting additional information and constructing
179

higher level scene representations wastes energy and computational resources and injects
an unnecessary degree of complexity, increasing the likelihood of malfunction.

This goal directed component approach to visual processing seems to have at least an ap-
proximate counterpart in the evolution of biological vision systems [10]. Biological systems
form representations and extract descriptions on a scale commensurate with the organism's
ambulatory capabilities and adapted to its specific needs!. Simple organisms which cannot
move far or fast need only sense the local environment; it is all the information they can
effectively use. Additionally, limited reasoning capability which does not allow for long
range planning or communications would be incapable of dealing with such information.
Distant vision capability is really only useful in a task planning operation as opposed to a
reflexive response to local conditions. The subject action of this chapter falls more into the
category of reflex than conscious planning. The system should not operate at any level of
abstraction which is not absolutely necessary.

2.1 BASIC SENSORY SYSTEM CONFIGURATION

The major requirement for either resistance or retreat is a reasonably accurate estimate of
the object's velocity vector and point of impact. If the objective is to resist the impact,
the robot will also require a priori knowledge of the mass of the projectile and whether the
force of the collision will be impulsive (short duration) or steady (sustained or lingering).
We are not supposing that the latter two pieces of information can be garnered from the
visual input, so they are assumed known from some other source.
We envision a system consisting of a stereo pair of video cameras mounted on the robot's
head 2 • The stereo vision system allows us to do triangulation ranging to recover the distance
to the inbound projectile. In building a (general purpose) visual stereo system, two problems
are paramount:

The Correspondence Problem: The identification of corresponding scene features in


the two images.
Camera Calibration: The precise determination of the position and orientation of each
camera relative to the other or to a world coordinate frame, as well as any significant
geometric distortions in the lenses or resulting from the physical construction of the
cameras.

The solution to the correspondence problem is required to establish the local triangulation
used to compute the location of physical scene features. Camera calibration allows the
triangulation to be accomplished in real coordinates. Additionally, a well-calibrated camera
lFor example, the vision system of the frog, which is concerned mostly with acquiring food, is little more
than a "fly detector."
2In this sense, the design is blatantly anthropomorphic.
180

system is needed for many stereo correspondence algorithms, since this provides geometric
constraints which are used to direct the search process( es).

The stereo correspondence problem has been pursued by many researchers [11-25] 3. These
techniques generally operate at the pixel (point) based level with little aid or guidance from
any sort of structural representation of the scene content. As an exception, Boyer and Kak
[13] showed that the rich information content of a structural description of the scene may be
used in solving the stereo correspondence problem for a very simple class of scene (binary
skeletons); these experiments provided a reasonable first proof of concept for structural
stereo. None of the other systems cited above implement the notion of structural stereo
as defined by Boyer 4. In recent work, Horaud and Skordas [15] present a scheme in which
a relational graph is built for each image and then correspondence is recovered by finding
maximal cliques in a correspondence graph. Arcs in the correspondence graph represent
consistent matches with respect to the information encoded in the pair of relational graphs.

Most proposed solutions for stereo vision rely on the assumption that the images are regis-
tered in epipolar geometry. That is, scan lines are thought to be epipolar lines, or related
to them in a known way. For certain applications this condition is forced by aligning the
cameras perfectly parallel, although this condition is very difficult to achieve in practice.
However, even if it can be accomplished to acceptable accuracy, it is quite likely that this
condition will change in time. For example, in a moving robot physical disturbances and
vibrations may cause misalignment; the camera positions should be periodically surveyed
and readjusted if necessary. Although one of the authors (Boyer) is currently engaged in
research designed to recover correspondence from uncalibrated stereo camera systems, that
is a problem quite distinct from the one we consider here. For this discussion, we will
assume that the camera pair is calibrated with respect to an egocentric robot coordinate
frame.

With respect to the tracking or active vergence problem which is closely related, we also
point to some especially noteworthy work done by Ballard, et al.[16]. Cepstral filtering is
used to control the camera vergence and "track" an object in the field of view. Much of this
work is predicated on the observation that active eye and head movement can introduce a
great deal of useful information of a very succinct nature to aid in resolving ambiguity and
limiting the massive amount of computation typically associated with such problems.

At this point we take a departure from most prior stereopsis work, including our own,
and propose a system based on associative memory. The purpose of the memory is to
immediately associate a set of motor commands and other behavioral actions with a set of
input visual patterns in the two cameras. That is, rather than explicitly computing point
correspondences and object positions in world coordinates and trying to reason forward
from this information to a plan of action, we are trying to capture the essence of reflex
3We recommend Barnard and Fischler [11] and Kak [12] for excellent summary reviews of several classic
stereopsis algorithms.
4The term could reasonably be applied to the system of Ayache and Faverjon[14]' although it is actually
quite different and still requires a priori knowledge of the epipolar geometry.
181

behavior through the action of associative memory. The explicit construction of point
correspondences and 3D scene descriptions, followed by online velocity and point of impact
calculations, is prohibitively expensive from a computational point of view for the problem
at hand. Again, we are proposing a strongly goal directed system which can serve as a
building block; it is not general purpose in a standalone sense. Learned patterns on the
four image planes, left and right at two discrete but closely spaced instants in time, will be
used directly to infer the robot's reaction, if any is required. We will refer to such a set of
four images as an image foursome. This will be a continuing online process as the projectile
approaches and does not require simplifying straightline trajectory assumptions and the
like to make the problem tractable. Knowledge of the ballistic behavior of projectiles can
be learned synthetically offline and stored in the associative memory.

The task of the associative memory is first to determine whether an impact is imminent.
To do this effectively, the memory must be able to recognize with reasonable accuracy all
possible object position and trajectory combinations likely to cause such impact within its
field of views. The stereoscopic field of view for which we expect the associative memory to
be responsible is termed its domain of responsibility. Given an incoming projectile anywhere
within this volume of space, we expect that the associative memory will be able to recognize
the condition as a threat (if indeed it is) and associate with that recognized condition an
appropriate set of responses.

The essential concept behind an associative memory implementation of reflexive behavior is


this. We will store some sizeable set of reference patterns derived from possible input image
foursomes. Each of these patterns will describe a physical configuration of the domain of
responsibility for the memory. For our purposes, the description of the configuration of this
domain should include whether or not a potentially threatening projectile is present and,
if so, what its current position and velocity vector are. This does not mean that we need
to compute position and velocity explicitly. Rather, it means that the patterns we extract
should implicitly contain that information.

The set of patterns we store should effectively cover the domain of responsibility; holes in
the coverage will correspond to windows of vulnerability for the robot. The robot will be
incapable of appropriate action if the physical configuration of the domain of responsibility
is not recognized. Associated with each pattern is information specifying the appropriate
reflexive action based on the current state of the environment.

2.2 SYSTEM DESIGN CRITERIA AND ASSUMPTIONS

We have already outlined the general criteria of our goal oriented visual system. Speed
is certainly the most critical factor in its design. To this end, we are assuming that the
input scene is a single dark blob silhouetted against a bland, bright sky. Simple gray level
5 Forthis discussion, we will assume that the robot is not turning its head to scan the sky for threats,
although that could readily be incorporated as layer above our paradigm.
182

thresholding will be used to segment the images into binary form. We will also work with
limited resolution. We do not need to see the individual facets of the ball surface; neither
do we need precise delineation of the ball perimeter. Camera resolutions on the order of
500 X 500 will be substantially subsampled.

Nelson [17] outlines three criteria required of an associative memory system designed to
perform visual homing6 . Similar criteria applicable to our problem are as follows:

• The sensing and extraction of patterns from the domain of responsibility should be
such that the reference patterns and their associated actions vary smoothly with the
configuration of that domain. Simply put, if the projectile position and velocity are
perturbed only slightly, the resulting pattern and the robot's response should also
be perturbed only slightly. This allows a reference pattern to effectively span some
reasonable volume of the domain of responsibility and so limits the number of reference
patterns which must be stored. This concept is analogous to a Nyquist sampling
criterion with respect to the mapping from the physical domain of responsibility to
pattern space and, ultimately, the associated action space. We will see that the
physical aspects of our problem (i.e., fundamental ballistics) ensure that this condition
is easily satisfied.
• To limit the degree of false recognition, the probability that highly similar patterns
will be produced from highly dissimilar configurations of the domain of responsibility
should be low. Of course, this point is fairly obvious. Because of the simplicity we
assume in the visual input, the source of false matches in our case will predominantly
consist of segmentation errors in the original images, which will be determined by
the probability density function of the input noise. Because camera resolutions on
the order of 500 X 500 will be substantially subsampled, considerable averaging will
occur and the noise variance will be accordingly reduced. In general, this requirement
dictates that the mapping of raw data into patterns should be as nearly unique as
possible. Practically, there should be little or no repeated image structure or distinct
configurations of the physical domain which lead to redundant patterns.
• The patterns should be compact, easily generated, am! fast to compare because, re-
spectively, we will need to store a large number of them, we must produce them in
real time from the raw stream of input data, and we must compare a freshly computed
input pattern with all stored patterns of interest (what we mean by "of interest" will
become clearer later) within a time interval which allows real time performance.

Definitions

In keeping with Nelson's formalism, we define a pattern as an n-tuple P = (aI, a2, ... , an)
which is an element of pattern space, P:

(2.1)
6We have learned much from Nelson's work; in some ways the pre-collision sensing problem is a dual to
the homing problem which he has formulated very nicely.
183

where AI, . .. , An represent finite sets of features. That is, ak E Ak and an image foursome
is described by a set of discrete primitives. There is no relational information captured
explicitly in this formalism, in contrast to the parametric structural description formalism
of Boyer and Kak, or many of the other graph-theoretic representations. These more so-
phisticated representations are inappropriate for the reflexive action under consideration;
they are simply richer than necessary and too costly to consider here. This approach only
captures relational information implicitly in the positions of the primitives in the pattern
n-tuple.

We will refer to the stored patterns acquired in the training phase as reference patterns.
The pattern extracted from the incoming data as the system is in operation at any instant
of time will be referred to as the index pattern. The system will, in response to an index
pattern, produce that stored reference pattern closest to it, according to an appropriate
distance metric. If this reference pattern is sufficiently close to the index pattern, the
reflexive action associated with that reference pattern will be executed. This describes the
behavior of a rudimentary associative memory, essentially an inexact lookup table or nearest
neighbor classifier. For our problem, the appeal of such a configuration is in its simplicity;
it offers the minimalist characteristics that we require to meet our time demands.

2.3 SYSTEM DESIGN

We will now establish some of the pertinent design parameters of an associative memory
visual sensor processor such as we propose. This should not be taken as a definitive system
design, but rather as a preliminary investigation into the feasibility of implementing such a
system.

Pattern Construction and Imaging Resolution

The first item to be considered is the mapping from image foursomes to elements of pattern
space:
if>:IxIxIxI>--->P (2.2)
We will denote the four images in the foursome by h(tO),!R(tO),h(t1),!R(tI) to signify
left and right images at sample times to, tl'

We construct an 8-element vector formatted as shown in Figure 2.1. Each of the eight
components specifies the row or column location of the centroid of the I-valued pixels
(object pixels having value 1, background pixels having value 0) in one of the four images
in the foursome. In this case, each of the £II; in the pattern formalism above is a number:
ak E {O,l, ... ,N -I} = Ak, Vk = 1, ... ,8 (2.3)
and the size of our pattern space is N8,which is likely to be quite large, indeed. This is
desirable to increase the mean separation between the stored reference patterns and so to
control the false match probability. We are assuming that an online estimate of centroid
position can be computed, possibly optically.
184

left . Right

J I

@ e
I- -
;;':e
\ I
V
<t>


"- Time I
A
Time 1+1

Figure 2.1: Constructing elements of pattern space from an input image foursome.

We will compare our patterns using a distance metric built on the well-known chessboard
distance. The chessboard distance between two points PI = (r}, CI) and P2 = (r2, C2) is
defined as follows:
(2.4)
This distance metric has the endearing characteristic of being fast to compute and relatively
meaningful physically. This will be applied to the projectile centroid position in each of the
four images. This involves four calculations, each comparing two members of the pattern
8-tuples. We then assume the distances in each position to be approximately additive with
respect to physical discrepancies.

Let Bk represent the pair (Ak, Ak+1), k odd, with the Ak arranged such that the pairs
Bk correspond to centroid positions in (row, column) format. Then we define the pattern
distance as follows:
8
dp(Px,PII ) =E ds[Bk(Px),Bk(PII )) (2.5)
k=l
kodd
where Bk(Px ) stands for the value of Bk in pattern Px , which in turn represents the pattern
constructed from an image foursome corresponding to a configuration, x, of D.
We will now select an appropriate imaging resolution. Assuming the object to be detected
and tracked is something like a soccer ball, we first establish the relationship among the
object's diameter, its detectable range, and the image resolution. In Figure 2.2 we depict
the perspective transform in one dimension. Assume a field of view 'IjJ, focal length f, image
185

plane physical extent s, and sampling interval ~ over the image plane. For an object of
diameter d tangent to the optic axis at a range (from the optical center) rand subtending
an angle equivalent to that of a single image sample, we can write:

(2.6)

But it is also easy to see the following:

2~ = tan (~) .~ s = 2ftan (~) (2.7)

So the sampling interval is simply:

~ = !... =:
N
2f tan
N
(!I!.)
2
(2.8)

Combining these relationships, we get:


Nd
(2.9)

With a field of view of 45°, N = 32 samples, and d = 1 unit, the range rmax ~ 39 units. So
a soccer ball about one 1/3 meter in diameter can be just isolated at a range of about 13
meters with only 32 X 32 resolution. Although the range is not spectacular, we will design
our system around this modest resolution. As the projectile approaches, the perspective
transform naturally ensures that our estimate of its position and velocity will improve as
its image covers a greater portion of the image plane.

Under these conditions, our patterns are vectors of length 8 with 32 choices in each com-
ponent. The pattern space is of size 328 ~ 1.1 X 10 12 , which is certainly large.

In the discussion which follows, we will have occasion to use the 1/3 meter object diameter
approximation often in computing estimates of the allowable image plane motion and in
counting the number of patterns to be stored. The reader should bear in mind that we
are assuming something like a soccer ball for a projectile. Obviously, if we knew the exact
diameter of the projectile we could compute range directly from image plane size through
the perspective transform using only a single camera. Again, this discussion is more an
assessment of feasibility than a detailed design exercise.

Selecting the Reference Patterns: Geneml Principles

Given our definition of the pattern space, P, we next select a set of reference patterns,
R ~ P, to be stored together with the associated behavioral specifications. There are two
conditions imposed on the reference pattern selection:

Condition 1: The reference patterns should be drawn from configurations of the domain
of responsibility, D, in such a way that every "interesting" arrangement of this domain
186

r Optic AHis

Image Plane

Figure 2.2: Computing the range at which the projectile just occupies a single pixel.

produces an index pattern matching one of the reference patterns according to some
criterion. We use a threshold on the pattern distance. That is,

("Ix E D)(3y E ~-l(R» 3 dp(P"" PII ) ~ t (2.10)

where x represents a possible configuration of the domain of responsibility and y


represents a configuration for which a reference pattern has been stored. P"" PII are
the respective patterns computed by the mapping ~ applied to the image foursome
acquired under physical configurations x, y, and t is the maximum distance between
two patterns considered to match one another 7 •
This condition is designed to ensure that we have no windows of vulnerability in the
domain of responsibility. Should this condition not hold, there will be physical con-
figurations of D (Le., combinations of projectile position and velocity) which will not
be recognized and for which appropriate action cannot be taken. This does not mean
that all visible combinations of position and velocity must be accommodated in the
memory, however. We may restrict our attention to those physical configurations for
7To avoid circumlocution:, we will refer to the mapping ~ as taking us either from D to P or from the
image foursome to P, whichever is more convenient. There should be little cause for confusion because
the image foursome is so closely identified with the configuration of D. We also assume ~ to be invertible,
which holds in our case.
187

which collision is possible in the immediate future. In this sense, we refine our defi-
nition of the domain of responsibility. Any physical configuration of this space which
is known a priori to be non-threatening can be left unrepresented in the associative
memory. In effect, we are treating our "don't know" and "don't care" conditions
(ignorance and apathy) alike. If the system doesn't care about a given configuration
(because it cannot result in a collision) it is not necessary to recognize it.
For any y E ~-l(R), the set of all local x such that dp(Px , Py) ~ t is termed the
recognition neighborhood of y. Condition 1 states that the union of all the recognition
neighborhoods of the stored patterns must cover the domain of responsibility. It is
important to recall that the recognition neighborhood is a region of the domain of
responsibility and so is identified with physical quantities; it is not a neighborhood in
the abstract pattern space. For our problem, the physical quantities are the object
position and velocity vector, defining a six dimensional Euclidean space.
Condition 2: The mapping, ~, from the physical domain to the abstract pattern space
should be such that, if Px matches Py then x and y should be near one another in D.
That is, the mapping from the physical world through the sensing and pattern con-
struction process should produce similar patterns for similar physical configurations.
Mathematically:

(V'x)(V'y) 3 dp(P",Py) ~ t => d(x,y) < rex) (2.11)

where d(.,.) is a distance measure on the physical configuration of D and r(.) is a


function which assesses the accuracy of the recognition at each point in D. In our case,
d corresponds to a distance metric on the 6-dimensional representation of D in terms
of three object position coordinates and three motion vector components. Naturally,
the (possibly nonisotropic) Euclidean distance is the most obvious choice.

The function r(.) is defined in the physical space as is the recognition neighborhood. In-
deed, we want to design the system so that rex) at each point just includes the recognition
neighborhood and as little else as possible. Then, of course, we want the union of the
recognition neighborhoods to cover the domain of responsibility. Figure 2.3 illustrates this
relationship. Two simplified (two dimensional) domains are depicted. On the left is the
domain of responsibility, which is actually a 6-dimensional real Euclidean space correspond-
ing to the physical quantities of position and velocity. On the right is the pattern space,
which is actually an 8-dimensional space (four row positions, four column positions) with
32 discrete values defined on each axis. In domain D is a configuration y corresponding to
a 6-dimensional vector (or a 6-tuple) of projectile position and velocity components. Sup-
pose this position in D is selected as a reference position. ~ maps this vector of physical
quantities onto the pattern space, P. Py = ~(y) is therefore stored as one of our reference
patterns, together with the associated robotic response. There is a region in pattern space
lying within a radius t (pattern distance) of the reference pattern Py • The inverse image of
this region is a region in D containing y which is the recognition neighborhood of y, and
which we denote as n(y). In a well designed system, this neighborhood should be simply
188

connected. Ideally, it is some sort of convex n-ball (n = 6 here) centered about y, but this
may be difficult to achieve (or verify) in some applications.

The accuracy criterion, r(.) then (indirectly) dictates the spacing of our reference patterns,
R, in the pattern space and the pattern distance threshold, t. That is, when we map the
set of matching patterns back onto the physical space, we want the resulting recognition
neighborhood to lie entirely within the accuracy limit:

(2.12)

Thus, we select our reference set and distance thresholds in pattern space to ensure the
accuracy we need with respect to physical space. If the recognition neighborhood is much
smaller than the region defined by r, the matching criterion will be overly demanding. We
will accurately determine the parameters of the physical environment (presumably more
accurately than necessary) when matches occur, but will be forced to store more vectors to
cover D, or risk a larger degree of vulnerability.

If the recognition neighborhood is larger than that defined by r, we will not achieve the
desired system accuracy. Matching an index pattern to one of the stored reference patterns
will not determine the physical parameters of the projectile's position and velocity to a
sufficient degree of accuracy. In the event that the recognition neighborhoods overlap, we
take the reference pattern closest to the index pattern and this may offer enough accuracy.
Despite this ameliorating feature, the computational burden is increased as a consequence
of the additional matches from which one must be selected. The system is most compu-
tationally efficient when one, and only one, reference pattern is found which matches the
input index pattern.

Selecting the Reference Patterns: Specifics

In our design, we adopt the position that our accuracy is limited by the coarse resolution
of our cameras s . The number and selection of reference patterns is determined by covering
the image foursome possibilities which imply impact with a projectile.

To get some feel for the size of the problem, we first consider the enormously impractical
brute force implementation in which all possible binary image foursomes having simply
connected blobs of approximate circular symmetry are encoded. At the farthest detection
range, the projectile subtends a single pixel. With no regard for the physics of the situation,
we have 32 2 valid positions for the projectile in each of four (independent) images, suggesting
that we have:
(2.13)
patterns to store just at this resolution, which would completely fill our pattern space.
It's obviously not worth pursuing this line of reasoning without introducing some physical
constraints.

8Indeed, with respect to motion our 32 x 32 images may not be enough in practice.
189

o p
{H I d (P ,P ) ( t)
P H Y

-I
n(yl = <1> {H I d (P ,P 1< t) P • <1>(y)
P H Y Y

Figure 2.3: The relationship among the physical domain, pattern space, the recognition
neighborhood, and the accuracy criterion.

Therefore, we make the following assumptions:

Al: The cameras are aligned in epipolar geometry such that corresponding points in the
two images lie along a common raster scan line. This removes a degree of freedom in
positioning the projectile in the right image given its position in the left. The row
positions of corresponding image points must be the same (or very nearly so).
A2: The camera optical axes are parallel (this is related to Al). Under these conditions,
the disparity between the projectile position in the two images is directly proportional
to range and, since range is unipolar, the disparity is unipolar as well. This restricts
the position of the projectile along the right image epipolar raster to one side of its
position along the left image raster. Specifically, the disparity (left column position
minus right column position) must be positive; any point in the right image must lie
to the left of its correspondent's position in the left image.
A3: We are only concerned with inbound, threatening projectiles. Therefore, only pat-
terns derived from image foursomes which describe such events require recognition.
This restricts the polarity of the change in disparity with respect to time (between
successive image pairs).
190

A4: There is a limit on the magnitude of the inbound velocity. The projectile may not
approach arbitrarily quickly. Even in the event that the visual system could recog-
nize the threat, there will be mechanical inertia and other system time constants to
consider. Again borrowing a design principle from biological systems, it is neither nec-
essary nor desirable to acquire information from the sensory system which cannot be
used due to other, say ambulatory, limitations. The inbound velocity limit establishes
a limit on the magnitude of the rate of change of disparity with respect to time.
A5: The visual system will not provide updates once the projectile covers more than one-
fourth ofthe image plane(s). At that point, collision is imminent, and the current best
estimate of the impact parameters is made available to the tactile sensory subsystem
and the robot controllers.
A6: The fields of view of the two cameras do not completely overlap (obviously). We will
assume that all threats are inbound such that they appear in both cameras. Obviously,
a projectile could approach in such a manner as to be visible in one camera only, but
we cannot perform triangulation ranging in that instance anyway. The domain of
responsibility only incorporates that volume of space visible to both cameras.

To decompose the counting problem, we will partition the domain of responsibility with
respect to range. The range quantization will be inferred by the size ofthe imaged projectile.
We will first count the number of patterns to be stored when the projectile is at a range such
that it occupies a single pixel. Next we will consider the number of patterns necessary to
describe the situation when the projectile has a diameter of two pixels, and so on. This may
not exhaust the necessary pattern set, but should provide us with a reasonable estimate
of the size of R. To make the discussion easier to follow, we will use the term "n pixel
range" to mean that portion of the domain of responsibility for which the imaged projectile
appears with a diameter of about n pixels.

Physical Constraints: Limiting the Choices

Figure 2.4 depicts a side view of the domain of responsibility (or, rather, the volume of
physical space it represents). The cameras are positioned such that the lower edge of the
field of view is parallel with the level ground plane. It is easy to see from the perspective
transform (recall the maximum range calculation) that the range at which the projectile
just subtends n pixels is given by:

Nd
Tn = 2n tan(~) (2.14)

SO that
Tnt = n2 (2.15)
Tn2 nl
From this relationship we observe the nature of the n-pixel range(s) as shown in the
diagram. As illustrated there, region Tl covers that range interval for which the projectile
covers at least a single pixel, but less than two full pixels, and so on. Therefore, the outer
191

h
Ground PIDne

Figure 2.4: Establishing physical limits on the vertical image motion of the projectile cen-
troid. Not drawn to scale.

limit of Tl is Tmax. We will refer to the Tn plane as the plane normal to the optic axis at
the outer limit of region Tn, SO the Tl plane is at a distance Tmax and Tn = Tmax/n.

To impose the motion constraints, we first observe that any inbound projectile which is
expected to collide with the robot must be (eventually) downward moving in the image
plane(s). Even if an object is launched on an upward trajectory which is threatening, we
assume that condition can be detected after the apex9 • Thus, we may restrict our reference
patterns to those for which the object is non ascending in both cameras. Given a pair
of centroid positions for images h(to),IR(to), the centroids in h(tI),IR(tI) should be no
higher.
The next limit to be estimated is the rate of change of centroid position. This is determined
by the speed of the projectile and its local trajectory. From Figurefig:incoming we see that
the fastest image plane motion for a given physical speed results when the rate of descent is
the steepest possible. Again, we only concern ourselves with trajectories which can result in
90f course, the system would certainly be more intelligent if it could infer collision while a projectile is
still upward bound, but we assume that it is rare that the projectile will be upward bound within D; i.e.,
it is on its downward arc before the robot sees it.
192

impact. For Tl (the 1-pixel range) that trajectory is denoted v. Obviously, as we approach
the cameras, the steepest possible path gets steeper, so the motion limit between successive
frames increases.

Careful computation of the limit as a function of range requires that we specify some of
the parameters of the system. Although our objective is a feasibility analysis rather than
a detailed design, we will specify certain parameters. To a reasonable extent, we will
try to develop general expressions. When necessary or practical, we assume the following
specifications:

• The image frame rate is 30 frames/sec. This corresponds to conventional video, ne-
glecting the 2:1 interleaved scan.
• The maximum projectile speed is about 20 meter/sec. This correponds roughly to
the speeds one might encounter in a foot ball or soccer game.
• The robot is 2 meters tall. More precisely, the camera lens centers are at h = 2 meters
above the ground plane.
• The closest approach for which we expect to have a chance to react corresponds to
T16, at which point the projectile occupies one-fourth of the image plane.

• To simplify matters, we will take Tmax to be 13 meters.


• The camera baseline (horizontal, or x distance between the two lens centers) is 1/3
meter.
• The other camera field of view and position parameters are as given above.

Figure 2.5 illustrates the extremal conditions for a given range limit. Tn is the distance
along the optic axis normal to the plane over which the projectile just subtends an n-pixel
diameter. The vector v is the trajectory of interest. We are interested in the speed of the
projection of v onto the Tn plane and then the perspective projection of that component of
motion onto the image plane.

First we observe the following about the incoming angle of inclination, a:

a = tan - 1 [ex + h]
---;:r- (2.16)

where exis the vertical distance from the lower limit of the field of view to the upper limit
at range Tn, h is the height of the camera lens centers above the ground plane, and T~ is the
horizontal distance from the lens centers to the intersection of the Tn plane and the upper
limit of the field of view.

Next we notice:
T~ = Tx cos 1/J (2.17)
193

r~ (imege plene not shown)

Figure 2.5: Computing the maximum image plane speed for a given range region.

where, as before, t/J is the angle of view, and Tx is the distance along the uppermost ray in
the field of view to the Tn plane. Since the angle between this ray and the optic axis is t/J/2,
we have:
(2.18)

By setting t/J = 45°, the T x , ex, T~ triangle is both right and isoceles, so we can see:
(2.19)

We also notice:

and "Y = 11" - Q - f3 = -11"2 + -t/J2 - Q (2.20)

And the magnitude of the projection of v onto the Tn plane is, with t/J = 7:
Iv;' I = Ivlcos"Y (2.21)
= IVlcos{511" -tan- 1 [TnSeC(i)cos(-V+h]} (2.22)
8 Tnsec(i)cos(7)
with Tn = Tmax/n, as above.
194

The perspective transform then lets us write the image plane speed in pixels per second,
according to our earlier results, as:

IViI = 3nlv-;'1 (2.23)

with the real space velocity expressed in meters per second and the image plane velocity
expressed in pixels per second. This is because we arranged Tn to correspond to the range
at which a 1/3 meter diameter object spans 1 pixel. The maximum distance the projectile
centroid can then move per frame, which will (partially) determine the pattern density is
then easily seen to be:
1vp_1_- IViI
30 (2.24)

Figure 2.6 plots this relationship as image plane speed in pixels per frame vs. the ap-
proximate range in meters. Perhaps more interesting from a pattern counting standpoint,
however, is the curve plotted in Figure 2.7 which depicts the same relationship, except im-
age plane speed is normalized to object diameters and the range is plotted as the integer n
corresponding to object diameter units (which also reverses the polarity of the x-axis). The
curve is clearly asymptotic to some final value, indicating that in terms of object diameters
the rate of translation in the image plane saturates as the projectile gets very close. It is
easy to see, since limn_oo I = 11"/8, that this curve is asymptotic to about 1.85 diameters
per frame. We naturally quantize D (and, hence P) in terms of the projectile diameter as
seen in the image. By attaching an estimate of the object diameter to the pattern vector,
we can scale our required accuracy accordingly, and this limit makes this approach very
convenient. This, in turn, will make estimating the number of pattern vectors to be stored
for a given range region relatively easy.

To a first approximation, the limit on horizontal image plane motion from frame to frame
is the same as its vertical limit, except that the motion can be in either direction. However,
once motion is established in one direction, we expect it to continue more or less consistently.
This assumption is somewhat pessimistic, since the robot is unlikely to be as wide as it is
tall. Therefore, the lateral motion parameters of concern lie in a smaller range than do the
vertical.

The next limit on the image plane position and motion to be considered is that of the
disparity in centroid position between the members of the stereo pair. Under our simple
camera geometry assumption we can write:

VI JJ.1
r
Vr JJ.1
r

where r is the range, b is the x-baseline, and the cameras are perfectly aligned in V. The
world coordinates (for this calculation) are defined relative to the left camera optic axis and
with x- and v-axes aligned with the horizontal and vertical dimensions, respectively, of
the left image. The right camera optic axis lies at (b,O) in this coordinate system.
195

Image Plane Speed vs. Range

30 ~--'-----~----~----~----'-----'-----~I

25 I .,
1\ I I

(P;.~: r.:m', :: b~-If-_-+_---+----+----+---I!


:-\1
-it--
1
1

10 I

I :.~ I
5'-1
o
o 2 4 6 8 10 12 14
Range (meters)

Figure 2.6: Maximum downward image plane speed as a function of projectile range.

With the x-disparity denoted as h and defined as left image position minus right image
position (and noting that the y-disparity is zero), we have

h =x/ - Xr = -bfr ~ 0 (2.25)

which produces a constraint on the allowable position of the projectile centroid as seen in
the right image, given its location in the left.
Additionally, since we have a limit on the maximum velocity of the projectile, this infers
a limit on the maximum rate of change of the x-disparity as a function of the projectile
range.
ah a [ bf ] -bf .
at = at r(t) = r2(t) r(t) (2.26)

with 1'(t) the rate of change of range in, say, meters per second. So with b, f, r expressed in
meters this expression yields the rate of change of disparity in meters per second, which is
not especially convenient. Since we would rather specify the angle of view and number of
image plane samples than the precise focal length, and since we wish to deal in spatial units
of pixels, we use the same set of substitutions used above in calculating maximum range to
196

Normalized Object Motion vs. Range Region

1.8

1.6 I I
I i ..... 1---1 ~t-
1.4
.---'

./ I
I ~
1.2 -

Diameters per
!I
J---1,
I
Frame 0.8

0.6 Ji
i I
0.4
, I
! ,
i I
0.2

o i I
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Object Diameter (pixels)

Figure 2.7: Normalized object motion versus normalized range in units of object diameter
in pixels.

arrive at the following:


6 _ Nb
(2.27)
p - 2rtan(*)
for the image plane disparity in pixels and:

(2.28)

for the rate of disparity change in pixels per second. Of course, we can always normalize
the range into object diameter units as above, and this is again convenient.
Plotting these relationships, we gain additional insight into the pattern selection and count-
ing problem. Figure 2.8 plots the disparity as a function of the range region, measured in
object diameters as usual. We see that the disparity is directly proportional (and for our
parameters, the constant of proportionality is just below unity) to the object diameter in
pixels. This is the sort of tight physical constraint we desire 1o .
IOThis relationship is a hyperbola in pixels vs. range in meters. The simplicity of this plot over the hyperbolic
is the motivation for working in object diameters.
197

Disparity vs. Range Region

Disparity
(pixels)

2 3 4 5 Ei 7 8 9 10 11 12 13 14 15 16
Object Diameter (pixels)

Figure 2.8: Disparity as a function of object diameter.

Additionally, in Figure 2.9 we plot the maximum rate of disparity change in pixels per frame
as a function of range. The maximum rate is computed by simply assuming the projectile
is moving precisely parallel to the optic axes at 20 meters per second. Again, it is more
useful from a pattern counting standpoint to consider the relationship in terms of object
diameters, and this depicted in Figure 2.10. The rate of disparity change measured this
way increases linearly with object diameter as the projectile approaches, with a constant of
proportionality of about 0.05. This, of course, is not surprising.

Accumcy Considemtions

Now that we have established limits on the physical motion of the projectile which can
be used to limit the memory requirements, we must consider the demands of accuracy.
There are two aspects to this question: accuracy of object position and accuracy of object
trajectory. The reader should bear in mind that the robot's response to the applied force
will not be determined by the visual input alone, nor even primarily. The purpose of the
visual system is to prepare the robot's tactile sensors, controllers, and so on either to elude
the collision, if possible, or to respond to an unavoidable collision. The detailed impact data
necessary for collision response will originate in the tactile sensor array, not in the visual
processing. Thus, pinpoint accuracy is not a requisite for the visual sensor.
198

Disparity Rate vs. Range

14

I I

f}+
12

10

F~T
---~

Disparity Change I

(pixels/frame)
6
I i
,
J
I
I
I
I I
4

~w
I

2 I
I I , I
L

I I'll I
oL ---~ - - -

o 2 4 6 8 10 12 14
Range (meters)

Figure 2.9: Frame to frame disparity change in pixels as a function of true range to the
projectile.

Clearly, accuracy becomes more important as the projectile approaches and collision is
imminent. The perspective transform assists us in this regard since each pixel subtends a
progressively smaller area of space as the range is reduced. But in addition to this effect,
in any given range region, we could elect to encode only a subset of the possible centroid
position 4-tuples for the reference set and set the matching pattern distance threshold
higher accordingly. As the object approaches, the field of view is gradually filled and the
likelihood of impact grows more certain as the possible trajectory is restricted. To a first
approximation, we will assume that we can use the advantage of the perspective transform
to let us represent the physical situation more coarsely (in image terms) as the object
approaches.
Counting the Reference Pattern Set
The foregoing set of plots can now be used to design the reference pattern set. Although
the detailed design of this set lies beyond the scope of this discussion, we will use these
plots to build an estimate of both the size of the pattern set and to suggest a mechanism for
inducing an effective topological structure on the memory to improve its efficiency. We will
assume that the object diameter is estimated from the average number of pixels it occupies
over the four images (roughly, the square root) and attached as part of the pattern. This
199

Normalized Disparity Change vs. Range Region

0.8

0.7 /
0.6 /
0.5 /
Disparity Change /
/'
(dia/frame) 0.4
0.3

0.2 /
0.1 /
~
o
o 2 4 6 8 10 12 14 16
Object Diameter (pixels)

Figure 2.10: Normalized frame-to-frame disparity change as a function of object diameter.

constrains the operating point along the various plots. We will further assume, to simplify
matters, that the range of possible projectile sizes and diameter estimation accuracy is such
that our position on these plots (in range expressed as object diameters) is known to within
±2 pixels.
By using the perspective transform to our advantage, we will store representative patterns
for non-overlapping projectile image positions. We begin counting by noting that, over
range region r n , there are:
32
n
choices for the column position of the projectile in h(to) and a like number for its row
position l l .
Once the h(to) image position is specified, the physical constraints on disparity (Figure 2.8)
leave us with only 3 (roughly) choices for the projectile position in IR(tO)' This is because
the disparity (for a projectile of about 1/3 meter diameter) is going to be about equal to
the object diameter in pixels. Since this information is part of the pattern, we have some
idea as to the range to the projectile, and so we have some idea of the expected disparity.
11 Recallthat we denote the four images in the foursome by h(tO),lR(tO),h(t,),lR(tl) to signify left and
right images at sample times to, t 1 •
200

Also, the patterns only represent positions to an accuracy of a diameter. Of course, if more
variety exists in our set of possible projectiles, more options will have to be accommodated
in the reference set.

With the first two images of the foursome specified, we now turn to the limits on object
motion to constrain the position in the second pair of the foursome. From Figure 2.7, we
see that the maximum vertical motion of the projectile will be less than two diameters.
Laterally, we will make the assumption that, since the robot is taller than it is wide, that
the next position will lie within one diameter to either side. Furthermore, we note that,
under the maximum velocity constraint, the projectile motion cannot attain its extreme
horizontal and vertical values simultaneously. Therefore, we conclude that given a position
in 1£(to), there are (in units of diameters) 7 possibilities for the object position in 1£(t1).
These are the current position and its neighbors to either side, the position just below the
current position and its neighbors td either side, and the position two diameters below the
current position. Motion which moves the imaged projectile out of this neighborhood will
not result in collision.

Finally, once the object position in 1£(td is specified, the position in I R(tl) is constrained
in precisely the same manner as IR(tO) is with respect to 1£(t o), giving 3 possibilities. Thus,
the total reference pattern count is approximately:
16 32 32
JRJ ~ I:-x-x3x7x3 (2.29)
n=l n n
1
L 2"
16
6.5 X 10 4 (2.30)
n
n=l

1.02 X 10 5 (2.31 )

Total Memory Size and Topological Structure

We have about 100,000 patterns to store. With five bits to specify the centroid location
in each dimension of each of the four images plus four bits to specify an estimate of the
projectile diameter in the image, each pattern consumes 44 bits. Thus a total pattern
memory capacity in the range of 4.4 Mbits (or slightly more than 1/2 MByte) is needed to
store the reference pattern set. This is well within current capabilities, even at high speed.

With each reference pattern, we will also have to store an encoded version of the robot's
response, approximate point of impact, and approximate time of impact. The point of
impact estimate will first be to the segment of the robot; 2 bits suffice to represent that
information. The time of impact (more precisely, time until impact) can be reliably resolved
no finer than the video frame rate, we expect. Although we haven't specified how slow the
projectile may go, 64 frames of video gives just over two seconds warning, which should be
sufficient. This requires another 6 bits.

The response options are limited by the precision with which the impact force vector can
be estimated. Of course, we at least have to specify the .lNoidance, retreat, resistance, or
201

no-action option. But, given any of the first three, we have to roughly specify the means of
doing so, with the details provided by the tactile sensor and processor. We estimate that
12 bits should provide sufficient flexibility, probably overmatching our impact parameter
estimation capability, anyway. This results in a convenient 64-bit total word length (44
for the reference pattern plus 20 for the specified action) and a total associative memory
capacity requirement of about 6.4 Mbits, or 800 KBytes.
To summarize, an associative memory forms the current best estimate of the time of impact,
point of impact, and the velocity of the projectile at impact time in the inertial coordinate
system. The input to the associative memory is a nine tuple consisting of a four tuple of
centroids and a crude estimate of the diameter of the projectile - all contained in a 44 bit
word. The output of the associative memory is a 20 bit word: 2 bits on strategy selection,
2 bits to specify the segment of impact, 6 bits for time until inpact, with the remaining 10
bits identifying the vector location of impact on the given segment and the impact velocity
vector.
Finally, we offer a few comments concerning topological structuring. Since the image four-
somes are built from overlapping pairs (the second pair of one foursome is the first pair
of the next) we can immediately restrict our attention to that portion of the memory con-
taining reference patterns whose "heads" correspond, more or less, to the current "tail."
Additionally, we can restrict the search within this region to the subregion containing those
patterns whose "tails" are possible, given the same motion and disparity constraints derived
above. Thus, the memory should be organized to take advantage of this natural structure.
This is an area of ongoing study, as are the crucial accuracy issues.

3. Post-Collision Tactile Sensing

For lack of space, we do not consider the collision itself. Some issues related to robotic
collision are discussed in [18].

We assume that a sustained force of variable magnitude and direction is applied to an


arbitrary point on the surface of the robot. As we show later, the controller needs knowledge
of this vector force, i.e., the vector force must be measured and expressed in the inertial
coordinate system for inclusion in the controller.

In Figure 3.1 we illustrate the basic coordinate frame relationships we envision for the robot.
........."... .,.
202

nbo\..rment

Figure 3.1: The robot is anchored at the origin of the inertial frame, O. B is a point in the
inertial frame. Bl is its image in camera (eye) 1. Dl is a vector from the center of gravity
of the head, CG h , to the origin of the image frame of eye 1. The image frame of the second
eye is not shown (segments not drawn to scale).

3.1 SENSOR STRUCTURE

The surface of the robot must be covered with sensors that collectively identify the location
of the force, its direction relative to the appropriate segment, and its magnitude.
From simpler tactile sensors with additional torque, position, and stress and strain sensors
we can estimate the magnitude, direction, and location of the disturbance force. In this
chapter we limit ourselves to the first approach, shifting the burden from the computing
machine to the sensor, although sensory systems of the type we envision do not yet exist
[19]. A sufficiently dense uniform grid of pressure detecting sensors is assumed to exist over
the surface of each segment of the robot, embedded in some skin-like viscoelastic material.
For each segment, an associative memory identifies the three segment-centered coordinates
203

of every point on the surface in response to excitation of the sensor elements by applied
pressure.

3.2 SENSOR PROCESSING

A subsystem is assumed that measures the magnitude and local direction of the impinging
force in that segment's coordinate system. It is worth noting that the neural processes
that supply this information in the central nervous system of living systems have not been
adequately identified nor have the processes been confirmed. Similarly artificial skin and
robots with tactile information processing capabilities do not yet exist to handle situations
of this complexity level. Consequently any mechanism discussed here is at best a theoretical
prototype. Additional uses of tactile information in exploration and shape perception are
found in [20,21]. The processing discussed here is shown schematically in Figure 3.2 and
divides naturally into two parts:

1. The set of excited sensors is treated as a three dimensional pattern and is reduced
to a skeleton, i.e., a sequence of sensors that lie parallel to the impinging force by a
lateral inhibition or nonmaximum suppression operation.
2. From the addresses of the skeletal sequence, the latter three components of impinging
force are computed in the segment's coordinate system.

Figure 3.2: Schematic of neighborhood information processing to extract the impinging


force vector expressed in inertial coordinates.
204

3.3 TRANSFORMATIONS

We assume the associative memory provides a sensor's position in a local 3D coordinate


system. Let this position vector be Xl. Suppose the origin of this local coordinate system
is at a point 0 1 , specified in the coordinate system of the segment by vector D 1 . The
coordinate system of each segment is at the center of gravity of that segment and its three
axes align with the principal axes of the segment. The principal axes are a unique set of
axes about which the moment of inertia matrix of the segment is diagonal. Finally there
is the inertial coordinate system. The above three coordinate systems correspond to those
of Figure 3.1. The image coordinate system of the eye is analogous to the local coordinate
system. The segment coordinate system is analogous to the head coordinate system of
Figure 3.1 and the inertial coordinate system is the same for both cases.
Let T be the orthogonal transformation from the local coordinate system to the segment
coordinate system. Let the coordinates of the sensor be x. in the segment coordinate

(3.1)

Finally let Xi be the coordinates of the sensor in the inertial coordinate system

Xi = CG.(8) + A.(8.)x. (3.2)

where CG.( 8) is the location of the center of gravity of the segment in the inertial coordinate
system and A.(8.) is the 3 X 3 orthogonal transformation from the segment of interest to the
inertial coordinate system. The 3-vector 8. describes the three Euler angles of the segment
[22,23]. The vector 8 in our notation represents the totality of all the degrees of freedom -
Euler angles of the robot. However, CG.(8), in general, depends only on a subset of these
angles, i.e., those of all the segments that are below the segment of interest.

When a vector is given by its components in the local system Fe, its components in the
inertial coordinate system are given by

Fi = A.(8 s )T Fe (3.3)

If the local coordinate system is the same as the segment coordinate system, the tranfor-
mation T is the identity matrix.
205

4. Control Requirements

There are two major requirements that must be specified prior to the robot's engagement
in resistance or retreat.

1. A subset of the robot states must be relegated [24,25) to the task of resistance or
retreat. The robot may be engaged in other tasks, and it is desirable not to interrupt
them.
2. The controller structure must be known, i.e., a topology of the feed forward and
feedback mechanisms must be selected.

We will not attempt to define any criteria or conditions for the selection of the subset of
states to which resistance or retreat is relegated. This selection depends on other tasks the
robot is involved in, and its future tasks and objectives, and is a function of higher levels
of intelligence and expertise of the robot. vVe assume the subset is a priori selected. One
underlying implication of relegation of control is that, due to higher level objectives, all
states are well specified so no redundancy or uncertainty can exist in the maneuver to be
undertaken.

4.1 THE CONTROLLER STRUCTURE

For the controller structure we select heuristically combined feed forward and feedback
mechanisms. The feed forward component consists of two parts: direct actuator input as
reference input, desired reference state, and/or gain adjustment for the feedback loop. The
feedback component consists of state feedback with variable gains to be set by the feed
forward mechanism. This structure is schematically shown in Figure 4.1. The main charac-
teristic of this design is the flexibility of control. Under certain circumstances the feedback
mechanism with the feed forward gain maybe sufficient. For other cases the feed forward
may be sufficient. Under other circumstances the two components can act together. The
forward gain system G 1 adjusts the eigenvalues of the closed loop system in the vicinity
of an operating point. Eigenvalues with laf!~e negative real parts correspond to high stiff-
ness in the corresponding subspace. Eigenvalues having small negative real parts provide
compliance in the direction of the corresponding subspace [26). Mason and Salisbury have
proposed a similar mechanism for controllin!~ a finger [27).

A mechanism that is not considered here is to dynamically or magnetically hold the actu-
ators and lock them in a particular position. This is an effective mechanism for making a
system very rigid and stiff. Natural systems apparently use an analogous mechanism for
resisting motion by coactivation of systems of protagonist-antagonist muscles in such a way
that the muscles are very taut and behave like metal cables. They inhibit relative motion
between the limb segments.
206

a.m .....t - - - - - ,
0,

Gain edjlllt.ment - - - - - - ' "


0.

.,.,.. ......... - - - - - - - - - - - - - '


...Ie

Figure 4.1: Topology of the feedback structure. State feedback with forward gain G},
forward actuator gain adjustment G 2 , direct forward actuator input, and desired reference
state input.

4.2 INPUT RELEGATION

With these preliminaries suppose a force F of constant magnitude and direction is applied
to point A of the segment n. Suppose force F is expressed in the inertial coordinate system
and point f is specified by its vector VI in the local coordinate system of segment n. The
coordinates of point f in the inertial system are given by
(4.1)
IT vector 0 is disturbed by .6.0
8L
dL= -d0 (4.2)
80
The incremental work of F is given by
8L T
dw =< dL, F >= d0 T - - F
80

dw = 8L T F (4.3)
d0 80
Equation (4.3) defines the influence of F on the dynamics of the robot. The system actuators
must counteract 8w/80 in the dynamics. One way to do this is to relegate the counteraction
as the sole function of a particular set of inputs [24]. Alternatively, pseudoinverse methods
[28] may be used to distribute this effort among all the inputs.
207

4.3 STATE RELEGATION

The velocity vector of point f in inertial coordinates is expressed in terms of 0 as

dL = 8L0 (4.4)
dt 80
This equation partially defines the incremental constraints on 0. Traditionally this equation
has been solved by pseudoinverse methods, and the retreat is relegated to all the position
states [28]. With state relegation, one may relegate the retreat action to three specific angles
of the robot. On the other hand, the situation allows one to define additional constraints to
maintain stability by keeping the projection of the center of gravity vector under the support
surface or keep the head segments containing the vision system stationary or provide specific
synergies among the position states to simplify the decision of the controller.

S. Four Link Biped

S.l THE MODEL

As an example we consider a four segment planar robot anchored at the origin, with four
degrees of freedom 0 T = (81) 82 , 83 , 84 ) and four moment of force actuators at the joints
UT = (Ul, U2, U3, U4) [29]. The equations of motion for this robot can be symbolically
written in a standard robotic notation

J(0)0 + B(0)0 2 + G(0)g = EU (5.1)

where J(0)0 contains inertial terms, B(0)0 2 is the centrifugal vector, G(0)g is the grav-
itational vector (g=9.8 m/s-s) and

E~ U-~ -~ -~ 1 (5.2)

Robot variables and parameters are defined in Figure 5.1. The human-like physical param-
eters of this robot are given in Table 5.1. We will consider a case of resistance and a case
of retreat for the four degree of freedom planar biped.
208

n~-~--_r1......,CC,._I __ J,
......04--...1.,;.;.:"...----+.;...._.
UI

Figure 5.1: Robot variables and parameters definitions

segment m 1 k j
number (kg) (m) (m) (kg-m-m)
1 7.770 .560 .260 .170
2 13.244 .375 .161 .130
3 40.628 .662 .331 1.25
4 8.358 1.000 .450 .585

Table 5.1: Robot parameter values

5.2 RESISTANCE TO STEADY FORCE

The resistance strategy is selected when projectile collision can not be avoided and insuf-
ficient time remains for retreat. A projectile, with some impact velocity, which directly
strikes the robot will rapidly acquire zero velocity in a short time interval. The robot
will experience a disturbance force during this short time which can be approximated as a
constant force F.
The resistance strategy selected here is to hold a constant position. The desired robotic
response (superscript d is for desired and indicates a feed forward term in the controller) is
e e
e d equal a constant (which will be the value of e at impact) and d = d = 0
209

We devise a rigorous test for the resistance strategy by applying a steady force F to the
robot and not just a transient force. The dynamics are affected according to the work done
by F
.. •2 {JLT
J(0)0 + B(0)0 + G(0)g = EU + -a0F (5.3)

and feed forward is obtained from inverse statics evauated on 0 d

Ud = (~)(G(0d)g- :~:F) (5.4)

The incremental eigenvalues should naturally be chosen with knowledge of the time con-
stants and power capabilty of the DC motors, mechanical strength of the robot's struc-
ture;etc. The feedbacks are a priori calculated, somewhat arbitrarily, to set all eight eigen-
values of the four planar segments of the linearized system at -3. The corresponding state
feedback matrix K is given in Table 5.2 where K feeds back state pertubations
~U = J((~0, ~0) (5.5)

-783.4 -462.15 -425.34 -13.563 -267.17 -164.64 -160.82 -5.7426


-207.9 -351.48 -330.66 -11.018 -1:18.6 -90.855 -97.702 -4.0459
-97.228 -65.108 -267.25 -9.3134 -64.18 -43.405 -55.443 -2.9097
-2.5454 -1.7043 -3.0086 -6.3048 -1.6968 -1.1362 -2.0058 -.90394

Table 5.2: Gain matrix K for resistance

The controller consists additively of


(5.6)
All the attributes of the generalized controller of Figure 4.1 are not included in Equa-
tion (5.6). The feedback ~U models the reflexive human response and the feed forward U d
is a voluntary response due to sensory detection of the physical impact, although the vision
system can allow some anticipation. Simulation of the controller is as follows: a horizontal
force F = (100, O)T newtons applied at joint 3 and 0 d = (7r /2, 7r /2, 7r /2, 7r, 2) is the impact
position, as shown in Figure 5.2. A sensory dynamic lag time of .2 seconds is allowed to
detect the existence of the impact force. The feed forward U d is added after sensor lag.
Figure 5.3 shows the trajectories of the four angles as functions of time. With the selected
feedbacks, the robot requires about 2.5 seconds to regain the original stance at impact and
successfully resist the force.

5.3 RETREAT FROM A PROJECTILE

The retreat strategy is chosen when a collision is not avoidable by evasive action but there is
some time to ameliorate the effect on the robot of a projectile impact. The retreat strategy
210

-'W-
-
Figure 5.2: Robotic resistance to a projectile impact at joint 3.

--.....-....

•••
...._. ..
•••
·~:~.~~.~.4~-.~.~.-L~.~.~~.~.~~~~~.~4~ • .•. .t_••.
Figure 5.3: Resistance simulation - actual and desired angles

chosen here is to make the estimated point of impact acquire the estimated projectile velocity
in the time remaining prior to impact. This strategy has the desireable property that the
sudden change in projectile velocity due to striking a stationary robot does not occur, which
in turn yields a small impact force. Subsequent to impact, the retreat strategy is to return
211

the robot to some stationary position by bringing all angular velocities 0 to zero. The
estimated point of impact will be at joint 3 as shown in Figure 5.4 We relegate the retreat

••

Figure 5.4: Robotic retreat from projectile approaching joint 3

strategy to the lower three angles 9b 92 , 93 and chose that 84 is zero( the head remains
stationary). The estimated velocity vector of the projectile is assumed to remain constant
at dL/dt = (.2,ol meters per second. From the given constant retreat velocity, dL/dt, the
minimum norm of angular velocities (B·t, e~,eg) is computed and et = 0 is appended to
e
give 0 d • The nomimal d is obtained directly as d(0 d )/dt.
Feed forward U d is obtain from the inverse dynamics, Equation (5.1), evaluated on 0 d ,ed ,
and 0.
The poles of the closed loop are chosen all at -10 resulting in the state feedback matrix of
Table 5.3 where K feeds back velocity pertubations l:1U = K l:10. The results of digital

-308.76 -135.68 -222.6 -8.176


-94.471 -126.97 -138.73 -6.2838
-85.764 -47.891 -92.329 -4.7899
-1.8922 -1.4938 -3.2834 -1.5066

Table 5.3: Gain matrix K for recession

computer simulations of this maneuver are given in Figure 5.5 which shows the angular
velocities WI = 8b W2 = 82 , W3 = 83 , and W4 = 84 of the robot in response to the retreat
command and Figure 5.6 which shows the actual and desired translational velocity of the
estimated impact point in both the horizontal and vertical direction.
212

....
..• - •• •• I • I •

- - - - - - .......... ..-.c,
••

••
I' I
••
,
•• I •
- _ ...,... _ _ _ _ "'".-....cl "' ••

• • ••
•I• • •:•••
••
II
•• f-
••
•:~.

............
.
~

.. I. I I I • • .t-
YfI'It_. I •
•• •• ••._'1&('I •
••

Figure 5.5: Retreat simulation - actual and desired angular velocities w

••tr~~;;~~~~~~~~A;;'~'~~~~~~"~'~~~
,- .
•• ••
...
.. ••
• J

I
;'.f-
:: ....
......
• - ....- - - - - -...---------·1
~
·
.....
.. ..
·
•• •...._.I' "' •• -.
..... I

.....•."' "'
I I
·
••
• I

Figure 5.6: Retreat simulation - actual and desired estimated impact point velocities
213

6. Discussion and Conclusions

We have provided a rudimentary method for resistance and retreat of robots when colli-
sion or contact are unavoidable. The method requires a vision system for velocity, force,
and point of contact estimates. A uniform distribution of pressure sensors on the sur-
face of the robot, an associative memory, a nonmaximum suppression technique, and some
transformations are needed for force sensing. We have provided sufficient detail so that a
microprocessor implementation of all the above can easily be envisioned. Due to lack of
space the actual implementation is not attempted and is left for the reader.
The controller structure is to some extent standard. We used input relegation to counteract
the external disturbance force and provide resistance, as opposed to impedance control by
changing the feedback gains. We used state relegation for avoidance as opposed to general
pseudoinverse methods.
The digital computer simulations illustrate how these methods work and that both the
resistance and retreat strategies seem to be feasible. Much remains to be done on these issues
and this chapter should be considered as a motivational discussion and as an introduction
to the subject.

7. References

[1] L. A. Loeff and A. H. Soni, "An algorithm for computer guidance of a manipulator in
between obstacles," Journal of Engineering for Industry, vol. 97, pp. 836-843, 1975.
[2] T. Lozano--Perez, "Automatic planning of manipulator transfer movements," IEEE
Transactions on Systems, Man, and Cybernetics, vol. 11, pp. 681-689, 1981.
[3] R. A. Brooks, "Planning collision-tree motions for pick-and-place operations," Inter-
national Journal of Robotics Research, vol. 2, pp. 19-44, 1983.
[4] O. Khatib, "Real-time obstacle avoidance for manipulators and mobile robots," Inter-
national Journal of Robotics Research, vol. 5, pp. 90-98, 1986.
[5] N. Hogan, "Impedance control: An approach to manipulation," ASME Journal of
Dynamic Systems, Measurment, and Control, vol. 107, pp. 1-24, 1985.
[6] M. R. Cutkosky and 1. Kao, "Computing and controlling the compliance of a robotic
hand," IEEE Transactions on Robotics and Automation, vol. 5, pp. 151-165,1989.
[7] A. Pugh, Ed., Robot Sensors, Vol. 2, Tactile and Non- Vision. United Kingdom:
Springer-Verlag, 1986.
214

[8] D. Siegel, I. Garabieta, and J. M. Hollerbach, "An integrated tactile and thermal sen-
sor ," in Proceedings of the IEEE International Conference on Robotics and Automation,
pp. 1286-1291, 1986.
[9] Y. F. Zheng, "Integration of multiple sensors into a robotic system and its performance
evaluation," IEEE Transactions on Robotics and Automation, vol. 5, pp. 658-669,1989.
[10] M. A. Fischler and O. Firschein, Intelligence: The Eye, the Brain, and the Computer.
Reading, MA: Addison-Wesley, 1987.
[11] S. T. Barnard and M. A. Fischler, "Computational stereo," ACM Computing Surveys,
vol. 14, pp. 553-572, December 1982.
[12] A. C. Kak, "Depth perception for robots," in Handbook of Industrial Robotics, S. Nof,
Ed., pp. 272-319, New York: John Wiley, 1986.
[13] K. L. Boyer and A. C. Kak, "Structural stereopsis ior 3-d vision," IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 10, pp. 144-166, March 1988.
[14] N. Ayache and B. Faverjon, "Efficient registration of stereo images by matching graph
descriptions of edge segments," International Journal of Computer Vision, vol. 1,
pp. 107-131, 1987.
[15] R. Horaud and T. Skirdas, "Stereo correspondence through feature grouping and maxi-
mal cliques," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11,
no. 11, pp. 1168-1180, November 1989.
[16] D. H. Ballard and A. Ozcandarli, "Eye fixation and early vision: kinetic depth," in
Proceedings Second International Conference on Computer Vision, pp. 524-531, 1988.
[17] R. C. Nelson, "Visual homing using an associative memory," in Proceedings of the
DARPA Image Understanding Workshop, pp. 245-262, 1989.
[18] Y. F. Zheng and H. Hemami, "Mathematical modeling of a robot collision with its
environment," Journal of Robotic Systems, vol. 2, pp. 289-307, 1985.
[19] J. Webster, Tactile Sensors for Robotics and Medzcine. New York: John Wiley and
Sons, 1988.
[20] H. Hemami and R. E. Goddard, "Recognition of geometric shape by a robotic system,"
Journal of Robotic Systems, vol. 4, pp. 237-257, 1987.
[21] H. Hemami, J. Bay, and R. Goddard, "A conceptual framework for tactually guided
exploration and shape perception," IEEE Transactions on Biomedical Engineering,
vol. 35, pp. 99-109, 1988.
[22] K. S. Fu, R. C. Gonzalez, and C. S. Lee, Robotics. New York: McGraw Hill Book Co.,
1987.
[23] H. Hemami, "A state space model for interconnected rigid bodies," IEEE Transactions
on Automatic Control, vol. 27, pp. 376-382, 1982.
215

[24) H. Hemami, C. Wongchaisuwat, and J. Brinker, "A heuristic study of relegation of


control in constrained robotic systems," Transactions of the ASME, vol. 109, pp. 224-
231,1987.
[25) H. Hemami and H. Ong, "Philosophy, structure, and examples of relegated control,"
Journal of Intelligent and Robotic Systems, vol. 2, pp. 53-72, 1989.
[26) A. Z. Ceranowicz, B. F. Wyman, and H. Hemami, "Control of constrained systems of
controllability index two," IEEE Transactions on Automatic Control, vol. 25, pp. 1102-
1111, 1980.
[27) M. T. Mason and J. K. Salisbury, Jr., Robot Hands and the Mechanics of Manipulation.
Cambridge: The MIT Press, 1985.
[28) C. Klein and C. Huang, "Reveiw of pseudoinverse control with kinematically redundant
manipulators," IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, pp. 245-
250,1983.
[29) B. Khosravi-Sichani, S. Yurkovich, and H. Hemami, "Control of a fourlink biped in a
back somersault maneuver," IEEE Transactions on Systems, Man, and Cybernetics,
vol. 17, pp. 303-311, 1987.
CHAPTER 9

A DISTRIBUTED CONTROL NETWORK


FOR SENSORY ROBOTICS

AT.de Almeida, U.C.Nunes, J.M.Dias, H.J.Araujo and J.Batista


Electrical Engineering Dept, University of Coimbra
300 Coimbra, Portugal

ABSTRACf. Sensor-based robots can increase significantly the scope of


applications of robots in manufacturing namely for flexible assembly. This paper
describes the development of a robotic system used for the evaluation of sensory
integration techniques and real-time path control strategies. The robotic system
has a distributed hierarchical architecture, featuring vision, force, tactile, range
and proximity sensors. The approaches for integration of the sensorial
information are discussed.

1. INTRODUCTION

In spite of recent developments towards task-level robotics, enormous


limitations still exist at the algorithmic and physical levels preventing the complete
implementation of such a flexible system. Among those limitations, it can be
mentioned the lack of reliable and feasible sensors and the lack of efficient methods to
cope with problems in domains such as environment modeling, geometric reasoning,
planning collision-free motions and planning sensory strategies. The integration of
these specialized modules in a complete control system poses another research
challenge. Such a system should be made up at high-level with powerful algorithms,
performing automatic planning of actions, and at low-level with run-time capabilities
to monitor and control the course of those actions under sensory feedback. It is well
understood that sensors play a central role in achieving the desirable system
flexibility, reliability and autonomy. Due to this fact, work dealing with various
aspects related to sensors such as sensor development, sensory data processing,
sensory integration, sensor fusion, etc., has been intensive.
A robotic system with the capacity of synthesizing and executing reliable
motions needs in first place tools for gathering information from its surroundings. At a
later stage, it should carry out the analysis of the sensory raw data and the required
data transformations in order to keep an updated world model of the environment.
Based on the on-line sensory information and on the pre-built model of the
environment, planning operations, such as the selection of feasible and suitable grasp
configurations can take place. At the low-level, the specification of sensor-based
217
218

control strategies is a key point - it has been verified that the difficulty of generating
accurate fine motions for assembly, is still one of the most serious factors precluding
flexible manufacturing.
The sensory information described until now is mainly concerned with the
specification and execution of isolated pieces of specialized motions. The degree of
autonomy of a system is related with its capacity to select the suitable specialized
motion along the time according to the state of the task under execution. That is, a
system to be autonomous needs a module for planning sensory strategies and for
planning appropriate recovery actions once an error state is determined [1,2]. As an
example, this planning module should provide solutions for the following questions:
When should a vision system be used to cope with an uncertain object position? What
sensors to use to decide for the end of an action? What sensors to test to decide for the
success/failure of an action? What sensors to scan continuously in order to minimize,
or prevent, secondary effects due to an error or a failure? How to process sensory data
to extract oustanding information?
This paper is concerned with the development of a robotic system which aims at
enabling the evaluation of sensory integration techniques as well as the evaluation of
real-time path control strategies. Most commercial robots, as in the case of the six-axis
PUMA 560, do not have enough flexibility to allow an efficient use of sensory
information. For that, the use of an external processing system is needed. For
connecting an external system to a PUMA controller there are several possible
configurations [3]. One of them is to use both the channels provided by the VAL-II
system [4], the real-time path control and the supervisory control channels. Despite the
limitations of such a configuration, it provides enough power and flexibility for many
applications. This is the configuration adopted in our robotic system approach, shown
in Fig.I. The robotic system exhibits an hierarchical structure and is based on a
commercial distributed control network, the Bitbus serial control bus [5].
Presently the main devices of the robotic cell are: a six-axis manipulator PUMA
560 equipped with a wrist force sensor, two CCD cameras and a two finger gripper
with proximity and tactile sensors; a conveyor belt equipped with a set of devices and
a rotary table with pneumatic indexing. There are several sensors and actuators
associated with these devices which in the case of the conveyor belt are the brake and
the motor speed controller, sensors to monitor the motor speed and miniature optical
devices to detect objects on the conveyor. The gripper is pneumatically controlled by
using a pneumatic proportional valve which allows continuous variation of the force
applied to the handling objects. It is equipped with an optical sensor, to detect the
presence of objects between the fingers, tactile sensors and ultrasonic ranging sensors.
The actuators and sensing devices of the conveyor, indexing table and gripper are
connected to the Bitbus serial network through a microcontroller-based modular
system. A set of boards 'corresponding to a variety of functions has been implemented
around this modular system with which is possible to accomplish diverse control and
monitoring applications. The vision system is based on a host computer, a frame
grabber and several image processing units. The frame grabber can receive images
219

'-
SUPERVISOR

BITBUS
t
L BITBUS_
Wires (2.4Ml>'1)
Main oct

1
BITBUSSIa.. I
I
BITBUSSta... I
1
BITBUSSIa", I BITBUSSlave I
LOCAL DDCMP IMAGE LOCAL
PROCESSING
-
SUPERVISOR INTERFACE CONTROLLER
UNIT L..- SYSTEM '-

OIL,
RS 232
I DDCMP
~~ Polltlon.r '"

--1----
nd~_ .. ____ Proto,eol
Coauyor
BI lB •• MalOr "
L Ink. Actuatores
ALTER

Gripper ControUer
Ultrasonic System
Port
2 CCD
carner••
I
Wrist force Sensor

r
~~
Tactile Sensor

fa
PUMA
Controller
( )
J
Il

Fig. 1. Schematic of distributed sensor-based robotic system.

from up to four cameras with a resolution of 512x512 pixels and has a programmable
pseudo-colour output. The image processing units, based on a very fast digital signal
processor, carry out the feature extraction operations on the images obtained by the
frame grabber. Several image processing units can be connected in a parallel-processor
archi tecture.

2. DISTRIBUTED SYSTEM ARCHITECTURE

The control system in use exhibits an hierarchical structure of the form


illustrated in Fig. 1. This computational structure can be useful for the following tasks:

(i) To support simple automatic assembly applications in real-time. The


computational structure should be powerful enough to allow the study of assembly
applications with sensory requirements. It should also be flexible enough in order to
cope with the study and the development of different assembly subtasks.
220

(ii) To allow the access to the cell resources from other computational systems
working at a higher level. At this level it is expected to investigate problems such as
three-dimensional vision integration, path planning, domain modelling, grasp
planning, etc.

In order to achieve the first purpose, the control system was implemented
following a distributed model based on a high-speed short-message channel and using
slaves with a high processing capacity. The communication between the Supervisor
and the distributed slave systems uses a serial network in a bus topology (the Bitbus).
This network has enough attributes which lead to acttractive solutions for a wide
spectrum of local control applications, particularly at the sensor and actuator level. It
exhibits a good performance in handling short messages, namely allowing high order
rates of information transmission, up to 2.4 Mb/s in hal-duplex, synchronous mode.
For the second purpose we have connected the Supervisor to an Ethernet on which we
have Sun workstations.
In functional terms we can divide the control system in three parts, the planning
processing system at the highest level, a logic branching control structure at the
intermediate level and a real-time continuous controller at the lowest level. The
principal component at the lowest level is the Local Supervisor which is connected to
the PUMA controller through the real-time path control channel (Alter). This is the
level which supports the implementation of real-time continuous sensory-feedback
control strategies, that is, the implementation of fine motions (motions in contact,
surface following motions using proximity sensing, etc). At the intermediate level
sequences of discrete actions actuating upon the distributed devices can be
programmed. Examples of these operations are the examination of a sensor state, a
request of raw data from a specific sensor, or a command to trigger a MOVE action by
the PUMA robot. Physically this layer is connected to the manipulator robot controller
through the supervisory channel. The highest-level is the planning and high-level
information processing system.

3. MODELLING THE ENVIRONMENT

To automatically perform useful tasks, the manipulator system must have


knowledge about its universe. More precisely, it needs a data structure to integrate all
the information captured by the sensors. This structure must have the capability to
filter and merge all this information, modelling it so that it can be useful for the
manipulator tasks. In principle, this structure should build the map or model the
environment without human intervention and this map should not come from a data
base. This principle is one of our goals when attempting to build an autonomous
manipulator system that uses sensorial information as interface to the universe.
The structure used to model the environment can assume different topologies
depending on the tasks on which we are interested. The selection between these
221

different topologies depends on the type of environment and task in hands. It is easy to
conclude that several different levels of representation, corresponding to different
resolutions of the universe of the manipulator are necessary. For example, a symbolic
description is desired if high-level goals can be specified. In that case the model
should be hierarchical, built bottom-up with sensor data at the lowest level and with
symbolic and topological information at higher levels. This way, reasoning and
planning use the data previously processed at higher levels with the necessary details
coming from lower levels. Taking these problems into account we propose three
different levels of representation for the working universe:

(i) - Low resolution map with the symbolic and topological information;
(ii) - Medium resolution map with some dominant features;
(iii)- High resolution map with dense data.

The low resolution map is the main source of information for the task or path
planners, because it normally contains the position and identification of the obstacles
in the universe. The family of planners using the concept of configuration space [6]
uses discrete obstacles represented by 2D polygons which can be obtained from the
last level of data representation explained above. At the low resolution level we
describe only discrete obstacles without explicitly describing the local shape of the
environment. At the medium resolution level, we include a description of the
environment through surface patches that correspond to significant environment
features. These features include regions with the same kind of properties such as
colour, texture and the discontinuities between these homogeneous regions. This map
uses more dense information about the universe, grouping (or segmenting) portions of
regions based on some criteria and defining better the discontinuities (edges). To
obtain this map a more dense information map obtained in an early stage is used. In
that map all the information coming from different sensors is combined, validated and
placed into the high resolution map.
In this paper we comment the interconnection between the different processing
stages and the sensors used to obtain a high-resolution map. We also comment how
these tasks are performed and how the information coming from the several different
sensors is made available in the distributed network.

4. VISION SYSTEM

Vision is one of the most powerful sensors that can be used in robotics. The
visual data is rich enough to be used in tasks such as parts assembly using visual
feedback or, as the main source of information about the manipulator universe.
Different techniques from Computer Vision can be used to extract different types of
information from the image data. That information can be employed to make up the
222

high-resolution map. We are interested in extracting the three-dimensional structure of


the world using techniques such as stereo, motion or depth from focus.
Regarding the vision system we have developed two distinct elements. One
refers to a specialized architecture for real time image processing and another deals
with the basic software modules - camera calibration and an image processing library.

4.1 Architecture for image processing

A vision system was developed for image processing in the fields of robotic and
products inspection [7]. Taking advantage of the flexibility of a programmable Digital
Signal Processor (DSP) and being built around an industrial standard bus, the system
can be configured in a number of ways allowing a wide spectrum of applications - see
Fig.2. Working under the control of a host computer, the vision system, in its
minimum configuration, is composed of two boards which are capable of processing
images of 512x512 pixels with 256 grey levels per pixel. These boards are a video
acquisition and display board and an intelligent image processing unit.
The vision system, developed around a general-purpose digital signal processor
in the processing units, allows the construction of a Single-Instruction-Multiple-Data
(SIMD) structure suitable for vision processing without compromising flexibility. On
the other hand, the modular approach taken in the design of the system permits its
adaptation to particular applications.
The video-acquisition display and transmission board [8] is self-contained unit
interfacing this system to standard black-and-white TV cameras using the CCIR
(Comite Consultative International des Radio-communications) norm and to black-
and-white monitors (see Fig.3). Up to four video inputs can be connected to the
acquisition board which uses an 8-bit analog-to-digital flash converter to digitize one

V1DOO
AQUlSrmN
DISR.AY
UNtr

Fig. 2. Block diagram of the VMEbus-based vision system.


223

n VIDEO IN
nomthc
Cameras

~
~ II [1}--
~! our
Tothc
Mooita

Fig.3. Schematic of the acquisition, display and transmission board.

of the selected channels. This provides information associated with multiple views of
the working cell. Before being stored in memory, the digitized sample passes through
a look-up table of 256 entries allowing simple processing, such as binarization of an
image, in real time. The display section of the board uses a conventional CRT
controller and a colour look-up table which incorporates three 8-bit digital-to-analog
converters to generate a RGB video signal with 256 colours from a palette of 256K
possibilities. The acquisition board has the capacity to store two complete images of
512x512 pixels on its internal video memory which is accessible to the host. This
board also has the capability of transmitting, through a secondary bus, the digitized
video which becomes available to the intelligent image processing boards connected to
the system. When activated by the host, the transmitting section of the board sends two
samples of video data and their corresponding addresses every 125 ns.
Each of the intelligent image processing boards [9] used in the vision system
consists of four parts: a video input module, digital signal processor (DSP), a local
memory and the interface to the main bus (VMEbus), as shown schematically in Fig.4.

DIGITAL SIGNAL
FROCESSOR

IMAGE
MEMORY

Fig. 4. Schematic of the intelligent image processing board.


224

The video input module uses both the data and the addresses appearing on the
video bus to place an image, or part of it, in memory. Under the command of the host
computer, the video interface module may be configured to acquired images of
different sizes (windows) and different resolutions. In choosing a digital signal
processor for this system, a special consideration was given to the capability of
addressing a large range of external memory, as it is required in image processing. The
intelligence of the board is only local and the DSP processor is totally controlled by
the host computer. This intelligent board can store and process a complete 8-bit
512x512 image, i.e. 256K pixels. This information is physically stored in static RAM
memory which can be accessed by three elements: the video bus, the host computer
and the digital signal processor. An important aspect of the video memory
organization refers to the arbitration between the three sources of access. The video
bus has the highest priority, having the whole memory available during the total
transfer of an image. The VME host computer has the responsibility of haiting the
DSP processor before making an access to the video memory, avoiding in this way any
conflicts in the access to the shared memory. The VME interface module of the
processing board permits the access to the video memory, the access to the digital DSP
programme memory, the control of the video interface module and the control of a
number of registers to operate the board.

4.2 Software tools- Image Processing Library

When we face a vision problem we can try to solve it by implementing the


specific algorithms we recognise adequate. However this solution is specific and
usually not flexible, that is, even if the problem changes slightly most of the software
has to be rewritten. When considering the solution of a range of problems in the same
field (a robot cell, in our case) a much more flexible approach consists in developing a
general purpose library containing the most used algorithms and operators. When we
try to solve a specific vision problem using the library it may be necessary to develop
new operators. This way the library provides a flexible means to integrate all the
software developed and to make it available to future purposes.
Thus we have developed a general purpose software [10] in order to provide the
user with a flexible image processing tool. The main goal of this software is to provide
the user with a friendly interface to manage an image data base and to apply to the
images an operator or a sequence of image processing operators.
This library has been developed taking into account the following constraints:

-Interactivity: This is essential for such software. Usually the user wish to be
able to select the operators, and to control the sequence of processing steps;
-Easy access:Provides a direct access to standard images files;
-Modularity: This feature allows the easy integration of new operators.
225

This image processing library integrates three major modules:


- an Interface module based on interactive menus;
- a Supervisor module oriented to an efficient image data base management;
- an Operator module which allows an easy, efficient and independent operator
management.
The interface module provides a man-machine interface based on interactive
menus, with an efficient subroutine management, allowing a perfect control over
incorrect sequential task selections. This module also presents a direct dialogue with
the user from which he may select all the parameters needed for a special purpose. It
also presents a high-level status and help menus.
The supervisor module is responsible for a correct management of image data. In
this library all the images are stored as raster files, and each image has its own image
descriptor in a round chain of descriptors . The supervisor module deals with three
types of images: The frame-buffer images, the chain images and the stored images.
The frame buffer images are the images acquired and stored in the frame-store. The
chain images represents all the images available for processing. These chain images
includes those obtained from the frame-buffer and also some of the stored images.
The chain images are temporary image files, but they can be transformed into stored
images by user decision. When the chain has a lot of images, the user can update the
images selecting from the chain images the image for processing. This software only
processes images that belong to the chain.
In this library, the image processing operators are not developed in the library
but externally. They are used as child processes which are created by the operator
module. With this procedure, all the operators can be developed as closed programs
which can be executed independently from the library or called from it.
This image processing library has been intensively used to solve several kinds of
problems related to image processing and automation. A large group of camera
calibration techniques has been implemented [11] on this software within the scope of
the sensory robotic system.

4.3 Software tools- camera calibration

In order to build the world model it is important to compute the depth map of the
environment. For that purpose two cameras are attached to the gripper's end-effector.
To compute depth we need to find out, with reasonable accuracy, the relative
transformation between the 2-D image and the referential representing the tool. The
solution of this problem is obtained by performing a calibration of the stereo pair of
cameras. As a result of this process, we obtain two (3x4) matrices which express the
above mentioned transformations. If the transformation between origins of the two
cameras' referentials is known, then the coordinates of any point in the 3D world can
be calculated by using its projections on the left and right images.
The basic problem in camera calibration is to obtain the relationship between the
three-dimensional world and the two-dimensional images captured by the cameras-
226

Fig. 5. Geometry of the stereo pair. The {LEFT} and {RIGHT} referentials are the
three-dimensional referentials of the cameras. The relationship between the cameras
referentials and the world referential is known.

see Fig. 5. This relationship can be expressed by a linear equation (1) where (X,Y) are
the coordinates of the projections in the image of a point p(x,y,z) in the 3D world, s is
a scale factor and CAM is known as calibration matrix.

(1)

By algebraic manipulation, (1) can be expressed in the equivalent form:

[~] = Ta (2)

where a is a vector made up of the 11 unknowns of CAM in a vector form (element


(3,4) is set to one - see [12,13] for details) and T is a matrix whose elements are
function of the three-dimensional points p(x,y,z).
For evaluation of T we analyze an image of a grid of points, whose world
coordinates are known. By using the correspondence between the points in the 3-D
scene and the corresponding points in the 2-D image we can write a set of equations
using the relationship (2). This system can be solved using more points than those
227

strictly necessary, by using extra measurements to offset the noise effects or filter the
data. For this purpose a solution for T can be obtained by a least-squares estimation.
To obtain this solution all available points should be used in the computation of T
[11].
Experiments have been done to obtain an optimal estimation of T using a
recursive form of the least-squares estimation [14]. This interesting solution is possible
by exploring the capability of capturing several pictures in different positions of the
manipulator universe [15].

5. MANIPULATOR GRIPPER AND ASSOCIATED SENSORS

The gripper is pneumatically controlled using a pneumatic proportional valve


which allows continuous variation of the force applied to the handling objects. For that
purpose, we use force sensing resistors (FSRs) manufactured by Interlink which are
polymer thick film devices. These are piezoresitive devices exhibiting a decreasing
resistance with increasing force applied normal to the device surface. We use simple
FSRs as well as force and position sensing resistors (FPSRs) which measure both the
position and the amount of an applied force along a planar or curved surface. The
contact forces are sensed and based on the difference between the desired force and the
sensed force, a pulse-width-modulated (PWM) signal is generated to control the
proportional valve pressure. The analog to digital conversion is accomplished by a 12-
bit ADC converter with serial output. The next stage of the signal processing is
performed in a physically separated intelligent system which has interface registers,
the PWM circuit, and a microcontroller board with the serial interface Bitbus. A pair
of infrared LED and photodiode installed near the bottom of the fingers is used to
detect the presence of objects within the fingers. Additionaly, we have an ultrasonic
system and a wrist sensor attached to the last link of the manipulador. Presently we
are also developing a tactile sensing array using the FSR technology. The raw data
from all these sensors is driven to the Local Supervisor. The main purpose of this
sensing information is to be used as feedback in real time control strategies. In the
Fig.6 is shown the schematic of the Local Supervisor with the sensors and actuators
under its control.
The complete system exhibits four levels of hierarchy. The lowest-level is
composed by the transducers and actuators themselves. At the next higher level we
have the pre-processing systems which carry out the operations of conditioning,
filtering, analog-to-digital convertion, tactile array scanning and PWM signal
generator. The analog-to-digital convertions, except for the Astek wrist force sensor,
are performed in a similar way. We use 12-bit ADC converters with serial output
which allow the implementation of the complete process of analog-to-digital-to-serial-
to-parallel, from the pre-processing level to the micro-based processing level, to be
accomplished in a minimum time of 5 J.lSec. This level is connected to the next higher
level through digital and power links. This pre-processing modules were designed to
228

BilBus Interconned

I
BITBUS SI"'I

Aller
LOCAL
' - SUPERVISOR

RS2

BItB
Lin
.n
::"~ ---------------- -------- ---~---------
Ultrasonic Gripp.r Tactile
Processor Controller Processor

t.1
.nd .... ---_ .. --- ... --- ---- .. --_ .... -- -_ .. ---_ .. ---_ .. --_ .... ----
Po..• r
ASTEK
Lin k.
Force
Sensor
Pre-Processing Pre-Processing Pre-Processing PUMA
Controller
Ultrasonic
PWM WRIST
Ranginl Scanning
Aduator Sensor
Modules

On.
Out or
Three
/
Ullrasonic
.
Tunsducer.
IIPro
P•••••
/'
~.I¥e
licl
rlioaal Simple
~
FS R s
I ON/OFF
Sensors
I ""FSR Matri.1

Fig.6 . Local Supervisor - system set-up and information flow.

be placed on the manipulator arm, close to the transducers, in order to achieve better
noise immunity. At the next processing level operations such as the frame formatting
of the raw data and the servo-control loops are executed. This level is micro-based and
is connected to the Local Supervisor through serial links (RS232 and/or RS485). The
real time implementation of sensor-based control is made possible due to the
multiprocessor structure described and due to the use of digital signal processor
based boards [16] (based on a TMS320C30), at the Local Supervisor, in digital
filtering and in the implementation of control algorithms. This board has two
synchronous serial interfaces aJlowing its direct connection to lower-level processing
systems. These DSP boards also allow the real-time processing of vision if required.

5.1 Force and tactile sensing

In flexible assembly it is required that the robot comes into physical contact with
parts, and continuous force control is necessary. Compliance can be achieved by using
passive and/or active methods_ Passive methods can be implemented by using
mechanical elastic devices as the remote center compliance (RCC), or its enhanced
229

version the instrumented remote center compliance (IRCC) which also provides
displacement sensing [17]. Wrist force sensors for sensing contact forces are used in
active compliance. Most popular stiff wrist force sensors are of strain-gauge type but
other types have been developed (see [18] for an optical 6-axial force sensor). It has
been widely reported that using a stiff sensor in a stiff environment can cause stability
problems [19]. One solution for the problem might be to use a combined stiff-
compliant sensor as described in [20]. Another approach to solve the stability problem
using only a stiff sensor has been reported by c.H.An and J.Hollerback [21,22] relying
on a fast open-loop joint torque control, to provide stability, and a lower bandwidth
outer loop with wrist force sensing to provide extra accuracy. We are using a stiff
Astek FS6-120A six-axis force sensor of strain-gauge type. The sensor carries out the
conditioning process and formats the force information in the form of a RS-232 serial
data stream. The sensor resolves data into a specified orthogonal coordinate frame at a
measurement rate up to 240 Hz. Currently we are working on the development of
algorithms for force control by analysing the behavior of the manipulator system when
constrained by different stiffnesses. We use the real time path control interface
provided by the PUMA to modify the robot's trajectory. Simultaneously we measure
the forces applied against surfaces caracterized by different passive compliances (we
use springs with different stiffnesses). Additionally we measure the position of the
end-effector during the motion using proximity sensors.
With regards to tactile sensing we are developing a 256 element tactile sensor using
a piezoresistive thick-film polymer manufactured by Interlink which is referred as a
force sensing resistor matrix. The transducer provides a spatial resolution of 0.156 cm
with a sensing area of 2.5 cm x 2.5 cm. The pre-processing board has the scanning
array circuit which implements a pixel isolation technique as described in [23,24]. The
same board has the analog-to-digital convertion circuit with serial output.

5.2 Ultrasonic system

Non-contact sensing is important to many robotic applications. Seam tracking


for arc welding [25], path control [26], grasping [27], surface following and guarded
motions are examples of applications whieh require non-contact sensing feedback.
Among the various sensors which belong to this class are the ultrasonic sensors, whose
application in robotics has been considerable since they are inexpensive, they give
range information and also give some insight on surface orientation. In our
application, the goal is to obtain 3-D surface information and to achieve three-
dimensional path control using three ultrasonic sensors. Each sensor has an ultrasonic
ranging module with a single transmitter and receiver transducer. The transducers have
a resonant frequency of about 215 KHz and the ranging modules have two outputs, a
digital signal whose pulse width is directly proportional to the distance and an
amplified analog signal composed by all the echoes reflected. The processing of these
signals is carried out by two physically separated boards, as depicted in the Fig. 6. One
of them is a remote pre-processing board, one per each transducer, which is placed on
230

the manipulator close to the transducer. The other is a microcontroller based board
with serial interface. This board has the capacity of processing simultaneously data
from three remote modules.
The repetition rate of the trigger pulse applied to the ultrasonic ranging module,
which controls the ultrasound emissions, is programmable with discrete values
between 61 Hz and 400Hz. This capability allows for variable triggering as a function
of ranges of distances. This is of great importance when real-time performance is
required. The analog signal is rectified and converted to digital using a 12-bit ADC
with serial output. At the micro-based board this signal is converted to parallel and
stored in a FIFO memory. The disposal of this memory permits the parallel
processing up to three analog signals with high sampling rates. The pulse width is
measured with 12-bit resolution.by an interval counter at the micro-based board.
Geometrical aspects related to the use of the ultrasonic system in tasks such as
the evaluation of planar-surface equations and manipulator surface following were
described in [15].

4.3 Real-time path control interface system

For the real-time path control communications we have developed a PC-AT bus
board based on a microcontroller of the Intel MCS51 family. It has serial interfaces,
RS232 and RS485, allowing its use as a node in a Bitbus network. The memory
addressing space of the board is also mapped into the memory space of the PC-AT.
Besides this facility for downloading programmes and data, the module has a
multiprocessor interface device which provides a parallel, asynchronous
communications channel for passing messages and data between the two processor
systems. This circuit implements a shared memory space with logic to resolve access
conflicts, that is it implements the mutual exclusion logic.
The software which implements the real-time path control communication is
divided into two physically separated modules. One module is written in PL/M-51 and
lies in the previously described board. This programme was designed to maintain the
real-time communications in a running state overcoming the tight demands of time
imposed by the VAL-II system. It consists on the lowest-level protocol layer
discharging the PC for the processing of application software. The second software
module is a PC resident programme performing the interface between the lowest-level
layer and the application software. From the user point of view this module appears as
a set of procedures which allow him to control the robot's trajectory in real time. Once
the normal communications state is achieved (the running state) the application
programme has accessible data concerning the state of the communication and the set-
point (or current-location) of the end-effector. Additionally, this interface is in charge
of sending control data to define/modify the robot's trajectory. The exchange data rate
is fixed at the value of 36 Hz. To provide for the task's synchronization the interface
can trigger synchronous interrupts in respect with the start of reception of each
message from VAL-II.
231

In order to design algorithms for sensor-based control geometrical task


constraints and dynamic constraints must be taken into account. With regards to the
dynamic constraints, an accurate model of the real time behavior of all the components
in use including the system manipulator, the external control system, sensors and
materials being handled, must be obtained. For the identification of the manipulator
system we have followed an approach of considering it as a black blox problem. Thus,
it was only considered the input/output relationship without using any knowledge of
the dynamics of the manipulator structure or of the behavior of its controller
processing system. A set of experiments was performed in order to obtain its transfer
function [28]. Based on the results of the experiments we have analyzed some models
including the model proposed by Wampler [29] and another model described by a
single delay term. Both the models aim only at describing the behavior of the system
within a narrow bandwidth, that is for low velocities and for the robot moving in free
space.

6. SENSOR DATA INTEGRATION

In general, a single sensor only provides partial and often noisy measurements
from the environment. To obtain enough information for complex tasks characterized
by high degree of autonomy and flexibility the use of a set of sensors forming a multi-
sensor system is required. It is evident that the combination of several sensor data
produces a richer and more reliable perception of the environment. There are some
types of information which can only be derived from the combination of several
sensor data, not being available from a single sensor source. Data combination can
also reduce errors generated by inaccurate sensor data interpretation models or by
sensors working beyond their accuracy range. Furthermore, multiple sensors can
complement each other to provide better understanding of the scene, and the
redundancy of information allows the use multi-sources to achieve better accuracy.
The development of sensor data integration techniques has relatively short
history but the research on the subject has been very intensive [30].
The multi-sensor integration techniques under development are intended to build
the high-resolution map as described previously. To achieve such goal we use two
different forms of interaction between the data supplied by the sensors: competitive
integration and complementary integration. The competitive integration occurs when
two or more sensors supply information of the same features from the same universe.
In this case, the aim is to get more accurate information by utilizing the redundant
information. The complementary integration occurs when the information provided by
two or more sensors deals with different features from the universe. The combined
information is composed of several non-overlapping information pieces coming from
different sources.
The competitive integration technique is explored on the information obtained
from the vision system. More precisely, techniques such as stereovision, motion and
232

focus are used independently to obtain information about the three dimensional
structure of the universe. The information that it is in search are points and lines.
These techniques provide more accurate information about those three-dimensional
primitives.
The complementary integration is used to enrich the information of the high-
resolution map, as in case of using ultrasonic and force sensors which give information
about surfaces and stiffness of the objects in the universe.
The study of data integration techniques, known as cooperative integration,
which uses the information given by different sensors to infer features, are left for
future research. Namely, we expect to explore the combination of ultrasonic and vision
data to obtain three-dimensional primitives such as planes and curved surfaces.
For the competitive data integralion we have adopted a probabilistic approach,
describing the uncertainty in each geometric feature as a probabilistic Gaussian
distribution [31,32]. As it has been pointed out this approach forms a valid basis
framework for the formulation of mathematical tools in order to describe and to
manipulate spatial uncertainty.

7. SOFTWARE CONSIDERATIONS

The support software of the overall distributed system under development is


described by the diagram of the Fig.7. In a symplified way the following functional
layers can be outlined:
(i) A Data Link which implements a subset of the synchronous data link control
protocol (SDLC);
(ii) A Transaction layer which defines a message format that assures the
transparency in the exchange of information between the master and slave tasks;
(iii) A Cell application support layer which implements several interface procedures
between the user programmes and the low-level procedures distributed throughout the
robotic system;
(iv) The application layer which corresponds to the user programmes and
applications. This layer includes utility software such as: development tools; software
for file transfer and configuration of the robotic system.
The first two layers follow the Bitbus specifications, and are assured by
firmware (the Intel 8044 Bitbus-Enhanced-Microcontroller). The application support
layer aims at performing the interface between the user programmes and the
distributed system devices. From the user point of view, this layer should appear as a
set of procedures allowing him to get the state of the robotic system and to initiate
remote actions in a transparent way (without knowledge of the physical configuration
of the cell). For that purpose, a data base having information about the physical
configuration and the system dynamic status is being specified. The bottom of this
layer is composed by the frame processor. It consists on the division of a message into
frames of fixed lenght of 13 bytes each. Each frame is transmitted by following an
233

SETUP of CELL
~TON
USER PROGRAM

CELL APPLICATION
SUPPORT

DATA LINK

Fig. 7. Software structure diagram.

order/reply structure (that is, a message in the Bitbus context). The connections to the
manipulator controller are assured by two channels whose processing is performed in
different slave units, as shown in Fig.1. The supervisory communications which use
the digital data communications message protocol (OOCMP) are assured by the
OOCMP Interface Unit while the real-time path control communications are
controlled by the Local Supervisor as described previously. Both communications
software belong to the Cell application support layer of the hierarchical control
system. Still at this layer we are developing functions which can be grouped in two
main classes: 1) Software which uses the remote access and control functions (RAC
functions of the Bitbus [5]); 2) Software that implements control operations at the
sensor and actuator level. As examples of those operations are the initialization of a
sensor processing module, request of raw data from a certain sensor, activation of a
fine motion and a command to trigger a MOVE action by the PUMA robot.

8. CONCLUSIONS

A distributed system has been described which aims to support the multi-sensor
integration in a robotic environment. For the communications, a high-speed serial
network was chosen exploiting its features of low cost, easy implementation and good
performance on short message transfers. On the other hand, slaves featuring high
capacity of data processing to enable real-time sensor-based control where developed.
The integration of several types of sensors around the manipulator allows the
estimation of a multi-level world model and thus enables the evaluation of
sophisticated assembly tasks. The development of algorithms for data integration and
data representation are under research. The aim is to build descriptions of the three-
dimensional world using information from various sensors - complementary and
234

redundant information. At another level, data processing methods and control


strategies for the implementation of fine motions are being investigated. Force, tactile
and proximity sensors are the information being used at this level. One of the most
serious problems we face in the implementation of accurate fine motions is due to the
integral use of the VAL II controller. This controller is very inflexible allowing only
the implementation of limited path modification strategies.

9. REFERENCES

[1] P.M.Taylor, G.E.Taylor, I.Halleron and X.K.Song, 'Analysis of a probabilistic


framework for an intelligent workcell', Nato Asi Series, Vol F.64 on Sensory robotics
for the handling of limp materials,P.M.Taylor Ed, Springer-Verlag, 1990, pp.275-297.
[2] B.R.Donald, Error detection and recovery in robotics, Springer-Verlag, 1989.
[3] AGoldenberg and L.Chan, 'An approach to real-time control of robots in task
space. Application to control of PUMA 560 without VAL-II', IEEE Trans. Ind.
Electron. 35, (2) (May 1988) 231-238.
[4] B.E.Shimano, 'A robot programming system incorporating real-time and
supervisory control: VAL-II', Proc.Robots 8, 1984.
[5] The Bitbus interconnect serial control bus specification, in: Distributed Control
Modules Databook, Intel Corporation, 1984, pp.100-147.
[6] T.Lozano-Perez,'A simple motion-planning algorithm for general robot
manipulators', IEEE Journ.of RobAutom., RA-3, (3) (June 1987) 224-238.
[7] L. Sa, J. M. Dias and V. Silva, 'A modular approach to image processing through
the VMEbus', I VMEbus Research Conference, Zurich, 1988.
[8] V. Silva and L.Sa, 'ACTIVE - Sistema para aquisicsiio em computador e
transmissiio de imagens video', 3/l Simposio de Electronica das Telecomunicar;6es,
Porto, 4-6 May 1988 (in Portuguese).
[9] J.M.Dias and L.Sa, 'Modulo de processamento digital de imagem com
microprocessador de sinal', 3/l Simposio de Electronica das Telecomunicar;6es, Porto,
4-6 May 1988 (in Portuguese).
[10] J. P. Batista, J. Dias, H. Araujo and A Tracsa de Almeida, 'Biblioteca de
Program as para Processamento de imagem' 3/l Simposio de Electronica das
Telecomunicar;oes, Porto, 4-6 May 1988 (in Portuguese).
[11] J. P. Batista, J. Dias, H. Araujo and A Tracsa de Almeida, 'Analise quantitativa
dos metodos de calibracsiio de camaras de video', 3/l Encontro Portugues de
Reconhecimento de Padroes,Aveiro, Feb. 1991 (in Portuguese).
[12] G.Toscani, 'Systeme de Calibration optique et perception du mouvement en
Vision artificielle', PhD thesis, Paris-Orsay, 1987.
[13] O.D.Faugeras and G.Toscani, 'The calibration problem for stereo', IEEE Proc.
CVPR' 86, Miami Beach, FI., 1986, pp. 15-20.
[14] J.M. Mendel, 'Lessons in Digital Estimation Theory', Prentice-Hall, 1987.
235

[15] U.Nunes, J.Dias and AT.Almeida, 'Geometric modeling aspects on vision and
ultrasonic range sensing for robotic applications', NATO ASI on Expert Systems and
Robotics, Greece, July 1990.
[16] J.Silvestre, AMendes, L.Sa and V.Silva, 'Frame Grabber com capacidade de
processamento para PC-AT', Internal Technical Report, DEE-University of Coimbra,
1990 (in Portuguese).
[17] y.xu, R.P.Paul and P.Corke, 'Hybrid position force control of robot manipulator
with an instrumented compliant wrist', First Int.Symp. on Experimental Robotics I,
Springer-Verlag, Eds. V.Hayward and O.Khatib, 1990, pp.244-270.
[18] S.Hirose and K. Yoneda, 'Development of optical 6-axial force sensor and its
signal calibration considering non-linear interference'JEEE Int.Conf.RobAutom,
1990, pp.46-53.
[19] R.P.Paul,'Problems and research issues associated with the hybrid control of force
and displacement', IEEE Int.Conf.RobAutom., 1987, pp.1966-1971.
[20] J.Dietrich, G.Hirzinger, B.Gombert and J.Schottet, 'On a unified concept for a
new generation of light-weight robots', First Int.Symp. on Experimental Robotics I,
Springer-Verlag, Eds. V.Hayward and O.Khatib, 1990, pp.287-303.
[21] C.H.An and J.Hollerback, 'Dynamic stability issues in force control of
manipulators', IEEE Int.Conf.RobAutom., 1987, pp.890-896.
[22] C.H.An, C.Atkeson and J.Hollerback, 'Model-based control of a direct drive arm,
PartlI:Control', IEEE Int.Conf.RobAutom., 1988, pp.1386-1391.
[23] T.H Speeter, 'A tactile sensing system for robotic manipulation', Int.Journ.Rob.
Research, 9 (6) (December 1990) 25-36.
[24] B.Tise, 'A compact high resolution piezoresistive digital tactile sensor', IEEE
Int.Conf.RobAutom., 1988, pp.760-764.
[25] B.Maqueira, C.Umeagukwu and J.Jarzynski, 'Application of ultrasonic sensors to
robotic seam tracking', IEEE Trans.Robot.autom, 5 (June 1989) 337-343.
[26] K.Furuta and M.Sampei, 'Path control of a three-dimensional linear motional
mechanical system using laser', IEEE Trans.lnd.Electron., 35 (1) (Feb. 1988) 52-59.
[27] B.Espiau and J.Y.Catros, 'Use of optical reflectance in robotics applications',
IEEE Trans.syst.Man Cybern.l0 (12) (December 1980) 903-912.
[28] U.Nunes, P.Faia, R.Araujo and AT.Almeida, 'Integrac;ao de informac;ao de
proximidade numa celula rob6tica de montagem', Proc. RECPAD 91, Portugal,1991
(in Portuguese).
[29] C.Wampler, 'Multiprocessor control of a telemanipulator with spatial proximity
sensors', Int.Journ.Rob.Research, 6 (1) (Spring 1984) 40-50.
[30] R.C. Luo and M.G.Kay, , Multisensor Integration and Fusion in Intelligent
Systems', IEEE Trans. on Sys, Man, and Cyb., 19 (5), (Sep./Oct. 1989),901-931.
[31] H.F.Durrant-Whyte, 'Integrated, coordenation and control of multi-sensor robot
system', Kluwer Acad.Publ., 1988.
[32] R. Smith, M. Self, and P. Cheeseman, 'Estimating Uncertainty Spatial
Relationships in Robotics', in Autonomous Robot Vehicles, I.J. Cox & G.T. Wilfong
Eds, Springer-Verlag, 1990.
CHAPTER 10

NEURAL NETWORKS AND ROBOT VISION

Erik H. D'Hollander
State University of Ghent
Department of Electrical Engineering
St.-Pietersnieuwstraat 41, B-9000 Ghent. Belgium

Abstract
Traditionally robot vision and robot control applications are written in imperative
languages, such as Fortran, Pascal, Modula or C. Since the development of Artificial
Intelligence techniques, specialized robot languages emerged. The upper layers of robot
languages are programmed in declarative languages, such as Lisp and Prolog. Recently,
neural computing has become a third exciting programming alternative which comple-
ments the other programming styles. Neural programming paradigms apply to both
robot vision and robot control. In this chapter the basic neural network techniques
applicable to robot vision are explored. Among these are pattern recognition, blurred
image restoration and unsupervised classification. It is likely that a judicious blend
of these techniques will be used in complex scene analysis. Neural networks are char-
acterized by an interconnection structure and a learning rule. These offer a wealth
of combinations for each particular application. Therefore the major neural network
models are analyzed and compared.

1 Neural Networks: what's in a name?


Neural computing employs models of elementary brain functions, the neurons, as problem
solving building blocks. A neural network typically consists of several layers which are
uniformly interconnected. The data of the problem are fed into the input units and this
causes neural signals to propagate dynamically through the network until a steady state is
reached. At that time the result is obtained from the output neurons or from the internal
state of the network. In a computerized neural network, the basic components are modeled
as follows (fig. 1).

• The neurons are processing units, U;.

• The state of a unit i is determined by the activation value, a;, a function of the
weighted sum of the input values, the net input, neti. The output value, 0i, is a
function of the activation ai.

• A weight matrix, W, with weights Wi; indicates the connection strength from neuron
j to neuron i. Hence row i contains the connection strengths at the synapses of neuron
i, coming from the neurons i
237
238

Oi =f(ai}

ai = F(neti}

Figure 1: Basic components of a neural unit

• The net input of the activation function, net;, equals the weighted sum of the outputs
OJ of the neurons connected to neuron i.

• The activation function a; = F(net;} is a non-decreasing function of its net input


and determines the neuron's internal state.

• The output function 0; = f(a;} calculates the effect the current activation will have
on the other units. Typical examples are a linear function, a threshold function, or
a probability distribution, determining the likelihood that the output will be either 0
or 1.

A single neural unit therefore is governed by the following equations:

net; L W;jOj
j

act; F(net;)
0; f{act;)

2 Neural Networks Operation


A neural network can be trained to recognize a relation between input and output patterns.
In the learning phase the network adapts its weights according to some learning rule. In
a supervised learning scheme, a set of input and target patterns is repeatedly presented
to the network. The learning algorithm modifies the weights in each cycle, according to
a criterion function which minimizes the error between the output pattern and the target
pattern. Eventually the weights stabilize when the error function is minimized. In an
unsupervised learning scheme, there are no target patterns. In fact, the input patterns
are recognized and classified autonomously according to some distance function. In this
case, a global minimization will organize the samples into clusters of matching similarity.
After training, the network can solve the following type of problems.

PATTERN CLASSIFICATION. A new pattern can be classified according to the proper-


ties in the learning patterns. When the pattern coordinates takes only binary values
and the number of classes is two, the network is called a perceptron.
239

PATTERN ASSOCIATION. Given a new pattern, the system associates the pattern with
a template that was stored in the training phase. The template most closely matches
the presentEld pattern. This phenomenon is known as the associative memory
property.

AUTO ASSOCIATION. In the training phase, a set of correct patterns is presented. This
corresponds to a pattern classifier in which the output and the input patterns are the
same in the training phase. After training, the network is capable to complete in-
complete or distorted patterns, based on the similarity with the prototype patterns
stored in the network.

CONSTRAINT SATISFACTION. In this paradigm, the nodes represent facts and the
activation of a node indicates the presence or the absence of a fact. Input nodes de-
scribe the known (observed) facts in the external world. During the learning process,
the network modifies its internal state as to be the most consistent with the external
world. Apart from external influences, the nodes affect each other through the fixed
weight connections. The weights between two units indicate the degree to which two
facts can coexist. The global quality is measured by a goodness value, E;j w;ja;aj,
and the activations are modified to maximize this expression. There is an analogy
with expert systems, because implicitly the network contains a knowledge base, or
more precisely, a relation between knowledge facts. Given a set of inputs patterns,
the network settles down into a state most feasible with the presented input patterns.
Therefore the constraint satisfaction model can be used e.g., for rule detection or
hypothesis testing. Learning paradigms for this type of network include the Harmony
theory [14], the Boltzmann machine [6] and the Schema model [13].

COMPETITIVE LEARNING. This is a form of pattern classification by unsupervised


learning [12]. The elements of an input pattern represent the presence (1) or absence
(0) of features. The next higher layer contains n units, and during the training, the
input patterns are grouped into n classes where each unit represents a particular class.
The learning is competitive in the sense that only the unit with the highest activation
is allowed to fire and to modify the weights in the network. As a consequence, the
winning unit improves the future recognition of similar patterns. In this way the
intra-class distance becomes minimal and the inter-class distance maximal.

In the following sections the learning mechanisms of these models are explored. The
examples are programmed using the McLelland and Rumelhart PC-neural network software
[9].

3 The Linear Pattern Associator


We first consider the simplest form of a neural network and relate its paradigm to two well
known mathematical problems.
240

Figure 2: The linear pattern associator

3.1 The Linear Case


Derivation of the Delta Learning Rule
Consider one layer of ni input units, in which each unit is connected to no output units.
The output of the network, Op = [i l ... inot, is a linear function of the input pattern
Ip = [i 1 ... in.]T, (fig. 2),

ni
0, =L w'jij. (1)
j=l

The weight W,j equals the strength of the connection from input j to output i. In the
training phase, the weights of this system are adjusted in order to reproduce a given set of
np input/target pattern pairs, Ip, Tp, p = 1, ... , np. Therefore the learning rule minimizes
the difference vector Ep = Tp - Op or

Ep = 1/2L(t, - 0;)2. (2)


Vi
Applying the steepest descent rule, one finds

8Ep
-f--
8w'j
8Ep 80,
-f----
80, 8w'j'
which yields the linear delta rule

(3)

Here t, and 0i represent the i - th coordinate of the target and the input patterns Tp and
Ip, and f is the learning rate parameter. The delta rule is applied to the np input patterns
until 2:;~1 Ep is minimized. In the linear network model, the Delta Rule is guaranteed to
converge to the unique set of weights which minimizes the quadratic error function (eq. 2).
241

3.2 The LPA, the Pseudo-Inverse and the Multiple Linear Regression
Analysis
The linear pattern associator (LPA) relates to two well known mathematical problems.
First, consider the input and target matricE~s Inixnp = [h ... Inp ] and Tnoxnp = [Tl ... Tnp],
where Ii and Ti are input and target pattern column vectors, with ni and no elements
respectively. The learning problem then comes down to solving the matrix equation

WI=T (4)
for the weight matrix W noxni.
When I is square, i.e. np = ni, and the input patterns are linearly independent, the
exact solution is found by a matrix inversion. However, in the general case p =I ni, and
then an approximate solution is calculated using the pseudo-inverse:

(5)
Kohonen [7] has shown that this formula yields the same weights as those obtained by the
delta rule. However, the closed formula (eq. 5) often is more efficient than the iterative
delta rule (eq. 3) .
By another approach, equation (1) can be considered as a multiple linear regression
problem, which predicts 0i from a weighted combination of the input variables i;. The delta
rule solves n linear regression problems and the row i of the weight matrix represent the
regression coefficients for unit i. Hence there is an evident isomorphism between the linear
neural network and two well known problems from the linear algebra and the statistics.

3.3 Example: the 7-Segment Display


A simple problem is to recognize the digits formed by a 7-segment display, available as a
readout on many electronic devices. The- input patterns are represented by a binary vector
with 7 elements (a-f in fig. 3). The output is a 10 element binary vector, indicating the
recognized number.
The target matrix is a lO-dimensional identity matrix, E, while the input patterns are
stored columnwise in the input matrix hxlO,

0 1 1 0 1 0 1 1 1 1
0 0 0 1 1 1 0 1 1 1
1 1 1 1 0 0 1 1 1 1
T lOx10 = WlOx7 0 1 1 1 1 1 1 1 1 0
0 1 0 0 0 1 0 1 0 1
1 0 1 1 1 1 1 1 1 1
0 1 1 0 1 1 0 1 0 1

Clearly, this is an overdetermined linear system. The best weights are found by the
delta rule or by the pseudo-inverse. The delta rule minimizes the total sum of squares
tss = liT - W Iii. After 1000 steps the total sum of squares of the errors over the 10
patterns for each unit, gives tss = 3.0009. However, an exact calculation using the pseudo-
inverse gives tss = 3.0000, and this is calculated in much less time. The output after
242

0
0
a 1
a 1

7
b 0
0
c 1

-Of
b£d C 0
d 1
0
e 0 0
f 0
9
9 0
0

"3"
• Input
• Output

Figure 3: A 7-segment display pattern associator

training looks as follows:

62 -19 23 15 -23 4 15 -3 -11 18


-19 80 18 1 -18 -2 1 24 -4 -1
23 18 72 5 28 -10 5 -4 -14 -5
15 1 5 80 -5 6 -20 6 26 -14
-23 -18 28 -5 72 10 -5 4 14 5
O=W!=
4 -2 -10 6 10 88 6 20 -18 -6
15 1 5 -20 -5 6 80 6 26 -14
-3 24 -4 6 4 20 6 29 14 27
-11 -4 -14 26 14 -18 26 14 56 8
18 -1 -5 -14 5 -6 -14 27 8 80

Since the target is a unity matrix, the output of the network is poor. Nevertheless,
a nonlinear transformation, selecting the maximal element in each column, generates the
right answer.

4 The Nonlinear Multilayered Pattern Associator


4.1 The limitations of linear networks, the necessity of nonlinear acti-
vation functions and the use of Hidden Units
Linear networks are inadequate because they use only one layer of neurons to store patterns.
However, multilayer linear networks reduce to a single layer network, due to the associative
243

aj aj

bias bias
o o act.
1/ /
1
netj netj
Threshold function Quasi-linear function

P(aj = 1)

bias
o /
netj
Probability function
Figure 4: Commonly used nonlinear activation functions

property of linear systems. For example, suppose that the output of one layer is fed into a
second linear network. Then the input-output relation of this two layer system is described
by the equations 0 = W 2 (W 1 I) = WI, where W = W 2W 1 represents the weight matrix
of the equivalent one layer system. This limitation is called the linear predictability
constraint.
The only way to increase the functionality of neural nets is to introduce nonlinearities
in the activation function. To keep the analogy with a biological neuron, the nonlinear
activation must preserve the significance of the excitatory and inhibitory connections. This
implies that an increase of the of an excitatory input causes an increased output, while a
decrease results in a decreased output. Therefore the activation function must be uniformly
non-decreasing for all inputs. Only then the excitatory or inhibitory character of the con-
nections are maintained. The most common non-decreasing activation functions are (see
figure 4):
- a linear threshold function. The binary output is 1 when the net input exceeds a
certain limit, otherwise it is o. A variant of this scheme is to add a bias-term to the
net input of each unit and let the unit fire if the result becomes positive. The biases
244

are equivalent to weights of input units which are always on and they can be trained
as ordinary weights.

- a quasi-linear function. The net input is fed into a nonlinear squashing function, which
maps into the interval [0,1]. The squashing function is typically a non-decreasing
sigmoid, such as
1
f(x) = 1 + e-z/T
This function divides the net input into three regions: a region of 'low activation'
(x « T), where the input is not significant enough to fire the unit, a 'saturation
region' (x » T) which limits the activation, and a 'center region' where the output is
quasi-linear in function of the input. With this division, the unit conforms to many
physical systems that have a limited dynamic range.

- a probability function. The output takes the value 0 or 1 depending on a nonlinear


probability function of the input.

The 7-segment display network using a linear threshold activation function


The maximum value in each column of the target matrix T corresponds to the single
unit which should be one. It appears that these values are also the maxima in each row.
Consequently, there exist threshold values which can serve as biases in a linear threshold
activation function. In fact, when biases are added to the 7-segments network, the linear
delta rule finds the solution based on a linear threshold activation in 53 cycles.

4.2 Multilayer Nonlinear Networks


In principle there always exists a nonlinear function which maps a finite number of input
patterns to matching output patterns. However, there are two constraints which limit the
functionality of the neural model.

1. The activation function has only one input, net;, which is a weighted sum of the
inputs i j . Consequently it is impossible to distinguish the activity of the individual
units.

2. The activation a; = F(net;) is monotonically non-decreasing. This precludes activa-


tions having an extremum in the input net;.

These limitations - which led to the criticism of Minsky and Papert [11], can be overcome
by adding one or more layers between the input and the output layer. These hidden
units are capable to extract additional features, calculate intermediate results and generate
extrema. Together with the nonlinear activation, the additional layers enhance the network
performance. The weights of this network are trained by the back-propagation algorithm.

4.3 The Back Propagation Algorithm


In order to train the weights of the hidden units, the delta rule is generalized as follows.
The net input of a unit is a weighted sum of the activations in the previous layer:
245
n

net; =: L W;jOj (6)


j=1

where OJ represents the output of the units in the upstream layer. We have OJ = i; if
the unit is connected to input units, otherwise OJ equals the output of a hidden unit. The
output is calculated by a nonlinear activation function f,

(7)
The delta rule minimizes the sum of squares Ep of the differences between the training
patterns and the output patterns of the network (equation 2). By the steepest descent rule,
the weights are changed according to the derivative of the error function:

or using the chain rule,

8Ep 8nd;
L:::..w;j = - ·8net;
f----
8w;j
(8)

According to equation (6) the last partial derivative is

Now define the first derivative factor in (eq. 8):

8; = -liEp /8net;. (9)


Again applying the chain rule, one obtains

_ 8Ep
8net;
__ 8Ep 80;
(10)
80; 8net;
Substituting the 8-values into equation (8) leads to the generalized delta rule

L:::..w;j = f8;oj.
The variable 8; (equation 10) is determined using equation (7), 80;/ 8net;
Now, consider two cases

a) the unit is an output unit


then

as in the linear case,


246

Ok

{Wij} Ok'
{OJ} -----ftJ

hidden unit Ok"

downstream layer

Figure 5: Connections of a hidden unit

and the t5-factor (eq. 10) for the output layer becomes

15; = (t; - o;)f'(net;) (12)

b) the unit is a hidden unit (see figure 5)


Then the hidden unit is related to the next layer, by introducing the effect of OJ on
the net inputs netk of the units in the following layer downstream the network, using
the chain rule:

t5Ep/t5o; = t t5Ep t5netk


k=lt5netk 150;
(13)

where k refers to the units in the next layer. Now, by definition, t5Ep/t5netk = -I5Ie,
and t5netJe/t5o; = WIe;, (eq. 6). Thus equation (13) becomes
n
t5Ep/t5o; =L -t5kwk;.
1e=1

Using equation (10), this leads to the t5-factor for the hidden units:

n
15; = t'(net;) L I5le W/ci. (14)
k=l
247

The algorithm is summarized as follows.

BACK PROPAGATION PROCEDURE

1. Present an input pattern, i, and let it propagate through the network, to yield the
output pattern, o.

2. Apply the generalized delta rule (eqn. 11), in combination with equation (12) for the
output layer.

3. Apply the generalized delta rule (eqn. 11) in combination with equation (14) for the
hidden layers, using the 8 values from the previous step.

4.4 Selecting an Activation Function


Both equations (12) and (14) make use of the derivative of the activation function, 0; =
f(net;). Therefore a non-differentiable function such as the linear threshold function is
excluded. However if one uses the squashing function
1
f(x) =
1 + e- z '
the derivative becomes quite simple:

f'(x) = f(x)(1 - f(x»,


and this yields the 8-values (equations 12 and 14):

8; = (t; - 0;)0;(1 - 0;) (15)


for the output layer and
n
8; = 0;(1 - o;} L 8k Wk; (16)
k=l
for the hidden layers.

4.5 The XOR problem with hiddE!D units


In order to illustrate the backpropagation algorithm, we consider one of the simplest input-
output patterns relations which cannot be learned by a linear single layered network. Con-
sider a linear network with 2 input units and 1 output unit (fig. 6-a).
The corresponding weight equations are shown in Table 1. Clearly, the weights cannot be
1, while their sum equals o. Moreover, there exists no nonlinear, non-decreasing activation
function of one input net; = w1i l + w2i2 to yield the XOR function.
Since it is desirable to maintain a non-decreasing activation function, such that a stronger
excitation delivers a stronger output, it is necessary to introduce a layer of additional units
between the input and the output.
Therefore a network with one hidden unit is considered (fig. 6-b). The output unit is
connected to both input units and to the hidden unit. Starting with random weights and
248

out

out

a) b)

Figure 6: a) The linear XOR network, b) The XOR network with 1 hidden unit, h, and 2
bias terms, bo and bh . The output equals out = Wlil + W2i2 + WhOh + boo The bias values
are trained as ordinary weights.

'I '2 out out = Wlil + W2i2


0 0 0 0=0
0 1 1 1 = W2
1 0 1 1 = WI
1 1 0 0= WI + W2

Table 1: Weight equations for the linear XOR network.


249

5 .-----------------------______~
~----
0.4 -r------------,
,// Wl

Wh (A)

Cycles [1-100]

Cycles [1-332]

Figure 7: Weights evolution during the training phase

using the back propagation algorithm just described, the weights are trained in 332 cycles
(fig. 7). Three learning phases are observed.

1. In the first 100 cycles, the system is squeezing the weights to the value of zero. This is
the minimal state of energy, giving the same output of 0.5 for all input combinations.

2. Next, the hidden unit begins to recognize the special condition i1 = i2 = 1 and adjust
its weights in opposite directions. In this way the double one leads to a zero input.
However, the hidden unit still produc:es a positive output, due to the positive bias,
bh . Since the weight Wh is negative, the hidden unit acts as inhibitor to the positive
inputs il, i 2 • The same inhibiting activation results if both inputs are zero.

3. At a slower pace the other weights settle down to cover the two remaining states.

4. When the weights stabilize around c:ycle 300, the single excitatory weight to the
output, WI, ensures a 1 when the state [il i 2]=[1 0]. Finally, the state [0 1] inhibits
the output and the hidden unit. However, the strong positive bias of the output unit
ensures that a 1 is generated.

5 The Auto Associator


S.l Learning with Feedback
The pattern associator recognizes and classifies trained patterns. However, in realistic
robot vision applications, the input pattern is often noisy or incomplete. In that case, we
would like to reconstruct the correct pattern. The auto associator tries to achieve this by
250

-
-
Inputs Outputs

-
Feedback

Figure 8: Auto associ at or network. The output is fed back and becomes an internal input
to the network

connecting the output back to the input units. In this way the network is fed by a weighted
sum of the external input and the output of the network in the previous cycle (fig. 8).
The outputs are not connected to the same inputs, since the network needs to develop
inter-pattern relations. In the learning phase, an input pattern is presented to the network
and after a fixed number of feedback cycles, the output is compared with the external input
pattern. The differences are accumulated in the error vector. Likewise the other training
patterns are successively clamped to the input units. When all patterns are presented,
the weights are adapted according to the delta rule and a new training epoch starts. The
training ends after the total sum of squares of the errors is minimized and the weights are
stable.
In a completion test, a distorted or incomplete pattern is presented. The network runs the
same number of cycles as in the training period, or more. Usually, the internal input pattern
which is fed back from the output will gradually override the incomplete external pattern.
In the end the output converges to one of the learned patterns. When the external input
pattern is too erroneous, the auto associator ends up in a local minimum corresponding to
an unknown pattern.

5.2 A 7-segment display Auto Associator


As an example a single layer network with 10 units was trained to memorize the patterns
of a 7-segment display. The outputs are clipped at the maximum value 1. This prevents
the activations to blow-up due to a positive feedback, a typical problem with the auto
associator. _The clipping auto associator network is called a 'Brain State in the Box' [1].
During the feedback cycles t, the external and internal inputs, extinput and intinput are
combined by the formula:

o;(t + 1) = (decay)o;(t) + (estr)extinput; + (istr)intinput;(t)


251

Correct Defect Segment Good Error


Symbol a b c d e f g
1 X X 0 2
2 2 6 2 3 2 3 2
3 3 5 3 3 9 3 2
4 X X 4 4 2 2
5 5 5 X 5 5 9 4 1
6 6 6 5 6 6 4 1
7 7 7 7 7 4 0
8 8 8 8 8 8 8 8 7 0
9 9 9 9 9 5 0
0 0 0 0 0 X 0 5 1
Totals 37 11

Table 2: Pattern completion test with the Auto-Associator. Learning and completion
phases uses 25 and 100 feedback cycles respectively. The parameters used are estr = istr =
decay = .15, f = .05. After learning, total sum of squares tss = 3.0545. Removal of one leg
in the 7-segment digits still results in a 77% hit-rate.

where estr and istr are the strength factors for the external and the feedback inputs. The
activation is gradually built up, where the decay parameter « 1) acts as an attenuator.
After settlement of the output, the weights are adapted according to a formula similar
to the delta rule:

~weighti(t) = f(extinputi)(extinputi - intinput;)

Reconstructing incomplete 7-segment numbers


The training phase requires several hundred iterations to recognize the patterns repre-
senting the digits 0 to 9. In the completion test, incomplete patterns are composed by
removing one segment from the valid patterns. This corresponds with an average segment
reduction of 23% and makes several digits unrecognizable. Nevertheless, as is shown in
Table 2, the associ at or is still able to reconstruct 77% of the incomplete patterns.

6 Constraint Satisfaction Network


6.1 Problem solving by interactive activation
Scene analysis is a challenging task facing general purpose robot systems. Even with sophis-
ticated pattern recognition and reconstruction techniques, it is likely that isolated objects
in a scene are wrongly classified. Therefore it is useful to classify the objects based on the
environmental context. For example the error rate of character recognition is significantly
reduced by spelling checking. In scene analysis a similar scheme can be used, when objects
are grouped into consistent clusters. In this way a chair and a table fit together, while a
table and a car belong to different worlds. In a constraint satisfaction neural network used
for scene analysis, the nodes represent objects and the weights between two nodes represent
252

the likelihood that these nodes appear in the same context. Negative weights indicate that
the objects normally do not occur in the same scene, while positive weights indicate they
do.
A vision system recognizes the individual objects and activates the corresponding nodes
via the external input input;. Then all nodes interact to maximize the global feasibility of
the network. The global goodness accumulates the local goodnesses of each unit:

goodness = Lgoodness;.

The goodness of each unit is


goodness; = net;a;.
where the net input of unit i is given by

net; =L w;jaj + input; + bias;.

Besides the activation from the other units, aj, there is an external input and a bias term.
The bias value bias; represents the a priori knowledge that unit i should be on or off. The
net input informs the unit that its activation should be increased or decreased to improve
its local goodness. A possible update rule which increases the value of goodness; and keeps
the activation within the interval [0,1] is:

a;(t + 1) = a;(t) + net;(1 - a;(t))

if the net; value is positive and

a;(t + 1) = a;(t) + net;a;(t)


if the net; value is negative.
Here t represents the update cycle number. The constraint satisfaction therefore implies
that after several updates the network converges to the most feasible global state.

6.2 Local Maxima and Simulated Annealing


It is well known that in the constraint satisfaction models, often the network slips into a
local maximum. Indeed, when a unit tries to improve its own goodness value, it is possible
that it forms a coalition with other strongly connected units. By the time these units should
change independently, their mutual interaction cannot be disrupted. In order to avoid a
definitive commitment of a unit in the early stages of the update process, the activation is
made binary and update rule becomes probabilistic [5]:
1
P[a;(t + 1) = 1] = tiT
1 + e- ne i

The parameter T is called the temperature, and the network operating in this fashion is
called a Boltzmann machine (fig. 9).
If the temperature T = 0, the update rule becomes the threshold function

a;(t + 1) = 2sign(net;) - 1.
253

P(8i 11 )

0.5
neti

o
o
Figure 9: Activation function used for simulated annealing, T represents the temperature.

For high temperatures (T -> 00), the activation values move around rather arbitrarily:
P[a;(t + 1)] = .5
At a low temperature the units react solely on the net; value. The gradual lowering of the
temperature T is similar to the annealing of steel to form a crystalline structure. Therefore
this technique to learn the activation of the units is called simulated annealing.

7 Competitive Learning Model


7.1 Learning Rule
A notable property of neural networks is their ability to partition a set of input patterns
into different classes. This is useful in robot vision to discriminate clusters of similar
objects. The competitive learning paradigm implements an unsupervised learning scheme
on a network with two layers: an input and a classification layer. The principle is simple: a
pattern activates several units in the classification layer, but only the unit with the highest
activation is allowed to fire. The winning unit adapts its weights in order to respond more
easily to the same input. Similar input patterns will activate the same unit. When they do,
a cluster is formed and the weights are affected by all units in the cluster. When a pattern
activates another unit, a new cluster of input patterns is formed.
In this scheme the number of classes is fixed to the number of units in the classification
layer. The weights for each unit are normalized, such that E j W;j = 1. The winning unit
will increase the weights of the active using the formula

llW;j = t(l/nactive - W;j).


254

0 1 2 3 4
17 20 20
17 17 49 20 20 25 25
20 20 25
17 17 49 20 20 25
17 20 20
5 6 7 8 9
20 25 14 20
20 20 25 14 14 20 20
20 20 25 14 20
20 20 20 25 14 14 20
20 20 14

Figure 10: Weight matrix of the 7-segment model after 80 cycles, using the Competitive
Learning paradigm. Dots represent 0 and values are given in percent. The weights are
placed at the corresponding segment locations.

The weights of the nonactive units are reduced by an amount

!:::..Wij = -f.Wi;
Here nactive is the number of active units and f. is the learning rate parameter. This rule
maintains a normalized total weight into unit i. A formal analysis [12J learns that the
weights converge to conditional probability values:

Wij = nac~iveP(input j activelit unit j wins).

7.2 Application: Classifying 7-segment Numbers


In sections 3 and 4 neural networks recognize 7-segment digits while in section 5 distorted
7-segment numbers are restored. Now the competitive learning model is used to recognize
the 10 digits without help of the target patterns. Starting with a random weight matrix,
the weights after 80 training cycles are shown in fig. 10
Clearly, the learning law has found the correct weights. Since there is only one pattern
per class, the normalized weights are equal to the conditional probability that a segment
is on when the unit fires. Nevertheless, the classification is ambiguous, because for some
patterns there is more than one winning unit. E.g., the output unit '1' wins ex aequo with
the input patterns '3', '4', '7', '8', '9' and '0'. In these numbers the segments c and fare
on and give a 100% activation to unit '1'. A tie break is possible by decreasing the weights
of unit 1 slightly. In the same way the weights of number '8' need a small increase in order
to discriminate it from the other units.
255

7.3 Clustering and Cascading CL-networks


Competitive learning (CL) networks can be extended horizontally and vertically, leading
to clusters and cascades of several classification layers. In a single cluster CL-network, the
entire set of input patterns is partitioned. Horizontal clustering allows to partition the input
patterns into different clusters. When the inputs are connected to all clusters, the various
clusters present many classifications or interpretations of the same patterns. A vertical
alignment of clusters allows a hierarchical feature detection which extracts new features
from the accumulated information in the lower layers. A sophisticated example of such a
hierarchical network is the selective attention model for overlapping patterns, developed by
Fukushima [2,3].

7.4 Hierarchical neural networks


Although neural networks are amenable to low level feature extraction and classification,
they are no panacea for real world vision problems. A sensible approach is to combine
neural networks and expert systems in a hierarchical vision system. The neurons in the
lowest layer are sensitive for elementary picture elements (pixels). The next layer joins the
pixels into lines, circles, squares or other shapes. In the higher layers these shapes form
more compound parts. In this way the original picture information is compressed into a set
of structural entities. The expert system uses its knowledge base to recognize the objects
and establish meaningful relationships between them. This approach has been followed by
Hartmann [4].

8 Neural Networks as Parallel Distributed Processing Sys-


tems
Neural networks are sometimes called parallel distributed processing systems. This de-
scription stresses two facts. First a large neural network contains a massive number of
parallel operating processing units, which are the building blocks of connectionist computer
architectures. Second, the information is distributed and requires a fast and global commu-
nication network between the units. Thus connectionist architectures need a fast, reliable
and flexible interconnection network. In fact, the capacity of a neural network to store
information is directly related to the fan-out of each unit, that is the number of connection
paths with other neurons. Some authors predict that this number should encompass several
thousands [8].
Finally, a neural unit and its connections is a building block similar to an assembler
or high level language instruction. From the large diversity of programming languages
and programs it may be expected that there are numerous neural networks and learning
paradigms to be found for many applications. And if we carry this comparison a bit
further, some talented neural engineers might end up inventing a structured neural network
development language.
256

9 Conclusion
In the last decades new programming paradigms such as declarative languages and neural
computing offer a wealth of new ways to implement robotic applications. The wide func-
tionality of neural networks will give it a place in robot vision and control. This chapter
has stressed the strong aspects of neural networks, such as associativity, adaptability and
unsupervised pattern classification. These properties make neural networks a serious can-
didate to replace the highly complicated feature extracting algorithms in hierarchical vision
systems.

References
[1] Anderson, J.A. (1983) Cognitive and Psychological Computation with Neural Models,
IEEE Transactions on Systems, Man and Cybernetics, Vol. 13, pp. 799-815.

[2] Fukushima, K. (1988) A neural network for Visual Pattern Recognition, IEEE Com-
puter, Vol. 21, 3, pp. 65-75.

[3] Fukushima, K. (1987) Neural network model for selective attention in visual pattern
recognition and associative recall, Applied Optics, Vol. 26, 23, pp. 4985-4992.

[4] Hartmann, G. (1988) Mapping Images to a Hierarchical Data Structure - a Way


to Knowledge-Based Pattern Recognition, Neural Computers, Rolf Eckmiller and
Christoph v.d. Malsburg (Editors), NATO ASI Series, pp. 91-100.

[5] Hinton, G.E. and Sejnowski, T.J. (1986) Learning and Relearning in Boltzmann Ma-
chines, Parallel Distributed Processing, Vol. 1: Foundations, D.E. Rumelhart, J.L.
McClelland and the PDP research group, Chapter 7, pp. 182-317.

[6] Hinton, G.E. and Sejnowski, T.J. (1983) Optimal Perceptual Inference, Proc. IEEE
Conf. on Computer Vision and Pattern Recognition, pp. 448-453.

[7] Kohonen, T. (1984) Self-organization and Associative Memory, Springer Verlag, Berlin.

[8] McClelland, J.L. (1986) Resource Requirements of Standard and Programmable Nets,
Parallel Distributed Processing, Vol. 1: Foundations, D.E. Rumelhart, J.L. McClelland
and the PDP research group, pp. 461-487.

[9] McClelland, J.L. and Rumelhart, D.E. (1988) Explorations in Parallel Distributed Pro-
cessing, A Handbook of Models, Programs and Exercises, The MIT Press.

[10] Myake S., Kawato M., Sonohara N., Hongo S., Inui T. (1989) em Applications in
Image Processing and Pattern Recognition, Flanders Technology Seminar on Neural
Networks, April, 9 pp.

[11] Minsky, M. and Papert, S. (1969) Perceptrons, MIT Press.


[12] Rumelhart, D.E. and Zipser, D. (1986) Feature Discovery by Competitive Learning,
Parallel Distributed Processing, Vol. 1: Foundations, D.E. Rumelhart, J.L. McClelland
and the PDP research group, Chapter 5, pp. 151-193.
257

[13J Rumelhart D.E., Smolensky R., McClelland J.L., Hinton G.E. (1986) Schemata and
Sequential Thought Processes in PDP models, Parallel Distributed Processing, Vol. 2:
Psychological and Biological Models, D.E. Rumelhart, J.L. McClelland and the PDP
research group, Chapter 14, pp. 7-57.

[14J Smolensky, P. (1986) Information Processing in Dynamical Systems, Parallel Dis-


tributed Processing, Vol. 1: Foundations, D.E. Rumelhart, J.L. McClelland and the
PDP research group, Chapter 6, pp. 194-281.
PARTS

MANUFACTURING SYSTEMS
CHAPTER 11

MICROPROCESSORS IN DATA ACQUISITION SYSTEMS


FOR PROCESS CONTROL

A. M. WEILERI'
Hewlett-Packard Colorado Integrated Circuit Division
637 Mansfield Dr.
Fort Collins, 00 80525, USA

ABSIRACI'. Microprocessors have brought about major changes in the


field of data acquisition systems and process control. A compilation
has been produced of the impact microprocessors have had on data
acquisition systems incl\.ldi.nJ sensors, acquisition circuitry and
control units, process control incl\.ldi.nJ methodologies, measurable
parameters and means of physical action and the integration and
automation of acquisition and control inc:l\.ldi.nJ the realm of
possibilities, practical limitations, actual applications and
possibilities for future implementations. A practical viewpoint has
been applied to all of these areas with an emphasis less on the
theoretical and more on sources of specific, hard information. Based
upon the wide availability of hard information and the obvious growth
in microprocessor technology and capabilities the importance and growth
of microprocessor based systems seems assured.

1. MICBOPRCX::ESSOR Bi\SED DATA ACQUISITION SYsrEMS

Data acquisition systems are not new. Traditional control systems


ranging from the simple home thermostat to the automatic flow controls
in a steam power plant all apply some level of data acquisition and
have for years. Essentially, all control systems that attempt to
proact or react to the controlled system's state require data
acquisition. 'lhe recent ,in relative terms, development of
microprocessors have brought a number of significant changes in the
requirements, demands and possibilities for data acquisition. While
many of the core, traditional aspects of data acquisition have not
changed and will be included briefly in the following IiIUll\mary, the
adaptability of microprocessors has broadened the ~ of
possibilities while the rigidity of microprocessors ,in terms of inp.rt
signal requirements for instance, has placed heretofore unknown
restraints on sensors, signal conditioning and other aspects of
acquisition which will be the emphasis of this section.
261
262

1.1. Sensors
1.1.1. General Sensor Requirements. In order to acquire data there
must be a source of data. '!he goal of any sensor is to provide some
quantifyable measure of some Ibysical characteristic of the real world.
'!he basic requirement of any sensor technology is the ability to make
the conversion to a tangible form. '!his key conversion is in turn tied
closely to some principle of basic Ibysics or chemistry. For instance,
flow measurements rely on Bernoulli's principle, pressure readings on
the ideal gas law, thermocouple temperature readings on the Peltier
effect. '!he wide range of sensors for a wider range of measurements is
testimony to the imaginative application of basic scientific principles
and even simple empirical relationships to the key conversion of one
Ibysical condition to another, gauged cbndition. Physical parameters
from temperature to charge mobility in a semiconductor can be sensed
and converted into a tangible representation. Before the advent of
microprocessor based acquisition the tangible representation could be
virtually unlimited: mechanical mechanisms driving needles showing
pressure, expanding fluids moving up scaled glass columns to indicate
temperature, varying electrical resistance caused by stretching wires
to indicate strain. '!he situation is still such that even if there is
no commercial sensor for the parameter required a principle or
relationship can be determined that will allow construction of a useful
sensor or allow an available sensor to measure some subsidiary effect
to construe the desired value.
1.1.2. Sensor Limitations For Microprocessor Systems. '!he
introduction of microprocessors into the data acquisition path has
imposed previously unnecessary restrictions on the sensors used to
generate the data. '!he most obvious limitation is the need for the
sensor to provide data in a form that can be either directly or
indirectly converted into an electrical signal. Gone are the days of
the simple mechanical gauge reading. '!he preferred sensors are those
that directly generate an electrical signal obviating the need for a
mechanical to electrical conversion (Basi~lly a sensor for the
sensor). A brief listing of direct electrical output sensors and the
parameters they measure is shown in '!able 1. A simple perusal of
experimental handbooks such as Holman [1] will provide a broadly based
a:>Verage of sensor technology, rut perhaps a resource closer to the
cutt.:inJ edge is the wide range of sensor manufacturer catalogs such as
Omega [2] am Davis [3] which list page after page (Many are the size
of a large city's Iilone book) of the latest in sensors for almost any
imaginable application.
263

TABLE 1. Direct Electrical CMtplt Sensors

Sensor Measurement Technique

'lhen!Dcouple ~ture Peltier Voltage


Resistance 'Ihernaneter Temperature RvsT
'Ihermistor Temperature RvsT

Potentiometer I.ocation /Motion R vs length


Differential Transformer Displacement Inductive Coupling

Piezoelectric Crystal Force/stress Piezoelectric V


Resistance strain Gauge Force/stress R vs wire Area

Resistive Pressure Gauge Pressure R vs Pressure

Hot wire Anenoneter Air Flow R vs Flow

1.2. Acquisition Of Information

While the sensor is the first point of contact between the real world
and the control system, the acquisition circuitry is the point at which
the information from the sensors urrlergoes its second conversion ('!he
first was the sensors conversion from Iilysical parameter to tangible
representation). '!he acquisition circuitry conditions and converts the
signals from the sensor (Preferably electrical signals) to a form
usable for display, control or complter inplt. '!he acquisition
circuitry can be used to convert the sensor signal directly into a
visible form as with a chart recorder or meter readout. '!he
acquisition unit can simply condition the signals into a form suitable
for use in a traditional analog control system. '!he acquisition
circuitry can be used to convert the sensor signal into a digital
representation suitable for inplt to a comp.1ter or digital control
system. since the primary focus of this treatment is microprocessor
based systems, the conversion of sensor data to digital form will be
the principle emlilasis.

1.2.1. Signal Conditioning. '!he initial step of any acquisition is


conditioning the sensor signal into a form (Voltage, current,
frequency, etc.) compatible with the later stages of the acquisition
circuitry. While there are a number of sensors which produce
electrical out~ directly, sensors which do not produce electrical
output are still in wide use and must have their outputs converted to
electrical signals by awerrlin;J an electrically based sensor to the
nonelectrical output of the primary sensor ,as in attaching a variable
resistor to the pointer of a Bourdon tube pressure gauge. '!he
conditioning of even the electrical signals is rEqUired because the
signal may be .i.ncompatible with the inp.1t rEqUirements of the digital
or analog-to-digital circuitry. For instance, if the signal ranges
from -20 to +20 Volts and the inp.rt circuit requires 0 to 5 Volts, an
operational amplifier circuit to convert the signal to a 0 to 40 Volt
264

range followed by a voltage divider to convert to 0 to 5 Volts. The


traditional signal conditioning methods used with nondigital systems
may be applied to conditioning in the predigital SEct:ions of a
digitally based acquisition system. Typical methods include
operational amplifiers for voltage range expansion and offsets,
resistance voltage dividers to shrink voltage ranges, active or
passive filters to remove either low or high frequency noise, and
frequency-to-voltage converters. Again, a general text such as Holman
[4] will provide an overview of conditioning circuitry while a more
practically oriented source such as Graf [5] will provide specific
examples.

1.2.2. Direct Digital Input. There are a fair number of applications


in which the conditioned signal ¥y be used directly by the digital
microprocessor circuitry. Signals from limit switches, keyboard
control hlttons, thumbwheel switches outp.rtting binary numbers and
zero-clipped analog signals can all be sent directly to a
microprocessor port. Some of these inIXlts are simple go/no-go signals
which comprise the rna jority of the data required to run even a complex
system which usually includes a large number of valve open/close, door
open/close, IXlIDP On/off and other components. In these simple cases a
o Volt - off, 5 Volt - on signal is connected directly to an input port
on the microprocessor system. However, beyond this simple level,
parameters such as low frequency signals where the frequency is the
important parameters can be conditioned and inp.rt to the microprocessor
which can calculate the frequency. Actually, with the high speed
microprocessors available frequencies up to 1 MHz can be sampled
easily. In fact, it is even possible to store and reproduce
understandable digital speech by transforming the inIXlt speech waveform
into a digital signal where high levels replace the portions of the
original waveform above a set zero level and low levels replace the
original signal below the zero level (This is zero clipping) and
storing or outputting this series of ones and zeros directly through a
port of the microprocessor system.

1.2.3. Analog-'lb-Digital Conversion. For situations where the actual


value of a parameter is required rather than just whether the signal is
high or low, analog-to-digital conversion is required to convert the
continuously variable data signal into a discretized, digital, binary
number representation suitable for microprocessor input. Analog-to-
digital converters are available to fit a wide range of requirements
for inIXlt ranges, resolution, speed, and multi-channel capabilities.
Specification information on individual converters can be easily
obtained from the manufacturer's component data handbooks such as Texas
Instruments [6], Analog Devices (7] and Signetics [8]. Thanks to
advances in integrated circuit technology almost any conceivable
combination of features is available in a single package. In light of
these facts, the problem of analog-to-digital conversion is reduced
primarily to the understanding of the requirements leading to the
device specifications. The key characteristics which can be selected
include: the number of bits in the digital output, the conversion
265

speed, the format of the digital output (Binary coded decimal or simple
binary), and the type of conversion cirouitry employed. '!he A-to-D
characteristics can in turn can be matched to the requirements of the
acquisition system as a whole correspondingly: signal resolution where
more bits provide finer resolution, signal frequency where conversion
speed provides a hard limit to the maximum measurable frequency, use of
direct readout vs computer inplt where the choice between binary, BCD
and 7-segment LED C>tl"tp.1.t is made, and alEolute accuracy required which
often keys the choice between the more accurate rut slow dual-slope
converters and the less accurate rut much faster successive-
approximation converters. Other factors contrituting to the final
converter choice are mainly involved with the ease of interfacing and
use with a microprocessor based system. For instance, is the data
output in serial or parallel format, and which would the particular
complter system being used prefer? Converters are available using both
formats. Likewise, are multiple input channels needed, how many and
can they be of mixed types (Some analog, some digital)? Converters are
available which include the circuitry to decode multiple channels, even
of mixed types which greatly simplifies the task of multi channel
acquisition since one chip will do the whole jab. Or, if advanced
features are not needed, can a simpler, cheaper component be used?
Once the system requirements have been converted into characteristics,
the component data books can be consulted to find the closest match of
available components to desired characteristics. In fact, the data
books are such a valuable resource because, in addition to supplying
the appropriate part numbers for c:i:rx::uits meeting the user's
requirements, the books list the wide variety of circuits available,
providing inspiration and insight into alternative methods, unknown
alternatives and circuit suggestions for anyone simply paging through.
Table 2 provides a brief glimp;e into the available range of analog-to-
digi tal converters.

TABLE 2. Typical analog-to-digital converter characteristics

Parameter Min Max Comments


----
# of Bits 3 1/2 14 Resolution=Vrange / (2"#Bits-1)

Conversion 1E-6 333E-3 Max Frequency = 2 / Conv. t


Time (sec)

# Signals 1 19 Analog multiplexing Wilt in

Conversion Successive approximation, Dual-slope, "Flash"


techniques

1.3. Control
Having combined a sensor and acquisition circuitry the final block
required for a complete acquisition system is a means of controlling
the action of the sensor and acquisition circuitry. Whether the
266

controller is a simple circuit to connect data to a display or a


complex microprocessor based system performing complex calculations and
acting on the acquired information, the role of the controller is
essentially the same. '!he control unit acts to control the sensor-
ac::quisition circuitry to insure that the data acquired is valid and
then acts to perform an appropriate action based on the data, even if
the appropriate action is to simply display whatever data is acquired.
'!he key control that the control unit must exert is the control of
timing. Timing is critical in that most. sensors require a set amount
of time, especially after a change in the measured parameter, before
the measured value will be stable enough to provide meaningful data,
the signal conditioning circuitry requires a set amount of time before
the output signal accurately reflects the correctly conditioned input
signal, and finally, the analog-to-digital conversion circuit requires
a set amount of time for its actions. '!he control unit must coordinate
the timing of the signals to start each portion of the acquisition
allowing for the proper delays to insure valid data at each transfer of
information. On the surface it may appear that the wide variety of
sensors, acquisition circuits, output requirements and control units
would requisite a custom solution for each system; however, the real
world concerns of maintenance, availability of replacement parts and
the reduction of development overhead provide a driving force for the
use of standards. '!he use and development of standards aid the
ability to apply the efforts and equipment of any manufacturer adhering
to the standard, the ease of repair and replacement of standard based
boards or modules and the ability to reconfigure and reuse portions of
the system easily by interchanging standard based equipment. '!he
powerful advantages of standardization can even lead to the not
necessarily poor choice of using a standard with poor performance
because it is standard. '!he overall advantage of a standard is, of
course, that it is standard. Unfortunately, while most standards are
advantageous the fact that the number of standards applicable to
hardware, software and data transmission has grown to the point where
choosing which standards to use in which situations has become almost
as complex as specifying a completely custom system. While an in depth
study of even one area of standardization is beyond the scope of this
presentation, a brief enumeration of standards for hardware, software
and data transmission can provide a starting point for developing an
image of the useful range of the dominant standards.

1.3.1. Microprocessor Hardware standards. The prime area of hardware


standardization is in the backplane bJs which caries the control
signals, data bits and address bits accessing both the system memory
and any addressable interface circuitry. Bus standards can be
differentiated along several lines including the number of address
bits, number of data bits, bus speed, control lines for multi or co-
processing and any data/address multiplexing schemes. The factors
listed above can be augmented with size, availability, software
support, and microprocessors available. Simply put, the choice of a
bus, like the choice of any component, can be driven by the
requirements of the system which can be translated into desired
267

characteristics for the hls. For instance, if a hls speed higher than
8 MHz is required, the I.B.M.-P.C. bus is unsuitable but the E.I.S.A.
bJs would serve. Likewise, if the software to be used on the system is
only available for Motorola 68000 series processors only, the I.B.M.
bJsses which are primarily oriented to Intel microprocessors would be
at a distinct disadvantage. IDoking at an number of the key factors
provides the following image of the bJs world. Common 8 data bit
busses include S.T.D., and the I.B.M.-P.C. bus while 16 bit or more
busses include the I.B.M.-A.T., I.B.M.-M.C., Apple NuBus and the
E.I.S.A. bus. Of these the I.B.M.-P.C. and A.T. busses could be
considered dominant if only due to the vast number of systems,
interface cards and software for the I.B.M. bJsses. A relatively
recent development specifically for instrumentation and acquisition
systems is the V.X.I. bus which is a version of the V.M.E. bus
developed by Digital Equipment with extensions to allow direct
installation of instrumentation, measurement or interface cards for
data acquisition with standard control protocols and software. A
corresporrling list can be developed for which bJsses suitor are
primarily based on which microprocessor families. Intel
microprocessors are available for all of the I.B.M. bJsses, E.I.S.A.,
V.X.I and S.T.D. bJsses while Motorola processors can be found on
S.T.D., NuBus and v.x.I. cards primarily. overall, however, the choice
of a bJs is primarily a choice of user preference based mainly on
microprocessors supported and software availability. In fact, so
entrenched are many of the proponents of the various bJsses that
arguments between the camps can take on a nearly religious fervor.
Journals such as Byte [9] and Personal Engineering [10] commonly
provide up-to-date technical information on the latest hardware
standards, de-facto and otherwise.

1.3.2. Microprocessor Software Standards. '!he two main areas of


software standardization are operating system standards and progranuning
language standards. Unfortunately, at least for most data acquisition
applications the software for actually controlling the instrument,
measurement or interface hardware is usually not standardized beyond
each manufacturer providing a common software base for their own
equipment. But, fortunately, given a standard operating system, a
standard language and careful stnlctured programming the equipment
dependent portions of the control software can be minimized and
isolated to easily updated modules. In terms of standardized
programming languages the choice may again be one of personal
preference between B1\SIC with its easy to learn and use
dlaracteristics, Pascal with the ruilt in block structure and typed
variables which contribute to structured programming and C which allows
direct access and control of the hardware and an ~nably standard
interface regardless of the operating system. Of course, deperrling
upon the requirements of the application in terms of speed and the need
for direct control, assembly language may be the only viable
alternative. Despite the peculiarities of assembly language pertaining
to the specific microprocessor used, the various microprocessor
families from the major manufacturers have maintained enough continuity
268

and compatibility through the development generations that even the


assembly languages have achieved some degree of standardization.
Likewise, the choice of an operat~ system hinges upon user preference
and availability. '!he dominant operating system for Intel
microprocessors is Microsoft M.S.-D.O.S. due to the vast number of
I.B.M.-P.C.'s and compatibles. Motorola 68000 series microprocessors
are supported by the Apple Macin1josh operating system. '!he unix
operating system is available for a wide range of both Intel and
Motorola processors with its broadly based support for systems software
development. For embedded systems not requiring a complete operating
system rut merely a development support system, language cross
compilers, cross assemblers, emulators, simulators and deb.1ggers are
available for most popular (And some not so popular) microprocessors
and execut~ on most major personal computer and engineer~
workstation hardware/software configurations. In fact, one of the
prime hardware selection criterion is simply whether the software
preferred for development is available for the hardware. '!he trade
journals again provide specifics on available software through both
feature articles and advertisements.
1.3.3. Data Oommunication standards. Of all of the standards
discussed thus far the data communication standards available are
perhaps the least steeped in personal preference. The coordination of
communication between separate pieces of equipment is understandably an
area requiring a great deal of standardization since without
standardization each system would not know what signals to expect or
what the other system is expecting. Oommunication standards rely on
combining control lines with well defined functions with the data
lines. For instance, the transmitt~ system might monitor a ready-
for-data line wait~ for the line to go high prior to sending data
serially on a single data line or in parallel on multiple lines.
standards for parallel communication are the Centronics interface in
common use for printers and over short distances and the IEEE-488 or
GPIB t:us which was developed by Hewlett-Packard primarily for
interfacing instruments and peri(i1erals or computers and which allows
for communication over longer distances and individual address~ of
multiple devices on the same cabling network. Serial interface
standards include RS-232 which again is predominantly a short distance,
low speed computer-peri!ileral standard and RS-422 for higher speeds and
longer distances. Further standards exist for network based
communications based mainly upon the format of the net (Ring or star
form). Typically the tradeoffs encountered when selecting a data
communication standard are along the lines of parallel with fast
transfer rates rut tulky cables VB serial with slower transfers rut
thinner cables and inexpensive short distance formats VB more expensive
long distance systems. Of course, in a data acxpisition system there
is also the conc:ern of transmitt~ analog signals to and from the
conditioning circuitry. Unfortunately, few standards exist in this
area even for components as basic as connectors. For instance,
connections can be made with BNC coaxial, banana plugs, RCA plugs, F56
or F59 coaxial, triaxial, !ilone plugs or DB9 or DB25 connectors ,and
269

with no set standard for which cormectors are used for what, it comes
as no surprise that adapters from one type to another are part of a
rooming blsiness. However, it should be noted that ideally the analog
signals are converted to a digital form as soon and closely as possible
to the source and can then be sent via more reliable, standard digital
methods. In the case of data communication standards, which are more
fully and carefully regulated than most of the hardware or software
standards discussed above, the actual standard p.lblication, such as
IEEE [11] for the GPIB standard, provides highly detailed information
on all aspects of the standard. Alternatively, since the actual
standard may in fact be too detailed, application oriented texts such
as Short [12] provide more practically oriented information.

1.4. Acquisition System Integration

As is usually the case, the careful selection of the latest and


greatest in each aspect of a total acquisition system, sensors,
conditioning circuitry, acquisition circuits and controller, can easily
lead to a tempermental, difficult to use, impossible to maintain system
unless considE'..ration is also focused on how each individual component
can be integrated with the other aspects of the system to produce a
cohesive whole. '!he keys to successful system integration can be
roiled down to: ease of use, roblstness and adaptability. '!he
implications of ease of use are reasonably obvious. If a single
component of the system is difficult to use, the entire system will
inherit the difficulty, and it is usually the case that the more
difficult a system is to use, the less it will be used. 'lhus, sensors
should be chosen so that they don't require extensive calibration prior
to use; conditioning and acquisition circuitry should be chosen so that
it doesn't require fine tuning constantly and the control unit should
strive for the ideal start/stop button usage. Second, robustness or
resistance to failure has similar pretensions. A system is only as
reliable as its least reliable part, and you can't use a system
frequently if it frequently fails. So again, sensors should seek to
avoid, as much as possible, fine wires, thin glass and tightly limited
ranges; conditioning and acquisition circuits should be solid-state and
modular if possible; control units should gracefully harrlle unexpecterl
events and connectors should be eliminated if possible and if not
eliminated then made water proof cable-yank proof and impossible to
hook up incorrectly. Finally, one of the key advantages that
microprocessor based acquisition systems hold over their more
conventional rounterparts is adaptability, but this flexibility can
only be plt to fullest use if it is considered a vital part of the
overall system goal. Common pitfalls which can render a potentially
open, flexible system closed and black-boxish are trying to perform too
many functions in hardware rather than software, not including
additional capabilities in terms of input channels, bls slots or
storage capacity beyond the needs of the current usage, and not
providing a means of outputting information in an irxiustry starrlarci,
machine readable form so that if all else fails, another complter can
get the information and act accordingly. Although it is difficult to
270

avoid setting limits, in some cases arbitrary, on a system at some


point in development, the limits can always be made broad enough that
they don't interfere with even unexpected and extreme demands which
might be made on the system in the future. Unfortunately, for many
systems the designers imagine the most that anyone would ever need and
design a system to fit while the users struggle and work around the
limitations and complain that the designers just didn't have a wild
enough imagination. Keeping all of this in mind will hopefully yield
an easy to use, rorust, adaptable f.iystem to acquire data which is
vitally important to the next aspect of this treatment since you can't
control something if you don't have any data.

2. PRO:ESS CONIROL

'!he first building block of an automatic control system is the data


acquisition system. '!he second is the actual control system which
takes the input from the data acquisition system, evaluates it and
performs an awropr iate action to bring about a desired change in the
real-world system being controlled. Much like the data acquisition
system the control system can be again sul:xlivided into the evaluation
related circuitry or software, the measured parameters and the
measurement characteristics and the action producing hardware.

2.1 . Evaluation Methods

'!he initial input to the control system is, of course, data from the
sensors via the data acquisition system. '!his input is the indicator
the control system has of the actual current state of the real-world
system being controlled. Given the current state of the real-world,
the control system has a number of possible methods to evaluate the
current state and determine the action or outp.lt needed to bring the
real-world system to the state specified by the set point of the
control system. Traditional feedback control methods including
proportional, proportional-integral, proportional-integral-<llfferential
and on/off can all be applied to the signal to generate the appropriate
output to the controlled system. Ia;s traditional feedforward control
methods can also be applied although they generally require a fairly
complete understanding of the system so that the effect of variation in
any of the measured variables is known and can be compensated for
before the variation is transferred to the system as a whole. '!he
mathematical background for all of these traditional control methods
can be found in a general text. such as Weber [13]. Unfortunately, most
texts on control limit themselves to the mathematical realm while
treating the actual hardware as black boxes. In fact, to gain an
understanding of the actual hardware required for to put the
mathematics into practice, a source such as Senturia & Wedlock [14]
which covers the application of operational amplifiers to signal
summing, inversion, proportional amplification, integration and
differentiation is needed and even the electronically oriented texts
will only provide insight into what needs to be done to the input
271

signal in an electrically oriented system. '!he advent of


microprocessor based control systems has not changed the basic
mathematics of control tut it has altered the actual implementation am
broadened. the scope of applications for some methods. One of the major
changes brought about by microprocessor based systems is the expansion
of the number of parameters or channels that can be controlled with a
reasonable level of reliability, complexity and space. '!he electronics
and logic encompassed by the microprocessor make the monitoring and
control of a large number of channels possible within a small, solid-
state, programmable package. '!he next major change is that the
mathematics involved in the control can be more complex am more easily
modified than with a traditional hard-wired controller. '!he ability to
use software to control the mathematical parameters allows for
definite, repeatable adjustments and allows for the complex
calculations required for accurate feedforward control thus broadening
the range of possible applications for feedforward control. Even
calculations which were previously beyond the realm of controllers may
be approximated by applying traditional numerical methods for matrix
manip..llation, integration, differentiation am curve fitting as
detailed in texts such as carnahan, lUther & wilkes [15] or James,
Smith & wolford [16]. Finally, the microprocessor has made the
application of statistical control methods possible on an automatic
basis. Now, typically, statistical control methods as outlined in
general statistics texts such as Miller & Freund [17] are applied in a
fairly manual mooe, measure, record, calculate control limits and apply
human intervention if the process passes the limits. with
microprocessors the measurement, recording, control limit calculation
and intervention may all be automatic. In fact, the ability of the
microprocessor to record and average can improve the reliability of
even traditional proportional, integral or differential based control
by only applying the control mathematics to a pre-averaged signal. All
of the above changes brought on by the microprocessor stem from the key
differences between the microprocessor and a hard-wired logic system:
programability, memory capabilities, conditional decision making
abilities and the inherent ability to deal with multiple parallel data
channels. Of course, the capability to cope with increased complexity
in evaluation of input signals does not eliminate the need for the
inp,rt signals to meet certain requirements which will be detailed
below.

2.2. Measures

'!he inp,rt to any sort of process control system whether automatic or


not must satisfy a number of requirements in order to provide a rablst
link between the Iilysical situation and the control action required.
'!he two major requirements are ease of measurement and correlation to
the desired outcome. First, even though there are sensors and
techniques for measuring even the most esoteric of parameters, not all
parameters are suitable for use as inplts to a control system.
Generally, the more difficult a parameter is to measure, the higher the
signal to noise ratio will be, the longer the measurement will take and
272

the more sensitive the system will be to uncontrollable fluctuations in


the environment. Hence, to be suitable for input to a control system,
a parameter should be measured simply, quickly and with a low signal to
noise ratio. Keeping the input signal simple and robJst will help
avoid the cascading problems of amplified noise in successive stages of
the system. Second, the correlation of the input signal to the
desired outcome should be as simple and direct as possible. AI though
the world is certainly full of non-linear, non-polynomial
relationships, the world is also full of simple approximation
tedmiques which should be appliEtl to the fullest extent possible to
the connection between the input signal and the output control signal.
'!he reasons for simplicity are myriad ,rut again, the amplification of
noise is the principle worry. It is obvious that a correlation that
bases the control output on a simple linear scaling will produce
significantly less magnification of input errors than a correlation
relating the output to the square, cube or exponential of the input.
other concerns involve factors such as the requirement of complex
circuitry or software to perform complex calculations, and the greater
likelihood of singularities or exception cases in a complex model. '!he
fact that the microprocessor based control system can handle the
complexity does not eliminate the tendency of systems with greater
complexity to suffer more severely from problems ,and in fact the
frequency of sutstantial software flaws increases significantly with
increasing complexity.

2.3. Actions

Having sensed the parameter of interest, conditioned the signal,


converted for the controller and correlated the input to a required
outp.It to retain control, the final step in the system is the actual
controlling action. Like the other portion of the system that deals
directly with the environment, the sensor, there are many options for
performing the control action, and again, the implications for
microprocessor based systems are highlighted here.

2.3.1. Digital-to-Analog Conversion. As a complement to the final


stage of aa;IUisition, analog-to-digital conversion, the initial step in
many control actions is the conversion of a computer outplt digital
value to an analog signal. In cases such as resistance heater control
and other situations where an electrical signal can act directly on the
environment l.D1der control, the digital-to-analog conversion is the only
interface and action necessary. Being complementary to analog-to-
digital conversion (Many analog-tcMtigital converters contain a
digital-to-analog converter) a similar variety and range of converters
are available and documented in the same component hand1::ooks as the
analog-to-digital converters discussed previously. Even if the
environment can not be directly effected by an electrical signal, the
analog outp.It signal can often be employed as an input to a subsidiary
control system which can effect the controlled system. For instance,
the analog output can be fed to a pump speed controller as a varying
setpoint which will result in changing the pump speed to adjust a flow
273

or pressure related to the pump. In fact, many components such as mass


flow controllers, motor speed controllers and valve controls are
available which are specifically designed to operate off of a direct
electrical signal, usually between zero and five Volts.

2.3.2. On/Off Controls. As with digital-to-analog conversion, on/off


controls have an inp.rt related complement in direct digital inp.rts. As
mentioned previously, a complex machine can have numerous open/close,
on/off type inp.1ts for items such as doors and valves; likewise,
solenoids, lights, warning blzzers and on/off valves can be acted upon
by a simple on/off digital outplt. Microprocessor systems can easily
be set up with outp.rt ports and individual bits attached to outp.rt
lines toggled on and off via software. Some simple signal
conditioning, usually to txlost the current levels available and isolate
the microprocessor circuitry from any high voltages or currents is
required to drive items such as relays or solenoids ,rut for many
blzzers and light-emitting-diodes, almost no conditioning is required.
Again, as with direct digital int7J,t, a number of fairly complex actions
can be accomplished with the awarent simplicity of an on/off signal.
For instance, in addition to providing the capability to turn a motor
on or off, zero or full speed, a properly modulated signal (Usually,
pllse width modulation) can be used to provide continuous variable
speed control of a motor employing a completely digital signal.
Likewise, the ability to pllse a line at a desired frequency can be
used as a sourrl generator if the line is used to drive an audio speaker
as evidenced by the explosion of beeping, tune-playing greeting cards,
toys and car dashtx:Iards.

2.3.3. Servos. For a large number of positioning or speed control


related actions, a servo position or speed controlled motor is an ideal
outt7J,t device. Proportional, continuous control over full position or
speed ranges allows for fine control of motion. '!he servo based
systems also incorporate their own internal feedback controls to
provide precise closed loop action. Interfacing to servos can be
aided by available hardware systems which are themselves acquisition
and control units inp,rting a desired position or speed in digital form
and performing the required calculations to drive the motor to the
requested position or speed. '!here are even sqi1isticated servo
controllers which accept complex commands in mnemonic languages and
convert these into elaborate motions. lacking the complex control
hardware, some servos, notably the servos used for radio controlled
cars, boats and aircraft, can be controlled with a direct zero to five
Volt, pllse-width-modulated signal. Another advantage of servos is
that, if the requirements for positional accuracy are fairly loose,
simple, low-cx:>st permanent magnet D.C. motors can be utilized.

2.3.4. stepper Motors. If high positional accuracy is required, D.C.


st.Egler motors are the usual choice for the acting device. As with
servos there are hardware stepper motor controllers which accept direct
digital inp.rts and even complex commands and convert these to a desired
amount of rotation or even a sequence of motions. Again, like servos,
274

lacking a complex motor control system, direct digital outp,rt lines can
be used (With appropriate corrlitionirg for the high currents and
voltages involved) to directly drive the stepper motor by OU"I:pJ.tting
the bits required to appropriately power the :irrlividual coils of the
motor. Like many of the other aspects of microprocessor based
acquisition and control systems, the stepper motors and controllers
available cover such a broad range that some of the best sources of
information are the manufacturer publications and industry journals
such as Chilton's [18].

'!he final combination of an acquisition system with a control


methodology and some means of action has taken on new aspects,
requirements, and possibilities with the increasing application of
microprocessors and related technologies to each and every portion of
the system and the integration into a coherent whole. '!he range of
possibilities, practicalities, actualities and eventualities has
virtually exploded with the capacity, speed and flexibility of the
microprocessor.

3.1. Possibilities

While traditional, hard-wired systems were limited in terms of the


mathematics they could perform, the ability to conditionally act on the
data they were receiving and in data storage and transmission
capabilities, microprocessor based systems suffer from few limitations
in any of these areas. '!he reduction or elimination of restrictions on
the key functions of a control system has broadened the possible
applications for control systems, especially automatic control systems.
FUrther, even within the recent development history of microprocessors,
the increase in compltational capacity has been astounding. One means
of analyzing the developments in microprocessors is to equate them with
the history of comp.rters as a whole only time lagged. For instance,
while the usual comparison made is between some of the first electronic
comp.rters such as ENIAC which occupied warehouse size tuildings and a
modern microprocessor, a more telling comparison is that the latest
microprocessors and the personal comp.rters based on them provide
performance equal to mini complters of only five to ten years ago. In
fact, some of the most recent framings of this comparison are between
the desktop P.C.'s of the 1990's and the car-sized minicomp.rters of the
1980's. Essentially, microprocessors are becoming mainframes in terms
of comp,rtational capacity and speed ,and thus, systems that would have
required a mainframe or mini to implement automatic control can now be
run with a desktop system. A case could be made, that the massive
banks of mainframe computers used to control the Apollo moon landings
could be reduced to one or two P.C.'s on a desk and perhaps more
tellingly that the banks of mini comp.rters installed only ten or
fifteen years ago to control the space shuttle missions could be
reduced to a dozen or so P.C.'s. Arguably, with the exception of speed
275

related limitations, there are effectively no boundaries to what can be


controlled employing microprocessor based systems. Of course, simply
because almost every imaginable application is possible does not imply
that every application is ~ctical.

3.2. Practicalities

In any given situation there is always a differentiation between what


is possible and what is practical. '!he limitations of size, Dldget,
and schedule pare down the realm of the possible into the scope of the
practical. With traditional, hard-wired systems the space and effort
required to set up a system are the primary roadblocks to successful
system integration. Again, however, the increasing capabilities of
microprocessor based system have overcome many of the typical practical
limitations. If size is a oonc::ern, there are single chip
microoontrollers which include analog input and output channels which
can encompass an entire system in a two inch long, 1/4 inch thick,
inch wide space. Cost oonc::erns can be eliminated if some performance
can be traded off to allow a low-cost, single chip system to be used.
Even tight schedule oonc::erns can be reduced if modular, industry
standard microprocessor based, programmable sul:systems are employed.
However, even microprocessor based systems can not avoid certain
inherent limitations to control. If the system being controlled
exhibits severe instabilities, extreme sensitivity to initial
conditions or a need for unattainable accuracy or speed in the control
action, control may, in fact, be impossible for any system at least on
a short term basis. '!he issues of speed and accuracy are perhaps the
only intractable oonc::erns related to the hardware since there are
finite limits to speed based on processor speed and to accuracy based
on the maximum number of bits available in analog-to-digital
converters. Of course, the continuing development in microprocessor
technology has led to increasing processor speed and in increasing the
bit capacity of all aspects of both the microprocessor and the
acquisition circuitry which implies that even if certain applications
are beyond current technology, they may soon be not only practical rut
perhaps even ordinary.

3.3. Actualities

Somewhere in the intersection of possibilities and practicalities are


the actualities. In light of the computing power now available, there
has been an explosion in the number of control systems, an expansion in
their complexity, and a contraction in size, cost and development
effort. '!he result of these developments is that complex control
systems are becoming more and more prevalent outside the realm of
industrial environments. For instance, the typical home furnace
thermostat which used to be a simple bimetalic strip on-off switch has
become an electronic, multiple setpoint, time cycle controller
occupying the same space and costing only minimally more than the
simple mechanical system. Correspondingly, the level of systems in
manufacturing situations has climbed as well. For instance, systems
276

that previously controlled only the central core of the process and
left loading and unloading manual now employ the logic and multi-
channel capabilities of the microprocessor and the actuating
capabilities of integrated circuit based motor controllers to load and
unload material, sense and move doors and tracks and completely
automate the process. Similarly, control systems that previously were
hard-set for a single control point employ the programability and
memory capacity of the microprocessor to enable multiple ~ of
action and control points or even continuously variable control points
allowing for smooth ramped transitions between system states and the
ability to store and operate a large number of different processes in a
single piece of equipment. Further, the ability of the microprocessor
to automate statistical process control is now employed to enforce
large scale control on a sequence of processes by measuring pertinent
parameters on the actual product at various stages of manufacture and
feeding the parametric information, including current values, trends,
derived relationships and other factors, back. to the specific processes
which effect the measured parameter. However, at least currently, the
connection between the parametric measurement and statistical control
0I.ltpJts and the individual processes usually consists of a message to a
human operator or engineer who then takes the Iilysical action required
to bring the system back. into control. Of course, as the discussion of
the current situation above points out, tomorrow will certainly hold
advances that will further expand the scope and ability of
microprocessor based acquisition and control systems.

3.4. Eventualities

Given the rate of development in both microprocessor technology and


software development, the future undoubtedly holds exciting
possibilities for microprocessor based acquisition and control systems.
Prime areas where developments are likely and significant research
opportunities exist are the areas of artificial intelligence: expert
systems, adaptive and predictive controls. As mentioned above, the
typical connection between statistical control 0I.ltpJts, even if the
acquisition and statistical calculations have been automated, is a
human link; however, since the human action required tends to be
restricted to a fairly small realm of knowledge and rules, the
application of an expert system is a natural extension to the automatic
parameter measurement, statistical control system. In many situations
the outpxt of the statistical control system merely alerts an operator
and generates a IilOne call to an engineer who reacts by determining the
state of the process (Is the liquid source bottle temperature less than
66 C? Did the deposition pressure exceed 500 m'Ibrr? Was it stable?) and
makes a recommendation to the operator to correct the situation
(Increase the source bottle temperature by 2 degrees. Increase the
deposition time by 30 secorrls. Call a technician, there is serious
trouble!). '!he facts about the system and the rules-of-thumb the
engineer applies can be incorporated into an expert system as outlined
in Barr & Feigenbaum [19] to replace the interaction of operator and
engineer with an interaction between the operator and the expert system
277

which doesn't need to go home and sleep or ever take a vacation. '!he
next logical extension is to simply eliminate the interaction and have
the expert system monitor the statistical control output and control
the system to the point where the expert system can make the
corrections itself. Likewise, advances in microprocessors have made at
least initial excursions into adaptive control systems possible.
Future developments will expand the areas where adaptive control can be
awli~, brirgirg the ability to adjust the mathematics ~ to control
the system based on the performance of the system and producing control
that can adapt to major environmental up;;ets by altering the control
equations and. generating continually refined and improved control.
Similarly, expanding comp.ltational abilities will continue to expand
the applicability of predictive controls which will awly a highly
specific, exactirg mathematical mcdel of the system itself, not merely
the control units, to predict the action of the system under the effect
of changes in intuts or environmental coroitions and adjust the
controll~ parameters by awlyirg the model in reverse to determine the
control actions requi~ to correct the inplt imbalances and maintain
control. A combination of expert systems and predictive control could
even be appli~ to a system which would employ the knowledge of the
system's performance under certain circumstances in the past to control
the system when the circumstance recurs. For instance, if the
controll~ system always goes out of temperature control after a shift
change this type of data can be coll~ into a statistical datamse
which the expert system can query to determine control actions when the
situation recurs. In some ways this is how weather prediction is
perform~; for instance, if it has rained three days in a row, past
statistics show that 50% of the time it will rain on the fourth day.
other possibilities for the future include the increasirg awlication
of parallel processors, neural networks and perhaIE even optical
computers to data acquisition and control systems. In conclusion, the
past has provid~ a firm base of mathematics, sensors and signal
coroitionirg, the present an expanding realm of microprocessor control
to enable the control of complex, multifaceted, configurable systems
and the future apparently holds nothing rut expansion of intelligence
and capabilities to brirg more and more systems into the space of
controlability •

4. REFERENCES

1. Holman, J. P. (1978) EXperimental Methods For Engineers, McGraw-


Hill, New York.
2. OJnega (1986) Omega Complete Temperature Measurement Handbook and
Encyclopedia, cm:ga Engineering Inc., stamford, cr.
3. Davis (1989} Davis Instrumentation 1989-1990 catalog, Davis
Instrl.noont Manufacturing caapany, BaltiJrore, MD.
4. Holman,. J. P. (1978) Experimental Methods For Engineers, McGraw-
Hill, New York.
5. Graf, R. F. (1985) '!he Encyclopedia Of Electronic Circuits, Tab
Books Inc., Blue Ridge Summit, PA.
278

6. Texas Instruments (1987) Interface Circuits Data Book, Texas


Instruments, Dallas, TX.
7. Analog Devices (1989) 1989/90 Data Conversion Products Databook.,
Analog Devices Inc., Norwood, MA.
8. Signetics (1987) Signetics 1987 Linear Data Manual Volume 2:
Industrial, Signetics Corporation, SUnnyvale, CA.
9. Byte (1989) Byte Magazine, lJk:Graw-Hill, Peterborough, NH.
10. Personal Engineering (1989) Personal Engineering And
Instrumentation News, Personal Engineering Communications,
Brookline, MA.
11. IEEE (1983) IEEE Standard Digital Interface For Programmable
Instrumentation, '!he Institute of Electrical and Electronics
Engineers, New York.
12. Short, K. L. (1981) Microprocessors And ProgrammEri logic,
Englewood Cliffs, NU.
13. Weber, T. W. (1973) An Introduction 'Ib Process Dynamics And
Control, John Wiley & Sons, New York.
14. Senturia S. D. & Wedlock B. D. (1975) Electronic Circuits And
Applications, John Wiley & Sons, New York.
15. Carnahan B., Luther H. A. & Wilkes J. O. (1969) Applied Numerical
Methods, John Wiley and Sons, New York.
16. James M. L., Smith G. M. & Wolford J. C. (1967) Applied Numerical
Methods for Digital Computation with FORlRAN, International
Textbook Company I SCranton, PA.
17. Miller I. & Freund J. E. (1977) Probability and statistics for
Engineers, Prentice-Hall Inc., Englewood Cliffs, NU.
18. alllton's IAN (1989) 'Special Product Report: Motion Control',
alllton's Instrumentation & Automation News 37 #1, 25-33.
19. Barr A. & Feigenbaum E. A. Ed. (1982) The Handbook Of Artificial
Intelligence VoltUlle 2, William Kaufmann, Inc., Los Altos, CA.
CHAPTER 12

DESIGN AND ANALYSIS OF A MODULAR CNC SYSTEM

Dr. Y. Altlntas
The University of British Columbia
Department of Mechanical Engineering
Vancouver, B.C., VeT lW5, Canada

Abstract
Recent focus of manufacturing research has been in the develop-
ment of unmanned machining systems. Modular Computer Numerical
Control (CNC) units, which control and monitor many tightly coupled
machining tasks, are required. This article describes the detailed en-
gineering design and analysis of a modular, research CNC system for
a milling machine. Continuous and discrete transfer functions of a
feed drive servo system are derived and experimentally verified. A
state space model of a feed drive servo is developed for a time domain
analysis of actual position, velocity and armature current for given
position input and machining force disturbances. A complete experi-
mental verification of the system's model is presented. The influence
of servo dynamics mismatch, varying velocity profiles and cutting force
disturbances on the accuracy of contour machining are discussed.

Nomenclature
B: Viscous damping coefficient.
H g : Tachometer constant.
Ia : Armature current.
J.: Equivalent inertia reflected to the motor shaft.
Ka: Armature current feedback gain.
Kb: Voltage constant of the motor. (back EMF.)
Kd : D / A converter gain.
K. : Encoder gain.
KI: Velocity error preamplifer gain.
Kp : Digital filter gain.
K t : Torque constant of the motor.
K,,: PWM voltage gain.
La: Armature inductance of the motor.
Ra: Armature resistance of the motor.
279
280

8g : Velocity command signal gain of the amplifier.


Te : Cutting torque reflected on the motor shaft.
Tg : Tachometer feedback signal gain.
Tm : Motor's useful torque.
V.. : Armature voltage.
v., : Velocity Command signal.
X .. : Actual position of the table.
Xr: Position command.
W : Angular velocity of the motor shaft.
a, b : Zero and pole of the digital filter.
fe : Feeding velocity command.
p : Pitch of the leads crew shaft.
x : State output vector.
u : State input vector.
~ : Velocity control loop damping ratio.
Wn: Natural frequency of the velocity control servo.

1 Introduction
The present goal of manufacturing research is to develop flexible, self adjusting and
unattended intelligent machining systems. The limited presence of operators at
unmanned machine tools leaves the supervision, monitoring and control tasks to
computer controllers. Some of the important machining process monitoring and
control tasks are: in-process detection of tool wear and breakage, chatter detection
and avoidance, adaptive control of chip load for maximum metal removal, geometric
adaptive control for precision machining and in-process gauging [1, 2]. These tasks
must be executed in parallel since they have a coupled effect on the cutting process.
In order to implement the unmanned machining tasks, the machine tool must be
equipped with sensors and a multi-processor based Computer Numerical Controller
(CNC) where computations demanding tasks are assigned to dedicated processors.
Communication between tasks, therefore processors, must be carried out in short
intervals. For example; the Adaptive Control task must be able to send feeding
velocity updates at control intervals of 10 to 20 ms to the amplifiers. Although some
commercial CNC systems are multi-processor based, they don't provide the flexibility
to add sensing, monitoring and control modules. This article describes the design of a
very modular research CNC system which allows integration of additional monitoring
and control modules. The system enables simultaneous testing of chatter avoidance,
tool condition monitoring and adaptive control algorithms, which are coupled in
practical machining operations. The modular CNC system consists of off the shelf
units, and can be easily realized at other research laboratories.
Although considerable progress in CNC technology has paralleled developments
281

in electronics technology, a detailed analysis of feed drive servo systems for the
purpose of precision multi-axes contouring, adaptive control and monitoring of ma.-
chining process has not been studied sufficiently. Significant early work in the area
was primarily carried out by Bollinger [3, 4], Stute [5, 6] and Koren [7, 8]. They
concentrated on engineering design methods of eNe Systems where approximate
models of the feed drive servo were sufficient. However, unmanned machining tasks,
in particular precision contouring require complete modelling and analysis of the feed
drive control system.
The first comprehensive analysis of contour machining errors was presented by
Poo, Bollinger and Younkin [9]. Poo et al assumed a second order dynamics which
makes the feed drive servo Type 1 control system. They showed the significance of
steady state contouring errors when the gains of the two feed drive axis are mis-
matched. Koren proposed a cross-coupled controller to compensate the steady state
errors in machining [10]. Recently Srinivasan et al proposed optimal control of con-
touring errors for dual axis feed drive systems [12, 13, 14]. The previous researchers
have treated the velocity servo as a first order system excepting Doraiswami et al
[11] who included the electrical dynamics of the servo motor in the controller. In-
stead of designing a cross-coupled controller Doraiswami et al tried to minimize the
steady state errors at each axis. Although the previous researchers proposed control
algorithms to improve the contouring accuracy, none of them was able to reduce the
contouring error to zero especially at transients.
In this article, a De motor based velocity servo is treated as a second order
dynamic system, and that the complete feed drive system is a fourth order discrete-
time system with a digital compensating filter. Transfer functions of both velocity
and position loops are identified, and experimentally verified. A discrete-time state
space model of the feed drive servo is presented. The model allows simulation of all
state variables (i.e. actual position, velocity and motor current) for a given position,
feeding velocity and cutting force disturbance. This is particularly important for the
analysis of motor current and velocity changes under cutting force disturbances.
The article includes the analysis of multi-axis contouring errors in eNe machine
tools. There are two fundamental sources which influence the contouring accuracy.
The first type is due to the dynamic positioning performance of the servo drive
controller. In order to contour a desired tool path trajectory, the dynamics of each
axis used in contouring must be matched. However, since each controller drives
different parts in a machine tool, the inertias and friction in the guideways may differ
at each axis. Even if the axis are matched perfectly, due to the velocity lag produced
by the dynamics of the controller, contouring errors exists during machining parts
with severe transients. The second type of machining error is caused by the cutting
forces transmitted to the controller as disturbance torque, which takes away some
of the useful torque required in contour positioning. In the article, the contouring
errors are analyzed and simulated for practical operating conditions of a machine
tool.
282
POSITION FEEDBACK

DC
IBM PC SERVO
MOTOR
MPLIFIER
VELOCITY FEEDBACK
PC BUS
TO MULTIBUS
ADAPTOR
POSITION FEEDBACK

MULTIBUS
DR
RS 232

MUL TIBUS
CARD CAGE POSITION FEEDBACK

1====== =1 DC
SERVO
VT 100
MOTOR
MPLIFIER
VELOCITY FEEDBACK

Figure 1: Architecture of the UBC-CNC Milling Machine.

The remainder of the article is organized as follows. Section 2 explains the archi-
tecture and physical components of the developed modular CNC system for a milling
machine. Transfer functions of the continuous and digital control components of the
feed drive system are given in section 3. Section 4 presents state space modelling.
Experimental results, which verify the feed drive model, are presented in section 5.
Time domain simulation of contouring three transient workpiece profiles, a square
corner, a diamond and an are, are presented in section 6. The paper concludes with
recommendations for CNC design and tool path planning.

2 Physical Components of the CNC Milling Ma-


chine
The research CNC milling machine may be introduced in three major parts; machine
tool, feed drive units and the control system as shown in Fig.I.
The retrofitted machine_tool is a vertical milling machine with a 5kW AC motor
connected to a spindle drive gear box. The three feeding axes (x,y and z) of the
machine have recirculating ball screw drives with 600, 400 and 120 mm travel limits
respectively. All three linear axes are driven by pulse width modulated (PWM)
permanent magnet DC motors [15] which are directly connected to the lead screw
283

shafts. The motors are capable of delivering sufficient continuous torque during
machining at desired feeding velocities.
A Motorola. M68008 processor based motion control board [16] was used to con-
trol feeding velocity and positions of the three linear axes of the machine. The
motion control card has 16 I/O lines for logic control functions. Coolant, spindle,
travel limit, feed-hold, amplifier shut-down and other auxiliary logic control signals
are wired to the I/O lines. Contouring (i.e. linear and circular interpolation), accel-
eration, deceleration and digital compensation functions are also carried out by the
motion control board. The board resides in a MULTIBUS card cage which can take
several motion control cards (Le. for an additional rotary axis control), data acquisi-
tion boards and single board dedicated process control computers (i.e. tool condition
monitoring, process identification and adaptive control) for modular unmanned ma-
chining research. In order to use the existing peripherals of the personal computers,
the Multibus is connected to the PC bus with a bus interface adaptor. An Intel
80386 CPU with an 80387 mathematics co-processor based PC was used as a master
computer to distribute monitoring and control tasks to other computer modules on
the Multibus. A full technical wiring diagram of the CNC unit is provided in Fig. 2
for readers. The machine tool can be used as a standard CNC machine by loading an
in-house developed ISO NC language emulator in PC. Feeding velocity, acceleration,
deceleration and digital filter parameters can be changed in real time by sending the
desired values from the PC to the motion control unit via the bus adaptor. This
feature is particularly important and essential for adaptive control and machine tool
monitoring research.

3 Transfer Functions of the Feed Drive Control


System
Each feed drive control system has two cascaded controlloopsj an angular velocity
control loop which consists of an analog amplifier electronics, a DC motor-table
assembly and a velocity feedback unit (tachogenerator)j and the position control loop
which includes a digital motion controller with a position feedback unit (encoder).

3.1 Velocity Control Loop


Unidrive power amplifiers [15] with a current feedback loop are used for the control
of Pulse Width Modulated (PWM) permanent magnet DC servo motors (Fig. 3).
The operational block diagram of the velocity control system, Fig. 4, was derived
from detailed analysis and measurements done on the power amplifier unit and DC
motor-table assembly.
The power amplifier accepts velocity command signals from the DMC-230 digital
motion control unit [16]. The signal is buffered with a differential preamplifier S9
N
00
Baldor DC """
Servo "olor D..e 230 "olion Conlroller GRAYHILL
INDEX 70RCK24 ~~
+5 V NIP EN +5 I/O "odule "llll~g
CHA GND GND ~ "achme
~er GND ,..--- F~ GND ~~ ~ PINDLE ON
• CHB '-+++-++-_---1RUf
INDEX 1fT
H i ~~ ~P[NDLE
NEE UP orr
Tac, leter + .--- NIP ERROR U ~, I; NEE lINN
- +5 JX ~~ 1 DDLANT
AXIS + I CHA IN7 5 11 AKEZ
+ lit-COMMON!! ~: I~ ~ ~~S-~APE/MAN
e SIGNAL HDT[R COMMAND ~ ~ 7 = [MIT
+12 iT~' ~~ m 5 .!V TIMER
T cull!! hD N 1 ~e:'"" AP~S}~T
1_ I~ L i..'YM I-- ~ IDE HOLD
LDOT ;10" N.C ,!'" .n..lrW.lTI-- ~
Swi' hes N.C. IN'I::~'1IKI-- JY < To T CUllS CIr'CUIts) ~O
~____~~~N~ PC/A 0
Unidrive Axis JZ ( To Z CUllS CIr'CUIts ) Ullrallnk _
Conlrol Unil ..ULTIBUS f9 Lt~~~r c=J I
I "'-L N.. an1:acts OR
~
CHAPOS-f- 0
VI f r CH{El,.~ VT100
Limil Switch Relay Board,· D.... 900 ~,.. XX PosItion
v.tocrty
I~ E'i' X V.L COIOI'IQrId
I I ,. ~ X Tac"-ter
r-, +:;:5 t ~ X Motor' C....-.nt
~t T I Buffer BOard
I rr RESET
•• • • _.. _ • .........r AB[RT

Figure 2: Electronics wiring diagram of the CNC feed drive control system.
285

MECHANICAL
+1
'{
J
0---'

ELECTRICAL
Figure 3: Schematic diagram of the Permanent Magnet Armature controlled DC
Servo Motor.

and compared with an actual velocity signal measured by the tachogenerator feed-
back unit. The resulting velocity error (in Volts) signal is converted to a desired
current by the current amplifier which has a gain of K/. The amplifier output cur-
rent is compared with the armature feedback current signal pulled from a current
sense coupler. The PWM circuit, which is modelled as a gain of K,,, accepts the
current error and outputs a mean voltage of V" to the armature terminal of the DC
motor. The transfer function between the input velocity command signal (v,,) and
the motor's angular shaft velocity (W) is found as follows; The voltage applied on
the armature of the motors is used to overcome losses in the resistance, inductance
and back electromotor voltage (Fig. 3),

(1)
The armature current produces a magnetic field between the armature and constant
field stator, which gives a torque proportional to the rotation of the motor.
(2)
The motor torque is spent in accelerating the equivalent inertia, and overcoming the
viscous friction and cutting torque reflected on the motor's shaft.
dW
Tm = Je Tt +B·W +Tc. (3)
Taking the Laplace transforms of (1,2,3), the second order dynamics of the ve-
locity controller and its parameters are derived from the control block diagram (Fig.
4) and given as follows:
t-..>
00
0'

r_ _ ...... IIWW de. ~ IiU'I


n_ ....- Up-.... ~
+
Xr I I Iv~ n I"«~ XO
-r--l" I • _.1 _ I I c:: ,----r [count]
k:cuIt]

IIWW to-...a
~

IIEI..IEITY
fUJIIM:I(
M

I'IISn1IJj fUJIIM:I(

Figure 4: Block Diagram of the Feed Drive Position Control System.


287

Table 1: Parameters of Feed Drive System

B 0.09 Nm/(rad/sec) K" 21.934 V/V


By 0.08872 V/(rad/sec) Kt 0.3 Nm/A
Je 0.0036 kgm 2 La 2 mB
Ka 0.0643 A/A Ra 0.4 n
Kb 0.3 V/(rad/sec) 8g 0.0648 V/V
Kd 0.0781 V / count Ty 0.13183 V/V
Ke 636.62 counts / (rad / sec) a -0.8825
KJ 25.5 A/V b 0
Kp 2.5 p 5.08 mm
Continuous current supply= 15 A Peak current supply= 30 A

W(s) Kl
(4)
l/;,(s) = s2+K2S+K3'
where the parameters Kh K2 and K3 are defined:

Je La
The parameters K t , Ra, La, Kb and By are provided by the manufacturers of the
amplifier and DC motor. The remaining power amplifier parameters (Ka, KJ, K",
8g and Tg) were measured from the circuits, by applying known signals to the input
port of the circuit in question, and then measuring the corresponding output with a
voltmeter and a currentmeter.
It must be noted that depending on the friction characteristics of the machine
tool's guideway, it may be necessary to include static friction as an additional dis-
turbance torque to the velocity loop. An average viscous damping coefficient of
B = 0.09(Nm/(rad/sec)) is identified by measuring armature current at various
steady state feeding velocities on the research machine, see (2) and (3). Factory
specifications and measured values of the servo feed drives are listed in Table 1.
288

3.2 Position Control Loop


Each position control loop has an up-down counter which receives a reference com-
mand position and actual position of the table measured by the shaft encoder. The
position is fed into the up-down counter in the form of pulse trains whose frequency
is equal to the command and actual feeding velocities. Henceforth, each position
pulse will be called a count. A two channel encoder with one thousand slots and a
quadrature decoding circuit is used, to obtain a corresponding count length of:
P
1 count = 4000 [roml,
where p is the pitch of the feed screw in rom. The research CNC machine's feed
screw has a pitch of p = 5.08 rom, thus 1 count = 0.00127 rom. The gain of the
encoder is given as,

(5)

The up-down counter acts like an integrator which contains the instantaneous posi-
tion error. Since the input to the controller is either parabolic (i.e. acceleration and
deceleration period) or ramp type (i.e. steady state velocity), the up down counter
and encoder gain can be cascaded to the end of the velocity loop as:
Xa(s) Ke
(6)
W(s) =-;-
Combining (6) with the transfer function of the velocity loop (4) yields to the transfer
function of the continuous part of the feed drive control system:

G (s) _ Kl Ke (7)
c - s2 + K2 s + Ka s
The velocity control signal v;, of the digital motion control unit is applied to the
power amplifier at T = 1 IDS intervals via a D/ A converter which has a gain of Kd.
The zero order hold equivalent of the Gc(s) for one millisecond sampling interval is:

or which, after taking the z transform, becomes

Gc(z) = KdKIKe b2 z 2 + btz + bo . (8)


Ka (z - 1)(Z2 + alz + ao)
The derived expressions for the parameters of Gc(z) are given in Table 3, and the
calculated values of all transfer functions are given in Table 2. Discrete position
error E(z) is the difference between the reference and actual position of the table:
289

Table 2: Calculated Values of the Feed Drive Servo

Kl 1510156.0 f32 2.2925


K2 930.178 (31 -2.177
K3 303093.90 f30 -0.552
b2 0.4xlO- 4 0:3 -2.1791
b1 1.27xlO- 4 0:2 1.6491
bo 0.25x10- 4 0:1 -0.4444
al -1.20204 0:0 -0.01272
ao 0.39448 Kcl 0.0229

E(z) = X~(z) - Xa(Z). (9)


The position error is passed through a digital compensation filter whose parame-
ters are tuned to provide a smooth transient response without any overshoot. The
structure of the digital filter D (z) is:
z+a
D(z) = Kp --b.
z+
(10)

The closed loop transfer function of the complete feed drive control system is derived
as follows:

or

(11)

The parameters of the closed loop transfer function Gc1(z) are given in Table 3.
There are two critical performance requirements from the feed drive servo: The
first being the smooth transient response in order to avoid oscillatory cutting during
velocity changes; and the second is to minimize the steady state position error (i.e.
following error) in order to achieve precision multi-axis contouring. At steady state
ramping of the position with a feeding velocity of Ie (counts/sec), the following error
is derived as [17]:

Ii Ie T (12)
e.. = z~ (z -1)D(z)Ge(z)"
290

Substituting (8) and (10) into (12) gives the parametric expression for the following
error in the feed drives as:
feK3T (1 + b)(l + al + ao)
(13)
e•• = KlKeKpKd . (1 + a)(b2 + bt + bo)'
It is evident that, the higher the open loop transfer function gain, (i.e. D(z)Ge{z)),
the lower the following error will be, which is desired for precision multi-axis contour-
ing. However, high open loop gain may result in an undesirable oscillatory response
in the servo. The digital filter parameters were tuned to achieve an optimum feed
drive servo response.

4 State Space Mode'lling of the Feed Drive Cont-


rol System
A state space representation of the feed drive servo was developed in order to verify
the derived model using experimentally measured time domain response data. The
servo is again divided into continuous and discrete sections.
The continuous part of the system consists of the velocity control loop (7) and
the up-down counter (6). Three states- the armature current la, the angular velocity
W and the actual position X a- are derived from Fig. 4 as:

Xe(t) = Ae xe(t) + Be ue(t), (14)


where state vector (xe(t)) and input vector (ue(t)) are defined as
xe(t) = [la(t) , W(t), Xa(t)V,
ue(t) = [lI;,(t), Te(t)]T.
Ae and Be are constant matrices:
K.Ka±Ra Kb+K.KITKHK
La La 0
Ae = III
J.
B
- J. 0
0 Ke 0

j.
K.KIS.
La
Be= 0 - ,0
J.

0 0

The state equation (14) represents the continuous part of the feed drive servo system,
which has the following discrete equivalent solution for an observation interval of
T = 1ms [17]:
291

Table 3: Coefficients of Gc(z) and Gclz

b2 = T - ~ e-e wnT sin(wdT)


Wd
- ~: {1 - e-e wnT [::; sin(wdT) + cOS(Wd)]}

2 e-ewn T[sin(wd T ) - 'llcOSIWd


( T)]
bl = Wd
. (Wd T) -K2 -~wn
+ -K2 (1 - e-2eW nT) - 2 e-eWnT szn
K3 K3 Wd
1
Te-2ewnT __ e-ewnT sin(wd T )
bo
Wd
+ K2 { e-2ewnT + e-ewnT [~Wn sin(wdT ) _ cos (WdT)] }
K3 Wd
al = -2 e-e wnT cOS(WdT)
e-2ewnT
ao =
bt +a ~
f32 = ~
bo+abt
f3l = ~
a bo
f30 = b2
~ 1
a3 = al + Kl KeK3Kd Kp -
Ke Kd Kp (' b)
a2 ao - al + Kl K3 1'1 - a 2
Kl Ke Kd Kp (b o-a b1 ) --ao
al = K3
Kl Ke Kd Kp
ao = K3
. a· bo

Kl Ke Kd Kp b2
Kcl
K3
292

(15)

where state and input vectors at sampling interval k are defined as:

xe(k) = [Ia(k) W(k) Xa(k)]T


ue(k) = [v;,(k) Te(k)]T
~(T) = eAcT = [¢'i(T)] , (i,j = 1,2,3)
H(T) = It' eActdt· Be = [h ii ] ,(i = 1,2,3),(j = 1,2).
The matrix ~(T) is computed from the eigenvalues of the Ae matrix for the discrete-
time equivalent of the continuo~s-time system.
The discrete-time components of the position control loop consists of digital filter
D(z) and DjA converter gain Kd. The velocity command signal can be expressed in
the z-domain as:
z+a
v;,(z) = Kp- b . K d · [Xr(z) - Xa(z)]. (16)
z+
The equation can be rearranged as:

(17)
where

(18)

The new variables v;, and ltd can be treated as fourth state respectively. After rear-
ranging equations (18) and (16) and taking their inverse z-transforms, the following
discrete-time state equations are obtained:

ltd(k + 1) = -b Vd(k) + Kp Kd (a - b) [Xr(k) - Xa(k)] }


(19)
v;,(k) = Kp Kd [Xr(k) - Xa(k)] + Vd(k)
The discrete-time state equation (19) can be combined with the state equation (15)
which represents the discrete-time equivalent of the continuous part of the feed drive
servo. An algebraic rearrangement of the equations yields the following complete
state equations for the feed drive servo:

x(k + 1) = G(T). x(k) + r(T) . u(k) }


(20)
y(k) = C· x(k) + D· u(k)

where state, input and output vectors are defined respectively as:
293

}J(k) If,,(k)

x(k) =
1,,(k)
W(k)
u(k) = [ X,(k)
Tc(k)
1' y(k) =
1,,(k)
W(k)
X,,(k) Xa(k)

The state matrix G(T), input matrix r(T), output matrix C(T) and transmission
matrix D(T) are defined in order as follows:

-b 0 0 -KpKd(a - b)
hu <Pu <P12 <P13 - huKpKd
G(T) =
h21 <P21 <P22 <P23 - h21KpK d
h31 <P31 <P32 <P33 - h31KpK d

KpKd(a - b) 0
hllKpKd. h12
r(T) =
h21KpKd h22
h31KpKd h32

o 1 0 o o o
D=
001 o o o
000 1 o o

The output vector y( k) gives access to three useful dynamic parameters in the
feed drive servo, namely: the armature current, angular velocity and the position
of the table for a given position command and applied cutting torque. Other states
in the control system can be easily found by multiplying them with the appropriate
gains according to the block diagram shown in Fig. 4.
294

5 Experimental Verification of the Feed Drive


Servo Model
The model of the feed drive servo was experimentally verified in both the frequency
and time domains. The verification is carried out separately for the velocity and
position loops.

5.1 Velocity Loop


The transfer function of the velocity control, as derived in (4) can be rewritten as:

W(S) Kl
(21)
v., (8) = 8 2 + 2 e Wn 8 + W~ .
Here the natural frequency (w n ) and damping ratio (0 of the velocity loop are defined
as follows:

Wn = Fa [rad/sec] , e= 2V.l\.3
K~ < 1,

where Kb K2 > o. The time domain step response of this underdamped velocity
servo is expressed as:

(22)

Experimental step response of the velocity loop was obtained with and without the
table assembly by applying v., = 1.0 volt to the terminals of the amplifier and
simultaneously measuring the velocity from the tachogenerator. Analytical (22) and
measured step responses of the velocity control servo are shown in Fig. 5. The
step response results measured without the table indicate that the theoretical and
experimental results are in agreement. There is a high frequency component of
the step response when the motor is attached to the leadscrew, which is due to
the mechanical dynamics of the leadscrew and table assembly. Overall, the tuned
dynamics of the velocity loop are sufficiently fast for the controlled milling machine.
The frequency response of the velocity loop was analyzed in order to determine
the bandwidth of the controller. The magnitude ratio (M(w)) and the phase angle
(cP( w»of the closed velocity loop controller can be derived from (4) as:

M(w) = 2010g(K1/ K 3) - 2010g[(1 - W2/W!)2 + (2ew/Wn)2]i }


(23)
cP(w) = -tan-l [(2f.l:!L)/(1
Wn
- (.l!L
LeI",
)2].
295

,
Theoretical Experimental


--
,.-...
[/)

~
~
1-1
'-"

....u
~
~
I

.9
~ •
Table connected

• •
TIme (ms)

• Theoretical


--
,.-...
[/).
Experimental
~1-1 I
'-"

....~u • Table disconnected


.9
~


-I

• II

Time (ms)
• • •

Figure 5: Step Response of the Velocity Loop.


296

The frequency response measurements were carried out with a Fourier analyzer.
The analyzer's random signal generator output was fed to the input terminal of the
amplifier, and the velocity was measured from the tachogenerator output (Hg) in
volts. The simulated and measured frequency response results are shown in Fig.6.
Although the model and experimental responses agreed fairly well, there were some
deviations. A separate frequency response measurement was carried out by removing
the motor from the table. When the model was modified to include only the motor's
shaft inertia and viscous damping, the measured and simulated frequency response
results were in perfect agreement. Hence, it is concluded that the small deviations
in the frequency response test with the table assembly is due to the presence of
unmodelled nonlinear friction and mec\tanical dynamics of the feed drive assembly.
The results also indicate that the servo seems to be able to follow feeding velocity
changes up-to 60 Hz which is the bandwidth of the velocity loop, see Fig. 6.

5.2 Position Loop


In addition to the velocity loop, the position loop consists of an encoder, digital
filter and integrator (i.e.up-down counter) which are mostly digital and their transfer
functions are precisely known, see Fig. 4. Digital filter parameters are tuned in such
a way that the position loop is well damped and the rise time is reasonably fast. The
following digital filter was selected and found to be satisfactory:

D{z) = 2.5 z - 0.8825


z
The remaining parameters of the position loop are listed in Table 1. The calculated
values of the state space equation (20) are given as:

0 0 0 0.1724
13.5962 0.5085 -2.4203 -2.6555
G{T) =
0.6231 0.0617 0.8638 -0.1217
0.1392 0.0219 0.6028 0.9728

-0.1724 0
2.6555 0.3749
r(T) =
0.1217 -0.2634
0.0272 -0.0859
297

I~------------------------------~
-I Experimental
...
...
,,--... ... ~~~~
,.0 -II
~-II
Q)
"'d -14
Theoretical
.....E -11

~ -11
::g-ll
...
-Ii
-II
-II
-10
I 11
• Frequency
• • M

(Hz)
.. 'III ..
I

-11 Experimental

-.
-II
,,--...
bO
Q)
"'d
'--"
Q) -a
~
..cl -II
P-t
-II

-n

-II

-II

-UI
. Frequency• (Hz) . .
Figure 6: Frequency Response of the Velocity Loop. Top: Table connected, Bottom:
Table disconnected.
298

10
Experimental
18

1.
!~ •• ,VON .... ' ..... '''h' ,........
l'

--
,.-.. 12
UJ
Theoretical
'"0 10
~
r-.
'-"
8
.....~

~

u
.s ,
~
a
0

-2
0.00 10.00 aoo •.00 ••00 100.00 1111.00 1".00 180.00

Time (ms)
Figure 7: Transient feeding velocity response of the feed drive servo.

1 0 0 -0.195 0.195 0
o 1 0 o o o
D=
o 0 1 o o o
000 1 o o
The transient response of the position loop was experimentally tested by sending
a leedhold command to the motion controller during steady state feeding of the table.
The feedhold command sets the feeding velocity command to zero and bypasses the
deceleration velocity shapero The resulting feeding velocity response is shown in
Fig. 7, where it is apparent that the analytical and experimental values are in close
agreement.
The steady state position error can be calculated by substituting the servo pa-
rameter values in (13), thus e.. = 0.0137 Ie [counts]. The steady state error for
Ie = 10000 counts/sec (12.7mm/sec) feeding velocity was measured by reading the
299

contents of the up-down counter from the motion-control computer. An average fol-
lowing error of eu = 129 counts (O.1638mm) was observed which is 8 counts(O.Olmm)
less than the analytical prediction. Measurements across a range of velocities showed
that the steady state error is not quite linear, this being mainly due to effect of the
nonlinear friction characteristics of the table which are not accounted in the linear
model.

6 Time Domain Simulation of Contouring Errors


The following errorfor a single axis machining is given in (9), which is the difference
between the instantaneous reference position Xr(t) = fe(t) . t, and actual position
(Xa(t)) of the table. In the startup or stoppage periods of the feed drive, accelera-
tion or deceleration commands are given to the servos until the velocity reaches to
the steady state feed or zero respectively. Acceleration and deceleration periods are
determined by the rise time of the servo, and is typically within lOOms for medium
size machines such as the one presented here. Thus the reference position input is
parabolic during acceleration and deceleration, and ramp type during steady state
machining along linear paths. The velocity profile of an axis for a linear tool path
is shown in Fig. 8. When tool path directions continuously vary during contour
machining such as circular cuts, the velocities of two drives may vary differently and
continuously. The servos are continuously excited by cutting forces, which are trans-
mitted as torque disturbances to the motor shafts. The cutting torque disturbances
may be in the form of step changes (i.e. changes in the axial depth of cut), periodic
(i.e. pulsating milling forces) or ramp type when the axial depth of cut linearly
increases (i.e. machining a slanted profile). When two axis move simultaneously
during contouring operations, time varying velocity commands and cutting torque
disturbances produce deviations from the desired tool paths. Such errors are called
contouring errors in CNC machining, and have to be kept within the tolerance of
the parts to be machined.
The contouring errors in multi-axis machining are best analyzed by time domain
simulations. A standard machine tool accuracy test workpiece shown in Fig. 9 is
used for the simulations. The test workpiece, which has a full circle, a square and
a diamond, contains the most severe transients. For example, contouring the circle
requires the continuous variation in the feeding velocity commands of the drives.
Profiling a right angle results in a stoppage of one axis and acceleration of the other
at the corners. The diamond trajectory demands sudden velocity direction reversal
in one of the axis. At the transient locations of the workpiece (i.e., at the corners
of the square and diamond, and throughout the arc), contouring errors occur due to
dynamics of the servo.
A block diagram of a two axis contouring servo is shown in Fig. 10. Here, the
reference position command (Xr ) is passed from the interpolation algorithm which
300

30~--------------------------------'

I
Jc
constant
,
feed
20 - -.....-....:..-----""\.

acceleration deceleration
/'
'"

O-+---r-~-...,...-"""\.
o 0.1
Ie
Figure 8: Velocity profile of an axis for a linear tool path. (Note: Acceleration period
o S t S O.ls -+ I = at; O.ls S t S (tJ - 0.1) -+ I = Ie; Deceleration period
(t - O.ls) S tJ -+ I = Ie - at. Ie =constant feed value, a = Ie/0.1 velocity
dependent acceleration.)

generates independent position command pulses to x and y axis feed drive servos.
The disturbance cutting torque components on the x and y feed drive shafts are
shown as Tea; and Tcy respectively in Fig. 10. Both drives have identical control
structures, and the state space equation 20, which allows time varying position and
cutting torque disturbance inputs, is used for both axis in the following simulation
cases. Feeding velocity of 1= 20mm/sec(15748count/sec) is used in all simulations
indicated otherwise. The circular arc is generated by sending following position
commands to x - axis and y-axis feed drive servos:

x(kT) = Rsin(we·kT) }
(24)
y(kT) = Rcos(we · kT)

where R is the radius of arc, T = 1ms is the sampling period, k is the sampling
counter. The angular velocity for circular contouring (we) is calculated from the
feeding velocity command and the radius of the arc (i.e., We = 1/ R). The velocity
profiles, which have acceleration and deceleration periods at the geometric transients,
are shown in Fig. 11 for the tool paths simulated.
301

200~-----------------------------,

150

IIII 100
.~
>.
50

o 50 100 150 200

x-axis (mm)
Figure 9: Standard workpiece profile to test contouring accuracy of CNC machine
tools.

6.1 Contouring with Perfectly Matched Axis


Transfer functions of both feed drive servos are tuned to be identical or dynamically
matched. The cornering error produced due to decelerated stoppage of one axis and
accelerated start-up of the other is shown in Fig. 12. Since the x - axis is unable
to reach to the corner, it continues to move until the position error in that axis
reduces to zero. Since the y-axis is commanded to move before the x - axis
stops, a contouring error is produced at the right angle. For a feeding velocity of
f = 20mm/sec, a contouring error ofO.0025mm is observed at the corner. Evidently,
the contouring errors increase proportionally with higher feeds. Similar contouring
error occurs in the diamond profile contouring as shown in Fig. 13.a. Here, y-axis
direction is revt:rsed with controlled acceleration and deceleration at the transient
intersection point. Note that the velocity of the x-axis must also be decelerated and
accelerated at the same time in order to keep correct tracking command, although
the feeding direction remains the same, see Fig. l1.b. It is difficult to contour
sharp corners at constant velocity (i.e., no acceleration and deceleration) for high
productivity machining. This is difficult to achieve at large feeding velocities since
the contouring error is directly proportional to the feed value. Machining of the
diamond at constant feeding velocity (f == 20mm/ sec) is simulated. The contouring
error became 0.20 mm (Fig. 14.a) which is 10 times larger than contouring with
a controlled velocity (Fig. 13.a). Since there are no sharp geometric changes
302

Tcx(t)

c
0 Xo.
:p
d £
Xr(t) 0 ..s::
Q. +'
I....
III
s::0 Tcy(t)
+' (J)
c
...... <[

u Yo.
z
u

Figure 10: Block diagram of a two-axis contouring system.

during machining of an arc segment, the contouring errors are negligibly small, see
Fig. 15.a. The correct strategy to avoid the error at transient corners of the profile
is to decelerate the axes until the position errors become zero in an NC block before
the next one is executed to follow a new trajectory. However, if the geometry has
many small geometric segments, it may be very unproductive to stop and start at
short time intervals. The sharp corners are smoothed out by the dynamics of the
servos as illustrated at constant feed contouring. The process planner has to consider
the tradeoffs between the precision and the productivity, which is obviously a part
tolerance dependent decision.

6.2 Contouring with Mismatched axes


Two physical feed drive systems may have slightly different transfer function pa-
rameters. The amplifier gains Sg,K a and tachogenerator gain Tg are adjusted to
match transient responses, or overall transfer functions of both drive systems. It is
reasonable to assume that the settings of the amplifiers and motor constants remain
unchanged during the operation of machine tool. However, the friction coefficients
in the guideways are velocity dependent, and in some guideways position depen-
dent as well. For the simulation of mismatched dynamics, the friction coefficient
of Bx = 0.09 Nm/(rad/sec) and By = 0.3 Nm/(rad/sec) are assumed for the x
and y feed drives respectively. The values of the friction coefficients were obtained
from the experimental measurements carried out on the research milling machine.
The use of different friction coefficients on both drives represent axis mismatch in
contouring operations. The contouring errors, resulting from the mismatched
friction coefficients, are shown in Fig. 12.b to Fig. 15.b for square, diamond and
arc profiling respectively. The contouring error in square cornering becomes 0.06mm
which is 3 times larger than contouring with the matched dynamics. This is ex-
303

20 mm/sec

o
0.1 0.1

-20 mm/sec
------------------------~----~
20 mm/sec

o
a)
-20 mm/sec --------------- .. ----------------_'--_-J/

Time (8)

20 mm/sec

-20 mm/sec

20 mm/sec

-20 mm/sec

Time (8)

Figure 11: Velocity command profiles for the test geometry. Top: contouring of a
square corner, bottom: contouring of a diamond.
304

150.6,------------------.

. DESIRED

,......., 150 I
/'
--~tI.l
.~ 149.5
ACTtJAL

I
>.

149
a)
148.6
149.8 149.9 150 1SO.1
x-axis

150.6

DESIRED

150
I
/'
,.......,

--~
tI.l
.~ 149.5
ACTUAL

I
>.

149
b)
148.6
149.8 149.9 150 1SO.1

x-axis
Figure 12: Simulation of contouring a square corner. Top: cornering with perfectly
matched axis dynamics, bottom: cornering with mismatched dynamics.
305

150.2~-------------------'

150

1
........
149 8
.

fIJ 149.6
.~
I

>'149.4

149.2
a)
149+-~~r-""-.-.-r-r-.'-'-~~r-,,,,~
99 99.5 100 100.5 101
x-axis (nun)

150.2
DESIRED

150

1
J
149 8
.
"--'
fIJ 149.6
.~
I
>'149.4

149.2 ACTUAL
b)

149~~~-r-''-T-~~-r,-~~~-r-r'-'
99 99.5 100 100.5 101
x-axis (nun)
Figure 13: Simulation of contouring a diamond profile. Top: contouring with per-
fectly matched dynamics, bottom: Contouring with mismatched dynamics.
306

151.-----------------------------------~

150

,,--..

§
'-'"
149

tf.l

.~ 148
>.

147

146
97 99 101 103
x-axis (nun)
101

DESIRED
150 /

....--..
§ 149
'-'"
tf.l

.~ 148
I
>.

147
b)
146
97 99 101 103
x-axis (nun)

Figure 14: Simulation of contouring a diamond profile with a constant velocity. Top:
contouring with perfectly matched dynamics, bottom: Contouring with mismatched
dynamics. (Note: The velocity of the y axis is stopped and reversed stepwise at the
diamond point.)
307

pected since the friction coefficient is also approximately 3 times larger in one of the
axis. The velocity profile for diamond machining, which requires the coordination
of both axis, is shown in Fig. l1.b. There is a contouring error of 0.14mm along
the linear profile due to different following error amplitudes in each axis. Again, if a
constant velocity policy is used in the machining of sharp corners, the amplitude of
contouring error becomes much larger due to influence of mismatched dynamics, see
Fig. 14.b. A stable tracking error is predicted in the contouring of an arc as shown
in Fig. 15.b. It is rather difficult to compensate the contouring errors produced by
the mismatched dynamics with the current structure of a controller.

6.3 The Influence of Cutting Torque on Contouring Ac-


curacy
The feeding forces are transmitted to the DC motor shaft as disturbance torques via
leadscrew and torque reduction gear pairs. The reflected torque disturbance Te can
be estimated by [18],
p
Te=Fe· - - (25)
211'v
where Fe[N] is the feed force, p is the pitch and v is the efficiency of the ball screw
drive system. In the research milling machine, the DC motors are directly coupled
with the feed screws (Le. no gear reduction) which have a pitch of p = 5.08mm/rev
and the nut efficiency is v = 0.89. Thus the reflected torque is Te = 0.945 x
1O-3Fe [Nm]. Therefore for a practical maximum feed force values of Fe = 2000N,
the disturbance torque reflected on the motor will be only Te = 1.8Nm.
Steady state error produced by a cutting torque disturbance (Te) is calculated to
be e. = 7.2516 Te counts (0.0092Te mm). The total following error produced by the
dynamics of the servo and torque disturbance is e. = 0.0137 . Ie + 0.0092 . Te mm
where le(mm/s) and Te(Nm). In turning and milling operations, uneven cutting
load distribution on the drives may be present. The contouring error produced by
both uneven torque distribution and dynamics of the servo was simulated. The
cutting torque disturbance of T"", = 1.8 Nm and Tcy = 0.18 Nm are applied on the
x and y drives respectively while moving the feeding velocity at I = 20 mm/ sec
on the diamond. Simulation conditions were same as the in Fig. 13.a, except the
presence of additional torque disturbance. The total contouring error at the diamond
point increased to 0.24mm, where 0.2mm of it is contributed by the servo dynamics
and 0.04 rom is caused by the uneven cutting torque disturbance. The result shows
that the influence of cutting torque disturbance on the contouring error is much less
than the errors caused by the servo dynamics on this type of machine tool drives.
The feed drive design parameters may further amplify or attenuate the influence
of disturbance cutting torque on the contouring accuracy. The large encoder and
motor torque constants, and the use of gear reductions attenuate the contouring
308

86.5,-------------------.

86

1
'-"
CI.l 85.5
.~
»
I

85

ACTUAL
a)

84.5
84 84.5 85 85.5 86
x-axis (nun)
86.5

DESm.ED

86
..--..
§
'-"
CI.l 85.5
.~
»
I

85
b)
ACTUAL

84.5
84 84.5 85 85.5 86
x-axis (mm)
Figure 15: Contouring of an arc profile. Top: contouring with perfectly matched
dynamics, bottom: Contouring with mismatched dynamics.
309

errors produced by the cutting torque disturbances, which may be important to


consider in the design of heavy duty and precision machine tools.

7 Conclusion
A practical and analytical guide for the design principles of a modular research CNC
system is presented. The architecture of the controller, transfer functions and state
space models of the physical and digital control modules of a feed drive servo are
substantiated with experimental verifications. The developed model can be used
in time domain simulations of positioning accuracy, variations in the current and
velocity states for time varying position and cutting torque disturbances. Analysis
of contouring errors for dual axis feed drive systems are presented. It is shown that
when the dynamics of both axis are matched, there is no machining error when
the geometry changes smoothly. At sharp geometric transients, where velocities are
commanded to change suddenly, the contouring errors are shown to be unavoidable.
It is shown that for the feed drives which have lead screws and gear reductions,
the influence of the reflected cutting torque on the contouring accuracy is negligibly
small at practical machining feedrates.

References
[1] J.Tlusty, G.C.Andrews, 'Critical Review of Sensors for Unmanned Machining',
Annals of GIRP, voI.32/2, p.563-572, 1983.

[2] Y. Altintas, I. Yellowley and J. Tlusty, 'The Detection of Tool Breakage in


Milling Operations', ASME Journal of Engineering for Industry, Vol.UO,
pp.271-277, 1988.
[3] N.A. Duffie, J.G.Bollinger, 'Distributed Computing Systems for Multiple Pro-
cessor Industrial Control', Annals of CIRP, vol.29/l, pp.357-362, 1979.

[4] J.G.Bollinger, G.Stute and H. Van Brussel, 'Digital Controls and Feed Drives:
State of the Art and New Developments', Annals of CIRP, vol.29/2, pp. 497-
5061982.

[5] G.Stute, H.Worn, 'A Modular Function Oriented Multiprocessor NC System',


Annals of CIRP, voI.27/l, pp. 261-264, 1978.

[6] G.Stute, P.Klemm, 'The Application of a Modular Multiprocessor NC System',


Int.J. of Machine Tool Design and Research, vol.22, pp. 215-222, 1981.

[7] Y. Koren, 'Design of Computer Control for Manufacturing Systems', Trans. of


ASME, J.Eng.Ind., vol.lOl, no.3, pp.326-332, August, 1979.
310

[8] Y.Koren, 'Computer Control of Manufacturing Systems', McGraw Hill, 1983.

[9] A.N.Poo, C.Younkin and J.G.Bollinger, 'Dynamic Errors in Type I Contouring


Errors', IEEE Transactions on Industry Applications, Vol.IA-B, NoA, July
1972.

[10] Y. Koren, 'Cross-coupled Computer Control for Manufacturing Systems', Trans.


of ASME, J.Meas. Dyn. Contr., vol.l02, noA, pp.265-272, December,1980.

[11] R.Doraiswami, A.Gulliver, 'A Control Strategy for Computer Numerical Con-
trol Machine Exhibiting Precision and Rapidity', Transactions of ASME, J. of
Dynamical Systems, Measurement and Control, voI.106, pp.56-62, March 1984.

[12] P.K.Kulkarni, K.Srinivasan, 'Cross Coupled Compensators for Contouring Con-


trol of Multi-Axial Machine Tools', Proceedings of 13th. North American Met-
alworking Research Con., pp.558-566,May 1985.

[13] K.Srinivasan, 'A General Multivariable Structure for Coordinated Control Ap-
plications', Proceeding of American Control Conference, June 1987.
[14] P.K.Kulkarni, K. Srinivasan, 'Optimal Contouring Control of Multi-Axial Feed
Drive Servomechanisms', Transactions of ASME, Jour'nal of Engineering for
Industry, May 1989.

[15J BaZdor DC Servo Motor and UMI Amplifier Manual, Baldor Servo Products,
Fremont, CA, 1987.

[16] DMC-230 Motion Control Unit Manual, Galil Motion Control Inc., Palo Alto,
CA,1987.

[17] K.Ogata, Discrete Time Control Systems, Prentice Hall, ppA50-455, 1987.
[18J Waler,S., Stute,G., Electrical Feed Drive for Machine Tools, Edited by Hans
Gross, John Wiley & Sons Ltd, 1983.
CHAPTER 13

COMPUTER I PROGRAMMABLE CONTROL


OF A FLEXIBLE MANUFACTURING CELL

J.RICHARD, F.LEPAGE, G.MOREL


Centre de Recherche en Automatique de Nancy
CNRS URA821
BP.239
54506 VANDOEUVRE LES NANCY CEDEX

ABSlRACT: This chapter presents the study and realization of a control loop flexible cell. The first part
is a description of the functional requirements, which define the functions that have to be made by the
physical manufacturing system, the industrial information system, and the manufacturing controller. The
structural requirements are described in the second part. They specify both the distributed control architecture
at cell and machine level, and the communication architecture. The control loop of the cell is achieved by
the quality control of the products and the monitoring of the equipment. The third part comprises the study
and the realization of an experimental cell in the laboratory. Many functions form the running at the cell
level. We only present the coordination between cell and station, and the planning of part handling. The
realization of the communication functions is presented; MAP, Ethernet, 3+ Open, and Bitbus networks are
used. The machine level is formed by three main components : the machine computer, supporting the
supervision, the numerical controller, and the fieldbus which for example achieves control and monitoring
of the fixturing device. The control loop of the machining process by dimensional measurement, is achieved
directly on the machine by gauging or by a vision system. It permits to correct in real time the machining
process, which achieves to keep optimal quality.

1. Introduction

The control of manufacturing units relies upon the implementation of many functions, including the
design of the products, the design of the associated manufacturing posts and the maintenance of the
means of production. Only the integration of these functions allows to search for a global
optimization of the process.
The integration of the control functions needs a suitable computer structure, which is
necessarily distributed, because the intelligence of the various functions is geographically
distributed. This requires methods and efficient tools in order to use and maintain:
- communication networks which carry different types of information,
- databases which store this information, and have to achieve their coherence and
consistence,
- real time and distributed computer systems which have permanently to co-operate.
311
312

Obviously reliable operating and quality assurance are essential to ensure economical interest.
This article presents the implementation of these principles limited at the level of a
manufacturing cell. With the experience of the study and realization of an automated manufacturing
island, [ I], we have developped a flexible manufacturing cell, which uses the same concepts:
- utilization of quality control as feedback,
- maximal flexibility of the cell related to new production,
- distribution of the control.
New concepts have been inserted:
- control loop for the equipment at each level,
- development of the distribution of decisions taken related to work chain defaults,
- global management of the information.

This cell can be integrated in to the global shop management. The integration of this concept
is only possible by the use of evolved computer systems, special local area networks which assume
the distribution of functions, and multi-task real time operating systems.

The project is presented in three parts:


- the functional requirements which define the functions to integrate,
- the structural requirements which define the functionnal structure,
- the study and the realization as they have been made at laboratory.

2. Cell Functional Requirements

2.1. INTRODUCTION

The cell is materially integrated in the workshop structure as a product transformation unit and as a
manufacturing information unit. It is also integrated as a structural entity of the manufacturing
function which is able to plan its own manufacturing activity. The cell's real performance depends
on the functional specifications described in this paragraph.

2.2. THE PHYSICAL MANUFACTURING SYSTEM

In the workshop, machine regrouping and consequently cell machines is a result of group
structuration technologies. The cell is a transformation entity of material flow composed of parts
flow and tools flow. The cell can be in the situation of bottleneck or non bottleneck.
- Non bottleneck. It does not slow down the product flow. Its local productivity is not an
optimization criterion. However, it is avantageous to use idle time to optimize quality assurance and
so, to increase speed of material flow in downstream cells, which can be in a bottle neck situation.
- Bottleneck. It slows down the product flow. All possibilities which can increase material
flow speed must be studied,especially by using possibilities of freedom degrees which belong to
manufacturing process.Productivity research is not an optimization criterion for the cell,but only for
all the manufacturing process. The manufacturing cell plan is determined by workshop
313

planning .This uses a global productivity criterion ,which involves interdependance between
manufacturing cells, to achieve the lowest production costs.
To satisfy these constraints, he manufacturing flexibility must be raised to a maximum,as
well as the decisionnal autonomy which is distributed on all the machines of the cell.This
decisionnal autonomy must be integrated in the decisionnal structure defined by the workshop
planning system.
On the other hand, tools data must respect standardization of manufacturing data to satisfy
manufacturing flexibility allowed by the manufacturing process.

2.3. THE INDUSTRIAL INFORMATION SYSTEM

In the CIM organization, manufacturing data must be considered as belonging to enterprise


patrimony as well as material process. The information system must ensure principally two
functions :
- data integrity
- data coherence taking in account the existing data management system
- material connection, realized with local area network, between all cell equipment is
necessary to build a CIM structure; however it is the manufacturing data management which
provides the gateway to CIM.
The methodological integration of data requires to divide data shared between the different
components of CIM structure from the data of each component.
The solution must as far as possible be in accordance with specifications and data exchange
standards [STEP, PDES, MAP].
The data management system is a technological process that must be controlled like other
manufacturing processes to ensure it's reliability. This data automatisation function affects the
transport and data storage components and allows to take in to account detection defaults,
configuration and performance change, to manage the information system.

2.4. THE MANUFACTURING PLANNING SYSTEM

2.4.1. The Planning. The planning system partly implemented in the workshop, sends to the cell,
manufacturing orders which are always linked to a product. These orders can be issued from
workshop or other cells and depend on the planning method used [2] (MRP, OPT, PBC, ... ).
Each method has its own field application specific to enterprise production type. Their
objective is the product flow regulation in the workshop. Their differences are assessment of
criterion of performances and method of batch definition.
Cell planning must be adaptable to every manufacturing planning system and must:
- accept random or sample time distribution for manufacturing orders
- take in account the criterion of the manufacturing planning method
- regulate its own product flow in accordance with batch size and the ultimate date defined by
manufacturing planning system
-permit as soon as possible that the machines of cell choice product affectation to treat
equipment defaults in real time. This product affectation must respect manufacturing rules, function
314

of evaluation criterion, manufacturing method and bottleneck or non bottleneck situation of the cell
for this product.
2.4.2. Quality. The quality control of the cell is a factor of optimal regulation of parts flow and
must be integrated in to the manufacturing process. This quality function must have an autonomous
level to take decisions, to control and maintain product quality [3].
The cell must have its own equipment to realize quality control defined in computer aided
design process. It must decide to perform a control operation and if necessary a second operation
function of technical possibilities defined by CAD and of manufacturing rules. This control loop
function must be applied as much to product quality as to equipment control, it means to the whole
manufacturing process. It is necessary to implement analysis and diagnostics methodology based
on conformity control data and state equipments. This control loop function is able to predict
defaults as much as possible and eventually to cure them. The cell maintenance function is closely
linked to cell quality function.
- The cell maintenance function monitors all manufacturing means which are assigned to the
cell. The control loop function at the cell performs with all data given by the process sensors allows
to monitor equipment and eventually to predict defaults. So a predictive maintenance politic can be
defined by the planning system to avoid failures and to ensure maximal product flow.

2.4.4. Computer Aided Design and Manufacturing. The control loop functional constraints can
only be satisfied with a research of flexibility in manufacturing sheet. A half automstised data
process generates a flexible manufacturing sheet described with operation entities, therefore several
possibilities are able to be performed on cell, depending of equipment availability.
Entity concept increases a commom functional perception between different actors of the
manufacturing process (CAD, CAM, CAP, operator, ... ).
These functional entities (manufacturing, handling, qualification) are directly performed on
the different processes of corresponding machines.

3.Structural Requirements

3.1. DIS1RIBUTED CONTROL ARCHITECTURE

3.1.1. Introduction. The ideas thai prevail in design and realization of a flexible cell control
structure, are the following:
- to respond to the production objectives prepared at shop level, therefore being able to
operate with various management rules,
- to change the cell configuration in real time, in order to mitigate the consequences of a
hardware failure. or to suit the production variations.
- to allow _the intervention of a supervision operator,
- to decentralize intelligence. by treating localy as much as possible. what may be done. from
a control or diagnosis point of view [4].
- to structure the control in layers. in order to create an independance of the upper layer with
regard to the hardware.
- to integrate quality control in order to ensure a feedback to the machining.
315

- to use a virtual machine model [5]. This model is organized into two layers : the
input/output posts layer, interfaced with the cell, and the station layer. The latter is itself split into
three sub-layers: supervision, process and ressources.
The on-line evaluation of the parts evolution is a great deal simplified if the cell is compared
with a set of input/output posts that mayor may not accept continually the product. The working
posts where the operations (machining, control, headling)are linked up , are intemaly controlled by
the machines (figure 1).

manufacturing plan
planning
sheet manufacturing orders

manufacturing plan rules

part flow planning


on input/output posts
part
post I/O I/O I/O handling
post nO i post nO j post nO k
level control

supervision SUPER VISION


subllevel

manufac-
Q H turing
] process
T
Trans- Quali- control
Hand-
~ subllevel forma- fication
ling
by
entitles
tion

D
nume- vision
resource rical control-
subllevel control- ler
ler

Figure 1 Structural model of the cell


316

3.1.2. Control Distribution. The control architecture is divided into two levels:
- the cell level control, that has to manage the part flow in the cell,
- the machine level control that has to ensure the product manufacture [6].

3.1.2.1. Cell Level. The cell level control distributes a set of tasks to each piece of equipment that
depends on:
-the technical possibilites of each one;
- the analysis of the management function: production instructions, manufacturing orders
coming from real time scheduling, management rules, supervision operator decision;
- the manufacturing data: planning sheets, entities coming from Computer Aided
Manufacturing.

3.1.2.2. Machine Level. The machine level control communicates with the cell level by the state of
input/output posts. The planning sheet corresponding to each part on a post is formed by a
sequence of entities of machining qualification and handling. The entities order depends on the
results of the qualification operation and equipements state. The machining of several parts on a
palet in working position may be realized in parallel in order to reduce the idle time corresponding
to tool changes. This possibility requires the creation of an upper control level on the machine.
While on classical numerical controlled machines the part program generates the driving of
operating functions, this new constraint implies to dissociate two functions indeed: to realize the
parts and to drive the machine operating functions. The sequence of the functions transformation,
qualification and handling, is elaborated by the supervision sub-level, from the work to do, the
management of the quality, the state of machine resources, with the objective to obtain the minimal
delay or the maximal quality depending of the cell context.
The process sub-level is divided into three functional modules: transformation, qualification,
handling. It drives the machine operating functions and gives execution reports to the supervisor
that is so able to control the manufacturing quality and the equipment.. Qualification entities include
analysis of measure results and name of object to correct: tool wear compensation, dimension of
finished part, ... The supervision sub-level applies this correction to the next entities.

3.1.3. Communication Architecture. The cell level control and corresponding information
management are implemented on a computer named cell computer. The intelligent functions on each
machine forming the cell are implemented on a computer named machine computer. These
computers support, of course, the communication functions inherent in distributed systems.
A great distribution implies a great communication flow. This communication is characterized
by:
- the global throughput,
- the flow constitution: periodic or aperiodic messages, short or long messages,
- the constraints related to the transport delay,
- the type of information: text. image, voice.
In the studied cell, the analysis of information exchanges characteristics between the remote
modules shows the existence of three communication classes (figure 2) :
- Oasse 1 : communication between cell computer and others macro-elements of the company
such as others cells. computer-aided design shop, computer-aided planning shop .... Messages are
317

long and aperiodic, transport delays are not strictly limited but have to be maintained at in acceptable
values. The throughput is high. Text only is carried, but images will probabily be exchanged later.

Class 1 Communication
I I I I
Cell Cell CAD/CAM CAP
Computer Computer shop shop

I
Class 2 Commumcanon
I I I I
Machine Machine Measuring Vision
Computer Machine
Computer Controller
Controller

I I I
Class 3 Commumcanon
I I

I I I
I I

sensor sensor I actuator actuator I

Figure 2 : Communication Architecture

- Oasse 2 : communication between control elements in the cell : cell computer, machine
computers, measuring machine controller, vision controller, ... Messages have middle sizes.
Transport delays are limited to ensure real time operation.

- Classe 3 : communication between process elements, i.e. sensors and actuators, and the
control elements. Messages are short. Transport delays have to be very small.

Each class has a corresponding adapted local area network. This network must also deliver
information without transmission errors, and must be reliable.
Note: this architecture is not hierarchic. Communication may be shared in classe 2 and classe
3 for cell elements,_ if necessary.

3.2. CONTROL LOOP OF THE CELL

3.2.1. Control loop of the cell by means of quality control. Optimal control of manufacturing
process must be achieved by means of control loops at several levels of the system in order to
realize correction (figure 3). The feedback controls require :
318

- to take into account measurements directly on the process with specific sensors
(temperature. vibrations. force .... ).
- to measure directly the part on the machine by probing (part orientation. half finished cut .
... ) or by means of a vision system (surface roughness •... ).
- to control tool wear and clamping devices.
- and after part machining. to inspect the part on a measuring machine.
With this informtion. corrections on the machining process can be performed in delayed time
defined by measuring process :
- immediatly with direct measures on the process with specific sensors. to correct feed and
speed or to stop the machine if a default is detected.
- with the following part machining a handling entity after measure with probing of part
orientation or half finished cut depth.
-with the following part after inspection on measuring machine. To achieve the feedback
controls on different process levels it is necessary to transmit. store and elaborate new data and to
integrate it in the cell control structure.
This cell control structure has several levels of data processing:
- machine tool level. The computer. connected to numerical control and specific equipment.
stores and analyses quality data issued from sensors. probing or vision system. Machine tool
corrections are directly performed on the machining process (speed. feed) or on the following entity
- cell level. Quality inspection of finished part on measuring machine. off line of machining
process. allows a control loop of quality with elaboration of a new manufacturing cell plan.

3.2.2. Control Loop of Equipment.. Control loop of equipment requires the modelling of the
behaviour of cell operating part or resources of the machine. The objective is to define monitors and
behaviour filters in order to pre-process information and thus to detect failures and there origins.
The functionalities of behaviour filters and monitors are appreciably distinctive. because a
filter may directly act on the actuators (figure 4).
These monitors and behaviour filters are integrated in the resource sub-level of the machine.
They return information to process modules during operation. or to the supervision sub-level at the
end of operation. For instance the behaviour of a tool is characterized by the wear monitor during
machining operation. and by information related to quality control. life duration. wear ratio •...
The control loop may be realized in short at several levels depending on information type :
- information coming from operating part by behaviour filter.
- information coming from machining by predictive model related to the dimension
fluctuation. by tool wear monitor. by model related to adaptative control •...
- information coming from probing or vision by position transformation model related to the
balancing.
319

cell
revised process plan computer inspection results
J

- -
Local area netWork:
I

,
position
or tool
I machine I
computer I intennittent
gauging
I I machine
computer
I
correction on machine tool

, I measuring
machine
I
I
Inumerical
in[ control in process
process measures
I
actions or
corrections
,
~ machining sensors\--
process
gauging,

Figure 3: Structure of control loops in the flexible manufacturing cell.

..
Functional
"I To acwators
CON1ROLLER orders ...
BEHAVIOUR OPERATING

PART Fll..TER PART


Operating ./
From sensors
part'state

MONITOR

Figure 4 - Behaviour fUter and monitor


320

4. Study and Realization

4.1. CONTROL CELL LEVEL

4.1.1. Introduction. The realization report of this level is intentionally limited to the coordination
function between cell and station level and to flow product planning according to manufacturing
planning rules.

4.1.2. Coordination between Cell and Station. When the cell decides to carry a new part to a
station, when the station is already machining, the reception point must not be the same as the part
being machined.
The station has to look logically at his point so that the cell does not use it.
For the same reason, the station does not put the part back on a place which is free for the
cell.
This reasoning is true for all objects exchanged by cell and stations. We have defined a
informational structure which insures that cell and stations have access to a place or its content with
mutual exclusion.
This structure is called reception post. It can be described by a diagram of the possible states
of the post (figure 5). Each state commutation is transmitted to the cell or the station. These
messages allow coordination between cell and stations.

STATE MESSAGE FROM

... - - -
waiting
station
objet is taken station

+
working

t ~ --- work end station

...
waiting
transfert

- - -
~
objet is taken robot

empty
... - - - objet is deposited robot

Figure 5 - First modelling of a reception object post


321

When a station wants to evacuate a part. palet. or resource (Le. weared tool). it modifies the
state of the post belonging the object by sending the message "work finished" to the structure of the
reception object post. This structure points out its new state to the cell. which is informed on line
that a new object must be shifted.
When the handling device affected to this deplacement has physically pulled off the object. it
informs the reception post which state becomes "free". Also the reception post informs the cell
which will be able to choose this reception post for reception of another handling operation.
When the same handling device puts the object on its new place. it informs the associated
reception post which commutes in state "waiting station".
The reception post informs of its new state the station which knows that it has to receive the
object.The station carries out its application on the object (machining for instance) and forbids the
reception post access to the cell by commuting it in the state "waiting". In particular. the cell can not
dispose of this post while it is no longer waiting for the station. For the cell control. a machine is
considered like a physical entity composed by one or several station themselves composed by one
or several posts.
A graphical tool1ike GRAFCET is particularly fitted to describe the control structure of object
handling and can generate directly an executable code by using advancing and interpretation rules
without ambiguity.
The original vocation of functional diagram GRAFCET is to be a specification tool which
improves the methodological approach for definition and conception of the control structure [7].
This is achieved independently of realizational consideration of the automated system.
Based on step and transitions ideas. it permits us to describe all expected behaviours of a
process confronted with sensitive events to detect all incoherences and to express merely notions
like parallelism. synchronism and cooperation.
The rigourous formalism and deterministic solution which it supplies. allows the
transformation of this description tool into a true programmation language which principal
characteristic is to be not only structured but also structures the application.
For a great number of robotic applications. programming consists of moving and action
sequence description of parts and tools. depending on inside or outside data. So we think that
Grafcet may be applied to robotics.
We have developped a programming tool GRAFCET. called GRAFCET LM. comprising a
graphic editor and a translator compiler providing a source program in LM language [8]. This tool
generates in a first step the application structure. then allows touching up. principaly to integrate
concepts which cannot be directly included in GRAFCET : for instance teaching mode moving.
Figure 6 presents an example of GRAFCET LM programmation for exchange between two
posts.

4.1.3. Planning in the Cell


4.1.3.1. Introduction. According to the planning method used in the workshop the manufacturing
orders can have different formulations for the cell.
With KANBAN. a raw parts supplying can be enough to represent a manufacturing order
transmited by a downstream cell.
With MRP a workshop messages comes before supplying and defines in particular the latest
date for the end of production. With OPT. the cell has the initiative for manufacturing launching. it
consults the limited security stocks of downstream cells.
322

What ever the manufacturing planning method used in the workshop, the aim is to supply the
ordered parts on the requested date.
- Computed by the workshop with MRP
- function of cell estimated manufacturing times with Kanban and OPT.

4.1.3.2. Assignment rule. Before determining the destination of an object to move, the cell reads
the report of the previous work on this part.
- If this report is negative, the cell can determine second operation solutions which are a
function of :
- the object type
- the possibilities of second operation in the part sheet.
Among the objects which are waiting for a transfer, some are resources to evacuate (weared
tool, ... ) or new resources for a station. They do not receive machining a measurement
operations in the station, then the associated working load is lower (time to set up in the
station) than working load for a manufacturing operation.
The associated operations are automated reconfiguration operations of the cell can receive
variable priority degrees.
The simplest planning which can be realized in the cell consists on an equal repartition of
working loads on the reception posts, on the corresponding stations, so that the station does not
become bottleneck in OPT planning. The working load is translated in working time.
A station does not load up its working capacity time, otherwise the cell can become a
bottleneck for the workshop.
When the cell plans handling operations, it considers the working load generated by
machining or control operations on the parts, and also by operation tool control or sharpening.
The cell selects the object destinations which increase as little as possible the working load of
the stations. The cell does not consider station working load but working load of reception post.

4.1.3.3. Impossible assignment.A file of free posts is managed by cell level which receives post
messages. Among free posts, it is possible that no one is physically able to accept objet to move for
two reasons :
- posts which can physically accept it have no more space,
- posts cannot accept it while there are no stations which can achieve the associated
operations.
In the first case, it is necessary to wait until a post becomes free, or to put the object on a
transit stock, or a less prioritary object which would unblock the post.
In the second case, the cell searches in its database for the missing resources and for the
station able to achieve the operation.
If the missing resources are present on other cell posts, in particular in transit stock, the cell
is able to drive these post states, so as to plan their transfers to the concerning station. In the other
case, it has to request necessary resources.
The cell must verify that the station can physically receive the sent resources.
323

wrist free

request of grip or placing post i grip j part k

prepare post i
(grip, placing) part k

prepare grip j (grip, placing) part k

exchange part k on post i

close grip j

close post i

Figure 6 : Exchange between posts models

4.2. COMMUNICATION IN THE CELL

4.2.1. Introduction. In order to simplify the realization a two level communication structure is used
(figure 7):
- a cell level that includes classes I and 2 defined in the structural requirements,
- a field level corresponding to classe 3.
This section only presents the cell level. Note that the regrouping of classes I and 2 at cell
level does not limit the transposition of software developments because these use application service
elements normally included on netwOlKs dedicated to either two classes.

4.2.2. Used Networks. MAP is naturally used.


324

MAP specifies a set of standard communication services for factory automation. The
specifications provide compatibility from the application layer to the physical layer. MAP uses the
ISO 8802.4. Token-passing bus on broadband bus medium for the factory backbone, and one
carrierband bus medium for subnets.
Full MAP 3.0 is in conformity with OSI model. It is used as the backbone network of the
factory. Three OSI services are currently provided for the user (figure 8) :
- MMS (Manufacturing Message Specification) ;
- FTAM (File Transfer, Access and Management) ;
- ISO/CCITT X500 Directory Services.

Application layer interface

Directory
7b MMS flAM Service
I----~...........- - - I
ROSE
7a ACSE

6 Presentation layer

Figure 8 - Application layer structure

The MAP controller used is the Concord MAPware TM Series 1200.


The Series 1200 PC Bus Controler is a high speed MAP communications controller for the
IBM PC and compatibles. The Series 1200 Controller card with Concord MAP 3.0 software
provides layer 2 (data link) through 7 (application) functionnaly. Layer 1 (physical) functionnaly is
provided by an internal or external modem.
The Series 1200 Controller is a single printed circuit card which takes one slot in a PC
backplane. The Series 1200 may be configured with either a 5 Mbps carrierband or a 10 Mbps
broadband modem. The 5 Mbps carrierband is used here because it is cheaper.
MMS software is SISCO Inc implementation, named MMS-EASE. This software is in fact
a library of programs written in C language. This software is run on PC micro-computer under
MSIDOS or OS/2 operating systems. Control loop is realized with a behaviour model [9]. Figure 9
shows the used MAP communication feature.
325

But the poor perfonnance of FTAM set a problem as the non-existence today of a database
management system operating on MAP. Therefore we used also an Ethernet network with the
netwoJX operating system 3+ OPEN LAN Manager from 3 COM Corporation. On this system has
been installed SQL Server. This multi-users relational database manager has been chosen because
of its opening and perfonnances qualities.

4.3 THE MACHINE LEVEL ELEMENTS

4.3.1. Used hardware. The experimental machine is created from a previous bought by the
laboratory machine tool. This machine is a CU 60 type machining center from Graffenstaden
Company controlled by a CNC NUM 760F.
This machine is a good answer to a range of present industrial requirements but has not all
the technical characteristics required in this project. Specially miss the possibility to developp smart
functions dedicated to control loop and a fieldbus allowing extensions with sensors and actuators.
The machine supervision software is implemented in a PS/2 micro-computer built on AT
bus and runs under OS/2 operating system. It communicates with the numerical controller by RS
232 link using a simple exchange protocol.
As fieldbus Bitbus from Intel Corporation, has been chosen because it is already used in the
industrial environment. This network extends the programmable logic controller capacities of CNC.
The programmable logic controller connection to the network uses a Bitbus node and a RS 232
link.
4.3.2. The machine computer. This micro-computer is chosen as it can work under the
multi-task operating system OS/2 and has an AT bus that authorizes a large choice of interface
boards. Its memory capacity is 8 mega-bytes, its hard disk capacity is 40 mega-bytes. The various
control tasks are developped using OBJECT!! object-oriented language. OS/2 and OBJECT!! offer
comfortable developpment tools and product user-friendly applications. This micro-computer
supports the communication on MAP and Ethernet networks. MMS-EASE and 3+ OPEN LAN
Manager therefore run on it .

4.3.3. The numerical controller. The NUM 760 controller has a modular architecture
described in figure 10. It includes three main processors:
- the numerical control processor that manages the part programs in memory, controls their
execution, realizes calculation and drives the axis; it is directly linked with the panel processor;
- the PLC (Programmable Logic controller) processor that processes the logical control
functions of the machine, especially those related to tools magazin and palets;
- the LPC (Link Processor Controller) that controls the exchanges between either previous
processors and external systems according to a user programmable protocol. These three
processors exchange infonnation on a common bus using control blocks.
A constructer specific real time operating system is active on each processor. It is the same on
PLC and LPC and it is open. This allows the developpment of new tasks written in an assembly
language similar, but not the same, as ASM 6809 from Motorola. The program debugging is
realized under the system using a specific utility program.
326

PS/2 PS/2 AT BUS


r-----
SQL
CEU...
OOMPUTER
SERVER - ----
OS/2 OS/2

- J t Ethernet ..
-- II'-
MAP
J
I

-
.
, , - --
'If ,
PS/2 AT BUS PC
MACHINE
OOMPUTER IPCX
---- - 344
OS/2

1'
RS 232 RS 232" BrrBUS '"
DA1EM J
.....
it
L.P.C P.L.C
~6809 6809
iA.
NC Processor
I.... IRCB
11 68000 r--
44/10

~ ~rfh..m..- i
I
Machine Tool CU 60
Station 1 : Machining Center Station 2
- --
Figure 7 - Computers, controllers and networks in the cell

4.3.4. Control by the fieldbus : application to futuring device. The PLC functionalities
extension across Bitbus network allows to decentralize into Bitbus nodes the tending of operating
parts by behaviour filters implementation.
The Bitbus network is formed by nodes that are built around a microcontroller monochip
Intel 8044. This microcontroller is composed of a 8051 Central Processor Unit and a serial interface
unit. The real-time multi-tasks operating system iDCX 51 run on this microcontroller, and is the
heart of Bitbus software.
327

Application Program

API
Computer
MMS-EASE

Computer
Layer 7
Bus

ACSE - ISO 8650

Layer 6 Presentation ISO 8823

Layer 5 Session ISO 8327 MAP


Controller
Layer 4 Transport ISO 8073

Layer 3 Network ISO 8473


LLC LLCI - ISO 8802.2
Layer 2 ----------------
MAC Token Bus - ISO 8802.4
cable

J
Internal
Layer 1 Broadband / Carrierband or MODEM
External

MEDIUM

Figure 9 - MAP communication system

Bitbus application layer is based on a task integrated in the firmware with iDCX. This task
named RAC (Remote Access and Control) is a special purpose task that allows the user to transfer
commands and program variables to and from Bitbus controllers to obtain the status of Input/Output
or data lines, or reverse the state of an Input/Output line or read and write memory ,etc ... Each node
accepts a maximum of eight tasks : the RAC task is one, and seven may be created by the user.
Bitbus uses a master - slave communication. Remote task communication is performed by a set of
interactions (figure 11).
A toolbox provides the user with software utilities that speed and simplify development and
debugging of applications.
The behaviour filter is implemented on Bitbus node using six tasks to control the fixturing
device: MAIN task, INTERRUPf tasks, PREPARE task (place, take), PLACE task, TAKE task,
CLOSE task (place, take).
328

PLC Processor
6809

NC Processor
I 68000 I
IE~=I8K
32 outputs

Tape reader'
Punch
I
I RAM8K P
L
N
Operator panel C
C
Serial link
B r------, U
B 32 inputs
U ' Memory extension!
A xis s ,
'
S
I

-+
Feed reference
Coordinates
16
32Ko
I
L."
"- '..
8
PLC controlled
2 axis board
for PLC program I
Measures
Machine zero B L ______ .J
B
Feed reference
I Coordinates
I Measures
T
T
NCMemory s S
Machine zero
LPC Processor
REPROM
+ I 6809
PLC
interrupt board
RAM EEPROM
40K

IRAM 16KI

RS 232 links to
computer or network interface

Figure 10 - NUM 7fJJ controller architecture

The objective of MAIN task is the reading and interpretation of commands, the management
of operating states, initialization of control field and sequences starting. It is a non-interrupted
tasks.
The INTERRUPT task has a priority on all others tasks. It interrupts the present sequence
and awaits either a default acknowledgement order or an initialization order.
The PREPARE task realizes the real actuators driving in order to unlock the part, and the
supervision of the action. This task may be interrupted.
The PLACE task corresponds to the supervision of the part placing by the robot. It is a non-
interrupted task.
329

SLAVE 1 SLAVE 2

TASK? TASK?

TASK 2 TASK 2

TASKl TASKl

1 ~ COMMUNICATlONBE1WEENLOCAL TASKS
TASK!
2 ~ COMMAND MESSAGE, FROM MASTER

3 ~ REPONSE MESSAGE, FROM SLAVE TASK 2

4~ COMMUNICATlONBE1WEEN NODE
TASK?

MASTER

Figure 11 - Communication between tasks

Figure 12 - Links between tasks


330

The TAKE task corresponds to the supervision of the part taking by the robot. It is a non-
interrupted task.
The CLOSE task realizes the real actuators driving in order to lock the part, and the
supervision of action. This task may be interrupted.
Figure 12 shows the link between the tasks.

4.4. CONTROL LOOP ON MACHINING PROCESS

4.4.1. Introduction. Control loop on machining process is a process performed on the machine tool
using the feedback of dimensional control of part, tool and clamping device. Two dimensional
technologies are integrated on the machine:
- measurement on the machine with a touch probe type "Renishaw" and by using encoder
position signals as position sensors;
- measurement by vision system with a fixed camera or moved by a robot.
These two systems have their own application fields, however they can have an intersection
domain.

4.4.2. The measure with a touch probe on machine tool. Most high technology machine tools are
equipped with this functional possibility. The principal restrictions are:
- Analysis of measure results are performed in NC program by using program parameters.
Mathematical operations, inevitably restricted, aIJow only paraxial control, which limits surface
identification.
-This probing function, specific to each numerical control technic, is not integrable in a
CAD-CAM software.
To resolve the problem caused by these two constraints, the solution is to use the
communication and data exchange possibilities of the NC and to implement analysis measurement
function on the machine computer.
The probing programs executable on the machine tool are limited to probing path in a
moving mode and encoder information transfer by LPC processor to the machine computer. These
programs are generated by a CAD-CAM software using entities description of manufacturing sheet.
The generated code is an ISO code in which we have integrated sequences which permit
synchronization of results transfer (figure 13).
When the geometrical entity is probed, encoder information are analysed on the
qualification sublevel of machine control. This function achieves:
- the transformation of encoder information in part reference,
- probing touch radius compensation with normal approach,
- identification of simple geometrical entity (straightline, plane, circle, cylinder, cone,
sphere) with shape default and positionning in part reference,
-balancing oftheorical points in relation to measured points in part reference,
- statistical analysis of machining dimensions and prediction of fluctuations rIO].
331

NlOOOXA YAZA
N20 ElOOOl=O --> semaphore initialization
N3000l 010 XB YB ZB --> probing B
N40 ElOOOl=l --> measure to transfer
N5000 XA YAZA
N60 00 XC YC ZC
N70 ElOOOl=O

Figure 13 - Probing sequence on machine tool

From this analysis, the diagnosis function detects the following defaults:
- bad part positioning,
- weared tool,
- machining dimension fluctuation.
These are corrected by the station level by :
- a new part positioning,
- radius tool wear compensation,
- machining dimension correction.

4.4.3. The measure with vision system. The functional possibilities are large in control loop cell :
dimensional inspection, localisation and recognition of part and tool, robot guiding. Although less
accuracy than touch probing for dimensional inspection, the advantage of vision system is data
availability and speed acquisition. The vision processor assigned to the machine is connected to the
machine computer on the same level that the other machine equipment.
332

A vision control interpreter has been developped to perfonn data processing execution
using simple description. Part description with dimensional entities allows automated parametering
of shape recognition and localisation modules.
The implemented shape recognition method perfonns noisy scenes recognition or
incomplete representation.
The object modelling with entities is improved progressly by threshold adjustment and by
an entities research strategy with likeness factor.
The object localisation makes easier their grasping by the handling system and is perfonned
in stereovision to define the object position in the working space. As for inspection with touch
probing on the machine, the defaults diagnosis and corrective actions calculation are achieved in the
machine computer.

5. Conclusion

The concepts synthesis developped in the control loop cell obviously expresses the fact that
cell perfonnances are tightly dependent on those of the infonnation system. Heterogeneity of
industrial equipment and their technical characteristics constrain the architecture and restrict
functional integration. They get illusory optimal exploitation of high technologies.
The electronic industrial equipment, numerical controllers, local area networks, etc ... have
to evolve to a larger capacity of data management. They have also to use more systematical
standards of data representation, data transmission, operating system. The future machines will
integrate directly enhanced control loop functions perfonned with larger instrumentation more
modularly implemented on ficldbus.

Acknowledgements

The authors expess their gratitude to MRT (Ministere de la Recherche et de la Technologic - France)
which supports the project.

References

(1) E. BAJIC, B. SALZEMANN, F. LEPAGE, G. MOREL, J. RICHARD


"flexible manufacturing cell: a programmable system for higher productivity and quality",
Applied Modelling and simulation of technological system, P. Borne, S.G. Tzafestas, Editors,
Proceedings of the first IMACS symposium on modelling and siulation for control of lumped and
distributed parameters systems. p 494 - 502.
Lille, France 3-6 june 1986.

(2) E.M. GOLDRATT


"Computerized shop floor scheduling",
International journal of Production Research, Vol. 26, number 3, p.443-455. March 1988.
333

(3) DNP MURTHY, I. DJAMALUNDIN


"Quality control in a single state production system: open and close loop policies",
International journal of Production Research. Vol. 28, number 12, p. 2219-2242. April 1990.

(4) NORIHISA KOMODA, KAZDO KERA, TAKEAKI KULO


"An autonomous decentralized control system for factory automation",
Computer IEEE journal, p. 73-83, December 1984.

(5) J. HUYSENTRUYT
"CIM - OSA, a computer integrated manufacturing based on the open system approach",
Proceedings of the 5th international Conference on flexible manufacturing system, p. 215-224.
Strafford Upon Avon, 3 - 5/11/1986.

(6) G. CHRYSSOLOURIS, M. DOMROESE, L. ZSOLDOS


"A decision - making strategy for machining control" ,
CIRP Annals 1990, Vol. 39/1/1990, p. 501-504.
Berlin, 26/8 - 1/9/1990.

(7) M. BLANCHARD
"Comprendre, mai'trlser et appliquer Ie Grafcet",
Automatisation et Production, Editions CEPADUES, 1982.

(8) F. MUNERATO
"Robotisation integree d'un flot de production manufacturiere",
These de I'Universite de Nancy 1
Nancy, 18 april 1988.

(9) F. LEPAGE, F. ARLAL, J.Y. BRON, P. MANGIN


"A behavior model for the management of application services in manufacturing",
Proceedings of the 14th IEEE Conference on Local Computer Networks, P; 125-129
Minneapolis, 10 - 12 october 1989.

(1O)J. RICHARD, G.RIS, M. VERON


"Spectrum analysis for statistical control in CIM",
Proceedings of the IFIP WG 5.3 International Conference on Modelling and simulation for
optimization of manufacturing system design and application.
Tempe, 8-10/11/1989.
CHAPTER 14

ROBOT'AND PLC SUPPORT SYSTEM


BASED ON FIBER OPTIC MAP NETWORK

P.Hodaie, B.Chan, B.Szabados O.Storoshchuk


Dept. ofECE General Motors of Canada
McMaster University, Oshawa, Ont., Canada
Hamilton, Ont. Canada
1..8S 4L7

ABS1RACf. This chapter describes how factory floor devices can be sup-
ported, using microcomputer based Local Area Networks. We start by introducing the
problems faced in computer integrated manufacturing (CIM) environment in general
and why there is a need to support production devices such as robots and Program-
mable Logic Controllers (PLC). After stating the minimum functions required from
any support system, we describe one such a system implemented in General Motors Car
Assembly plant in Oshawa, Canada. Most of the design and development of this system
had to be done in-house, with the assistance of McMaster University engineers, since
this was the first time technologies such as fibre optic media and Manufacturing
Automation Protocol (MAP) were used in full scale production environment. The
design philosophy and some technical aspects of this implementation are also included
here. There are several areas which need further work; we will discuss present efforts
and future directions.

1. INTRODUCflON

There has been an increasing level of interest in computer integrated manufacturing


(CIM) within North American and European industries in the last decade, mainly
influenced by the need to improve world-wide competitiveness. The automotive
industry in particular must deploy CIM rapidly to meet the challenge from developing
nations, who have cost benefits not achievable in North America and Europe.

The use of integrated manufacturing systems impose new problems of information


transmission, processing and storage. The central issue is the need for efficient
information flow between the computer-based equipment operating at each level
within manufacturing systems. This equipment, however, is inevitably supplied by a
multitude of vendors (for example, there are some 200 manufacturers of industrial
robots alone, each basing its robot controller on its own choice of computer hardware).
Most of these devices have their local programs and proprietary operating systems.
335
336

These softwares represent a large percentage of the total automation cost, hence
maintenance is critical.

It is estimated that only 15 percent of the 50,000 programmable tools, instruments,


controls, and systems already installed at General Motors facilities worldwide are able
to communicate with one another. When such communication does occur, it is costly,
accounting for up to 50 percent of the total expense of automation due to the custom
hardware and software interfaces needed. The result of this partial adaptation of CIM,
is many isolated "Islands of Automation" within a single manufacturing plant.

General Motors has stated that by 1990 it would have some 200,000 programmable
devices installed. By that time, enormous numbers of such devices will be operating
throughout the spectrum of manufacturing industries worldwide. With such an increase
in the number of devices, the communications problem would become unmanageable
if no solution was found.

To solve this monumental problem General Motors has spearheaded development of


Manufacturing Automation Protocol (MAP), which is based on the International
Standards Organization (ISO) seven layer model for Open Systems Interconnection
(OSI). Each of these layers is base on internationally accepted standards. Many
companies other than General Motors support MAP; with some 2500 members, the
World Federation of MAP User Group may be the largest specialized networking user
body in the world. This was specially evident at the Enterprise Networking Event'88 in
Baltimore, where more than 55 vendors demonstrated MAP products in real-world
environment.

Similar problems were faced in Oshawa Plant in 1982, when we first started to design
the plant Robot and PLC support systems. The General Motors Car Assembly plant in
Oshawa is one of the largest manufacturing facilities in the world. The plant covers an
area of 634 acres with over 147 acres under one roof. The plant workforce totals over
10,000 people, the size of a small town. As part of GM CPC (Chevrolet Pontiac
Canada), Oshawa is responsible for worldwide production of the Buick Regal and
Chevy Lumina models. The plant has two shifts in operation five days a week, with a
production rate of2 cars per minute. Oshawa is part of the GM10 program, with world
class manufacturing as a primary goal. Over $1 billion dollars have been spent on
automation, making it one of the most advanced plants in GM.

At that time there were over 500 robots and 150 PLCs operating throughout the plant,
none of which were linked to the host computers or file servers. Software maintenance
in these devices had to performed manually, which meant for each division of the plant
some body had to go around saving the software of every device on floppy discs or
cartridge tapes which were kept by each shift supervisor for that division. The back up
337

process is necessary in order to quickly restore to operation a device that suffers


memory corruption. This backup procedure was very costly due to the unreliability of
the disks or tapes, both vulnerable to being lost or damaged. The first objective,
therefore, was to automate the back up process by linking all devices to a host. The long
term goal was to fully integrate the plant floor device and systems. The only logical way
to link so many devices was to set up a local area network (LAN) on the plant floor.

2. LOCAL AREA NE1WORK SELECTION CRfI'ERIA

The manufacturing environment has unique characteristics which impose additional


restrictions on the LAN installed in such an environment. After a careful study of
existing LAN protocols and media, it was decided to base the planned network on fiber
optic Manufacturing Automation Protocol, since it satisfies all the required criteria.
These criteria are listed bellow:

- Simplified multivendor connectivity - The plant floor is very much a multi-vendor


environment with many different types of devices (from mainframes to sensors). The
LAN chosen must support all types and brands of equipment; hence, the network
must be based on an accepted standard. Using MAP, we have one wiring system, with
common protocols, and no gateways. Maintenance of proprietary protocols and
interfaces is eliminated, allowing applications portability.

- Noise immunity - Compared with other environments, such as office or a develop-


ment laboratory, the factory floor is very noisy (i.e. electromagnetic noise). It is
important to use a media that is immune to this noise in the LAN installation. Fiber
optic media was chosen due to a total immunity to electromagnetic noise, cost
effective and, ease of maintenance.

- FleXIbility - Network expansion is a critical issue. It is desirable to expand the


network if need be at later date, rather than having to build a separate network for
new stations. Fiber optic media can handle a large number of nodes over several
square kilometres, thus any network based on fiber can easily be expanded. Unlike
broadband where a major backbone installation is done up front, fiber optic is
configured in a way that lets us add media as needed and grow our network over time.

- Functionality - As the size (number of nodes) of any network is increased, so should


its functionality, since the new devices added may require new services. For example
consider a network which initially had only robots on it, and then expanded to also
support simple sensors. For robot support this system only had to provide file
transfer services, but with new sensors added, it is also required to support variable
access. This functionality upgrade is fully supported by MAP.
338

- Economica1- In general, MAP purchase prices are still higher than those of proprie-
tary networks. However, this is just one component of network cost of ownership.
MAP development and maintenance costs are significantly lower. Wiring costs go
down as well, with one common wiring system. Corporate dollars freed up from these
lower costs go into additional training, test equipment and support for MAP, thus
extending our expertise and ability to use MAP effectively. The biggest dollar
benefits though, are those resulting from the increased functionality, reliability and
performance that MAP provides.

3. NE1WORK DESIGN

Three networks have been implemented to data in the Oshawa plant, two for robot
support and one for PLC support. The two robot networks cover a 3000 foot by 1500 ft.
area, linking 515 robots to a host microcomputer and a network manager console
(which is also a microcomputer). The PLC network alone covers a 30 foot by 100 foot
area and contains 150 PLCs. The three networks have the same basic conceptual
design, as shown in Figure 1. Therefore they can be joined forming a single network at
any time. But since each network supports a different production line, it is configured
as a stand- alone system. In Figure 2 a single tapped bus is shown, but the difficulty of
passively tapping fiber optic media dictates the use of a tree topology using passive
couplers and active stars, as shown in Figure 2. The tree topology emulates the passively
tapped bus in that each signal is seen by each station only once, and the packets exit the
network without the intervention of the transmitting station (as in the bus). Each node
has two cables; one for transmission and one for receiving. Eight pairs are connected
to a passive splitter/combiner, which connects a pair of cables to an active concentrator
upstream. Each concentrator can handle 32 pairs (or 256 nodes) from the passive
splitter/combiners. The active concentrators are then connected to a passive star hub.
Transmissions pass from a node on the transmit line, up through the Combiner, through
the active concentrator to the hub, and relayed back to all nodes on the receive line to
be received by the intended node.

The physical and media layers of the network were designed to comply with the IEEE
802.4 specification, chapters 16 and 17. The fiber optic specifications are summarized
below.

Signalling Rate: 10 Mb/s, Manchester encoded.

Transmitter Characteristics: 850 nm wavelength, < 10 ns rise/fall times, 40 to


l00uW peak power into 62.5/125 fiber (3 dB lower than IEEE spec), < 60 nm
FWHM, > 13 dB extinction ratio.
339

Host Network]
Computer Manager

File
System

FIGURE 1. Layout of the robot networks

Transmit Receive

8x1 Combiner

T R

FIGURE 2. Media Topology


340

Receiver Characteristics: 850 nm wavelength, < 5 ns pulse width distortion,


Sensitivity .8 uW or better, Dynamic range 25 dB.

Passive Couplers: 8 x 1 combiners, 1 x 8 splitters, 8 x 8 stars.

Fiber Cable: Mostly ruggedized duplex 100/140 urn, with some 62.5/125 used in
point to point links.

Active Star: All ports meet station transmitter and receiver specifications, all
signals are re-timed to prevent jitter accumulation.

The industrial devices supported by these networks are GM Fanuc robots, Cincinati
Millacron robots and Allen Bradly PLC3 PLCs. The hosts and network manager
consoles are Intel 80286 and Motorola 680XO based microcomputers. Active and
passive components for the media are supplied by Thomas & Betts, At&T, AMP, and
CD Networks. The network controller cards are supplied by Motorola, Ungermann-
Brass, Concord Communication, and Allen Bradley. The resulting network is a true
multi-vendor environment.

4. SUPPORT APPUCATION SOFIWARES

Although we can all appreciate and get excited about MAP technology, the real interest
is in the applications and the benefits they provide the factory. MAP is simply a
networking infrastructure that our support applications use to achieve reliable commu-
nication. With our wiring in place, and interoperability among several vendors, we can
add new applications quickly, and build on this infrastructure. The application soft-
wares developed in-house for these networks are written in C programming language
and run on the hosts under UNIX operating system. To clearly understand what is
involved in supporting plant floor devices, we describe an application we use to support
our production robots. This application is called Robot Support System (RSS) and run
on dedicated IBM PC/AT compatible (this software can also run on Motorola 1131
VME machine).

Basically, the software allows users to:

a) Upload programs from robots and store them on disk. This can be done via user
requests at a system terminal or form any robot on the MAP network.

b) Download programs from the Host's disk into the robot's memory. Again, this can
be done from either a system terminal or directly from any of the robots.
341

c) In addition, the system allows users at a system terminal to examine file


directories on the robots, delete a file on the robot, move files, and perform a
number of related supporting functions.

The host machine (while very low-cost) takes advantage of the increasing power and
capacity of microcomputer technology. This "little" machine provides very satisfactory
performance and response. In addition, 120 Megabytes of disk storage provide enough
space for thousands of robot programs. Separate space is provided for both the Current
and the Previous working copies, as well as a more secure Master (Production) copy.
There are also some sophisticated features you might not expect on a small system such
as this. For example, the software has been designed to provide logging for all
transactions, built-in debug capabilities, "locking" to avoid dual-simultaneous file
access, on screen help, and powerful ALLFILES function to make the user's job easier.

Based upon the plant's requirements, the system manager can configure a series of
menus starting from the Home Menu through the Robot Menus. Between the Home
Menu and Robot Menu there can be up to 6 Area Menus. Figure 3 illustrates an
example of a generic menu tree structure for a plant. There are two types of menus that
can be defined: an Area Menu and a Robot Menu. An Area Menu is a subdivision of
the plant, which eventually isolates a group of robots. A Robot Menu identifies the
actual names of robots located within a specific plant area.

Once running, RSS allows the user to select a specific robot through a multiple menu
selection process. This process guides the user in a hierarchical fashion from the shop
areas in the plant, the line or sub-areas within the shop, the functions available, to the
robot files.

The user must make a selection at each decision point. For example, the user will first
make a choice between HOME and UTIUTIES. If HOME is chosen, the user will then
make a choice between BODY and TRIM areas. If BODY is chosen, the user will then
choose LEFf or RIGHT within sub-area SIDES, and so on.

4.1 System Structure

RSS is structured in a modular fashion. This provides for reasonably manageable tasks
for development, and allows for the maintainability of the software.

The system is separated into the following main parts:


-User interfaces.
The interactive components that a user sees and manipulates to accomplish the
tasks of retrieving, storing, and maintaining the robot file and program informa-
tion.
342

EXAMPLE MENU TREE STRUCfURE

WGOUT HOME UTILITY

UTILITY FUNCTIONS

TRIM ROBOTS
SIDES END
LEFT FRONT REAR
ROBOTS ROBOTS ROBOTS ROl
R02
R06
R04 ROS
R09 RlO
RIGHT I
FUNCTIONS
ROBOTS I
FILES
R07 R08
Figure 3. Generic Menu Tree Structure for a Plant.
-Function control servers.
The components that manage the logic required to perform the functions
requested by the user.

-Communication servers.
The functions that perform the task of moving files and data between the robot
and the file storage system.

-File maintenance servers.


The components that manage the storage and retrieval of data within the file
storage system.

-Log file functions.


This allows all communication and interactive transactions to be logged to a file,
there by providing for error recovery and providing a trace capability to deter-
mine the cause of a particular problem.
343

-Menu maintenance functions.


Menus are configurable, allowing each plant to create its own unique access
screens. However, the control software does not change even though the screens
are different.

-Terminal configurations.
This system provides for access using many different terminal systems. Terminal
keyboard variations are placed in a configuration file, eliminating the need for
different software when a new or different terminal is added to the system. There
is no dependence on a particular vendor's terminal.

42 RSS Use During Initial Robot Programming

In many cases, a robot will be programmed by the equipment supplier prior to delivery
to the plant. If this is the case, the supplier will probably leave the program in the robot.
This will normally be the case if the robot uses Bubble memory. The supplier also
provides an up-to-date copy of all robot programs on some other media such as cassette
tape or floppy disk.

When the robot program is supplied by the supplier as described above, the robot
program is uploaded from the robot's memory into the RSS system as soon as the RSS
Manager has added the new robot to the network and RSS. Upon uploading the robot
program (or multiple programs if more than one resides on this device), the tape or disk
provided with the robot is also loaded into the robot and uploaded. The first upload will
be automatically copied from the "CURRENT" directory into the "PREVIOUS"
directory, while the "CURRENT" directory will contain the tape image. Next the user
does a compare to see if the two are identical. If there are any differences, the OEM
should be required to explain the discrepancy.

When the robot is programmed by the supplier, the plant probably still has to do some
fine tuning of the robot path after the arm is mounted in its final production position.
During this fine tuning process, the programmer saves his work (Upload into RSS host)
on a periodic basis, especially if it has been very time consuming.

If the entire robot program is to be developed at the plant, then it will be up to the
programmer to do uploads at an interval that is convenient, yet minimizes the potential
lost effort. The point to be remembered by the programmer is that each upload goes
into the "CURRENT" directory, and forces the "CURRENT" copy to be moved into
the "PREVIOUS" directory. At this point, the older "PREVIOUS" copy is destroyed.
If the user feels there is any chance that he might need to go back to an older version,
he should logon to RSS and copy the "PREVIOUS" file into his personal directory.
344

4.3 Making a MASTER Copy of a Robot Program

In all cases, once the robot program is finalized and metal has been run through the
station to verify a working program, the user Uploads the finished file to the RSS host.
The programmer will then see to it that this working version is copied into the
"MASTER" directory on RSS. In general, all copies into the "MASTER" directory are
to be made with great care. This copy must be known to be correct and must function
when loaded into the robot.

It is absolutely essential that a person is assigned to be the RSS system manager. All the
other RSS users should know who he is and how he can be contacted. It is also a good
idea that to have a user coordinator for each shift to assist the system manager and help
when the system manager is not present.

4.4 Security

A key administrative function of the system manager is to ensure that each RSS
function can only be accessed by authorized personnel. A security system has been
developed to simplify this job. This system has two modes, "Password Protection" mode
and "Full 3-D Protection" mode. In "Password Protection" mode the system utilizes a
multiple password approach to access the robot and utility functions. Also, a user who
has been granted password permission to a specific function may perform that function
on any device in any area ofthe plant. In contrast, the "Full 3-D Protection" mode relies
on a three-dimensional database (User, Device, and Function) to control the access to
RSS functions. This mode is very flexible, the permission for a user to perform any
function can be granted or denied on any device in any designated plant area. The Full
3-D protection mode can be used in conjunction with Password Protection mode, or the
system may be configured to such that the 3-D mode will serve as an alternative to the
standard Password mode. The user login ID and corresponding password are main-
tained. At the point in which the user attempts to perform a robot function or utility,
the system checks to determine whether or not the user should be permitted to access
the respective RSS functionality, based upon a predefined set of plant rules. The result
of this access check is relayed to RSS. If the permission has been denied, the user will
not be ;illowed to access the selected functionality. All failed attempts to access are
logged. If the 3-D mode grants the user access, the system then uses the password from
the Password Protection mode to supplement the 3-D result. If a non-zero security level
has been assigned to the respective function by the RSS Manager, the user will be
prompted to enter the appropriate password for that security level. The access to the
selected function is granted only if the user responds with the correct password. The
standard security system of any RSS with 3-D Security option installation may be
disabled by assigning a security level of zero to all functions/utilities.
345

45 Further Modifications to Robot Programs

The precautions on replacing the MASTER copy of a robot program will continue to
apply to all program modifications. While experience or process adjustments will
demand that the robot programs (logic or path) be changed, all such changes must be
officially approved before the new master program can be copied into the RSS
"MASTER" directory. The official approval process will vary from one plant to
another, and may even vary from time to time within the same plant. In general, it will
consist of identifying the person who made the change and the reason for the change.
The justification should be confirmed by the manager responsible for the area of the
plant containing the robot.

As a precaution against accidentally picking up unauthorized changes along with the


desired change, a detail compare should be done against the existing "MASTER" file.
If the only difference is the change that was approved, then it is safe to replace the
existing master with the new version.

As a safety against accidentally destroying the only working version of a robot program,
two steps are strongly recommended"

1) Copy the existing master file to someone's personal directory prior to copying the
"CURRENT" to "MASTER".

2) Make sure that the robot actually runs with the version that is to become the new
master.

4.6 Complete List of RSS Functions and Utilities

The following list identifies all the functions supported by RSS.

1) Display the directory of the robot. Displays all the files located on a specific
robot. The robot will send its directory up to RSS. For viewing only.

2) Display the RSS Pry directory. Displays all the files located within the RSS
Previous (work) directory associated with a specific robot. For viewing only.

3) Display the RSS Cur directory. Displays all the files located within the RSS
Current (work) directory associated with a specific robot. For viewing only.

4) Display the RSS Mst directory. Displays all the files located within the RSS
Master (production) directory associated with a specific robot. For viewing
only.
346

5) Upload one or more robot files. Copies one or more robot files/programs
from a robot to the robot's Current (work) directory on the RSS host. The
MENU will receive a current directory from the robot and display the
directory listing on the screen. The user may then select one or more files
from the list to be Uploaded.

6) Upload all robot files. Copies all robot files/programs from a robot and
stores them in the robot's Current (work) directory on the RSS host. The
MENU will receive a current directory from the robot and display the
directory listing on the screen.

7) Download one or more RSS Cur files to robot. Copies one or more robot
files/programs located within the robot's Current (work) directory on the
RSS host to the robot. The MENU will display the RSS Current directory
listing for that robot and allow the user to select one or more files from the
list to be Downloaded to the robot.

8) Download one or more RSS Mst files to the robot. Copies one or more robot
files/programs located within the robot's Master (production) directory on
the RSS host to the robot. The MENU will display the RSS Master directory
listing for that robot and allow the user to select one or more files from the
list to be Downloaded to the robot.

9) Download all the RSS Cur files to the robot. Copies all robot files/programs
located within the robot's Current (work) directory on the RSS host to the
robot. The MENU will display the RSS Current directory listing for that
robot.

10) Download all the RSS Mst files to the robot. Copies all robot files/programs
located within the robot's Master (production) directory on the RSS host to
the robot. The MENU will display the RSS Master directory listing for that
robot.

11) Delete a file from the robot. Remove a single file from a robot. This function
must be repeated for each file to be removed. The MENU will receive a
current directory from the robot and display the directory listing on the
screen. The user may then select one file from the list to be deleted from the
robot.

12) Delete a RSS file from the Prv directory. Removes a single file from the RSS
Previous directory for a particular robot. This function must be repeated for
each file to be removed. The MENU will receive a directory and display the
347

directory listing on the screen. The user may then select one file from the list
to be removed.

13) Delete a RSS file from the Cur directory. Removes a single file from the
RSS Current (work) directory for a particular robot. This function must be
repeated for each file to be removed. The MENU will receive a directory
and display the directory listing on the screen. The user may then select one
file from the list to be deleted.

14) Delete a RSS file from the Mst directory. Removes a single file from the
RSS Master (production) directory for a particular robot. This function
must be repeated for each file to be removed. The MENU will receive a
directory and display the directory listing on the screen. The user may then
select one file from the list to be deleted.

15) Copy one or more RSS robot files from Cur to Mst directory. Copies one or
more files located within the RSS Current (work) directory of a specific
robot to the RSS Master (production) directory of that robot.

16) Copy one or more RSS robot files from Pry to HOME directory. Copies one
or more files located within the RSS Previous directory of a specific robot
to the user's HOME directory.

17) Copy one or more RSS robot files from Cur to HOME directory. Copies one
or more files located within the RSS Current (work) directory of a specific
robot to the user's HOME directory.

18) Copy one or more RSS robot files from Mst to HOME directory. Copies one
or more files located within the RSS Current (work) directory of a specific
robot to the user's HOME directory.

19) Copy all RSS robot files from Cur to Mst directory. Copies all files located
within the RSS Current (work) directory of a specific robot to the RSS
Master (production) directory of that robot.

20) Copy all RSS robot files from Prv to HOME directory. Copies all files
located within the RSS Previous directory of a specific robot to the user's
HOME directory.

21) Copy all RSS robot files from Cur to HOME directory. Copies all files
located within the RSS Current (work) directory of a specific robot to the
user's HOME directory.
348

22) Copy all RSS robot files from Mst to HOME directory. Copies all files
located within the RSS Master (production) directory of a specific robot to
the user's HOME directory.

23) Display a RSS robot file from the Pry directory. Displays the contents of a
RSS Previous file on the user's terminal.

24) Display a RSS robot file from the Cur directory. Displays the contents of a
RSS Current (work) file on the user's terminal.

25) Display a RSS robot file from the Mst directory. Displays the contents of a
RSS Master (production) file on the user's terminal.

26) Compare file from the Cur directory with Mst and Prv directories. Com-
pares a file in Current with a file in Master and then Previous directories.
If the file does not exist in Master and/ or Previous, a message pertaining to
such is displayed.

27) Archive all saved log files onto a floppy diskette. Initializes the log file
system and archives all previous log files onto a floppy diskette.

28) Retrieve log files from archived floppy diskette. Retrieves logged files from
an archive floppy, and copies them into the current user's personal directory
(HOME Directory).

29) Mail a message to the RSS manager. To be used to tell the system manager
about any problems and/or to indicate if a program should be put in the
Master (production) directory.

30) Change RSS password. Allows all RSS users to change their RSS password.
Prompts user for old password, asks for new password, and then asks user
to re-enter new password.

31) List RSS users. Checks validity of RSS user setup. Gives a list of RSS users
by UNIX ID names. Also gives a list of all defined RSS Security Levels.

32) Backup robot directory. Copies all RSS robot host directories and files onto
backup tape system.

33) Backup HOME directory. Copies user's HOME directory onto a floppy
diskette.
349

34) Restore robot directory. Copies RSS robot host directories and files from
backup tape system onto the host directory.

35) Restore HOME directory. Copies user's HOME directory from backup
floppy diskette onto the host directory.

4.7 Other Notes on RSS

It is extremely important that the end users of the RSS system (people on manufactur-
ing floor) are aware of what the system can do for them and how to use it. We have
ensured that these individuals receive RSS training to satisfy their needs.

RSS is most useful during periods such as plant startup or changeover when numerous
robot changes are being made. Therefore it is important that the plant network stay up
during these production down times.

As a programming aid, the RRS system can be used to accomplish what is called "Cross-
Load" function. When a program from one robot can be used with little or no
modification in another robot, it can be copied from one robot to the other. To do this,
the file is first uploaded from the first robot, then copied to any directory of the second
robot via user's own directory and, finally downloaded into the second robot.

The value of RSS, and of any similar support system for programmable devices, lies in
getting the device back to a known and running condition quickly. Many hours are spent
in programming such a device. All of this work is at risk of being lost if the robot's
memory is wiped out by natural disaster, operator error, or internal device failure. In
addition to all this programming effort being lost, significant amounts of downtime for
the entire plant can result from such a memory loss.

But just having a support system like RSS available does not guarantee that the plant
is not vulnerable to these kinds of disasters. That can only be assured if the plant is using
the available tools to protect against major losses. In other words, the real benefits can
only be realized by establishing and following good procedures.

4.8. Factory Automation Support Tools (FAST) for PLCs

As stated before we have over 200 PLCs that provide control to various factory
automation and conveyance system. As with robots, PLC programs require frequent
backup and management as well, therefore all of the RSS functionality described above
are also supported by FAST system for PLCs. In addition, download capability is
essential for recovery since the plant may shut down if we lose one PLC. Our PLCs
control the assembly area of the plant where all cars must pass through a series of
350

unique, sequential stations. If one PLC goes down, there is not an alternate line or route
that can accomplish the task of the failed node, and production is stopped.

Using microcomputer hosts, we currently have connectivity to the PLCs. Our PLC
support system, known as FAST (factory automation support tools), provides upload/
download capability and lives up to its name. The primary problem we needed to solve
is one of speed. Manual tape backup takes 45 minutes per PLC. Backup over Allen
Bradley Data Highway takes about 20 minutes per node. Our MAP network accom-
plishes backup in less than 1 minute. In the event of a plant shutdown due to a PLC
failure, recovery can be accomplished in 1 minute instead of 20.

5. CONCLUSIONS AND RECOMMENDATIONS

Restoration to a known production state in case of failure is the minimum functionality


required to support any type of factory floor device. For intelligent (microprocessor
based) devices such as robots and PLCs, this restoration involves an efficient transfer
of software to and from a reliable storage facility. In this chapter we have described how
this can be achieved using a microcomputer based local area network. A system we have
designed and developed for General Motors Car Body Assembly Plant in Oshawa is
used as an example throughout the chapter.

Above the basic restoration, there are other useful functions a support system should
provide. For example in a typical environment where softwares are modified frequently
to meet changing production requirements, program version control is essential. The
benefit of version control function is in achieving administrative control.

The issue of security becomes important, as the functionality of any support system
increases. Since some of the functions of the support system can effect production, only
authorized personnel should have access to these functions. In section dealing with
security, we have fully covered this issue and given an example of a 3-dimensional
security system built on top of a database.

New applications are presently being developed for the system. Real-time monitoring
of the plant, involving both addition of new devices and new functions (such as variable
access), is a prime example. New applications may be added to the network without
major upheaval since the system is easily expandable and enhanced functionality is
already in place.

A major issue that is becoming increasingly urgent is the integration of all applications
running on a network. Until this integration is achieved, it will not be possible to utilized
the benefits of our plant communication system. We propose development of a global
351

application interface environment which would lay above the network providing a
common application access, and development environment to all applications.

The network is best justified in economic terms. Consider, for example, a comparison
of times required for file backup on a PLC using tape/disk, a Data Highway network,
and MAP. A typical PLC download takes approximately 45 minutes (from experience)
using tape/disk, 20 minutes using Data Highway, and less than one minute using MAP.
In the event of a plant shutdown due to a PLC failure, recovery can be accomplished
in 1 minute instead of 20. Plant down time costs are estimated at $3,000 to $5,000 per
minute. This translates to a $57,000 to $95,000 saving every time one PLC crash causes
a plant shutdown. The system can pay for itself quite quickly. As new applications are
developed, systems such as this one will likely prove invaluable in keeping North
American and European industries competitive.
CHAPTER 15

EXPERIMENTAL RESULTS OF PARAMETER


IDENTIFICATION AND NONLINEAR CONTROL
ON A PM STEPPER MOTOR
Michael Aiello and Ronald Rekowski
Aerotech. Inc .• 10 1 Zeta Drive. Pgh. PA 15238

Marc Bodson
Department of Electrical and Computer Engineering
Carnegie Mellon University. Pgh. PA 15213

John Chiasson and David Schuerer


Department of Electrical Engineering. 348 BEH
University of Pittsburgh. Pgh. PPA 15261

Abstract
This paper discusses the application of modern nonlinear control
theory to the fast and accurate positioning of permanent magnet (PM)
stepping motors. The mathematical model of a PM stepper motor is given and
a control algorithm based on the recently developed feedback linearization
approach which uses only position measurements form an optical encoder is
described. Furthermore, a least squares parameter identification algorithm
to determine the resistance, inductance, torque/back-emf constant, moment
of inertia and viscous friction coefficient of the motor are presented.
The identification and control algorithms are implemented on an
experimental set-up consisting 0/ the Motorola DSP56001ADS Digital Signal
Processing System, a personal computer, two PWM amplifiers, and a PM
stepper motor. The results of the identification and control algoritms are
presented.
1. Introduction
DC motors have traditionally been used in positioning systems
precisely because they are easy (relatively speaking) to control. This
ease of control is due to the fact that a DC motor is a linear system.
Originally, stepper motors were designed to provide precise positioning
control to within an integer number of steps (e.g., 200 steps for a
resolution of 1.80 ) without using sensors. That is, they are open loop
stable to any step position and consequently no feedback is needed to
control them. Furthermore, by a more precise control of the phase
currents, microstepping can be ac~eved in which there are, for example,
2000 steps for a resolution of .18 which is the resolution limit of a
typical position transducer (optical encoder). But, using the stepper
motor in an open loop configuration results in very low performance. In
particular, the PM stepper has a step response with a significant overshoot
and a long settling time. Furthermore, at high rotor speeds, the phase
currents cannot build up nor decay fast enough to keep up with the applied
353
354

voltage frequency. These are major obstacles to using the stepper


as a fast accurate positioning system.
The objective of the work presented here is two-fold: (i) Presentation
of experimental results using least squares identification methods to
validate the existing nonlinear mathematical model [3][4][5] of the PM
stepping motor and determine its parameter values (ii) Implementation of a
nonlinear controller on the Motorola DSP5600IADS Digital Signal Processing
System and associated hardware along with a study of its tracking
capabilities.

This paper is a progress report about on-going research. The first


set of results are reported in [1] and [9]. The outline of the rest of
paper is as follows: Section 2 gives a nonlinear mathematical model of the
PM stepper motor and describes the exact linearizing controller that was
implemented to position the motor (more detail is given than reported in
[1]); Section 3 describes the least squares identification technique used
to validate the model and find its parameters; Section 4 describes the
experimental setup and presents the results of the identification
experiments and of the implementation of the nonlinear controller;
Finally, in Section 5, future research considerations are discussed.

2. Feedback Linearization Controller


The equations describing the PM stepper motor are well documented
[3][4][6][9]. See [3] for an especially clear derivation of these
equations and [9] for a qualitative description of the operation of a
stepper motor.

di
~ = [va -Ria + Kmrosin(Nre))/L
di
W = [vb - Rib - Kmrocos (Nre)]/L
(1)
dro
Of= [-Km a r e) + Kmi bcos(Nre) - Bro]/J
i sin(N
de
Of =ro
ia,ib and va, Vb are the currents and voltages in phases A and B,
respectively. L and R are the self-inductance and resistance of each phase
winding, K is the motor torque (back-emf) constant, N is the number of
m r
rotor teeth, J is the rotor inertia, B is the viscous friction constant, ro
is the rotor speed and e is the motor position. This model neglects the
slight magnetic coupling between phases, the small change in L as a
function of position e due to the rotor teeth [3], the variation in L as a
function of phase current level, and the detent torque. The results given

below suggest that the model given in (1) is sufficient for the control
355

design.
It is proposed here that the high rotor speed control problem briefly
mentioned in the introductory section can be circumvented by using the full
(four state) mathematical model in the design of a nonlinear feedback
controller. Since this model incorporates the speed dependent phase shift
between phase voltage and phase current [6], the control law will then
automatically take this dependency into account. In particular, a modified
exact feedback linearization [10] controller is designed using this full
order model. Applying the exact linearization method to the PM stepper
motor was first considered by Zribi [7] and later refined in Zribi and
Chiasson [8].

It should be pointed out that ruc et al. [2] considered the feedback
linearization technique on variable reluctance stepper motors and an
implementation of this controller was reported in [14].

The feedback linearization technique is a combination of a nonlinear


state-space transformation combined with state feedback so that the
resulting system is linear [8][9][10][11]. For the particular application
considered here (Le., the stepper motor), this technique is equivalent to
the DQ transformation of electric machine theory [8]. The DQ
transformation is given by

id cos(Ne) sin(N e) 0 0 i
r r •
i
q ~ -sin(Nre) cos(Ne)
r
0 0 ib (2)
00 0 0 1 0 00
e 0 0 0 1 e

[::l ~ [.:::::: ::::::l [::l (3)

In these new state variables, the system becomes

~t' = -Ri d + NR ooLiq + v d

l~
-dt"'
= -Ri q - NR ooLid - Km00 + v q (4)

JdOO = K i - Boo
iff mq

de
at = 00
356

Using the nonlinear feedback given by


vd = Rid - NRroLi q + v*d
(5)
V
q
= Riq + NRroLid + Kmro + v*q

results in the linear system

id 0 0 0 0 id 1 0

[:~l
Li
q
0 0 0 0 i
q
0 1
d
(If Jro = 0 Km -B 0 ro + 0 0
(6)

e 0 0 1 0 e 0 0

Practical Issues in Implementing an Exact Linearization Controller


The key to using the above decoupled control law is to choose an
appropiate desired direct current idd for the feedback law to track. To
understand this, note from the second equation of system (4) the back-emf in
the DQ coordinates is given by K ro. Choosing v as given in (5), shows that
m q
v must cancel this back-emf along with the phase resistance and inductance
q
voltage drops given by Riq + NRroLid. In a typical stepper motor, the
amplifiers will quickly saturate since v would be asking for more voltage
q
than is available. For example, a typical value of K is.2 and for speeds
m
of 500rpm, lOOOrpm, lS00rpm and 2000rpm, the corresponding values of the
back-emf Km 0> would be 1Ovolts, 2Ovolts, 3Ovolts and 4Ovolts, respectively.
However, the supply voltage for the system is only 40 volts. To get around
this, the field-weakening approach is used in which vd is chosen to make id
negative. As seen from the second equation of system (4), the term NRroLid
will then help to cancel the back-emf term K 0>.
m
Another issue in implementing the above control algorithm is that the
PWM amplifier adds high frequency noise to the measurements of the phase
currents i., ib which in turn makes them useless for feedback. Analog
Butterworth filters were used to filter a great deal (but not all) of this
noise. However, these anti-aliasing filters add unacceptable phase shifts
to the current measurements. (Simulations have even shown that these phase
357

shifts will not allow the exact linearization controller to work.) We


presently circumvent this problem by using for feedback only the position
and speed measurements, while using the desired currents i ,i in place
qd dd
of the actual currents iq and i.
d
Therefore, this is a combination of a
feedbacklfeedforward controller. The feedforward control for the currents
is used since it is insensitive to sensor noise and delay. However,
feedback control is necessary to reduce the effect of input disturbances and
unmodeled effects. The results presented in section 4 demonstrate this
controller to be a reasonable compromise to the existing state of affairs.

The desired current i is computed using the desired trajectory by


defining i
qd
~ (Ja +Bco
d dm
)/,l, where a d
and co are the desired acceleration
d
and speed, respectively. A glance at equation (4) shows that this is the
appropriate choice for i . On the other hand, i is defmed by
qd dd

(7)

This expression is derived straightforwardly by maximizing the torque


(Le., i ) under constant speed conditions with v2 + v 2 ~ V for some fixed
q • b
V. This is equivalent to using the optimum (in the sense of producing the
maximum torque at constant speed) lead-angle [4][6][3] at constant speed.
Note that i dd is negative which helps to cancel out the back-emf so that the
applied voltage v can be used for increasing i and hence the acceleration
q q
without saturating the supply voltage.Interestingly, Brown et. aI. [15] have
shown that the time optimal control (point to point with free end velocity)
of a stepper motor is only 1% better than using the optimum lead angle
controller.

As mentioned above, we feed back the position and speed while


feedforwarding the desired currents vis-a-vis

v =-Ri -NroLi
d dd R qd
(8)
v q = - Riqd - NRroLiddm
-K co

Then, using (4) and (8), the closed-loop system may be written as follows:
358

* = R( id - id d) + NReoL(iq - i qd )

T2i.., = R (i - i q d) - NReoL(id - i dd ) (9)


-d t" q

Jdeo
Of
= K i - Beo
mq

de
Of = eo

Next. i qd is modified to introduce a feedback action from the position


error. A simple PID algorithm was found to be useful for that purpose. As
seen above. i has a direct influence on the speed of the motor. e.g.• if
q
the motor is too slow. more i is needed and vice-versa. This motivates the
q
choice of iqd as

iqd ~ (Ja.d+Beo)1K
d m - K
-1"
r(e-ed)dt - KP(e-ed) - KD(ro-eo)d (10)

To increase i • the PID controller will cause v to be more negative and.


~ d
as a consequence of this field weakening. cause eo to increase.

3. Model Validation and Parameter Identification


The control law proposed in section 2 depends on parameters R. L. K •
m
J. and B. It may be expected that the optimization of closed-loop stepper
motor responses will strongly depend on the precise determination of its
dynamic characteristics. Since some of the parameters may be varying with
time. on-line adaptation is also desirable. In [9]. we proposed a technique
to identify the unknown parameters based on input/output data obtained in
normal operation of the stepper motor. We now report some results obtained
with this technique.

Identification Method

While the dynamic model of the stepper motor is nonlinear. the


dependence of the right-hand side of (2) is linear in the parameters
Kt .....K s defined below.

Specifically. if we define
359

K
XI I
0
X2 K2
X = , V = [::]. K = K3 ' r(x,v) = 0 (11)
x3 0
x4 K4
x3
Ks
and

-x
I
-x
2
0 0
x sin(N x) -x3cos(N x ) 0 0
3 r4 r 4
W(x,V) = 0 0 -x sin(N x )+x cos(N x )
1 r4 2 r4 0
0 0 -1 0
v
a Vb 0 0
(12)
Then (1) may be rewritten as

X = WT(x,v)K + r(x,v) (13)

If 'C L *- 0 and is unknown, it may be considered an additional parameter K 6 ,


with minor adjustments.

The linearity of eqn. (13) in the parameters make the system amenable
to standard identification techniques. We assume that x and V are known
exactly, as well as x.
1\
Denote by K(t) the estimate of K at time t, and bye., the
1
so-called identifier error
T 1\ •
e. = W (x,v)K + r(x,v) - x (14)
1

= WT(x,v)(K - 1\
K) (15)

The identifier error e. is available (see (14» and reflects the


1\ 1
parameter
error R = K - K (see (15». A least-squares identification algorithm can be
1\
obtained so that K(t) minimizes the error

J
I

E.(t) = I e.('C) 12d'C (16)


1 1
o
360

Specifically,
1\
K(t) = [So W(x(t),v(t»W{x(t),v(t»
t J t
dtf I[ W(x(t),v(t»x(t)dt] (17)
o
1\
assuming that the inverse exists. In the absence of noise, K(t) =K as soon
1\
as the inverse exists. Otherwise, K minimizes (16).

An approximate recursive form of (17) may also be obtained. leading to


a standard recursive least-squares algorithm (see Sastry and Bodson [12]).
1\
K(t) = - gP(t)W(x(t),v(t»{WT(x(t),v(t»K(t)-x(t)}
1\.
(18)

p(t) = -gP(t)W(x(t),v(t»WT(x(t),v(t»P(t) (19)

where ~(O), P(O) = pT(O) > 0, and g > 0 are arbitrary, and P(t) is a 5x5
symmetric matrix.

A drawback of the approach is that the derivative of the state x is


necessary. Since it is not directly measurable, an approximate derivative
must be implemented. The problem may be circumvented at the price of some
additional complexity in the identification algorithm (cf.Hsu et al. [13]).
In the results presented here, the identification algorithm was done
offline.

4. Experimental Set Up and Results


Experimental evaluation of the above control and identification
algorithms were carried out as part of a University-Industry cooperative
effort between the University of Pittsburgh, Carnegie Mellon University, and
the Aerotech Corporation, Pittsburgh. The experimental set-up motor
consists of a two-phase stepper motor with an optical encoder, a Motorola
DSP560001ADS digital signal processing system, two amplifiers, four AID's
for each of the two motor voltages and currents and two D/A's for the
control signal. Two separate systems were built: one each for the
identification and controller experiments, respectively.
The motor used for the identification experiments has a 6 Ampere
nominal phase current and a 265W maximum input power and is controlled
through a voltage regulated pulse-width modulated (PWM) ±80V source. The
manufacturer's data supplied for this motor was R = 0.250, L = 0.023H, Km =
0.44 N-m/A and J = 18.7xlO·sKg-m-sec2•

The motor used for the controller evaluation experiments has a 6 Ampere
nominal phase current and a 70W maximum input power and is controlled
through a voltage regulated pUlse-width modulated ±40V source. The
361

manufacturer's data supplied for this motor was R = 0.50, L = 1. 25mH, Km


(not given) and J = 2.7xlO-SKg-m-sec2 •
The control algorithm was implemented on a Motorola DSP56001 digital
signal processing chip. The DSP has a clock frequency of 20.48 MHz and a
capacity to process 10.25 million instructions/sec. The output interface of
the DSP system is achieved through two 12-bit D/A converters. The
measurements of v.' Vb' i., ib are obtained through 4 8-bit NO converters.
All four readings v.' Vb' ia' ib were needed for the identification
experiments while the readings of i and i were used to compare the actual
• b
currents with the desired currents. The voltages v ,v were limited to a
• b
resolution .3125 volt because the AID's had only 8 bits to measure the
full-scale 80 volt source.

Pulses from an optical encoder were used to provide discrete


measurements of position and speed to the DSP. The position was obtained a
by counting the pulses from the 2000 pulse/rev optical encoder while the
velocity measurements were obtained by counting the number of pulses per
unit time.

Controller Evaluation

A reference (desired) velocity profile along with the actual (measured)


velocity is shown below. To see how well the controller is tracking, the
velocity error is shown below. The maximum errors occur when the motor is
stopping and starting.
• 00r-----~~----~~----~~----~------~------_.

i
......
III

~
g
c::I

~
to> -100
S~
> -200

-300

-.OO~----~------~~----~~----~~----~~----~
o 0.2 0 .• 0.6 0.8 1.2

Velocity Profile Time in sec


362

15

10

U
IaI
en
"-
~
e
c.:
0
c.:
c.:
IaI -5

-10

-15
0 0.2 0.4 0.6 0.8 1.2

TIME (SEC)

Velocity Error
The actual and desired positions are now shown below followed by the
position error. Note, as expected, the maximum position error occurs when
the velocity is at a maximum.
90

60

70

Ii> 60
:r.
!i
Q
<C 50
~
:r. 40
0
E
II)
0 30
c..

20

10

0
0 0.2 0.4 0.6 0.8 1.2

TIME (SEC)

Actual and Desired Positions


363

0.3r-------~~._--~-------r------~------~------_.

0.2

til 0.1
~Q
g
Z
o
E
~ -0.1
c..

-0.2

-0.3L-------~------~------~------~~--~~----~
o 0.2 0.4 0.6 0.8 1.2

Position Error TIME (SEC)

Although the currents i d and i q were not fed back, there were measured
and plotted below. The fIrst plot below is that of the desired and measured
direct current while the next plot is of the desired and measured quadrature
currents. As seen below, there is quite a discrepancy between them.

0.5

in
c.. -0.5
:E
~
fZ
r.:I
-1

III:
III:
::> -1.5
u

-2

-2.5

-3
0 0.2 0.4 0.6 0.8 1 1.2

Desired and Actual Direct Current Time in sec


364
.r-------~------~------_r------~------~------_,

_.L-______ ______ ______


~ ~ _ L_ _ _ _ _ _ ~ ______ ~ ______ ~

o 0.2 0.4 0.6 0.8 1.2

TIME (SEC)

Desired and Actual Quadrature Current


The primary reason for this poor current tracking is the phase shift
due to the (butterworth) filtering of the currents. The phase shift due to
the butterworth filters was simulated for the desired currents and plotted
below along with the measured currents. Note the dramatic increase in
tracking for id.

0.5

iil -0.5
c.
::&
~
-1
~
loll

""
~
Co)
-1.5

-2

-2.5

-3
0 0.2 0.4 0.6 0.8 1.2

Actual and Phase-Shifted Desired Direct Current Time(sec)


365

..
3
,I
~--- '
"
:' \"- .. _--\
,:

/
2 \,
I \
Vi' '\
0-
:I
~
0
~
III
0::
0::
::> -1
u

-2
,,
-3

-4L-------~------L-----·--~------~------~----~
o 0.2 0.4 0.6 0,8 1.2

TIME (SEC)

Actual and Phase-Shifted Desired Quadrature Current

Results of the Identification Experiments

The measured and ftltered position and phase currents (i I ,ib) are shown
below in the dashed curves. The constant voltages of vI = 0, vb = 4 volts
were used along with off-line calculations of speed, acceleration and
derivatives of the currents as "inputs" to the above identification
algorithm. The resulting values of the parameters were: R = .62 ohms,
L = .OO31H, Km = .34 Nt-m/amp, J = 14.05xlO- SKg-m2 and B = -.0013 Kg-m2/sec.
The resulting error index was 14.2% The "negative" friction coefficient is
attributed to the fact that the maximum speed attained in the step response
is only 9 rads/sec and consequently the resulting friction torque Bro is too
small to have an effect. That is, B is un-identifiable from step response
data.

The parameter values identified above were then used in a SIMNON


simulation of the stepper motor equation and are shown in the solid plots
below.
366
0.04

0.033

0.03

! 0.025

.~ 0.02
'i
.... 0.013

0.01

0.003

°0L-~------0-.OO~3---------0.~0-1--------0~.0~1~3--------~0~.02

Time (oec)

Actual and simulated (using identified parameters) step responses

5. Future Research

We have demonstrated how modem nonlinear control techniques may be


used on industrial type problems. The results presented here are quite
promising, although much further work is needed. In particular, work is now
in progress to filter!compensate the current measurements i d,iq in order to
use them for feedback. Furthermore, the motor values obtained from the
identification experiments will be used in the control algorithm. This will
be reported in a later paper.

A patent is pending on the methodologies/hardware setup presented in


this paper.
Acknowledgements

The authors are grateful to the Motorola Corporation for donating the
56001ADS development system which made possible the implementation of the
controllers.
Marc Bodson and John Chiasson are especially grateful to Mr. Stephen
Botos, President of the Aerotech Corporation (Pittsburgh), for his
commitment to this project in the form of equipment, technical and financial
support.
367

References
[1] M. Aiello, M. Bodson, J. Chiasson, R. Rekowski and D. Schuerer,
"Experimental Results of a Nonlinear Controller Applied to the Positioning
of a Pennanent Magnet Stepper Motor" 1990 Symposium on Incremental Motion
Control, Systems & Devices, June 12-14, San Jose, California.
[2] Ilic-Spong, M., R. Marino, S.M. Peresada and D.G. Taylor, "Feedback
Linearizing Control of Switched Reluctance Motors," IEEE Trans. on
Autom.Control, Vol. AC-32, No.5, May 1987.
[3] Kenjo, T. Stepping Motors and Their Microprocessor Controls, Clarendon
Press, Oxford, 1984.

[4] Kuo, B.C. and J. Tal, Incremental Motion Control, Step Motors and
Control Systems, Vol. n, SRL Publishing, Champaign, IL, 1978.

[5] White, G. "Open-loop Step Motor Controls for Print-Wheel Drives," in


Proc. 9th Annual Symposium on Incremental Motion Control Systems and
Devices, B.C. Kuo, Editor, June 1980.
[6] Acamley, P.P., Stepping Motors: A Guide to Modern Theory and
Practice, P. Peregrinus, Ltd., Stevenage, UK, 1982.
[7] Zribi, M., Control of a PM Stepper Motor Using Modern Nonlinear Control
Techniques, M.S. Thesis, Purdue University, 1987.
[8] Zribi, M. and J. Chiasson, "Control of a PM Stepper Motor by Exact
Linearization," to be published in the IEEE Transactions Automatic Control,
Feb. 1991.

[9] Bodson, M., and 1. Chiasson, "Application of Nonlinear Control


Methods to the Positioning of a Permanent Magnet Stepper Motor" Proceedings
of 24th CDC, Austin, TX, Dec. 1989.
[10] Su, R., G. Meyer and L.R. Hunt, "Design of Multi-input Nonlinear
Systems," in Differential Geometric Control Theory, R.W. Brochett, R.S.
Millman and H.J. Sussman, Editors, Birkhauser, Boston, MA, pp. 268-298,
1983.
[11] Sommer, R. "Control Design for Multivariable Time Varying Systems,"
Int. J. Control, Vol. 31, No.5, pp. 883-891, 1980.
[12] Sastry, S. and M. Bodson, Adaptive control: Stability, Convergence and
Robustness, Prentice Hall, Englewood-Cliffs, New Jersey, 1989.
[13] Hsu, P., M. Bodson, S. Sastry, and B.Paden, "Adaptive Identification
and Control of Manipulators without using Joint Accelerations", Proc. of the
IEEE Conference on Robotics and Automation, Raleigh, NC, pp. 1210-1215,
1987.
368

[14] Taylor, D.G., M.J. Woolley, and M. Ilic, "Design and Implementation of
a Linearizing and Decoupling Feedback Transfonnation for Switched Reluctance
Motors", Proceedings 17th Symposium on Incremental Motion Control Systems
and Devices, Champaign, Illinois, June 1988.
[15] Brown, R.H., Y.Zhu and X. Feng, A new relaxation Algorithm for the Time
Optimal Control Problems of Hybrid Step Motors", Proceedings 28th CDC Tampa,
FL, Dec. 1989.
CHAPTER 16

DIGITAL SIGNAL PROCESSOR IMPLEMENTATION


OF A KALMAN FILTER FOR DISK DRIVE
HEAD POSITIONING MECHANISM

J. A. Lowe and F. L. Lewis


The University of Texas at Arlington
Automation and Robotics Research Institute
7300 Jack Newell Blvd S
Fort Worth, Texas 76118 USA

ABSTRACT. A Kalman filter for estimating the state of a disk drive


head-positioning mechanism has been implemented on a Texas Instruments
TMS320C25 digital signal processor (DSP). The predictive formulation of
the Kalman filter is employed to avoid potential difficulties with
computational delays. Quantization effects of the filter are measured
by comparing the performance of the 16-bit fixed-point DSP filter with a
64-bit floating point filter implemented on a general purpose computer.
Results of the implementation show that the Kalman filter uses less than
two percent of the computational bandwidth of the DSP at a sample period
of 250 #s, leaving more than ninety-eight percent available for
executing the control algorithm and other tasks.

1. INTRODUCTION
The first production disk drive, introduced by IBM in 1957, used
mechanical detents to position the head and had a track density of 8
tracks per centimeter [1]. Disk drives have since incorporated a number
of significant innovations. Modern high performance disk drives use
voice coil actuators to position the heads under closed-loop control and
have track densities in excess of 400 tracks per centimeter. A history
of the evolution of disk drive technology is given in [2].
As track densities increased beyond 40 tracks per centimeter, it
became necessary to incorporate a position sensing mechanism in the disk
drive head-positioning assembly so that feedback control techniques
could be employed. This was accomplished by means of a dedicated servo
surface on which incremental position information is encoded. Position
measurements from a servo surface are typically available at a 250 kHz
to 500 kHz sample rate, which is essentially continuous when compared to
the bandwidth of typical head-position controllers. A velocity signal
is usually generated directly from the high speed position measurements
without resorting to optimal estimation methods. An obvious
disadvantage of the servo surface method of position sensing is that an
entire surface that could have been used for data is used for position
information; a less obvious disadvantage is that thermal errors make it
369
370

difficult to keep the heads on other platters aligned with the head on
the servo surface to the accuracy required for very high track density
drives. More details on servo surface encoding and decoding can be
found in [3] and f4].
Many modern disk drives have now dispensed with the dedicated servo
surface in favor of position information embedded on each data surface.
Because the position information must now be interleaved with data,
position measurements are typically available at only a 3 kHz to 4 kHz
sample rate. This requires that estimation methods be used to derive
the velocity. Although high speed microprocessors are typically
utilized for this task, the high speed arithmetic capability of a
digital signal processor makes it an attractive alternative.
A Kalman filter for a disk drive with embedded head-position
information is implemented in this paper. The filter estimates any bias
force acting on the head as well as the position and velocity of the
head. The filter is formulated as a one-step predictor so that
computational delay is not an issue.
The dynamics of the head-positioning actuator are described in
Section 2. Section 3 gives the detailed design of the Kalman filter.
Implementation of the filter on a digital signal processor is covered in
Section 4, along with an example of the performance of the filter for a
sample trajectory.

2. THE HEAD-POSITIONING ACTUATOR


We first need to describe the dynamics of the head-positioning actuator
so that they may be incorporated in the Kalman filter equations. The
dynamics will be derived in continuous time and then converted to
discrete time, as required for the Kalman filter. Once we have derived
the dynamics for the general case, we will define the parameters and
performance requirements for a fictitious--but realistic--disk drive
which will be referred to throughout the rest of the paper.
2.1. Actuator Dynamics
The head-positioning mechanism for a disk drive can be either rotary or
of the linear voice coil actuator type. The latter is generally used in
high performance disk drives and will be considered here. The dynamics
are described by

y- = k
-u (2.1)
m

where m is the mass of the coil and carriage assembly, k (NjV) is


the product of the motor force constant and power amplifier gain, y(t)
is the head position, and control input u(t) is motor voltage [5J. By
absorbing the constant kjm into the definition of the input, we may
write the state-variable description
371

x = [ ~~ Jx + [ ~ Ju, (2.2)

with x = [ d v ]T, d(t) the head position, and v(t) the head
velocity, where u(t) is now in units of acceleration.
Ve should like to consider a more detailed model which includes a
bias force and process and measurement noises. The theory of the Kalman
filter will then allow us to incorporate the known characteristics of
the noise processes into the filter design.
The model we will use of the disk drive head-positioning system is
given by

i = [ ~ g~ ]x + [ ~ ]u + • = l'x + B'u + G'w, (2.3)

where the state is x = [ d v f ]T, with d(t) the head position and
v(t) the head velocity. The control input is u(t), and
w(t) = [ Wd Wv Wf]T is process noise which disturbs the state
components. Ve have selected the units so that the system matrices AC,
BC, GC have the simple elements shown (i.e. GC = I).
The bias force f(t) (equivalent driver input) is included to
model forces on the actuator which do not vary with time. Possible
sources of these constant forces are windage due to the rotation of the
platters, the flexible leads which connect to the heads, and tilt of the
disk drive. Since the bias is unknown and must be estimated, it is
necessary to include it as a state. Note that even though the bias is
not a function of time, it may be a function of other parameters such as
head position. To describe f, we have selected the Brownian motion
model, which is an integrator driven by white noise Wf(t).
Position measurements are made, so that the measured output is
y = [ 1 0 0 ]x + n = ex + n (2.4)
where net) is measurement noise.
To design a digital filter we must first sample the continuous
dynamics (2.3), (2.4). For a sample period of T and allowing the
control input u(t) to change only at the sample instants, the discrete
dynamics are

(2.6)

The relationship of the sampled noise statistics to the continuous noise


statistics will be described in Section 3.
372

A remark is in order at this point concerning the control input u.


The head-positioning actuator is invariably driven by a current mode
(transconductance) amplifier limited by the voltage of its power supply.
When the amplifier is not saturated, the control input calculated by the
control algorithm is directly proportional to the current in the coil,
which in turn is proportional to the force applied to the head carriage.
When the amplifier is saturated, however, the control input calculated
by the control algorithm is not proportional to the force applied to the
head carriage. Since the need to position the heads in minimum time
dictates that the amplifier be driven into saturation during
acceleration, the control input Uk provided to the Kalman filter must
actually be sensed from the coil current. If instead the control input
calculated by the control algorithm is provided to the Kalman filter,
then the estimation errors will increase while the amplifier is
saturated.
2.2. Actuator Paraaeters and Performance Requirements
There are two control problems associated with the head-positioning
mechanism. One is the track access or seek problem. There, the head
must be driven from an initial position to a certain track in minimum
time. The second problem is that of track following, where a servo
control system must be designed to hold the head above a specified track
while reading from or writing to the disk. The Kalman filter runs
continuously since state estimates are needed for both control modes;
however, the resolution required of the state estimates and the accuracy
of the position measurements differ from the track-access mode to the
track-following mode.
The accuracy of the position measurements is reduced during a seek
due to the fact that the embedded position values are not impulse
sampled. At high head velocities, the resolution of the position
measurements can typically go down to the track level, and the accuracy
of the velocity estimate correspondingly suffers [3]. Fortunately, it
is not necessary for the velocity estimate to be as accurate at the
middle of a seek when the velocity is high as at the end of a seek when
the velocity is low.
The primary function of the track-following controller is to reject
disturbances such as constant bias forces, vibrations, and track runout
due to spindle eccentricity. The magnitude of the track-following error
must typically be held to less than 57. of the track width in order to
avoid data errors due to interference from adjacent tracks.
Since the Kalman filter equations will be implemented using
fixed-point arithmetic, it is necessary to define numerical values for
certain parameters of the actuator so that the equations can be scaled
appropriately. The parameters in Table I, which are similar to those
given by Stich [6] for an actual disk drive, will be utilized when the
filter is implemented.
373

TABLE I. Head-Positioning Actuator Parameters


Parameter Value Units
sample period (T) 0.25 ms
track pitch 43 tracks/mm
maximum position error %0.05 tracks

3. mm fILTER DESIGN
Ve will use the recursive a priori formulation [7] to design a Kalman
filter for the discretized dynamics. The predictive Kalman filter
provides a predictive estimate one sample period in the future, since
the measurement and control input at time kT are used to estimate the
state at time k(T+1). Use of the predictive formulation avoids the
need to include an extra state in the discretized actuator model as in
[6] in order to account for the computation delay.
3.1. lalaan filter for.ulation
The recursive a priori formulation of the Kalman filter equations for
the discretized actuator model (2.5) and (2.6) is given by

Kk = PkCT(CPkC T + R)-l (3.1)


xk+1 = AXk + BUk + AKk(Yk - Cxk) (3.2)
Pk+1 = A(I - KkC)PkA T + GQG T, (3.3)

where (3.1) is the Kalman gain, (3.2) is the estimate recursion, and
(3.3) is the error covariance recursion. Ve have assumed that nk and
Wk are white noise processes uncorrelated with the initial conditions
of the actuator states and with each other. Ro[k] is the covariance of
the measurement noise nk and Qo[k] is the covariance of the process
noise Wk, where 6[k] is the Kronecker delta. The superscript If-If
denotes an a priori value (i.e. one not updated to reflect the current
measurement).
3.2. Discretization of the Noise Processes
Ve now need to relate the spectral densities of the discrete noise
processes to the spectral densities of the continuous noise processes in
the continuous model of the actuator. Suppose the covariance of the
continuous process noise w(t) is QC6(t). Then the covariance matrix
of the discrete process noise Wk is given by
374

(3.4)

Equation (3.4) may conveniently be approximated by

(3.5)
To discretize the measurement noise we will use an approximation
given in Kailath [8]. If the covariance of the continuous measurement
noise n(t) is RC6(t), then the covariance matrix of the discrete
measurement noise nk may be approximated by

(3.6)

3.3. lalaan filter Equations


Ve will now derive and simplify the Kalman filter update at time k in
scalar form for the discretized model of the head-positioning actuator.
For more complicated systems, such analytic simplification is
intractable, and some square-root algorithm such as those of Bierman [9]
should be used to implement the Riccati equation. Straightforward
implementation of the Kalman filter using matrix multiplications should
be avoided at all costs, since it is invariably accompanied by numerical
problems.
As we proceed, we shall define intermediate scalar variables for
ease of programming later. Ve suppress the time index k from all scalar
variables. Since the covariance matrix of the measurement noise is
simply a scalar, we shall represent it by r instead of R.
Denote the (symmetric) a priori error covariance at time k as

Pk = [ ~~P3 P5~~ P6~: 1' (3.7)

where Pi are scalars. The Kalman gain is computed using the following
steps.

(3.8)

(3.9)

The Riccati equation error covariance update may be written as


375

(3.10)

Therefore, the error covariance update may be computed using the steps:

PI' P2' P3' ]


- [ P2' P4' P5' . (3.11)
P3' P5' P6'
(Note that this is just the a posteriori error covariance Pk; as such,
it is symmetric.)
PI'+P2'T+P3'T2/2 P2'+P4'T+P5'T2/2 P3'+P5'T+PS'T2/2 ]
A(I-KkC)Pk = [ P2'+P3'T P4'+P5'T P5'+P6'T
P3' P5' Ps'

(3.12)

And, finally,

ql+q2T+q3T2/2 q2+q3T q3] [qd 0 0 ]


= [ q4+q5T+q6T2/2 q5+q6T qs + 0 qv 0 . (3.13)
q7+q sT+qgT2/2 q8+qgT qg 0 0 qf
The Kalman filter dynamics may be written in terms of scalar
components as

(3.14)

(3.15)

(3.16)

where the filter gains Li are given by

(3.17)
376

Note that for notational simplicity we have suppressed the superscript


II -II denoting that these are a priori estimates.

4. lalaan filter IapleJIentation


The scalar Kalman filter equations derived in Section 3 have been
implemented on a Texas Instruments TIS320C25 digital signal processor.
Several other manufacturers produce digital signal processors with
similar capabilities.
The C25 is a second-generation 16-bit fixed-point digital signal
processor with a cycle time of 0.1 microseconds. A second-generation
DSP was chosen instead of a third- or fourth-generation floating-point
DSP because the benefits of using floating-point arithmetic were deemed
insufficient to justify the additional cost in this application. A
first-generation DSP, such as the Texas Instruments TMS320C10 with a
cycle time of 0.2 microseconds, is a viable alternative to the C25, but
the slower execution time of the Kalman filter would leave less
computational time available for performing the other calculations
involved in controlling the head position. A decision on which
particular digital signal processor to use in an actual disk drive, or
indeed whether to use a digital signal processor at all, would depend on
many factors which have not been considered here.
One important factor to consider when choosing a digital signal
processor for any control task is the amount of internal program (ROM)
and data (RAM) memory. Vhenever possible, a digital signal processor
with sufficient internal memory should be utilized so that the decreased
reliability and increased expense associated with external memory can be
avoided. The C25 has 544 words of internal data memory and 4K words of
internal program memory. As will be shown, the Kalman filter for the
head-positioning actuator uses only a small fraction of the available
internal memory resources.
The Kalman gains in the Kalman filter are time-variant, but soon
settle to constant values. The settling time of the Kalman gains is a
function of the spectral densities of the measurement and process noise.
It is possible to use only the steady-state gain values in the filter,
which considerably decreases the execution time and code size of the
filter. The effect of using only the steady-state gain values on the
performance of the Kalman filter is to lengthen the convergence time of
the state estimates. It is possible to perform an initialization
sequence when power is first applied to the disk drive in order to allow
time for the state estimates to converge. Since the Kalman filter which
estimates the states of the head-positioning actuator runs continuously,
the steady-state Kalman filter may then be employed without any loss in
estimation accuracy. This fact makes it unlikely that a time-varying
Kalman filter would be used in an actual disk drive head-positioning
system. Nonetheless, we shall describe the implementation of both the
time-varying and the steady-state Kalman filter in order that the
execution times and memory requirements may be compared. We shall only
discuss the estimation accuracy of the steady-state filter in detail,
since the performance of the two filters is identical once the state
377

estimates have initially converged and the gains of the time-varying


filter have reached steady state.
4.1. The Steady--State Ialun filter
Before either the time-varying or steady-state Kalman filter can be
implemented, we must first assign numerical values to the continuous
noise spectral densities. Recalling the discussion in Section 2.2
concerning the relationship between head velocity and the accuracy of
the position measurement, it might be appropriate to use one value for
the measurement noise spectral density during track following and a
larger value during track seeking. We only need to use one value in
order to evaluate the performance of the filter, so we will use

RC = r C = 0.01 tracks 2 . (4.1)


For the process noise spectral density we will use

QC = diag { 0.0001, 0.01, 0.0001 } (4.2)


where the units are tracks2jms2, tracks2jms4, and tracks2jms6,
respectively. These choices for the process noise spectral density are
rather arbitrary, but were chosen so that primarily the second state
derivative (i.e. acceleration) is excited.
Another detail which must be attended to before the coding of the
filter can begin is to choose initial values for the states of the
filter. The choice of initial values determines the initial value Po
of the error covariance matrix. Using the initial measurement of
position we may obtain the initial position estimate

d(O) = y(O). (4.3)

The error covariance of d(O) is simply r C , the covariance of y(O).


The initial velocity estimate may be computed after the first two
position measurements using
A(O) -_ y(T)-y(O)
v T . (4.4)

The error covariance of v{O) is thus 2rc jT2. The bias force estimate
£(0) may be taken as zero and the error covariance as some large value
such as 100 tracks2jms4. Our initial error covariance is thus

Po = diag { r C , 2rc jT2 , 100 }. (4.5)


Now that we have determined values for the continuous noise
spectral densities and the initial error covariance, we can calculate
378

the steady-state filter gains. First, the spectral densities of the


discrete noise processes must be calculated as in Section 3.2. Then the
error covariance update equation is iterated until the filter gains
given by (3.17) converge. The steady-state filter gains are

[ ~~ ]
L3
= [ ~:~~~:~ ms-
0.02107 ms~
1 ]. (4.6)

Since the filter equations will be implemented in fixed-point


arithmetic on the TMS320C25 digital signal processor, we must determine
the scale of each variable and parameter in the filter equations
(3.14)-(3.16). Each variable is scaled by determining the maximum value
the variable could attain based on the known characteristics of the
actuator. The variable is then scaled such that the range accommodates
the maximum value. The resolution of the variable is thus by default
determined by the number of bits remaining in the 16-bit word, with one
bit reserved to indicate sign in the two's-complement number format.
This scaling technique results in adequate resolution for the velocity
and bias estimates, but the position estimate poses a problem. Typical
disk drives have several hundred tracks, yet need resolution of
0.01-0.001 tracks in order to be able to follow a track accurately.
This exceeds the dynamic range available in a 16-bit word.
Nevertheless, we have elected to use only a single 16-bit word for the
position estimate and have scaled it such that it has the necessary
resolution. If this filter was used in an actual disk drive, the range
of the position estimate would have to be augmented externally to the
filter. Once the variables have been scaled, the parameters are scaled
such that the multiplication of a variable by a parameter results in an
appropriately scaled product. This technique avoids the need to shift
intermediate products in order to align the binary points for addition.
The range and resolution of each variable and parameter are shown in
Table II.
The steady-state Kalman filter for the head-positioning actuator
was implemented on the TMS320C25 digital signal processor. All
intermediate products are accumulated in the 32-bit accumulator, but are
rounded to 16-bit words for storage. A second version of the
steady-state filter was also implemented with truncation rather than
rounding for purpose of comparison. The instruction set of the C25 is
such that the code size and execution speed of the two implementations
are identical. The Kalman filter implementation (including all
initialization) requires 116 words of program memory and 18 words of
data memory. Each iteration of the filter requires 41 cycles, which is
equivalent to 4.1 microseconds for a cycle time of 0.1 microseconds.
The filter thus utilizes only 1.670 of the sample period of 250
microseconds.
The performance of the Kalman filter can be verified by numerically
generating a sample trajectory of the head-positioning actuator. The
control input and position measurement for such a sample trajectory are
shown in Figures 1 and 2, respectively. Gaussian process and
measurement noise was used in the simulation. The control input is
379

Table II. Scaling of Filter Variables and Parameters


Variable or Range Resolution Units
Parameter

d 25 2-10 tracks
v 25 2-10 tracks/ms
f 24 2-11 tracks/ms 2
y 25 2-10 tracks
u 24 2-11 tracks/ms 2
T (v) 20 2-15 ms
T2/2 (f) 21 2-14 ms 2
T2/2 (u) 21 2-14 ms 2
LI 2° 2-15
T (f) 21 2-14 ms
T (u) 21 2-14 ms
L2 2° 2-15 ms- I
L3 2-1 2-16 ms-2
l/T 23 2-12 ms- I

similar to that used when seeking a track. The head carriage is


accelerated for 10 milliseconds, coasts for 5 milliseconds, and then is
decelerated for 10 milliseconds. The initial position and velocity are
both zero, and the bias force is -0.05 tracks/ms 2• The states of the
actuator and the state estimates produced by the Kalman filter (with
rounding) are plotted in Figures 3, 4, and 5. The position and velocity
estimates converge quickly, but the bias force estimate has not yet
converged in the 25 milliseconds shown on the plot. This slow
convergence would not be a problem in an actual disk drive since the
Kalman filter would be running continuously and the state estimates
could be allowed to converge during an initialization sequence.
The same steady-state Kalman filter equations as were implemented
on the C25 in fixed-point arithmetic were also implemented on a general
purpose computer using 64-bit floating point arithmetic. The
floating-point filter can be used as a benchmark by which to evaluate
the roundoff or truncation errors of the two implementations of the
fixed-point filter. The roundoff and truncation error of the position
estimate for the sample trajectory we have just discussed are shown in
Figure 6. The roundoff error is on the order of 0.001 tracks with no
bias. Roundoff error thus consumes only about 27. of our error budget of
0.05 tracks. The truncation error has a bias on the order of 0.01
tracks, which ,is enough to adversely affect the performance of a
track-following controller. This clearly illustrates the necessity to
380

---
N

o<' 30
~
]'
u

e 20
~ ~
N 20 -
<
..[10 -
t:: 16
e
(1J

~ 0
(1J
M
- 12
e
L---- ;3
III
~
(1J

'=' 8
.s
-10 - ~
c::

1-
0
20 -
....
.r:!
III
4
0
8
~
-30 I I I I I I
0
o 4 8 12 16 20 24 0 4 8 12 16 20 24
Time (ms) Time (ms)

figure 1. Control Input to the figure 2. Measured Head Position


Head-Positioning Actuator

Table III. Roundoff and Truncation Error in the Steady-State


Kalman Filter
State Roundoff Error Truncation Error Units
lean Std Dev lean Std Dev

a -0.6 1.1 -8.3 8.9 tracks x 10-3


v -0.7 1.5 -10.1 10.9 tracksJms x 10-3
f -0.8 1.0 -6.6 7.0 tracks Jms 2 x 10-3

use rounding rather than truncation, especially since there is no


penalty in terms of code size or execution speed. The mean and standard
deviation of the roundoff and truncation error in each of the state
estimates over the sample trajectory are given in Table III.
381

2.0 -r------,.~----____,
20 1.6
~

~
III E 1.2
III

........
~
u 16 ~ 0.8
ec:: 12
~

~
l:l 0.4
'-'
0
.....J:1 a
·0
0.0
III
0 8
p.... .9-0.4
Q.I

4 :;> -0.8

-1.2
0
0 4 8 12 16 20 24 o 4 8 12 16 20 24
Time (ms) Time (ms)

figure 3. Actual and Estimated figure 4. Actual and Estimated


Head Position Head Velocity

~
NI
0 ~

<
I

< 0
,....;
0
0
,....; -2 :.<
:.< III -2
~
N U
<III -4 ~
-4
Actual l:l
E
........
'-'
s..
III
~ -6 0
s..
-6
u s..
~ ~
l:l c:: -8
'-' -8 0
~ .J:1
·iii -10
~ 0
-10 p....
-12
0 4 8 12 16 20 24 0 4 8 12 16 20 24
Time (ms) Time (ms)

figure 5. Actual and Estimated figure 6. Roundoff and Trun-


Bias cation Error in Position Estimate
382

4.2. The Ti.e-Varying lalaan Filter


A time-varying Kalman filter to estimate the states of the
head-positioning actuator was also implemented on the TMS320C25. We
chose to pre-compute the Kalman gains rather than to calculate them
within the filter itself. The decision to pre-compute the gains or to
calculate them on-line involves a tradeoff between execution speed and
code size. Our implementation with pre-computed gains utilizes 141
words of program memory for the filter code, plus 552 words of program
memory for the gain table. The gain table is terminated when the gains
have converged to the steady-state values. Each iteration of the filter
requires 157 clock cycles (15.7 ~s). Even though the time-varying
Kalman filter was not implemented with on-line computation of the filter
gains, we can estimate based on the number of arithmetic operations
involved that it would require approximately 510 words of program memory
and execute at the rate of 700 clock cycles (70 ~s) per iteration.
Hence, the choice to use pre-computed gains results in much faster
execution speed at the expense of a small penalty in program memory
requirements.
Since the filter gains are no longer constant, they must be
re-scaled. Each filter gain is scaled by determining its maximum value
from the floating-point simulation and fixing the range such that the
maximum value is accommodated. The range and resolution of the filter
gains in the time-varying implementation of the Kalman filter are shown
in Table IV. The re-scaling of the filter gains makes it necessary to
perform some shifting of intermediate products in order to align binary
points for addition. This shifting is quite time consuming because
products which are shifted out of range are forced to saturate instead
of being permitted to overflow. The shifting-with-saturation operation
accounts for the majority of the increase in execution time from the
steady-state filter to the time-varying filter.
Table IV. Re-scaling of Filter Gains for Time-Varying
Kalman Filter
Filter Gain Range Resolution Units

Ll 22 2-13
L2 24 2-11
L3 25 2-10

5. CONCLUSION
We have presented in detail the design and implementation of a Kalman
filter for a disk drive head-positioning actuator on a fixed-point
digital signal processor. The results show that a second-generation
digital signal processor is capable of executing an iteration of the
383

filter in a small fraction of the available time and without the need
for any external memory. In fact, a digital signal processor clearly
provides more computational capacity than required if the processor is
only to be used for the Kalman filter and the track-following and
track-seeking control algorithms. The decision to employ a digital
signal processor rather than a traditional microprocessor in an actual
disk drive would likely hinge on the potential to utilize the excess
computational capacity of the DSP to perform in software tasks that
would otherwise require additional hardware.

6. AClNOllLEDGElENTS
This material is based upon work supported under a National Science
Foundation Graduate Fellowship.

7. REFERENCES
[1] T. Noyes and Y. E. Dickinson, "The Random-Access Memory Accounting
Machine," IBI Journal of Research and Development, vol. 1, pp.
72-75, Jan. 1957.
[2] J. I. Harker et aI, "A Quarter Century of Disk File Innovation,"
IBI Journal of Research and De'lJelopment, vol. 25, pp. 677-689, Sep.
1981.
[3] J. V. Espy, "Techniques of Position Control of High Performance,
High Capacity Disk File Systems," Proceedings, Symposium on
Incremental lotion Control Systems, pp. 177-210.
[4] R. K. Oswald, "Design of a Disk File Head-Positioning Servo," IBI
Journal of Research and Development, vol. 18, pp. 506-512, Nov.
1974.
[5] R. F. Bell et aI, "Head Positioning in a Large Disc Drive,"
Hewlett-Packard Journal, pp. 14-20, Jan. 1984.
[6] I. C. Stich, "Digital Servo Algorithm for Disk Actuator Control,"
Conference on Applied lotion Control, pp. 35-41, 1987.
[7] F. L. Lewis, Optimal Estimation with an Introduction to Stochastic
Control Theory, Viley, 1986.
[8] T. Kailath, Lectures on Yiener and Kalman Filtering, Springer-
Verlag, 1981.
[9] G. J. Bierman, Factorization lethods for Discrete Sequential
Estimation, Academic Press, 1977.
CHAPTER 17

SIMULATION ANALYSIS OF FLEXIBLE MANUFACTURING


SYSTEMS USING RENSAM

Jorge Haddock
Young Chul Kim
Department of Decision Sciences and Engineering Systems
and Center for Manufacturing Productivity
Rensselaer Polytechnic Institute
Troy, New York 12180-3590
Telephone: 518-276-8099
BTINET Address: FFYE@RPITSMTS

ABSTRACT

Simulation has become the most widely used computer based performance
evaluation tool for FMS design and analysis due to the complexity of the interaction of the
dynamic components in these systems. Simulation models can incorporate a great level
of detail and capture the time dynamic behavior of these systems. RENSAM
(&msselaer's Simulator for Automated Manufacturing) is a generic simulation model that
takes advantage of the underlying structure of manufacturing systems. RENSAM is an
'adaptive system' in that it consists of a generic GPSS simulation model which provides a
high level of flexibility in modeling a wide spectrum of FMS configurations. This paper
presents a description of RENSAM system and its capabilities as an analysis tool are
illustrated by means of a case study.
This project has been partially funded by Alcoa, GE, GM, Kodak, RCA and New
York State Center for Advanced Technology.

385
386

INTRODUCTION

The use of simulation represents a viable approach for the design and analysis of flexible
manufacturing systems (FMS) because these systems tend to be more complex than the traditional
manufacturing systems. In fact. simulation has become the most widely used computer based
performance evaluation tool for FMS design and analysis due to the complexity of the interaction of the
dynamic components in these systems (Surl 1984). Simulation models can incorporate a great level of
detail and capture the time dynamic behavior of these systems. Events such as machine breakdowns.
as well as the monitoring of the state of the system. are significant events in any FMS and the ability
to include them in the model usually proves to be significant Also. simulation models are appropriate
when there is significant blocking in the system (Surl and Diehl 1987). Moreover. simulation is the only
available tool to study the system's transient behavior. Thus. simulation techniques enable the creation
of realistic models and provide a favorable environment for FMS design and analysis. Simulation
languages. such as GPSS. SIMAN and SLAM. have made feasible the modeling of large. complex
models in a reasonable time frame. However, the development of simulation models still require a
significant level of expertise and is usually time consuming and tedious. Other simulation systems
similar in software structure are specialized simulation languages such as GEMS (Phillips et al. 1979)
and MAP (Rolston 1984). These systems are easier to use. but can model a narrower spectrum of
system characteristics and also require training.
Other commercially available tools have also been developed to assist in the simulation process.
These packages include GEFMS. SIMFACTORY. STAR*CELL, and XCELL. All of these packages
provide dynamic animation and require no programming. These systems can be classified as simulation
generators. The concept of a generator is to translate a development-oriented description of the system
into a simulation model (Haddock 1987; 1988). Thus. some of the difficulties associated with simulation
modeling can be overcome by using a simulation model generator.
A similar approach involves generic simulation models such as RENSAM ~sselaer's
~imulator for .6.utomated Manufacturing) that take advantage of the underlying structure of the systems
(Foley, Jain and Haddock 1987). RENSAM is innovative because it is an 'adaptive system' that consists
of a generic GPSS simulation model which provides a high level of flexibility in modeling a wide
spectrum of FMS configurations.
This paper presents a description of RENSAM system and its capabilities as an analysis tool are
illustrated by means of a case study.

RENSAM SYSTEM DESCRIPTION

RENSAM was developed as a test bed for issues in scheduling and control of FMS. RENSAM
has been designed as a completely data driven generalized simulation generator with three major
components. pre-processor. generic simulation model. and post processor. The user interfaces (pre and
post-processors) have "been developed in Turbo-Pascal and run on an IBM-PC. The generic simulation
model has been developed in GPSS/H and is installed on a MicroVAX IT cluster.
Compared to general and special-purpose simulation languages. RENSAM provides more ease
of use and requires less simulation expertise from the user. This is. however. at the expense of higher
domain orientation and less flexibility in model development. Compared to a specific simulation model.
RENSAM provides more flexibility and lower domain orientation. but is is slightly more difficult to use.
Figure 1 shows an overview of the configuration of RENSAM. The user interacts with the
preprocessor to create a data file containing the description of the FMS. The generic simulation model
simulates the FMS according to a supplied schedule or based on the operational policies contained in
the FMS description file. The results of the simulation are stored in three files. One file consists of a
387

PRE-PROCESSOR~ USER
I~
FMS
SCHEDULE
DESCRIPT ION
.... '., .•......•..
", II
SIMULAlION
'.1
~ i'

--------- - MODEL --------


t.--- \~ ----~

EVENT SUMMAR~ PERFORMANCE


HISTORY FILE FILE REPORT
--.....
----.--- \l/

~
POSl-
--.--
--- - PROCESSOR --.---.
---.
j,---- "V --~

GRAPHICS
GANTT CHART ANIMATION
I
DISPLA~S
~ I!I I~
rtUSER
Figure 1. RENSAM CONFIGURAnON
388

formatted perfonnance report that is genented for direct use. The other two files can be down-loaded
to an IBM-PC for detailed processing using the post-processor.
The data required to describe the FMS is elicited from the user through prompts. The data can
be classified in three groups: FMS description, e.g. work station, material handling system, layout
descriptions, and pallet/fixture availabilities; part descriptions, e.g., opention sequence for each part
type and resource requirement for each opention; and, scheduling parameters, e.g., tool loading and
schedule specifications.
The capabilities of the simulation model can be described in two categories, FMS simulation
capabilities and information and control capabilities. For instance, in terms of the FMS simulation
capabilities, different work station configuntions can be simulated. A station can have more than one
machine, an internal material handling device and stonge space. Each machine can have a number of
tool groups in its magazine. Storage can be modeled as either shared or dedicated input/output buffers.
Both discrete (e.g. AGVs, fork lifts) and continuous systems (e.g. conveyors) can be simulated. In terms
of the infonnation and control capabilities, extensive information and control capabilities have been built
in the generic simulation model to allow experimentation with different scheduling and control stIategies.
Salient capabilities include: part release control, different mechanisms are available to control the
release of parts to the system; part routing control, different mechanisms are available to detennine the
routing of parts through the system; real-time control, real-time control routines respond to occurrences
of interruptions during simulated opention of an FMS; and, final performance reports.
The post-processor is a user friendly interface for detailed analysis of simulation results. An
abstIact animation of the simulation is provided which displays the dynamics of the system opention.
The user can also examine gnphic displays for statistics such as machine utilizations and availability,
part flowtimes, distribution of flowtimes among process and waiting times, material handling system
utilization and availability, fixture utilizations, Gantt charts, and comparison of two different scenarios
using aggregate statistics.

CASE STUDY: COMPANY C

The manufacturing organization of Company C is considering the installation of an FMS. The


company's initial goals were to obtain an understanding of the FMS technology and to determine if such
a system would be compatible for their workload. One potential problem was that the parts to be
produced changed frequently. Even though perpetual changes were inevitable, the company still
expected to continue to produce a large variety of parts in small batch sizes. To obtain a better
understanding of the proposed FMS, a study was perfonned using RENSAM. The main objective of
this study consisted of detennining the adequacy of the specified system to meet the annual demand
(Table 1). With this infonnation further studies of alternative systems were performed. Some results
for the alternative systems will be shown on this paper, even though an optimal solution is not presented.
The ultimate goal consists in obtaining with the aid of RENSAM and in minimal time, a system with
adequate number of resources to permit reduction of costs, maintenance of quality and establish policies
such as 'just-in-time' deliveries.

FMS Environment

The manufacturing division is involved in low volume production of parts used to construct
electronic equipment (Table 1 contains sample order quantities) and they are considering the installation
of a FMS.
The manufacturing group functions as a 'in house vendor' for the rest of the organization. Due
to this status. there are a few special considentions which must be made in reviewing the
389

TABLE 1: ORDER QUAt."ITITIES

PART TYPICAL ANNUAL


NAME QRDER QUANTITY QUANTITY

1. A 38 450
2. B 141 1.700
3. C 141 1.700
4. D 141 1.700
5. E 50 591
6. F 75 900
7. G 75 900
8. H 75 900
9. I 75 900
10. J 75 900
11. K 34 400
12. L 83 1.000
13. M 108 1.300
14. N 234 2.800
15. 0 15 80
16. P 8 90
17. Q 166 2.250
18. R 167 2.000
19. S 50 600
20. T 16 70
390

appropriateness of a FMS for the group. First, as an 'in house vendor', the organization is not run as
a profit producing division of the fmn. As a result, the manufacturing group is not privy to long range
marketing or sales forecasts on a regular basis. The group. therefore. only plans for one or two mooths
in advance. Second. the potential pool of customers is limited to other divisions within the firm. even
though the other divisions are permitted to go elsewhere for services. There are several steady
customers. but it is the organization's policy to accept important orders without advance notice. As a
consequence of this policy, even the division's short term plans are frequently interrupted.

Operating calendar. The company works on a thirteen period operating calendar. There is a three shift
operation (twenty four hours/day), five days per week. Saturdays are left for unplanned work and
machine maintenance.

System specifications. The proposed FMS is being designed to handle over two thousand parts.
consisting of a wide spectrum of materials and sizes. As this entails an abundance of data. a
representative sample of twenty parts has been selected to study the system.
Preliminary research has suggested that the FMS include the following:
• 5 machining centers (1 backup station)
• I inventory station (CMM)
• 1 wash station
• 2 load stations (manual)
• I fixture preparation station
• 1 tool change station
• 1 tool setting station
• 1 battery charge station (manual recharging of AGVs)
·3 or4 AGVs.
The planned workload for the manufacturing organization. along with long and short term forecasts of
the planned production requirements were used to establish this configuration. Figure 2 presents a
diagram of the system layout. Studies to determine the exact number of AGVs, inspection stations and
wash stations have not yet been completed. and the quantities provided are the result of preliminary
research.

Macbine-resouce capacities. Each of the five machining centers has a two pallet exchange. There are
two tool magazines on each machine. each with forty-five spaces for tools.
A 'riser' (also referred to as a 'stump' or 'tombstone') which holds additional parts on each of
its sides, sits on each pallet. The riser is used to reduce tooling costs, set-up times and downtimes.
Risers are mounted onto the center of the machining pallet. They provide the means by which up to
five sides of a part may be machined in one set-up.
Sub plates which are either aluminum or steel (flat) plates are secured to the riser. The sub
plates are used in drilling operations to prevent penetration of the riser. Sub plates are assigned to
specific parts and are not interchangeable.

Machine-resource reliability. The estimate of five machines, for the FMS. includes one machine to
be used as a backup. The estimated machine utilization is only eighty percent, including this additional
machine.
The company has estimated the 'tool lives' and is incorporating this knowledge into the FMS.
so that tools may be changed on a regular basis. The plans do not include, however, a means by which
the FMS would automatically detect broken tools.
Parts going to the coordinate measuring machine (CMM) must first be cleansed. It takes
approximately three to five minutes per pallet at the wash station. To date. there has not been a study
ChlpiCoolant System

-e -B
~~ II II
Cli [nopection Station Hash Station
..-xi"'I
! 'xtrw Q
Storage Area

AS/RS
~!l
c:::J ii
load Station.
(ltanual) • >----,--.,.--,,--:-,-,----,...---"""'-/
Control
Centar

I~'I:' elf ~lr'~·"~'


~.::.1 f '\ t '\ t::::·
load Stillon

8lttery CJlIrgtn,
Stat!~n Dd'
Too I Changer ,. C=o
~il.~
Fixture
~ Tool Setting
a Prep. Statlo1
E9
D(Hanual Recharge of ACV.)
o
Figure 2. SYSTEM LAYOUT w
10
-
392
to verify that only one CMM and one wash station is sufficient

Material handling. Preliminary studies suggest the use of three or four AGVs. The AGVs are battery
operated and time to recharge each of them will be allotted for by the FMS.

Routing. The routes through the FMS are easily depicted by using the diagram of the system (Figure
3) in conjunction with the list of operations for each part in Table 2. Note that every operation in Table
2 actually consists of a load, operation, wash, inspection, and unload sequence on the same fIxture. In
the actual simulation, many load and unload operations exist due to the actual difference in time caused
by the type of fIxture used and the type of part.

Goals. Given the information provided previously, the following goals were set for analysis with the
use of RENSAM. The fust step is to determine whether the proposed confIguration. as well as the
alternative systems to be developed and examined, reach steady state. If this condition does not hold.
the RENSAM results do not reflect the actual performance of the system in the long run. Second, based
on the results of whether a steady system, alternative systems shall be developed and analyzed through
RENSAM. By comparison of the results and the use of RENSAM as a tool. an adequate FMS system
shall be determined for the projected demand. Summarizing. the critical fIgures to be determined are
the combination of machining workstations. load/unload stations. wash stations. inspection stations. and
AGVs in which the main workstation will be working at an eighty percent level.

Assumptions for RENSAM. Given the data of fIve working days per week, each consisting of twenty
four hours. we initially ran the simulation to obtain data for 60 days. Thus, the run length of the
simulation in time units is: 60 days'" 24 hrs/day ... 600 time units/hr = 864.000 time units. In order to
determine if the data obtained is reaching a steady state, runs for 80 and 100 days were performed.
Other general assumptions were made for this case study. First, since backup workstations are
not currently modeled by the RENSAM model. we simply assume four workstations with immediate
repair time. That is the backup workstation becomes available immediately. However. the mean failure
time was arbitrarily decided as 144.000 time units. Identically. the same backup assumption was made
for the material handling system where the number of AGVs was set to four. The mean failure time
is set to 582,000 time units. Second, the AGV travelling speed is assumed to be 100 ft/min. Third. the
distance between stations shall be assumed to be as shown in Figure 2. Fourth. the movement time
between buffers is zero since the data provided assumes the operation times to include them. Fifth. for
simulation pu1llOSes, the wash and inspection stations are modeled as two different machine types located
at different stations, even though, they may in actuality be manual stations. Thus, three types of
machines exist totalling to a number of six machines in the system. Next. a single tool group is
assumed. Finally. the number of ftxtures available to the system is assumed to be the number of
operations using the fIXture type plus one or two additional fIXtures (see Table 3).
In addition, the interarrival schedule matrix determines order quantity. This has been set to
monthly batch sizes. which are referred as typical order quantities in Table 1. Also. two particular
decision rules are used for this case. First. the rule of smallest work in queue to choose the next station
is used. Second, the FIFO job selection rule is used.

RENSAM RESULTS

The original system which consists of four machining workstations. two load/unload stations. one
wash station, one inspection station. and four AGV's did not produce any parts due to blocking
conditions which were attributed to the low number of load/unload stations. Thus. through the use of
393

TABLE 2: OPERATIONS AND FIXTURE TYPES FOR PARTS

PART FIXTURE OPERATION


NAME TYPE REQD SEQUENCE

A 1
2

B.C.D 2 3
2 4

E 5
6
7

F 8
3 9

G 5 12
H 10

I 10
11

J 10
11

K 4 14
4 14

L 5 15
4 16

M 17

N 2 18
2 8
2 4

0 1 19
6 20

P 21
22
23

Q 24
394

TABLE 2: OPERATIONS AND F1XTURE TYPES FOR PARTS (CONT.)

PART FIXTURE OPERATION


NAME TYPE REQD SEQUENCE

R 6 26
6 27

S 4 28
4 29

T 4 30

Note that every operation actually consists of an load, operation, wash, inspection,
and unload sequence on the same fixture. .
395

TABLE 3: NUMBER OF,FIXTURES AVAILABLE

FJXTURE NUMBER NUMBER A V AIL\BLE

15

2 3

3 2

4 5

5 3

6 4

7 2
396

the pre-processor. the capacity of loadlunload stations was immediately doubled. For the run length of
60 days. the system produced parts without the occurrence of blocking conditions. In order to determine
steady state. results of 80 and 100 day run lengths were used. The production figures for the different
runs are shown on Table 4. This table seems to indicate that production quantities are leveling off for
some parts. however. these figures alone are not concrete evidence of steady state.
Further examining data, cumulative average flowtime figures for 1000 days were plotted for every
part type produced in order to determine if steady state is being reached. Figure 3 is a graph of
cumulative average flowtime for part type A. For this particular part. the flowtime averages at around
3.8 hours when the demand rate calls for production of a part at every 14 hours. As shown for part A.
most parts show flowtimes lower than the demand rates. however for parts with very high demand this
is not true. Figure 4 shows part N as an example of this. The figure suggests that the average flowtime
that part N has an average flowtime of 8.2 hours. but the demand rate requires production of a part every
2.26 hours. With the offset of production rates. and the graphical figures. the system seems to be
reaching steady state. However. if one examines the machine statistics closely. the machining
workstations were utilized at levels over 97% in the 80th day. but at below 90% by the l00th day due
to blocking conditions at all stations (see Table 5). Obviously. with the gridlock condition, the system
will not reach steady state. In addition. the inspection and wash stations were only being utilized at a
33% level which indicated that there is no need for additional stations of this type. Finally, the material
handling system was not fully utilized indicating an excess of AGV's for the system.
Thus. the next obvious step was to increase the number of machining stations. The next system
to be modelled was comprised of five machining stations, four load/unload stations, and two AGV·s.
The simulation was run for 100 days and some critical results are found as shown in Table 6. The
results show that the increase of parts completed, as well as the higher utilization of the load-unload
stations, indicate that the flow of parts has defmitely improved as compared to the previous system.
However. the utilization of the machines at the main workstations are still at the 97% level which may
indicate future system blocking. Moreover. it is important to remember that the goal of this study is to
achieve a machine utilization level of 80%.
Further alternatives were quickly implemented through the use of the pre-processor. and results
indicated that with an increase in the machining workstations. it was necessary to increase the capacity
of the load/unload stations. as well as the material handling system. The final goal was attained with
a system configuration of seven machining workstations. six load/unload· stations, one washing station.
one inspection station. and a material handling system of five AGVs. Table 7 shows some critical
results for this system. An average utilization level of 80% level is achieved with this system and
furthermore. the production flowtime seems to be adequate for the demand projected. If demand is
increased in the future. at least there is a 20% utilization level that this system provides as a safety
factor. The wash and inspection stations are below 50% utilization levels. indicating that single stations
more than satisfy the demand. The additional results reveal that there is no substantial accumulation in
the queues system and the material handling system is not used to its full capacity.

SUMMARY

The use of a simulation generator such as RENSAM assists in the process of modeling different
FMS configurations. The user friendly interfaces facilitate quick modeling and analysis of the simulation
model. Once the data is entered and thus the model created, several alternatives are easily modeled and
analyzed. RENSAM has been successfully used in this study. The goals were accomplished by
designing an appropriate system configuration.
397
TABLE 4: FINAL INVENTORIES

4 MACHINING STATIONS. 4 LOAD/UNLOAD. 4 AGVs

PART 60 DAYS 80 DAYS 100 DAYS


A 100 132 149
B.C.D 263 369 429
E 124 167 184
F 174 222 253
G 174 222 253
H 174 221 252
I 174 221 252
J 174 221 252
K 92 119 136
L 185 237 268
M 210 287 318
N 262 367 427
0 45 60 62
P 24 32 34
Q 268 345 390
R 268 346 391
S 124 167 184
T 48 64 66
398

Cumulative Averaoe Flowtime


"","'t A
4 1

:I
"
::I."
~ :I ?

!
::I."
!
:I 5

3."
3.3

:I ;0
1"1 21 411 -4., 5.., 8.., 71 91 1iiI"1 "'01 11.., 121 131 1 ....,

Figure 3. CUMULATIVE AVERAGE FLOWTIME FOR PART TYPE A

Cumulative Averace Flowtime


~.
11.
11 ••
11

10.'
10.'
10.1-
10.2

~
10
a.1
!
! ...
a.1

o•

••
.. .......
••
••
• of trWWMt. r_

Figure 4. CUMULATIVE AVERAGE FLOWTIME FOR PART TYPE N


399

TABLE 5: MACHINE STATISTICS

4 MACHmING STATIONS, 4- LOAD/UNLOAD, 4- AGVs

MACHINE STATISTICS: (80 DAYS)


Mactrine --Utilization-During-- Fraction Fraction Pans
Number Processing Setup Blocked Failed Entered
1 0.9803 0.0000 0.0137 0.0000 1873
2 0.9782 0.0000 0.0141 0.0000 1892
3 0.9771 0.0000 0.0145 0.0000 1917
4 0.9775 0.0000 0.0142 0.0000 1921
5 0.3364 0.0000 0.0035 0.0000 7599
6 0.3364 0.0000 0.2154 0.0000 7599

Average 0.7643 0.0000 0.0459 0.0000

Machine Avg.Opn Current


Number Time Status

602.92 BUSY
2 595.63 BUSY
3 587.16 BUSY
4 586.21 BUSY
5 51.00 IDLE
6 51.00 BLOCK

MACHINE STATISTICS: (100 DAYS)

Mactrine -Utiliz:ltion-During-- Fraction Fraction Pans


Number Processing Setup Blocked Failed Entered

1 0.8919 0.0000 0.0128 0.0000 2121


2 0.8902 0.0000 0.0128 0.0000 2147
3 0.8891 0.0000 0.0133 0.0000 2171
4 0.8893 0.0000 0.0129 0.0000 2178
5 0.3051 0.0000 0.0031 0.0000 8614
6 0.3050 0.0000 0.1978 0.0000 8613

Average 0.6951 0.0000 0.0421 0.0000

Machine Avg.Opn Current


Number Time Status

1 605.54 BLOCK
2 597.05 BLOCK
3 589.72 BLOCK
4 587.99 IDLE
5 51.00 BLOCK
6 51.00 BLOCK
400
TABLE 6: RESULTS FOR FIVE WORKSTATION SYSTEM

FOUR LOAD/UNLOAD STATIONS


TWO AGVs
100 DAYS

PART TYPE PARTS COMPLETED AVERAGE FLOWTIME


A 185 253202
B,C,D 605 3600.54
E 232 4938.34
F 332 3239.64
G 333 1263.97
H 332 1942.34
I 332 2000.l9
J 332 1987.70
K 168 3422.99
L 364 3164.87
M 464 1261.33
N 560 5145.01
o 75 3134.03
P 40 3522.45
Q 645 2607.89
R 648 2662.03
S 232 2447.06
T 80 1197.14

MACHINE STATISTICS :

Machine --Utilization-During-- Fraction Fraction Pans


Number Processing Setup Blocked Failed Entered
1 0.9750 0.0000 0.0204 0.0000 2340
2 0.9754 0.0000 0.0202 0.0000 2377
3 0.9745 0.0000 0.0203 0.0000 2374
4 0.9715 0.0000 0.0217 0.0000 2423
5 0.9699 0.0000 0.0190 0.0000 2371
6 0.4208 0.0000 ·0.0002 0.0000 11880
7 0.4208 0.0000 0.1576 0.0000 11880

Average 0.8154 0.0000 0.0371 0.0000

Machine Avg.Opn Curren [


Number Time S[atus
1 600.02 BUSY
2 590.88 BUSY
3 591.12 BUSY
4 577.36 BUSY
5 589.05 BUSY
6 51.00 IDLE
7 51.00 IDLE
401

TABLE 7: RESULTS FOR SEVEN WORKSTATION SYSTEM

SIX LOADIUNLOAD STATIONS


FIVE AGVs
100 DAYS
PART TYPE PARTS COMPLETED AVERAGE FLOWTIME
A 190 2132.37
B,C,D 792 3048.04
E 250 4358.81
F 369 2804.53
G 370 1073.03
H 369 1561.19
I 369 1596.24
J 369 1563.01
K 170 2930.51
L 400 2748.69
M 501 1000.81
N 658 4318.60
o 75 2809.89
P 40 2850.50
Q 732 2620.78
R 736 2158.66
S 250 2075.32
T 80 1018.67

MACHlNE STATISTICS:
Machine -Utilization-During-- Fraction Fraction Pans
Number Processing Setup Blocked Failed Entered
1 0.8963 0.0000 0.0289 0.0000 2081
2 0.8760 0.0000 0.0287 0.0000 2075
3 0.8602 0.0000 0.0294 0.0000 2004
4 0.8065 0.0000 0.0283 0.0000 1943
5 0.7307 0.0000 0.0300 0.0000 1838
6 0.7037 0.0000 0.0284 0.0000 1786
7 0.6953 0.0000 0.0285 0.0000 1726
8 0.4762 0.0000 0.0022 0.0000 13446
9 0.4762 0.0000 0.0555 0.0000 13446
Average 0.7246 0.0000 0.0289 0.0000

Machine Avg.Opn Current


Number Time Status
1 620.19 BUSY
2 607.94 BUSY
3 618.12 BUSY
4 597.73 BUSY
5 572.50 BUSY
6 567.41 BUSY
7 580.12 BUSY
8 51.00 IDLE
9 51.00 BUSY
402

REFERENCES

Foley, W., S. Jain and J. Haddock, 1990, "Using Simulation Generators for Modeling Flexible
Manufacturing Systems," Progress in Simulation, (J. V. Leonard and G. W. Zobrist, Co-Editors),
(Technical Report 37-87-137), forthcoming.

Haddock, J .• 1987, "An Expert System Framework Based on Simulation Generator," Simulation 48:2.
45-53.

Haddock, J., 1988, "A Simulation Generator for Flexible Manufacturing Systems Design and Control,"
liE Transactions 20:1, 22-31.

Phillips, D.T., M. Handwerker, and G.L. Hogg, 1979, "GEFMS: A Generalized Manufacturing
Simulator," Computers Industrial Engineering, 3, 225-233.

Rolston, L.J., 1984, "Modeling Flexible Manufacturing Systems with MAP/I," Proceedings: First
ORSAITIMS Flexible Manufacturing Systems Conference: Operations Research Models Applications,
University of Michigan. Michigan, Ann Arbor, August, 199-204.

Suri. R., 1984, ,. An Overview of Evaluative Models for Flexible Manufacturing Systems," Proceedings
of First ORSAITIMS Conference on FMS, 8-15.

Suri, R., and G. Diehl, 1987, "Rough-Cut Modeling: An Alternative to Simulation," elM Review,
Winter, 25-33.
AUTHOR INDEX

AIELLO M. 353
ALTINTAS Y. 279
ARAUJOH.J. 217
BATISTA J. 217
BENALI R. 153
BILLINGSLEY J. 131
BODSONM. 353
BOYERK.L. 177
CHANB. 335
CHIASSONJ. 353
DE ALMEIDA AT. 217
D'HOLLANDER E. 31, 237
DIASJ.M. 217
DI STEFANO A 49
GODDARDRE. 177
HADOOCKJ. 385
HEMAMI H. 3, 177
HODAIE P. 335
KASAHARAH. 79
KIM, Y.C. 385
LEPAGEF. 311
LEWISF.L. 369
LOWEJA 369
MARTINS J.L. DE CARVALHO 103
MIRABELLA O. 49
MORELG. 311
NUNES U.C. 217
O'SHEAJ. 153
REKOWSKI R. 353
RICHARDJ. 311
SCHUERERD. 353
STOROSCHCHUK O. 335
SZABADOSB. 335
TENREIRO MACHADO J.A 103
WEILERT AM. 261
ZHENGY.F. 3

403
SUBJECT INDEX

Acquisition of information 263


Acquisition system integration 269
Actions 272
Activation function 247
Actuator dynamics 370
Actuator parameters 372
Adaptive controllers 107
Analog- digital conversion 264
Arithmetic of DSP 52
Assembly language programming 117
Auto-adaptation feedback (AAF) 172
Auto-associator 249. 250
Axis command/coordination 143
Axis controller 150

Back propagation 244


- algorithms 244. 247

Camera calibration 225


Cell functional of requirements 312. 316
Cell planning 321
CNC milling machine 282
Collision avoidance 177
Communication architecture 316. 323
Communication overhead 32. 40
Competitive learning model 253
Computer control 311
Computer integrated manufacturing 335
Constraint satisfaction network 251
Contouring accuracy 307
Contouring errors 299
ContOuring with matched axis 301
ContOuring with mismatched axis 302
Control 265
Control algorithms 160. 168
Control cell level 320
Control loop of cells 317
Controller complexity 108

405
406

Controller performance 108


Controllers 164
Controller structure 205
Cutting torque 307

Data acquisition systems 261


Data communication standards 268
Data flow-analysis 37
Delta learning rule 240
Desk top computer 131
Difference equations implementation 141
Digital signal processor (DSP) 49, 52
Distributed control architecture 314
Distributed system architecture 141
Dynamics algOrithm using DSP 61

Feedback linearization controller 354


Feed drive control 283, 290. 294
FIR filter 50
Floating-point matrix inversion 68
FMS 385, 386
Force sensing 228

Hardware - oriented approach 82, 92


Heads positioning actuator 370
Hidden units 247

Identification experiments 365


Imaging resolution 183
Image plane speed 193
Image processing 222
-library 224
Initial robot programming 343
Input delegation 206
Interactive activation 251
Inverse kinematics 158

Joint acceleration integration 165

Kalman filter design 373


- implementation 376
407

Kinematics 61, 132


Kinematics algorithm using DSP 61

LearnIng model 253


Unear controllers 104
Unear pattern associator (LPA) 239
Local area networks 337
Longest processing time first algorithm 38

Machine level elements 325


Machining process control 330
Manipulator gripper 227
Manufacturing planning system 313
Manufacturing systems 385, 386
Map controller 324
Material handling 392
Measures 271
Memory management of DSP 54
MIcroprocessor-based controller 103, 104, 109
MIcroprocessor S/W standards 267
Modular CNC system 313
Multilayered pattern associator 242
Multiprocessor system 17, 83
Multirate schemes 119

Network desIgn 338


Neural networks 237, 238
Neurons 237
Newton Euler equations 6, 31, 81
Nonlinear model-based controllers 105
Nonlinear pattern associator 242
Numerical controller 325

Parallel computation 17, 24


Parallel execution 39
Parallel processIng 78, 80, 89
Parameter Identification 358
PC simulation program 153, 154
Perlormanceevaluation 22, 90, 385, 386
Perlormnce requirements 372
Permanent magnet stepper motor (PMS) 353
PID controller 168
Polynomial fittlng 139
408

Position control loop 147


Positive current feedback 170
- plus PID controller 171
Processor utilization 43

Quadrature current 364

Real time
- controller performance 115
- path control 230
Reference pattern 188
- set 198
RENSAM 385, 386
Resistance to steady force 208
Retreat from a projectile 209
Robot control 159
- computation 80, 82
- performance 154
Robot dynamics 3,162
Routing 392

Scheduling 37, 386


Sensor data Integration 231
Sensor processing 203
Sensor structure 202
Sensory configuration 179
Simulated annealing 252
Simulation 386
Simulation analySIS 385, 386
Simulation generator 386
Simulation languages 386
Simulation models 385, 386
Simulation on a multiprocessor 96
Software oriented approach 84, 94
State relegation 207
Steady state Kalman fllter 377
Supervisory system 132
Support application software 340

Tactile sensing 228


Task granularity 32
Task graph 32, 86
Task scheduling 228
409

Time scheduling for central CP 20


Time varying Kalman filter 382
Trajectory generation 155
Tree height reduction 6

Ultrasonic system 229

Vartable structure controllers 106


Vision system 221
Visual sensing 178
International Series on
MICROPROCESSOR-BASED SYSTEMS ENGINEERING

Editor: Professor S. G. Tzafestas, National Technical University, Athens, Greece

1. S.G. Tzafestas (ed.): Microprocessors in Signal Processing, Measurement and


Control. 1983 ISBN 90-277-1497-5
2. G. Conte and D. Del Corso (eds.): Multi-Microprocessor Systems for Real-Time
Applications. 1985 ISBN 90-277-2054-1
3. C.J. Georgopoulos: Interface Fundamentals in Microprocessor-Controlled Systems.
1985 ISBN 90-277-2127-0
4. N.K. Sinha (ed.): Microprocessor-Based Control Systems. 1986 ISBN 90-277-2287-0
5. S.G. Tzafestas and 1.K. Pal (eds.): Real Time Microcomputer Control of Industrial
Processes. 1990 ISBN 0-7923-0779-8
6. S.G. Tzafestas (ed.): Microprocessors in Robotic and Manufacturing Systems. 1991
ISBN 0-7923-0780-1
7. N.K. Sinha and G.P. Rao (eds.): IdentificatIon of Continuous-Time Systems. Methodol-
ogy and Computer Implementation. 1991 ISBN 0-7923-1336-4

KLUWER ACADEMIC PUBLISHERS - DORDRECHT / BOSTON / LONDON

You might also like