Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
111 views

Introduction To Autonomous Robots - Mechanisms, Sensors, Actuators, and Algorithms

This document provides an overview of robotics including the fundamental aspects of autonomous robots which are mechanical construction, electrical components, and software. It also discusses various robot types, applications, actuation methods, and power sources.

Uploaded by

mohan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Introduction To Autonomous Robots - Mechanisms, Sensors, Actuators, and Algorithms

This document provides an overview of robotics including the fundamental aspects of autonomous robots which are mechanical construction, electrical components, and software. It also discusses various robot types, applications, actuation methods, and power sources.

Uploaded by

mohan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 108

26.

Introduction to Autonomous Robots: Mechanisms, Sensors, Actuators, and Algorithms

Chapter 1: Fundamentals of Autonomous Robots

[mh]Introduction to Robotics Technology

Robotics is the interdisciplinary study and practice of the design, construction, operation, and use of
robots. Within mechanical engineering, robotics is the design and construction of the physical structures
of robots, while in computer science, robotics focuses on robotic automation algorithms. Other disciplines
contributing to robotics include electrical, control, software, information, electronic, telecommunication,
computer, mechatronic, and materials engineering.

The goal of most robotics is to design machines that can help and assist humans. Many robots are built to
do jobs that are hazardous to people, such as finding survivors in unstable ruins, and exploring space,
mines and shipwrecks. Others replace people in jobs that are boring, repetitive, or unpleasant, such as
cleaning, monitoring, transporting, and assembling. Today, robotics is a rapidly growing field, as
technological advances continue; researching, designing, and building new robots serve various practical
purposes.

[h]Robotics aspects

There are many types of robots; they are used in many different environments and for many different
uses. Although diverse in application and form, they all share three basic aspects when it comes to their
design and construction:

1. Mechanical construction: a frame, form or shape designed to achieve a particular task. For
example, a robot designed to travel across heavy dirt or mud might use caterpillar tracks. Origami
inspired robots can sense and analyze in extreme environments. The mechanical aspect of the robot
is mostly the creator's solution to completing the assigned task and dealing with the physics of the
environment around it. Form follows function.
2. Electrical components that power and control the machinery. For example, the robot with
caterpillar tracks would need some kind of power to move the tracker treads. That power comes in
the form of electricity, which will have to travel through a wire and originate from a battery, a
basic electrical circuit. Even petrol-powered machines that get their power mainly from petrol still
require an electric current to start the combustion process which is why most petrol-powered
machines like cars, have batteries. The electrical aspect of robots is used for movement (through
motors), sensing (where electrical signals are used to measure things like heat, sound, position, and
energy status), and operation (robots need some level of electrical energy supplied to their motors
and sensors in order to activate and perform basic operations)
3. Software. A program is how a robot decides when or how to do something. In the caterpillar track
example, a robot that needs to move across a muddy road may have the correct mechanical
construction and receive the correct amount of power from its battery, but would not be able to go
anywhere without a program telling it to move. Programs are the core essence of a robot, it could
have excellent mechanical and electrical construction, but if its program is poorly structured, its
performance will be very poor (or it may not perform at all). There are three different types of
robotic programs: remote control, artificial intelligence, and hybrid. A robot with remote control
programming has a preexisting set of commands that it will only perform if and when it receives a
signal from a control source, typically a human being with remote control. It is perhaps more
appropriate to view devices controlled primarily by human commands as falling in the discipline of
automation rather than robotics. Robots that use artificial intelligence interact with their
environment on their own without a control source, and can determine reactions to objects and
problems they encounter using their preexisting programming. A hybrid is a form of programming
that incorporates both AI and RC functions in them.

[h]Applied robotics

As more and more robots are designed for specific tasks, this method of classification becomes more
relevant. For example, many robots are designed for assembly work, which may not be readily adaptable
for other applications. They are termed "assembly robots". For seam welding, some suppliers provide
complete welding systems with the robot i.e. the welding equipment along with other material handling
facilities like turntables, etc. as an integrated unit. Such an integrated robotic system is called a "welding
robot" even though its discrete manipulator unit could be adapted to a variety of tasks. Some robots are
specifically designed for heavy load manipulation, and are labeled as "heavy-duty robots".

Current and potential applications include:

 Manufacturing. Robots have been increasingly used in manufacturing since the 1960s. According
to the Robotic Industries Association US data, in 2016 the automotive industry was the main
customer of industrial robots with 52% of total sales. In the auto industry, they can amount for
more than half of the "labor". There are even "lights off" factories such as an IBM keyboard
manufacturing factory in Texas that was fully automated as early as 2003.
 Autonomous transport including self-driving cars and airplane autopilot
 Domestic robots including Robotic vacuum cleaners.
 Construction robots. Construction robots can be separated into three types: traditional robots,
robotic arm, and robotic exoskeleton.
 Agricultural robots. The use of robots in agriculture is closely linked to the concept of AI-assisted
precision agriculture and drone usage.
 Medical robots of various types ; and Robot-assisted surgery designed and used in clinics.
 Food processing. Commercial examples of kitchen automation are Flippy (burgers), Zume Pizza
(pizza), Cafe X (coffee), Makr Shakr (cocktails), Frobot (frozen yogurts), Sally (salads), salad or
food bowl robots manufactured by Dexai (a Draper Laboratory spinoff, operating on military
bases), and integrated food bowl assembly systems manufactured by Spyce Kitchen (acquired by
Sweetgreen) and Silicon Valley startup Hyphen. Home examples are Rotimatic (flatbreads baking)
and Boris (dishwasher loading). Other examples may include manufacturing technologies based on
3D Food Printing.
 Automated mining.
 Space exploration, including Mars rovers.
 Cleanup of contaminated areas, such as toxic waste or nuclear facilities.
 Robotic lawn mowers and Sports field line marking.
 Robot sports for entertainment and education, including Robot combat, Autonomous racing, drone
racing, and FIRST Robotics.
 Military robots.

At present, mostly (lead–acid) batteries are used as a power source. Many different types of batteries can
be used as a power source for robots. They range from lead–acid batteries, which are safe and have
relatively long shelf lives but are rather heavy compared to silver–cadmium batteries which are much
smaller in volume and are currently much more expensive. Designing a battery-powered robot needs to
take into account factors such as safety, cycle lifetime, and weight. Generators, often some type of
internal combustion engine, can also be used. However, such designs are often mechanically complex and
need fuel, require heat dissipation, and are relatively heavy. A tether connecting the robot to a power
supply would remove the power supply from the robot entirely. This has the advantage of saving weight
and space by moving all power generation and storage components elsewhere. However, this design does
come with the drawback of constantly having a cable connected to the robot, which can be difficult to
manage. Potential power sources could be:

 pneumatic (compressed gases)


 Solar power (using the sun's energy and converting it into electrical power)
 hydraulics (liquids)
 flywheel energy storage
 organic garbage (through anaerobic digestion)
 nuclear

[h]Actuation

Actuators are the "muscles" of a robot, the parts which convert stored energy into movement. By far the
most popular actuators are electric motors that rotate a wheel or gear, and linear actuators that control
industrial robots in factories. There are some recent advances in alternative types of actuators, powered by
electricity, chemicals, or compressed air.

[h]Electric motors

The vast majority of robots use electric motors, often brushed and brushless DC motors in portable robots
or AC motors in industrial robots and CNC machines. These motors are often preferred in systems with
lighter loads, and where the predominant form of motion is rotational.

[h]Linear actuators

Various types of linear actuators move in and out instead of by spinning, and often have quicker direction
changes, particularly when very large forces are needed such as with industrial robotics. They are
typically powered by compressed and oxidized air (pneumatic actuator) or an oil (hydraulic actuator)
Linear actuators can also be powered by electricity which usually consists of a motor and a leadscrew.
Another common type is a mechanical linear actuator such as a rack and pinion on a car.

[h]Series elastic actuators

Series elastic actuation (SEA) relies on the idea of introducing intentional elasticity between the motor
actuator and the load for robust force control. Due to the resultant lower reflected inertia, series elastic
actuation improves safety when a robot interacts with the environment (e.g., humans or workpieces) or
during collisions. Furthermore, it also provides energy efficiency and shock absorption (mechanical
filtering) while reducing excessive wear on the transmission and other mechanical components. This
approach has successfully been employed in various robots, particularly advanced manufacturing robots
and walking humanoid robots.
The controller design of a series elastic actuator is most often performed within the passivity framework
as it ensures the safety of interaction with unstructured environments. Despite its remarkable stability and
robustness, this framework suffers from the stringent limitations imposed on the controller which may
trade-off performance. The reader is referred to the following survey which summarizes the common
controller architectures for SEA along with the corresponding sufficient passivity conditions. One recent
study has derived the necessary and sufficient passivity conditions for one of the most common
impedance control architectures, namely velocity-sourced SEA. This work is of particular importance as
it drives the non-conservative passivity bounds in an SEA scheme for the first time which allows a larger
selection of control gains.

[h]Air muscles

Pneumatic artificial muscles also known as air muscles, are special tubes that expand (typically up to
42%) when air is forced inside them. They are used in some robot applications.

[h]Wire muscles

Muscle wire, also known as shape memory alloy, Nitinol® or Flexinol® wire, is a material that contracts
(under 5%) when electricity is applied. They have been used for some small robot applications.

[h]Electroactive polymers

EAPs or EPAMs are a plastic material that can contract substantially (up to 380% activation strain) from
electricity, and have been used in facial muscles and arms of humanoid robots, and to enable new robots
to float, fly, swim or walk.

[h]Piezo motors

Recent alternatives to DC motors are piezo motors or ultrasonic motors. These work on a fundamentally
different principle, whereby tiny piezoceramic elements, vibrating many thousands of times per second,
cause linear or rotary motion. There are different mechanisms of operation; one type uses the vibration of
the piezo elements to step the motor in a circle or a straight line. Another type uses the piezo elements to
cause a nut to vibrate or to drive a screw. The advantages of these motors are nanometer resolution,
speed, and available force for their size. These motors are already available commercially and being used
on some robots.

[h]Elastic nanotubes

Elastic nanotubes are a promising artificial muscle technology in early-stage experimental development.
The absence of defects in carbon nanotubes enables these filaments to deform elastically by several
percent, with energy storage levels of perhaps 10 J/cm3 for metal nanotubes. Human biceps could be
replaced with an 8 mm diameter wire of this material. Such compact "muscle" might allow future robots
to outrun and outjump humans.

[h]Sensing

Current robotic and prosthetic hands receive far less tactile information than the human hand. Recent
research has developed a tactile sensor array that mimics the mechanical properties and touch receptors of
human fingertips. The sensor array is constructed as a rigid core surrounded by conductive fluid
contained by an elastomeric skin. Electrodes are mounted on the surface of the rigid core and are
connected to an impedance-measuring device within the core. When the artificial skin touches an object
the fluid path around the electrodes is deformed, producing impedance changes that map the forces
received from the object. The researchers expect that an important function of such artificial fingertips
will be adjusting the robotic grip on held objects.

Scientists from several European countries and Israel developed a prosthetic hand in 2009, called
SmartHand, which functions like a real one —allowing patients to write with it, type on a keyboard, play
piano, and perform other fine movements. The prosthesis has sensors which enable the patient to sense
real feelings in its fingertips.

A definition of robotic manipulation has been provided by Matt Mason as: "manipulation refers to an
agent's control of its environment through selective contact".

Robots need to manipulate objects; pick up, modify, destroy, move or otherwise have an effect. Thus the
functional end of a robot arm intended to make the effect (whether a hand, or tool) are often referred to as
end effectors, while the "arm" is referred to as a manipulator. Most robot arms have replaceable end-
effectors, each allowing them to perform some small range of tasks. Some have a fixed manipulator that
cannot be replaced, while a few have one very general-purpose manipulator, for example, a humanoid
hand.

[h]Mechanical grippers

One of the most common types of end-effectors are "grippers". In its simplest manifestation, it consists of
just two fingers that can open and close to pick up and let go of a range of small objects. Fingers can, for
example, be made of a chain with a metal wire running through it. Hands that resemble and work more
like a human hand include the Shadow Hand and the Robonaut hand. Hands that are of a mid-level
complexity include the Delft hand. Mechanical grippers can come in various types, including friction and
encompassing jaws. Friction jaws use all the force of the gripper to hold the object in place using friction.
Encompassing jaws cradle the object in place, using less friction.

[h]Suction end-effectors

Suction end-effectors, powered by vacuum generators, are very simple astrictive devices that can hold
very large loads provided the prehension surface is smooth enough to ensure suction.

Pick and place robots for electronic components and for large objects like car windscreens, often use very
simple vacuum end-effectors.
Suction is a highly used type of end-effector in industry, in part because the natural compliance of soft
suction end-effectors can enable a robot to be more robust in the presence of imperfect robotic perception.
As an example: consider the case of a robot vision system that estimates the position of a water bottle but
has 1 centimeter of error. While this may cause a rigid mechanical gripper to puncture the water bottle,
the soft suction end-effector may just bend slightly and conform to the shape of the water bottle surface.

[h]General purpose effectors

Some advanced robots are beginning to use fully humanoid hands, like the Shadow Hand, MANUS, and
the Schunk hand. They have powerful robot dexterity intelligence (RDI), with as many as 20 degrees of
freedom and hundreds of tactile sensors.

[h]Rolling robots

For simplicity, most mobile robots have four wheels or a number of continuous tracks. Some researchers
have tried to create more complex wheeled robots with only one or two wheels. These can have certain
advantages such as greater efficiency and reduced parts, as well as allowing a robot to navigate in
confined places that a four-wheeled robot would not be able to.

[h]Two-wheeled balancing robots

Balancing robots generally use a gyroscope to detect how much a robot is falling and then drive the
wheels proportionally in the same direction, to counterbalance the fall at hundreds of times per second,
based on the dynamics of an inverted pendulum. Many different balancing robots have been designed.
While the Segway is not commonly thought of as a robot, it can be thought of as a component of a robot,
when used as such Segway refer to them as RMP (Robotic Mobility Platform). An example of this use
has been as NASA's Robonaut that has been mounted on a Segway.

[h]One-wheeled balancing robots

A one-wheeled balancing robot is an extension of a two-wheeled balancing robot so that it can move in
any 2D direction using a round ball as its only wheel. Several one-wheeled balancing robots have been
designed recently, such as Carnegie Mellon University's "Ballbot" which is the approximate height and
width of a person, and Tohoku Gakuin University's "BallIP". Because of the long, thin shape and ability
to maneuver in tight spaces, they have the potential to function better than other robots in environments
with people.

[h]Spherical orb robots

Several attempts have been made in robots that are completely inside a spherical ball, either by spinning a
weight inside the ball, or by rotating the outer shells of the sphere. These have also been referred to as an
orb bot or a ball bot.
[h]Six-wheeled robots

Using six wheels instead of four wheels can give better traction or grip in outdoor terrain such as on rocky
dirt or grass.

[h]Tracked robots

Figure: TALON military robots used by the United States Army

Tank tracks provide even more traction than a six-wheeled robot. Tracked wheels behave as if they were
made of hundreds of wheels, therefore are very common for outdoor and military robots, where the robot
must drive on very rough terrain. However, they are difficult to use indoors such as on carpets and
smooth floors. Examples include NASA's Urban Robot "Urbie".

[h]Walking robots

Walking is a difficult and dynamic problem to solve. Several robots have been made which can walk
reliably on two legs, however, none have yet been made which are as robust as a human. There has been
much study on human-inspired walking, such as AMBER lab which was established in 2008 by the
Mechanical Engineering Department at Texas A&M University. Many other robots have been built that
walk on more than two legs, due to these robots being significantly easier to construct. Walking robots
can be used for uneven terrains, which would provide better mobility and energy efficiency than other
locomotion methods. Typically, robots on two legs can walk well on flat floors and can occasionally walk
up stairs. None can walk over rocky, uneven terrain. Some of the methods which have been tried are:

The zero moment point (ZMP) is the algorithm used by robots such as Honda's ASIMO. The robot's
onboard computer tries to keep the total inertial forces (the combination of Earth's gravity and the
acceleration and deceleration of walking), exactly opposed by the floor reaction force (the force of the
floor pushing back on the robot's foot). In this way, the two forces cancel out, leaving no moment (force
causing the robot to rotate and fall over). However, this is not exactly how a human walks, and the
difference is obvious to human observers, some of whom have pointed out that ASIMO walks as if it
needs the lavatory. ASIMO's walking algorithm is not static, and some dynamic balancing is used .
However, it still requires a smooth surface to walk on.

[h]Hopping

Several robots, built in the 1980s by Marc Raibert at the MIT Leg Laboratory, successfully demonstrated
very dynamic walking. Initially, a robot with only one leg, and a very small foot could stay upright simply
by hopping. The movement is the same as that of a person on a pogo stick. As the robot falls to one side,
it would jump slightly in that direction, in order to catch itself. Soon, the algorithm was generalised to two
and four legs. A bipedal robot was demonstrated running and even performing somersaults. A quadruped
was also demonstrated which could trot, run, pace, and bound. For a full list of these robots, see the MIT
Leg Lab Robots page.

[h]Dynamic balancing (controlled falling)

A more advanced way for a robot to walk is by using a dynamic balancing algorithm, which is potentially
more robust than the Zero Moment Point technique, as it constantly monitors the robot's motion, and
places the feet in order to maintain stability. This technique was recently demonstrated by Anybots'
Dexter Robot, which is so stable, it can even jump. Another example is the TU Delft Flame.

[mh] Evolution of Autonomous Systems

Autonomous driving is a field with a promising future . The human society is extremely dependent on
transportation. According to statistics, there are about more than 1.4 billion vehicles in the world . Tens of
millions or even hundreds of millions of drivers manipulate vehicles around the world every day, which is
a very large consumption of manpower . Intelligent autonomous vehicles, including driverless vehicles,
have become a hot topic of concern for regulators and industry in recent years. Intelligent autonomous
driving has been defined as the direction of development in the next 10 years .

Based on the degree to which dynamic driving tasks that the intelligent automation driving system can
perform, the role assigned in the execution of dynamic driving tasks, and with or without the design
operating condition restrictions, the driving automation is divided into levels 0–5 .

Level Vehicle motion control Targets and events detection and response
Level 0 No Automation Driver Driver
Level 1 Driver assistance Driver and system Driver
Level 2 Partial Automation Driver and system Driver
Level 3 Conditional Automation System Driver or system
Level 4 High Automation System System
Level 5 Fully Automation System System

Table:The automotive driving division elements.

Level 0 (No Automation): Drivers are in full control of the vehicle. The automation driving system cannot
continuously control vehicle motion in dynamic driving tasks, but has the ability to continuously detect
and respond to partial target and event in dynamic driving tasks. Typical vehicles currently on the road
are classified as Level 0.

Level 1 (Driver assistance): The automation driving system continuously controls vehicle motion in
dynamic driving tasks within its designed operating conditions and has the ability to appropriately detect
and respond to partial targets and events that related to vehicle motion control. In other words, Level 1
automated systems are sometimes able to assist the driver in certain driving tasks.
Level 2 (Partial Automation): The driving automation system continuously controls vehicle motion in
dynamic driving tasks within the conditions for which it is designed to operate and has the capability to
detect and appropriately respond to some of the targets and events that are related to vehicle motion
control. In other words, the automation system is capable of performing certain driving tasks, but the
driver needs to monitor the driving environment and complete the remainder. Also, drivers need to take
over driving when problems arising at any time. At this level, the automation system asks the driver ready
to correct errors in perception and judgment, which has been able to be provided by most of car
companies. Level 2 condition can be split by speed and environment into different usage scenarios, such
as low-speed traffic jams on ring roads, fast travel on highways, and automatic parking with the driver in
the car . Most of the new vehicles on the market are classified as Level 1 or Level 2. These new vehicles
are equipped with assisted driving features such as lane centering and speed control, which are useful for
both parking and driving . However, the car remains in the hands of the driver.

Level 3 (Conditional automation): The automation system is capable of both performing certain driving
tasks and monitoring the driving environment in certain situations, but the driver must be ready to regain
driving control when they are requested to do it by the automated system. At this level, drivers are still
unable to sleep or take a deep rest. Currently, the highest level that commercial vehicles can reach is at
Level 3 at most . Conditional automation requires that the vehicle be able to drive autonomously under
ideal conditions, such as at a given speed and road type, but there are many limitations when these
conditions are exceeded. The meaningful deployments seen so far are upgraded high-speed road
condition.

Level 4 (High Automation): Automation systems are able to perform driving tasks and monitor the
driving environment in certain environments and under specific conditions. Specifically, the driving
automation system performs all dynamic driving tasks continuously and takeovers dynamic driving tasks
under the designed operating conditions. Under this stage, in the range of automatic driving can operate,
all tasks related to driving and the driver have nothing to do. The perception of external responsibility is
all in the automatic driving system. Level 4 vehicles are almost fully automated, but their automation
systems can only be used in known use cases. They cannot be applied in driving off-road or in extreme
weather conditions. In these unknown situations, the driver must steer the vehicle. Most of deployments
of Level 4 are currently set in urban-based condition . It is expected for fully automated valet parking or
can be done directly in conjunction with a taxi service.

Level 5 (Fully Automation): The automation system can perform all driving tasks under all conditions.
Specifically, the driving automation system continuously performs all dynamic driving tasks and
takeovers dynamic driving tasks under any drivable condition. Level 5 vehicles are truly driverless cars.
To reach Level 5, a vehicle must be able to navigate autonomously through any road condition or
hazardous obstacle.

Level 1 and Level 2 cannot be considered autonomous driving; instead, they are Advanced Driver
Assistance Systems (ADAS) . Level 3 and above can be called autonomous driving. The majority of
Level 3 vehicles is still test vehicles and is not really commercially available . The current autonomous
driving is still not perfect. Although the car is in automatic situation, the driver must be alert and ready to
take over the vehicle to deal with accidents. It can be phased, in the future, autonomous driving will
become a mature and reliable technology. It is not yet mature, but it will have a bright future.
[h]Development history

In August 1925, a radio-controlled car called “American Miracle” was unveiled by Francis P. Houdina, a
U.S. Army electrical engineer . By radio control, the steering wheel, clutch, brakes, and other parts of the
vehicle can be remotely controlled. According to the New York Times, the radio-controlled vehicle can
start its engine, turn its gears, and honk its horn like a phantom hand at the wheel. It’s a long way from
“automation driving,” but it was the first documented self-driving vehicle in human history.

In 1939, General Motors exhibited the world’s first self-driving concept vehicle—Futurama—at the New
York World’s Fair . Futurama is an electric car guided by a radio-controlled electromagnetic field, which
is generated by magnetized metal spikes embedded in the road. It was until 1958, however, that General
Motors brought this concept vehicle to life . General Motors embedded sensors called pickup coils in the
front of the car. The current flowing, which goes through the wires embedded in the road, could then be
manipulated and tells the vehicle to move the steering wheel to the left or right .

During the next close to 20 years, the development of autonomous driving technology hit a bottleneck and
developed more slowly, with no significant progress. It was until the 1970s, especially after computers
and IT technology began to develop at a rapid pace, that autonomous driving technology again ushered in
a period of rapid development. In 1977, Japan’s Tsukuba Mechanical Engineering Laboratory improved
the pulse signal control method previously used by General Motors. They used a camera system that
forwards data to a computer to process road images. This allows the car to follow white road signs at 30
kilometers per hour on its own, though it still needs the assistance of steel rails. A decade later, the
Germans improved the camera and developed VaMoR. A vehicle equipped with the camera could drive
safely at 90 km/h . As technology has advanced, the environment detection and reaction ability of self-
driving cars also has been improved.

In 1984, the Defense Advanced Research Projects Agency (DARPA), in partnership with the Army,
launched the Autonomous Ground Vehicle (ALV) program . The goal of this program is to give vehicles
full autonomy and allow them to detect terrain through cameras and calculate solutions such as navigation
and driving routes through computer systems. The second DARPA Challenge in 2005 was the first time
in history that five driverless cars successfully navigated a desert track with rough road conditions using a
recognition system.

Since 2009, Google has been secretly developing a driverless car project, which is now known as Waymo
. In 2010, it was reported that Google was testing self-driving cars at night on Highway for 1000 miles
without human intervention and that occasional human intervention had been required for a total of more
than 140,000 miles . In 2014, Google demonstrated a prototype of a driverless car without a steering
wheel, gas or brake pedal, resulting in 100% self-driving . As driverless program steadily advances, the
potential and opportunities of “autonomous driving” are being discovered by more and more people.

[h]Industrial

In recent 10 years, self-driving cars have become a key area of concern for many companies . There are
two evolutions of autonomous driving industry. The first-time evolution is triggered from the expectation
of autonomous-driving industrialization. There are many landmark events, such as Waymo’s spin-off,
General Motors’ acquisition of Cruise, Ford’s investment in ArgoAI . The background of this evolution is
that artificial intelligence technology has improved tremendously, and deep learning has started to be
applied on a large scale in the field of perception technology, such as image recognition . Also, sensor
technology has developed greatly . However, at that time, the long tail problem has become an important
constraint for the implementation of high-level autonomous driving. Specific challenges exist in the
robustness of hardware, redundancy of the system, and perfection of testing .

The second evolution comes from the commercialization of autonomous driving. After 3–4 -year
development of technology, the core technologies of autonomous driving such as LiDAR, chips,
perception, and decision algorithms have further developed . Autonomous driving in specific scenarios
(such as mines, ports, and airports) has been able to be commercialized.

At present, the global autonomous vehicle industry is developing in a better trend. However, few areas
can achieve mass production . Autonomous vehicle technology is developing together with 5G
communication technology and related technology of new energy vehicles.

[h]Why intelligent autonomous vehicles have a huge potential

Autonomous vehicles technology has the potential to transform commuting and long-distance travel
experiences, take people away from high-risk work environments, and allow for a higher degree of
development and collaboration across industries. It is also the key to building our cities of the future.

[h]Living

In the future, human’s relationship with vehicles will be redefined from reducing carbon emissions and
paving the way for more sustainable lifestyles levels. It is estimated that 30% of greenhouse gas
emissions in the United States come from transportation . Autonomous vehicles will be able to travel
more efficiently on the road than vehicles operated by human drivers. This will lead to a reduction in
greenhouse gas emissions. Also, vehicles can be grouped into “platoons” and reduced number of
accidents. These can keep traffic flow continuously. Less congestion means that passengers can get to
their destinations faster and spend less time on the road, which in turn makes more fuel efficiency and
reduces CO2 emissions. Vehicles will also likely communicate with road infrastructure, such as traffic
signals, to adjust fuel consumption and emissions accordingly.

Cab services, car sharing, and public transportation will become faster and cheaper as the autonomous
vehicles revolution progresses. The cost of these services is therefore expected to decrease as
maintenance, gas, and labor requirements decrease. In this way, the cost gaps between purchasing one’s
own car and using these travel services will be stretched to a degree that will redefine how people travel.
More importantly, the reduced cost of transportation services will drive economic mobility for vulnerable
populations. Geographic locations that previously have been inaccessible to certain populations due to
commuting costs will become accessible, resulting in some new beneficial effects on the working
population.

When autonomous vehicles become available, navigation will be more efficient and traffic congestion
will be reduced. Also, because cabs and car-sharing services will be cheaper than purchasing a car, there
will be fewer cars on the road. These will ultimately improve commuting efficiency. The saved commute
time can be spent on work, socializing, and relaxing. This thus reduces the anxiety commuters when they
arrive at work and being able to work in a better condition. In addition, the reduction in travel time also
helps to improve daily productivity.
[h]Human nature

Laziness is often the first driver of technological progress. Looking back at the history of technological
progress, people have invented home appliances in order to reduce the labor of household chores. To save
the pain of walking, human beings have made carriages, bicycles, motorcycles, cars, trains, and airplanes.
Humans hate repetitive and inefficient tasks. They are too lazy to do it themselves. Therefore, they let the
tools do it and automate the repetitive tasks. Driving is a relatively repetitive and inefficient task.
Developing tools to achieve automation is clearly in line with the trend of technological development.
From efficiency perspective, the popularity of intelligent autonomous vehicles can improve efficiency and
save time.

[h]Safety

Safety of intelligent autonomous vehicles is defined as their ability to keep people safe and reduce the
accident rates when autonomous vehicles are operating properly. Research shows that intelligent
autonomous vehicles are safer than the average human drivers . Firstly, machines are much better at
perception than humans. The autonomous vehicle has various sharp sensors, radar, and cameras, which
can perceive a wider range than the human when driving. The hardware upgrades allow vehicles to “see”
the world that humans cannot see in a farther, wider, and clearer way. Also, it can see many different
angles at the same time, which is beyond the range of human perception. For example, Tesla has eight
cameras mounted around the body (two in the front, two on the left and right side, and two in the rear),
providing a 360-degree view and a range of 250 meters . Front-enhanced radar is installed to provide
clearer and more accurate detection data in adverse weather conditions (such as rain, fog, and smog). All
of these are beyond the ability of human perception. Therefore, autonomous vehicles can make decisions
earlier than humans and react faster, which makes driving safer. Furthermore, for, autonomous vehicles,
vehicle-to-vehicle communication will be possible. There will be more communication in various
scenarios and further improve safety.

Secondly, autonomous vehicles are more energetic than humans. Globally, fatigue driving has become
one of the major causes of traffic accidents. According to the National Highway Traffic Safety
Administration, there are about 100,000 traffic accidents on U.S. roads each year due to drivers going to
sleep while driving . Studies show that approximately 94% of crashes are caused by human error. The
World Health Organization estimates that more than 1.3 million people die in road traffic accidents each
year . The number of deaths and injuries from car accidents caused by distracted drivers continues to
increase . Autonomous vehicles do not need a human driver. Instead, the driver is a microcomputer,
which runs a large amount of computer code and connects to different sensors inside and outside the
vehicle. The data and sensors are connected to the cloud and can simulate the external environment
around the vehicle real time. In this way, the intelligent autonomous vehicles can anticipate the actions
that need to be taken based on the current surrounding traffic conditions. These actions are performed
normally regardless of the climate, environmental, and traffic conditions. People get fatigued while
autonomous vehicles do not. Thus, safety levels of autonomous vehicles driving are higher than human
drivers.

Thirdly, autonomous vehicles are more rational than humans. People have emotions. They may make
dangerous actions because of panic and rage. Autonomous vehicles do not make these mistakes, which is
a major advantage of autonomous vehicles. At present, autonomous vehicles may be worse than humans
in making decisions, especially in the face of various extreme situations and uncertainty. However, this
piece is constantly improving. One aim of intelligent autonomous vehicles research is to improve the
ability of intelligent autonomous vehicles to deal with various extreme situations and cover a variety of
possible extreme cases to increase safety of intelligent autonomous vehicles. In addition, the optimal
future situation is that all vehicles are intelligent autonomous, which makes most of the driving behavior
predictable:It is also acknowledged that autonomous vehicles cannot guarantee a 100% safety rate. They
will have failures and flawed algorithms. However, humans cannot achieve a 100% safety rate either. As
long as intelligent autonomous vehicles outperform humans, intelligent autonomous vehicles will help
reduce the deaths in traffic accidents worldwide each year.

[h]Key technologies needed for future intelligent autonomous vehicles

From the future perspective, intelligent autonomous vehicles will develop from low speed to high speed,
from carrying goods to people, from commercial to civil use. Autonomous vehicles are a complex
engineering system that requires the integration and precise cooperation of various technologies, which
include algorithm, client system, and cloud platform . The algorithm includes sensing, which is used to
extract meaningful information from the collected sensor raw data; positioning, which is used to precisely
control the driving direction of the intelligent autonomous vehicles; perception, which is used to
understand the surrounding environment of the vehicle and provide safe and reliable planning for the
vehicle’s travel and arrival. To achieve these algorithms, professionals are expected in the following
areas: computer vision (including image classification, target detection, target recognition, and semantic
segmentation), Kaman filtering (including vehicle position prediction, measurement, updating, and
multiple sensor data fusion), Markov localization (including vehicle motion models for vehicle
localization, Markov processes, Bayesian principles and position filtering, target observation localization,
and p filtering), vehicle control decisions (PID control; including lane departure error control, PID
hyperparameter adaptive adjustment), model predictive control (MPC; including vehicle dynamics,
vehicle trajectory prediction, finding optimal execution parameters such as steering angle and
acceleration) . Client system consists of operating system and hardware system, which can cooperate with
the algorithm part to meet the requirements of real-time, reliability, safety, and green energy
consumption. Cloud platform provides offline computing and storage functions to support testing of
continuously updated algorithms, generating high-precision maps and large-scale deep learning model
training.

[h]Scenario operation

The future of intelligent autonomous vehicles driving also requires the cooperation of technology and
scenario operation. The structured scenario can already realize Level 4 intelligent autonomous vehicles
operation. However, different scenario has different requirements for technical details. Mining scenario
has magnetic field interference, poor GPS signal, and harsh working environment with more dust . Port
scenario often encounters rain and wet weather . Airport logistics needs to require high security . Under
these conditions, intelligent autonomous vehicles need to be connected to the overall operation
management and dispatching system. Therefore, a deep understanding of the scenario and targeted
changes to the algorithm are expected to achieve the implementation of intelligent autonomous vehicle
driving. Also, continuous data acquisition through operation is needed to continuously iterate on the
technology.

[h]Machine learning

Machine learning behind autonomous driving is an interesting problem. Dataset of autonomous driving is
on-policy, meaning that it changes with the driving strategy. Also, not all data are useful. For autonomous
driving, a lot of data are monotonous and repetitive, such as nice weather, and no cars or pedestrians
around, which do not help much to improve the driving strategy. How to overcome the shortcomings of
each sensor to provide real-time, accurate, and non-redundant environmental information is a further
requirement of deep learning.

Another area that future needs to solve is the learning improvement of driving strategy. Assuming that the
driving strategy performs poorly at the beginning and requires human intervention every kilometer, for
every kilometer some important data are collected, such as video and radar data, which are associated
with a few seconds before the accident. Then, data are used to train the current model and learn a better
strategy to avoid accidents. With better strategies, manual intervention is reduced once for every 10
kilometers. Then, the data are used to train the current model again. The learning cycles thus are built up.
From the learning, it will be found that the better the quality of the driving strategy it is, the less frequent
the manual intervention occurs, the less effective training data can be gotten, and the harder it is to
continue improving. As the quality of driving improves, the decay in the rate of manual intervention is
very slow. The key question then is whether it can surpass the level of a real driver? The real scenarios
have a variety of strange and bizarre situations. There will be some corner case never seen by the
algorithm . As long as the machine learning algorithm still needs a lot of data, the driving strategy seems
difficult to achieve human levels. Some people think that the problems can be solved by adding more
sensors . However, with the increase of sensors, at first the effect of signals will raise. But then the
deployment, maintenance, and coordination costs will rise significantly.

The current research on perception algorithms is mainly focused on vision. The research on applying deep
learning for target recognition, segmentation, and tracking is very hot . However, what will happen if
there is a person wearing a t-shirt painted with a stop sign on the road? More problems happen on the
counter-sample. Stop-sign with a few sticky notes may be recognized as yield sign. Printing a few special
patterns on the clothes may be stealthy. These are still considered counter-sample on the visual level. In
the future, strategy-level counter-sample problems may generate with more self-driving cars becoming
available.

Using artificial intelligence to make decisions is also imperative. At present, a series of research studies
on decision making such as deep augmented learning, predictive learning, and imitation learning are
gradually carried out.

[h]Sensors

Autonomous vehicles require the use of multiple sensors. This at the same time is the challenge for
autonomous vehicle—how to balance these different sensing modalities . To overcome this challenge, an
intelligent way is expected to combine the various sensors to provide a safe, high-quality autonomous
driving system without significantly increasing size, weight, power, or cost . The improvement of vehicle
sensing focuses on collecting data from individual sensors and applies sensor fusion strategies to
maximize complementary and compensate the weaknesses of different sensors in various conditions . For
example, cameras are excellent at recognizing signs and colors, but perform poor in bad weather or low
light environments. These weaknesses can be compensated by radar . However, the current systems are
mainly independent, with little interaction between individual sensing systems. Also, no single sensing
approach can be applied to all applications or environmental conditions.

[h]Road condition recognition

For road condition recognition, vehicles need to control and recognize surrounding obstacles, traffic
signals, pedestrians, and other vehicle states. Inertial Measurement Units (IMUs) can detect sudden jumps
or deviations caused by potholes or obstacles. Through real-time connectivity, these data can be sent to a
central database and used to warn other vehicles about potholes or obstacles . The same is true for
camera, radar, LIDAR, and other sensor data . These data are compiled, analyzed, and fused so that the
vehicle can use it to make predictions about its driving environment. This allows the vehicle to become a
learning machine and is promised to make better and safer decisions than humans. To date, however, very
few intelligent autonomous vehicles have been able to achieve it.

[h]Storage

The data of intelligent autonomous vehicles are massive. Data of intelligent autonomous vehicles are
mostly images, sound, and other natural information. They are unstructured data, whose data volumes are
much larger than structured data. In addition, intelligent autonomous vehicles require a very high level in
real time. According to calculations, the amount of data collected by autonomous driving technology is as
high as 500GB to 1 TB per hour per vehicle. The core of data flow in vehicles is the storage of data. The
memory capacity of general computers is in a few or dozens of GigaByte (GB). The memory capacity of
large servers is in hundreds of GB. It is nearly impossible to store all the data into the computer of
intelligent autonomous vehicles considering the memory for computing capabilities under the existing
hardware system. A potential solution is to load the data into high-efficiency hard disk storage and send it
to the memory and CPU or GPU for computation when needed .

Another problem encountered in data storage for autonomous vehicles driving is environment . A
computer hard drive is a fragile component. However, vehicles are often used in harsh environment.
When the vehicle is on the road, it will experience constant vibration, extreme weather, and even sudden
power outages or accidents. The hard drive, which is the “data tank” for autonomous driving, must ensure
efficient and stable operation under all these circumstances. Otherwise, the data will not be sent to the
computing unit, which will lead to serious consequences.

[h]Electromagnetic interference (EMI)

There are two types of electromagnetic interference (EMI) emissions: conducted and radiated . Conducted
emissions are connected to the product through wires and alignments . Since the noise is limited to a
specific terminal or connector in the design, in the early development process, with the help of a good
layout or filter design, it usually can be relatively easy to ensure compliance with the conducted emission
requirements . However, radiated emissions are in a different condition. Anything on a circuit board that
carries current radiates electromagnetic fields. Every alignment on a circuit board is an antenna, and every
copper layer is a resonator. Things, other than a pure sine wave or DC voltage, will generate noise across
the signal spectrum. The power-supply designer does not know how bad the radiated emissions will be
until the system is tested. To make things worse, radiated emissions can only be tested formally after the
design is essentially complete. People often use filters to attenuate a specific frequency or a certain
frequency range of signal strength, in this way to reduce EMI . In addition, the energy of radiation, which
goes through the space propagation, can be attenuated by adding metal and magnetic shielding. EMI
cannot be eliminated, but can be attenuated to a level acceptable to other communications and digital
devices. This is what the future needs to further detect.

[h]Chips

Under the trend of “software-defined car,” chips, operating systems, algorithm, and data form the closed
loop of intelligent autonomous vehicle computing ecology. Chips are the core of intelligent autonomous
vehicle ecological development. This also gives rise to a strong demand for autonomous vehicles chips.
The value of chips in the autonomous vehicles is not small. The value of all the chips in a cell phone is a
hundred dollars, but the value of all the chips in autonomous vehicles is up to hundreds of dollars. At the
same time, there are many kinds of chips above the autonomous vehicles. Family cars need hundreds of
chips, such as tire pressure monitoring, sunroof, lights to back-up camera, back-up radar, and remote
control keys. When any chip misses, the car cannot be delivered.

In 2022, the global “chips shortage” still has not been effectively alleviated. The most shortage of chips is
MCU chips, which is the most common chip on the autonomous vehicles. Window control, seat control,
and Electronic Control Unit (ECU) are inseparable from it. MCU accounts for about 30% of the total
number of autonomous vehicles chips. A car needs dozens to hundreds of MCU chips . Chips shortage
prompts many dealers under the temptation of high profits, which makes the market situation of chips
shortage become more and more serious.

Chips are a piece that contains integrated circuit components. Chips can be roughly divided into two
categories—functional chips (such as CPU and communication base station processing chip) and storage
chips. To manufacture a chip, the industry first needs to design the chip. Which function that the chip can
have has been determined in the design of this step. This requires professionals to carry out the design of
the circuit. Next is the production. This step is also the most tedious. Finally, it is the packaging. The
finished chip is mounted into a saleable product. Among these three steps, the most difficult is the design,
while the easiest is the packaging.

The chip crisis is bringing a change to reshape the autonomous vehicles business model. The “0
inventory” management can no longer adapt to the development. The pursuit of supply chain costs brings
a greater risk of hidden danger. Secondly, cooperation with chip foundries needs to be more open.
Previously, vehicle chip foundries were relatively closed. They were unable to respond unexpected
situations flexibly. How to cooperate with more chip suppliers on an equal footing in the future is a
problem that must be faced. Finally, autonomous vehicle chips need to follow the chip industry upgrade.
Now, the vehicles chip is mainly used in 8-inch production line. However, from a technical point of view,
8 inches compared with 12 inches is “past tense”; from the economic efficiency point of view, the 12-inch
production line is more efficient than the 8-inch production line.

Autonomous vehicles driving also makes the E–E architecture of vehicles a new trend. The EE
architecture of vehicles refers to the layout of the electrical and electronic systems in the vehicle scheme.
This concept was first proposed by Delphi in 2007. The electrical architecture is dedicated to providing
integrated electrical system layout solutions for automakers to address the development of electrification
of vehicles. The original EE architecture was a distributed architecture. Each controller targets one
function, which makes add an additional function simple and quick by adding a controller. In the stage of
mechanical vehicles, because there are not many electrical and electronic components, the distributed
architecture is still comfortable to use. As the level of automotive intelligence increased, distributed
architecture began to be caught off guard by the explosive growth of electrical and electronic
components.

Today’s autonomous vehicles do not only need abundant acceleration, but also need to be able to see and
hear all directions and realize human-machine interaction at any time. With so many functions to achieve,
the number of ECUs has increased dramatically; the complexity of the wiring harness in the car has risen;
and the maintenance of electrical equipment has become cumbersome to update. How to ensure these
functions do not become “congested” in the process of use, and how to call different ECU functions at the
same time to react quickly when dealing with complex instructions? The wave of vehicle electrification
makes these problems even more difficult.

The new architecture of EE-Centralized Architecture has emerged. It divides the whole vehicle into
several domains according to the functions of electronic components (such as powertrain domain, vehicle
safety domain, intelligent cabin domain, and intelligent driving domain). Then, a domain controller
(DCU) with better processing power is used to unify the control of each domain. The emergence of a
centralized architecture has led to a significant increase in the integration of vehicle functions. The role of
individual ECU is integrated by integrated management. Complex data processing and control functions
are unified in the DCU. With the signals received and analyzed become more complex, the EE
architecture will also evolve to multi-domain controller MDC. Because of the new trend of automotive
EE architecture, the integration of controllers and the introduction of domain controllers will affect the
chip market simultaneously. DCUs, ECUs, and other electronic devices will also be further developed.

The centralized EE architecture also puts new demands on the automotive software architecture. With the
centralization of automotive EE architecture, domain controllers and central computing platform deploy
in a hierarchical or service-oriented architecture. The number of ECUs is significantly reduced. The
underlying hardware platform needs to provide more powerful arithmetic support. The software is no
longer based on a fixed hardware development, but to have portable, iterative, and scalable features.
Therefore, at the level of software architecture, automotive software architecture is also gradually
upgraded from Signal-Oriented Architecture to Service-Oriented Architecture to better achieve hardware
and software decoupling and rapid software iteration.

[h]Navigation

In the future, more advanced driver assistance features that can be synchronized with navigation and GPS
systems will be developed. Each situation’s data that a self-driving car might encounter will be collected
and accumulated. Mapping companies need to enhance 3D mapping data of cities. Automakers and high-
tech automotive system suppliers need to work closely with each other to ensure that light detectors,
LIDAR, radar sensors, GPS, and cameras work together.

[h]Iteration

There are currently two types of iterations: evolutionary and revolutionary. The evolutionary route
enables accumulation of more perception and driving data (such as scenario road conditions). These data
(such as calibrated image data and corner case driving road conditions) and decision algorithms can be
migrated to Level 4 intelligent autonomous vehicles. Revolutionary aims to develop fully autonomous
vehicles directly. It remains unclear which path alone will be successful. However, the more likely
outcome is a symbiotic fusion of the two.

[h]How intelligent autonomous vehicles affect the future

The automobile was invented in 1886. More than 100 years after the invention of the automobile, the
human social life has been changed. Life, living, working, and entertainment have changed dramatically.
Therefore, autonomous vehicles may change human life living once again in the following areas.

The value chain of the automobile industry will be reconstructed with the development of intelligent
autonomous vehicles. Nowadays, vehicles are mainly hardware devices, and safety is the core. However,
the core of the vehicles is changing from hardware equipment to IT services, including autonomous
driving system and vehicle networking system. In addition, automobile insurance is being changed. In the
future, insurance for human driving may be not needed. Without the needs on insurance for human
driving, the insurance companies need to seek new business. Insurance for autonomous vehicles, such as
network security and system failure, may become a new trend.
Because of the installation of cameras, radar, LIDAR, and artificial intelligence systems, the cost of
autonomous vehicles will be high. This means intelligent autonomous vehicles will be more likely to be
firstly used for specific traffic industries and groups . The senior and disabled people may be first group
people that use intelligent autonomous vehicles. Intelligent autonomous vehicles give them the freedom
to travel without relying on friends and family. Car-hailing, buses, cabs, logistics vehicles may be the first
group services that adopt autonomous vehicles. This will significantly reduce traffic congestion and
environmental degradation. Because of the changing public transportation , the offline shopping mall will
be subverted too. From small street stores to the large shopping malls, there are usually large parking lots
underneath the malls. However, in the future, after the popularity of self-driving, the vacancy time of the
car almost does not exist, and people travel and shopping simply do not need to consider whether there is
a large parking lot near the mall. This will have a direct impact on shopping malls.

[h]Barriers in intelligent autonomous vehicles development

In the future, it is inevitable that autonomous driving may replace human driving in most cases. However,
new technologies have not been tested over a large amount of time and are not guaranteed to be free of
pitfalls. These pitfalls mainly include reliability, legal, and ethical issues. The two pitfalls may barrier the
development of intelligent autonomous vehicles.

Reliability of intelligent autonomous vehicles is defined as the ability to perform a given function without
failure for a given period of time and under certain conditions. Failures can come from various directions
such as control system failures and being hacked. Generally, the more complex the system is that the
more automated and intelligent it is, the greater risk of potential stability it will meet. For example, in
Arizona, USA, a self-driving road test vehicle struck and killed a cyclist. There was a person behind the
wheel of the autonomous vehicles at the time of the accident, but that person was not actually steering the
vehicle. These types of accidents reduce public confidence in the ability of autonomous vehicles.

Ethical and legal are one of the most serious issues that autonomous vehicles need to solve if they want to
go forward in the future, especially involving the definition of responsibility for accidents. It includes a
few of sub-questions. Firstly, the question is related to whether an autonomous driving system can be a
legally recognized driver or not? The function of driving a vehicle has historically been assigned to a
licensed human driver. This determination will have a direct impact on whether self-driving cars can be
considered as a legal driver. Secondly, the issue is related to liability. If autonomous vehicles malfunction
and cause damage to the owner, whether the owner can seek compensation from the insurance company,
and how liability will be allocated between the human driver and the manufacturer in an accident.
Legislation on intelligent autonomous vehicles depends on the simultaneous development of technology,
regulations, and public consensus building. Thirdly, the issue is related to ethics. In times of emergency
condition, whether self-driving cars protect the people inside the car or the people outside the car?
Traditional cars are centered on protecting the drivers and passengers for it is decided by human drivers.
But self-driving cars can face such ethical controversies because they can be prEEngineered. Moreover,
the ethical issues related to the condition that when facing the necessity of hitting one of the two persons
in certain situations, who should be the victim? Due to Moore’s law, self-driving technology will continue
to mature and safety. The legal and ethical issues will become the biggest obstacle for the development of
intelligent autonomous vehicles.
[mh] Cloud Robotics and Autonomous Vehicles

The advancements in the vehicular industry have immensely benefited different related industries and
served the humanity by increasing the efficiency of our routine activities. Take the example of agriculture
industry, one can cultivate a piece of land so quickly (using tractors and other equipment) than compared
to 100 years ago. The same applies to the transportation industry—nowadays, one can travel from one
point to another so quickly compared to travelling few decades ago. However, vehicular industry still
requires further advancements to decrease/eliminate the human error/involvement. If we take the example
of transportation industry, only in the USA, motor-vehicles-traffic-related injuries result in around 34,000
deaths every year, and it is the leading cause of death every year for people aged between 4 and 34 . If we
look worldwide, over 1.20 million people die every year as a result of road traffic accidents and between
20 and 50 million more people suffer from non-fatal injuries including physical disabilities, etc. . 90 plus
percent of these accidents are caused by human error . Hence, a need arises for the technology that always
pays attention to it, never gets distracted and has no/minimal human involvement. Such goals can be
achieved through autonomous vehicles.
Figure :Different types of vehicles used in real life applications for travelling in different mediums such
as land, water, air and space.
Vehicles can be generally categorized into on road, off road, water, aerial, space and amphibious vehicles.
Figure(a)–(f) shows examples of different types of vehicles from real life. On-road vehicles require paved
or gravel surface for their driving. These vehicles involve sports cars, passenger cars, pickup trucks,
school/passenger buses, etc. Off-road vehicles are vehicles which are capable of driving both on as well
off the paved surface. These vehicles are generally characterized with deep large tires or caterpillar tracks,
flexible suspension and open treads. Tractors, bulldozers, tanks, 4WD army trucks are examples of such
type of vehicles. Both on-road and off-road vehicles require land as a medium of travelling. Water
vehicles are vehicles which are capable of driving/travelling on water, under water or both on as well
under water. Ships, boats and submarines are examples of such vehicles. Aerial vehicles are vehicles
which are capable of flying in the air by gaining air support. Aeroplanes, helicopters, drone planes are
examples of aerial vehicles. Space vehicles are vehicles which are capable of travelling/flying in the outer
space. They are used to carry payload such as humans or satellites between the earth and the outer space.
These vehicles are rocket-powered vehicles, which also require an oxidizer to operate in vacuum space.
Spacecrafts and rockets are examples of space vehicles. Amphibious vehicles are vehicles, which inherit
characteristics of multiple mediums of travelling and can travel in those mediums efficiently. These
mediums can be land, water, air and space. AeroMobil 3.0 and LARC-V (Lighter, Amphibious Resupply,
Cargo, 5 ton) are examples of amphibious vehicles.

Automation of these different types of vehicles can increase the safety, reliability, robustness and
efficiency of the systems through standardization of procedural operations with minimal human
intervention. With the advancements in technology, autonomous vehicles have become forefront public
interest and active discussion topic recently. Based on a recent survey conducted in the USA, UK and
Australia, 56.8% peoples had positive opinion, 29.4% had neutral opinion and only 13.8% had negative
opinion about the autonomous or self-driving vehicles . These stats give us a good picture of the general
public’s interest in the autonomous vehicles; however, they do have high levels of concerns regarding
safety, privacy and performance issues. There are generally five levels of autonomous or self-driving
vehicles ranging from Level 0 to Level 4 . The brief description of these levels is as follows :

 •

Level 0 means no automation.

 •

Level 1 achieves critical function-specific automation.

 •

Level 2 achieves combined functions automation by coordinating two or more Level 1 functions.

 •

Level 3 provides limited self-driven automation.

 •

Level 4 provides completely self-driven unmanned vehicle. The vehicle controls all its operations
by itself.

These levels of automations are drafted for the on-road vehicles; however, they can be extended for the
other types of vehicles. Based on a recent survey conducted in the USA, UK and Australia, 29.9% people
were very concerned, 30.5% were moderately concerned, 27.5% were slightly concerned and only 12.1%
people were not at all concerned about riding in a Level-4 autonomous vehicle . These statistics show that
major efforts are required for the usage and acceptability of the autonomous vehicles in real-world
practical applications.

Autonomous vehicle is an active area of research and possesses numerous challenging applications. The
earlier implementations of the autonomous vehicles and other autonomous systems were standalone
implementations; rather, most of the existing implementations are still operating independently . In such
standalone implementations, the system is limited to the onboard capabilities such as memory,
computations, data and programs, and also the vehicles cannot interact with each other or have access to
each other’s information or information about their surroundings . To achieve Level-4 autonomous
vehicles and self-driven automation in other robotic systems, it is important to overcome these limitations
and go beyond the onboard capabilities of such systems. With the advent of Internet and emerging
advances in the cloud robotics paradigm, new approaches have been enabled where systems are not
limited to the onboard capabilities and the processing is also performed remotely on the cloud to support
different operations. The cloud-based implementation of different types of vehicles has been illustrated in
Figure. The cloud-based infrastructure has potential to enable a wide range of applications and new
paradigms in robotics and automation systems. Autonomous vehicle is one example of such systems,
which can highly benefit from cloud infrastructure and can overcome the limitations posed by standalone
implementations. Cloud infrastructure enables ubiquitous, convenient and on-demand network access to a
shared pool of configurable computing resources . These computing resources can include services,
storage, servers, networks and applications . These resources can be rapidly provisioned and released with
minimal service provider interaction or management effort . The cloud model is characterized with five
important characteristics (on-demand self-services, broad network access, resource pooling, rapid
elasticity and measured service), three service models and four deployment models , and see Ref. for
more detail.

Figure :Example of vehicular cloud where vehicles act as nodes to access shared pool of computing
resources, including services, storage, servers, networks and applications.
An example of cloud-based implementation is the online document-processing facility offered by
Microsoft through Office 365 and OneDrive. One can perform different operations online, such as one
can create, edit and share MS Office documents online without the need of installing the MS Office
locally. The documents’ data and software installations reside on the cloud remote servers, which can be
accessed through the Internet. These servers share their computing capabilities such as processors, storage
and memory. Cloud-based implementations provide economics of scale and take care of the software and
hardware updates. The infrastructure also facilities backup of data as well sharing of resources across
different applications and users.

Cloud-enabled robots and autonomous systems are not limited to onboard capabilities and rely on data
from a cloud network to support their different operations. In 2010, James Kuffner explained the potential
benefit of cloud-enabled robots and coined the term “Cloud Robotics” . The vehicular industry is rapidly
evolving; nowadays, vehicles are equipped with different ranges of sensors and cloud collaborations,
which facilitate drivers with the desired information, for example, weather forecasts, GPS location, traffic
situation on the road, road condition, directions and time to reach the destination with different alternative
paths and speeds, etc. Through cloud robotics, we can develop a network of autonomous vehicles , by
which autonomous vehicles can collaborate with each other or can perform different computing activities
that they cannot perform locally, to achieve their well-defined utility functions . Google’s self-driving car
demonstrates the cloud-robotics-based implementation of the autonomous vehicles. It uses cloud services
for the accurate localization and manoeuvring. Google tested the autonomous vehicle deployment on
different types of cars including Audi TT, Lexus RX450h, Toyota Prius and their own custom vehicle .
As of March 2016, Google had tested their autonomous self-driven vehicles a total of 1,498,214 miles
(2,411,142 km) . The project is limited to the on-road implementation of the autonomous vehicle; also it
has many limitations that need to be addressed before it can be released to the commercial public. These
limitations involve driving in heavy rain or snowy weather, driving on unmapped intersections or routes,
veer unnecessarily due to difficulty in objects identification and other limitations cause of LIDAR
technology to spot different signals , etc. .

The rest of the chapter is organized as follows: In Section, we have presented the historical backgrounds
of the evolution of autonomous vehicles, cloud computing and cloud-enabled autonomous vehicles. In
this section, we have also presented different high level architectures of autonomous vehicle and cloud-
enabled autonomous vehicles proposed in literature. In Section, we have discussed five potential benefits
of cloud-enabled autonomous vehicles, namely cloud computing, big data, open-source/open-access,
system learning and crowdsourcing. Section describes active research challenges and possible future
directions in the field, and conclusion appears in Section.

Autonomous vehicle is an active area of research with rich history. The research performed in the early
1980s and 1990s on autonomous driving demonstrated the possibilities of developing vehicles that can
control their movements in the complex environments . Initial prototypes of autonomous vehicles were
limited to indoor use . First study on visual road vehicle was performed in Mechanical Engineering
Laboratory, Tsukuba, Japan, using vertical stereo cameras . In the early 1980s, completely onboard
autonomous vision system for vehicles was developed and deployed with digital microprocessors . First
milestone towards the development of road vehicles with machine vision was accomplished in the late
1980s, with the fully autonomous longitudinal and lateral demonstrations of UBM’s (Universität der
Bundeswehr München) test vehicle and computer vision VaMoRs on a free stretch of Autobahn over 20
km at 96 km/h . These encouraging results led the inclusion of computer vision for vehicles in the
European EUREKA-project “Prometheus” (1987–1994) and also sparked European car industrialists and
universities for research in the field . The major focus of these developments was to demonstrate which
functions of the autonomous vehicles could be automated through computer vision . The promising
results of such demonstrations kicked-off many initiatives .

The Intermodal Surface Transportation Efficiency Act (ISTEA) passed the transportation authorization
bill in 1991 and instructed United States Department of Transportation (USDOT) for demonstrating an
autonomous vehicle and highway system by 1997 . The USDOT built partnerships with academia, state,
local and private sectors in conducting the program, and made extraordinary progress with
revolutionizing developments in vehicle safety and information systems . The USDOT program also
motivated US Federal Highway Administration (FHWA) to start National Automated Highway System
Consortium (NAHSC) program with different partners including California PATH, General Motors,
Carnegie Mellon University, Hughes, Caltrans, Lockheed Martian, Bechtel, Parsons Brinkerhoff and
Delco Electronics . The Intelligent Vehicle Initiative (IVI) program was announced in 1997 and was
legalized in 1998 Transportation Equity Act for the 21 st Century (TEA-21), with the purpose to speed-up
the development of driver-assistive systems by focusing on crash prevention rather than mitigation and
vehicle-based rather than highway-based applications . The IVI program resulted in a number of
successful developments and deployments in the field operational tests which include different
commercial applications such as lane change assist, lane departure/merge warning, adaptive cruise
control, adaptive forward crash warning and vehicle stability system . Commercial versions of these
systems were manufactured shortly, and ever since their evolution and industrial penetration have been
increasing .

The increasing advancements in the computational and sensing technologies have further impelled
interest in the field of autonomous vehicles and developing cost-effective systems . The present state of
the art autonomous vehicles can sense their local environment, identify different objects, have knowledge
about the evolution of their environment and can plan complex motion by obeying different rules. Many
advancements have been made in the recent 1.5 decade, evidenced through different successful
demonstrations and competitions. The most prominent historical series of such competitions/challenges
were organized by US Department of Defence under Defence Advanced Research Projects Agency
(DARPA). The competition was initially launched as DARPA Grand Challenge (DGC), and there are five
such competitions held so far—first in March 2004, second in October 2005, third in November 2007,
fourth from October 2012–June 2015 and fifth one from January–April 2013 . The fourth challenge,
named as DARPA Robotics Challenge (DRC), was aimed at the development of semi-autonomous
emergency maintenance ground robots , and the fifth challenge, named as Fast Adaptable Next-
Generation Ground Vehicle (FANG GV), was aimed at adaptive designs of the vehicles . Both FANG and
DRC were not aiming for self-driven/autonomous/robotic vehicles; hence, we have not discussed them in
this chapter. In the first three challenges, hundreds of autonomous vehicles from the USA and around the
world participated in the competition and exhibited their different levels of versatilities. The first and
second challenges were aimed to examine vehicles’ ability in off-road environment. Autonomous vehicles
had to navigate in a desert up to 240 km at speed up to 80 km/h . Only five vehicles were able to travel
more than a mile, in the first competition. Out of those five vehicles the furthest travelling vehicle
covered only 7.32 miles (11.78 km) . None of the vehicles completed the route; hence, there was no
winner, and the second challenge was scheduled for October 2005. By the second challenge, five vehicles
were able to successfully complete the route, and Stanley from Stanford University, Palo Alto, California,
secured the first place in the competition. The third challenge, named as DARPA Urban Challenge
(DUC), was shifted to the urban area. The route involved 60 miles of travelling to be completed within 6
h, and vehicles had to obey all traffic regulations during their autonomous driving. Out of the 11 final
teams, only 6 were able to successfully complete the route, and Boss from Carnegie Mellon University,
Pittsburgh, Pennsylvania, secured the first place in the competition .
The participating teams in the DGC competitions adopted different types of system architectures, with
standalone implementations. However, on a higher level, they decomposed the system architecture in four
basic subsystems, namely, sensing, perception, planning and control (see Figure for the pictorial
representation) . The sensing unit takes raw measurements from different on-/off-board sensors (e.g. GPS,
radar, LIDAR, odometer, vision, inertial measurement unit, etc.) for perceiving the static and dynamic
environment. Sensor unit passes the raw data to the perception unit, which then generates the usable
information about the vehicle (e.g. pose, map relative estimations, etc.) and its environment (e.g. lanes of
other vehicles, obstacles, etc.), based on provided data. The planner unit takes the usable
information/estimations from the perception unit, reasons about the provided information and plans about
the vehicle’s actuations in the environment, such as path, behavioural, escalation and map planning, etc.
to maximize their well-defined utility functions. Finally, the planner unit passes the ultimate
information/commands to the control unit, which is responsible for actuating the vehicle.

Figure :High-level system architecture of the autonomous vehicles.

One of the main lessons learned from the DARPA challenges was the need for the autonomous vehicles
to be connected, that is, the ability to interact with each other and to have access to each other’s
information or information about their surroundings . This also provides us some idea about the
importance of cloud infrastructure in accomplishing dreams of autonomous vehicles. In the 2000s, cloud
computing was evolving and came into existence. Amazon introduced its Elastic Compute Cloud (EC2)
as a web service in 2006 . Amazon EC2 was aimed to provide resizable computing capabilities in the
cloud servers . In 2008, NASA’s OpenNebula become the first open-source software to provide private
and hybrid clouds . In the same year, Azure was announced by Microsoft, aiming to provide cloud
computing services and was released in early 2010 . In the mid-2010, OpenStack project was jointly
launched by NASA and Rackspace Hosting, with the intentions to help organizations to set up cloud
computing services (mostly IaaS) on their standard hardware . Oracle Cloud was announced by Oracle in
year 2012, with the aim to provide access to integrated set of IT solutions, including SaaS, PaaS and
IaaS .

The importance of connecting machines in manufacturing automation systems through networking was
realized 3.5 decades ago when General Motors developed Manufacturing Automation Protocol in 1980s .
Before the discovery of World Wide Web (WWW), different types of incompatible protocols were
adopted by different vendors. In the early 1990s, the discovery of WWW promoted the Hypertext
Transfer Protocol (HTTP) over Internet Protocol (IP) . In 1994, the first industrial robot was integrated
with WWW so that it can be teleoperated by different users through graphical user interface . In the mid-
and late-1990s different types of robots were integrated with the web to explore robustness and interface
issues that initiated study in a new field named as “Network Robotics” . In 1997, Inaba et al. investigated
the benefits of remote computing to accomplish control of remote-brained robots . The technical
committee on networked robotics established the IEEE robotics and automation society in 2001 . The
initial focus of the society was on Internet-based teleported robots which was later on extended to
different range of applications . In 2006, MobEyes system was proposed, which exploits vehicular sensors
networks to record surrounding environment and events for the purpose of urban monitoring. RoboEarth
project was announced in 2009, with the purpose to use WWW for robots, such that they can share their
data and learn from each other . In 2010, James Kuffner explained the concept of ‘Remote Brained’
robots (i.e. physical separation of the robotics hardware and the software) and introduced the term “Cloud
Robotics” with potential applications and benefits of cloud-enabled robots . In the same year, the term
“Internet of Things (IoT)” was introduced to exploit the network of physical things (e.g. vehicles,
household appliances, buildings, etc.) that consists of sensors, software and ability for network
connectivity for exchanging information with other objects . Different vehicular ad-hoc networks
(VANETs) were proposed in 2011, with the purpose to provide several cloud services for the next
generation automotive systems . The term “Industry 4.0” was introduced in the same year for the fourth
industrial revolution, with the purpose to use networking to follow the first three revolutions .

In 2012, M. Gerla discussed different design principles, issues and potential applications of the Vehicular
Cloud Computing (VCC) . In the same year, S. Kumar et al. proposed the Octree-based cloud-assisted
design for autonomous driving of vehicles to assist them in planning their trajectories . The high level
system architecture can be visualized as presented in Figure. The purpose of sensing, planner and
controller unit is same as explained for Figure, with the modification that the perception unit has been
merged into the planner unit, and planner unit has been divided into two sub-units namely onboard
planner and planner over cloud. Both planner units can communicate with each other to exchange desired
information and for vehicle to vehicle (V2V) communication. Cloud planner can generate requests to
various autonomous vehicles for providing sensors’ data, which is then aggregated to generate the
information about the obstacles, path planning, localization and emergency control, etc. Onboard planner
unit communicates with the cloud planner for planning the optimal trajectory and passes the ultimate
information to the controller unit, which then actuates the vehicle as required.
Figure :High-level system architecture of the cloud-assisted design of the autonomous vehicles.

In the same year 2012, the term “Industrial Internet” was introduced by General Electric, with the purpose
to connect industrial equipment over network for exchanging their data . In 2014, Gerla et al. investigated
the vehicular cloud and deduced that it will be a core system for autonomous vehicles that will make the
advancements possible . In the same year, Ashutosh Saxena announced the “RoboBrain” project, with the
aim to build a massive online brain for all the robots of the world from publically available internet data .
In early 2016, HERE announced the launch of their cloud-based mapping service for autonomous
vehicles, aiming to enhance automated driving features of the vehicles . In February 2016, Maglaras et al.
investigated the concept of Social Internet of Vehicles (SIoV), discussed its different design principles,
potential applications and research issues .

[h]Potential benefits

As discussed in the previous section, cloud-based automation has gained massive interest of researches
around the world and sparked many initiatives such as RoboEarth, Remote Brained Robots, IoT, Industry
4.0, VCC Industrial Internet, RoboBrain and SIoV. In this section, we have discussed potential of cloud to
enhance automation of vehicle (also applicable to all robotics automation systems in general) by
improving performance though five potential benefits, as follows:

[h]Cloud computing

Autonomous vehicles require intensive parallel computation cycles to process sensors’ data and efficient
path planning in the real-world environment . It is certainly not practical to deploy massive onboard
computing power with each agent of autonomous vehicle. Such deployments will be cost-intensive and
may have certain limitations in parallel processing. Cloud provides massively parallel on demand
computation, up to the computing power of super computers , which was previously not possible in
standalone onboard implementations. Nowadays, a wide range of commercial sources (including
Amazon’s EC2 , Microsoft’s Azure and Google’s Compute Engine ) are available for cloud computing
services, with the aim to provide access to tens of thousands of processors for on-demand computing
tasks . Initially, web/mobile apps developers used such services; however, they have increasingly been
used in technical high-performance applications. Cloud computing can be used for computationally
extensive tasks, such as to find out uncertainties in models, sensing and controls, analysis of videos and
images, generate rapidly growing graphs (e.g. RRT) and mapping, etc. . Many applications require real-
time processing of computational tasks, in such applications cloud can be prone to varying network
latency and quality of service (QoS), and this has been an active research area nowadays .

[h]Big data

Big data refers to extremely large collection of datasets that cannot be handled with conventional database
systems and require analysis to find out different patterns, associations, trends, etc. . Autonomous
vehicles require access to vast amount of data, for example, sensors’ network data, maps, images, videos,
weather forecasts, programs, algorithms, etc., which cannot be maintained on board and surpass the
processing capabilities of conventional database systems. Cloud infrastructure offer access to unlimited
on-demand elastic storage capacities over cloud servers that can store large collections of big data as well
facilitate in their intensive computations . Shared access of big datasets can also facilitate more accurate
machine learning of autonomous vehicles, which can help the planners in optimal decision-making. It is
essential to recognize that big datasets may require high-performance IaaS tools for performing intensive
computations on the gigantic amount of data. These may include Amazon’s EC2 , Microsoft’s Azure and
Google’s Compute Engine , as described in previous section. Active research challenges in cloud-based
big data storage include defining cross platform formats, working with sparse representation for efficient
processing and developing new approaches that can be more robust to dirty data .

[h]Open-source/open-access

Open-source refers to free access to original source code of software and models in case of hardware,
which can be modified and redistributed without any discrimination . Open-access refers to free access of
the algorithms, publications, libraries, designs, models, maps, datasets, standards and competitions, etc. .
In open set-up, different organizations and researchers contribute and share such resources to facilitate
their development, adoption and distribution. For standalone autonomous vehicles, it is not possible to
maintain such open-source software and resources and take maximum advantages of the facilities. Cloud
infrastructure facilitates by providing well-organized access to such pools of resources . A prominent
example of the success of the open resources in the scientific community is the Robot Operating System
(ROS), which provides access to the robotics tools and libraries to facilitate the development of the
robotics applications . Furthermore, many simulation tools and libraries (e.g. GraspIt, Bullet, Gazebo,
OpenRAVE, etc.) are available open-source and can be customized as per the application’s requirement,
which can certainly speed up the research and development activities.

[h]System learning

System learning refers to collective learning of all the agents (e.g. autonomous vehicles) in the system.
Autonomous vehicles need to learn from each other’s experiences, for example, if a vehicle identifies a
new situation that was not part of the initial system, then the learning outcome of that instance needs to be
reflected in all the vehicles in the systems. Accomplishing such goals is not possible with standalone
implementation of the autonomous vehicles . Cloud infrastructure enables shared access on the data .
Instances of physical trials and new experiences are also stored in that shared pool for collective leaning
of all the vehicles. Instances can hold initial and anticipated conditions, boundary conditions and outcome
of the execution. A good example of collective learning is the “Lightning” framework, which indexes
paths of different robots in the system over several tasks and then use cloud computing for path planning
and variations in different new situations .

[h]Crowdsourcing

Crowdsourcing can be defined as a process to obtain desired information, service, input or ideas on a
specific task (which surpass computer capabilities) from human(s), typically over the Internet . In case of
autonomous vehicles, crowdsourcing can be performed to solve a number of problems, for example,
during operation vehicles identify new obstacles/routes which were not labelled previously and require
human(s) input. Standalone implementations limit vehicles’ ability for accomplishing such objectives .
Cloud-enabled systems facilitate in conducting crowdsourcing activities with specific or cloud crowd .
Cloud-based crowdsourcing has captured much attention of the researchers and industrialists/enterprises
to achieve automation in their different processes . A prominent example of cloud-based crowdsourcing is
Amazon’s Mechanical Turk (MTurk), which provides marketplace to perform tasks that surpass computer
capabilities and require human intelligence .

[h]Research challenges and future directions

In this section, we have summarized different potential research challenges and future directions for
cloud-enabled autonomous vehicles.

 •

Effective load balancing: New algorithms and policies are required for balancing computations
between vehicle’s onboard and cloud computers.

 •

Scalable parallelization: Advancements in the cloud infrastructure are required for cloud
computing parallelization scheme to scale based upon the size of autonomous vehicular system.

 •

Effective sampling and scaling of data: New algorithms and approaches are required, which scale
to the size of big data and are more robust to dirty data .

 •

Ensure privacy and security: The data collected through different autonomous vehicles (using
sensors, cams, route maps, etc.) can include potential secretes (e.g. private home data, corporate
business planes, etc.), and over cloud it can be prone to theft or criminal use. Hence, privacy and
security of the data over cloud needs to be ensured.

 •

Ensure control and safety: The control of autonomous vehicles over cloud can be exposed to
potential hacking threats. A hacker could remotely control the vehicle and use it for unethical
purpose or to cause certain damage. Hence, the control and safety of the vehicle needs to be
ensured.

 •

Cope with varying network latency and QoS: For real-time applications, new algorithms and
approaches are needed to handle varying network latency and QoS.

 •

Fault tolerant control: For autonomous vehicular system failures can lead to undesirable hazardous
situations, hence are not acceptable. New approaches are required for onboard and cloud-based
fault tolerant control.

 •

Verification and validation of the system: A primary problem for the autonomous vehicular system
is the ability to substantiate that the system can operate safely, effectively and robustly, with safety
being the major concern . New methods are required for verifying and validating the desired
functioning of the autonomous vehicular system .

 •

Standards and protocols: Research in new standards and protocols is required such as to define
cross platform formats of data and to work with sparse representation for efficient processing etc.

 •

Crowdsourcing quality control: Crowdsourcing has normally been prone to generating


nosier/erroneous data . Hence, new mechanisms are required for improving and ensuring the
quality of data collected through crowdsourcing.

Chapter 2: Robotic Mechanisms

[mh] Design Considerations for Robotic Mechanisms

The mechatronic design of robotic hands is a very complex task, which involves different aspects of
mechanics, actuation, and control. In most of cases inspiration is taken by the human hand, which is able
to grasp and manipulate objects with different sizes and shapes, but its functionality and versatility are
very difficult to mimic. Human hand strength and dexterity involve a complex geometry of cantilevered
joints, ligaments, and musculotendinous elements that must be analyzed as a coordinated entity.
Furthermore, actuation redundancy of muscles generates forces across joints and tissues, perception
ability and intricate mechanics complicate its dynamic and functional analyses.
By considering these factors it is evident that the design of highly adaptable, sensor-based robotic hands
is still a quite challenge objective giving in a number of cases devices that are still confined to the
research laboratory.

There have been a number of robotic hand implementations that can be found in literature. A selection of
leading hand designs reported here is limited in scope, addressing mechanical architecture, not control or
sensing schemes. Moreover, because this work is concentrated to finger synthesis and design, the thumb
description is excluded, as well as two-fingered constructions, because most of them were designed to
work as grippers and would not integrate in the frame of multi-finger configuration.

Significant tendon operated hands are the Stanford/JPL hand and the Utah/MIT hand. In particular, the
first one has three 3-DOF fingers, each of them has a double-jointed head knuckle providing 90° of pitch
and jaw and another distal knuckle with a range of ±135°. The Utah/MIT dextrous hand has three fingers
with 4-DOFs, each digit of this hand has a non anthropomorphic design of the head knuckle excluding
circumduction. The inclusion of three fingers minimizes reliance on friction and adds redundant support
to manipulations tasks. Each N-DOF finger is controlled by 2-N independent actuators and tension cables.
Although these two prototypes exhibit a good overall behaviour, they suffer of limited power
transmission capability.

The prototype of the DLR hand possesses special designed actuators and sensors integrated in the hand’s
palm and fingers. This prototype has four fingers with 3-DOFs each, a 2-DOFs base joint gives ± 45° of
flexion and ±30° of abduction/adduction, and 1-DOF knuckle with 135° of flexion. The distal joint, which
is passively driven, is capable of flexing 110°.

A prototype of an anthropomorphic mechanical hand with pneumatic actuation has been developed at
Polytechnic of Turin having 4 fingers with 1-DOF each and it is controlled through PWM modulated
digital valves.

Following this latter basic idea, several articulated finger mechanisms with only 1-DOF were designed
and built at the University of Cassino and some prototypes allowing to carry out suitable grasping tests of
different objects were developed.

More recently, the concept of the underactuation was introduced and used for the design of articulated
finger mechanisms at the Laval University of Québec.

Underactuation concept deals with the possibility of a mechanical system to be designed having less
control inputs than DOFs. Thus, underactuated robotic hands can be considered as a good compromise
between manipulation flexibility and reduced complexity for the control and they can be attractive for a
large number of application, both industrial and non conventional ones.

[h]The underactuation concept

Since the last decades an increasing interest has been focused on the design and control of underactuated
mechanical systems, which can be defined as systems whose number of control inputs (i.e. active joints)
is smaller than their DOFs. This class of mechanical systems can be found in real life; examples of such
systems include, but not limited to, surface vessels, spacecraft, underwater vehicles, helicopters, road
vehicles, and robots.

The underactuation property may arise from one of the following reasons:
 the dynamics of the system (e.g. aircrafts, spacecrafts, helicopters, underwater vehicles);
 needs for cost reduction or practical purposes (e.g. satellites);
 actuator failure (e.g. in surface vessel or aircraft).

Furthermore, underactuation can be also imposed artificially to get a complex low-order nonlinear
systems for gaining an insight in the control theory and developing new strategies. However, the benefits
of underactuation can be extended beyond a simple reduction of mechanical complexity, in particular for
devices in which the distribution of wrenches is of fundamental importance. An example is the
automobile differential, in which an underactuated mechanism is commonly used to distribute the engine
power to two wheels. The differential incorporates an additional DOF to balance the torque delivered to
each wheel. The differential fundamentally operates on wheel torques instead of rotations; aided by
passive mechanisms, the wheels can rotate along complex relative trajectories, maintaining traction on the
ground without closed loop active control.

Some examples found in Robotics can be considered as underactuated systems such as: legged robots,
underwater and flying robots, and grasping and manipulation robots.

In particular, underactuated robotic hands are the intermediate solution between robotic hands for
manipulation, which have the advantages of being versatile, guarantee a stable grasp, but they are
expensive, complex to control and with many actuators; and robotic grippers, whose advantages are
simplified control, few actuators, but they have the drawbacks of being task specific, and perform an
unstable grasp.

In an underactuated mechanism actuators are replaced by passive elastic elements (e.g. springs) or limit
switches. These elements are small, lightweight and allow a reduction in the number of actuators. They
may be considered as passive elements that increase the adaptability of the mechanism to shape of the
grasped object, but can not and should not be handled by the control system.

The correct choice of arrangement and the functional characteristics of the elastic or mechanical limit
(mechanical stop) ensures the proper execution of the grasping sequence. In a generic sequence for the
grasping action, with an object with regular shape and in a fixed position, one can clearly distinguish the
different phases, as shown in Fig..

In Fig.a the finger is in its initial configuration and no external forces are acting. In Fig.b the proximal
phalanx is in contact with the object. In the Fig.c the middle phalanx, after a relative rotation respect to
the proximal phalanx, starts the contact with the object. In this configuration, the first two phalanges can
not move, because of the object itself. In Fig.d, finally, the finger has completed the adaptation to the
object, and all the three phalanges are in contact with it. A similar sequence can be described for an
irregularly shaped object, as shown in Fig., in which it is worth to note the adaptation of the finger to the
irregular object shape.

An underactuated mechanism allows the grasping of objects in a more natural and more similar to the
movement obtained by the human hand. The geometric configuration of the finger is automatically
determined by external constraints related with the shape of the object and does not require coordinated
activities of several phalanges. It is important to note that the sequences shown in Fig. and fig. can be
obtained with a continuous motion given by a single actuator.

Few underactuated finger mechanisms for robotic hands have been proposed in the literature. Some of
them are based on linkages, while others are based on tendon-actuated mechanisms. Tendon systems are
generally limited to rather small grasping forces and they lead to friction and elasticity. Hence, for
applications in which large grasping forces are
Figure :A sequence for grasping a regularly shaped object: a) starting phase; b) first phalange is in its
final configuration; c) second phalange is in its final configuration; d) third phalange is in its final
configuration.

Figure :A sequence for grasping an irregularly shaped object: a) starting phase; b) first phalange is in its
final configuration; c) second phalange is in its final configuration; d) third phalange is in its final
configuration.

required, linkage mechanisms are usually preferred and this Chapter is focused to the study of the latter
type of mechanisms.

An example of underactuation based on cable transmission is shown in Fig.a, it consists of a cable


system, which properly tensioned, act in such a way as to close the fingers and grasp the object.

The underactuation based on link transmission, or linkages, consists of a mechanism with multiple DOFs
in which an appropriate use of passive joints enables to completely envelop the object, so as to ensure a
stable grasp. An example of this system is shown in Fig..b. This type solution for robotic hands has been
developed for industrial or space applications with the aim to increase functionality without overly
complicating the complexity of the mechanism, and ensuring a good adaptability to the object in grasp.
Figure :Examples of underactuation systems: a) tendon-actuated mechanism; b) linkage mechanism; c)
differential mechanism; d) hybrid mechanism.

A differential mechanism, shown in Fig.c, is a device, usually but not necessarily used for gears, capable
of transmitting torque and rotation through three shafts, almost always used in one of two ways: in one
way, it receives one input and provides two outputs, this is found in most automobiles, and in the other
way, it combines two inputs to create an output that is the sum, difference, or average, of the inputs.
These differential mechanisms have unique features like the ability to control many DOFs with a single
actuator, mechanical stops or elastic limits. The differential gear, commonly used in cars, distributes the
torque from the engine on two-wheel drive according to the torque acting on the wheels. Applying this
solution to robotic hands, the actuation can be distributed to the joints according to the reaction forces
acting to each phalanx during its operation.
Hybrid solutions have been also developed and make use of planetary gears and linkages, together with
mechanical stops or elastic elements. An example is shown in Fig. d.

[h]Design of underactuated finger mechanism

An anthropomorphic robotic finger usually consists of 2-3 hinge-like joints that articulates the phalanges.
In addition to the pitch enabled by a pivoting joint, the head knuckle, sometimes also provides yaw
movement. Usually, the condyloid nature of the human metacarpal-phalangeal joint is often separated into
two rotary joints or, as in the case under-study, simplified as just one revolute joint.

Maintaining size and shape of the robot hand consistent to the human counterpart is to facilitate automatic
grasp and sensible use of conventional tools designed for human finger placement. This holds true for
many manipulative applications, especially in prosthesis and tele-manipulation where accuracy of a
human hand model enables more intuitive control to the slave. Regarding to the actuation system in most
of cases adopted solutions do not attempt to mimic human capabilities, but assume some of the pertinent
characteristics of the force generation, since complex functionality of tendons and muscles that have to be
replaced and somehow simplified by linear or revolute actuators and rotary joints.

The design of a finger mechanism proposed here uses the concept of underactuation applied to
mechanical hands. Specifically, underactuation allows the use of n – m actuators to control n-DOFs,
where m passive elastic elements replace actuators, as shown in Fig.. Thus, the concept of underactuation
is used to design a suitable finger mechanism for mechanical hands, which can automatically envelop
objects with different sizes and shapes through simple stable grasping sequences, and do not require an
active coordination of the phalanges. Referring to Figs. 4 and 5, the underactuated finger mechanism of
Ca.U.M.Ha. (Cassino-Underactuated-Multifinger-Hand) is composed by three links mj for j = 1, 2, 3,
which correspond to the proximal, median and distal phalanges, respectively. Dimensions of the
simplified sketch reported in Fig. have been chosen according to the overall characteristics of the human
finger given in Table. In particular, in Fig., θiM are the maximal angles of rotation, and torsion springs are
denoted by S1 and S2. In the kinematic scheme of Fig., two four-bar linkages A, B, C, D and B, E, F, G
are connected in series through the rigid body B, C, G, for transmitting the motion to the median and
distal phalanges, respectively, where the rigid body A, D, P represents the distal phalange. Likewise to the
human finger, links mj ( j = 1, 2, 3) are provided of suitable mechanical stoppers in order to avoid the
hyper-extension and hyper-flexion of the finger mechanism. Both revolute joints in A and B are provided
of torsion springs in order to obtain a statically determined system in each configuration of the finger
mechanism.
Figure :Simplified sketch of underactuated finger mechanism.

Phalanx Length Angle


m1 l1 = 43 mm ( 1M = 83°
m2 l2 = 25 mm ( 2M = 105°
m3 l3 = 23 mm ( 3M = 78°

Table:Characteristics of an index human finger.


Figure :Kinematic sketch of the underactuated finger mechanism.

[h]Optimal kinematic synthesis

The optimal dimensional synthesis of the function-generating linkage shown in Fig., which is used as
transmission system from the pneumatic cylinder to the three phalanxes of the proposed underactuated
finger mechanism, is formulated by using the Freudenstein’s equations and the transmission defect, as
index of merit of the force transmission. The three linkages connected in series are synthesized as in the
following by starting from the four-bar linkage, which moves the third phalanx.

[h]Synthesis of the four bar linkage A, B, C,D

By considering the four-bar linkage A, B, C, D in Fig., one has to refer to Fig. and the Freudenstein’s
equations can be expressed in the form

where l2 is the length of the second phalanx, a, b and c are the lengths of the links AD, DC and CB
respectively, and εi and ρi for i = 1, 2, 3 are the input and output angles of the four-bar linkage ABCD.

Equations (1) can be solved when three positions 1), 2) and 3) of both links BC and AD are given through
the pairs of angles (εi, ρi) for i = 1, 2, 3. According to a suitable mechanical design of the finger, (zoomed
view reported in Fig.) some design parameters are assumed, such as α = 50° for the link AD, γ = 40° and
β 1 = 25° for the link BC, the pairs of angles (ε1 = 115°, ρ1 = 130°) and (ε3 = 140°, ρ3 = 208°) are obtained
for the starting 1) and final 3) configurations respectively. Angle ρ3 is given by the sum of ρ1 and θ3M.
Since only two of the three pairs of angles required by the Freudenstein’s equations are assigned as design
specification of the function-generating four-bar linkage ABCD, an optimization procedure in terms of
force transmission has been developed by assuming (ε2, ρ2) as starting values of the optimization, which
correspond to both middle positions between 1) and 3) of links BC and AD respectively.

The transmission quality of the four-bar linkage is defined as the integral of the square of the cosine of the
transmission angle. The complement of this quantity is defined “transmission defect” by taking the form
The optimal values of the pair of angles (ε2, ρ2) are obtained through the optimization of the transmission
defect z’. In particular, the outcome of the computation has given (ε2 = 132.5°, ρ2 = 180.1°) and
consequently, a = 22.6 mm, b = 58.3 mm and c = 70.9 mm have been obtained from the Eq (1) and Eq.
(2).

It is worth to note that, as reported from Fig.a to Fig.c, these plots give many design solutions, the choice
can be related to the specific application and design requirements. In the case under-study parameters ε2
and ρ2 have been obtained in order to have the maximum of the mean values for the transmission angle.
The transmission angle µ1 versus the input angle ε for the synthesized mechanism is shown in Fig. d.

Figure, shows a parametric study of the a, b, c, parameters as function of ε2 and η2. The colour scale
represents the relative link length. For each plot the circle represents the choice that has been made for ε2
and ρ2, by assuming the length a = 23 mm, for the case under-study.
Figure :Sketch for the kinematic synthesis of the four bar linkage ABCD, shown in Fig..

.
Figure :Mechanical design of a particular used to define the angle α and the link length a of A, B, C, D,
in Fig..
Figure :Map of the link length versus the angles ε 2 and ρ2.; a) link AD, b) link DC; c) link BC, d)
Transmission angle μ1 versus angle ε for the moving link c.

[h]Synthesis of the four-bar linkage B, E, F, G

The same method has been applied to the synthesis of the function-generating four-bar linkage BEFG. In
fact, referring to Fig., the Freudenstein’s equations can be expressed in the form
where l1 is the length of the first phalanx, d, e and f are lengths of the links BG, GF and FE respectively,
and i and φi for i = 1, 2, 3 are the input and output angles of the four-bar linkage BEFG.

Likewise to the four-bar linkage ABCD, Eq. (5) can be solved when three positions 1), 2) and 3) of both
links EF and BG are given through the pairs of angles ( i, φi) for i = 1, 2, 3. In particular, according to a
suitable mechanical design of the finger, the design parameters γ = 40°, β 2 = 30° and = 10° are assumed
empirically. Consequently, the pairs of angles ( 1 = 80°, φ1 = 60°) and ( 3 = 140°, φ3 = 190°) are obtained
for the starting 1) and final 3) positions of both links EF and BG.

Since only two of the three pairs of angles required by the Freudenstein’s equations are assigned as design
specification of the function-generating four-bar linkage BEFG, an optimization procedure in terms of
force transmission has been carried out by assuming (2, φ2) as starting values of the optimization the
middle positions between 1) and 3) of links EF and BG respectively. The transmission defect z ’ of the
function-generating four-bar linkage BEFG takes the form
The optimal values of the pair of angles (2, φ2) are obtained and the output of the computation gives ( 2 =
115.5°, φ2 = 133.7°). Consequently, d = 53.4 mm, e = 96.3 mm and f = 104.9 mm have been obtained
from Eq. (5) and Eq. (6). Figure, shows a parametric study of the d, e, f, parameters as a function of 2 and
φ 2. The colour scale represents the relative link length. For each plot the circle represents the choice that
has been made for 2 and φ2, for the case under study. The diagram of the transmission angle µ 2 versus the
input angle of the moving link EF of the synthesized mechanism BEFG is shown in Fig. d.

Figure :Sketch for the kinematic synthesis of the four-bar linkage BEFG.
Figure :Map of the link length versus the angles ψ 2 and φ2; a) link BG, b) link GF, c) link EF, d)
transmission angle μ2 versus angle ψ of the moving link EF.

[h]Synthesis of the slider-crank mechanism EHI

Likewise to both four-bar linkages ABCD and BEFG, the offset slider-crank mechanism EHI of Fig. is
synthesized by using the Freudenstein’s equations, which takes the form
where of is the offset, g and h are the lengths of the links EH and HI respectively, and x i and λ i for i = 1,
2, 3 are the input displacement of the piston and the output rotation angle of the link EH of the slider-
crank mechanism EHI. Eq. (9) can be solved when three positions 1), 2) and 3) of both piston and link EH
are given through the pairs of parameters (x i, i) for i = 1, 2, 3. In particular, according to a suitable
mechanical design of the finger, the design parameters (x 1 = 0 mm, 1 = 37°) and (x3 = 75 mm, 3 = 180°)
are assumed empirically for the starting 1) and final 3) positions of both piston and link EH. The
optimization procedure in terms of force transmission has been carried out by assuming as starting values
of the optimization the middle position between 1) and 3) of the piston and link EH respectively. The
transmission defect z ’of the function-generating slider-crank mechanism EHI takes the form

The optimal values of the pair of parameters (x 2, 2) are obtained, and the outcome of the computation has
given (x2 = 47.5 mm, 2 = 126.9°). Consequently, of = 43.4 mm, g = 35.7 mm and h = 74.7 mm have been
obtained from the Eq. (9) and Eq. (10).

Figures 12a, 12b and 12c, show a parametric study of the parameters g, h and o f, as a function of λ2 and s2.
The colour scale represents the relative link length and for each plot the marked circle represents the
choice that has been made for values 2 and s2. The diagram of the transmission angle µ 3 versus the input
displacement x of the moving piston of the synthesized slider-crank mechanism EHI is shown in Fig.d.
Figure :Kinematic scheme of the offset slider-crank mechanism EHI.
Figure :Map of the link length versus angles λ 2 and s2; a) link EI, b) link HI, c) link EH, d) transmission
angle μ3 versus distance x of the moving link.

[h]Mechanical design

Figure shows a drawing front view of the designed underactuated finger mechanism. In particular, EHI
indicates the slider-crank mechanism, ABCD indicates the first four-bar linkage, and DEFG indicates the
second four-bar linkage. In order to obtain the underactuated finger mechanism, two torsion springs (S 1
and S2) have been used at joints A and B and indicated with 1 and 2, respectively.

Aluminium has been selected for its characteristics of lightness and low-cost. It has the disadvantage of
low hardness, therefore for the manufacturing of the revolute joints, ferrules have been considered. In
particular, in Fig., it is possible to note that the finger mechanism, which allows the finger motion, is
always on the upper side of the phalanges. This is to avoid mechanical interference between the object in
grasp and the links’ mechanism. Furthermore, the finger is asymmetric, this is due to the fact that is
necessary to have a suitable side to mount the torsion spring.

Figure :Mechanical design of the underactuated finger.

Each phalange has a flat surface to interact with the object to be grasped. This is to further consider force
sensors to develop a suitable force control of the robotic hand prototype, as reported in Section.

The common operation of the four underactuated fingers gives an additional auto-adaptability of the
Ca.U.M.Ha. robotic hand, because each finger can reach a different closure configuration according to the
shape and size of the object to grasp. This behaviour is due to the uniform distribution of the air pressure
inside the pneumatic tank and pushing chambers, as it will be described below.

[h]Actuation and control

The lay-out of the electro-pneumatic circuit of the proposed closed-loop pressure control system is
sketched in Fig., where the pressure POUT in the rigid tank is controlled by means of two PWM modulated
pneumatic digital valves V1 and V2, which are connected in supply, at the supply pressure P S, and in
exhaust, at the atmospheric pressure PA, respectively.

Thus, both valves V1 and V2 approximate the behaviour of a three-way flow proportional valve, which
allows the pressure regulation in the tank. These valves are controlled through the voltage control signals
VPWM 1 and VPWM 2, which are modulated in PWM at 24 V, as it is required by the valves V 1 and V2. These
signals are given by a specific electronic board supplied at 24 V, which allows the generation of both
signals VPWM 1 and VPWM 2 and the amplification at 24 V from the input signal V PWM that lies within the
range of V. The PWM modulated control signal V PWM is generated via software because of a suitable
Lab-View program.

The feed-back signal VF/B is given by the pressure transducer Tp with static gain K T = 1 V/bar, which is
installed on the rigid tank directly.
Thus, a typical PID compensation of the error between the input electric signal V SET and the feed-back
electric signal VF/B is carried out through a PC controller, which is provided of the electronic board PCI
6052-E and a terminal block SCB-68.

Figure :Scheme for the pressure control of the robotic hand prototype finger.

[h]Experimental test-bed

The closed-loop pressure control system and a test bed of Fig. have been designed and built according to
the scheme of Fig.. In particular, this test-bed is mainly composed by: 1) and 2), two 2/2 (two-way/two-
position) pneumatic digital valves of type SMC VQ21A1-5Y0-C6-F-Q; 3) a tank of type Festo with a
volume of 0.4 lt; 4) a pressure transducer of type GS Sensor XPM5-10G, connected to an electronic board
of type PCI 6052-E with the terminal block SCB-68, which is connected to the PC in order to generate the
control signal VPWM ; 5) a specific electronic board to split and amplify at 24 V the control signals V PWM 1
and VPWM 2.

The electronic circuit of Fig.b splits and amplifies the modulated electric signal V PWM that comes from the
PWM driver into the signals VPWM 1 and VPWM 2, which control the digital valves V1 and V2 respectively.
This circuit is composed by a photodiode FD, three equal electric resistors R 1, a MOSFET M and a diode
D. In fact, the working range of the electronic board NI DAQ AT MIO-16E-2 is amplified from V to the
working range [ 0 / + 24] V of the digital valves because of the electric supply at 24 V DC. Moreover, this
signal is decomposed and sent alternatively to V1 and V2 because of the effects of the MOSFET M.
A suitable software in the form of virtual instrument has been conceived and implemented by using the
Lab-View software, as shown in Fig.. This solutions gives the possibility of using the electronic board NI
DAQ PCI-6052-E for driving the PWM modulated pneumatic digital valves and acquiring both voltage
signals VSET and VF/B of the proposed closed-loop pressure control system. Thus, the program can be
considered as composed by three main blocks, where the first is for acquiring analogical signals through a
suitable scan-rate, the second gives the PID compensation of the pressure error and the third one is for
generating the PWM signal.

Figure :Test-bed a) of the proposed closed-loop pressure control system and b) a scheme of the electronic
circuit.
Figure :Lab-View program for controlling the pressure in the tank through PWM modulated pneumatic
digital valves.

[h]Experimental results

The static and dynamic performances of the proposed closed-loop pressure control system have been
analyzed by using the test-bed of Fig.. Some experimental results in the time domain are reported in Fig.
in order to show the effects of the proportional gain Kp of the PID compensator. In particular, the
reference and output pressure signals P SET and POUT are compared by increasing the values of the
proportional gain Kp from 0.3 to 2.4, as shown in Figs.17a to 17d, respectively. Taking into account that
the pressure transducer Tp is characterized by a static gain K T = 1 V/bar, the pressure diagrams of P SET
and POUT show the same shape and values of the correspondent voltage diagrams V SET and VF/B,
respectively. Moreover, the diagram of Fig.c shows a good behaviour at high values of Kp, even if some
instability of the system may appear, as shown in Fig.d for Kp = 2.4. The experimental closed-loop
frequency response of the proposed pressure control system has been carried out by using a Gain-Phase-
Analyzer of type SI 1253. The Bode diagrams of Fig.a and 18bhave been obtained for the periods of the
PWM modulation, T = 50 ms and T = 100 ms, respectively.

Thus, the diagrams of the pressure signals PSET and POUT versus time, which have been acquired through
the Lab-View Data-Acquisition-System, are shown in continuous and dash-dot lines, respectively. In
particular, Fig.a and 19bshow both frequency responses of Fig.a and 18bin the time domain for a P SET
sinusoidal pressure signal with frequency f = 0.1 Hz, average value Av = 3 bar rel and amplitude A = 2
bar rel. Likewise to the diagrams of Fig.and still referring to the Bode diagrams of Fig., the frequency
responses in the time domain for a PSET with frequency f = 1.5 Hz are shown respectively in Fig.a and 20b
for T = 50 ms and T = 100 ms.

Figure :Effects of the proportional gain: a) Kp = 0.3; b) Kp = 0.9; c) Kp = 1.8; d) Kp = 2.4.


Figure :Closed-loop frequency responses of the proposed pressure control system for different periods of
the PWM modulation; a) T = 50 ms; b) T = 100 ms.

Figure :Frequency responses in the time domain for a sinusoidal P SET with f = 0.1 Hz, Av = 3 bar rel and
A = 2 bar rel: a) T = 50 ms; b) T = 100 ms.
Figure :Frequency responses in the time domain for a sinusoidal P SET with f = 1.5 Hz, Av = 3 V and A =
2 V: a) T = 50 ms; b) T = 100 ms.

In this Chapter the mechatronic design has been reported for the Ca.U.M.Ha. (Cassino-Underactuated-
Multifinger-Hand) robotic hand. In particular, the underactuation concept is addressed by reporting
several examples and kinematic synthesis and the mechatronic design have been developed for a finger
mechanism of the robotic hand. As a result the Ca.U.M.Ha. robotic hand shows a robust and efficient
design, which gives good flexibility and versatility in the grasping operation at low-cost. The kinematic
synthesis and optimization of the underactuated finger mechanism of Ca.U.M.Ha. have been formulated
and implemented. In particular, two function-generating four-bar linkages and one offset slider-crank
mechanism have been synthesized by using the Freudenstein’ equations and optimizing the force
transmission, which can be considered as a critical issue because of the large rotation angles of the
phalanxes. A closed-loop pressure control system through PWM modulated pneumatic digital valves has
been designed and experimentally tested in order to determine and analyze its static and dynamic
performances. The proposed and tested closed-loop control system is applied to the Ca.U.M.Ha. robotic
hand in order to control the actuating force of the pneumatic cylinders of the articulated fingers.
Consequently, a force control of the grasping force has been developed and tested according to a robust
and low-cost design of the robotic hand.

Chapter 3: Sensing Technologies for Robots

[mh] Role and Importance of Sensors in Robotics

In robotics, perception is understood as a system that endows the robot with the ability to perceive,
comprehend, and reason about the surrounding environment. The key components of a perception system
are essentially sensory data processing, data representation (environment modeling), and ML-based
algorithms, as illustrated in Figure. Since strong AI is still far from being achieved in real-world robotics
applications, this chapter is about weak AI, i.e., standard machine learning approaches .
Figure :Key modules of a typical robotic perception system: sensory data processing (focusing here on
visual and range perception); data representations specific for the tasks at hand; algorithms for data
analysis and interpretation (using AI/ML methods); and planning and execution of actions for robot-
environment interaction.

Robotic perception is crucial for a robot to make decisions, plan, and operate in real-world environments,
by means of numerous functionalities and operations from occupancy grid mapping to object detection.
Some examples of robotic perception subareas, including autonomous robot-vehicles, are obstacle
detection , object recognition , semantic place classification , 3D environment representation , gesture and
voice recognition , activity classification , terrain classification , road detection , vehicle detection ,
pedestrian detection , object tracking , human detection , and environment change detection .

Nowadays, most of robotic perception systems use machine learning (ML) techniques, ranging from
classical to deep-learning approaches . Machine learning for robotic perception can be in the form of
unsupervised learning, or supervised classifiers using handcrafted features, or deep-learning neural
networks (e.g., convolutional neural network (CNN)), or even a combination of multiple methods.

Regardless of the ML approach considered, data from sensor(s) are the key ingredient in robotic
perception. Data can come from a single or multiple sensors, usually mounted onboard the robot, but can
also come from the infrastructure or from another robot (e.g., cameras mounted on UAVs flying nearby).
In multiple-sensors perception, either the same modality or multimodal, an efficient approach is usually
necessary to combine and process data from the sensors before an ML method can be employed. Data
alignment and calibration steps are necessary depending on the nature of the problem and the type of
sensors used.

Sensor-based environment representation/mapping is a very important part of a robotic perception system.


Mapping here encompasses both the acquisition of a metric model and its semantic interpretation, and is
therefore a synonym of environment/scene representation. This semantic mapping process uses ML at
various levels, e.g., reasoning on volumetric occupancy and occlusions, or identifying, describing, and
matching optimally the local regions from different time-stamps/models, i.e., not only higher level
interpretations. However, in the majority of applications, the primary role of environment mapping is to
model data from exteroceptive sensors, mounted onboard the robot, in order to enable reasoning and
inference regarding the real-world environment where the robot operates.

Robot perception functions, like localization and navigation, are dependent on the environment where the
robot operates. Essentially, a robot is designed to operate in two categories of environments: indoors or
outdoors. Therefore, different assumptions can be incorporated in the mapping (representation) and
perception systems considering indoor or outdoor environments. Moreover, the sensors used are different
depending on the environment, and therefore, the sensory data to be processed by a perception system
will not be the same for indoors and outdoors scenarios. An example to clarify the differences and
challenges between a mobile robot navigating in an indoor versus outdoor environment is the ground, or
terrain, where the robot operates. Most of indoor robots assume that the ground is regular and flat which,
in some manner, facilitates the environment representation models; on the other hand, for field (outdoors)
robots, the terrain is quite often far from being regular and, as consequence, the environment modeling is
itself a challenge and, without a proper representation, the subsequent perception tasks are negatively
affected. Moreover, in outdoors, robotic perception has to deal with weather conditions and variations in
light intensities and spectra.

Similar scenario-specific differences exist in virtually all use-cases of robotic vision, as exemplified by
the 2016 Amazon Picking Challenge participants’ survey , requiring complex yet robust solutions, and
therefore considered one of the most difficult tasks in the pick-and-place application domain. Moreover,
one of the participating teams from 2016 benchmarked a pose estimation method on a warehouse logistics
dataset, and found large variations in performance depending on clutter level and object type . Thus,
perception systems currently require expert knowledge in order to select, adapt, extend, and fine-tune the
various employed components.

Apart from the increased training data sizes and robustness, the end-to-end training aspect of deep-
learning (DL) approaches made the development of perception systems easier and more accessible for
newcomers, as one can obtain the desired results directly from raw data in many cases, by providing a
large number of training examples. The method selection often boils down to obtaining the latest
pretrained network from an online repository and fine-tuning it to the problem at hand, hiding all the
traditional feature detection, description, filtering, matching, optimization steps behind a relatively unified
framework. Unfortunately, at the moment an off-the-shelf DL solution for every problem does not exist,
or at least no usable pretrained network, making the need for huge amounts of training data apparent.
Therefore, large datasets are a valuable asset for modern AI/ML. A large number of datasets exist for
perception tasks as well, with a survey of RGB-D datasets presented by Firman , and even tools for
synthetically generating sensor-based datasets, e.g., the work presented by Handa et al. which is available
online: http://robotvault.bitbucket.org/. However, the danger is to overfit to such benchmarks, as the
deployment environment of mobile robots is almost sure to differ from the one used in teaching the robot
to perceive and understand the surrounding environment. Thus, the suggestions formulated by Wagstaff
still hold true today and should be taken to heart by researchers and practitioners.

As pointed out recently by Sünderhauf et al. , robotic perception (also designated robotic vision in )
differs from traditional computer vision perception in the sense that, in robotics, the outputs of a
perception system will result in decisions and actions in the real world. Therefore, perception is a very
important part of a complex, embodied, active, and goal-driven robotic system. As exemplified by
Sünderhauf et al. , robotic perception has to translate images (or scans, or point-clouds) into actions,
whereas most computer vision applications take images and translate the outputs into information.

Among the numerous approaches used in environment representation for mobile robotics, and for
autonomous robotic-vehicles, the most influential approach is the occupancy grid mapping . This 2D
mapping is still used in many mobile platforms due to its efficiency, probabilistic framework, and fast
implementation. Although many approaches use 2D-based representations to model the real world,
presently 2.5D and 3D representation models are becoming more common. The main reasons for using
higher dimensional representations are essentially twofold: (1) robots are demanded to navigate and make
decisions in higher complex environments where 2D representations are insufficient; (2) current 3D
sensor technologies are affordable and reliable, and therefore 3D environment representations became
attainable. Moreover, the recent advances in software tools, like ROS and PCL, and also the advent of
methods like Octomaps, developed by Hornung et al. , have been contributing to the increase in 3D-like
environment representations.

The advent and proliferation of RGBD sensors has enabled the construction of larger and ever-more
detailed 3D maps. In addition, considerable effort has been made in the semantic labeling of these maps,
at pixel and voxels levels. Most of the relevant approaches can be split into two main trends: methods
designed for online and those designed for offline use. Online methods process data as it is being acquired
by the mobile robot, and generate a semantic map incrementally. These methods are usually coupled with
a SLAM framework, which ensures the geometric consistency of the map. Building maps of the
environment is a crucial part of any robotic system and arguably one of the most researched areas in
robotics. Early work coupled mapping with localization as part of the simultaneous localization and
mapping (SLAM) problem . More recent work has focused on dealing with or incorporating time-
dependencies (short or long term) into the underlying structure, using either grid maps as described in ,
pose-graph representations in , and normal distribution transform (NDT) .

As presented by Hermans et al. , RGBD data are processed by a random forest-based classifier and
predict semantic labels; these labels are further regularized through the conditional random field (CRF)
method proposed by Krahenbuhl and Koltun . Similarly, McCormac et al. use the elastic fusion SLAM
algorithm proposed by Whelan et al. to fuse CNN predictions about the scene in a geometrically
consistent map. In the work of Sünderhauf et al. , a CNN is used to incrementally build a semantic map,
with the aim of extending the number of classes supported by the CNN by complementing it with a series
of one-vs-all classifiers which can be trained online. A number of semantic mapping approaches are
designed to operate offline, taking as input a complete map of the environment. In the methods described
by Ambrus et al. and Armeni et al. , large-scale point clouds of indoor buildings are processed, and then,
after segmenting the input data, the method’s outputs are in the form of a set of “rooms.” Ambrus et al.
use a 2D cell-complex graph-cut approach to compute the segmentation with the main limitation that only
single floor buildings can be processed, while Armeni et al. process multifloor structures by detecting the
spaces between the walls, ceilings, etc., with the limitation that the building walls have to be axis-aligned
(i.e., the Manhattan world assumption). Similarly, in the work proposed by Mura et al. , a large point
cloud of an indoor structure is processed by making use of a 3D cell-complex structure and outputting a
mesh containing the semantic segmentation of the input data. However, the main limitation in is that the
approach requires knowledge of the positions from which the environment was scanned when the input
data were collected.

The recent work presented by Brucker et al. builds on the segmentation of Ambrus et al. and explores
ways of fusing different types of information, such as presence of objects and cues of the types of rooms
to obtain a semantic segmentation of the environment. The aim of the work presented by Brucker et al. is
to obtain an intuitive and human-like labeling of the environment while at the same time preserving as
many of the semantic features as possible. Also, Brucker et al. use a conditional random field (CRF) or
the fusion of various heterogeneous data sources and inference is done using Gibbs sampling technique.

Processing sensory data and storing it in a representation of the environment (i.e., a map of the
environment) has been and continues to be an active area in robotics research, including autonomous
driving system (or autonomous robotic-vehicles). The approaches covered range from metric
representations (2D or 3D) to higher semantic or topological maps, and all serve specific purposes key to
the successful operation of a mobile robot, such as localization, navigation, object detection,
manipulation, etc. Moreover, the ability to construct a geometrically accurate map further annotated with
semantic information also can be used in other applications such as building management or architecture,
or can be further fed back into a robotic system, increasing the awareness of its surroundings and thus
improving its ability to perform certain tasks in human-populated environments (e.g., finding a cup is
more likely to be successful if the robot knows a priori which room is the kitchen and how to get there).

[h]Artificial intelligence and machine learning applied on robotics perception

Once a robot is (self) localized, it can proceed with the execution of its task. In the case of autonomous
mobile manipulators, this involves localizing the objects of interest in the operating environment and
grasping them. In a typical setup, the robot navigates to the region of interest, observes the current scene
to build a 3D map for collision-free grasp planning and for localizing target objects. The target could be a
table or container where something has to be put down, or an object to be picked up. Especially in the
latter case, estimating all 6 degrees of freedom of an object is necessary. Subsequently, a motion and a
grasp are computed and executed. There are cases where a tighter integration of perception and
manipulation is required, e.g., for high-precision manipulation, where approaches like visual servoing are
employed. However, in every application, there is a potential improvement for treating perception and
manipulation together.

Perception and manipulation are complementary ways to understand and interact with the environment
and according to the common coding theory, as developed and presented by Sperry , they are also
inextricably linked in the brain. The importance of a tight link between perception and action for artificial
agents has been recognized by Turing , who suggested to equip computers “with the best sense organs
that money can buy” and let them learn from gathered experiences until they pass his famous test as
described in .

The argument for embodied learning and grounding of new information evolved, considering the works
of Steels and Brooks and Vernon , and more recently in , robot perception involves planning and
interactive segmentation. In this regard, perception and action reciprocally inform each other, in order to
obtain the best results for locating objects. In this context, the localization problem involves segmenting
objects, but also knowing their position and orientation relative to the robot in order to facilitate
manipulation. The problem of object pose estimation, an important prerequisite for model-based robotic
grasping, uses in most of the cases precomputed grasp points as described by Ferrari and Canny . We can
categorize this topic in either template/descriptor-based approaches or alternatively local feature/patch-
based approaches. In both cases, an ever-recurring approach is that bottom-up data-driven hypothesis
generation is followed and verified by top-down concept-driven models. Such mechanisms are assumed,
as addressed by Frisby and Stone , to be like our human vision system.

[h]The approaches presented in have been employed.

The importance of mobile manipulation and perception areas has been signaled by the (not only
academic) interest spurred by events like the Amazon Robotics (formerly Picking) Challenge and the
workshop series at the recent major computer vision conferences associated with the SIXD Challenge .
However, current solutions are either heavily tailored to a specific application, requiring specific
engineering during deployment, or their generality makes them too slow or imprecise to fulfill the tight
time-constraints of industrial applications. While deep learning holds the potential to both improve
accuracy (i.e., classification or recognition performance) and also to increase execution speed, more work
on transfer learning, in the sense of generalization improvement, is required to apply models learned in
real-world and also in unseen (new) environment. Domain adaptation and domain randomization (i.e.,
image augmentations) seem to be important directions to pursue, and should be explored not only for
vision/camera cases, but also for LiDAR-based perception cases.
Usually, in traditional mobile robot manipulation use-cases, the navigation and manipulation capabilities
of a robot can be exploited to let the robot gather data about objects autonomously. This can involve, for
instance, observing an object of interest from multiple viewpoints in order to allow a better object model
estimation, or even in-hand modeling. In the case of perception for mobile robots and autonomous (robot)
vehicles, such options are not available; thus, its perception systems have to be trained offline. However,
besides AI/ML-based algorithms and higher level perception, for autonomous driving applications,
environment representation (including multisensor fusion) is of primary concern .

The development of advanced perception for (full) autonomous driving has been a subject of interest
since the 1980s, having a period of strong development due to the DARPA Challenges and the European
ELROB challenges , and more recently, it has regained considerable interest from automotive and
robotics industries and academia. Research in self-driving cars, also referred as autonomous robot-cars, is
closely related to mobile robotics and many important works in this field have been published in well-
known conferences and journals devoted to robotics. Autonomous driving systems (ADS) comprise,
basically, perception (including sensor-fusion and environment modeling/representation), localization,
and navigation (path planning, trajectory following, control) and, more recently, cooperation (V2X-based
communication technologies). However, the cornerstone of ADS is the perception system because it is
involved in most of the essential and necessary tasks for safe driving such as the “segmentation,”
detection/recognition, of: road, lane-markings, pedestrians, and other vulnerable road users (e.g., cyclists),
other vehicles, traffic signals, crosswalks, and the numerous other types of objects and obstacles that can
be found on the roads. In addition to the sensors (e.g., cameras, LIDAR, Radar, “new” solid-state LiDAR
technology) and the models used in ADS, the common denominator in a perception system consists of
AI/ML algorithms, where deep learning is the leading technique for semantic segmentation and object
detection .

One of current trends in autonomous vehicles and robotics is the promising idea of incorporating
cooperative information, from connected environment/infrastructure, into the decision loop of the robotic
perception system. The rationale is to improve robustness and safety by providing complementary
information to the perception system, for example: the position and identification of a given object or
obstacle on the road could be reported (e.g., broadcasted through a communication network) in advance
to an autonomous car, moments before the object/obstacle are within the onboard sensor’s field/range of
view.

[h]The Strands project

The EU FP7 Strands project is formed by a consortium of six universities and two industrial partners.
The aim of the project is to develop the next generation of intelligent mobile robots, capable of operating
alongside humans for extended periods of time. While research into mobile robotic technology has been
very active over the last few decades, robotic systems that can operate robustly, for extended periods of
time, in human-populated environments remain a rarity. Strands aims to fill this gap and to provide robots
that are intelligent, robust, and can provide useful functions in real-world security and care scenarios.
Importantly, the extended operation times imply that the robotic systems developed have to be able to
cope with an ever-increasing amount of data, as well as to be able to deal with the complex and
unstructured real world .
Figure :The Strands project .

Figure shows a high level overview of the Strands system : the mobile robot navigates autonomously
between a number of predefined waypoints. A task scheduling mechanism dictates when the robot should
visit which waypoints, depending on the tasks the robot has to accomplish on any given day. The
perception system consists, at the lowest level, of a module which builds local metric maps at the
waypoints visited by the robot. These local maps are updated over time, as the robot revisits the same
locations in the environment, and they are further used to segment out the dynamic objects from the static
scene. The dynamic segmentations are used as cues for higher level behaviors, such as triggering a data
acquisition and object modeling step, whereby the robot navigates around the detected object to collect
additional data which are fused into a canonical model of the object . The data can further be used to
generate a textured mesh through which a convolutional neural network can be trained which can
successfully recognize the object in future observations . The dynamics detected in the environment can
be used to detect patterns, either through spectral analysis , as described in , or as part of a multitarget
tracking system based on a Rao-Blackwellized p filter.
Figure :The Strands system—Overview.

In addition to the detection and modeling of objects, the Strands perception system also focuses on the
detection of people. Beyer et al. present a method to continuously estimate the head-pose of people,
while in laser and RGB-D are combined to reliably detect humans and to allow human-aware navigation
approaches which make the robot more socially acceptable. Beyer et al. propose a CNN-based system
which uses laser scanner data to detect objects; the usefulness of the approach is demonstrated in the case
scenario, where it is used to detect wheelchairs and walkers.

Robust perception algorithms that can operate reliably for extended periods of time are one of the
cornerstones of the Strands system. However, any algorithm deployed on the robot has to be not only
robust, but also able to scale as the robot makes more observations and collects more information about
the world. One of the key parts that would enable the successful operation of such a robotic system is a
perception stack that is able to continuously integrate observations about the world, extract relevant parts
as well as build models that understand and are able to predict what the environment will look like in the
future. This spatio-temporal understanding is crucial, as it allows a mobile robot to compress the data
acquired during months of autonomous operation into models that can be used to refine the robot’s
operation over time. Modeling periodicities in the environment and integrating them into a planning
pipeline is further investigated by Fentanes et al. , while Santos et al. build spatio-temporal models of the
environment and use them for exploration through an information-theoretic approach which predicts the
potential gain of observing particular areas of the world at different points in time.
[h]The RobDREAM project

Advanced robots operating in complex and dynamic environments require intelligent perception
algorithms to navigate collision-free, analyze scenes, recognize relevant objects, and manipulate them.
Nowadays, the perception of mobile manipulation systems often fails if the context changes due to a
variation, e.g., in the lightning conditions, the utilized objects, the manipulation area, or the environment.
Then, a robotic expert is needed who needs to adjust the parameters of the perception algorithm and the
utilized sensor or even select a better method or sensor. Thus, a high-level cognitive ability that is
required for operating alongside humans is to continuously improve performance based on introspection.
This adaptability to changing situations requires different aspects of machine learning, e.g., storing
experiences for life-long learning, generating annotated datasets for supervised learning through user
interaction, Bayesian optimization to avoid brute-force search in high-dimensional data, and a unified
representation of data and meta-data to facilitate knowledge transfer.

The RobDREAM consortium automated and integrated different aspects of these. Specifically, in the
EU’s H2020 RobDREAM project, a mobile manipulator was used to showcase the intuitive programming
and simplified setup of robotic applications enabled by automatically tuning task execution pipelines
according to user-defined performance criteria.

As illustrated in Figure, this was achieved by a semantically annotated logging of perceptual episodic
memories that can be queried intuitively in order to analyze the performance of the system in different
contexts. Then, a ground truth annotation tool can be used by the user to mark satisfying results, or
correct unsatisfying ones, where the suggestions and interactive capabilities of the system reduced the
cognitive load of this often complicated task (especially when it comes to 6 DoF pose annotations), as
shown in user studies involving computer vision expert and nonexpert users alike.

Figure :Schematics of the RobDREAM approach

These annotations are then used by a Bayesian optimization framework to tune the off-the-shelf pipeline
to the specific scenarios the robot encounters, thereby incrementally improving the performance of the
system. The project did not focus only on perception, but on other key technologies for mobile
manipulation as well. Bayesian optimization and other techniques were used to adapt the navigation,
manipulation, and grasping capabilities independently of each other and the perception ones. However,
the combinatorial complexity of the joint parameter space of all the involved steps was too much even for
such intelligent meta-learners. The final industrially relevant use-case demo featured the kitting and
mounting of electric cabinet board elements, for which a pose-annotated database was built using two
RBD-D cameras and released to the public .
[h]The SPENCER project

When deploying robots in scenarios where they need to share the environment and interact with a large
number of people, it is increasingly important that their functionalities are “socially aware.” This means
that they respect the personal space of encountered persons, does not navigate s.t. to cut up cues or
groups, etc. Such functionalities go beyond the usual focus of robotics research groups, while academics
focusing on user experience typically do not have the means to develop radically new robots. However,
the EU’s FP7 program funded such an interdisciplinary project, called SPENCER, driven by an end-user
in the aviation industry.

Since around 80% of passenger traffic at different hubs, including Schiphol in Amsterdam, is comprised
of passengers who are transferring from one flight to the other, KLM is interested in an efficient
management of their movements. For example, when transfer times are short, and finding one’s way in a
big airport is difficult due to language and alphabet barriers, people are at risk to losing their connection.
In such, and similar cases, robotic assistants that can be deployed and booked flexibly can possibly help
alleviate some of the problem. This use-case was explored by the SPENCER demonstrator for smart
passengers’ flow management and mobile information provider, but similar solutions are required in other
domains as well .
Figure :Concept and results of the SPENCER project

The SPENCER consortium integrated the developed technologies onto a robot platform whose task
consists in picking up short-transfer time passenger groups at their gate of arrival, identifying them with
an onboard boarding pass reader, guiding them to the Schengen barrier and instructing them to use the
priority track . Additionally, the platform was equipped with a KLM information kiosk and provides
services to passengers in need of help.
In crowded environments such as airports, generating short and safe paths for mobile robots is still
difficult. Thus, social scene understanding and long-term prediction of human motion in crowds is not
sufficiently solved but highly relevant for all robots that need to quickly navigate in human environments,
possibly under temporal constraints. Social scene understanding means, in part, that a reliable tracking
and prediction of people’s motion with low uncertainty is available, and that is particularly hard if there
are too many occlusions and too many fast changes of motion direction. Classical path planning
approaches often result in an overconstrained or overly cautious robot that either fails to produce a
feasible and safe path in the crowd, or plans a large and suboptimal detour to avoid people in the scene.

[h]The AUTOCITS project

The AUTOCITS project will carry out a comprehensive assessment of cooperative systems and
autonomous driving by deploying real-world Pilots, and will study and review regulations related to
automated and autonomous driving. AUTOCITS, cofinanced by the European Union through the
Connecting Europe Facility (CEF) Program, aims to facilitate the deployment of autonomous vehicles in
European roads, and to use connected/cooperative intelligent transport systems (C-ITS) services to share
information between autonomous vehicles and infrastructure, by means of V2V and V2I communication
technology, to improve safety and to facilitate the coexistence of autonomous cars in real-world traffic
conditions. The AUTOCITS Pilots, involving connected and autonomous vehicles , will be deployed in
three major European cities in “the Atlantic Corridor of the European Network”: Lisbon (Portugal),
Madrid (Spain), and Paris (France).

A number of technologies are involved in AUTOCITS, ranging from the onboard and road-side units
(OBU, RSU) to the autonomous driving systems that equip the cars. Today, the autonomous and/or
automated driving technology we see on the roads belongs to the levels 3 or 4 . In AUTOCITS, the Pilot’s
deployment will be of level 3 to 4. In this context, it is important to say that level 5 cars (i.e., 100% self-
driving or full-automated cars: the driving wheels would be unnecessary) operating in real-world roads
and streets are still far from reality.

We can say that the perception system is in charge of all tasks related to object and event detection and
response (OEDR). Therefore, a perception system—including of course its software modules—is
responsible for sensing, understanding, and reasoning about the autonomous car’s surroundings. Within a
connected and cooperative environment, connected cars would leverage and complement onboard sensor
data by using information from vehicular communication systems (i.e., V2X technology): information
from other connected vehicles, from infrastructure, and road users (and vice-versa).

[h]Conclusions and remarks

So just how capable is current perception and AI, and how close did/can it get to human-level
performance? Szeliski in his introductory book to computer vision argued that traditional vision
struggled to reach the performance of a 2-year old child, but today’s CNNs reach super-human
classification performance on restricted domains .

The recent surge and interest in deep-learning methods for perception has greatly improved performance
in a variety of tasks such as object detection, recognition, semantic segmentation, etc. One of the main
reasons for these advancements is that working on perception systems lends itself easily to offline
experimentation on publicly available datasets, and comparison to other methods via standard
benchmarks and competitions.
Machine learning (ML) and deep learning (DL), the latter has been one of the most used keywords in
some conferences in robotics recently, are consolidated topics embraced by the robotics community
nowadays. While one can interpret the filters of CNNs as Gabor filters and assume to be analogous to
functions of the visual cortex, currently, deep learning is a purely nonsymbolic approach to AI/ML, and
thus not expected to produce “strong” AI/ML. However, even at the current level, its usefulness is
undeniable, and perhaps, the most eloquent example comes from the world of autonomous driving which
brings together the robotics and the computer vision community. A number of other robotics-related
products are starting to be commercially available for increasingly complex tasks such as visual question
and answering systems, video captioning and activity recognition, large-scale human detection and
tracking in videos, or anomaly detection in images for factory automation.

[mh] Sensory Data Acquisition and Processing

Assistance systems in Ambient Assisted Living and in medical care have to recognize relevant situations,
that require fast assistive intervention. Former projects in this field like tecla or PAUL have been
focused on the application of the new AAL-technologies in AAL test beds to get information about the
acceptance level of the technologies and the different new applications for the patients. Additionally,
business models have been drafted to realize a successful AAL business area in future.

The clinical established measurement technology for diagnostic, monitoring and risk stratification does
not translate directly to the outpatient area (ambulant or domestically environment). The key challenge is,
that many relevant situations are only noticeable, when various sensor modalities are merged – such as for
discrimination between pathological, emotional or stress induced increase of the heart rate . This is only
possible by the use of the combination of multiple different sensors . The same applies to the analysis of
joint kinematics of everyday activities, which requires more and inertial sensors with higher accuracy.

The next generation of radio networks (5G) shows the possibility of introducing new possibilities of real-
time communication in all areas of life with very low latency and high data rates. One speaks of a so-
called tactile Internet. People come into contact with their surroundings through their senses, which
involve several different reaction times. Here, muscular, audio-visual and tactile response times are of
particular importance. The typical muscular response time is around 1 second, that of the hearing at
100 ms, while the visual response time is in the range of 10 ms .

In the case of active control of an object, such as a car or a machine, the information must first be
recorded while a reaction must be carried out at the same time. The well-known use of a touch screen
requires that you move your finger in a controlled manner across the screen. It is therefore necessary that
the touch screen can achieve a response time of less than 1 ms in order not to produce any noticeable
delay in the visual impression. In the case of an active prothesis, which was applied in this study, the
response time must be below 10 ms to achieve a practical application basis for its use in daily life.
Therefore, fast sensor data-frameworks are needed to analyze the conditions of real-time identification
and subsequently provide a medical valid corresponding assistance .

The aim of the fast care project was to develop a real-time sensor data analysis framework for intelligent
assistance systems in the area of Ambient Assisted Living (AAL), eHealth, mHealth, tele-rehabilitation
and tele-care. It provides a medically valid, integrated real-time situation picture based on a distributed,
ad hoc networking, everyday use and energy-efficient sensor infrastructure with a latency of less than
several ms. The integrated situation picture that includes physiological, cognitive, kinematic information
of the patient is generated by the intelligent fusion of sensor data . It can serve as a basis both for the rapid
detection of risks and dangerous situations as well as for everyday use medical assistance systems that
autonomously intervene in real time and allows active telemedical feedback .

In this chapter of the book, after an introduction, the technical goals and implementation options of a fast
sensor network with real-time data analysis are presented followed without contact by the structure of the
overall system. In the Section, the details of the technological concept such as data fusion and telemetry
are presented. All relevant interfaces for real-time applications are discussed in detail. In the following
section, the hardware, sensors/actuators and the specific installation of the demonstrator in laboratory
operation are discussed. In the following part, details of the individual sensor systems and the
corresponding visualization of the sensor data presented by an Avatar are distinguished. In the Section,
the acceptance test for the use of the sensor components of the demonstration are analyzed and discussed.
Finally, a summary with a view of upcoming developments will be given at the end.

[h]System setup

The basis of a medical valid - integrated real-time picture of the situation is an ad hoc interconnected
sensor infrastructure. Its latency period should be very fast to fulfill the boundaries of a haptive working
network. Here, physiological, cognitive and kinematic information of a patient are captured with the help
of intelligent sensor data fusion. These data can be combined to provide an integrated picture of the
patient’s physical and mental situation. In this way, it should be ensured that the framework can be used
for applications, in which feedback has to be embedded synchronically. This can be realized in visual,
auditive, tactile or proprioceptive string of perception, such as in the field of support of motor function
and kinematics for the rehabilitation and for active prosthetics and orthotics.

Figure shows an overview of the system concept of the project approach for an integrated sensor
infrastructure in the home of an elderly person. It consists of GPS data, air pressure and temperature data,
vital parameters, cameras, optical sensors and so-called inertial sensors (IMU) together.
Figure :Integrated system concept.

These sensor data are summarized in real-time and buffered in a database system. From this database, an
integrated real-time situation analysis is generated that touches on three areas of human life: firstly, the
kinematic data such as localization, movement and posture. The second area is the cognitive sub-area
with awareness, emotionality and mental clarity. The third subsection deals with the physiological data in
which cardiovascular metabolic and neurological data can be recorded and analyzed.

This entirety of the data in the home of the living person can be evaluated integratively and can
accordingly provide a precise analysis of his health. In this project, apart from the emotional and
neurological aspects, all the addressed areas were recorded and evaluated. After evaluating the situation
analysis, actuators are implemented for rehabilitation, in a special case of an active prosthesis of the foot,
which can adjust different heel heights, automatic adaptation to different floor conditions or rapid
walking. Furthermore, the client should be provided with a real-time display of his vital parameters as a
so-called Smart Home Assistant, which can give a helpful health support to the client.

For a real-time application, it is necessary that the latency times between sensor detection and actuator
actuation are less than several Milliseconds. This ensures a so-called haptic functionality of the system
and can be achieved with the help of new radio technologies and fast network technologies such as FTTH
and the fifth generation of mobile radio networks (5G). To ensure private data security, all data is stored
and evaluated in a so-called home server which is situated in the client’s apartment. Further intervention
options are possible by a secure cloud connection to medical services or the system administrators for
possible updates of the sensor and actuator components.

The challenge of a distributed, real-time medical sensor technology and signal processing is to be
processed by means of sensor-based data processing and sensor hubs, optical sensors, hardware system
optimization, the development of distributed systems as well as by interface network sensors. The focus
of the project was on the intelligent fusion of sensor and actuator data as well as the evaluation and
delivery in real-time. In order to meet this objective, the following developments took place in the
Ambient Assisted Living (AAL)-Lab of the Harz University of Applied Sciences in Wernigerode .

 Analysis of requirements
 Data acquisition
 Data analysis
 Data fusion
 Acceptance analysis
 Situation detection and assistance in real-time

Figure :Application of fast care real-time sensor system.

The objective of a distributed, real-time medical sensor technology and signal processing is to get an
evaluation of the patient’s situation from the available data in real-time. The main application focuses in
the area of the application of orthopedic devices. For example, the optimization process of the leg
prosthesis` damping members and active foot positioning points shall be executed online. Currently, these
parameters are performed offline and hand-made by orthopedic technicians with variable quality. This
often leads to suboptimal adapted orthopedic devices; whose functionality and efficacy are
correspondingly limited and therefore to an unsatisfactory rehabilitation outcome. This system approach
of the sensor integration into an active foot prothesis is called a real-time active prosthetics/orthotics -time
controller. Another project section describes the online execution of the estimation of cognitive condition,
the motion analysis for rehabilitation and cardiopulmonary performance.

[h]Technological concept

Based on the project goals, the technical and content requirements of the technological topics to be
worked on were specified, categorized and summarized by the individual partners. The basic
requirements are listed in the following areas:

1. Hardware/sensors,
2. Network,
3. Data analysis,
4. Actuators/intervention/feedback

The system diagram of the research approach of the fast care framework is shown in the Figure. The fast
care framework is the technical basis for the realization of the fast care project, which implements the
fusion of heterogeneous sensors via heterogeneous networks. The basic idea of the fast care framework is
to derive a condition from the past and the current states of the sensory data using different newly
developed sensor applications, including the following areas and interfaces . From the network
topological representation, a breakdown of the used network interfaces was made, specified by the project
partners. Based on this, a suitable communication protocol was selected regarding the individual
implementations. Communication via MQTT forms the basis of the used communication between the
sensor-applications and the real-time controller depicted in Figure. In the left side of the figure, the
sensor-applications are situated, consisting of a Kinect system for motion data, inertial motion units
(IMU) for the detection of movements of body and objects in a fixed sequence for the analysis of a
workout in a kitchen, motion sensors/actuators in an active intelligent prothesis, a camera based heart rate
and breathe sensor, and finally a special sensor of volatile organic components in the room air. Prothesis,
body and objects sensors are connected via smartphone and Bluetooth low energy. While the smartphone
transfers the data to the real-time controller.
Figure :Network topology.

In total, the seven sensor components are listed there on the left. The active prosthesis, the heart rate
measurement, the respiratory rate measurement, the detection of VOC components in the breathing air,
the detection of movement in the room and the measurement of room temperature and humidity, as well
as the use of the emergency button, uses the corresponding network structure according to the blocks
shown in the sketch.

After the individual implementations of the interfaces a suitable software communication server was
selected. The MQTT protocol was implemented using a real-time capable Linux variant. Suitable
hardware was procured by the project partner of the Harz University of Applied Sciences, a suitable
operating system was installed and the MQTT software server “mosquitto” was installed and configured.
The definition of topics (message channels) and the specification of the data formats were necessary for
smooth communication of the individual partner realizations “in-itself” and “with each other.” A detailed
description of the communication formats between the sensors built by the partners and the MQTT server
can be found in the final design plan of the fast care project .

At the beginning of the project, the communication protocols that should be used between the individual
project partners for data exchange have been discussed and clearly defined . The interfaces for the
network used in the project are essentially the Bluetooth LE transmission, the Wi-Fi transmission and the
wired transmission via Ethernet 802.3. Furthermore, wireless transmission via LTE or 4G plus was used
by several partners. This resulted in a very broad transmission application scenario. An overview of the
transmission technology of the sensor infrastructure to the real-time controller and the forwarding to the

real-time visualization is depicted in Figure :

Table:Overview of network interface parts used in fast care.

Figure :Network infrastructure .

After the data has been transferred to the real-time controller, the data is available in the form of JSON
objects that were stored on the Linux system of the server. At the same time, an integrative situation
analysis of the sensor data is carried out and the corresponding information is transferred to the real-time
visualization via the public network to a cloud server, which generates a website with the correspondingly
evaluated real-time data in the form of an Avatar.
[h]Hardware, sensors, actors

In this part all of the hardware components which have been developed in the project are described. On
the one hand, this includes sensors with the task of capturing a physical measured variable like motion,
VOC gas, heart rate, etc. Furthermore, sensor modules have been developed with implemented combined
sensors which form a functional unit with actuators e.g. the electronically controllable lower leg
prosthesis. For a better overview of the components used by the individual partners, a matrix of the use of
all partners and their network interfaces was created. .

IMUs IMUs Smart Real-time


Kinect Prothesis Camera VOC Sen. Cloud Terminal
(Body) (Object) phone controller
HSH + + +
TUD + ++ +
OvGU + + +
URO + + ++++ +++ +++
EX +++ + + ++
BST + +
OBO + ++
HO + + +

Table:Types of hardware components used by the cooperation partners.

In the following subsections all of the used hardware and all sensors/actors are collected and described.

[h]AAL lab installation

Rapid and intelligent sensors and actuators, an improvement of motion pattern recognition and intelligent
algorithms for real-time network integration in three demonstrators of the AAL-Lab serve as solution
approaches. Within the fast care project, a real-time network integration with demonstrators is to be
carried out at the AAL-Lab of the Harz University. The various partial results of the project partners have
been collected and integrated in the AAL-Lab. The integration at the AAL-Lab will be performed with
the focus on user friendliness and the interaction with him by means of a show flat. Figure illustrates the
realized structure of the AAL-Lab with various elements for monitoring and evaluation of the measured
vital data. The lab includes the following parts: Sensors on the walls: Pulse, Blood pressure, breathing
frequency, Motion/position, VOC breath analysis, e-rehabilitation workout and the real-time controller
PC.
Figure :AAL lab of the Harz university; sketch of installations; (a) sensors on the walls: Pulse, blood
pressure, breathing frequency, skin resistance, motion/position, VOC breath analysis, (b) E-rehabilitation,
(c) real-time controller.

In Figure you can see the laboratory, including a sofa, several armchairs, a bed and all the sensor
components that were attached to the room, as shown in the Figure. The room has been deliberately
designed like an old room to create a pleasant atmosphere for the examinations. After the technology was
installed, the acceptance tests were carried out in this environment.

Figure :Photograph of AAL lab.

[h]E-rehabilitation system

The Kinect sensor used by the Otto von Guericke University in fast care is a physical device with depth
sensor technology, integrated color camera, infrared transmitter and microphone array that detects the
position and movement of people and voices. Table shows the data of the KINECT depth sensor, while
Figure shows the workout scene. The application is to make a therapeutically workout with the patient
and give him in real-time information and helpful feedback to move him in the right way. Additionally, a
gait analysis can be performed by the use of IMUs positioned at the feet, shown in Figure. More detailed
information can be found by Stoutz et al. in .

Figure :Setup of the gait measurements for e-rehabilitation of Otto von Guericke university; above left:
IMU application at the feet; above right: Therapeutic movements with avatar; lower middle: Presentation
of gait analysis measurement.
Feature Description
Depth sensor
512 × 424, 30 Hz Optimized 3D visualization, detection of smaller objects in particular
FOV: 70 × 60 and stable body tracking
One-Modus: 0.5–4.5 m
1080p-Color Camera
30 Hz (15 Hz in poor lighting Camera with 1080p resolution
conditions)
Neue aktive Infrarot-Funktionen
IR functions for lighting independent observations
512 × 424, 30 Hz
Four microphones etc. to find the sound source and the direction of the
Multi-Array-Microphone
audio wave
Kinect AUX (USB)
Interfaces
Kinect2 AUX (USB)

Table:Data of the used KINECT sensor system for e-rehabilitation.

[h]Inertial measurement unit (IMU)

The IMU used by the project partners “Otto Bock HealthCare GmbH”, “Otto von Guericke University”
and “University of Rostock” describes an initial measuring unit. It is a self-contained measuring system
that continuously records, analyzes, and, if necessary, pre-processes defined physical parameters (e.g.
movement, acceleration, pressure, etc.) and forwards them to downstream communication and network
protocols . A distinction is made between two application modes. On the one hand, the IMUs on an object
e.g. be installed in a kitchen appliance , which describes the use of “IMU on object” and provides
measurement data for further analysis. Another area of application is the use of an IMU through suitable
holders on the body of a person, which in turn describes the use of the “initial sensor on body” and also
provides measurement data for further analysis . The project partner “Bosch Sensortec GmbH”
developed and produces the IMU’s used in the fast care project .
Figure :Structure of the inertial measurement unit network.

[h]Camera-based vital parameter sensor

The camera-based vital sensors used by the project partner of the “Technical University Dresden” are
based on one or more camera systems with an associated, spectrally controllable lighting system and
generate a spatial image of the surroundings as a database for further evaluations. Camera-based
photoplethysmography (cbPPG) remotely detects the volume pulse of cardiac ejection in the peripheral
circulation. The system does measure the heart rate, the breath rate with a camera system contactless in
real time. More detailed information’s are described in the work of the Technical University of Dresden,
Institute of Biomedical Technologies of Zaunseder et al. . The camera-based system records the change in
the movement of the surface of the face in a fast data recording .
Figure :Camera-based vital sensors, 1 measurement unit, 2: Camera and lighting system 1, 3: Central
display of real-time measurement 4: Measurement system 1 while application, 5: Measurement system 2
in while application, 6: Camera and lighting system 2.

The exposure with an LED light source with a special spectral range is necessary to obtain a particularly
good contrast. The raw image data are sent directly to a controller and evaluated there. The evaluated data
(heart rate, respiratory rate) are transferred directly as a JSON object to the real-time controller via
Ethernet cabling at 1 Gb/s and stored there in the MQTT server. The representation of the respiratory rate
and the heart rate is then realized in real time in the Avatar .

[h]VOC air sensor

As part of the BMBF-funded “fast care” project, HarzOptics GmbH has developed components for a
distributed sensor network for the spectroscopic analysis of air. The sensor system analyzes the air in a
room by measuring the optical spectral content of volatile organic components (VOC) . Special
absorptions of VOC gases are analyzed, which indicate the beginning of clinical pictures. In addition to
assessing the quality of indoor air for AAL applications, this system is also to be used for the detection of
VOC in breathing gas. Since the presence of certain VOCs in exhaled air enables conclusions to be drawn
about diseases such as lung cancer or metabolic disorders, the integration of a non-invasive permanent
gas analysis in real-time medical care is becoming possible, also in view of increasing bandwidths and
decreasing latency times .

The air sensor is part of a more complex system, the basic mode of operation of which can be seen in
Figure. Data recorded by a sensor (e.g. CO2 concentration) are transferred as (voltage) values to an
Arduino board, which converts the values into volume concentrations, converts the data generated from it
into an MQTT-compliant format and transmits it to a real-time server. The data is displayed using a
special real time Avatar sketch which is presented in chapter 4.10 in more detail. If limits are exceeded, a
warning or recommendation is issued (e.g. “Please open window and ventilate” or “Please consult a
doctor”). In addition to the data from this sensor, the MQTT server also receives data from other sensors
that have been developed by other project partners. These are also visualized in the Avatar figure :

Figure :VOC sensor setup.

After the spectrum could not be recorded using an optical spectrometer due to a lack of sensitivity, an
alternative setup with laser sources was implemented. The wavelengths used here correspond to the
previously determined absorptions of the relevant substances and are recorded by a broadband optical
sensor. If the substances sought are present in the air, the light from the laser source is attenuated in
accordance with the concentration, which reduces the voltage values at the sensor output and the volume
concentration can be determined. The temperature sensitivity of the sensor and amplifier is still causing
problems.

[h]Active prothesis

Under the catchphrase “active prosthesis”, “Otto Bock HealthCare GmbH” summarizes its IMUs worn on
the body, an associated analysis and evaluation unit and the control of an active prosthetic foot. The aim
is to map an automatic adjustment of an active prosthetic foot using a long-term measurement of a gait
analysis based on the foot, knee and joint angle. The realization of the complete measurement system is
described in more details by Albrecht-Laatsch in . The current status quo for the adaptation of prostheses
is that clients rarely come to adapt their prostheses for rehabilitation and check-ups. Therefore, the
prosthesis is usually only adapted for one type of gait. In addition, developers rarely speak to users, so
that little everyday problems flow into development.

The goal of the development the active prothesis in the fast care project was to get a better picture of the
real prosthesis usage, as well as to make it easier and faster to adapt to the real needs of the user. This was
achieved with a remote connection of the active prosthetic foot used for remote diagnosis and automatic
adaptation to the conditions of use.

Implementation was achieved with the help of motion sensors (IMU), the measured values of which were
used both locally and remotely. This eliminates the need for a regular visit to the gait laboratory and the
long-term recording takes place in a relaxed environment. In addition, incorrect movement patterns can
be recognized and corrected early. The adaptation takes place automatically and can be initiated from a
“remote” location. With the active prosthetic foot, the heel height and the active aisle support could be
automatically adjusted by the software. This reduces fatigue, as the engine pushes the legs off. The
support is regulated depending on the speed. For experts in the laboratory, the gait diagram is displayed
remotely in real time, and further parameters of the prosthesis can be remotely adjusted by the experts in
fine tuning mode. The test of the automatic adaptation of the was performed in the laboratory which is
depicted in the working scene of Figure :

Figure :Active prothesis motion sensor with feedback for gait optimization.

[h]Bluetooth beacons
The University of Rostock uses “bulky BLE Beacons” to locate its IMUs in the room . These beacons are
distributed in a fixed position in the room and allow the IMU’s to make statements about movements in
the space of people and their acceleration via a field strength measurement. The sensors provide
information about using a kitchen task assessment dataset. This dataset contains normal behavior as well
as erroneous behavior due to dementia, recorded with wearable sensors as well as with sensors attached to
objects. The scene of the application of the kitchen task workout is depicted Figure :

Figure :Motion analysis of a cooking process with IMUs with inference method at university Rostock.

In this workout, a test client prepares a pudding meal that is clearly defined in a few simple steps. The
process goes through the compilation of the ingredients, the cooking itself to completion and decanting
the pudding into several cups. All sub-processes are analyzed in detail and provided with appropriate help
if the wrong ingredients are used or the wrong wooden spoon, while all objects in the environment which
the person is working, are connected with IMU sensors.

The kitchen task is created by a semantic annotation scheme. This scheme gives information about the
observed motions and the errors while performing the workout. The data format splits in sensor and video
data. The video data are collected by several cameras while the sensor data are collecting parallel to the
video several accelerations from the IMU sensors fixed at the body worn sensors and additional from the
used objects. The complete data roll consists of several normal and false runs. To get information about
the false runs, the clients realized errors in the workout. The data consists of action data as well as the
object being manipulated and the client that is working with it. More information about the sensor
application to analyze the erroneous behavior from Hein et al. can be found in .

[h]Emergency button and temperature/humidity sensors

As an additional sensor system, the Exelonix company implemented an NbIoT sensor as a push button,
which transmits its sensor data in JSON format to the real-time server via the public network via the
existing 4G + radio network . The emergency is displayed in real time on the visualization server. In the
real case, this could then be transmitted to the 24/7 service of a nursing service. A second sensor that also
works via NbIoT transmission is a motion-sensitive sensor. This has been installed to register movements
in the room and additionally to transmit the room temperature and air pressure to the real-time server via
the public radio network. In this case, too, the data is transmitted in JSON format. Further information on
the exact key data of the sensors can be found in the publications by Stege et al. .

Figure :Sensor modules of Exelonix, left: IoT emergency button via 4G+; right: IoT temperature, air
pressure and motion sensor via 4G+.

[h]Real-time controller

Within the fast care project, the Harz University of Applied Sciences developed a real-time platform for
the sensor data fusion of the partial realizations of the partners. For this purpose, a Linux-based
application server was configured based on a communication protocol (MQTT) selected for the project.
This “real-time controller”, on which all information converges, forms the central “sensor data fusion”.
The device includes a rack mounted server PC with Intel I7 topology and a memory of 16 GByte
1600 MHz DDR3 which is depicted in Figure. The LINUX version is “Red Hat Enterprise Linux Server
release 7.7 (Maipo)”. The network interfaces are two 1 GB IEEE 802.3 and a “Realtek Semiconductor
Co., Ltd. RTL8192EE PCIe Wireless Network Adapter”. More detailed information can be found in the
so called final design plan of the fast care project.
Figure :Real-time controller with MQTT server.

[h]Sensor data visualization

The project partners agreed to the technical implementation of the data fusion on the planned real-time
server and the development of a user interface. After the data collection of all partners, these data are
evaluated centrally on the real-time controller. The user should receive feedback about the obtained
information. This feedback is based on the visualization of the situation analysis. The main view of the
real-time visualization is shown in Figure. With its end customer platform, Exelonix GmbH forms the
technological basis for the visualization in the fast care project. All sensor data collected in the MQTT
server of the Harz University of Applied Sciences are evaluated using the Axel Onyx and Customer
Platform, and all sensor data collected in the MQTT server of the Harz University are collected using the
end customer platform from Exelonix. The sensor data were evaluated and visualized in a web page to
which only the project partners had access. The transformation and preparation of the “technical
information and data packets” received on the “real-time controller” was realized into a form that can be
interpreted by those in need of care, relatives and experts. Among other things, time courses and histories
are added.
Figure :Real-time visualization of the measured sensor data.

The visualization is shown in Figure. An Avatar appears on the left, in which both, the heart rate and the
breathing rate are shown optically in a movement of the heart and chest. On the right side of the picture
there is a heart with the heart rate and with a lung that the respiratory rate. Furthermore, the data of the
Exelonix sensor as well as the emergency button status, the room temperature and the room humidity are
shown. An indication of the condition of the indoor air is shown directly below these displays, in this case
the icon of a green cloud shows that the indoor air is in good condition.

Additional sensor data is depicted on the Avatar sketch. In the hip, knee and ankle area of the legs, the
information about the energetic states of the batteries of the IMUs for recording the posture and knee
angle is shown. The measured knee angle from the leg with the prosthesis is shown online in the graphic
on the right, where the knee angle is shown in degrees over time while walking.

The measurement of the gait parameters of the patient, which is also recorded by the IMUs on the hips,
knees and ankles , can be seen online to the right of the two icons on the gait width and lifting height of
the foot. This allows the gait to be assessed and improved in situ for rehabilitation purposes.

In addition to this main page of the real-time display, a sub-page has been created for each application of
the partners, in which the details of the individual sensor elements and their operation are compressed.
The details of the real-time visualization of the partners can be seen especially in the final design plan,
which can be found in the publication of Kußmann et al. .

[h]User acceptance studies


In addition to the technical development activities, an analysis of acceptance was executed at the AAL-
Lab of the Harz University. As a result of the project, fast care wants to develop feasible products and
create the medical fundamentals for an interaction in real time.

The project partners agreed to the technical implementation of the data fusion on the planned real-time
server and the development of a user interface. This is done in addition to the workload of the integration
of all technical components and the planed example application. After the data collection of all partners,
these data are evaluated centrally on the real-time controller. The user should receive feedback about the
obtained information. This feedback is based on the visualization of the situation analysis.

In the analysis of acceptance of the system, a small sample of a total of 20 subjects from different age
groups was interviewed. The following figure shows the distribution by gender and age . Although this
study is not representative, it gives a first insight into the valuation of the developed technology.

Figure :Age and gender distribution of the testing persons.

During the survey, the subjects had to assess both the individual systems of the project partners and the
overall system. The survey results of the entire system were very positive. 60% of the respondents stated,
that they would like to use the technology privately, 70% of the respondents would like to have access to
the technology, 35% would be willing to buy the presented technology and 95% see a great benefit for
themselves and for others in the tested technology .
Figure :Use of the presented technologies.

In another part of the test, the sample’s affinity for technology was queried. On average, the confidence
“in your own skills” when dealing with new technology was rated with 3.33 out of 5 points, the
willingness to use new and unknown technology with 4 out of 5 points and the degree of technical
overload with only 2.13 out of 5 Points. As a result, the test subjects showed a great willingness to use
new technologies and did not feel overwhelmed with the used technology .
Figure :Technical affinity of the test persons.

Figure illustrates, that the subsystem of the project partner Otto Bock was rated positively by the test
subjects. The success of the measurement was rated on average with 4.35 out of 5 points, the success of
the calibration with 3.97 out of 5 points and the intelligibility of the display with 3.27 out of 5 points. The
women rated the manageability of the system with 4.08 out of 5 points slightly better than the men with
3.44 out of 5 points.
Figure :Evaluation of the application of the active prothetic foot.

The gait analysis of the project partner of the Otto von Guericke University was rated as very positive by
the subjects with 4.27 out of 5 points. The technology used by the OvGU Kinect system with 3.9 out of 5
points. The more the test subjects were overwhelmed with the technology, the more negative the system
was rated .

Figure :Evaluation of the applications of the demonstrators of OvGU and TU Dresden.


Analyzing the system of the TU Dresden, the success of the measurement was rated 4.05 out of 5 points
and the comprehensibility of the instructions with 4.05 out of 5 points. The comprehensibility of the
instructions was more incomprehensible for the test subjects when they were overwhelmed by the
technology. The intelligibility of the display and the results was rated with 3.58 out of 5 points .

In the project fast care, a real-time capable sensor data analysis-framework in the fields of ambient
assisted living was developed. The project realized a medical valid integrated real-time picture of the
patient’s situation by using several interconnected sensor-actor infrastructures with a latency period of
less than 10 ms. The implemented sensor structure records the heart rate, the breathing rate, the VOC
content of the room air, analyzes the gait for rehabilitation and measures the temperature and humidity in
the room. An emergency button has also been integrated.

An active prosthetic foot was used as a special application of the sensor-actor System. Its running
parameters can be measured online, and the prosthesis can automatically adapt to the floor covering and
the running demands via the network. This means that users have an intelligent active prosthesis at their
disposal to help them cope with everyday life more easily.

It was shown that even with a heterogeneous network consisting of the components WiFi, Bluetooth LE,
Gigabit LAN and 4G+, real-time operation was possible for the use of the AAL components. Even the
display of the measured data, which was transferred to a website via the cloud, only showed latencies of
an additional few milliseconds. This made it possible to create a real-time image in the form of an Avatar
for all vital parameters and the automatic setting of the active prosthetic foot, which enables the client to
notice his physical condition in situ.

In addition to the technical development activities, an analysis of acceptance was executed at the
demonstrator in the AAL-laboratory. The survey results of the entire system were very positive. 60% of
the respondents stated, that they would like to use the technology privately, 70% of the respondents
would like to have access to the technology, 35% would be willing to buy the presented technology and
95% see a great benefit for themselves and for others in the tested technology.

Unfortunately, some slow network technologies such as Bluetooth LE had to be used to carry out the
project. It is to be expected, that with the full expansion of the networks to the fifth generation (5G), there
will still be a significant leap in transmission speed and transmission quality. It is therefore to be expected
that eHealth applications in the home area can be implemented in real time in the near future. After the
data fusion, further processing with the help of the artificial intelligence will bring further benefits to the
client for the prevention of his physical and mental health.

Chapter 4: Actuation Systems in Robotics

[mh] Modelling of piezoelectric stick-slip actuators

The piezoelectric stick-slip actuator is a type of actuator that uses the inverse piezoelectric effect and the
friction principle to realize the stepping displacement. Its motion process is more complex and influenced
by more factors. Therefore, the establishment of a piezoelectric stick-slip actuator model needs to reflect
as many motion laws as possible. By analyzing the principle of stick-slip drive and establishing a
comprehensive representative model of piezoelectric stick-slip drive based on a simplified model. On the
one hand, the influence of each factor on the system can be analyzed by simulation. On the other hand,
for the control method containing the system model, the establishment of the model is the theoretical
basis for control, and the accuracy of its model directly affects the performance of the control system.
A typical piezoelectric stack-frictional rod type consists of four principal parts—the fixed part, the
piezoelectric stack, the frictional rod, and the slider. The piezoelectric element elongates and contracts
under the action of a sawtooth wave drive signal driving the friction rod into a corresponding
reciprocating motion. The slider is displaced forward by the frictional force with the friction rod and its
own inertia. The basic drive principle of a piezoelectric stick-slip actuator is illustrated in Figure. The
drive process is divided into a stick phase and a slip phase within one cycle, and its force analysis is
shown in Figureb. In the sticky stage, when a slowly increasing voltage is added to the piezoelectric
element, the piezoelectric element slowly extends to drive the friction rod to the right. At this time, the
friction force between the slider and the friction rod is greater than the inertia force. The slider and the
friction rod remain relatively stationary and move together with the right S0

. At the slip stage, the voltage loaded on the piezoelectric element disappears quickly, and the
piezoelectric element contracts quickly to the initial position. At this time, the inertia force between the
slider and the friction rod is more important than the friction force and drives the slider to produce a
backward displacement S1 to the left. The effective step in each cycle is ΔS

. The drive can achieve continuous motion to the right by continuously repeating the motion process.

Figure :Operation principle. (a) Driving principle of the stick-slip actuator. (b) Force analysis of the
stick-slip actuator. (c) The actual object of the stick-slip actuator.

Research shows that the modeling accuracy of piezoelectric stick-slip actuators is mainly determined by
the following aspects—the hysteresis effect of the piezoelectric stack, the relationship between the
driving voltage and the driving force, the model of the mechanical structure of the piezoelectric stick-slip
actuator, and the friction model between the slider and the friction rod. Therefore, a representative
integrated model of piezoelectric stick-slip actuators is discussed based on these factors in this paper.

[MH] Electromechanical coupling model for piezoelectric stick-slip actuators

The piezoelectric actuator part is an electromechanical coupling system. When a certain drive voltage is
applied, the piezoelectric actuator generates a certain displacement and output force because of the
inverse piezoelectric effect. The modeling of the piezoelectric actuator needs to reflect the relationship
between the driving voltage and the deformation and output force generated by the piezoelectric actuator
at the same time, which are coupled with each other.

To quantitatively analyze a system, it is necessary to describe the dynamics of the system through a
mathematical model. This enables more information about the system to be described, resulting in better
system control. Adriaens et al. pointed out that if the piezoelectric positioning system is properly
designed, a second-order approximate modeling approach can be used to represent the dynamics of the
system very well . Typically, it can be viewed as a simplified spring-damped mass second-order system
with a friction bar. Its linear dynamics is represented as

mx¨+cẋ +kx=Fp−Ff
E1

where x

is the displacement of the slider, c and k denote the damping and stiffness of the piezoelectric drive stack,
and m denotes the total mass of the piezoelectric stick-slip actuator. Fp is the output force of the
piezoelectric stack, and Ff

is the frictional reaction force between the slider and the friction rod.

There are both hysteresis and creep effects in the process of applying a voltage to the ends of the
piezoelectric stack. When the nonlinear characteristics of the system are not considered, the piezoelectric
actuator input voltage and output force can be expressed as

Fp=Khu(t)
E2

where Kh is the conversion ratio of input voltage to output force and u(t)

is the drive voltage of the upper piezoelectric driver.

[h]Hysteresis model for piezoelectric stick-slip actuators

In the ideal case, the output displacement of the piezoelectric stick-slip actuators is linearly related to the
input control voltage curve. Due to the inherent characteristics of this material, there are certain nonlinear
characteristics such as the hysteresis effect and creep phenomenon in the actual driving process.
However, because the creep effect is so small that it is mostly ignored and the hysteresis effect is mainly
considered in the existing literature. The hysteresis phenomenon refers to the non-coincidence between
the boost displacement curve and the bulk displacement curve when the drive voltage is applied to the
piezoelectric driver. The hysteresis model is used to represent the force generated by the piezoelectric
driver with a new equation as

mx¨+cẋ +kx=H(t)−Ff
E3

Currently, they can be broadly classified into physical and phenomenological models based on the
modeling principles. Physical models are based on the physical properties of materials, among which the
Jiles-Atherton model and Ikuta-K model are more common . However, physical modeling is based on the
physical properties of the material, so the model implementation is difficult and affects the generality of
the physical hysteresis model. Phenomenological models are based on input-output relations of hysteresis
systems and are described using similar mathematical models. There are three broad categories based on
the modeling approach, operator hysteresis models, differential equation hysteresis models, and
intelligent hysteresis models.

[h]Operator hysteresis model

The common operator hysteresis models are the Preisach model, Prandtl-Ishlinskii (PI) model, and
Krasnosel'skii-Pokrovskii (KP) model.

The Preisach model was first used to describe the hysteresis phenomenon in ferromagnetic materials. It
was then gradually extended to describe the hysteresis behavior of smart materials, such as piezoelectric
ceramics and magnetically controlled shape memory alloys. It is one of the most frequently studied
nonlinear models of hysteresis . The model consists of an integral accumulation of relay operators in the
Preisach plane. The relay operator is shown in Figurea. Its mathematical expression is given by

Figure :Operators. (a) Relay operator. (b) Play operator. (c) Stop operator.

y(t)=∬Pμ(α,β)γαβ[u(t)]dαdβ
E4

where u(t)

and y(t) represent the input and output of the Preisach model, μ(α,β) is the weight function of the relay
operator corresponding to the Preisach plane, γαβ represents the output of the relay operator, α and β

are the switching thresholds of the relay operator.

Based on the previous work, Li et al. proposed that a multilayer neural network can be used to
approximate the Preisach model. It can use any algorithm trained for neural networks to identify the
model. The model is more flexible to adapt to different working conditions than the conventional model .
Later, Li et al. proposed a transformation operator for neural networks that can transform the multi-valued
mapping of Lagrange into a one-to-one mapping. By adjusting its weights, the neural network model is
made applicable to different operating conditions. The drawback that the Preisach model cannot be
updated online is solved .

Both the PI model and the KP model evolved from the Preisach model. The PI model has a single
threshold and two continuum hysteresis operators, the reciprocal inverse play operator and the stop
operator , as shown in Figureb and 2c. Therefore, the PI model can be derived from the PI inverse model
by the stop operator, which can be easily used to design feedforward control compensators by the inverse
model. The KP model uses a modified and improved Play operator with its corresponding density
function superimposed for hysteresis modeling. While the traditional play operator has only one threshold
that determines its width and symmetry, the KP operator has two different thresholds, enabling it to
describe more complex hysteresis nonlinear behavior .

[h]Differential equation hysteresis model

The common differential equation hysteresis models include the Duhem hysteresis model, the Bouc-Wen
hysteresis model, and the Backlash-like hysteresis model. P. Duhem et al. proposed the Duhem model,
which is a differential equation. The model was later improved by Coleman and Hodgdon and applied to
describe the hysteresis behavior of piezoelectric ceramics . Its common mathematical expression is given
by

ẋ =αD|u̇ |[f(u)−x]+g(u)u̇
E5

where x

and u represent the output displacement and input voltage of the Duhem model, and αD represents the
model parameters of the Duhem model, which is a positive constant. The functions f(u) and g(u)

determine the shape and performance of the input-output hysteresis curve of the Duhem model.

Although the Duhem model is also applied to describe the piezoelectric ceramic hysteresis problem, its
application in engineering is greatly limited due to the difficulty of solving the model inverse model. Su
et al. proposed a simplified dynamic hysteresis Backlash-like model based on the Duhem model . This
model has fewer parameters compared to the Duhem model and is a first-order differential equation with
the mathematical expression

ẋ =αB|u̇ |[cu−x]+βBu̇
E6

where x

and u represent the output displacement and input voltage of the Backlash-like model, αB, c and βB

are constants.

The Bouc-Wen model was originally proposed as a differential equation by Bouc and was later refined
by Wen to form the current Bouc-Wen model. The classical Bouc-Wen model can describe a large class
of hysteresis phenomena and has a concise expression , which is given as follows
{y(t)=kv(t)−h(t)ḣ (t)=αv̇ (t)−β|v̇ (t)||h(t)|n−1h(t)−γv̇ (t)|h(t)|n
E7

where k

denotes the scale factor of the system input to the output of the hysteresis part, α, β, γ , and n denote the
parameters of the hysteresis part of the model. The Bouc-Wen output y(t) consists of a proportional linear
part kv(t) and a hysteresis nonlinear part h(t)

The Bouc-Wen model is simple in form and has few identification parameters, which is convenient for
controller design. However, it cannot completely describe the hysteresis characteristics of piezoelectric
ceramics, its accuracy is low and it is only applicable to single frequency signals. The Bouc-Wen model is
difficult to accurately describe the hysteresis phenomenon under the effect of frequency signals .
Therefore, the application of this model in practical engineering is greatly limited.

[h]Intelligent hysteresis model

In addition to the above hysteresis models, there are some other classes of hysteresis models used in
hysteresis modeling of smart materials, such as neural network models polynomial models, and other
nonlinear models. Gan et al. proposed a polynomial model for the hysteretic nonlinearity of piezoelectric
actuators. Experimental results show that the proposed model has higher modeling accuracy than the
conventional PI model . Cheng et al. proposed a method for nonlinear model prediction. First, a
multilayer neuron network is used to identify the nonlinear autoregressive sliding average model of
piezoelectric ceramics. Then, the tracking control problem is transformed into an optimization problem
for model prediction. Finally, the Levenberg-Marquardt method is used to solve the numerical solution of
the nonlinear minimization .

There are some other models, for example, Zhang et al. proposed a proposed Rayleigh model to describe
the hysteresis characteristics of the piezoelectric drive system. The parameters of the rate-dependent
Rayleigh model were obtained and validated based on the functional and experimental data . Li et al.
proposed a simplified interval type 2 (IT2) fuzzy system for hysteresis modeling of piezoelectric drives.
In the experiments, gradient resolution and inverse resolution are used to identify the IT2 fuzzy hysteresis
model . Although these models are not as widely applied as the three major classes of models, they can
often achieve good results in some cases when dealing with the hysteresis characteristics in some specific
situations.

[h]Selection of friction model

The output of the drive system is ultimately transferred to the slider in the form of friction, so the choice
of friction model will directly affect the accuracy of the stick-slip drive platform model. At present, with
the in-depth research of many international scholars on friction models, a variety of friction models have
been established, which can be broadly divided into two categories, static friction models and dynamic
friction models. Static friction models describe the friction force as a function of relative velocity. The
dynamic friction model describes the friction force as a function of relative velocity and displacement. In
contrast to static friction, which only considers the case where the relative velocity is not zero, the
dynamic friction model uses differential equations to refer to the case where the relative velocity speed is
zero. Therefore, in terms of accuracy, the dynamic friction model is more comprehensive and realistic
than the static model. However, in addition to the accuracy of the friction model, it is also necessary to
consider the complexity of the model, not all cases need to use the dynamic friction model.

[h]Static friction model

The most widely used models in static friction modeling can be broadly classified into a series of
coulomb and the stribeck model. Leonardo da Vinci took the lead in discovering that friction is related to
the mass of an object and constructed a model. The model considers that the frictional force is
proportional to the mass of the object and opposite to the direction of motion. Later this model was
improved and called the coulomb model with the expression for friction

Ff=Fcsgn(v)
E8

where Ff

is the friction force, Fc is the Coulomb friction force, and sgn(v)

is the sign function.

In some studies, classical friction models have been used to represent the friction between the slider and
the friction bar. The four static friction models commonly used in the early days are shown in Figure .
However, the slider step is only a few tens of nanometers to a few microns. The friction at this point is
determined by the pre-slip displacement, which is the motion of the object before it is about to slide
formally. When the friction surface is rough, the Coulomb friction model cannot accurately predict the
friction force at pre-slip. Experiments have shown that in the pre-slip domain, the friction force depends
on the micro-displacement between the two contacting surfaces . However, the Coulomb friction model
does not accurately predict this effect and will result in a relatively large error in the system. Therefore, a
more accurate friction model is needed to describe the friction between the drive block and the terminal
output.

Figure :Four classical static friction models. (a) Classical Coulomb friction model. (b) Coulomb and stick
friction models. (c) Static friction, coulomb, and stick friction models. (d) Stribeck model.

These models need to exhibit some important static and dynamic properties of friction, such as the
stribeck effect, Coulomb friction, stick friction, and pre-slip displacement. Li et al. proposed the stribeck
model, which was the first model to describe dynamic and static frictional transition processes . Its
mathematical expression is as follows

Ff=(Fc+(Fs−Fc)e−|v/vs|ςs)sgn(v)+bv
E9

where Fs

is the maximum static friction, Fc is the Coulomb friction force, b is the coefficient of stick friction, vs is
the stribeck effect velocity value, and ςs

is the empirical constant.

It not only reflects the linear relationship between dynamic friction and velocity but also expresses the
change of friction during the transition between dynamic and static friction. It lays the groundwork for
future research and the establishment of a dynamic friction model.

[h]Dynamic friction models

Dahl et al. proposed the Dahl model, which is a dynamic friction model . It describes for the first time the
pre-slip displacement in a friction model, represented by the partial differential equation

where x is the shape variable and σ is the stiffness coefficient. α

determines the shape of the curve.

However, the model does not capture the variation of the static friction phase and also does not explain
the stribeck phenomenon. It is far from an adequate description of the stick-slip-driven friction interface.
However, due to its simple and accurate expressiveness, it provides a solid foundation for the subsequent
dynamic modeling.

Based on the study of the Dahl model, the French scholar Canudas proposed the LuGre model . Its
principle formula is
where α0 denotes the stiffness coefficient of the elastic bristle, α1 denotes the system damping
coefficient, α2 is the coefficient of stick friction, v is the relative velocity of the object surface, z is the
mean deformation of the friction surface, and g(v)

is the described stribeck effect.

The LuGre model introduced the stribeck effect in addition to combining the idea of pre-slip displacement
in the Dahl model. The idea of the bristle effect was designed to address the changing situation of the
static friction phase. Swevers et al. proposed a new structure of the dynamic friction model . The non-
local memory hysteresis function and the modeling of arbitrary transition curves were added on the basis
of the LuGre model. This allows the model to accurately describe the experimentally obtained friction
characteristics, stribeck friction during slip, hysteresis behavior during slip, and stick-slip behavior. Since
the structure of the obtained model is flexible, it can be further extended and generalized.

[h]Comprehensive model of piezoelectric stick-slip actuators


By combining the above models and considering other influencing factors, a comprehensive dynamics
model considering the electrical model of the piezoelectric stick-slip actuator, the hysteresis effect, the
linear dynamics performance, and the frictional characteristics of the system can be obtained, as shown in

Figure :

Figure :Flowchart of a comprehensive model of a piezoelectric stick-slip actuator.

In the stick-slip drive system, the output force of the piezoelectric stack is first obtained by the change of
voltage at both ends of the piezoelectric element. Then, the electromechanical conversion model of the
drive transmission system composed of the piezoelectric stack and the flexible transmission mechanism is
transformed into the displacement and force output of the transmission system. Finally, the displacement
is transferred to the slider by the kinematic friction conversion, and the final displacement output is
obtained. The mathematical equation of its integrated model is as follows

where H(t) can be given by the previous hysteresis model [i.e., (4)–(7)]; Ff can be given by the friction
model [i.e., (8)–(11)]; xs is the backward displacement of the slider under the action of dynamic friction;
and xe

is the forward displacement of the slider.


Wang et al. developed a kinetic friction model of the actuator and investigated the effect of the input drive
voltage on the viscous slip motion of the actuator through simulation . Nguyen et al. used the method of
dimensionality reduction to describe the frictional contact behavior of the stick-slip microactuator. The
model accurately predicts the frictional contact behavior of the actuator on different geometric scales
without using any empirical parameters . Piezoelectric stick-slip actuators also have more complex
dynamic characteristics. The ability to simulate a wide range of dynamic characteristics is the direction of
increasing research. Shao et al. found that the contact behavior of the piezoelectric feed element produced
an inconsistent displacement response. So the Hunt-Crossley kinetic model, LuGre model, distributed
parameter method combined with Bouc-Wen hysteresis model were used to model the viscous slip
actuator. This model can effectively model the step inconsistency in the front-to-back direction of the
actuator . Wang et al. proposed a stick-slip piezoelectric actuator dynamics model considering the overall
system deformation. The model introduced stiffness coefficients and damping coefficients for the whole
system and successfully simulated three single-step characteristics, namely, backward motion, smooth
motion, and a sudden jump for the first time . Due to a large number of parameters in the dynamic model,
the accurate identification of each parameter will be quite difficult. Therefore, more accurate
identification of simulation parameters needs further research, which may be our future work.

Chapter 5 Control schemes of piezoelectric stick-slip actuators

The piezoelectric stick-slip actuator introduces mechanical structures, such as friction rods and linear
cross roller guides on the basis of piezoelectric stacks. A sawtooth wave signal is applied to the
piezoelectric stack to achieve a stepping large stroke and high precision motion. In the actual application
process, the traditional mechanical structure can no longer meet the demand of real positioning accuracy
due to the complex nonlinear effects in the system and the influence of external disturbances. Therefore,
intelligent control algorithms combined with computer hardware devices are usually introduced to
eliminate or reduce the impact of the above problems on motion accuracy. This section mainly discusses
the existing controller design methods from two parts—open-loop control and closed-loop control.

In this paper, control schemes and hardware facilities for piezoelectric stick-slip actuators are briefly
described in most of the literature. Depending on whether a closed-loop is formed, the main categories are
feedforward control, feedback control, and composite control with a combination of feedforward and
feedback. The inverse model-based control is mostly feedforward control, which is usually used to
compensate for the hysteresis characteristics of piezoelectric stick-slip actuators. The control system is
shown in Figure, yd

is the expected input, vinv is the theoretical input under the expected output value, y

is the actual output value.


Figure :Feedforward control system principle based on inverse model.

Feedback control is the real-time feedback of the actual measured displacement through data
measurement equipment, such as sensors, and the feedback value is one of the inputs of the controller.
Feedback control can effectively improve the robustness of the control system, and its control mode is
shown in Figure. Where, v

is the output of the feedback controller.

Figure :Feedback control system principle.

In actual control, simple feedforward control and feedback control cannot meet the control demand.
Therefore, in most cases, researchers combined the advantages of feedforward control and feedback
control, usually the feedforward control and feedback control of the compound control scheme are
applied to the piezoelectric stick-slip actuator position tracking control. The control system is shown in

Figure :

Figure :Feedforward and feedback compound control system.

[mh] Conventional open-loop control of piezoelectric stick-slip actuators

Feedforward control is essentially an open-loop control. However, in the overall control of piezoelectric
stick-slip actuators, the advantages of this control method are very limited. This has been found by many
scholars—since the voltage changes for each step displacement are very fast, there is almost no
hysteresis, creep effect, and the use of feed-forward control in this motion mode is not necessary.

Feedforward control is often used to improve the quality of the end motion output of piezoelectric stick-
slip actuators. Holub et al. compensated for the hysteresis of piezoelectricity by varying the amplitude of
the input voltage for position errors and hysteresis modeling . Chang et al. compensate for hysteresis by
changing the phase lag of the actuator .

Feedforward control of end-effectors is challenging as the control accuracy of feedforward control is


heavily dependent on the accuracy of the above model. The main source of difficulty comes from the
many problems in the real system, including the hysteresis of the piezoelectric, the creeping nature, and
the non-linearity of the frictional motion, the vibration between the stick and sliding points, the wear of
the material between the movers and the stator, and other uncertainties .

Another feed-forward control by some scholars-charge control to compensate for hysteresis in the
actuator. Špiller et al. developed a hybrid charge-controlled driver that slides by generating a high-voltage
asymmetric sawtooth wave and feeding it into a capacitive load to compensate for the piezoelectric
hysteresis as well as to achieve fast back-off. A simplified circuit diagram of the piezoelectric driver is
shown in Figurea. This control method combines a charge control scheme with a switch is an effective
solution, and the proposed hybrid amplifier has better motion linearity as shown in Figureb .
Figure :(a) A simplified circuit diagram. (b) Experiment effect.

The control of piezoelectric stick-slip actuators is usually divided into one-step control stage and sub-step
control stage, also known as step mode and scan mode. Step mode refers to the piezoelectric stick-slip
actuator moving forward at a fixed step size and a fixed frequency when it is far from the desired
position. Until the error between the actual position and the desired position is less than the single-step
displacement of the piezoelectric stick-slip actuator, the precise positioning is realized by controlling the
elongation of the piezoelectric stack, which is called the scanning control stage. In the stepping mode, the
voltage changes rapidly and there is almost no creep. At the same time, in the stepping mode, the control
accuracy is not necessary. Therefore, in the control process of stick-slip motion, there are few overall
feedforward control cases. And feedforward control is usually implemented in the scanning control stage.
The core component of the piezoelectric stick-slip actuator is the piezoelectric stack, which moves
through the inverse piezoelectric effect of the piezoelectric stack. Compared with the motion control of
the piezoelectric stick-slip actuator, feedforward motion control of the piezoelectric stack actuator is more
mature. Chen of Harbin Institute of Technology defined a new function named mirror function, which
connected the dynamic hysteresis model with the classical Preisach model and established a new dynamic
hysteresis model to describe the input-output relationship of the piezoelectric actuator under different
conditions. On this basis, a feedforward control scheme based on the dynamic hysteresis inverse model is
designed .

In addition, Ha et al. experimentally identified the hysteretic parameters of the Bouc-Wen model, and on
this basis designed a feedforward compensator to compensate for the influence caused by the nonlinearity
of the hysteretic effect. Finally, the simulation results of the compensator and the designed voltage
waveform are given to realize the feedforward control of the piezoelectric stack . Wei et al. proposed a
feedforward controller based on an improved rate-dependent PI hysteresis inverse model, which achieved
the expected effect . In recent years, Zhang et al. also proposed a third-order rate-dependent Rayleigh
model to describe the hysteresis nonlinearity of piezoelectric stacks. And proposed a feedforward control
scheme based on the inverse third-order rate-dependent Rayleigh model, which also verified the
effectiveness of the method through experiments . Feedforward control often plays an obvious role in
hysteresis compensation. In the future piezoelectric stick-slip drive controller design, the existing
piezoelectric stack feedforward control methods can be used for reference to realize the feedforward
control in the scanning stage. Simple feedforward control has poor robustness in the application, so most
researchers use compound control to improve the control accuracy.
[h]Conventional closed-loop control of piezoelectric stick-slip actuators

Piezoelectric stick-slip actuators are affected in practice by factors such as environmental vibration and
their own nonlinear characteristics, and their controllability becomes poor. Therefore, appropriate control
methods are needed for closed-loop control to meet the actual working requirements. Zhong B et al.
found that differences in object surface roughness and wear can cause inconsistent velocities during the
movement of a piezoelectric stick-slip actuator. Therefore, a dual closed-loop controller for velocity and
position was designed to achieve high accuracy positioning. Its principle of double closed-loop control is
shown in Figure. The experimental results show that the standard deviation of the speed of the controller
is less than 0.1 mm/s, and the repeated positioning accuracy reaches 80 nm, both of which achieve a good
control effect . The design of the controller considers a more accurate speed control scheme, which has a
high reference value for realizing the fast positioning of piezoelectric stick-slip actuators. Rong et al.
introduced strain gauge as positioning sensor of the precision manipulator with piezoelectric stick-slip
actuators and developed a displacement prediction method based on this. The feedforward PID control
method is used throughout the system to improve the dynamic performance of the system. As shown in
Figurea, it is the simulation of displacement-prediction control positioning performance (target
displacement of 5.5 μm). Figureb shows the comparison of displacement response under the open-loop
control and displacement response under the displacement-prediction control (target displacement of 20
μm). It can be seen from the experimental results that the 200 nm steady-state error of the proposed
control method is much lower than that of the open-loop control . The commonly used classical control
methods are difficult to achieve high control accuracy in practical applications due to the limitations of
parameters, weak automatic regulation and poor robustness.

Figure :Double closed-loop control principle.


Figure :(a) Simulation of displacement-prediction control positioning performance (target displacement
of 20 μm). (b) Comparison between displacement response under the open-loop control and displacement
response under the displacement-prediction control (target displacement of 20 μm).

Rakotondrabe et al. designed a micro-positioning device based on a stick-slip actuator. The control
process is divided into a step mode and a scan mode, where the scan mode is precisely controlled by a PI
controller . Theik et al. used an inertial piezoelectric actuator to suppress the vibration of the hanging
handle and designed three controllers—PID manual setting, PID self-setting, and PID-AFC. The best
damping effect was achieved by experimentally comparing the PID-AFC controller. When the mass of
the inertia block is larger, the vibration damping effect is more obvious . These control methods are
developed by the classical control theory in the actual control process of piezoelectric stick-slip actuators.

[h]Intelligent control of piezoelectric stick-slip actuators

By introducing intelligent control algorithms, such as sliding mode control algorithm and neural network
algorithm, the self-adjusting control of piezoelectric stick-slip actuator is realized. Closed-loop control
with feedback is a common control mode of piezoelectric stick-slip actuators, which can effectively
compensate for the effects of hysteresis nonlinearity, complex friction relations and external interference
on the positioning accuracy, and improve the robustness of the controller. The closed-loop control is
mainly divided into two kinds of closed-loop control. The first closed-loop control is the voltage
amplitude of the driving signal, which adjusts the single-step size of the piezoelectric stick-slip actuator
by controlling the voltage amplitude. The other is the control of the driving signal frequency. By
adjusting the frequency of the piezoelectric stick-slip actuator, the speed of the piezoelectric stick-slip
actuator can be controlled. Cao et al. proposed a sliding mode control method based on linear
autoregressive proportional integral-differential. It can solve the problem that the hysteretic characteristics
of piezoelectric stacks in piezoelectric stick-slip actuators and the nonlinear friction relationship between
end-effector and workbench affect the control effect. Firstly, an ARX model of the system is designed,
and its state space description is obtained. Then, the sliding mode control is introduced, and PID control
is introduced as the frequency switching controller in the sliding mode control, so that the error tends to
zero, to achieve better speed control .

In addition to introducing the inherent mathematical model into the controller, the controller can also be
designed by introducing the neural network algorithm to online model identification. Cheng et al.
proposed a neural network-based controller to reduce the effect of complex nonlinearities between the
end-effector and the driving object. The structure block diagram of the overall controller is shown in
Figure. The control paradigm of piezoelectric stick-slip actuators is usually divided into two phases—the
one-step control phase and the sub-step control phase. In the one-step control phase, when the error
between the desired position and the actual position is less than the maximum single-step displacement
length by continuous sawtooth wave excitation, the controller switches to the sub-step control phase. In
the experiment, the steady-state tracking error is kept within 50 nm, realizing ultra-precise motion control
at the nanometer level .
Figure :The schematic of the overall controller in the sub-step control phase—the desired reference
ref(tk)

of the end-effector; the real displacement yef(tk) of the end-effector; the estimated displacement y^ef(tk)
of the end-effector; the desired displacement y^(tk) of the driving object/PEA; the input voltage v(tk)
applied to the PEA; the real displacement y(tk)

of the driving object/PEA.

Oubellil et al. applied proportional control to the macro motion control of nanorobots based on the
piezoelectric stick-slip motion principle. Under macro motion control, the amplitude and frequency of the
sawtooth wave voltage signal are adjusted by proportional control. When switching to scan control mode,
Hammerstein dynamic model based on the PI hysteresis model is established, and then H∞ robust control
scheme based on the model is designed. The hybrid stepper/scan controller can effectively meet the
stability, robustness, hysteresis, and accuracy of multi-target nanorobots . Oubellil et al. also applied
piezoelectric stick-slip actuators to the nanorobot system of fast scanning probe microscope. To meet the
requirements of fast scanning in closed-loop bandwidth and vibration reduction, the uncertain model of
the piezoelectric actuator was defined by the multi-linear approximation method. A 2-DOF H∞ control
scheme is designed to provide robust performance for the positioning of the nanorobot system. The fast
and accurate positioning of the piezoelectric stick-slip actuator is realized .

In addition to its characteristics, the model of piezoelectric stick-slip actuators can also absorb the
modeling mode of the piezoelectric stack. In a sense, due to the coupling relationship of the structure, the
model can be regarded as an inclusion relation. The control mode of piezoelectric stick-slip actuators can
also be the control mode of the piezoelectric stack, such as model-based feedforward control, inversion of
control, sliding mode control method, active disturbance rejection control and some intelligent control
methods can be applied to the precise control of piezoelectric stick-slip actuators. The research on
piezoelectric stack also has reference significance in the precise control of piezoelectric stick-slip
actuators.

Sliding mode controller often appears in the control of piezoelectric stack actuators nonlinear system . It
is an effective and simple method to deal with the defects and uncertainties of a nonlinear system model.
Sliding mode control is not dependent on an accurate mathematical model, which makes it popular in
nonlinear system control of piezoelectric actuators. Li et al. proposed a sliding mode controller with
disturbance estimation is designed for piezoelectric actuators. The Bouc-Wen model is chosen to describe
the input and output relations of the piezoelectric actuator, and a p swarm optimization algorithm is used
for real-time identification of the model parameters. Considering the external and own uncertain
disturbances, adaptive control rules are introduced to change the controller parameters. Experimental
results show that the proposed controller can significantly improve the transient response speed of the
system . Mishra et al. designed a new continuous third-order sliding mode robust control scheme for the
hinged piezoelectric actuator. To ensure the overall stability of the closed-loop system, a disturbance
estimator was designed to counteract the effects of external disturbances and nonlinearities . Xu Q et al.
proposed an enhanced model predictive discrete sliding mode control (MPDSMC) with proportional-
integral (PI) sliding mode function and a novel continuous third-order integral terminal sliding mode
control (3-ITSMC) strategy .

Because of the hysteresis nonlinearity of the piezoelectric actuator and the existence of system vibration
and external disturbance, the robustness of the controller is usually required. Wei et al. proposed a
variable bandwidth active disturbance rejection control method for piezoelectric actuators. The control
method of the nanopositioning system is based on a cascade model of the hysteresis model and the system
structure. Information about all uncertainties and disturbances excluded items in the model is estimated
by a time-varying extended state observer (TESO). Afterwards, a variable bandwidth controller based on
the control error is designed. Its control system is shown in Figure. z1

, z2, z3are the states of time-varying extended state observer, b0 is adjustable coefficient, and d

is a disturbance. A series of experiments show that the proposed controller has a higher response speed
and stronger anti-interference ability than the traditional active disturbance rejection controller .

Figure :The variable bandwidth active disturbance rejection control.

Neural networks are widely used in the design of adaptive controllers for nonlinear systems because of
their strong self-learning ability. In view of the system uncertainty and hysteresis nonlinearity of
piezoelectric actuator, Li et al. proposed a neural network self-tuning control method. Two nonlinear
function variables about hysteresis output are established and two neural networks are introduced to
identify the two hysteresis function variables on line, respectively. Experiments verify that the neural
network self-tuning controller has a good track tracking effect . Napole et al. proposed a new method
combining super torsion algorithm (STA) and artificial neural network (ANN) to improve the tracking
accuracy of high voltage stack actuator . Lin et al. proposed a dynamic Petri fuzzy cerebellar (DPFC)
model joint controller for magnetic levitation system (MLS) and two-axis piezoelectric ceramic motor
(LPCM) drive system, which is used to control the position of MLS metal ball and track tracking of the
two-axis LPCM drive system. The experimental results also show that this method can obtain a high-
precision trajectory tracking response .

The neural network has a strong self-learning ability and can approach complex nonlinear functions. It
plays an important role in the design of the piezoelectric stick-slip actuators controller. In addition to
neural network and sliding mode control, data-driven model-free adaptive control is also suitable for
systems with model uncertainty. Model-free adaptive control (MFAC) as a typical data-driven control
method, this method was proposed in Mr. Hou Z's doctoral thesis in 1994 . In the past two decades, both
the continuous development and improvement of theoretical achievements, and the successful practical
application in the fields of motor control, chemical industry, machinery and so on, have made MFAC
become a new control theory with a systematic and rigorous framework. As for the application of model-
free control in the piezoelectric stack, Muhammad designed a data-driven feedforward controller and
feedback controller. To avoid chattering caused by noise and affect the convergence of the learning
process, several rules about parameters are also proposed. The experimental results show that the
controller can realize high-precision position tracking at low frequency .

Piezoelectric stick-slip actuators have great potential in the field of precision operation. However,
whether at the experimental level or in the application, the hysteresis nonlinearity of the piezoelectric
stick-slip actuator, the complex friction motion relationship in the driving mechanism, its vibration and
external disturbance will have a great impact on its motion control accuracy, so that the piezoelectric
stick-slip actuator cannot achieve the ideal output performance. In this paper, the modeling and control of
piezoelectric stick-slip actuators are summarized and studied.

In the aspect of modeling, the existing mathematical models describing the hysteresis characteristics of
piezoelectric stick-slip actuators and the mathematical models of complex friction relationships in the
structure are introduced. Hysteresis models mainly include Prandtl-Ishlinskii (PI) model, Krasnosel'skii-
Pokrovskii (KP) model, Preisach model, Bouc-Wen model, and Rayleigh model. In terms of the friction
model, the existing dynamic friction and static friction models are mainly introduced. The model of
piezoelectric stick-slip actuators usually includes the hysteresis model and friction model. In the modeling
part, the mathematical model of piezoelectric stick-slip actuators proposed by people is summarized and
studied, which provides a reference for the control and model analysis of piezoelectric stick-slip
actuators.

In terms of control, according to open-loop control and closed-loop control, this paper summarizes and
studies the efforts made by people to make up for control accuracy, and summarizes many control cases,
such as feedforward control, sliding mode control, PID control, neural network control, and so on. In the
future development of piezoelectric stick-slip actuators, opportunities and difficulties coexist. The control
mode can effectively make up for the output performance of piezoelectric stick-slip actuators and make
them meet the actual need in various complex environments.

Based on this paper, a more comprehensive dynamics model can be developed in the future by analyzing
the characteristics of piezoelectric viscous-slip actuators in-depth to extend to actuators of different
mechanical structures. By combining various control methods to eliminate system nonlinearity, higher
accuracy and precision motion can be achieved. With the combination of intelligent control field and
piezoelectric actuators, piezoelectric stick-slip actuators will be applied to more fields in the future.
Preface

In the rapidly advancing landscape of robotics, the journey towards autonomy has emerged as a defining
quest, reshaping industries, augmenting human capabilities, and provoking profound questions about the
future of work and society. In this introductory exploration, we embark on a voyage into the heart of
autonomous robots, unraveling the intricate tapestry of mechanisms, sensors, actuators, and algorithms
that bestow these machines with the ability to perceive, reason, and act in the world.

At its essence, autonomy epitomizes the pinnacle of robotic evolution, endowing machines with a degree
of self-governance and decision-making prowess previously confined to the realm of science fiction. It
represents the convergence of diverse disciplines, melding together insights from mechanical engineering,
computer science, electrical engineering, and cognitive psychology to create a cohesive framework for
intelligent robotic systems.

Mechanisms serve as the bedrock upon which autonomous robots stand, encompassing a myriad of
physical components ranging from joints and linkages to wheels and manipulators. These mechanical
structures embody the embodiment of robotic motion, enabling machines to traverse environments,
manipulate objects, and interact with the world in meaningful ways.

Sensors act as the eyes and ears of autonomous robots, imbuing them with the capacity to perceive and
interpret their surroundings. From cameras and lidars to ultrasonic sensors and inertial measurement units,
these sensory apparatuses capture rich streams of data that serve as the raw material for perception
algorithms, enabling robots to navigate dynamic environments with dexterity and precision.

Actuators form the bridge between intention and action, translating computational commands into
tangible movements and exerting forces upon the physical world. Whether in the form of motors,
pneumatic systems, or shape-memory alloys, these actuators endow robots with the ability to execute
tasks with finesse, grace, and efficiency.

Algorithms serve as the cognitive engine driving the autonomy revolution, orchestrating the fusion of
sensory inputs, decision-making processes, and motor commands into a cohesive symphony of robotic
behavior. From classical control techniques to cutting-edge machine learning algorithms, these
computational constructs empower robots to adapt, learn, and evolve in response to changing
environments and tasks.

As we embark on this odyssey through the realm of autonomous robots, we are poised at the dawn of a
new era in human-machine collaboration, where the boundaries between the artificial and the natural blur,
and the possibilities for innovation and discovery are limited only by the bounds of our imagination.

About the book

In the exploration of autonomous robots, we delve into their core components—mechanisms, sensors,
actuators, and algorithms—pioneering a journey into the frontier of robotic autonomy. This quest
represents a fusion of disciplines, blending mechanical engineering, computer science, and cognitive
psychology to create intelligent robotic systems.

Mechanisms form the foundation of autonomous robots, comprising various physical elements like joints
and wheels that enable movement and manipulation. Sensors act as the sensory organs, providing robots
with the ability to perceive and interpret their surroundings through data streams from cameras, lidars,
and other devices. Actuators translate computational instructions into physical actions, allowing robots to
interact with the environment.

Algorithms serve as the brainpower behind autonomy, orchestrating sensory inputs, decision-making
processes, and motor commands to facilitate adaptive behavior and learning. This interdisciplinary
endeavor heralds a new era of collaboration between humans and machines, where boundaries between
artificial and natural realms blur, and innovation flourishes boundlessly.

You might also like