Design of Computer Vision and Robotics Learning Kit
Design of Computer Vision and Robotics Learning Kit
net/publication/370633253
CITATIONS READS
0 92
3 authors, including:
All content following this page was uploaded by Dan Kilula on 10 May 2023.
Abstract
Computer vision is the branch of computer science that focuses on the understanding,
analysis, and processing of images and videos by computers. Along with robotics, computer
vision is used to automate tasks in automotive, manufacturing, and numerous other industries
giving rise to intelligent systems. This is especially useful as the need for increased automation
in warehouses and industries has risen steeply. A python-based computer vision learning kit has
been created to detect colors and shapes within a workspace and sort them. The application
implements matrix operations and image processing algorithms such as Hough circle
transformation, to perform object tracking, color, and shape detection. A graphical user interface
was created to help the user start, stop, and monitor the computer vision process. Currently, the
application is being implemented on a three degrees of freedom pick and place robot. Design
details and results will be discussed in the paper.
Keywords
I. Introduction
Robotics is, without doubt, one of the engineering subjects that continues to bring
technology forward. From cleaning floors to handling packages, robots have taken considerable
importance in our society. With the fast growth of automation and artificial intelligence, robotics
is set to become the backbone of tomorrow’s world. To be able to compete with others in the
future, numerous countries are investing resources in robotics programs at every academic level,
especially at high school and college levels. In the US despite the efforts, robotics has not been
able to.
For the past two years, as part of its outreach efforts, the College of Engineering and
Computer Science at Arkansas State University (A-State) has been holding engineering summer
camps for area high school students on its main campus; the design and building of robots were
part of the various camp activities. One significant observation was the difference in the level of
students’ exposure to robotics, which ranged from basic introduction to robotics to more
advanced levels, wherein the school students have worked with some advanced kits and have
taken part in competitions. This range of exposure showed that a wide gap exists among the area
high school students which needed to be addressed.
One of the ideas put forth was to design a hands-on project for the school students which
will enable them to learn and program a scaled-down robot incorporating different features. This
project was undertaken by two of the students as part of the graduate Intermediate Robotics class
project in Spring 2022.
II. Overview
The objective of this work was to design a robot kit that can be utilized to teach the basics
of computer vision with Python to robotics at the high school and college level, at the same time
providing critical hands-on experience with this type of design project. The learning tool consists
of a three-degrees-of-freedom robotic arm equipped with a camera for computer vision. Below
the camera is a workspace fixed to a panel, all mounted on a portable station. The Python
computer vision program runs on a Raspberry Pi 4, sending commands to an Arduino Uno board.
When assembled and programmed, the robot will be capable of sorting objects by color and
shape. After using this tool and going through the content, high school level learners are
expected to understand basic computer vision with Python. For users with college-level
understanding, the takeaway is significant. In addition to computer vision and image processing,
users will be introduced to inverse kinematics, robot arm kinetic diagrams, homogenous
transformation, and basic user interface (UI) design. By increasing the number of degrees of
freedom or adding elements such as artificial intelligence and machine learning, one can increase
the level of difficulty and go into greater depth in the field of robotics.
Robotic arms (as in a basic pick-an-place robot) constitute one of the best starting points
for anyone to become familiar with robotics. It is envisaged that the kit and the documentation
will eventually become an open-source project, with a design that is easily replicable at a low
cost. After design optimization, the replication of the system is expected to cost more or less five
hundred dollars, with most of the spending being done on the boards (Raspberry Pi and Arduino)
and other electronic components. The goal is not simply to assemble the tool but to learn the
engineering aspects of the project, such as robotic arms, and implementing a computer vision
algorithm, for different applications.
The following sections provide the details of the hardware, software, and system design,
which went into the building of this educational kit. Even though the project touches on multiple
topics, the area of interest will be computer vision. Each of the engineering aspects reflects the
individual steps necessary to replicate the assembly and programming of the arm. For each
engineering aspect/step, the key material learned by the student or user will be discussed, as well
as the additional material available to raise the complexity of the system to the next level. By the
end of the learning process, the students will be able to implement computer vision into the
system enabling the robot arm to sort objects based on color. In this project, the robot will sort
red and blue objects within a given workspace.
III. Hardware
The main component of the kit is the pick-and-place robot, here an existing three-
degrees-of-freedom articulated manipulator is used as a robotic arm[1], and most of the parts can
be 3D printed. With three degrees of freedom, students can learn about the basics of inverse and
forward kinematics. Figure 1. below shows the CAD of the pick-and-place robot used for the kit.
The motion of the robot is driven by three stepper motors, controlled by an Arduino, and give
feedback on their positions via limit switches. Objects are detected through a 1080p webcam
which feeds data to the Raspberry Pi 4. Arduino and Raspberry Pi boards are two of the most
used boards in university and DIY electronics projects. Moreover, the communication between
the two systems is well documented and accessible.
The motion of the robot is controlled by an Arduino Uno microcontroller via stepper
motors, which provide feedback on their positions with limit switches. Other electrical
components will be discussed in their respective sections. At this point of the learning process,
students will be able to learn how to identify types of joints and draw kinematic diagrams for
various 3 degrees of freedom arms, including the articulated manipulator, as shown in Figure 2.
To find equations of the inverse kinematics of the robot, basic trigonometry and geometry
are needed to get started with this part of the learning process. By projecting the kinematic
diagram of the robot on the top and front plans, students deduce the relationships between the
joint angles and the members of the robot. Those relationships will be later on used during the
programming phase of the project. Figures 3 through 5 show how the projections are used to find
the inverse kinematics of the robot.
The first angle equation is derived by projecting the kinematic diagram on the x-y plane
and utilizing Pythagoras’ Theorem. Rt is the projection of the distance between the center of the
base and the end effector.
The two remaining angles are found by projecting the robot on the x-z plane. Rs is the
projection of the distance between the second joint and the end-effector. The expressions of the
angles omega and alpha are derived from the Pythagorean Theorem and Law of Cosine,
respectively.
Considering the robot geometry obtained with the above diagrams, the students can
design a robot workspace and station. In this case, the workspace was designed using
SolidWorks. Figure 6. underneath displays a top view of the robot workspace and station.
One important feature of the panel is the modularity of the components. Learners who
wish to increase the level of complexity can replace the robot with one that possesses more
degrees of freedom.
IV. Software
At this stage of the learning, the student will be introduced to the OpenCV[2] library for
Python. OpenCV is one of the most utilized computer vision libraries available, with numerous
available open-source projects. As the learners advance, they will go from learning how to set up
a camera with OpenCV to writing various image processing algorithms. In addition to OpenCV,
students will be exposed to other important libraries such as NumPy[3] and Queue[4]. For people
getting started with programming, Python is considered one of the best languages. In addition to
its accessibility, the community around python is large and resources are available to
programmers of any level. With additional research, students, particularly college students, will
be able to expand the scope of this project and implement new algorithms and components.
IV.1 OpenCV
For non-college learners, this section will be where most of their time will be spent.
Before starting to utilize OpenCV, students will get to have a brief introduction to the science
behind cameras, and existing color systems ( RGB, BGR, HSV, Gray Scale, and BW).
Moreover, the learners will understand how cameras can be used as sensors, and how images are
turned into matrices. After getting an overview of computer vision, students will start writing
scripts and use basic syntaxes such as “cv.VideoCapture” or “cv.imshow.”
In the object detection portion, students will be learning how to detect contours,
apply masks to images and write algorithms to detect various shapes such as rectangles,
triangles, and circles. For this project, more attention will be brought to the use of the Hough
Circles algorithm[5]. Figure 7 underneath shows a code snippet of how the Hough circle
algorithm is used.
After learning how to apply different masks to a frame, students will use OpenCV
built-in functions and syntaxes to detect colors. In this case, the learners will be manually
inputting the values of colors they wish to detect, from color code tables and charts.
Subsequently, the students will learn how to extract values from various color systems such as
HSV and BGR, with the help of OpenCV. Figure 8 shows a code snippet of color detection on a
camera frame.
Students will get the basics of homogeneous transformation at this stage, using rotation
and displacement matrix to convert the robot coordinates from robot frame to camera frame.
Figure 9 below shows a top view of the workspace with identified frames and dimensions.
Even though understanding matrix operations requires some knowledge of linear algebra,
students not familiar will understand the fundamentals of homogeneous transformation.
Moreover, they will be capable of identifying elements in the homogenous transformation
expression which is given as.
𝑋𝑅 𝑋𝐶
𝑌 𝑌
[ 𝑅 ] = 𝐻𝐶𝑅 [ 𝐶 ]
𝑍𝑅 𝑍𝐶
1 1
Where,
X, Y, and Z are the object coordinates with R and C designating the robot and camera frame.
𝑅𝐶𝑅 is the rotation matrix to align the robot arm axes with the camera axes.
𝑇𝐶𝑅 is the displacement vector, representing the distance from the center of the robot arm to the
center of the camera. 𝑇𝐶𝑅 is given by,
𝑋𝐶
𝑌
𝑇𝐶𝑅 = [ 𝐶 ]
𝑍𝐶
1
Referring to the robot geometry and the dimensions of the workspace in figure 9, the
displacement matrix is given by,
−130
270
𝑇𝐶𝑅 = [ ]
0
1
When familiar with the concepts, learners will move on to the coding aspect which
consists of turning the homogeneous transformation into a Python script. Figure 10 Below shows
a python script of the homogeneous transformation used for the robot.
Once the circles have been detected and identified by color, students will proceed to write
a sorting algorithm and send the sorting sequence to the microcontroller ( Arduino Uno board).
In the case of this project, the sorting algorithm will separate the blue and red circles by putting
them in their respective baskets (see 3D CAD). The circles will be sorted with respect to the
distances between the centers and the buckets’ reference points as shown in Figure 11 below.
To generate the G-code sorting sequence, students will write a Python script that will
track the gripper’s position. After sorting the circles’ distances from shortest to longest, the G-
code will be generated by finding the difference between the x and y coordinates of the bucket
reference point and the closest circle. Figure 12 underneath shows a snippet of the G-code
generation. The code also includes G-code instruction to control the z-axis of the robot (to drop
and pick up circles).
The communication with the Arduino board will be made through serial communication,
enabled by the PySerial library. The G-code will be converted to strings with UTF encoding. On
the microcontroller side, all the C++ files and libraries have been made available by the creator
of the arm. To fully understand this part of the project, students will need to have a good
knowledge of C or C++ programming. The learners lacking C or C++ knowledge will still have
the chance to go over the basics of Arduino programming, using stepper motors and Microstep
drivers.
V. User Interface
The user interface design for the robot is not a mandatory step and at this point of the
project, students should be able to make the robot run and sort the object on the platform.
Nevertheless, it is practical to have a user interface to monitor the number of objects in the
workspace and provide the user with the camera point of view. For this purpose, students will be
introduced to PyQt5[6], a free graphical interface design application. Learners will design a
simple UI with output labels and widgets. The goal is to have a label displaying the processed
camera feed from python and display information such as the number of blue and red circles, the
© American Society for Engineering Education, 2022
2022 ASEE Midwest Section Conference
status of the robot, and the time it took for sorting. Figure 13 shows a screen capture of a sample
UI created with PyQt5 Designer.
The essential elements that the learners can take away are how to import a UI to python
with the pyuic5 library and use threads and signals in python to display the camera feed on the
UI’s camera label.
VI. Summary
A fully operational pick-and-place robot with three degrees of freedom was designed and
fabricated to incorporate computer vision. The system was successfully demonstrated – a set of
blue and red discs were sorted into their respective groups and placed in their designated
locations. A short demo clip was posted YouTube[7] to show how the system works. Various
stages of learning were listed with the objectives to be accomplished at each stage. The system
can be made more complex depending on requirements.
Prospective students will complete the various steps in the learning stages following
which they will be capable of completing the final assembly of the robot and the base station.
Depending on the resources available, the station can be built with a variety of materials and
configurations. The prototype was built with an aluminum frame and plexiglass sides cut with a
laser. The UI is displayed on a 20in. monitor mounted to the frame. Figure 14 shows the final
product from different angles. It is hoped that this system will be introduced during our outreach
visits to high schools as well as in the engineering camps being planned. Another possible
application for this kit is to introduce this as a project to incoming engineering freshmen as part
of their first-year experience.
References
Dan Kilula
Dan Kilula is currently pursuing a Master of Science in Engineering at Arkansas State University
(A-State). After obtaining his BSME at A-State in 2021, he decided to continue his engineering
studies with an emphasis on robotics and computer vision.
Mohamed Adawi