(Kostanay, Kostanay state university named after A.Baitursynov) MACHINE VISION SYSTEM FOR MOBILE ROBOTS
Аннотация. In modern robotics, the problem of developing systems of spatial orientation and navigation of mobile robots remains one of the most actual tasks. In information-measuring and control system of the robot with remote control or autonomous operation, data analysis and formation of the control purpose is based on information, and up to 80% of this information is delivered through vision [1].* The aim of this work is to create several hardware and software options for building computer vision systems (CVS) for their further use as individual embeddable modules in other projects focused on different applications and control tasks.In this case, CVS should solve the problem of identifying obstacles, measuring distances, creating terrain maps and the formation of movement route of the robot to specified goal.Computational capability of onboard control system and characteristics of touch cameras (TV, webcams, cameras in IR range) set the main constraints in the solution of navigation tasks of the robot.We examined several structures of CVS designed to solve different problems according to their complexity of mobile robot (MR) management. Conventionally, they can be related to three types of tasks: remote control with CVS, autonomous behavior of the robot and the simplest tasks of game robots control.
Computer vision system for MR remote control
Vision system was built on iMX-233 debugging board, and Atmel ARM926EJ-S microcontroller operating at a frequency of 454 MHz is based on this board.The board includes the following: DDR dynamic RAM with a capacity of 64 MB; flash memory with a capacity of 256 MB; interfaces I/O, USB 2.0, SD/MMC card, audio, video (analog), I2C and SPI, and Ethernet 100 M. Two Logitech C270 webcams are used in the system, the webcams are conntected to the motherboard via USB interface. The developed algorithms use libraries and functions of computer vision system OpenCV. The algorithm determining the distance to points in space in front of the stereo camera and building an approximate map of distances was developed [2,3]. The developed algorithm compares the key points obtained using the method of SURF on the left and right image.Real distance from the stereo cameras to the point is calculated for all found pairs of points. For this purpose, two webcams are rigidly fixed with respect to each other in the way that their central rays are parallel, and the cameras are calibrated by OpenCV. It is required to know the edges of things in order to create the map of distances, so, approximation of the calculated distance with filling the area in countor is carried out. Borders are highlighted on the left image with use of Canny edge detector [4].
Then,the image with edges is subtracted from the original image and several subsequent reforms, clarifying the edges of the images, are carried out. The points obtained after the comparison are passed to cvTriangulate Points() function, which also takes the matrix parameters of the stereo cameras asinput parameters. At the output, the matrix of distances to points is generated. The views from the left and right cameras are showed on Fig.1. Edges of things are marked on the image from the left camera.
Fig. 1. Image from the left (a) and the right (b) cameras with the marked edges of objects
● Т Е Х Н И Ч Е С К И Е Н А У К И
●
Техникалық ғылымдар
ҚазҰТЗУ хабаршысы №2 2017 43
Fig. 2. Map of distances
After finding key points and calculating distances, filling of map of distances (Fig.2) is carried out. A lighter shade means the nearest contour, and darker shade means the furthest one. No key points are found in black contours. A program was created for spatial visualization of distance map, and it uses OpenGL and OpenCV facilities. Visualizationof distance map is showed on Fig.3.
More red area means the nearest contour, and bluer area means the furthest one. Some advantages of the developed algorithm are ability to detect obstacles, sufficiently accurate determination of distances to nearby objects and good separation of objects with contrasting borders from the background.
USB-N10 Wi-Fi transceiver is connected to the controller board via USB port. For encryption of transmitted data WPA2 standard is used with software support of wpa_supplicant package, and eponymous process is launched in the service mode. The video data processing is performed on the remote computer.
Fig. 3. Spatial visualization of distance map
Like many algorithms which process stereo image from webcams, the implemented algorithm has sig- nificant drawbacks: edges between objects of similar brightness are poorly detected; the contours are located along the plane, so there is no three-dimensional image of objects, and a number of otherdisadvantages that aretypical for video cameras.
Computer vision system based on structured light cameras
New opportunities in the processing of images appeared in CVS based structured light cameras [5, 6].
Cameras of this type include MS Kinect and ASUS Xtion Pro live touch cameras. The cameras work based on technology developed by Prime Sense company [7]. The camera switches a projector that emits light in the IR range in the form of a pseudo-random pattern, and specially calibrated monochrome CMOS sensor, which takes the resulting picture and color RGB camera and microphone array. IR camera is used for receiv- ing data about distance, and RGB camerais used to telecontrol a robot.The distance is determined by distor- tion of known radiated pattern on the resulting picture. The distance calculation is happening on controller which is built in camera.
At the same time, a three-dimensional array of points represents the initial data which should be transformed for some purposes, for example, object recognition or reconstruction of the surfaces.
Most of the methods used at various stages of the transformation, have a computational complexity, which is rapidly increasing with the increase in the number of points in processed data. To ensure the execu- tion of navigation tasks in a reasonable time, the data received from the structured light camera must be pre- filtered to reduce the number of points without loss of information about the obstacles. In the developed al-
●
Технические науки
44
№2 2017 Вестник КазНИТУ
gorithm [8] of image processing for obstacle identification, the following order of basic operations is suggested: delete of unnecessary points, noise reduction, reducing the density of a cloud, allocation of the principal planes, construction of the descriptors of point clouds, classification of objects, estimation of the distance to the object. Layout of CVS is developed on the basis of suggested method, the layout is consisting of the MS Kinect camera and a laptop with Intel SU7300 processor at a clock frequency of 1.3 GHz [8]. Ex- perimental studies were carried out in order to recognize the following types of typical obstacles: three- dimensional objects of simple geometric shapes, doorway, stairs, the purpose of which was to measure the speed and accuracy of recognition of various objects. Time spent on individual stages of the algorithm when processing each image was measured during experiments. The total processing time per image did not ex- ceed 0.7 s, and the accuracy of recognition was not less than 0.65.Structured light camera is a powerful software and hardware complex of optical spatial perception, which is the most effective to use for small mobile robots, where physical sizes of the touch cameras and onboard control units are highly important.
Currently, low-power computing units include systems with single-core processors with frequency up to 1 GHz, low volume RAM (512 MB) and the lack of a discrete graphics accelerator. Based on these restric- tive parameters, components of CVS for small robots were selected; in particular, the structure of the system included ASUS Xtion Pro Live camera which is smaller than MS Kinect, Raspberry Pi Model B single-board computer and a Wi-Fi radio module. All devices are connected via USB interface.The computer operates via Debian Linux OS.
Open-source libraries were used for image processing: ROS Groovy Galapagos libraryfor solving typical robotic tasks and PCL library for working with point clouds.
Algorithms of obstacle detection were developed, and a sequence of operations with a three- dimensional point cloud is used in it, the sequence is like in the CVS layout on the basis of the MS Kinect camera.However, in this case, all transformation stagesare performed on Raspberry Pi board computer. The stage is showed on Fig.4 in order to illustrate one of transformation stages.
Fig. 4. The camera image (a) and its corresponding point cloud (b)
Density is reduced at the first stage of transformation of the initial point cloud. A voxel grid is used in order to reduce the number of points without distortion of objects. The voxel grid is built on the cloud and all points within each cell are approximated by their centroid. This method is slightly slower than the approximation by the cell center, but it helps to avoid distortion.
The following stagesof transformation of the point cloud depends on the set goal. Two options of algorithms were implemented. In the first algorithm the problem of obtacle detection were solved.
Possibilities of a depth map were used to measure the distance to nearby objects with consideration of restrictions on height, distance of objects, the volume of the classifier. In the second algorithm, the map of surrounding terrain and route of robot movement was developed according to data of pre-reduced point cloud.Work with camera was supported by OpenNI2freely distributed driver.
Computer vision system for small mobile robots
Hardware base of the CVS has a minimum set of technical facilities: Raspberry Pi Model B single- board microcomputer and Raspberry Pi Camera (a specialized video camera for Raspberry Pi).
In addition to standard sofware, vision system includes as follows:
Raspbian (a free operating system for Raspberry built on the basis of Debian) and the OpenCV libraries; the following software was used:
●
Техникалық ғылымдар
ҚазҰТЗУ хабаршысы №2 2017 45
- MMAL (a framework that provides an interface to interact with the camera module);- raspicam (library for C++ programming language that allows to control the camera programmatically and use it in conjunction with OpenCV).
Working principle of CVS is based on recognition of patterns in images obtained from video camera in real-time. The system allows the robot to follow a marker and control the distance to it. The marker is highlightedfrom the imageon the basis of color and shape. Ball which is painted with one color is used as the marker in order to simplify the algorithm. The use of such tool guarantees the invariance of contour and the square of marker on received image in any projection, provided that the distance to the marker is invariable.
The algorithm of control of robot movementto marker consists of three main sequentially executed stages, and the most time-consuming subtasksare reading shot from the camera and further processing of the received shot in order to search marker on it. The third subtaskis formation of commands to control motors, and it does not require complex calculations.
Reading a shot from the camera. Speed and accuracy of reaction of the robot to change the marker position, and volume of resources of the video stream processing depend on the frequency of shots in video stream.
Initially, a simple Logitech C270 webcam was used instead of a specialized camera for Raspberry.
However, during image processing load on the processor never dropped below 95% at the resolution of the input video stream of 320x240 pixels and at frequency of 24 shots per second. The high loading was due to the fact that the video stream was decoded and processed only by processor of Raspberry microcomputer.
Motion JPEG compression format was usedin Logitech С270 camera, and the format was not supported by co-processor of GPU microcomputer.
This problem is solved when you connect Raspberry Pi Camera [10]. Raspberry video accelerator supports hardware acceleration of decoding of video stream in H264, MPEG-1 and MPEG-2 formats. The use of hardware acceleration allowed to reduce load on the central processor significantly as well as to increase the speed: video decoding with a resolution of 640x480 points and a frequency of 29 shots per second; and load of the processor is less than 70%.
Processing of the received shot- search ofmarker. Image processing can be divided into several stages: the conversion of the shot in easily recognizible color model; overlaying a color filter on the shot, highlighting the contour ofthe marker, and clipping of objects with the same coloranddifferent form.
The marker on the shot is seachedvia facilities of OpenCV library. In accordance with the proposed algorithm, firstly,color filtering of the processed shot happens;then, contours on the shot are searched.The contour that meets the definition of a marker, that is, a circle, is highlighted. The ratio of the perimeter and area of a circle is usedto check the shape of the contours and the detection of circles:
S is the area of the contour in pixels, and P is the perimeter of the contour in pixels.
Each found contour is checked for compliance with this condition. There are other more flexible and accurate algorithms for finding figuresaccording to their form, but they require high computing power, which will affect the reaction time of the robot. Also at this stage, the coordinates of center of the found contour are calculated in order to determine the direction of motion of the robot.
A choice of the color model deserves a special attention, based on this model, marker recognition will be performed. Selection of the optimal color model becomes crucial when the primary criterion of object search is color. The use of RGB color space is inefficient for such type of search.When subject may be illuminated with varying degrees of intensity, it is impossible to identify the ranges for components of RGB space in the way that each component is limited by integral interval. The nonlinearity of this color space does not allow to allocate a certain color intervals by inseparable components R, G and B. This RGB feature allows using of definition of the marker according to the color – if illumination changes, the object is no longer tracked by CVS. In addition, the color represented in the RGB space depends on the shooting devices, and this leads to the impossibility of synchronization of ranges of color space components in different cameras. Similar problems were identified when using the HSV color space.
It was established experimentally that the optimal color spaces for the given task are Luv, Lab, and YCbCr, which does not have the above disadvantages of RGB and HSV color spaces.At the same time, these spaces are not universal and different tasks require different approaches. In the working version of CVS the
●
Технические науки
46
№2 2017 Вестник КазНИТУ
Lab space is used, this space showed the best results in the number of practice tests. Fig. 5. illustrates shots obtained after processing of the shot when using the Lab color space. The original shot is shown in Fig. 7, b) As a result of color filtering of the shot, it is shown on Fig.5. The shot has a depth of 1bit;regions relating to required range of colors are indicated by white pixels. After a search of closed areas on the shot Fig. 5, container of regions is formed.Fig.5. Colow filtering of the shot
To search for closed areas, function of search edges in the image is used; it called Canny edge detector and was implemented in the OpenCV library. The result is an array of contours, among the contours a search of requied things in specific task is carried out. It is possible to seach for several contours and to analyze their position against each other.
The formation of the control commands of engines. This subtask contains the algorithm of for- mation of control commands on motors ofrobot,it affects an adequacy of the device response to detection of the object. A block diagram of the algorithm is shown on Fig.6.
Fig. 6. Block diagram of motion control of the robot Begin
await
The object is detected ?
No
?
Yes
?
Ye s Yes
The objectfar? Forward
End Back
Ye s
?
Ye s
No
The ob- ject
on the right The object on the left
The objec t close To
the right
To the left
●
Техникалық ғылымдар
ҚазҰТЗУ хабаршысы №2 2017 47 W
in do w
O b j e c t
F r a m e
The algorithm described above runs on the Raspberry Pi, which transmits control commands using a serial UART port to the drive control unit.
Conditions are based on the analysis of the visible area of the marker (number of pixels), which is calculated in the previous step. Motion of the robot is directed toward the center of the object. The robot stops rotating to the direction of the object only when the center of object is located in window of the shot set in advance (Fig. 7, a). The smaller the window, the more precise is the positioning of the robot. The minimum possible size of this window depends on many factors, but mainly from the smooth motion of the robot and quality of the image received from camera.
Fig. 7. The analysis of the position of the marker (a) and object detection (b)
Conclusion
A prototype of a mobile robot on a wheeled platform with an installed computer vision system was created, the platform has different hardware and software options. Video cameras and infrared cameras of structured light were examined as sensor elements of CVS. The hardware platform is also implemented in several variants: based on 32-bit controllers with architecture of ARM 9, Cortex – M4F and a single-board computer Raspberry Pi Model B. The experiments showed that the limited computational power of board system, in most cases, they allow only remote or supervisory control of the robot. Complex tasks of navigation and route planning of robot motion in a nondeterministic environment in real time can be solved by multicore processors, including graphics accelerators, DSP processors.
Layouts of CVS were developed for experimental verification of developed algorithms of image recognition, distance measurement, route planning. Developed vision systemswill allow to develop and test new algorithms of navigation and control of mobile robot with the most promising touch controllers.
REFERENCES
[1] Mobile technological robots and simulators: integrative software of group interaction / V.P.Andreyev, K.B.Kirsanov, A.V.Kostin, etc. // Information-measuring and control systems. №4.-М. 2013.-pp. 74 – 79.
[2] Adaptation of computer vision algorithms to control systems of walking machines / S.A.Bykov, A.V.Yeremenko, A.V.Gavrilov, V.N.Skakunov // The news of Volgograd State Technical University. Actual problems
of control, computer science and informatics in technical systems: interuniversity collection of scientific articles / VolgSTU – Volgograd, 2011. – Edition. 10, № 3. pp. 52-56.
[3] G. Bradski, A. Kaehler. Learning OpenCV. − 1005 Gravenstein Highway North, Sebastopol, CA, USA:
O'Reilly, 2008. 555 p,
[4] The Control System of the Eight-Legged Mobile Walking Robot, Andrey Andreev, Victor Zhoga, Valeriy Serov, Vladimir Skakunov, 11th Joint Conference on KnowledgeBased software Engineering JCKBSE, 2014,17th-20th, September, Volgograd, Russia, Knowledge-Based Software Engineering Communications in Computer and Infor- mation Science Volume 466, 2014, pp. 383-392.
[5] Walking Mobile Robot with Manipulator-Tripod / V. Zhoga, A. Gavrilov, V. Gerasun, I. Nesmianov, V.
Pavlovsky, V. Skakunov, V. Bogatyrev, D. Golubev,V. Dyashkin-Titov, N. Vorobieva // Proceedings of Romansy 2014 XX CISMIFToMM Symposium on Theory and Practice of Robots and Manipulators.- Series: Mechanisms and Ma- chine Science. - Springer International Publishing Switzerland.- Volume 22, 2014, pp 463-471.
[6] Sanmartín G. et al. PrimeSense sensors as a low-cost method for motion capture on clinical tests //Spanish Computer Graphics Conference. – The Eurographics Association, 2012. – pp. 133-136.
[7] S.A.BykovApplication of method of analysis of three-dimensional point clouds in computer vision systems of robots / S.A.Bykov, V.G.Leontev, V.N.Skakunov // Thenews of Volgograd State Technical University. 2012. Т.4,
№13. pp. 37-41.