Robot Hearing and Vision

Machine hearing involves detection of acoustic waves, along with amplification and analysis of the resulting audio signals. Machine vision involves the interception of visible, infrared (IR), or ultraviolet (UV) radiation, and translating this energy into electronic images. Machine hearing and vision can allow robots to locate, and in some cases classify or identify, objects in the environment.

Binaural Hearing

Even with your eyes closed, you can usually tell from which direction a sound is coming. This is because you have binaural hearing. Sound arrives at your left ear with a different intensity, and in a different phase, than it arrives at your right ear. Your brain processes this information, allowing you to locate the source of the sound, with certain limitations. If you are confused, you can turn your head until the direction becomes apparent.
 
Robots can be equipped with binaural hearing. Two acoustic transducers are positioned, one on either side of the robot’s head. A microprocessor compares the relative phase and intensity of signals from the two transducers. This lets the robot determine, within certain limitations, the direction from which sound is coming. If the robot is confused, it can turn until the confusion is eliminated and a meaningful bearing is obtained. If the robot can move around and take bearings from more than one position, a more accurate determination of the source location is possible if the source is not too far away.

Visible-Light Vision

A visible-light robotic vision system must have a device for receiving incoming images. This is usually a charge coupled device (CCD) video camera, similar to the type used in home video cameras. The camera receives an analog video signal. This is processed into digital form by an ADC. The digital signal is clarified by means of DSP. The resulting data goes to the robot controller.

The moving image, received from the camera and processed by the circuitry, contains an enormous amount of information. It’s easy to present a robot controller with a detailed and meaningful moving image. But getting the robot controller to know what’s happening, and to determine whether or not these events are significant, is another problem altogether.

Optical Sensitivity and Resolution

Optical sensitivity is the ability of a machine vision system to see in dim light or to detect weak impulses at invisible wavelengths. In some environments, high optical sensitivity is necessary. In others, it is not needed and might not be wanted. A robot that works in bright sunlight doesn’t need to be able to see well in a dark cave. A robot designed for working in mines, pipes, or caverns must be able to see in dim light, using a system that might be blinded by ordinary daylight.

Optical resolution is the extent to which a machine vision system can differentiate between objects that are close together in the field of vision. The better the optical resolution, the keener the vision. Human eyes have excellent optical resolution, but machines can be designed to have superior resolution.

In general, the better the optical resolution, the more confined the field of vision must be. To understand why this is true, think of a telescope. The higher the magnification, the better its optical resolution will be, up to a certain maximum useful magnification. Increasing the magnification reduces the angle, or field, of vision. Zeroing in on one object or zone is done at the expense of other objects or zones.

Optical sensitivity and resolution are interdependent. If all other factors remain constant, improved sensitivity causes a sacrifice in resolution. Also, the better the optical resolution, the more incident light it requires to function well. In this case, a good analogy is camera film (the old fashioned kind). The fastest films require more light than slow ones. The corollary to this is the fact that if you want excellent detail in a photograph, you will have to expose the film for a comparatively long period of time.

Invisible and Passive Vision

Robots have an advantage over people when it comes to vision. Machines can see at wavelengths to which humans are blind. Human eyes are sensitive to EM waves whose length ranges from 390 to 750 nanometers (nm). The nanometer is a thousand-millionth (10−9) of a meter. The longest visible wavelengths look red. As the wavelength gets shorter, the color changes through orange, yellow, green, blue, and indigo. The shortest waves look violet. Infrared (IR) energy is at wavelengths somewhat longer than 750 nm. Ultraviolet (UV) energy is at wavelengths somewhat shorter than 390 nm.

Machines, and most nonhuman living species, can see energy in a range of wavelengths that differs somewhat from the range of wavelengths to which human eyes respond. For example, insects can see UV that humans cannot, but insects are blind to red and orange light that humans can see. (Have you used orange bug lights when camping to keep the flying pests from coming around at night, or those UV devices that attract bugs and then zap them?)

A robot can be designed to see IR and/or UV, as well as (or instead of ) visible light, because video cameras can be sensitive to a range of wavelengths much wider than the range humans can see. Robots can be made to see in an environment that is dark and cold, and that radiates too little energy to be detected at any electromagnetic wavelength. In these cases the robot provides its own illumination. This can be a simple lamp, a laser, an IR device, or a UV device. Radar and sonar can also be used.

Binocular Vision

Binocular machine vision is the analog of binocular human vision. It is sometimes called stereo vision or stereoscopic vision. In humans, binocular vision allows perception of depth. With only one eye— that is, with monocular vision—you can infer depth only to a limited extent, and that perception is entirely dependent on your knowledge of the environment or scene you are observing. Almost everyone has had the experience of being fooled when looking at a scene with one eye covered or blocked. A nearby pole and a distant tower might seem to be adjacent, when in fact they are a city block apart.
Untitled
Binocular machine vision. Two different views of the same object are combined to achieve a sense of depth and perspective.
Above figure shows the basic concept of binocular robot vision. High-resolution video cameras, and a sufficiently powerful robot controller, are essential components of such a system.

Color Sensing

Robot vision systems often function only in grayscale, like old-fashioned 1950s television. But color sensing can be added, in a manner similar to the way it is added to television systems. Color sensing can help a robot with AI figure out what an object is. Is that horizontal surface a parking lot, or is it a grassy yard? Sometimes, objects have regions of different colors that have identical brightness as seen by a grayscale system. Such objects, obviously, can be evaluated in more detail with a colorsensing system than with a vision system that sees only shades of gray.
 
In a typical color-sensing vision system, three grayscale cameras are used. Each camera has a color filter in its lens. One filter passes red light, another passes green light, and another passes blue light. These are the three primary colors. All possible hues, levels of brightness, and levels of saturation are made up of these three colors in various ratios. The signals from the three cameras are processed by a microcomputer, and the result is fed to the robot controller.

The Eye-in-Hand System

Untitled
A robotic eye-in-hand system.
In order to assist a robot gripper (hand) in finding its way, a camera can be placed in the mechanism. The camera must be equipped for work at close range, from about 1 m down to 1 mm or less. The positioning error must be kept as small as possible. To be sure that the camera gets a good image, lamps are included in the gripper along with the camera (above figure). This so-called eye-in-hand system can be used to precisely measure the distance between the gripper and the object it is seeking. It can also make positive identification of the object.
 
The eye-in-hand system takes advantage of the properties of a servo. The robot is equipped with, or has access to, a computer that processes the data from the camera and sends instructions back to the gripper. Most eye-in-hand systems use visible light for guidance and manipulation. Infrared (IR) can be used when it is necessary for the robot gripper to sense differences in temperature.

The Flying Eyeball

In environments hostile to humans, robots find many uses, from manufacturing to exploration. One such device, especially useful underwater, has been called a flying eyeball. A cable, containing the robot in a special launcher housing, is dropped from a boat. When the launcher gets to the desired depth, it lets out the robot, which is connected to the launcher by a tether. The tether and the drop cable convey data back to the boat.
 
In some cases, the tether for a flying eyeball can be eliminated, and a wireless link can be used to convey data from the robot to the launcher. The link is usually in the IR or visible red portion of the spectrum. The robot contains a video camera and one or more lamps to illuminate the underwater environment. It also has a set of thrusters (jets or propellers) that let it move around according to control commands sent from the boat. Human operators on board the boat watch the images and guide the robot.