The Power of Computer Vision
To start, “computer vision” is an umbrella term. It covers a variety of technologies aimed at training computers to process images and video like humans do. The goal is for computers to be able to recognize subjects within a picture and make statements about how they relate to each other. Shown an image of a beach, for example, a computer could do more than note the location of specific colored pixels. It could produce observations like:- “There is a beach scene.”
- “This is a woman, specifically this woman from a linked database.”
- “The car is moving in this direction.”
- “The buildings are this far apart.”
- Space and deep ocean exploration
- Biohazardous sites
- Sensitive manufacturing processes
Barriers To Advancement
Making computer vision work requires both technical savvy and a very good understanding of mathematics. It’s an interdisciplinary process that needs a high level team to work. Talent like that isn’t cheap, limiting the number of teams that can afford to devote time to innovation. It’s important to recognize that vision is far more complex than it seems. Humans can handle a variety of conditions that stupify computers. What are the biggest challenges?Occlusion
Images that are partly obscured, like when a person is standing behind a car, can confuse an algorithm that tries to identify the top half of a person as an independent object.Scale
Computers can have trouble distinguishing if an item is far away or just small.Complex background
Dense or texturally complicated backgrounds can be mistaken for additional items. That slows down the analysis process and might even throw false positives. The internet full of “accidental face recognitions” where computers tag kneecaps or tree knots as people.Intraclass variation
There’s no single “pattern” for what most items are. Humans comprehend a huge amount of variety in color, shape, size, and material, but that’s a difficult concept for computers. Identifying dogs and cars is a good example of this.Computer Vision in Action
Despite the challenges, computer vision has matured into something with real enterprise value. It’s being used in ways most people probably haven’t even considered. Some of the most exciting applications include:- Optical Character Recognition (OCR): Reading handwritten and PDF documents and translating them into text documents
- Face and Object Detection and Recognition: Identifying, sorting and classifying images, including correlating those images with examples in a linked database
- Special Effects: Matching and lining up effects to real world footage
- Sports: Action recognition and quality assessment
- Smart Cars:Navigating live environments in real time
- Games: Assessing user input (like drawings or photos)
- Mobile Apps: Giving information based on images (such as identifying items or translating signs using the device camera)
- Robotics: Processing surroundings and distinguishing between items when performing or triggering tasks
Looking Ahead
Computer vision is one of the easiest tech terms to define but has been one of the most difficult to teach computers. The progress made so far has opened a whole new world of possibilities for enterprise. What lies ahead is sure to be amazing.Could computer vision add functionality to your next software project? Talk with one of our technology experts to explore what’s possible!