Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Computer vision
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Recognition=== The classical problem in computer vision, image processing, and [[machine vision]] is that of determining whether or not the image data contains some specific object, feature, or activity. Different varieties of recognition problem are described in the literature.<ref name=Forsyth2012>{{cite book |last1=Forsyth|first1=David |last2=Ponce |first2=Jean |date=2012 |title=Computer vision: a modern approach |publisher=Pearson}}</ref> * '''[[Object recognition]]''' (also called '''object classification'''){{spaced ndash}}one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. Blippar, [[Google Goggles]], and LikeThat provide stand-alone programs that illustrate this functionality. * '''Identification'''{{spaced ndash}}an individual instance of an object is recognized. Examples include identification of a specific person's face or fingerprint, [[handwriting recognition|identification of handwritten digits]], or the identification of a specific vehicle. * '''[[Object detection|Detection]]'''{{spaced ndash}}the image data are scanned for specific objects along with their locations. Examples include the detection of an obstacle in the car's field of view and possible abnormal cells or tissues in medical images or the detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analyzed by more computationally demanding techniques to produce a correct interpretation. Currently, the best algorithms for such tasks are based on [[convolutional neural network]]s. An illustration of their capabilities is given by the [[ImageNet#ImageNet Challenge|ImageNet Large Scale Visual Recognition Challenge]]; this is a benchmark in object classification and detection, with millions of images and 1000 object classes used in the competition.<ref name=":2">{{Cite journal|last1=Russakovsky|first1=Olga|last2=Deng|first2=Jia|last3=Su|first3=Hao|last4=Krause|first4=Jonathan|last5=Satheesh|first5=Sanjeev|last6=Ma|first6=Sean|last7=Huang|first7=Zhiheng|last8=Karpathy|first8=Andrej|last9=Khosla|first9=Aditya|last10=Bernstein|first10=Michael|last11=Berg|first11=Alexander C.|date=December 2015|title=ImageNet Large Scale Visual Recognition Challenge|url=http://link.springer.com/10.1007/s11263-015-0816-y|journal=International Journal of Computer Vision|volume=115|issue=3|pages=211β252|doi=10.1007/s11263-015-0816-y|arxiv=1409.0575 |issn=0920-5691|hdl=1721.1/104944|s2cid=2930547|hdl-access=free|access-date=2020-11-20|archive-date=2023-03-15|archive-url=https://web.archive.org/web/20230315180822/https://link.springer.com/article/10.1007/s11263-015-0816-y|url-status=live}}</ref> Performance of convolutional neural networks on the ImageNet tests is now close to that of humans.<ref name=":2" /> The best algorithms still struggle with objects that are small or thin, such as a small ant on the stem of a flower or a person holding a quill in their hand. They also have trouble with images that have been distorted with filters (an increasingly common phenomenon with modern digital cameras). By contrast, those kinds of images rarely trouble humans. Humans, however, tend to have trouble with other issues. For example, they are not good at classifying objects into fine-grained classes, such as the particular breed of dog or species of bird, whereas convolutional neural networks handle this with ease.{{citation needed|date=June 2020}} Several specialized tasks based on recognition exist, such as: * '''[[Content-based image retrieval]]'''{{spaced ndash}}finding all images in a larger set of images which have a specific content. The content can be specified in different ways, for example in terms of similarity relative to a target image (give me all images similar to image X) by utilizing [[reverse image search]] techniques, or in terms of high-level search criteria given as text input (give me all images which contain many houses, are taken during winter and have no cars in them). [[File: SRT Shape Recognition Technology.png|thumb|Computer vision for [[people counter]] purposes in public places, malls, shopping centers]] * '''[[Pose (computer vision)|Pose estimation]]'''{{spaced ndash}}estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an [[assembly line]] situation or picking parts from a bin. * '''[[Optical character recognition]]''' (OCR){{spaced ndash}}identifying [[Character (computing)|characters]] in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or [[Search index|indexing]] (''e.g.'' [[ASCII]]). A related task is reading of 2D codes such as [[Data Matrix|data matrix]] and [[QR code|QR]] codes. * '''[[Facial recognition system|Facial recognition]]{{spaced ndash}}''' a technology that enables the matching of faces in digital images or video frames to a face database, which is now widely used for mobile phone facelock, smart door locking, etc.<ref>{{Cite web |last=Quinn |first=Arthur |date=2022-10-09 |title=AI Image Recognition: Inevitable Trending of Modern Lifestyle |url=https://topten.ai/ai-image-recognition/ |access-date=2022-12-23 |website=TopTen.ai |archive-date=2022-12-02 |archive-url=https://web.archive.org/web/20221202063116/https://topten.ai/ai-image-recognition/ |url-status=live }}</ref> * [[Emotion recognition]]'''{{spaced ndash}}'''a subset of facial recognition, emotion recognition refers to the process of classifying human [[Emotion|emotions.]] Psychologists caution, however, that internal emotions cannot be reliably detected from faces.<ref>{{Cite journal |last1=Barrett |first1=Lisa Feldman |last2=Adolphs |first2=Ralph |last3=Marsella |first3=Stacy |last4=Martinez |first4=Aleix M. |last5=Pollak |first5=Seth D. |date=July 2019 |title=Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements |journal=Psychological Science in the Public Interest |volume=20 |issue=1 |pages=1β68 |doi=10.1177/1529100619832930 |issn=1529-1006 |pmc=6640856 |pmid=31313636}}</ref> * '''[[Pattern recognition|Shape Recognition Technology]]''' (SRT) in [[people counter]] systems differentiating human beings (head and shoulder patterns) from objects. * '''[[Activity recognition|Human activity recognition]]''' - deals with recognizing the activity from a series of video frames, such as, if the person is picking up an object or walking.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Computer vision
(section)
Add topic