https://openalex.org/T11714
This cluster of papers focuses on the development and improvement of visual question answering systems, image captioning techniques, and neural networks for understanding and generating descriptions of images and videos. The research involves semantic reasoning, multimodal fusion, scene graph generation, attention mechanisms, and deep learning approaches to bridge the gap between vision and language.
@prefix oasubfields: <https://openalex.org/subfields/> .
@prefix openalex: <https://lambdamusic.github.io/openalex-hacks/ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<https://openalex.org/T11714> a skos:Concept ;
rdfs:label "Visual Question Answering in Images and Videos"@en ;
rdfs:isDefinedBy openalex: ;
owl:sameAs <https://en.wikipedia.org/wiki/Visual_question_answering>,
<https://openalex.org/T11714> ;
skos:broader oasubfields:1707 ;
skos:definition "This cluster of papers focuses on the development and improvement of visual question answering systems, image captioning techniques, and neural networks for understanding and generating descriptions of images and videos. The research involves semantic reasoning, multimodal fusion, scene graph generation, attention mechanisms, and deep learning approaches to bridge the gap between vision and language."@en ;
skos:inScheme openalex: ;
skos:prefLabel "Visual Question Answering in Images and Videos"@en ;
openalex:cited_by_count 529438 ;
openalex:works_count 24730 .