Describing Images using a Multilayer Framework based on Qualitative Spatial Models

Tao Wang, University of Bremen
Hui Shi, University of Bremen

Keywords

image processing, spatial perception, spatial models, qualitative spatial models

Abstract

To date most research in image processing has been based on quantitative representations of image features using pixel values, however, humans often use abstract and semantic knowledge to describe and analyze images. To enhance cognitive adequacy and tractability, we here present a multilayer framework based on qualitative spatial models. The layout features of segmented images are defined by qualitative spatial models which we introduce, and represented as a set of qualitative spatial constraints. Assigned different semantic and context knowledge, the image segments and the qualitative spatial constraints are interpreted from different perspectives. Finally, the knowledge layer of the framework enables us to describe the image in a natural way by integrating the domain-specified semantic constraints and the spatial constraints.

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Recommended Citation

Wang, Tao and Shi, Hui (2015) "Describing Images using a Multilayer Framework based on Qualitative Spatial Models," Baltic International Yearbook of Cognition, Logic and Communication: Vol. 10. https://doi.org/10.4148/1944-3676.1104

References

Allen, James F. 1983. ‘Maintaining Knowledge About Temporal Intervals’. Commun. ACM 26, no. 11: 832–843. http://doi.acm.org/10.1145/182.358434.

Bay, Herbert, Ess, Andreas, Tuytelaars, Tinne & Gool, Luc Van. 2008. ‘Speeded-Up Robust Features (SURF)’. Computer Vision and Image Understanding 110, no. 3: 346 – 359. http://www.sciencedirect.com/science/article/pii/S1077314207001555. Similarity Matching in Computer Vision and Multimedia.

Coyne, Bob, Bauer, Daniel & Rambow, Owen. 2011. ‘Vignet: Grounding language in graphics using frame semantics’. In ‘Proceedings of the ACL 2011 Workshop on Relational Models of Semantics’, 28–36. Association for Computational Linguistics.

Dalal, N. & Triggs, B. 2005. ‘Histograms of oriented gradients for human detection’. In ‘Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on’, vol. 1, 886–893 vol. 1.

Escalante, Hugo Jair, Hernández, Carlos A., Gonzalez, Jesus A., López-López, A., Montes, Manuel, Morales, Eduardo F., Sucar, L. Enrique, Villaseñor, Luis & Grub- inger, Michael. 2010. ‘The segmented and annotated {IAPR} TC-12 bench- mark’. Computer Vision and Image Understanding 114, no. 4: 419 – 428. http://www.sciencedirect.com/science/article/pii/S1077314209000575. Special issue on Image and Video Retrieval Evaluation.

Goyal, Roop K. & Egenhofer, Max J. 2000. ‘Consistent Queries over Cardinal Directions across Different Levels of Detail’. In ‘Proceedings of the 11th International Workshop on Database and Expert System Applications’, 876–880.

Li, Sanjiang. 2013. ‘Cardinal Directions between Regions: A Comparison of Two Models’. Research article.

Lieto, Antonio. 2014. ‘A Computational Framework for Concept Repre- sentation in Cognitive Systems and Architectures: Concepts as Het- erogeneous Proxytypes’. Procedia Computer Science 41: 6 – 14. http://www.sciencedirect.com/science/article/pii/S1877050914015233. 5th Annual International Conference on Biologically Inspired Cognitive Architectures, 2014 {BICA}.

Liu, Weiming, Zhang, Xiaotong, Li, Sanjiang & Ying, Mingsheng. 2010. ‘Reasoning about cardinal directions between extended objects’. Artificial Intelligence 174, no. 12 - 13: 951 – 983. http://www.sciencedirect.com/science/article/pii/S0004370210000834.

Lowe, D.G. 1999. ‘Object recognition from local scale-invariant features’. In ‘Computer Vision, 1999. TheProceedings of theSeventh IEEEInternational Conference on’, vol. 2, 1150–1157 vol.2.

Schneider, Markus, Chen, Tao, Viswanathan, Ganesh & Yuan, Wenjie. 2012. ‘Cardinal Directions Between Complex Regions’. ACM Trans. Database Syst. 37, no. 2: 8:1–8:40. http://doi.acm.org/10.1145/2188349.2188350.

Socher, Richard, Karpathy, Andrej, Le, V. Quoc, Manning, D. Christopher & Ng, Y. Andrew. 2014. ‘Grounded Compositional Semantics for Finding and Describing Images with Sentences’. Transactions of the Association of Computational Linguistics – Volume 2, Issue 1 207–218. http://aclweb.org/anthology/Q14-1017.

Wang, Li&He, Dong-Chen. 1990. ‘Texture Classification UsingTexture Spectrum’. Pattern Recogn. 23, no. 8: 905–910. http://dx.doi.org/10.1016/0031-3203(90)90135-8.

Zhang, Dengsheng, Islam, Md. Monirul & Lu, Guojun. 2012. ‘A review on auto- matic image annotation techniques’. Pattern Recognition 45, no. 1: 346 – 362. http://www.sciencedirect.com/science/article/pii/S0031320311002391.

Download

Included in

Cognition and Perception Commons, Graphics and Human Computer Interfaces Commons

COinS

Baltic International Yearbook of Cognition, Logic and Communication