The workshop will be at the Hanse Wissenschaftskolleg in Delmenhorst from March 3-4, 2016

Time Event

Thursday, March 3, 2016
10:00 Welcome
10:15 Introduction: Wai-Tat Fu and Holger Schultheis
10:30 Speaker 1: Marios Avraamides
10:45 Speaker 2: Tobias Meilinger
11:00 Speaker 3: Bernhard Riecke
11:15 Coffee and Break-out discussion 1:Example questions: What is the nature of spatial representations? What are crucial properties?
12:30 lunch
13:30 Speaker 4: Sven Bertel
13:45 Speaker 5: John Kelleher
14:00 Speaker 6: Franz-Benjamin Mocnik
14:15 Break-out discussion 2:Example questions: To what extent (and how) is representations and models of spatial cognition different or similar to other models? How can these models be useful for applications?
15:30 Coffee & cake
15:45 Speaker 7: Simon Dobnik
16:00 Speaker 8: Hans-Peter Mallot
16:15 Speaker 9: Tyler Thrash
16:30 Break-out discussion 3:Example questions: What role can and should computational modeling play in spatial cognition research (if any)? What are other crucial methodological approaches are useful?
17:45 Wrap-up discussion
18:00 /18:30 dinner
19:30 Poster session / Going for a beer
Friday, March 4, 2016
10:00 Conclusions from Breakout 1:Marios Avraamides, Tobias Meilinger, Bernhard Riecke
10:30 Conclusions from Breakout 2:Sven Bertel, John Kelleher, Franz-Benjamin Mocnik
11:00 Conclusions from Breakout 3:Simon Dobnik, Hans-Peter Mallot, Tyler Thrash
11:30 Future meetings / How to strengthen Spatial Cognition / Spatial Modeling Research?
12:30 lunch

Break-out discussion: 3 small groups retreat for 30 - 45 minutes to discuss some of the central questions of the workshop. Invited speakers will lead the discussion in each group and subsequently summarize the discussion at the end of the session. In addition, the discussion leaders will collectively conclude by presenting their answers to the key questions on Friday.

Talk title and abstract:

Marios Avraamides

Marios is a Cognitive Psychologist, interested in studying the cognitive processes that support spatial cognition and navigation. He is based at the University of Cyprus where he’s running the Experimental Psychology Lab (


Are there multiple systems for spatial memory?


Research with “offline” spatial tasks suggests that people maintain allocentric spatial memories, organized along stable preferred orientations that are determined by the convergence of environmental and social cues. At the same time, “online” tasks such as spatial updating highlight the importance of egocentric relations for the moment to moment cognition about space. The question that arises is: do we encode and maintain spatial information simultaneously in distinct systems that rely on different reference frames? If yes, how do these systems interface? I will present results from behavioural studies suggesting the presence of multiple systems of spatial memory and I will discuss how and when these systems may function together.

Sven Bertel

Sven is a junior professor of usability at Bauhaus-Universität Weimar. His research interests include cognitive user diversity and its consequences for what constitutes good usability; adaptive user interfaces; visuo-spatial reasoning; architectural design support; and usability of mobile devices.


Mental and Physical Touch-Based Rotation Processes


The ability to mentally rotate an object is a frequently studied aspect of spatial intelligence, important for performance in various visual and spatial tasks, not the least within STEM domains. It can be expected that enriching education to include an adequate training of spatial skills will increase participation in STEM domains. I will present and discuss a study from an interdisciplinary project, which focused on the question of whether, for elementary school students, solving spatial tasks can be enhanced by using touch-gestures on mobile devices. Two conditions of spatial rotation tasks were compared using a within- and between-subject design: an interactive, touch-based app allowing to physically rotate objects and a paper-based, static version. The results indicate an additive, enhancing effect of the touch-based, dynamic interaction mode especially for children who already are capable of solving the tasks by using mental rotation processes. Results from a subsequent analysis and modelling of physical rotation trajectories will be presented, aimed at differentiating trajectories obtained from correctly and incorrectly answered tasks and, eventually, at deriving general and individual strategies employed by successful and unsuccessful rotators.

Simon Dobnik

Simon Dobnik is a Senior Lecturer in Computational Linguistics at the Department of Philosophy, Linguistics and Theory of Science at University of Gothenburg, Sweden. He a member of the Centre for Language Technology (CLT) and the Centre for Linguistic Theory and Studies in Probability (CLASP), both at University of Gothenburg. His research interests include spatial cognition, computational models of language and perception, human-robot interaction, situated spoken dialogue systems, and computational representations of meaning (semantics).


Interfacing Language, Spatial Perception and Cognition in Type Theory with Records


Computational modelling of perception, action, language, and cognition introduces several requirements on a formal semantic theory and its practical implementations: (i) interfacing discrete conceptual knowledge and continuous real-valued sensory readings; (ii) information fusion of knowledge from several modalities; (iii) dynamic adaptation of semantic representations/knowledge as agents experience new situations through linguistic interaction and perception. Using examples of semantic representations of spatial descriptions we show how Type Theory with Records (TTR), a framework with origins in (natural language) computational semantics, addresses these requirements.

Hanspeter Mallot

Hanspeter A. Mallot is interested in the computational mechanisms underlying human spatial cognition and visual perception. Recent publications address the interaction of working and long-term memories of space, the recognition of places, and the perception of ego-motion from optic flow. He is currently a professor of Cognitive Neuroscience and Speaker of the Department of Biology at the University of Tübingen, Germany.


Views, places, regions: granularity and topology of cognitive maps

Hanspeter A. Mallot, Wiebke Schick, Banafsheh Grochulla

In his 1932 book on “Purposive behavior in animals and men”, Edward C. Tolman published an account of the memory structure underlying purposive behavior in animals, for which he he later coined the term “cognitive map”. It is essentially a graph composed of states that the animal knows and can be in, and actions known to effect transitions from one such state to another. While it is obvious to interpret the nodes of the graph (the states) as places, other interpretations are possible. In the talk, I will present evidence from behavioral experiments supporting the existence of at least two other types of nodes in the cognitive graph, oriented views, and higher-level regions. Evidence for view-based representation of places comes from spatial recall in imagery of distant places, in which some views are more often recalled than others. Evidence for regions as higher level nodes of the cognitive graph comes from the choice between equidistant routes in route planning, which is affected by the regional layout. In the emerging picture, the cognitive graph is a multi-layered structure including a hierarchy of spatial representations of various granularity as well as action representations on different levels of abstraction. In an accompanying poster, we will present evidence for the role of language (place names) in the formation of hierarchical representations of space.

Tobias Meilinger

Tobias Meilinger studied Psychology in Würzburg and conducted his PhD Freiburg. He worked in Tübingen, Paris and Tokyo and is currently Group leader of the Social and Spatial Cognition group at the Max Planck Institute in Tübingen. His research focuses on how humans represent and process their environment.


Constraints on models of human survey estimation – evidence from a learning study

Tobias Meilinger, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Jon Rebane, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Agnes Henson, Leeds Beckett University, UK
Heinrich H. Bülthoff, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Hanspeter A. Mallot, Tübingen University, Germany

Survey estimates such as pointing, straight line distance estimation, or finding novel shortcuts to distant locations are common tasks. Although involved reference frames and brain areas were examined the underlying processing is widely unknown. We examined the development of survey knowledge with experience to tap into the underlying processes. Participants learned a simple multi-corridor layout by walking forwards and backwards through a virtual environment. Throughout learning, participants were repeatedly asked to perform in pairwise pointing from each segment border to each other segment border. Pointing latency increased with pointing distance and decreased with pointing experience, rather than learning experience. From this realization, we conclude that participants did not access an encoded representation when performing survey tasks, but instead performed an on-the-fly construction of the estimates which was quicker for nearby goals and quickened with repeated construction, but not with learning of the underlying elements. This could relate to successive firing of place cells representing locations along a route from the current location to the target, or the construction of a mental model of non-visible object locations. Furthermore, participants made systematic errors in pointing, for example, mixed up turns or forgot segments. Modelling of underlying representations based on different error sources suggests that participants did not create one unified representation when internally constructing the experimental environment. But instead, they constructed a unique representation at least for each orientation the environment was navigated. There was no indication that this separation changed with experience. We conclude that survey estimates are conducted on-the-fly and are based on multiple representational units.

Franz-Benjamin Mocnik

Franz-Benjamin Mocnik is a visiting researcher at the University of Bremen and a Twin-Fellow at the HWK. He earned his PhD in Geoinformation by the Vienna University of Technology, and a diploma in mathematics by the University of Bonn. His background is very broad: Franz-Benjamin has interests in spatial science, cognitive science, mathematics and physics, as well as philosophy.


Modelling Cognition by Structural Affordances of the Environment


Human behaviour is restricted by the affordances that are offered by the environment. Research of human cognition can thus either focus on humans and how they interact with the world, or they can focus on which affordances the environment is providing to human perception, cognition and actions.

This talk presents the example of how the structure of a map – i.e. parts of the environment – influences human cognition by conveying subconscious assumptions, by only offering information subjected to inflexible categorization and generalization, and by offering much information about how places relate but not about the places itself. This structure of a map is contrasted to the structure of texts and its influence on human cognition.

Bernhard Riecke

Bernhard is a psycho-physicist and Cognitive Scientist who’s excited about studying how humans orient and behave in virtual and real environments, and applying this to improve human-computer interaction. He’s pursuing this a the School of Interactive Arts & Technology (SIAT) at Simon Fraser University near Vancouver, Canada, which is a great place to be if you like being in nature and feel like you don’t fit any more into any of the traditional departments.


Qualitative Modeling of Spatial Orientation Processes and Concurrent Reference Frame Conflicts using Logical Propositions


While navigation in the real world can already be difficult enough, this challenge tends to increase for travel through imagined or computer-mediated environments like virtual environments and tele-presence applications. We propose that this can, at least in part, be attributed to a concurrent reference frame conflict in working memory between one’s automatically activated primary embodied egocentric representation of one’s immediate physical environment on the one hand and the additional 1st person (egocentric/embodied) reference frame provided by the computer-mediated or imagined environment on the other hand.

Here, we will use a visually presented framework that integrates logical propositions (i.e., necessary and sufficient conditions) to discuss different underlying processes and implications. For example, we propose that alignment (and thus lack of interference) between concurrent egocentric reference frames is a necessary prerequisite for a number of desirable spatial orientation characteristics such as ease of adopting a new perspective, low cognitive load, and automatic spatial updating. Conversely, we propose that increasing concurrent reference frame conflict can lead to increased cognitive load, difficulty of perspective changes, impaired spatial updating, and thus reduced task performance in numerous applications such as telepresence or motion simulation in VR.

Tyler Thrash

Tyler is a postdoctoral researcher at the Chair of Cognitive Science at ETH Zurich. He studies spatial cognition and navigation from a largely ecological (Gibsonian) perspective. With this approach, he attempts to explain higher-level cognition (e.g., biases in spatial memory) in terms of lower-level, perceptual processes (e.g., visual exposure to environmental structure).

Formalizing the Cognitive Map

Since the introduction of the term “cognitive map” by Edward Tolman in 1948, research in spatial cognition has significantly advanced our understanding of spatial memory. However, a strict (testable) definition of the term “cognitive map” remains elusive. While some researchers define cognitive maps with respect to reference frame (e.g., Lloyd, 1989), others refer to scope (e.g., Shettleworth, 2010), metric (e.g., Golledge & Hubert, 1982), or level of abstraction (e.g., Portugali, 1996). The present work describes a computational model for disentangling these various aspects of cognitive maps and providing evidence for or against specific possible definitions. This model was constructed by combining the Generalized Context Model (originally intended for investigating perceptual classification; Nosofsky, 1986) and the Category-Adjustment Model (originally intended for investigating spatial categorization; Huttenlocher, Hedges, & Duncan, 1991). The proposed model may allow researchers to predict different types of spatial responses (e.g., sketch maps, pointing judgments) using the locations of objects to be remembered.

John Kelleher

John Kelleher is a lecturer in the School of Computing at the Dublin Insitute of Technology and a funded investigator in the Adapt Research Centre ( John recently published a textbook with MIT Press on machine learning ( John's primary interest in the area of spatial cognition relates to developing cognitively inspired models that enable computational systems (such as robots) to ground spatial langauge in sensor data. In particular, John is interested in the role of perceptual phenomenon (such a object occlusions and viewer perspective) on the semantics of spatial terms.

What is not where: the challenge of integrating spatial representations into deep learning architectures


In recent years Deep Learning models have resulted in major breakthroughs in the areas of language and vision processing. The standard deep learning architecture for processing of image data is the Convolutional Neural Network (CNN). CNN models have been very successful in learning to identify what is in an image. Furthermore, there has been exciting breakthroughs in image captioning systems. In these systems CNN image processing models are integrated with Recurrent Neural Network (RNN) language models. These hybrid (CNN+RNN) deep learning architectures are able to generate descriptions (often including spatial descriptions) of the contents of images. This talk, however, crititcaly evaluates these deep learning models with respect to the ability of these systems to explicitly model spatial concepts. In particular, the talk will argue that these architectures rely on the RNN language model to select the spatial relationships used in the generated descriptions, rather than actually grounding spatial relationships in the image itself. This critique is based on the architecture of CNN models, I agree that the disconnected nature of CNN models (due to pooling layers) means that it is not feasible for these models to explicitly learn grounded spatial representations. The purpose of the critique is to open a discussion regarding how best to integrate spatial representatons into deep learning architectures.