Skip to main content
  • Research Article
  • Open access
  • Published:

Towards Automation 2.0: A Neurocognitive Model for Environment Recognition, Decision-Making, and Action Execution


The ongoing penetration of building automation by information technology is by far not saturated. Today's systems need not only be reliable and fault tolerant, they also have to regard energy efficiency and flexibility in the overall consumption. Meeting the quality and comfort goals in building automation while at the same time optimizing towards energy, carbon footprint and cost-efficiency requires systems that are able to handle large amounts of information and negotiate system behaviour that resolves conflicting demands—a decision-making process. In the last years, research has started to focus on bionic principles for designing new concepts in this area. The information processing principles of the human mind have turned out to be of particular interest as the mind is capable of processing huge amounts of sensory data and taking adequate decisions for (re-)actions based on these analysed data. In this paper, we discuss how a bionic approach can solve the upcoming problems of energy optimal systems. A recently developed model for environment recognition and decision-making processes, which is based on research findings from different disciplines of brain research is introduced. This model is the foundation for applications in intelligent building automation that have to deal with information from home and office environments. All of these applications have in common that they consist of a combination of communicating nodes and have many, partly contradicting goals.

1. Introduction

Over the last decades, automation technology has made serious progress in observing and control processes in order to automate them. Prominent examples for research areas addressing this issue are the discipline of data fusion [1] and the field of fuzzy control [2]. In factory environments, where the number of possible occurring situations and states is limited and usually well known, observation and controlling of most industrial processes is a tedious, but achievable task. However, the situation changes if we shift from industrial to less organized environments like offices or private homes. Here, the number of possible occurring objects, events, and scenarios and the ways how to react to them is almost infinite. Interacting in such real world situations and fulfilling goals turned out to be a task far from trivial [3, 4]. Existing approaches are challenged by the abundance of data and the ways in which it should be analyzed and responded to [5, 6]. The challenge that cannot be met is to find an appropriate behaviour in the light of multiple, partly contradictory goals.

Building automation is today a network of embedded systems that are interconnected by standardized fieldbus protocols. In larger office buildings, some thousand embedded controllers, sensors, and actuators are installed and take care of user comfort and safety. The installations in a building are separated into different industries, which have grown historically and have no tradition in achieving common goals together, but only recently started to cooperate. Each industry prefers to have separate installations rather than sharing, for example, sensor information between industries. The control of the HVAC system and the lighting operate separately without regarding occupancy or sunblinds. Other information sources like the outside temperature, humidity, or irradiation are available only for a single industry (if it is regarded at all). While it is possible to operate a building in such a way and still maintain a certain level of comfort, it is impossible to achieve other goals like maximizing energy efficiency. This is only possible when all industries cooperate, share information and infrastructure, and can be controlled in a holistic way.

The next challenge is to find mechanisms to control the complexity of such an integrated system. When merging all available subsystems in a building, the number of possible states rises exponentially and is not manageable with classic approaches. Instead, the subsystems have to be controlled by a management system that makes global decisions and resolves conflicts. Programming in the classical senses, that is, predefining the behaviour of the system in all possible situations, is no longer an option, instead, adaptability and the ability for decision-making is required.

In recent times, research in this field started to focus on bionic concepts looking at nature as an archetype [711]. Taking these concepts as a basis for the development of technical systems appears to be a very reasonable idea: animals and humans have the capability to perceive and (re-)act on their environment very efficiently [12, 13]. Their mind reconstructs the environment from the incoming stream of (often ambiguous) sensory information, generates unambiguous interpretations of the world on a more abstract level, evaluates these perceptions, and takes adequate decisions in order to act or react on them. To do so, evolution has equipped our brains with highly efficient circuits [14]. Deciphering these circuits and mechanisms and translating them into technically implementable concepts would without doubt lead to a revolution in machine intelligence and bring additional economic benefits when applied to technical systems [15]. Optimizing for energy efficiency is a task that requires a holistic view of the whole system with all its border conditions and ambiguous interconnections. Especially if humans are involved in the system—like in energy optimization for buildings—the description of the system is already a complex task. An alternative to manual modelling is required and can be found in the abilities of the human mind. A key ability is the creation of models of the real world with the necessary evaluation of objects and events in this world: when the system has to make decisions about the control strategy of a building in order to, for example, minimize energy consumption, it needs fast evaluations about the building status and the ability of subcomponents to contribute to reduction of consumption. Thus not only perception of the current situation is required, but also an evaluation towards a certain goal. This concept is the translation of what emotions are in the human mind: fast evaluations of objects and events in the surrounding world, which is achieved by multiple levels of processing which cooperate to create an abstract image of the world focusing on the relevant information. By exposing an individual to many different situations over its lifetime, emotions are built and refined. The foundation is laid by experiencing situations that have different impacts on the individual. Some emotions exist already at an early stage, since they are vital for survival, some develop at later stages [1618]. Lab situations as we use today for training systems are not available in the real world. It is always an amalgamation of different types of inputs, where relevant information is embedded into a bulk of irrelevant information. The challenge lies in identifying the data that have an impact on the individual. By linking perception of objects and events with emotions, that is, with the evaluation of the possible impact, a mechanism is found that enables us to act and react on complex situations.

Energy management of office, public, and residential buildings creates such complex situations. The operation of the building has to be optimized towards different goals: it shall be energy efficient, with a low carbon footprint, but also at lowest possible costs. These optimizations have to be seen in the light of other operational parameters like maintaining maximum comfort for the users with regard to temperature, humidity, and lighting. To do so, it has to regard occupancy of rooms and user behaviour. At the same time, a building may have different sources of thermal and electric energy: the electric grid, the thermal grid, and several sources of renewable energy like solar thermal systems, wind generation, heat pumps, and photovoltaic systems. Finally, the building management system should optimize its electric consumption towards the grid in order to avoid peak loads. While the necessary hardware and IT infrastructure is today already in place, there is still a lot of work to be done to find the right methods for processing the available information in a way that allows for multigoal optimization and flexible reaction on unexpected situations. We try to fill this gap with the bionic approach described in this paper. The enormous potential of such innovative bionic approaches were taken up by a research team around Dietmar Dietrich in the year 2000; an interdisciplinary team of scientists at the ICT (Institute of Computer Technology), Vienna University of Technology, works on the development of next generation intelligent automation systems for building automation, interactive environments, autonomous agents, and autonomous robots based on neurocognitive concepts [1924]. The outcome of this effort is illustrated in the following in form of a neuro-cognitive model for environment recognition, decision-making, and action execution.

2. Neuro-Cognitive Model for Environment Recognition, Decision Making, and Action Execution

An overview about the developed model is given in Figure 1. The model consists of various interconnected modules. The arrows indicate informational and/or control flows between the different units. The functionality of the different blocks of the model and their interaction will be explained step by step in this section. Starting point for model development were latest research findings from the disciplines of neuro-physiology and neuro-psychology about the function of the brain in the process of environment recognition, decision-making, and action execution.

Figure 1
figure 1

Overview of neuro-cognitive model for environment recognition, decisión-making, and action execution.

According to the neuroscientist and psychoanalyst M. Solms and Turnbull [25], in nature, the purpose of these processes can be summarized in one phrase: "survival of the organism". In order to survive, an individual has to search for and get the resources its organism currently needs (food, water, oxygen, sexual partner for reproduction) from the environment. To do so, it has to be able to recognize (perceive) its environment and its current bodily needs (internal states). For this purpose, the body of the individual is equipped with different sensors (sensory receptors). The processing of the information coming from these sensors takes place in the mind of the individual. Based on this information, it is decided what actions to execute in order to satisfy the needs of the body. For this purpose, the body is equipped with a number of actuators to act on the internal states and the environment.

The architecture of the mind considers two key ideas of the neuro-cognitive picture. The first is the fact that human intelligence is based on a mixture of low-level and high-level mechanisms. Low-level responses are relatively predefined and may not always be accurate, but they are quick and provide the system with a basic mode of functioning in terms of built-in goals and behavioural responses. The second key idea of the model is the usage of emotions as evaluation mechanism on all levels of the architecture. By emotions, the system can learn values along with the information they acquire.

The four main blocks of the mind are the recognition module, the predecision module, the decision module, and the execution module. The recognition module is responsible for the processing of incoming sensory data in order to perceive the environment and internal states of the body. The pre-decision module and the decision module are responsible for deciding what actions to take based on all available incoming information. In the pre-decision module, these mechanisms are based on mainly pre-defined low-level processes which guarantee a fast reaction in critical situations. The decision module bases on higher-level mechanisms requiring more time-consuming reasoning processes. The execution module is responsible for the control of the actuators in order to correctly execute the selected actions.

In the architecture, there exist several types of memories. Perceptual memory is used extensively by the recognition module while processing sensory input data. Perceptual memory comprises information of how different objects look like, what sounds they emit, what texture they have, how they smell, and so forth. A suggestion how to represent perceptual memory computationally with respect to its neuro-cognitive basis is given in Section 4. For facilitating perception and resolving ambiguous perceptual information, knowledge stored in the semantic memory is needed. It contains facts and rules about the environment, for example, what kinds of objects are there, how are they related to each other, what are the physical rules of the world, and so forth. In a similar way, semantic memory also supports the decision making process. Semantic memory is acquired from episodic memory. Episodic memory consists of previously experienced episodes. An episode is a sequence of situations. These episodes have generally been given an emotional rating and support the decision-making process. Procedural memory is used in the execution module and comprises the necessary information for the execution of routine behaviours. A suggestion for the computational representation of procedural memory considering its neuro-cognitive archetype is given in Section 4. Working memory is conceptualized as active, explicit kind of short-term memory that supports higher-level cognitive operations by holding goal-specific information and streamlining the information flow to the cognitive processes.

The whole decision-making and behaviour selection process runs as a loop and can be described as follows: external stimuli originating from the environment are processed by the recognition module using knowledge stored in the perceptual and the semantic memory. The resulting representation of the current situation is first passed on to the basic emotions module of the pre-decision unit. From the recognition module, there are also perceived internal stimuli from the body to watch over the internal needs of the system which are represented by internal variables. Each of these variables manages an essential resource of the system that has to be kept within a certain range, for example, its energy level. If one of the internal variables of the recognition module is about to exceed its limits, it signifies this to the drives module which in turn raises the intensity of a corresponding drive, for example, hunger in the case of low energy. There exists a threshold for hunger. In the case it is passed, the action tendency to search for food is invoked. In case that the basic emotions module does not release a competing action tendency, the decision to search for food is passed on to the execution unit. The basic emotions module gets its input from the perception module and the drives module. It connects stereotype situations with action tendencies that are appropriate with a high probability. For instance, if an object is hindering the satisfaction of an active drive, it will become angry, which leads to "aggressive" behaviour where the system "impulsively" attempts to remove the obstacle. For this purpose, it initiates a predefined coping reaction. Each basic emotion is connected with a specific kind of behavioural tendency/action like for instance fear with fleeing (being cautious), disgust with the avoidance of contact, and playfulness with the exploration of new situations. An important task of the basic emotions module is to label the behaviour or action the system has finally carried out as "good" or "bad". This rating is based on the perceived consequences (mainly on the internal states) of the executed actions. Successful actions are rewarded with lust; unsuccessful behaviour leads to avoidance. Through basic emotions, the system can switch between various modes of behaviour based on the perception of simple, but still characteristic external or internal stimuli. This helps to focus the attention by narrowing the set of possible actions and the set of possible "perceptions". The system starts to actively look for special features of the environment while suppressing others.

If the pre-decision module does not trigger a response, perceived situations are handed over to the decision module. In the decision module, again an emotional rating takes place—this time by the complex emotions module. Here, current situations are matched with one or more social emotions like contempt, shame, compassion, and so forth. Additionally, current desires influence the decision process. The decision module heavily interacts with episodic memory. The episodic memory is searched for situations similar to the current one including emotional ratings. Furthermore, the semantic memory can provide factual knowledge of how to react to a certain situation. If no similar situation can be found, the planning module (acting-as-if module) is activated which mentally simulates different responses to a situation as well as their potential outcomes. After a final decision how to react to a certain situation has been taken, the according behaviours/actions have to be carried out physically. While actions carried out by the pre-decision unit are of reactive nature with the aim to keep the system from harm in a dangerous situation, actions coming from the decision unit are of more complex nature and allocate more complex patterns from the procedural memory. One important fact is that the higher-level decisions from the decision module can inhibit (suppress) the execution of actions selected by the pre-decision module.

3. Model Implementation and Use Case Description

In order to do a first verification and evaluation of the model, it was implemented as a computational simulation in a virtual environment [6]. In this virtual environment, autonomous agents are embedded [22, 2628]. Each of these autonomous agents has implemented an instance of the model described in Section 2 as control unit. The agents can navigate through a three dimensional world. They can perceive their environment through simplified sense organs. They can detect the presence of other agents and energy sources. The set goal of the agents is to survive in the environment as long as possible. Agents compete in different groups and try to find an optimum strategy in diverse (unknown) situations. Therefore, they continuously have to take decisions about how to (re-)act on the environment. Starting point for decision making are always both internal states of the body and external perceptions of the environment.

One of the use cases for evaluating the model functionality was the so-called cooperation for energy recovery scenario occurring between two or more agents in the virtual environment. This example scenario shall now be explained in more detail to clarify the concept of decisions-making according to the model. In the cooperation for energy recovery scenario, a virtual agent (Agent A) recognizes an energy source in the environment based on the perceptual knowledge about the possible appearances of energy sources stored in his perceptual memory. From the semantic memory, he retrieves the information that he cannot consume this energy source alone, but would require the help of other agents.

In Figure 2(a), the internal states (basic emotions, complex emotions, desires, drives) of Agent A are depicted. The agent feels hunger. However, he is also afraid because of the danger connected to approaching this energy source, which he experienced previously. This event was stored in his episodic memory. Nevertheless, the hunger is stronger than the fear. Furthermore, the agent feels the desire to get food and has the hope that another agent will assist him in this task. Both the pre-decision and the decision unit are therefore in accordance and a request of cooperation for energy recovery is sent to two other agents (Agent B and Agent C) via the execution module. Both agents receive this request via their recognition units. Based on their internal states, they will either make the decision to cooperate for the purpose of cracking the food source or not. States that influence this decision are whether the agents feel hunger themselves, whether they feel a need for social interaction, whether they feel fear, and so forth. In Figures 2(b) and 2(c), the internal states of Agent B and Agent C are shown at the moment they receive Agent A's request. The internal states of Agent B show a high level of fear and moderate levels of pride and reproach due to the fact that he does not want to admit his fear and is afraid to get blamed for not helping. Therefore, although he feels the drive to care about Agent A and the desire of socially interactting with him, the basic emotion of fear overrules all other internal states and the request of Agent A is rejected. Agent C in contrast shows a high level of lust, a low level of fear, and a high level of hope to become friend with Agent A and socially interact with him in future in case of supporting him. Although his hunger level and his desire for getting food are only low, he therefore answers Agent A's request positively.

Figure 2
figure 2

Internal states of the agents A, B, and C in the decision making process of thecooperation for energy recovery scenario. Internal States of Agent A that lead to the Formulation of a RequestInternal States of Agent B that lead to the Rejection of the RequestInternal States of Agent C that lead to a Positive Answer

4. Neurosymbolic Intelligence

The model introduced in Section 2 presents a general framework for environment recognition, decision-making, and action execution in automation systems based on neuro-congitive insights about the human brain. The first simulation and validation of this framework was presented in Section 3. In this simulation, the different modules were implemented in a rule-based form (hard-coded rules and fuzzy rules) in order to determine output data based on incoming data. In further development steps, it was then aimed to substitute these rules by approaches that are closer to the neurophysiological and neuropsychological information processing principles of the brain. The result of this research effort was the elaboration of the so-called neurosymbolic information processing principle [3]. The first module to which this method was applied was the recognition module [29]. In later steps, it was also attempted to apply this mechanisms to the action execution module and for the representation of emotions, drives, and desires. An overview of the neuro-symbolic principle is given in the following with focuses on the recognition system and further remarks on the application to other areas.

4.1. Neuro-Symbolic Recognition

In Figure 3, an overview is given about the neuro-symbolic recognition model. Recognition, also referred to as perception, always starts with sensor values. These sensor data is processed in a neuro-symbolic network, which comprises the perceptual memory, and results in the perception of what is going on in the environment. The perception process is assisted by semantic memory and provides output information to the episodic memory and the decision-making modules. The neuro-symbolic network is the central element of the model and is concerned with the so-called neuro-symbolic information processing. Due to length constraints of this paper we will focus only on the description of this module.

Figure 3
figure 3

Overview of neuro-symbolic recognition model.

The basic information processing units of the neuro-symbolic network are so-called neuro-symbols. To use neuro-symbols as elementary information processing units came from the following observation: in the brain, information is processed by neurons. However, humans do not think in terms of firing nerve cells but in terms of symbols. In perception, these symbols are perceptual images like a face, a person, a melody, a voice, and so forth. Neural and symbolic information processing can be seen as information processing in the brain on two different levels of abstraction. Nevertheless, there seems to exist a correlation between these two levels. Actually, there have been found neurons in the brain which react for instance exclusively if a face is perceived in the environment [3032]. This fact can be seen as evidence for such a correlation and was the motivation for using neuro-symbols as basic information processing units. Neuro-symbols show certain characteristics of neurons and others of symbols. Analyses of structures in the human mind have shown that certain characteristics and mechanisms are repeated on different levels, for example, afference and efference. This repetition of characteristics is a key element to the concept of neuro-symbolic processing.

In perception, neuro-symbols represent perceptual images—symbolic information—like persons, faces, voices, melodies, textures, odours, and so forth. Each neuro-symbol has an activation degree. This activation degree indicates whether the perceptual image it represents is currently present in the environment. Neuro-symbols have several inputs and one output. Via the inputs, information about the activation degree of other neuro-symbols is collected. These activation degrees are then summed up and result in the activation degree of the particular neuro-symbol. If this sum exceeds a certain threshold value, the neuro-symbol is activated and information about its own activation degree is transmitted via the output to other neuro-symbols. Neuro-symbols can process information that comes in concurrently, within a certain time window, or in a certain succession. Additionally, neuro-symbols can have so-called properties, which specify them in more detail. One important example for such a property is the location of the perceptual images in the environment.

To perform complex tasks, neuro-symbols are combined and structured to neuro-symbolic networks. As archetype for this neuro-symbolic architecture, the structural organization of the perceptual system of the human brain as described by Luria [32] is taken. According to Luria, the starting point for perception are the sensory receptors of different modalities (visual, acoustic, somatosensory, gustatory, and olfactory perception). The information from these receptors is then processed in three hierarchical levels. In the first two levels, the information of each sensory modality is processed separately and in parallel. In the third one, the information of all sensory modalities is merged and results in a multimodal (modality neutral) perception of the environment. In the first level, simple features are extracted from the incoming sensory data. In the first level of the visual system, neurons fire to features like edges, lines, colours, movements of a certain velocity and into a certain direction, and so forth. In the second level, a combination of extracted features results in a quite complex representation of all aspects of the particular perceptual modality. In the visual system, perceptual images like faces, a person, or other objects are perceived at this level. On the highest level, the perceptual aspects of all modalities are merged. An example would be to perceive the visual shape of a person, a voice, and a certain odour and conclude that all this information belongs to a particular person currently talking.

In analogy to this modular hierarchical structure of the perceptual system of the human brain, neuro-symbols are structured to neuro-symbolic networks (see Figure 4). Also here, sensor data are the starting point for perception. These input data are processed in different hierarchical levels to more and more complex neuro-symbolic information until they result in a multimodal perception of the environment. Neuro-symbols of different hierarchical levels are labelled differently according to their function. Neuro-symbols of the first level are called feature neuro-symbols, neuro-symbols of the next two layers are labelled subunimodal and unimodal neuro-symbols, and the neuro-symbols of the highest levels are referred to as multimodal neuro-symbols and scenario neuro-symbols. Neuro-symbols of one level present the symbol alphabet for the next higher level. Each neuro-symbol of the higher level is activated by a certain combination of neuro-symbols of the level below. Concerning the sensor modalities, there can be used sensors, which have an analogy in human sensory perception like video cameras for visual perception, microphones for acoustic perception, tactile sensors for tactile perception, and chemical sensors for olfactory perception. Furthermore, there can be used sensors, which have no analogy in the human senses like the perception of electricity or magnetism. What sensor data trigger which neuro-symbols and what lower-level neuro-symbols activate what neuro-symbols of the next higher level is defined by the connections between them. There exist forward connections as well as feedback connections. These connections are no fixed structures, but they can be learned from examples [15]. Learning allows great flexibility and adaptation of the system, because learning is a process that involves all levels of the network. In the current approach, learning is intended to modify the connections between neuro-symbols, but future approaches will also change the structure of the network itself, thus allowing increased flexibility and creation of new neuro-symbols.

Figure 4
figure 4

Neuro-Symbolic network.

4.2. Neuro-Symbolic Implementation and Use Case Description

To verify the concepts of neuro-symbolic recognition, it was applied to a building automation environment. In concrete, the test environment was the office kitchen of the Institute of Computer Technology (ICT) at the Vienna University of Technology [33, 34]. The kitchen comprises a table with eight chairs and a kitchen cabinet including a stove, a sink, a dishwasher, and a coffee machine. For testing the recognition model, the kitchen was equipped with sensors of different types: tactile floor sensors, motion detectors, door contact sensors, window contacts, light barriers, temperature sensors, a humidity sensor, brightness sensors, a microphone, and a camera. From these sensor data, different scenarios had to be perceived following the information processing principles proposed in Section 4.1. As by these measures, the kitchen became an "intelligent" system capable of autonomously perceiving what is going on in it, it got the name Smart Kitchen.

In Figure 5, the neuro-symbol hierarchy for the detection of the three most typical events occurring in the kitchen during working hours is presented: "prepare coffee", "kitchen party", and "meeting". It is shown how level-by-level more and more meaningful and interpretable neuro-symbols are generated from partly redundant sensor data until they result in an activation of the neuro-symbols "prepare coffee", "kitchen party", and "meeting". The redundancy in sensor data allows a certain level of fault tolerance in detection. An activation of a neuro-symbol of the highest level indicates that the event it represents has been perceived in the kitchen.

Figure 5
figure 5

Neuro-symbolic network for detecting the scenarios "meeting", "kitchen party", and "prepare coffee".

The event "prepare coffee" is the situation occurring most often in the kitchen and represents the activity that one or more of the employees come(s) into the kitchen, operate(s) the coffee machine, and leave(s) the kitchen again. The detection of this scenario is based on data from the video camera, the microphone, the tactile floor sensors, and the motion detectors. From the floor sensors and motion detectors, it is perceived where in the room a dynamic (moving) object is present. Together with an image processing algorithm analyzing the video data, it is concluded where in the room a person is present. The information from these sensors is partly redundant, which makes the perception more robust. In case a person is perceived close to the coffee machine and the acoustic noise emitted by the coffee machine is detected, the neuro-symbol "prepare coffee" is activated.

The "kitchen party" scenario generically describes a get-together of a number of people in the kitchen for an informal gathering, usually accompanied by food and drinks. Such informal gatherings benefit social networking and the quick exchange of ideas. This scenario is detected from the same sensor types like the "prepare coffee" event. However, in this case, there have to be detected two or more persons based on video data and data from the tactile floor sensors and motion detectors. Additionally, food and drinks on the table have to be identified from the video data and voices from the microphone.

The "meeting" scenario describes a formal get-together for working purposes. It is usually characterized by a number of people that are seated regularly around the table. They have papers or laptops to read and tools to write with them. The number of people talking at the same time is smaller and the overall noise level is lower than in the kitchen party scenario.

The information about perceived scenarios from the recognition module is constantly passed to the decision units. Depending on which event occurs, there are different requirements concerning lighting and heating or cooling. Based on the perceived event and additional sensor information about current temperature, brightness level, position of the sunblinds, and the window status (open/closed), a decision is taken of how to regulate heating, air conditioning, lighting, the position of the sunblinds, and so forth. For the "prepare coffee" scenario, for instance, standard lighting conditions are provided (main light switched on) in case that the outside light is not sufficient. No special adaptations are made in heating or cooling as the person(s) are present in the room only for a few minutes, which is below the time constant of the heating and air conditioning system. Also the "kitchen party" event does not require particular adjustments in lighting. However, while the "prepare coffee" scenario is a spontaneous event the "kitchen party" can be scheduled in advance, since the facility management has access to the room schedule. This is important, because the cooling or heating load is considerable and requires preparation of the room climate. Such a scenario generally lasts about 30 minutes, the impact of (human) heat load depends amongst other factors on the current inside and outside temperature. In the "meeting" scenario, lighting needs special adaptation. In case that the outside light is not sufficient, a light above the table is switched on additionally to the main light. If laptops are used and direct sunlight shines on the screens, the sunblinds are shut down. Adaptation in heating or air conditioning are made in a similar way like for the "kitchen party" scenario.

The Smart Kitchen is a good example for complex interactions between different subsystems that operate in a building or room, respectively. To achieve maximum energy efficiency, the system needs to know about room occupancy. Lighting conditions have to be adapted by electric light and sunblinds depending on outside light conditions and on the activity of the user, for example, when operating the coffee machine, reading journals that are on display in the kitchen, holding a meeting, or coming together for an informal break. The room climate has to be maintained, but only upon occupancy. Since the climate has much longer reaction times than, for example, lighting, the system has to either predict usage [35] or keep climate permanently at comfort level-which is not energy efficient. Instead the system has to operate the room in comfort mode (if it is occupied) or in pre-comfort mode (if unoccupied). In pre-comfort mode, the room can be operated in more relaxed conditions regarding temperature and humidity. This degree of freedom again allows for flexibility in usage of renewable energy sources and cost optimization (e.g., by cooling the room in summer at times when energy from the grid is cheap or when renewable energy is available). Lighting conditions are extremely critical, since human users react sensibly on changes, so the amount of changes has to be kept at a minimum. Furthermore, there is no common lighting level for a room, but it strongly depends on the geometry and obstacles in the room as well as the lighting installation in the room. To maintain a high level of comfort while at the same time optimizing for all other goals (energy efficiency, costs, usage of renewable) is a most challenging task that can be approached satisfactorily by the presented model

4.3. Further Neuro-Symbolic Representations

Similar to the recognition module of the model depicted in Figure 1, the neuro-symbolic information representation and information processing principle can also be applied to the action execution unit for the representation of procedural memory. As described by Goldstein [30], like the perceptual system, also the motor cortex, which is responsible for action planning and action execution, is organized in a modular hierarchical manner. In contrast to the recognition unit, in the action execution unit, the information flow is directed top-down from higher to lower levels. Unlike for the recognition unit, where neuro-symbols receive information from various sources and are only activated if their activation degree exceeds a certain threshold, motor neuro-symbols work the other way around. They have the task to distribute information about a planned action to various sources and therefore activate various neuro-symbols of the next lower level. At the highest level, neuro-symbols represent whole action plans as reaction to a certain situation. Based on this, at the level below, there are activated neuro-symbols in a certain sequence representing different sub-tasks of this action plan. From layer to layer, these action commands become more and more detailed until the last layer comprises neuro-symbols that directly result in the activation of certain muscles and muscle groups in a certain sequence. In technical systems, these muscle activations can be substituted by the activation of certain actuators or the triggering of alerts. Again, neuro-symbols of a lower level are the symbol alphabet of the level above and therefore allow a flexible reuse of defined structures.

Besides recognition and action performance, neuro-symbols can also serve for the representation of emotions as used in the pre-decision and the decision module of Figure 1. In this case, neuro-symbols represent emotional states like lust, anger, panic, fear, hope, pride, and so forth. The activation of these neuro-symbols is triggered from sensory receptors perceiving the internal states of the body, from neuro-symbols of the recognition unit, or from higher cognitive activities. Further details concerning the representation of emotions via neuro-symbols and the structure of such neuro-symbolic networks have already been discussed in [25].

A similar representation for emotions might also be conceivable for drives and desires. Apart from this, it would be interesting to face in a next step the possibility to represent also other types of memory (episodic memory, semantic memory, and working memory) by the neuro-symbolic coding scheme and to investigate how the interaction between all these different neuro-sybmolic representations works in the process of decision making.

5. Conclusion

In this paper, the issue of maintaining quality and comfort goals in building automation while at the same time optimizing towards energy efficiency was addressed by presenting a bionic model for environment recognition, decision making, and action execution. The model incorporates concepts like emotions, drives, desires, perceptual memory, procedural memory, episodic memory, and semantic memory and provides significant schematical and analytical insights into processes taking place in the mind; this has been unseen so far in its clarity. By these mechanisms, it becomes possible to handle large amounts of information and negotiate a system behaviour that resolves conflicting demands. In this sense, the presented model is a first step towards a future generation of truly "intelligent" automation systems.


  1. Llinas J, Bowman C, Rogova G, Steinberg A, Waltz E, White F: Revisiting the JDL data fusion model II. Proceedings of the 7th International Conference on Information Fusion (FUSION '04), July 2004 1218-1230.

    Google Scholar 

  2. Passino K, Yurkovich S: Fuzzy Control. Addison-Wesley, New York, NY, USA; 1998.

    Google Scholar 

  3. Velik R, Bruckner D: Neuro-symbolic networks: introduction to a new information processing principle. Proceedings of the 6th IEEE International Conference on Industrial Informatics (INDIN '08), July 2008 1042-1047.

    Google Scholar 

  4. Velik R: Towards human-like machine perception 2.0. International Review on Computers and Software. In press, Special Section on Advanced Artificial Networks

  5. Velik R, Lang R, Bruckner D, Deutsch T: Emulating the perceptual system of the brain for the purpose of sensor fusion. In Human-Computer System Interaction: Innovation in Hybrid System Intelligence. Springer, Berlin, Germany; 2009.

    Google Scholar 

  6. Dietrich D, Fodor G, Zucker G, Bruckner D: Simulating the Mind: A Technical Neuropsychoanalytical Approach. Springer, Berlin, Germany; 2008.

    Google Scholar 

  7. Dietrich D, Sauter T: Evolution potentials for fieldbus systems. Proceedings of the IEEE International Workshop on Factory Communication Systems (WFCS '00), 2000 343-350.

    Google Scholar 

  8. Russ G: Situation-dependent behavior in building automation, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2003.

    Google Scholar 

  9. Tamarit-Fuertes C: Automation system perception—first step towards perceptive awareness, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2003.

    Google Scholar 

  10. Pratl G: Processing and Symbolization of Ambient Sensor Data, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2006.

    Google Scholar 

  11. Velik R: Quo Vadis, intelligent machine? Brain—Broad Research in Artificial Intelligence and Neuroscience 2010.,1(4):

    Google Scholar 

  12. Perlovsky LI, Weijers B, Mutz CW: Cognitive foundations for model-based sensor fusion. Signal Processing, Sensor Fusion, and Target Recognition XII, April 2003, Orlando, Fla, USA, Proceedings of SPIE 494-501.

    Chapter  Google Scholar 

  13. Davis J: Biological sensor fusion inspires novel system design. Proceedings of the Joint Service Combat Identification Systems Conference, 1997

    Google Scholar 

  14. Velik R: From single neuron-firing to consciousness—towards the true solution of the binding problem. Neuroscience and Biobehavioral Reviews 2010,34(7):993-1001. 10.1016/j.neubiorev.2009.11.014

    Article  Google Scholar 

  15. Velik R: A Bionic Model for Human-Like Machine Perception. VHS; 2008.

    Google Scholar 

  16. Burgstaller W, Lang R, Pörscht P, Velik R: Technical model for basic and complex emotions. Proceedings of the 5th IEEE International Conference on Industrial Informatics (INDIN '07), June 2007, Vienna, Austria 1033-1038.

    Google Scholar 

  17. Burgstaller W: Interpretation of situations in buildings, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2007.

    Google Scholar 

  18. Velik R: Why machines cannot feel. Minds and Machines 2010,20(1):1-18. 10.1007/s11023-010-9186-y

    Article  Google Scholar 

  19. Deutsch T, Lang R, Pratl G, Brainin E, Teicher S: Applying psychoanalytic and neuro-scientific models to automation. Proceedings of the 2nd IET International Conference on Intelligent Environments (IE '06), July 2006, Athens, Greece 111-118.

    Google Scholar 

  20. Lang R, Bruckner D, Velik R, Deutsch T: Scenario recognition in modern building automation. International Journal of Intelligent Systems and Technologies 2009,4(1):36-44.

    Google Scholar 

  21. Velik R, Bruckner D: A bionic approach to dynamic, multimodal scene perception and interpretation in buildings. International Journal of Intelligent Systems and Technologies 2009,4(1):1-9.

    Google Scholar 

  22. Velik R, Zucker G: Autonomous perception and decision making in building automation. IEEE Transactions on Industrial Electronics 2010,57(11):3645-3652.

    Article  Google Scholar 

  23. Bruckner D, Velik R: Behavior learning in dwelling environments with hidden Markov models. IEEE Transactions on Industrial Electronics 2010,57(11):3653-3660.

    Article  Google Scholar 

  24. Velik R, Boley H: Neurosymbolic alerting rules. IEEE Transactions on Industrial Electronics 2010,57(11):3661-3668.

    Article  Google Scholar 

  25. Solms M, Turnbull O: The Brain and the Inner World—An Introduction to the Neuroscience of Subjective Experience. Other Press, New York, NY, USA; 2002.

    Google Scholar 

  26. Roesener C: Adaptive behavior arbitration for mobile service robots in building automation, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2007.

    Google Scholar 

  27. Palensky B: From neuro-psychoanalysis to cognitive and affective automation systems, Ph.D. thesis. Vienna University of Technology, Institute of Computer Technology, Vienna, Austria; 2008.

    Google Scholar 

  28. Lang R: A decision unit for autonomous agents based on the theory of psychoanalysis, Ph.D. thesis. Vienna University of Technology, Institute of Computer Technology, Vienna, Austria; 2010.

    Google Scholar 

  29. Velik R: A bionic model for human-like machine perception, Ph.D. thesis. Vienna University of Technology, Institute of Computer Technology, Vienna, Austria; 2008.

    Google Scholar 

  30. Goldstein E: Wahrnehmungspsychologie. Spektrum Akademischer; 2002.

    Google Scholar 

  31. Goldstein E: Sensation and Perception. Thomson Wadsworth; 2007.

    Google Scholar 

  32. Luria A: The Working Brain—An Introduction in Neuropsychology. Basic Books; 1973.

    Google Scholar 

  33. Goetzinger S: Scenario recognition based on a bionic model for multi-level symbolization, M.S. thesis. Vienna University of Technology, Vienna, Austria; 2006.

    Google Scholar 

  34. Richtsfeld A: Szenarienerkennung durch symbolische datenverarbeitung mit fuzzy-logic, M.S. thesis. University of Technology; 2007.

    Google Scholar 

  35. Bruckner D: Probabilistic models in building automation: recognizing scenarios with statistical methods, Ph.D. thesis. Vienna University of Technology, Vienna, Austria; 2007.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Gerhard Zucker.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Velik, R., Zucker, G. & Dietrich, D. Towards Automation 2.0: A Neurocognitive Model for Environment Recognition, Decision-Making, and Action Execution. J Embedded Systems 2011, 707410 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: