Our expertise in technology, cognitive science, mental health and the philosophy of learning enable STEER to advise on responsible applications of machine-technologies to education systems.
'AI already drives our lives. Directing where it takes us has nothing to do with our IQ, or our EQ, but on our control of our cognitive steering.'
Dr Simon Walker, CEO, STEER
Uniquely human cognition
In 2011 the UK education company STEER began a programme of research around cognitive processes which could be described as ‘machine-resistant’. The programme was led by Dr Simon Walker, an applied cognitive biologist.
Whilst advances in machine learning have accelerated rapidly, the basic architecture and coding principles of machine learning had already been established back in the 1980. Applied at scale, and across networked machines, harvesting a wider variety of data, these processes were achieving rapid gains in replicating and beating some instances of human cognitive skill which had previously been considered uniquely human.
The STEER board anticipated that it was increasingly important to identify cognitive functions, which humans possessed, which were unlikely to be replicated by machines. The goals of this were to inform sustainable and strategic educational directions; safeguard uniquely human skills; qualify the remit of AI in education; inform moral development of AI and clarify themes of human identity.
To progress these goals, STEER drew on specialisms from philosophy, psychology, cognitive and computer science, education and pedagogy. The STEER programme centred around a series of large quantitative school studies testing a theoretical data model which, previous research had indicated, might describe cognitive functions likely to be uniquely human. Therefore, cognitive functions which did conform to this computational direction were considered machine-resistant.
The programme methodologies, outcomes and conclusions were published in a series of studies between 2012-15. The term ‘steering cognition’ or ‘steering’ was used to refer to the cognitive function described.
In 2018 STEER began work to develop a framework for rating AI-based educational technologies. This framework seeks to integrate the full range of perspectives considered relevant to educational decision-making: moral framework, philosophical considerations, practical limitations, commercial realities, political risks, cognitive foundations, pedagogical processes and sound educational outcomes.
The development of AI-based educational technology should be understood within a longer tradition of methods used to improve education. Lessons can be learned from the success and failure of previous examples. In particular, educational technologies have frequently been mistaken for educational benefits in themselves, rather than being seen as simply potential channels which may/may not enhance educational outcomes. Technology developers have sometimes been guilty of a lack of understanding of pedagogy, adolescent psychology and school realities, leading them to claim benefits which then prove to be unachievable.
At the same time, teachers have sometimes been seduced by the promises of new hard and software to dramatically improve or alter the fundamental basis of teaching and learning.
The promises of AI are currently over-hyped. A healthy degree of scepticism is required to see beyond the hype and evaluate both risks/ downsides as well as possible gains.
The draft STEER framework is intended to contribute an explicit and reasonable set of questions which can be used to scrutinise new AI educational advances and can be downloaded below. A final version will be published after review and consultation with other experts.
Responsible policies for the application of AI in education
AI questions some basic assumptions about being human. The goal of education is to preserve, extend and enrich human endeavour and cooperation on the planet. In the light of this, AI raises challenging questions for educational policy makers, which involve careful and wise political, commercial, moral, societal, psychological, industrial, economic and philosophical consideration.
STEER encourages policy makers to take a long term perspective including the following considerations:
A basic question for AI-based applications is the extent to which any new pedagogical technology enhances uniquely human cognitive abilities. Without a proper answer to this question, we run both a political and an existential risk. The political risk is that societies may perceive AI as a fundamental threat to human progress and control and oppose its use at every level. The existential risk is that our confidence in, and investment toward future artistic, intellectual, scientific, literary progress becomes uncertain if we cannot clearly distinguish the unique value of human activity.
How a human thinks differently from a machine must be answered at a mechanistic level, a cognitive level, a psychological level and a social level. It is important that these distinctions do not remain in the papers of academics but are communicated clearly and effectively to inform educational ideology and teaching practice. Teachers will need to understand how their maths, or science, or history lesson is contributing a kind of knowledge and skill which cannot be replicated by a machine, otherwise their confidence in their enterprise may ebb away.
Related to this is the need to explicitly teach students a theory of knowledge. Students will need to distinguish between data, information, knowledge, abstraction, felt experience such that they can discriminate between lower-level machine analytics and higher-level data interpretation.
Like any other field, educational AI is being developed in a market in which both private and state actors are acting to exert influence and control. Understanding this, it is important to recognise that machines are morally blind, but not morally neutral. Machine learning algorithms must be first trained using a set of data in order to recognise patterns. The data they are trained on determines the patterns they detect and the meanings they ascribe to them. If the dataset is distorted (for example, a data set with a bias of negative data on the violence committed by women), then the machine will be trained to see women as more violent and predict that future pattern.
The implication of this for educational policy makers is that they must consider how their education will train, supervise and regulate each generation of engineers to develop socially responsible code. By way of historical analogy, in mediaeval times across Europe, a religious priesthood politically controlled society through their exclusive access to, and understanding of, Latin biblical texts. Machine coding creates a kind of new priesthood who author code which society relies on for our knowledge of the world. The new priesthood resides in anonymous warehouses and offices in California, Oregon, Moscow, Beijing, Pyongyang and London. Educational policy makers must consider specifically how those who occupy such warehouses are educated.
In addition, educational policy makers must consider ethical and regulatory framework within which they will engage with third parties, in mass data collection.
Educational digital technologies inherently construct an additional interface through which a student engages with the learning process. Inevitably, the social- psychological dynamic of learning is changed. For example, facial or other biometric monitoring alters the dynamic fundamentally. The learner becomes scrutinised potentially on a continuous basis; they may not be aware of the uses of the data. As a kind of educational surveillance, this is likely to create a state of hypervigilance, the effects of which upon the student’s ability and willingness to learn, need to be much more fully investigated. In addition, adolescence is a stage when students socially monitor what they present and what they hide; students who hide more are associated with higher mental health risks. An inadvertent consequence of AI-enhanced student monitoring may therefore be to amplify the rising psychological health risks already faced by adolescents in many countries.
This one example illustrates how AI-technology should be scrutinised by a range of disciplines and practitioners.
Fundamentally, AI- based educational applications outsource cognitive load from the learner to an external digital environment. In investing in adaptive educational environments, policy makers must review the evidence of inadvertent cognitive regressions.
Some studies provide direct evidence that the more learners rely on [some] adaptive digital resources, the more unrealistically inflated their self-evaluation of their own cognitive skills becomes . These findings come against a wider background of declining and reverse-Flynn Effects (generational IQ gains) across developed countries, attributed by some researchers to social and environmental effects, including the ubiquity of technology.
To pose the key question: will AI construct adaptive educational roads for users to better tailor their learning, or will it provide the equivalent of an autonomous educational car in which the learner’s decisions are made by the machine rather than the driver?
The previous point then poses the next consideration for policy makers. AI offers responsive, adaptive educational road designs to the learner. This has the potential to accelerate the acquisition of knowledge as the learner follows the adaptive road, without expending the cognitive labour of searching for, discriminating and linking knowledge themselves. However, higher-order thinking involves the ability to navigate epistemological landscapes in which there are no pre-existing roads. Such thinking is essential beyond the classroom; it is required to solve all new social, cognitive and artistic challenges.
In addition, this is also the skill which will enable humans to collaborate effectively with machines in a future economy. Machine intelligence relies on the identification of the right question and the right dataset in the first place; to extend the analogy, the ability of AI to enhance learning involves the human capacity to steer its powerful engine in the right direction in the first place.
Education is the environment which must train this metacognitive steering skill, which centres around self-regulation, map-making and cognitive flexibility. However, to date education has found it easier to measure academic engine power and progress (grades) rather than metacognitive steering skills. AI puts pressure on policy makers to develop a partner curriculum to target, measure and improve metacognitive skills.
AI which is of value is AI which works in the real-world. For education, this means technologies which work given the limitations of available time, hard/software and teacher skills. Policy makers should ask the questions: How feasible is it for teachers to practically use this technology in-class / school? What constraints of time, skill or costs exist?
What intangible obstacles have been considered? Schools are uniquely complex environments, educating young people who are on a journey of physical, intellectual, social and emotional change. What can seem like a perfectly acceptable activity to an adult can be excruciatingly embarrassing for an awkward teenager. For example, the reluctance of students to use ‘wearable technologies’ in the social setting of the classroom. Equally, technology may enhance the ability of young people to overcome adolescent social anxieties, for example, collaborating in online rather than face-to-face discussion.
Why is steering cognition machine resistant?
Machine intelligence is progressing at a tremendous rate; it was once considered impossible for a machine to beat a Grandmaster at chess. In 2016, Google’s Deep Mind overcame Grandmasters in the unstructured game Go. Surely it is likely that, at some point, machines will be able to replicate the brain functions which constitute steering cognition?
This is indeed possible, but we believe it is a long way off, and indeed, may in fact never be achievable. The reason is that machine intelligence and steering cognition use two different processing architectures. Machine intelligence is built upon an algorithmic architecture, whilst steering cognition is built on associative.
The associative architecture of steering cognition
The central function of steering cognition is as a mental simulator, which enables the brain to ‘manipulate or turn round’ novel, external data to work out what kind of data it is, how to attend to it, where to locate it in our long term memory and how to act back out into our environment in response to it.
To visualise the function of steering cognition in the brain, think of the board game ‘Downfall’. The object of the game is to get counters of the right colour into the right container at the bottom, via a series of cogs. The counters are like ‘data’ from the outside world, which come in a huge array of forms and colours. The containers at the bottom are like our existing memory in our minds: our brain creates long term memories by categorising counters into groups with similar associations (shape, size, smell, feel) so that we can recognise a leaf as a leaf, or a smile as a smile.
To get the coloured counters from the varied outside world into the right containers that exist in our minds, we need an equivalent of the ‘Downfall cogs’ in between. Steering cognition is our ‘Downfall cogs’. The steering cognition cogs enable our brains to turn round the data, work out what kind of data it is, whether we have seen it before and how it relates to our existing categories.
These cogs take place in our working memory and involve our imagination: we use our imagination to mentally simulate whether we have experienced this data before, how we reacted to it if we did, how much attention to pay to it now, and how to relate it to our existing memories.
The brain’s steering cognition cogs have to be very flexible, constantly adjusting between and within the varied languages through which we ‘read the world’. We read ’emotional’ data, ‘social’ data, and ‘spatial/physical’ data, ‘numerical’ data and of course ‘linguistic’ data.
In the course of an everyday task, like shopping in the supermarket or chatting with friends over a drink, ‘the meaning’ is contained not in one language, but within the fluid combination of these different data languages (gestures, tones, words, numbers, ideas etc). Steering cognition prioritises, adjusts and regulates our limited attentional focus between these different kinds of data, in order to detect the meaning of the whole. When it fails to regulate appropriately, our attention and subsequent action can become biased, focused on some languages over the others.
The ability to recruit imagination is critical to achieve such attentional regulation, because imagination allows us to ‘see ourselves in relation to’ new data experiences. The imagination ‘puts us into the picture’ so to speak in the first person; in this way it ‘recruits and associates’ past emotional, social, linguistic, numerical memories with the new experience; new data is initially not processed procedurally and atomistically, but holistically and integratively.
Often, those associations are oblique, ambiguous, unresolved and putative, waiting to be more fully crystallised as we anchor the new more closely into our existing structures of memory and meaning (which is why the imagination is also the realm of metaphor, symbol, allusion and inference). Tolerance of the unresolved is critical: By sustaining and retaining such putative ‘associations’ in our mental simulation circuitry, we can create time to make connections with almost any new experience or kind of data. This allows us to adjust to, process and incorporate into our internal memory, a much richer, combinative and unpredictable stream of information than any other animal species.
Machines, on the other hand, are restricted to data types for which they already have an existing internal coding architecture. Otherwise the external data ‘does not fit’. Internet companies have faced and had to overcome this problem in its simplest form. Different applications (Facebook, Instagram, Blogger) are coded in different data architectures. When we post a picture on Instagram and want to link it to our Facebook page, the data has to be translated through an external, third party interface programme (an API). It cannot be read directly.
Internet companies are having to create more and more APIs to link different applications- maybe one day they will have created 1000s. The brain has to have an ‘steering cognition’ API for an almost limitless array of external data grammars. This helps us appreciate how difficult it is for any cognitive processor (e.g. the brain or a computer) to process data from an external source that is of an unpredictable and different structure to that held in its internal database (memory). It is for this reason that machines are only intelligent when faced with narrowly predictable and routine environments; e.g. a chess game, maths problems, stock market trading judgements etc- the data comes in one format.
The algorithmic architecture of machine and analytical learning
The brain’s almost limitless ability to process varied data grammars relies on its capacity to ‘associate’ unrelated data via its mental simulator, the imagination. The imagination’s ‘associative processing’ is what makes steering cognition critically and uniquely human. Machines make judgements by following programmed algorithms (10000’s of procedures in a step by step sequence to arrive at an answer). In this, machines analyse data similarly to how the brain sieves through, analyses, computes and finds patterns in its existing retained memories.
We have tests to measure the brain’s ability to process internal data algorithmically– we call them IQ tests. The capacity of the brain to perform such complex, procedural mental tasks quickly and accurately is given a term ‘general intelligence’. The term is somewhat misleading as it suggests it refers to the overall capacity of the brain to learn. However, general intelligence is not the intelligence of the brain overall; rather, it is the ability of the brain to process and use existing, learned data algorithmically.
Machines have a superior potential to replicate and surpass the algorithmic (IQ) functions of the brain. Machine learning enterprises are and can only (currently) be coded by algorithms. They will inevitably become more powerful in mastering the processing of data sets the like of which they have encountered before.
But it is our steering cognition, rather than algorithmic processing, that enables us as humans to process data in the complexity of the real and living world.
This leads to a fairly confident set of five reasons why machines will not be able to replicate steering cognition:
1. Humans are involved in every situation we process: machines cannot be
When we ‘read’ a social situation we are also a contributor ‘to’ that social situation. An agent as well as a reader. Human knowledge therefore always involves conscious mental representation of ourselves, and as such requires the capacity to become aware of ourselves AND become aware of the state of the other people around us. A machine will never be able to do this because, by definition, they are non-human and therefore will not be ‘included’ as a contributor to the external social dataset by another human being. Our ability to ‘imagine’ ourselves as a first person, and to ‘imagine’ the state of another (empathy) centrally requires a community of imaginative participants. A singular machine intelligence would need to be come a community of machine intelligences, and even then, would only potentially be able to understand its own kind.
Fundamentally, steering cognition requires the thinker to ‘see them self’ as an entity participating in a situation. The capacity to ‘self-represent’ is an emergent property of the brain's interaction, understanding of which remains beyond the grasp of philosophers let alone computer scientists. The most sophisticated neural machine architectures have been developed without any understanding of what constitutes, neurally, a state of self- conscious, self-representation.
3. Much external data requires associative and ambiguous processing
For example, what does a child waving mean? The answer is determined by wider context, personal history and other factors not immediately discernible from the dataset. This requires associating symbolic, gestural, metaphorical data to form a composite picture involving memory and personal story. Machines cannot do that. Philip K Dick’s Do Androids Dream of Electric Sheep? explored this problem poignantly in 1979.
4. Data processing requires a person to move ‘into’ and ‘out of’ a situation mentally in their head, from first to third person in order to understand it
For example, how do you help a child with a nasty cough? Appropriate action requires the capacity to ‘stand in her shoes’ and feel the illness, be gentle, kind but also ‘step back’ and consider medical data, temperature, symptoms etc. Real world intelligence requires the ability to move from ‘object-to subject’ moment by moment depending on the structure of data presenting itself. Machines cannot detect when to make those judgements because they cannot discriminate between the subtle, unpredicted data types presenting themselves moment by moment.
4. Human beings fake
Is the ill child faking to get off school? Does my smile mean I agree with you or am hiding my real reaction? Human cognition involves social codes of disclosure/shame etc which are cultural as well as personal. They are critical for social cohesion and influence; we use our self-presentation to effect influence upon other human beings. The Hal problem in 2001 Space Odyssey explores machine and human non-disclosure; Hal, the computer, withholds information from the human astronaut, who works out that Hal is not-disclosing and works round him. Hal, on the other is flummoxed when the astronaut becomes non-disclosing because he cannot compute what his intentions are.