The aim of this paper is to reconstruct and critically assess the evidential relationship between neuroscience and educational practice. To do this, I reconstruct a standard way in which evidence from neuroscience is used to support recommendations about educational practice, that is, testing pedagogical interventions using neuroimaging methods, and discuss and critically assess the inference behind this approach. I argue that this inference rests on problematic assumptions, and, therefore, that neuroimaging intervention studies have no special evidential status for basing educational practice. I conclude arguing that these limitations could be resolved by integrating evidence from neurocognitive and educational science.
Keywords: Educational neuroscience, evidence-based educational practice, neuroimaging, interdisciplinary integration
The aim of this paper is to reconstruct and assess the relationship between neuroscience and educational practice, from the perspective of philosophy of science. To do this, I reconstruct a standard way in which evidence from neuroscience is used to support recommendations about educational practice, that is, testing pedagogical interventions using neuroimaging methods; and discuss and critically assess the inferences behind this approach.
In this paper, I consider the relationship between neuroscience and education as a special case of the concept of evidence-based practice in education.1 The idea that educational practice should be informed by evidence has nowadays become a common slogan. Indication of this tendency is the flourishing of initiatives that have as their main objective to facilitate and improve the collaboration between science and practice. Examples of such actors are the What Works Clearinghouse set up by the US Department of Education2 and the Evidence for Policy and Practice Centre in London.3 A rather recent development in the discourse about evidence-based practice in the education is the introduction of neuroscience as a specific type of evidence base for educational practice (The Royal Society, 2011a; The Royal Society, 2011b). This new tendency has been met with both positive and negative acclamation. The advocates of this new tendency welcome the introduction of the clarity and generality of the hard sciences into the context-dependent field of educational practice (Carew & Magsamen, 2010). On the other hand, critics argue that knowledge from neuroscience lacks the sufficient context-sensitivity that is necessary for educational practice (Bruer, 1997; Lee & Ng, 2011; Stern, 2005; Smeyers, 2016).
Despite this lively discussion, a number of foundational issues have not had careful consideration in the literature. Examples of such issues are
What evidential role (if any) can neuroscience play for educational practice?
What suggestions can be drawn from neuroscience about educational practice?
Does the available evidence warrant these suggestions?
In contrast to this scenario, in recent years philosophy of science has directed great attention to the issues of the methodological foundations of the evidence-based movement (Cartwright & Stegenga, 2011; Grüne-Yanoff, 2016; Marchionni & Reijula, 2019). Therefore, in this paper, I employ the theoretical resources of the present philosophical discussion about evidence-based practice and policy in order to reconstruct what it means to base educational practice on evidence from neuroscience, and to provide a critical assessment of the inferences that are used to justify this evidential relation. After clarifying the theoretical perspective on evidence and evidence-based practice assumed in this paper, I begin my argumentation by reconstructing the inference involved in testing pedagogical interventions by means of neuroimaging. The next section is the central step of my argumentation, in which I critically assess the soundness of this inference by discussing its basic assumptions. Finally, I suggest a way in which the problems I identify in the previous section could be resolved. I conclude by wrapping up the discussion and reconnecting with the aims of the paper.
2. Educational neuroscience and educational practice
Educational neuroscience is a relatively recent interdisciplinary field that employs the methods and results of cognitive neuroscience to approach issues that are relevant for educational science and pedagogy. Although not every educational neuroscientist suggests that empirical results of neuroscience should work as an evidence base for educational practice or policy, there have been examples of such interpretation. For instance, the Royal Society report “Brain Waves 2: Neuroscience: implications for education and lifelong learning” states:
Education is about enhancing learning, and neuroscience is about understanding the mental processes involved in learning. This common ground suggests a future in which educational practice can be transformed by science, just as medical practice was transformed by science about a century ago. In this report we consider some of the key insights from neuroscience that could eventually lead to such a transformation. (The Royal Society, 2011b, p. 2)
The same report states also the following:
Neuroscience evidence should inform the assessment of different education policy options and their impacts where available and relevant. Neuroscience evidence should also be considered in diverse policy areas such as health and employment. (Royal Society, 2011b, p. 19)
These passages suggest that neuroscience should be used in education in order to establish evidence-based practice and that the relationship between evidence and practice/policy should be the same as in medical practice. Therefore, the question is in what concrete way neuroscience can or should work as evidence for educational practice. Xavier Seron (2012) recently argued that if neuroscience is to be used to ground recommendations for practice, this should be done
[f]ollow[ing] the principles of evidence-based medicine, which are classically used in the health and educational sciences to establish the effectiveness of any intervention: they should create randomized groups, make pre- and post-intervention assessments, measure the long-term effects of the programmes and so on. (Seron, 2012, p. 102)
Testing interventions by means of randomized tests is, as Seron argues, a well-established method in evidence-based medicine. For this reason, I will consider how the same standard approach is applied when evidence from neuroscience is used as a base for educational practice, that is, when neuroimaging methods are used to estimate the differential effect of one or more interventions with relevance for some educational phenomenon (I will use henceforth the term ‘pedagogical intervention’).
In order to reconstruct this inference, I have considered a convenience sample consisting of all research articles4 (n = 122) published in the two journals dedicated to educational neuroscience (Mind, Brain and Education and Trends in Neuroscience and Education) during the period 2015-2019. Of 122 articles, I identified 15 articles that fit the search criteria, i.e. testing a pedagogical intervention (or an intervening factor that can be connected to some teaching strategy) and using neuroimaging (Aar et al., 2019; Antonenko et al., 2019; Daly et al., 2019; Horowitz-Kraus, 2015; Karlsson Wirebring et al., 2015; H. S. Lee et al., 2015; Ludyga et al., 2018; Nenciovici et al., 2018; Nissim et al., 2018; Pietto et al., 2018; Rominger et al., 2017; Rosenberg-Lee et al., 2018; Sanger & Dorjee, 2016; Taillan et al., 2015; Takeuchi et al., 2019). I have also searched the What Works Clearinghouse archive using terms “neuro*,” “*imaging,” “ERP,” “fMRI,” “MRI,” “MEG,” “EEG,” “fNIRS,” “Event Related,” “Magnetic Resonance,” “*encephalo*,” “tomogr*,” and “spectro*.” Only one result fit the search criteria (Neville et al., 2013). Many of these studies do not explicitly aim at formulating a classroom recommendation. Therefore, in this paper, I will only discuss whether it is possible to derive a classroom recommendation that states that teachers should employ the specific interventions from these intervention studies. The details of the selected 16 articles is summarized in Table 1.
A summary of the selected studies.
The focus on intervention and intervening factors constrains the concept of recommendation for educational practice. The kind of recommendation that I consider in my discussion has the form “given some educational goal S, and a set of alternatives courses of action I1–In, if empirical evidence supports some Im in contrast to all available alternatives relatively to S, then a teacher should choose Im in relation to S.”
Therefore, Im can be anything from a specific educational intervention to a less strictly defined instructional strategy. The content of recommendations are therefore provided by the empirical studies that support them. Consider two examples from the studies above. The first is the Reading Acceleration Program (RAP) discussed in Horowitz-Kraus (2015): this is a reading fluency computer-based training that improves word-decoding accuracy and reading. The study describes the effect of 8 weeks of training on executive functions. Therefore, the recommendation that can get support from the study is “If the goal is improving executive functions, the RAP training should be preferred to standard instruction.” The second example is the difference between teaching mathematics presenting pupils with a solution and letting them create an own solution, discussed in Karlsson, Wirebring, et al. (2015). In this case, the researcher discusses a more general instructional approach that can be applied in many different ways. The suggestion that can be derived from this study is that if the goal is to obtain long lasting learning effects, then letting pupils creating their own solution should be preferred.
3. Evidence for theory and evidence for practice
This paper is concerned with the concept of evidential reasoning and especially with the issue of basing recommendations for educational practice on empirical evidence from neuroscience. In this section, I clarify the specific philosophical perspective that I assume concerning this issue.
The first piece in my framework is the concept of evidence. From a philosophical perspective, evidential inferences involve constructing a logically valid argument using some evidential base and possibly a theory as premises and some hypothetical claim (or its negation) as conclusion. This requires making several assumptions, and these assumptions must be justifiable. From this perspective, assessing evidential reasoning can be understood as case-by-case evaluation of the specific assumptions involved in the support of an empirical hypothesis. In this paper, I assume this methodological stance and focus on the reconstruction of specific evidential inferences, focusing on what assumptions must be in place in order for the inference to be admissible.
The second item in my framework consists of a distinction between evidence for theory and evidence for practice. Typically, a theoretical claim expresses a causal effect that we can expect from a series of intervening factors, assuming that everything resembles some class of optimal (normal) conditions. Theoretical claims are abstract models of their targets, which justifies the elimination of disturbing contextual factors and the isolation of relevant causal factors. The methods used to support theoretical claims ideally fit the status of these claims. This means that, for instance, specific methods such as experimental ones are used to support theoretical claims, because they help the process of isolation and elimination. Recommendations for practice are a different kind of claim, and for this reason, it is plausible to expect that they require other types of evidence. Such recommendations differ from theoretical claims in that many normality assumptions necessary for some theoretical claim might generally not apply to a related practical situation. For instance, once we attempt to extrapolate a laboratory result to a real-life setting, some (maybe unknown) disturbing factors that were eliminated in the lab play a role, which makes it difficult to evaluate what the expected effect will be when we move from the lab to the real-life situation. Therefore, the evidence supporting a causal process in the lab might not be sufficient to support the same causal process in real-life conditions. Consequently, the evidence supporting a theoretical claim could be insufficient to support a recommendation for practice.
The final item in my framework consists in the claim that practical recommendations require mechanistic evidence. Arguably, the context-sensitivity of practical recommendations entails that such claims must be supported by mechanistic evidence. Mechanistic evidence is the kind of evidence that supports a model of the entities and activities connecting an intervening factor with the expected effect (Illari & Williamson, 2012). It has been recently argued (Grüne-Yanoff, 2016; Marchionni & Reijula, 2019) that evidence-based policy requires mechanistic evidence. Grüne-Yanoff contrasts mechanistic evidence with difference-making evidence, i.e., the evidence that supports a claim that an intervention contributes to a difference in an expected effect. According to him, policy interventions that are only supported by difference-making evidence can fail to be “effective, robust, persistent or welfare-improving” (2018, p. 463). This is because these aspects of policy interventions are sensitive to which underlying mechanism produces the effect. Take effectiveness as an example. Knowing that the intervention makes a difference to the target variable in experimental conditions does not guarantee that the policy will be effective in the expected social setting. Instead, mechanistic evidence is helpful in order to extrapolate from an experimental to a real setting, as it is able to account for the extent to which a causal relation between intervention and effect is modulated or mediated by further factors (Steel, 2007). Knowledge about the presence or absence of such further factors, together with mechanistic evidence, allow us to form a judgment of what will happen when we apply the intervention in a real-life setting. Marchionni and Reijula (2019) provide arguments for similar claims and argue that evidence-based policy requires mechanistic evidence, which they construe as evidence about the causal path leading from an intervention to its effect. Evidence from neuroscience seems at first sight to be a very good candidate for mechanistic evidence. According to Zednik (2014), “There is a widespread consensus in philosophy of science that neuroscientists provide mechanistic explanations. That is, they seek the discovery and description of the mechanisms responsible for the behavioral and neurological phenomena being explained. This consensus is supported by a growing philosophical literature on past and present examples from various branches of neuroscience” (2014, p. 1). Therefore, evidence from neuroscience should fit well the demands of evidence-based practice in education. The brain structures that are indirectly observed using neuroimaging techniques work as entities and activities that mediate the causal connection between intervention (e.g. the use of a particular instructional strategy) and effect (the learning outcome).
The authors cited in this section argue that the issue is not which methodology or research design is more suited for supporting a recommendation for practice. Different studies using different methodologies can be helpful in different situations. Rather, providing evidence for practice is a matter of what model is supported. If the model abstracts from contextual conditions and depicts a causal relation that obtains in idealized circumstances, then it is a theoretical model. The more it accounts for what the causal relationship looks like in more concrete circumstances, the more it is suited as model for practice. For instance, computational models of learning are abstract and theoretical, whereas the models used in ethnography are more practice-oriented. Moreover, if the model describes the differential effect comparing two scenarios (introducing an intervention vs. not doing it) then it is a differential model. The more it accounts for how the effect is brought about by describing a causal path, the more it is a mechanistic model. Models supported by randomized controlled trials are typically differential, whereas models developed by means of process tracing, structural models and agent-based models are typically mechanistic. Some mechanistic models can be theoretical if they are highly idealized. For instance, agent-based models of school segregation are mechanistic but idealized (Stoica & Flache, 2015). In the same way some differential model can describe practice by testing interventions in the field, but fail to provide a mechanism.
Although the authors I have discussed above focus on policy rather than practice, their arguments can be easily transferred to the issue of practice recommendations, as there are some relevant similarities between the two issues. First, both classroom recommendations and policies are formal instrumental recommendations. These recommendations put forward a decision problem focused on a wished effect and recommend a course of action in order to obtain some effect. Secondly, both classroom recommendations and policies are context-sensitive. For these reasons, it is plausible to expect that classroom recommendations, in the same way as policies, require mechanistic evidence.
Considering all these claims, the assessment of the two forms of relationships between neuroscience and educational practice will focus on whether the issue of the difference between evidence for theory and evidence for practice is accounted for and if the results from neuroscience can be considered as mechanistic evidence.
4. Reconstructing the inference from neuroimaging intervention studies to classroom recommendations
In this section, I reconstruct the inferences involved in the process of drawing conclusions about specific educational interventions from neuroimaging results. This kind of inference involves using neuroimaging as a means to measure the effect of an intervention. Thus, the neuroimaging result is supposed to supports a claim concerning the difference that a particular intervention makes for an effect that is measured at the brain level.
The inferential scheme used in the 17 selected examples is summarized below in Table 2. In order to clarify this inference, let us consider a paradigmatic example among the selected studies, i.e. the study of a family-based intervention discussed by Neville et al. (2013). This intervention is discussed in two journal articles (Giuliano et al., 2018; Neville et al., 2013). These studies aim at evaluating the effect of a specific family-based pedagogical intervention on children from lower socio-economic status (SES) families on selective attention (which is assumed to be crucial for learning). The intervention consisted of a child component (a set of group activities) and a parent component (including activities, instructions and weekly support). The selected individuals belonged to low SES and were randomly assigned to the intervention and to the two contrasts: a non-intervention contrast and a contrast consisting of an intervention similar in content and intensity, but without a parent component. Pre- and post-measurements were administered to all participants. These measurements included neuroimaging and behavioral tests. The behavioral tests aimed at measuring “non-verbal IQ, receptive language, and preliteracy skills by testers blind to children’s experimental group, and also using parent and teacher reports of children’s social skills and problem behaviors” (Neville et al., 2013, p. 12139). Finally, parents were asked to assess their stress and ability to parent.
Schematic representation of the inference involved in supporting a classroom recommendation by means of a neuroimaging intervention study.
The analysis of the neuroimaging data reveals a significant difference in the pattern of activation in the area related to selective attention in the test group (no significant difference in the controls). The results show also greater improvement in all behavioral tests and in the stress self-assessment in the test group than in the controls. The researchers conclude that interventions that include parents and the home environment might have a higher chance to ‘narrow the large and growing gap in school readiness and academic achievement between higher and lower SES children’ (Neville et al., 2013, p. 12142). The inferential structure of this example is summarized below.
In Table 2, the symbol >* means ‘is significantly larger than’. This schematic summary should give us a picture of the inferences that are sufficient for a basic assessment. We can therefore move on to the assessment of this scheme.
5. Assessment of the inferential scheme
My argument in this section works in the following way:
I start reconstructing the assumptions necessary for the inference from neuroimaging intervention studies to pedagogical interventions.
I discuss whether, in the light of (1), the evidence that is provided can be considered as mechanistic evidence, that is, evidence that supports a mechanistic model of the targeted educational phenomenon.
Whenever I identify a problem concerning (2) above, I discuss whether the problem is specific to neuroscience or in general of evidence-based practice in education.
The pattern described in Table 2 rests on some crucial assumptions. The first is a background neurocognitive theory. In the example, this is explicitly stated in the beginning of the Neville et al. (2013) study. According to this background theory, the area of the brain that is responsible for selective attention is characterized by high plasticity, that is, it can change as a result of life experience. As supported by further results traumatic experiences can affect selective attention negatively. The study cites results connecting SES to selective attention. Selective attention is further assumed to be a basic building block for the human capacity of learning. The second assumption states that patterns of observed brain activation indicate the right kind of function (in the example, selective attention). This assumption is typically justified using methodological devices in the analysis of neuroimaging data (Wright, 2018). The behavioral measures are also assumed to be able to track the right cognitive function, which is normally an assumption justified by the background theory of the used test battery. Furthermore, it is assumed that the target system of the neuroimaging and behavioral measurements in identical or similar with the target system of the relevant classroom recommendation. Finally, the researchers make a theoretical assumption about which differences in intervention are responsible for the observed differences in the patterns of brain region activation and cognitive function. Put differently, researchers have a theoretical hypothesis about what difference between the intervention and the control tracks the actual relevant causal factor. In the example, the relevant difference is the inclusion of a parent component in the test intervention.
As I argue in the remainder of this section, the last two assumptions raise some crucial issues. The first problem concerns the assumptions that the target of the measurements is identical or similar to the target of the recommendations. In all 16 examples, the relevant intervention is administrated in school or similar-to-school conditions. However, the conditions in which the neuroimaging and behavioral measurements are performed varied. Six studies used neuroimaging technologies (magnetic resonance and magnetoencephalography) that required that the participating individual sits or lays inside a large scanner in a laboratory or in a hospital. Ten cases (including Neville et al.) employed neuroimaging technology such as EEG and fNIRS, which allowed the involved individuals to move more freely. Among these cases, two studies specified that the measurements were taken in a quiet separate room (Rominger et al., 2017; Taillan et al., 2015); two studies specified that the measurements were taken in a quiet separate room within the school premises (Horowitz-Kraus, 2015; Sanger & Dorjee, 2016); two studies specified that measurements were performed in lab conditions (Giuliano et al., 2018; Ludyga et al., 2018); three studies provided unclear descriptions of the measurement location (Antonenko et al., 2019; Takeuchi et al., 2019; Daly et al., 2019); and finally, one study (Pietto et al., 2018) stated that the neuroimaging measurement were taken under natural field conditions. To the extent that natural field conditions were not established during these measurements, we incur in a problem that I call the discrepancy between learning-in-the-lab and learning-in-the-classroom. As discussed in section 3, measured effects in the lab or otherwise artificial conditions work best as evidence for theory, since laboratory settings are set up to isolate relevant factors and eliminate contextual disturbance. Therefore, the target of these measurements is an abstract theoretical event, a component of a theoretical model, which we call learning-in-the-lab. In contrast, the target of a classroom recommendation is a different event, in which contextual factors play a role, which we call learning-in-the-classroom. These two events are plausibly related, but their relation requires the specification of a translation theory, describing the similarity between the learning-in-the-lab and the learning-in-the-classroom events. In simple terms, even if the intervention happened under natural conditions, the measured effect of the intervention might not have been exactly the same as that which is relevant for teachers’ decisions in the classroom. As a result, if we want to use a result such as the study by Neville et al. to ground a practice recommendation, we might need further evidence that the studied intervention not only caused a difference in selective attention when the individuals were solving tasks in a lab environment, but also that the measurement in the lab tracks the features of the targeted property that are salient in a classroom environment.
For instance, according to Caparos and Linnell (2010), selective attention is a dual mechanism in which distractors are managed by perceptual and cognitive control. The former is affected by the perceptual load, whereas the latter by cognitive load. The measurements in Neville et al. seem to track perceptual control, by introducing visual and auditory distractors. However, the salient distracting events in a classroom might be both perceptive and cognitive, since factors like anxiety, stress or the complexity of the content of instruction might play a role. Maybe, stress and conceptual complexity are more salient features of selective attention than perceptual distractions. A theory of learning-in-the-classroom might tell. Or, maybe, something else, other than perceptual and cognitive load, is the relevant constitutive factor of selective attention in a classroom environment. Well-grounded recommendations require a specification of the salient features of the targeted property under natural conditions. To achieve this, we need a theoretical understanding of what it is to learn successfully in a school environment, that is, a theory of learning-in-the-classroom. Neuroimaging measures should therefore account for how the measures track the features of the targeted property that are salient to a classroom environment, according to a theory of learning-in-the-classroom.
A further problem is what I call the missing intervention theory. Simply put, this problem is related to the common level of complexity that characterizes interventions in the educational field and in other social contexts. Ideally, we operationalize interventions as consisting of discrete events. However, in many of the selected studies (Aar et al., 2019; Rominger et al., 2017; Rosenberg-Lee et al., 2018; Horowitz-Kraus, 2015; Neville et al., 2013), interventions consisted of many steps. These steps can include providing the involved individuals with specific information or even training, requiring the individuals to participate in activities such as workshops. As a result, the way in which the participants construct the intervention by using it can sometimes differ from the way in which the researchers conceive the intervention. Therefore, whereas it is common to think about intervention and outcome as two discrete factors and about mechanism as a network of entities and activities found ‘in between’ these two factors, in many cases it would be more correct to think about the intervention itself as an independent mechanism. In the case of the family-based intervention, the difference between the two main compared interventions is the parent component in the test intervention, which is not present in the control. As I discussed above in this section, it is assumed that the parent component is ‘the difference that makes a difference’. However, thus constructed, the difference between the interventions is not sufficient to extrapolate the claim that ‘the parent component is the difference-maker about the selective attention outcome’ to a different context. In fact, from the perspective of the participants of the intervention, the difference might be about more than the parent component of the intervention. Maybe the participants in the test group experienced that the teachers, schools, parents, and involved social workers made a difference that, in fact, constituted several different interventions. This would suggest that the part of the intervention that is identical between treatment and control (the child component) is not really identical in the two groups. At the same time, the parent component involved training and materials. The parents could have constructed these events in different ways. This would suggest that rather than one difference between treatment and control, we would have several different treatments, depending on how the individual constructed the interventions in which they were involved. In other words, in order to extrapolate the results, we need an intervention theory that explains what difference exists between intervention and control when the intervention is implemented. Such a theory should specify how – through which entities and activities – the intervention is constructed when put in use.5
Both problems are ultimately special cases of the problem of extrapolation I introduced in section 3. In the case of the first problem, extrapolation requires the assumption that the measurements are tracking salient classroom features of the targeted property. This is a problem of mechanistic relevance since the measurements might fail to track the relevant mechanism behind the targeted property. In the case of the second problem, we need a model that specifies the contribution of different contextual factors related to how the intervention is constructed in its use for the targeted effect, and evidence for this model. Since the selected intervention studies do not provide such evidence, there is a problem of mechanistic deficit affecting the inference to specific interventions. Both problems rest on unjustified assumptions, i.e. a) that the learning-in-the-lab outcome tracks the salient features of the learning-in-the-classroom property, and b) that the difference between the intervention and the control tracks a real difference when the intervention is put in use. The lack of justification for these two assumptions entails that the evidence that is gathered in the neuroimaging intervention studies does not support a mechanistic model of the targeted effect.6 Therefore, these problems affect the capacity of neuroimaging intervention studies to support a recommendation for practice. If we want to derive a recommendation from these studies alone, we must make very strong assumptions that are not justified. If we want to justify these assumptions, we need further mechanistic evidence.
It is important to stress that the issues above are not problems of incorrect application of methods. The fact that the selected studies did not perform their measurements in the field and missed to account for how contextual factors modulate or mediate the causal relation between intervention and effect is not a consequence of a failure in applying method or design. Rather, the aim of these studies is not to support a recommendation for practice, but to generate scientific understanding and, to that aim, the used designs and methods are correct. Therefore, this discussion is not about what the selected studies did wrong (they did not do anything wrong) but rather about what kind of evidence we need to inform teachers’ practice.
Finally, are these specific problems of neuroimaging studies or special cases of general problems of evidence-based practice? If we consider these problems only from a methodological perspective, then these are examples of extrapolation problems of intervention studies. The use of neuroimaging is not a necessary condition for the problems to occur. However, two considerations are in order.
First, recall that I argued that the kind of evidence that we present in favor of a classroom recommendation must support a model of the intervention that accounts for the entities and activities that modulate and mediate the main effect and that evidence from neuroscience seems at first sight to be a very good candidate for mechanistic evidence. The problems above indicate a problem of mechanistic deficit and a problem of mechanistic relevance in the neuroimaging evidence with respect to a possible classroom recommendation. The specific limitation of using neuroscience as an evidential base for educational practice consists in the fact that, even if the mechanistic character of neurocognitive theories would suggest otherwise, such theories seem to be unable to provide the right kind of mechanism to ground a classroom recommendation. This is a central problem for the prospect of neuro-based practice. The value-added of including neuroimaging to studies of educational interventions seems to be that of providing a mechanistic model of the intervention, but the mechanism that we get is not sufficient to inform practice.
Secondly, once we put this discussion in the context of the idea of neuro-based practice, we can see that these problems have a special relevance for neuroscience. Recall the claim contained in the Royal Society report, according to which “Neuroscience evidence should inform the assessment of different education policy options and their impacts where available and relevant” (Royal Society, 2011b, p. 19). Researchers have advanced similar claims. For instance, according to Carew and Magsamen,
We could continue to imagine a million things that are all possible when fueled by evidence-based rigorous neuroscience research that can be translated to practical application and tested for their efficacy through the creation of research schools, informal learning testing, and other measures. These game-changers for education and learning are within our reach. (2010, p. 686)
In both cases, a special evidential status seems to be attributed to neuroscience in basing educational practice. De Smedt et al. (2010) describe neuroimaging as “a unique source of evidence” (2010, p. 100). It seems that, among all sources of evidence, neuroscience is the best candidate to base educational practice. Possibly, there is a tacit assumption of ontological fundamentality of brain processes for educational phenomena. The processes described by neuroscience are more fundamental about educational phenomena than the entities and activities described by any other relevant science (psychology, education, sociology, ethnography, philosophy and so on), and this fundamentality makes the results of neuroscience evidentially stronger and more relevant than any other result.7 From this perspective, the two problems I discussed in this section have a special relevance for neuroscience, viz. they show that no special evidential status can be attributed to neuroscience in the grounding of educational practice. This is an important claim that should be clear for teachers, principals, local administrators and policy makers. Neuroimaging studies of pedagogical interventions are crucial to our understanding of the effectiveness of instructional methods, but the use of neuroimaging does not enable us to derive classroom recommendations directly from these studies.
Before concluding this paper, I will address a final issue. The mechanistic deficits could be resolved using evidence that does not come from neuroscience. This issue will be the subject of the next section: how can we provide justification for the assumptions discussed in this section?
6. Integration to the rescue
Whereas section 5 has focused on the critical assessment of how classroom recommendations are based on neuroscientific results, this section is focused on a positive argument about how classroom recommendations could be based on results from neuroscience, together with other results. As I argue below, some of the issues discussed in sections 5 could be resolved or at least managed by integrating methodological approaches that are typical in educational research into neuroimaging studies of interventions.
Let us start with the justification of the assumption of similarity between learning-in-the-lab and learning-in-the-classroom. The justification of this assumption requires a translation theory from learning-in-the-lab to learning-in-the-classroom, a concern that has been expressed by many other scholars (De Smedt et al., 2010; Turner, 2011; De Smedt et al., 2011; Lee & Ng, 2011; Nes, 2011; Howard-Jones, 2013; Stafford‐Brizard et al., 2017), stressing that these translations can be achieved by means of interdisciplinary efforts. For instance, Howard-Jones (2013) uses the term “bridging studies” to describe such translation theories. A translation would in this case consist in a specific study that provides evidence for the claim of similarity between learning-in-the-lab and learning-in-the-classroom.
Such translation theory requires two building blocks. First, we need a practice-based theory of learning-in-the-classroom that provides a model of learning that is enough context-specific to be used by teachers. This is not a mere specification of the outcome variable, but a full-blooded mechanistic model of the target phenomenon, which should account for the salient features of the targeted property and the classroom factors that affect these features (e.g. the relation between anxiety and cognitive load and its role in selective attention). Practice-based theories are abundant in educational research. For instance, case-study methodology and ethnographic studies of educational practices are able to support the kind of mechanistic theories we are looking for. Secondly, we need a translation technique that specifies whether the neuroimaging measurement is relevant enough to support a classroom recommendation. This translation can be concretized in at least two ways. One way can be to design a study of educational practices that analyzes the features of the targeted property of an existing neuroimaging intervention study, in a classroom environment. This could help practitioners in translating the learning-in-the-lab effect to a learning-in-the-classroom effect, by estimating the possible differences between salient features. Another strategy can be to incorporate a practice-based observation of the targeted property, to be performed as the initial phase of a neuroimaging intervention study, in order to specify in advance the salient mechanism that needs to be tracked using neuroimaging.
The second assumption is that the difference between intervention and control tracks a real difference in the application of the intervention. As I discussed in section 5, this assumption could be justified by evidence about how the involved individual constructs the intervention by putting it into use. This model could be obtained by, for instance, performing an ethnographic study of the targeted intervention between the pre- and post- neuroimaging measures. Including a practice-based study of the intervention as a part of neuroimaging studies of educational interventions could entail the specification of the target variable (instead of a sharp difference between intervention and control, the ethnographic study might reveal that the individuals involved construct together several different interventions), as well as the identification of relations connecting the different interventions to one another, specifying which causal processes lead to the differentiation of several interventions in use. Such integrative approach could account for how the different interventions in use modulate the effect on the target variable, providing the kind of mechanistic evidence necessary in order to support recommendations for practice.
The integrative approach I suggest here is not widespread in educational neuroscience (for an example, see Frankenberg et al., 2019) and is not free from problems. Several scholars have warned about the risk that interdisciplinary integration might lead into dilution (Priaulx & Weinel, 2014). For instance, Turner (2011) argues that educational neuroscience typically entails that the role of educational theory is secondary and limited to the specification of variables of interest. Even if the interdisciplinary integration I suggest here attributes a central role to educational research, researchers should consider the rewards and pitfalls of such collaborations carefully.
This paper has been motivated by two aims: a) reconstructing the evidential relationship between neuroscience and educational practice, and b) assessing the inference involved in this evidential relationship. As for the first aim, I have reconstructed the inference involved in supporting classroom recommendations using neuroimaging intervention studies.
As for the second aim, I have argued that some important methodological issues affect this inferential scheme. I identified two problematic assumptions: a) the identity (or similarity) between learning-in-the-lab and learning-in-the-classroom, and b) the claim that the difference between intervention and control successfully track a real difference in the application of the intervention.
These assumptions entailed a mechanistic deficit and a problem of mechanistic relevance. In order to justify these assumptions, an account of the similarity between learning-in-the-lab and learning-in-the-classroom and of the mechanism of the intervention-in-use are necessary, together with evidence supporting these mechanistic models. Without such mechanistic evidence, I argued that any classroom recommendation derived from neuroscientific results could not be considered as well grounded.
Finally, I have argued for the potential rewards of integrating neuroimaging studies with practice-based studies of specific educational contexts, such as ethnographic studies of pedagogical practices. I therefore hope that this paper will inspire researchers in educational neuroscience to start a discussion of the possibilities and challenges of integrative methodologies. These integrative strategies have the potentiality of making neuroimaging results more useful and informative for teachers, by producing well-grounded, context-sensitive recommendations for practice.
Many thanks to my colleagues at the SITE research seminar and to all people attending the network for developmental and neuroscience in education (NUNU) at Stockholm University, for providing useful insights about earlier drafts of this paper. Special thanks to Eric Pakulak for helping me with some technical issues concerning neuroimaging, and to Hillevi Lenz-Taguchi and Bettina Vogt for providing many insightful comments. Thanks to two anonymous reviewers for identifying several important issues in an earlier draft of this paper.
1 For this reason, I will make a strong assumption that there is a general value in basing teachers’ decision making on empirical evidence. This is of course far from uncontroversial. It could be argued that teachers should not base their decision-making on empirical evidence, but rather on their professional competence. In this way, the role of empirical evidence is not to provide suggestions on what decisions to make but rather to generate theoretical clarification that enriches teachers’ professional competence. I cannot for reason of space discuss this problem in this paper, but the reader should be aware that my conclusions rest on this assumption.
2 https://ies.ed.gov/ncee/wwc/.3 https://eppi.ioe.ac.uk/cms/.
4 By research articles, I mean here articles that described single empirical studies. This excludes review studies, opinions and theoretical studies.
5 Many researchers working with intervention studies use fidelity measures as a way of estimating the difference between how the intervention is constructed in theory and how it is applied in practice during a trial (Wickersham et al., 2011). This is a way of describing whether the intervention was correctly applied. However, in the case of incorrect application, fidelity measures are unable to specify which non-accounted factors made a (either positive or negative) contribution to the outcome.
6 Many subfields of psychology and medical science have started to use more complex statistical models as a means to improve the external validity of experimental and intervention studies. Examples of such tools are path models. These models are complex statistical tools that output a structural model that includes contextual factors and specifies the contribution of these factors to the targeted output. These models are better at supporting practice recommendations. However, three concerns must be taken into consideration. 1) Path models are scarce in neuroimaging studies of educational interventions (none of the 16 studies considered in this paper employed such models). 2) Path models cannot solve the discrepancy between learning-in-the-lab and learning-in-the-classroom without help from background theory or further mechanistic evidence. 3) Path models cannot solve the problem of the missing intervention theory without a background understanding or further mechanistic evidence of how participants construct the intervention in use.
7 If a source of evidence A is stronger than another source B, then, whenever A and B contradict one another, A is more credible than B. If a source of evidence A is more relevant than B, then the claims supported by A provide better understanding about the target than B.