Open issues in fMRI / Discussions

From FMRI Methods

Jump to: navigation, search

fMRI has often been criticized for being an indirect and somewhat ambiguous measure of neural processing. Furthermore, it has been difficult to develop widely agreed-upon standards related to conducting, analyzing, reporting and interpreting results from fMRI studies.

Although a lot of progress is constantly being made with respect to these problems, it may still occasionally be difficult to deal with everyday "challenges" which arise when practicing fMRI research. The purpose of this page is to raise some of these confusing or controversial issues, in hope it will motivate discussions and long-term developments in the respected areas.

Contents

General fMRI

A lot of critiques have been raises in recent years, arguing that fMRI is a non-reliable method and should not be used for exploring mental processing. Are any of these justified or not?

Such critiques have indeed been raised, mainly motivated by the fact that fMRI represents a hemodynamic method driven by metabolic processing within the brain coupled with the fact that the relationship between neural and vascular processing is still not fully specified. Therefore, it has been suggested that fMRI, being an indirect and ambiguous measure of neural processes, is not useful for investigating either neural or the emergent mental processes.

Quite a few single arguments which have been raised on this topic have significant merit: it is true that the nature of neuro-vascular coupling is still being investigated and that the measured BOLD effects is robust and can be subject to some rather systematic biases. Furthermore, it is also true that some features of the BOLD signal as well as its relationship to neural processing are still not fully understood (e.g., the functional significance of the initial dip; the signature of inhibitory processing within the BOLD signal, etc.). Therefore, these critiques should be taken seriously and need to be considered when evaluating and interpreting the results from single fMRI experiments.

However, this does not in any way imply that fMRI is not a useful measure of neural and mental processes. Although their mutual relationships are still not fully specified, it has by now clearly and repeatedly been shown that there is a systematic mapping between both mental and neural as well as between neural and metabolic processes in the brain. Regardless of the fact that these mappings may be quite complex and context-dependent, their systematic nature is sufficient to justify studying neural processes to better understand mental ones as well as using hemodynamic methods to study neural processing. Furthermore, even if the nature of the BOLD signal was never fully specified and there always remained a certain degree of ambiguity when relating it to the underlying neural processing, this would still not discredit its potential benefits. Since every other dependent variable used in cognitive neuroscience (e.g., reaction time in behavioral studies, event-related potentials / fields in EEG / MEG, local field potentials in single-cell recordings, etc.) suffers from the same malady, it would only be fair to apply the same metric when judging the validity and usefulness of the BOLD signal.

Finally, it needs to be emphasized that the indirect and still not fully specified relationship between BOLD and neural processes somewhat limits the interpretations and conclusions which can be made based on single fMRI studies. In doing this, one should always validate and constrain these interpretations using results from other fMRI studies as well as findings obtained studying the same phenomenon with different methods.


Significance criteria for different types of analysis

Bayesian analysis

In contrast to classical approaches which are quite abundant with guidelines related to analyzing and reporting fMRI results, standards are somewhat scarcer in relation to Bayesian analysis. Specifically, it is not clear whether there exists an accepted criterion beyond which identified activations can be considered significant or not and, even more importantly, if this standard is even needed.

In a way, levels of significance represent somewhat of a “frequentist” logic which does not necessarily apply to Bayes. The questions behind the analysis are different: in the first case one evaluates the probability that something would be activated in case the null hypothesis was true (a.k.a. no difference between conditions) and in the second case one gets a probability of the alternative hypothesis being true (a.k.a. difference between conditions). Therefore, it has been argued that significance criteria and correcting for multiple comparisons make no sense in a Bayesian context. All that being true, this issue is still somewhat confusing.

In practice, applying frequentist and Bayesian approaches to the same data often yields partly different results. The obtained patterns usually don't diverge significantly, so the final results (and conclusions drawn based on these) strongly depend on the applied thresholds which are not balanced across the two approaches. Furthermore, the fact that it is possible to apply Bayesian and frequentist analysis in the same context (or even treat this as a post-measurement decision) in a rather interchangeable manner signals that the questions for which the two analysis have occasionally been used are not so fundamentally different after all. In this case, one can chose based on benefits of one method over the other (e.g., Bayes may be less sensitive to outliers or even “logically” more superior). So, what does it mean when a voxel which seems to have a probability of more than 99% of differentiating between two conditions can not be considered even as a false positive in another analysis?


Multi-voxel pattern analysis

It is quite clear how significance of pattern classification accuracy is tested in multivariate analysis of fMRI data. And, it is also clear that any identified level of classification accuracy which significantly exceeds chance level can be considered to be in some way meaningful. However, part of that meaning may be lost if different levels of accuracy (e.g. 56%, 67%, 79% or 92%) are treated as equally (not once termed “highly”) significant. Therefore, it may be important and useful to try and develop some provisory standards which would allow differentiating between these (similar to e.g., guidelines related to how different sizes of significant correlation coefficients should be interpreted). In addition, it might be useful to start comparing levels of classification accuracy between different conditions or different regions (using t-tests, ANOVAs or non-parametric tests) whose activation can be useful in differentiating different conditions.


Other

Some notes on how to proceed after identifying significantly activated voxels in an fMRI study can be found here.


Reporting activations in Tables

  • Is there a guideline for reporting activations which extend over several areas?

When reporting activations in tables, many papers report a maximum peak of activity and occasionally provide information regarding some, but not all, local maxima coordinates. However, the second part is optional, and, even when reported, it is still not sure how these coordinates were chosen (most probably so, that all/most areas being encompassed in the extended activation are mentioned). Furthermore, it is not really clear how these non-uniform practices are treated in different meta-analysis approaches.


  • Is it possible to at least suggest some standards of labeling activated areas within tables?

Most often, when labeling activated regions, structural information (e.g., sulci or gyri) have precedence over functional. However, there are not many standards related to this issue. For example, it is not clear if one should report a functional area only if a functional localizer was conducted and, if not, only speculate about the involvement of a functional area in the Discussion.

Regardless of the answer to the previous question, it is not even clear how to define a functional area. This seems to include quite an unbalanced mix of, e.g., LOC, FFA, PPA, FEF, but also premotor cortex. In addition, there is a good chance that a functional region may also represent a structural entity (just as an example, it seems that IFJ may be such a case). This is even more complicated because sometimes in practice it would be much more convenient to use a functional term: e.g., a rather restricted activation in IFJ would, in stricter terms have to be reported as IFS/iPrCS or IFG/MFG/PrCG (BA 6/9/44) which may look like half of the lateral frontal cortex and is actually not: in this case, saying less is actually more. Of course, this can be disambiguated by specifying the extent of the activation and attaching a Figure, but even when all these are provided it may still be good to have maximally precise individual information.


  • Is it possible to agree on terminology used for naming areas?

Even when reporting the same activations, different papers still occasionally use different labels for these regions. For example, mesial portion of the superior frontal gyrus is sometimes termed medial frontal gyrus (abbreviated as MFG which is an acronym usually used for the middle frontal gyrus). These practices can be misleading and confusing, specially if one does not have enough knowledge to be able to judge which of the used alternatives is actually better suited for the area they refer to (or, much worse, does not realize that two labels refer to the same area). Of course, one can always check and clear the confusion and there are great sources which offer this information. However, if these sources are not used as a default in the first place, then the problems is not so trivial after all.


  • Is it possible to “in principle” agree on the optimal level of reporting activated areas?

Finally, there is the issue of optimal level of reporting (e.g. DLPFC vs. MFG; IPL vs. SMG). Is it possible to identify such a level (e.g., the more specific, the better) or does this depend on factors such as, e.g., research questions, size of activated areas, etc?


Other

There are many other issues which could be mentioned in this context, some of which have been recently discussed: signatures of inhibitory processing in the measured BOLD response, functional significance of the initial dip, standards for conducting ROI analysis (especially when used as an exclusive analysis; choosing the criteria for defining ROIs), (non)linearity of the BOLD response, importance of spontaneous BOLD fluctuations, separating early from late or bottom-up from top-down activity, etc.

It is not possible to discuss all of them here, but they should be taken into consideration when planning, conducting, analyzing, interpreting and reporting an fMRI study as this may lead to significant improvements of such a study. And, it would be worthwhile to do this in advance just like it is commonly done with every other method. To quote R.A. Fisher: “To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination: he may be able to say what the experiment died of”. Same can be said for fMRI.