Estimating Learning Effects: A Short-Time Fourier Transform Regression Model for MEG Source Localization

Yang, Ying; Tarr, Michael J.; Kass, Robert E.

doi:10.1007/978-3-319-45174-9_8

Ying Yang¹⁹,
Michael J. Tarr¹⁹ &
Robert E. Kass¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9444))

Included in the following conference series:

807 Accesses

Abstract

Magnetoencephalography (MEG) has a high temporal resolution well-suited for studying perceptual learning. However, to identify where learning happens in the brain, one needs to apply source localization techniques to project MEG sensor data into brain space. Previous source localization methods, such as the short-time Fourier transform (STFT) method by Gramfort et al. [6] produced intriguing results, but they were not designed to incorporate trial-by-trial learning effects. Here we modify the approach in [6] to produce an STFT-based source localization method (STFT-R) that includes an additional regression of the STFT components on covariates such as the behavioral learning curve. We also exploit a hierarchical $L_{21}$ penalty to induce structured sparsity of STFT components and to emphasize signals from regions of interest (ROIs) that are selected according to prior knowledge. In reconstructing the ROI source signals from simulated data, STFT-R achieved smaller errors than a two-step method using the popular minimum-norm estimate (MNE), and in a real-world human learning experiment, STFT-R yielded more interpretable results about what time-frequency components of the ROI signals were correlated with learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Real-Time MEG Source Localization Using Regional Clustering

Article 18 March 2015

MEG Methods: A Primer of Basic MEG Analysis

Magnetoencephalographic Imaging

Notes

1.
Note that in this case, the variance of sensor noise in each trial was proportional to the source signals. This violated the i.i.d sensor noise assumption in both STFT-R and MNE-R. We compared performance of the two methods in tolerating such heteroskedasticity.

References

Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization withsparsity-inducing penalties. CoRR abs/1108.0775 (2011). http://arXiv.org/abs/1108.0775
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Dale, A.M., Liu, A.K., Fischl, B.R., Buckner, R.L., Belliveau, J.W., Lewine, J.D., Halgren, E.: Dynamic statistical parametric mapping: combining fmri and meg for high-resolution imaging of cortical activity. Neuron 26(1), 55–67 (2000)
Article Google Scholar
Galka, A., Ozaki, O.Y.T., Biscay, R., Valdes-Sosa, P.: A solution to the dynamical inverse problem of eeg generation using spatiotemporal kalman filtering. NeuroImage 23, 435–453 (2004)
Article Google Scholar
Gauthier, I., Tarr, M.J., Moylan, J., Skudlarski, P., Gore, J.C., Anderson, A.W.: The fusiform face area is part of a network that processes faces at the individual level. J. Cogn. Neurosci. 12(3), 495–504 (2000)
Article Google Scholar
Gramfort, A., Strohmeier, D., Haueisen, J., Hamalainen, M., Kowalski, M.: Time-frequency mixed-norm estimates: sparse M/EEG imaging with non-stationary source activations. NeuroImage 70, 410–422 (2013)
Article Google Scholar
Gramfort, A., Luessi, M., Larson, E., Engemann, D.A., Strohmeier, D., Brodbeck, C., Parkkonen, L., Hmlinen, M.S.: MNE software for processing MEG and EEG data. NeuroImage 86, 446–460 (2014)
Google Scholar
Hamalainen, M., Ilmoniemi, R.: Interpreting magnetic fields of the brain: minimum norm estimates. Med. Biol. Eng. Comput. 32, 35–42 (1994)
Article Google Scholar
Hamalainen, M., Hari, R., Ilmoniemi, R.J., Knuutila, J., Lounasmaa, O.V.: Magnetoencephalography-theory, instrumentation, to noninvasive studies of the working human brain. Rev. Mod. Phys. 65, 414–487 (1993)
Article Google Scholar
Henson, R.N., Wakeman, D.G., Litvak, V., Friston, K.J.: A parametric empirical bayesian framework for the EEG/MEG inverse problem: generative models for multi-subject and multi-modal integration. Front. Hum. Neurosci. 5, 76 (2011)
Article Google Scholar
Jenatton, R., Mairal, J., Obozinski, G., Bach, F.: Proximal methods for hierarchical space coding. J. Mach. Learn. Res. 12, 2297–2334 (2011)
MathSciNet MATH Google Scholar
Kanwisher, N., McDermott, J., Chun, M.M.: The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17(11), 4302–4311 (1997)
Google Scholar
Lamus, C., Hamalainen, M.S., Temereanca, S., Brown, E.N., Purdon, P.L.: A spatiotemporal dynamic distributed solution to the MEG inverse problem. NeuroImage 63, 894–909 (2012)
Article Google Scholar
Mattout, J., Phillips, C., Penny, W.D., Rugg, M.D., Friston, K.J.: MEG source localization under multiple constraints: an extended Bayesian framework. NeuroImage 30(3), 753–767 (2006)
Article Google Scholar
Pascual-Marqui, R.: Standardized low resolution brain electromagnetic tomography (sLORETA): technical details. Methods Find. Exp. Clin. Pharmacol. 24, 5–12 (2002)
Google Scholar
Pitcher, D., Walsh, V., Duchaine, B.: The role of the occipital face area in the cortical face perception network. Exp. Brain Res. 209(4), 481–493 (2011)
Article Google Scholar
Stine, R.A.: Bootstrp prediction intervals for regression. J. Am. Stat. Assoc. 80, 1026–1031 (1985)
Article MathSciNet MATH Google Scholar
Tanaka, J.W., Curran, T., Porterfield, A.L., Collins, D.: Activation of preexisting and acquired face representations: the N250 event-related potential as an index of face familiarity. J. Cogn. Neurosci. 18(9), 1488–1497 (2006)
Article Google Scholar
Xu, Y.: Cortical spatiotemporal plasticity in visual category learning (doctoral dissertation) (2013)
Google Scholar

Download references

Acknowledgements

This work was funded by the Multi-Modal Neuroimaging Training Program (MNTP) fellowship from the NIH (5R90DA023420-08,5R90DA023420-09) and Richard King Mellon Foundation. We also thank Yang Xu and the MNE-python user group for their help.

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, USA
Ying Yang, Michael J. Tarr & Robert E. Kass

Authors

Ying Yang
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Tarr
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Kass
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Yang .

Editor information

Editors and Affiliations

IBM T.J. Watson Research Center , Yorktown Heights, New York, USA
Irina Rish
Medical University of Vienna , Vienna, Austria
Georg Langs
University of California, Berkeley, California, USA
Leila Wehbe
T.J. Watson Research Center , Yorktown Heights, New York, USA
Guillermo Cecchi
Carnegie Mellon University , Pittsburgh, Pennsylvania, USA
Kai-min Kevin Chang
Queen's University Belfast , Belfast, United Kingdom
Brian Murphy

Appendix

1.1 Appendix 1

Short-Time Fourier Transform (STFT). Our approach builds on the STFT implemented by Gramfort et al. in [6]. Given a time series $ \varvec{U} = \{U(t), t = 1,\cdots ,T\}$, a time step $\tau _0$ and a window size $T_0$, we define the STFT as

$$\begin{aligned} \varPhi (\{U(t)\},\tau ,\omega _h) = \sum _{t=1}^T U(t) K(t-\tau ) e^{(-i \omega _h)} \end{aligned}$$

(6)

for $\omega _h = 2\pi h/T_0, h = 0,1,\cdots , T_0/2$ and $\tau = \tau _0, 2\tau _0, \cdots n_0 \tau _0 $, where $K(t-\tau )$ is a window function centered at $\tau $, and $n_0 = T/\tau _0$. We concatenate STFT components at different time points and frequencies into a single vector in $ \varvec{V} \in \mathbb {C}^{s}$, where $s = (T_0/2+1) \times n_0$. Following notations in [6], we also call the $K(t-\tau ) e^{(-i \omega _h)}$ terms STFT dictionary functions, and use a matrix’s Hermitian transpose $\varvec{\varPhi ^H}$ to denote them, i.e. $ (\varvec{U}^T)_{1\times T} = ({\varvec{V}}^T)_{1\times s} (\varvec{\varPhi ^H})_{s \times T}$.

1.2 Appendix 2

The Karush-Kuhn-Tucker Conditions. Here we derive the Karush-Kuhn-Tucker (KKT) conditions for the hierarchical $L_{21}$ problem. Since the term $ f(\varvec{z}) = \frac{1}{2} \sum _{r=1}^q ||\varvec{M} ^{(r)} - \varvec{G}( \sum _{k=1}^p X_k^{(r)} \varvec{Z}_k ) \varvec{\varPhi ^H}||_F^2$ is essentially a sum of squared error of a linear problem, we can re-write it as $ f(\varvec{z}) = \frac{1}{2} || \varvec{b} - \varvec{A} \varvec{z} ||^2$, where $\varvec{z}$ again is a vector concatenated by entries in $\varvec{Z}$, $\varvec{b}$ is a vector concatenated by $\varvec{M}^{(1)}, \cdots , \varvec{M}^{(q)}$, and $ \varvec{A}$ is a linear operator, such that $\varvec{A} \varvec{z}$ is the concatenated $\varvec{G}( \sum _{k=1}^p X_k^{(r)} \varvec{Z}_k ) \varvec{\varPhi ^H}, r = 1,\cdots , q$. Note that although $\varvec{z}$ is a complex vector, we can further reduce the problem into a real-valued problem by rearranging the real and imaginary parts of $\varvec{z}$ and $ \varvec{A}$. Here for simplicity, we only derive the KKT conditions for the real case. Again we use $\{g_1, \cdots , g_h, \cdots , g_N \}$ to denote our ordered hierarchical group set, and $\lambda _h$ to denote the corresponding penalty for group $g_h$. We also define diagonal matrices $\varvec{D}_h$ such that

$$\begin{aligned} \varvec{D}_h(l,l) = \left\{ \begin{array}{l l} 1 &{}\text { if } l\in g_h \\ 0 &{} \text { otherwise } \end{array} \right. \forall h \end{aligned}$$

therefore, the non-zero elements of $\varvec{D}_h \varvec{z}$ is equal to $\varvec{z} |_{g_h}$. With the simplified notation, we re-cast the original problem into a standard formulation:

$$\begin{aligned} \min _{\varvec{z}} (\frac{1}{2}\Vert \varvec{b} - \varvec{A} \varvec{z} \Vert ^2_2 + \sum _h \lambda _h \Vert \varvec{D}_h \varvec{z} \Vert _2) \end{aligned}$$

(7)

To better describe the KKT conditions, we introduce some auxiliary variables, $\varvec{u} = \varvec{A} \varvec{z}, \varvec{v}_h = \varvec{D}_h \varvec{z}$. Then (7) is equivalent to

$$\begin{aligned} \min _{\varvec{z}, \varvec{u} , \varvec{v}_h}&(\frac{1}{2}\Vert \varvec{b} - \varvec{u} \Vert ^2_2 + \sum _h \lambda _h \Vert \varvec{v}_h \Vert _2) \\ \text {such that }&\varvec{u} = \varvec{A} \varvec{z}, \quad \varvec{v}_h= \varvec{D}_h \varvec{z}, \forall h \end{aligned}$$

The corresponding Lagrange function is

$$\begin{aligned} L(\varvec{z}, \varvec{u} , \varvec{v}_h,\varvec{\mu },\varvec{\xi }_h ) = \frac{1}{2}\Vert \varvec{b} - \varvec{u} \Vert ^2_2 + \sum _h \lambda _h\Vert \varvec{v}_h \Vert _2 + \varvec{\mu }^T ( \varvec{A} \varvec{z} - \varvec{u}) + \sum _{h} \varvec{\xi }_h^T ( \varvec{D}_h \varvec{z} - \varvec{v}_h ) \end{aligned}$$

where $\varvec{\mu }$ and $\varvec{\xi }_h$’s are Lagrange multipliers. At the optimum, the following KKT conditions hold

$$\begin{aligned} \frac{\partial {L}}{\partial {\varvec{u}}}&= \varvec{u} - \varvec{b} - \varvec{\mu } = 0 \end{aligned}$$

(8)

$$\begin{aligned} \frac{\partial {L}}{\partial {\varvec{z}}}&= \varvec{A}^T \varvec{\mu } + \sum _h \varvec{D}_h \varvec{\xi }_h = 0 \end{aligned}$$

(9)

$$\begin{aligned} \frac{\partial {L}}{\partial {\varvec{v}_h}}&= \lambda _h \partial { \Vert \varvec{v}_h \Vert _2} - \varvec{\xi }_h \ni 0, \forall h \end{aligned}$$

(10)

where $\partial { \Vert \cdot \Vert _2}$ is the subgradient of the $L_2$ norm. From (8) we have $ \varvec{\mu } = \varvec{u} - \varvec{b}$, then (9) becomes $ \varvec{A}^T (\varvec{u} - \varvec{b}) + \sum _h \varvec{D}_h \varvec{\xi }_h = 0 $. Plugging $\varvec{u}= \varvec{Az}$ in, we can see that the first term $ \varvec{A}^T ( \varvec{u} - \varvec{b}) = \varvec{A}^T (\varvec{A} \varvec{z} - \varvec{b})$ is the gradient of $ f(\varvec{z}) = \frac{1}{2} \Vert \varvec{b} - \varvec{A} \varvec{z}\Vert _2^2$. For a solution $\varvec{z}_0$, once we plug in $ \varvec{v}_h = \varvec{D}_h \varvec{z}_0 $, the KKT conditions become

$$\begin{aligned}&\nabla f(\varvec{z})_{\varvec{z} = \varvec{z}_0} + \sum _h \varvec{D}_h \varvec{\xi }_h = 0 \end{aligned}$$

(11)

$$\begin{aligned}&\lambda _h \partial { \Vert \varvec{D}_h \varvec{z}_0 \Vert _2} - \varvec{\xi }_h \ni 0, \forall h \end{aligned}$$

(12)

In (12), we have the following according to the definition of subgradients

$$\begin{aligned}&\varvec{\xi }_h = \lambda _h \frac{ \varvec{D}_h \varvec{z}_0 }{\Vert \varvec{D}_h \varvec{z}_0 \Vert _2} \text { if } \Vert \varvec{D}_h \varvec{z}_0 \Vert _2 > 0 \\&\Vert \varvec{\xi }_h\Vert _2 \le \lambda _h \text { if } \Vert \varvec{D}_h \varvec{z}_0 \Vert _2 = 0 \end{aligned}$$

Therefore we can determine whether (11) and (12) hold by solving the following problem.

$$\begin{aligned} \min _{\varvec{\xi }_h}&\frac{1}{2} \Vert \nabla f(\varvec{z})_{\varvec{z} = \varvec{z}_0} + \sum _h \varvec{D}_h \varvec{\xi }_h\Vert _2^2 \\ \text {subject to }&\varvec{\xi }_h = \lambda _h \frac{ \varvec{D}_h \varvec{z}_0 }{\Vert \varvec{D}_h \varvec{z}_0 \Vert _2} \text { if } \Vert \varvec{D}_h \varvec{z}_0 \Vert _2 > 0 \\&\Vert \varvec{\xi }_h\Vert _2 \le \lambda _h \text { if } \Vert \varvec{D}_h \varvec{z}_0 \Vert _2 = 0 \end{aligned}$$

which is a standard group lasso problem with no overlap. We can use coordinate-descent to solve it. We define $\frac{1}{2} \Vert \nabla f(\varvec{z})_{\varvec{z} = \varvec{z}_0} + \sum _h \varvec{D}_h \varvec{\xi }_h\Vert _2^2$ at the optimum as a measure of violation of the KKT conditions.

Let $f_{J}$ be the function f constrained on a set J. Because the gradient of f is linear, if $\varvec{z_0}$ only has non-zero entries in J, then the entries of $\nabla f( \varvec{z})$ in J are equal to $\nabla f_{J} (\varvec{z} |_J)$ at $\varvec{z} = \varvec{z}_0$. In addition, $\varvec{\xi }_h$’s are separate for each group. Therefore if $\varvec{z}_0$ is an optimal solution to the problem constrained on J, then the KKT conditions are already met for entries in J (i.e. $ \left( \nabla f(\varvec{z})_{\varvec{z} = \varvec{z}_0} + \sum _h \varvec{D}_h \varvec{\xi }_h\right) |_{J} = 0$); for $g_h \not \subset J$, we use ( $\frac{1}{2}\Vert \left( \nabla f(\varvec{z})_{\varvec{z} = \varvec{z}_0} + \sum _h \varvec{D}_h \varvec{\xi }_h\right) |_{g_h} \Vert ^2$) at the optimum as a measurement of how much the elements in group $g_h$ violate the KKT conditions, which is a criterion when we greedily add groups (see Algorithm 2).

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Tarr, M.J., Kass, R.E. (2016). Estimating Learning Effects: A Short-Time Fourier Transform Regression Model for MEG Source Localization. In: Rish, I., Langs, G., Wehbe, L., Cecchi, G., Chang, Km., Murphy, B. (eds) Machine Learning and Interpretation in Neuroimaging. MLINI MLINI 2013 2014. Lecture Notes in Computer Science(), vol 9444. Springer, Cham. https://doi.org/10.1007/978-3-319-45174-9_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-45174-9_8
Published: 13 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45173-2
Online ISBN: 978-3-319-45174-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Estimating Learning Effects: A Short-Time Fourier Transform Regression Model for MEG Source Localization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Real-Time MEG Source Localization Using Regional Clustering

MEG Methods: A Primer of Basic MEG Analysis

Magnetoencephalographic Imaging

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

1.1 Appendix 1

1.2 Appendix 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Estimating Learning Effects: A Short-Time Fourier Transform Regression Model for MEG Source Localization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Real-Time MEG Source Localization Using Regional Clustering

MEG Methods: A Primer of Basic MEG Analysis

Magnetoencephalographic Imaging

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Appendix 1

1.2 Appendix 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation