My scientific interests are in the field of machine learning and inference from empirical data. In particular, I study kernel methods for extracting regularities from possibly high-dimensional data. These regularities are usually statistical ones, however, in recent years I have also become interested in methods for finding causal structures that underly statistical dependences. I have worked on a number of different applications of machine learning - in our field, you get "to play in everyone's backyard." Most recently, I have been trying to play in the backyard of astronomers and photographers.
With the growing interest in (how to make money with) big data, machine learning has significantly gained in popularity. We have published an article in the German newspaper FAZ in January 2015, discussing some of the implications. Disclaimer: the newspaper added some text that appears above our names - this was not written or approved by us.
In March 2018, I published an article about the cybernetic revolution in the German newspaper SZ. It starts with the thesis that the current revolution is about processing (generating, converting, industrializing) information in much the same way the first two industrial revolutions dealt with processing (generating, converting, industrializing) energy. I have occasionally put forward this thesis (but I'm sure I am not the only one who thinks of it this way), for instance during a NYU symposium on the future of AI in January 2016 (here are some notes written by Max Tegmark). The article also provides recommendations on what Europe should do to keep up with the development.
My department and/or members of the department (incl. myself) receive funding from a number of sources including Max Planck, the DFG, the Alexander-von-Humboldt foundation, Amazon, Google, Bosch, Facebook, the BMBF (German Ministry of Science), the EU, the ETH Zürich, the Land Baden-Wuerttemberg, the Koerber foundation, CIFAR, and the Stanford Center on Philanthropy and Civil Society.
M.Sc. in mathematics and Lionel Cooper Memorial Prize, University of London (1992)
Diplom in physics (Tübingen, 1994)
doctorate in computer science from the Technical University Berlin (1997); thesis on Support Vector Learning (main advisor: V. Vapnik, AT&T Bell Labs) won the annual dissertation prize of the German Association for Computer Science (GI)
If you'd like to contact me, please consider these two notes:
1. I recently became co-editor-in-chief of JMLR. I work for JMLR because I believe in its open access model, but it takes a lot of time. During my JMLR term, please don't convince me to do other journal or grant reviewing duties.
2. I am not very organized with my e-mail so if you want to apply for a position in my lab, please send your application only to Sekretariat-Schoelkopf@tuebingen.mpg.de. Note that we do not respond to non-personalized applications that look like they are being sent to a large number of places simultaneously.
We are always happy to receive outstanding applications for PhD positions and postdocs.
In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, 38, pages: 847-855, JMLR Workshop and Conference Proceedings, (Editors: Lebanon, G. and Vishwanathan, S.V.N.), JMLR.org, AISTATS, 2015 (inproceedings)
In 24th International Joint Conference on Artificial Intelligence, Machine Learning Track, pages: 3561-3568, (Editors: Yang, Q. and Wooldridge, M.), AAAI Press, Palo Alto, California USA, IJCAI15, 2015 (inproceedings)
The Kalman filter is a well-established approach to get information on the time-dependent state of a system from noisy observations. It was developed in the context of the Apollo project to see the deviation of the true trajectory of a rocket from the desired trajectory. Afterwards it was applied to many different systems with small numbers of components of the respective state vector (typically about 10). In all cases the equation of motion for the state vector was known exactly. The fast dissipative magnetization dynamics is often investigated by x-ray magnetic circular dichroism movies (XMCD movies), which are often very noisy. In this situation the number of components of the state vector is extremely large (about 105), and the equation of motion for the dissipative magnetization dynamics (especially the values of the material parameters of this equation) is not well known. In the present paper it is shown by theoretical considerations that – nevertheless – there is no principle problem for the use of the Kalman filter to denoise XMCD movies of fast dissipative magnetization dynamics.
In Proceedings of the 32nd International Conference on Machine Learning, 37, pages: 1452–1461, JMLR Workshop and Conference Proceedings, (Editors: F. Bach and D. Blei), JMLR, ICML, 2015 (inproceedings)
In 6th International Workshop on Machine Learning in Medical Imaging, 9352, pages: 52-60, Lecture Notes in Computer Science, (Editors: L. Zhou, L. Wang, Q. Wang and Y. Shi), Springer, MLMI, 2015 (inproceedings)
In Proceedings of the 32nd International Conference on Machine Learning, 37, pages: 1898–1906, JMLR Workshop and Conference Proceedings, (Editors: F. Bach and D. Blei), JMLR, ICML, 2015 (inproceedings)
In Proceedings of The 32nd International Conference on Machine Learning, 37, pages: 2218–2226, JMLR Workshop and Conference Proceedings, (Editors: Bach, F. and Blei, D.), JMLR, ICML, 2015 (inproceedings)
Foreman-Mackey, D., Montet, B., Hogg, D., Morton, T., Wang, D., Schölkopf, B.
The Astrophysical Journal, 806(2), 2015 (article)
Photometry of stars from the K2 extension of NASA’s Kepler mission is afflicted by systematic effects caused by small (few-pixel) drifts in the telescope pointing and other spacecraft issues. We present a method for searching K2 light curves for evidence of exoplanets by simultaneously fitting for these systematics and the transit signals of interest. This method is more computationally expensive than standard search algorithms but we demonstrate that it can be efficiently implemented and used to discover transit signals. We apply this method to the full Campaign 1 data set and report a list of 36 planet candidates transiting 31 stars, along with an analysis of the pipeline performance and detection efficiency based on artificial signal injections and recoveries. For all planet candidates, we present posterior distributions on the properties of each system based strictly on the transit observables.
In Proceedings of the 32nd International Conference on Machine Learning, 37, pages: 1917–1925, JMLR Workshop and Conference Proceedings, (Editors: F. Bach and D. Blei), JMLR, ICML, 2015 (inproceedings)
IEEE International Conference on Computer Vision (ICCV 2015), Workshop on Inverse Rendering, 2015, Note: This work has been presented as a poster and is not included in the workshop proceedings. (poster)
Pickup, L., Zheng, P., Donglai, W., YiChang, S., Changshui, Z., Zisserman, A., Schölkopf, B., Freeman, W.
Seeing the Arrow of TimeComputer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pages: 2043-2050, IEEE, CVPR, June 2014 (conference)
Journal of Neural Engineering, 11(2):026006, 2014 (article)
Objective. Patients in the completely locked-in state (CLIS), due to, for example, amyotrophic lateral sclerosis (ALS), no longer possess voluntary muscle control. Assessing attention and cognitive function in these patients during the course of the disease is a challenging but essential task for both nursing staff and physicians. Approach. An electrophysiological cognition test battery, including auditory and semantic stimuli, was applied in a late-stage ALS patient at four different time points during a six-month epidural electrocorticography (ECoG) recording period. Event-related cortical potentials (ERP), together with changes in the ECoG signal spectrum, were recorded via 128 channels that partially covered the left frontal, temporal and parietal cortex. Main results. Auditory but not semantic stimuli induced significant and reproducible ERP projecting to specific temporal and parietal cortical areas. N1/P2 responses could be detected throughout the whole study period. The highest P3 ERP was measured immediately after the patient's last communication through voluntary muscle control, which was paralleled by low theta and high gamma spectral power. Three months after the patient's last communication, i.e., in the CLIS, P3 responses could no longer be detected. At the same time, increased activity in low-frequency bands and a sharp drop of gamma spectral power were recorded. Significance. Cortical electrophysiological measures indicate at least partially intact attention and cognitive function during sparse volitional motor control for communication. Although the P3 ERP and frequency-specific changes in the ECoG spectrum may serve as indicators for CLIS, a close-meshed monitoring will be required to define the exact time point of the transition.
In Proceedings of the Eighth International Conference on Weblogs and Social Media, pages: 170-179, (Editors: E. Adar, P. Resnick, M. De Choudhury, B. Hogan, and A. Oh), AAAI Press, ICWSM, 2014 (inproceedings)
In Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI2014), pages: 132-141, (Editors: Nevin L. Zhang and Jin Tian), AUAI Press Corvallis, Oregon, UAI2014, 2014 (inproceedings)
Time plays an essential role in the diffusion of information, influence, and disease over networks. In many cases we can only observe when a node is activated by a contagion—when a node learns about a piece of information, makes a decision, adopts a new behavior, or becomes infected with a disease. However, the underlying network connectivity and transmission rates between nodes are unknown. Inferring the underlying diffusion dynamics is important because it leads to new insights and enables forecasting, as well as influencing or containing information propagation. In this paper we model diffusion as a continuous temporal process occurring at different rates over a latent, unobserved network that may change over time. Given information diffusion data, we infer the edges and dynamics of the underlying network. Our model naturally imposes sparse solutions and requires no parameter tuning. We develop an efficient inference algorithm that uses stochastic convex optimization to compute online estimates of the edges and transmission rates. We evaluate our method by tracking information diffusion among 3.3 million mainstream media sites and blogs, and experiment with more than 179 million different instances of information spreading over the network in a one-year period. We apply our network inference algorithm to the top 5,000 media sites and blogs and report several interesting observations. First, information pathways for general recurrent topics are more stable across time than for on-going news events. Second, clusters of news media sites and blogs often emerge and vanish in a matter of days for on-going news events. Finally, major events, for example, large scale civil unrest as in the Libyan civil war or Syrian uprising, increase the number of information pathways among blogs, and also increase the network centrality of blogs and social media sites.
Journal of Neural Engineering, 11(5):056015, 2014 (article)
Objective. Brain–computer interface (BCI) systems are often based on motor- and/or sensory processes that are known to be impaired in late stages of amyotrophic lateral sclerosis (ALS). We propose a novel BCI designed for patients in late stages of ALS that only requires high-level cognitive processes to transmit information from the user to the BCI. Approach. We trained subjects via EEG-based neurofeedback to self-regulate the amplitude of gamma-oscillations in the superior parietal cortex (SPC). We argue that parietal gamma-oscillations are likely to be associated with high-level attentional processes, thereby providing a communication channel that does not rely on the integrity of sensory- and/or motor-pathways impaired in late stages of ALS. Main results. Healthy subjects quickly learned to self-regulate gamma-power in the SPC by alternating between states of focused attention and relaxed wakefulness, resulting in an average decoding accuracy of 70.2%. One locked-in ALS patient (ALS-FRS-R score of zero) achieved an average decoding accuracy significantly above chance-level though insufficient for communication (55.8%). Significance. Self-regulation of gamma-power in the SPC is a feasible paradigm for brain–computer interfacing and may be preserved in late stages of ALS. This provides a novel approach to testing whether completely locked-in ALS patients retain the capacity for goal-directed thinking.
In Advances in Neural Information Processing Systems 27, pages: 1-9, (Editors: Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger), Curran Associates, Inc., 28th Annual Conference on Neural Information Processing Systems (NIPS), 2014 (inproceedings)
In Regularization, Optimization, Kernels, and Support Vector Machines, pages: 427-456, 19, Chapman & Hall/CRC Machine Learning & Pattern Recognition, (Editors: Suykens, J. A. K., Signoretto, M. and Argyriou, A.), Chapman and Hall/CRC, Boca Raton, USA, 2014 (inbook)
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems