Satellite missions and ground-based observations generate datasets of unprecedented quality and size. New fields such as exoplanet science and gravitational wave physics have emerged, impossible without large-scale data analysis.
The Kepler space telescope, launched in 2009 and retired in 2018, monitored the brightness of 150,000 stars to find exoplanets causing decreases of brightness by temporary occlusions. The measurements are corrupted by systematic noise due to the telescope. However, since the stars can be assumed to be causally independent of each other (being light years apart) as well as of the instrument noise, we can denoise the signal of a single star by removing all information that can be explained by the measurements of the other stars. Subject to an additivity assumption, we provide theoretical guarantees regarding the quality of the reconstruction of our half-sibling regression method [ ]. We used half-sibling regression to develop a practical method to denoise pixel light curves [ ] and an exoplanet search pipeline which discovered 21 subsequently confirmed exoplanets.
More recently, we have focused on detecting single exoplanet transits using supervised learning. Detecting transits is hard already when periodically reoccurring transit events are available, and harder still when only single events are observed. In this setting, we need to resort to a strong data-driven model based on a large set of representative transits [ ]. This is of current interest as NASA's TESS mission is starting to release data, and many of the most interesting planets will only have a single transit.
Gravitational wave detection
The detection of gravitational waves from a binary black hole merger in 2015 was a milestone in modern physics. However, despite the unparalleled sensitivity of the LIGO detectors, data analysis remains a challenge. We have developed a dilated, fully convolutional neural net to be applied directly on the time series strain data to identify simulated GW signals from black hole mergers in real, non-Gaussian background measurements from the LIGO detectors. The system efficiently runs on strain data of arbitrary length from any number of detectors in real time [ ]. It has the potential to develop into a complementary trigger generator in the existing LIGO search pipeline. To explore this, several department members are currently members of the LSC (Ligo Scientific Collaboration).