**New homepage**: https://sites.google.com/site/yevgenyseldin

**Affiliations:**

- Research Scientist, Max Planck Insitute for Intelligent Systems (Oct 2009 - present)
- Honorary Research Associate, Department of Computer Science, University College London (Aug 2011 - present)

**Education:**

I did my Ph.D. at the Hebrew University of Jerusalem under supervision of Prof. Naftali Tishby.

**Research Interests:**

Machine Learning, Reinforcement Learning, Information Theory.

My current research is focused on data-dependent analysis of reinforcement learning (via application of PAC-Bayesian analysis).

I am also taking part in CompLACS research project.

**Tutorials:**

- PAC-Bayesian Analysis and Its Applications (together with François Laviolette and John Shawe-Taylor). Given at ECML-PKDD-2012.
- PAC-Bayesian Analysis in Supervised, Unsupervised, and Reinforcement Learning (together with François Laviolette and John Shawe-Taylor). Given at ICML-2012.

**Events that I organized:**

- Multi-Trade-offs in Machine Learning workshop at NIPS-2012 (together with Guy Lever, John Shawe-Taylor, Nicolò Cesa-Bianchi, Koby Crammer, François Laviolette, Gábor Lugosi, and Peter Bartlett).
- New Frontiers in Model Order Selection workshop at NIPS-2011 (together with Koby Crammer, Nicolò Cesa-Bianchi, François Laviolette, and John Shawe-Taylor).
- Empirical Inference Symposium in honor of the 75th anniversary of Prof. Vladimir N. Vapnik (together with Bernhard Schölkopf and Krikamol Muandet).

**Teaching:**

- Teaching Assistant (frontal) in
**Introduction to Machine Learning**course at The Hebrew University of Jerusalem. Fall 2007 and 2008.

- Teaching Assistant (frontal) in
**Introduction to Linear Systems**course at The Hebrew University of Jerusalem. Spring 2006, 2007, 2008, and 2009.

**A quick guide to my main publications:**

**PAC-Bayesian analysis of martingales and its application to multiarmed bandits with side information:**Please, see our PAC-Bayesian Inequalities for Martingales draft for general PAC-Bayesian inequalities for martingales (joint work with François Laviolette, Nicolò Cesa-Bianchi, John Shawe-Taylor, and Peter Auer, accepted to IEEE Transactions on Information Theory). These inequalities make it possible to control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. We apply these inequalities to multiarmed bandits with side information in PAC-Bayesian Analysis of Contextual Bandits NIPS-2011 paper (joint work with Peter Auer, François Laviolette, John Shawe-Taylor, and Ronald Ortner). Our analysis allows to provide the algorithm large amount of side information, let the algorithm to decide which side information is relevant for the task, and penalize the algorithm only for the side information that it is using de facto. We also provide an algorithm for multiarmed bandits with side information with computational complexity that is independent of the amount of side information and linear in the number of actions.**PAC-Bayesian analysis of co-clustering, matrix tri-factorization and graphical models:**This series of works is best summarized in our JMLR paper PAC-Bayesian Analysis of Co-clustering and Beyond (joint work with Naftali Tishby). We derive generalization bounds and regularized optimization algorithms for co-clustering and matrix tri-factorization. The obtained generalization bounds for co-clustering suggest that co-clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on row and column variables. To the best of our knowledge, this is the first known generalization analysis of co-clustering and matrix tri-factorization and the first time regularization terms are derived for these problems. Our approach to formulating unsupervised learning problems as prediction problems can be extended to virtually any unsupervised learning task and our generalization bounds can be further extended to tree-shaped graphical models.**Multilevel models for image processing:**We were one of the first to apply multilevel unsupervised learning in image analysis, where the first level identified a "dictionary" of common textures within a collection of images and the second level used this dictionary to perform joint unsupervised segmentation of the images. See our Unsupervised Clustering of Images using their Joint Segmentation by Yevgeny Seldin, Sonia Starik and Michael Werman.**Unsupervised sequence segmentation by mixtures of variable memory Markov sources:**We designed an algorithm for unsupervised segmentation of sequences into alternating variable memory Markov sources (implemented as Prediction Suffix Trees). The algorithm was shown to be successful in identification of domains in protein sequences. See our Bioinformatics publication Markovian domain fingerprinting: statistical segmentation of protein sequences by Gill Bejerano, Yevgeny Seldin, Hanah Margalit and Naftali Tishby for the summary of the biological results and ICML-2001 paper Unsupervised Sequence Segmentation by a Mixture of Switching Variable Memory Markov Sources by Yevgeny Seldin, Gill Bejerano, and Naftali Tishby for more details about the algorithm.

28 results
(BibTeX)

**PAC-Bayes-Empirical-Bernstein Inequality**
In *Advances in Neural Information Processing Systems 26*, pages: 109-117, (Editors: C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger), 27th Annual Conference on Neural Information Processing Systems (NIPS), 2013 (inproceedings)

**On the Relations and Differences between Popper Dimension, Exclusion Dimension and VC-Dimension**
In *Empirical Inference - Festschrift in Honor of Vladimir N. Vapnik*, pages: 53-57, 6, (Editors: Schölkopf, B., Luo, Z. and Vovk, V.), Springer, 2013 (inbook)

**Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments**
In *Proceedings of the Tenth European Workshop on Reinforcement Learning *, pages: 103-116, (Editors: MP Deisenroth and C Szepesvári and J Peters), JMLR, EWRL, 2013 (inproceedings)

**PAC-Bayesian Analysis: A Link Between Inference and Statistical Physics
**
Workshop on Statistical Physics of Inference and Control Theory, 2012 (talk)

**PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits**
In *JMLR Workshop and Conference Proceedings 26*, pages: 98-111, JMLR, Cambridge, MA, USA, On-line Trading of Exploration and Exploitation 2, April 2012 (inproceedings)

**PAC-Bayesian Analysis of Supervised, Unsupervised, and Reinforcement Learning
**
Tutorial at the 29th International Conference on Machine Learning (ICML), 2012 (talk)

**PAC-Bayesian Inequalities for Martingales **
*IEEE Transactions on Information Theory*, 58(12):7086-7093, June 2012 (article)

**PAC-Bayesian Analysis and Its Applications**
Tutorial at The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2012 (talk)

**PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off**
In pages: 1-8, ICML Workshop on Online Trading of Exploration and Exploitation 2, July 2011 (inproceedings)

**PAC-Bayesian Analysis of Contextual Bandits **
In *Advances in Neural Information Processing Systems 24*, pages: 1683-1691, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

**PAC-Bayesian Analysis of Martingales and Multiarmed Bandits **
Max Planck Institute for Biological Cybernetics, Tübingen, Germany, May 2011 (techreport)

**A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering**
Max Planck Institute for Biological Cybernetics, Tübingen, Germany, September 2010 (techreport)

**PAC-Bayesian Analysis in Unsupervised Learning**
Foundations and New Trends of PAC Bayesian Learning Workshop, March 2010 (talk)

**A PAC-Bayesian Analysis of Co-clustering, Graph Clustering, and Pairwise Clustering**
In *ICML 2010 Workshop on Social Analytics: Learning from human interactions*, pages: 1-5, ICML Workshop on Social Analytics: Learning from human interactions, June 2010 (inproceedings)

**PAC-Bayesian Bounds for Discrete Density Estimation and Co-clustering Analysis**
*Workshop "Foundations and New Trends of PAC Bayesian Learning"*, 2010, March 2010 (poster)

**PAC-Bayesian Analysis of Co-clustering and Beyond**
*Journal of Machine Learning Research*, 11, pages: 3595-3646, December 2010 (article)

**A PAC-Bayesian Approach to Structure Learning**
The Hebrew University of Jerusalem, Israel, September 2009 (phdthesis)

**PAC-Bayesian Approach to Formulation of Clustering Objectives**
NIPS Workshop on "Clustering: Science or Art? Towards Principled Approaches", December 2009 (talk)

**A PAC-Bayesian Approach to Formulation of Clustering Objectives**
In *Proceedings of the NIPS 2009 Workshop "Clustering: Science or Art? Towards Principled Approaches"*, pages: 1-4, NIPS Workshop "Clustering: Science or Art? Towards Principled Approaches", December 2009 (inproceedings)

**PAC-Bayesian Generalization Bound for Density Estimation with Application to Co-clustering**
In *In the proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS 2009)*, *JMLR Workshop and Conference Proceedings Volume 5: AISTATS 2009*, pages: 472-479, MIT Press, Cambridge, MA, USA, 12th International Conference on Artificial Intelligence and Statistics, April 2009 (inproceedings)

**New homepage**: https://sites.google.com/site/yevgenyseldin

**Affiliations:**

- Research Scientist, Max Planck Insitute for Intelligent Systems (Oct 2009 - present)
- Honorary Research Associate, Department of Computer Science, University College London (Aug 2011 - present)

**Education:**

I did my Ph.D. at the Hebrew University of Jerusalem under supervision of Prof. Naftali Tishby.

**Research Interests:**

Machine Learning, Reinforcement Learning, Information Theory.

My current research is focused on data-dependent analysis of reinforcement learning (via application of PAC-Bayesian analysis).

I am also taking part in CompLACS research project.

**Tutorials:**

- PAC-Bayesian Analysis and Its Applications (together with François Laviolette and John Shawe-Taylor). Given at ECML-PKDD-2012.
- PAC-Bayesian Analysis in Supervised, Unsupervised, and Reinforcement Learning (together with François Laviolette and John Shawe-Taylor). Given at ICML-2012.

**Events that I organized:**

- Multi-Trade-offs in Machine Learning workshop at NIPS-2012 (together with Guy Lever, John Shawe-Taylor, Nicolò Cesa-Bianchi, Koby Crammer, François Laviolette, Gábor Lugosi, and Peter Bartlett).
- New Frontiers in Model Order Selection workshop at NIPS-2011 (together with Koby Crammer, Nicolò Cesa-Bianchi, François Laviolette, and John Shawe-Taylor).
- Empirical Inference Symposium in honor of the 75th anniversary of Prof. Vladimir N. Vapnik (together with Bernhard Schölkopf and Krikamol Muandet).

**Teaching:**

- Teaching Assistant (frontal) in
**Introduction to Machine Learning**course at The Hebrew University of Jerusalem. Fall 2007 and 2008.

- Teaching Assistant (frontal) in
**Introduction to Linear Systems**course at The Hebrew University of Jerusalem. Spring 2006, 2007, 2008, and 2009.

**A quick guide to my main publications:**

**PAC-Bayesian analysis of martingales and its application to multiarmed bandits with side information:**Please, see our PAC-Bayesian Inequalities for Martingales draft for general PAC-Bayesian inequalities for martingales (joint work with François Laviolette, Nicolò Cesa-Bianchi, John Shawe-Taylor, and Peter Auer, accepted to IEEE Transactions on Information Theory). These inequalities make it possible to control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. We apply these inequalities to multiarmed bandits with side information in PAC-Bayesian Analysis of Contextual Bandits NIPS-2011 paper (joint work with Peter Auer, François Laviolette, John Shawe-Taylor, and Ronald Ortner). Our analysis allows to provide the algorithm large amount of side information, let the algorithm to decide which side information is relevant for the task, and penalize the algorithm only for the side information that it is using de facto. We also provide an algorithm for multiarmed bandits with side information with computational complexity that is independent of the amount of side information and linear in the number of actions.**PAC-Bayesian analysis of co-clustering, matrix tri-factorization and graphical models:**This series of works is best summarized in our JMLR paper PAC-Bayesian Analysis of Co-clustering and Beyond (joint work with Naftali Tishby). We derive generalization bounds and regularized optimization algorithms for co-clustering and matrix tri-factorization. The obtained generalization bounds for co-clustering suggest that co-clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on row and column variables. To the best of our knowledge, this is the first known generalization analysis of co-clustering and matrix tri-factorization and the first time regularization terms are derived for these problems. Our approach to formulating unsupervised learning problems as prediction problems can be extended to virtually any unsupervised learning task and our generalization bounds can be further extended to tree-shaped graphical models.**Multilevel models for image processing:**We were one of the first to apply multilevel unsupervised learning in image analysis, where the first level identified a "dictionary" of common textures within a collection of images and the second level used this dictionary to perform joint unsupervised segmentation of the images. See our Unsupervised Clustering of Images using their Joint Segmentation by Yevgeny Seldin, Sonia Starik and Michael Werman.**Unsupervised sequence segmentation by mixtures of variable memory Markov sources:**We designed an algorithm for unsupervised segmentation of sequences into alternating variable memory Markov sources (implemented as Prediction Suffix Trees). The algorithm was shown to be successful in identification of domains in protein sequences. See our Bioinformatics publication Markovian domain fingerprinting: statistical segmentation of protein sequences by Gill Bejerano, Yevgeny Seldin, Hanah Margalit and Naftali Tishby for the summary of the biological results and ICML-2001 paper Unsupervised Sequence Segmentation by a Mixture of Switching Variable Memory Markov Sources by Yevgeny Seldin, Gill Bejerano, and Naftali Tishby for more details about the algorithm.