**Navigation**

**Contact us**

- Scientific Leader
- tel: +33 3 20 43 68 76

- Team Assistant
- tel: +33 3 59 57 78 45

**Research Organizations**

**Current Collaborations**

**Related Inria teams**

*© 2016-2018 Modal-Team. All rights reserved.*

**Navigation**

**Contact us**

- Scientific Leader
- tel: +33 3 20 43 68 76

- Team Assistant
- tel: +33 3 59 57 78 45

**Research Organizations**

**Current Collaborations**

**Related Inria teams**

*© 2016-2018 Modal-Team. All rights reserved.*

seminars

**Usual day**: Tuesday at 11.00.

**Place**: Inria Lille - Nord Europe.

**How to get there**: en français, in english.

**Organizers**: Alain Celisse and Hemant Tyagi

**Calendar feed**: iCalendar (hosted by the seminars platform of University of Lille)

Most slides are available: check past sessions and archives.

**Archives**: 2018-2019, 2017-2018, 2016-2017, 2015-2016, 2014-2015, 2013-2014.

**Date**: March 31, 2020**(Tuesday)**at**11.00**(Plenary room)**Affiliation**: ENSAE CREST**Webpage**: Link**Title**: TBA**Abstract**: TBA

**Date**: March 3, 2020**(Tuesday)**at**11.00**(Plenary room)**Affiliation**: CNRS & École Normale Supérieure, Paris**Webpage**: Link**Title**: TBA**Abstract**: TBA

**Date**: February 11, 2020**(Tuesday)**at**11.00**(Plenary room)**Affiliation**: University of Stuttgart**Webpage**: Link**Title**: TBA**Abstract**: TBA

**Date**: January 28, 2020**(Tuesday)**at**11.00**(Room A11)**Affiliation**: Inria Lille - Nord Europe**Webpage**:**Title**: Early stopping: some new ideas towards optimality**Abstract**: Early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. The main advantage of this technique is significant reduction of computational resources and possibly better statistical performance. In this talk we will focus on nonparametric regression setting in reproducing kernel Hilbert space (RKHS). We perform empirical risk minimisation procedure in RKHS with two iterative learning algorithms: gradient descent and (iterative) kernel ridge regression. In order to avoid overfitting phenomenon for these two algorithms we will consider so-called discrepancy principle early stopping rule. We quantify its performance by means of oracle-type inequality and show that, under some mild conditions, this stopping rule is not appropriate for various kernel classes. In order to solve this problem we propose a new stopping rule called smoothed discrepancy principle and show that this rule could be proved optimal in minimax sense under some conditions. At the end of the talk there will be simulation experiments to validate the theoretical results.

**Date**: January 21, 2020**(Tuesday)**at**11.00**(Plenary room)**Affiliation**: University of Lorraine**Webpage**: Link**Title**: Decision Tree for uncertainty measures**Abstract**: The ensemble methods are popular machine learning techniques which are powerful when one wants to deal with both classification or prediction problems. A set of classifiers is constructed, and the classification or the prediction of a new data instance is done by tacking a weighted vote. These classifiers can be regression or classification trees. Practically, a tree is a piece-wise constant estimator on partitions obtained from the data. These partitions are induced by recursive dyadic split of the set of input variables. For example, CART (Classification And Regression Trees) is an algorithm which constructs a tree. The goal is to partition the space of input variable values in the most “homogeneous” K disjoint regions possible. More precisely, each partitioning value has to minimize a risk function. However, in practice, experimental measures can be observed with uncertainty. This work proposes to extend CART algorithm to these kind of data. We present an induced model adapted to uncertainty data and both a prediction and split rule for a tree construction taking into account the uncertainty of each quantitative observation from the data base. Joint work with Sami ALKHOURY, Myriam TAMI, Emilie DEVIJVER, Eric GAUSSIER and Total SA

**Date**: January 14, 2020**(Tuesday)**at**11.00**(Room A11)**Affiliation**: Inria Lille - Nord Europe**Webpage**:**Title**: Optimal Transport and its Application to Domain Adaptation**Abstract**: The optimal transport was first formalized by Monge in the 70s. The initial problem consists of finding the best plan to move a mass of dirt into a given hole while minimizing the effort. Recently, the problem and associated Wasserstein distance have attracted lots of attention and been used in different machine learning tasks. This talk will focus on the application of optimal transport to domain adaptation problems. In such a scenario, one is given a set of points sampled from a source distribution and wants to adapt them to match the distribution of the target domain. The optimal transport will be compared with other domain adaptation methods and some computational details will also be discussed.

**Date**: December 3, 2019**(Tuesday)**at**11.00**(Plenary Room)**Affiliation**: University of Lyon 2**Webpage**: Link**Title**: A new and simpler approach to the analysis of Robust PCA**Abstract**: In standard PCA, one collects n observed vectors x_1,. . . ,x_n in R^d and forms the matrix X = [x_1, . . . , x_n]. Then, one finds the best approximation of rank r by thresholding the singular values as prescribed, by the Eckart-Young theorem. One of the main drawback of PCA is that the singular vectors are very sensitive to outliers. Robust PCA (RPCA) is a kind of matrix decomposition which allows to recover a low rank matrix corrupted by both noise and outliers with potentially very large value. RPCA has been extensively used in a very wide range of applications from genetics to video processing. One of the main ideas behind Robust PCA is to reconstruct the matrices L_0 and S_0 by solving the convex programming problem

min ||L||_{∗} + λ ||S||_1 s. t. ||X − L − S|| ≤ η.

The original theoretical analysis of this estimator’s accuracy is well known to be technically very involved. The goal of the present work is to propose an elementary analysis of Robust PCA using the descent cone approach of Amelunxen et al. and a null space-type property on the eigenvectors of the original low rank matrix

**Date**: October 15, 2019**(Tuesday)**at**11.00**(Plenary Room)**Affiliation**: Indian Institute of Science (IISc)**Webpage**:**Title**: Euclidean k-Means with Center Proximity**Abstract**: Clustering is an important tool in any data science toolkit. Most popular clustering algorithms partition the given data into disjoint clusters by optimizing a certain global objective such as the k-means. The implicit assumption in doing so is that an optimal solution for this objective would recover the underlying ground truth clustering. However, many such objectives are NP-hard to optimize in the worst case, e.g., k-centre, k-median, k-means. Moreover, an optimal solution need not satisfy certain properties desired from the ground truth clustering, e.g., balance, stability. In practice, however, Lloyd’s algorithm, k-means++, and their variants perform well on most real-world data sets. This dichotomy between theoretical intractability and empirically observed efficiency has led to the CDNM thesis: “Clustering is Difficult only when it does Not Matter”! In most real-world data sets, the underlying ground-truth clustering is unambiguous and stable under small perturbations of data. We will highlight these issues with the example of k-means clustering. In this talk, we will explore a notion of stability called centre-proximity and give an algorithm and lower bounds for stable instances. This is joint work with Dr Amit Deshpande (Microsoft Research, India) and Dr Anand Louis (Indian Institute of Science, India).

**Date**: October 8, 2019**(Tuesday)**at**11.00**(Plenary Room)**Affiliation**: SINTEF**Webpage**: Link.**Title**: Machine learning in the real world**Abstract**: Machine learning algorithms are flexible and powerful, but the data requirements are high and rarely met by the available data. Real world data is often medium sized (relative to problem side), noisy and full of missing values. At the same time, in order to deploy machine learning in industrial settings, they must be robust, explainable and have quantified uncertainties. I will show practical examples of these challenges from our recent projects and some case-by-case solutions, but also highlight remaining issues.

**Date**: September 23, 2019**(Monday)**at**11.00**(Room A00)**Affiliation**: University of Tokyo**Webpage**: Link.**Title**: Unsupervised Domain Adaptation Based on Source-guided Discrepancy**Abstract**: Unsupervised domain adaptation is the problem setting where data generating distributions in the source and target domains are different, and labels in the target domain are unavailable. One important question in unsupervised domain adaptation is how to measure the difference between the source and target domains. A previously proposed discrepancy that does not use the source domain labels requires high computational cost to estimate and may lead to a loose generalization error bound in the target domain. To mitigate these problems, we propose a novel discrepancy called source-guided discrepancy (S-disc), which exploits labels in the source domain. As a consequence, S-disc can be computed efficiently with a finite sample convergence guarantee. In addition, we show that S-disc can provide a tighter generalization error bound than the one based on an existing discrepancy. Finally, we report experimental results that demonstrate the advantages of S-disc over the existing discrepancies.**Slides**: Link

**Date**: September 11, 2019**(Wednesday)**at**14.00**(Plenary Room)**Affiliation**: KU Leuven**Webpage**: Link.**Title**: Landmark sampling, diversity and kernel methods**Abstract**: In machine learning, there is a revived interest for kernel methods, e.g. for designing interpretable convolutional networks or in the context of Gaussian processes. More generally, in kernel-based learning, a central question concerns large scale approximations of the kernel matrix. A popular method for finding a low rank approximation of kernel matrices is the so-called Nystrom method, which relies on the sampling of 'good' landmark points in a dataset. We will discuss an approach for selecting 'diverse' landmarks with some theoretical guarantees. Our work makes a connection between kernelized Christoffel functions, ridge leverage scores and determinantal point processes.

**Archives**: 2018-2019, 2017-2018, 2016-2017, 2015-2016, 2014-2015, 2013-2014.

seminars.txt · Last modified: 2020/01/17 14:01 by tyagi