Machine Learning Workshop

Andrew Wan, IIIS, Tsinghua University

Title: Efficient learning with noisy data
Abstract: The hypotheses used by learning algorithms (for example linear separators) may not fit the data seen in real-world applications (the data may have been corrupted by some noise or generated by a different hypothesis to begin with). I will introduce the theoretical models used to study these scenarios, including agnostic learning, in which we make no assumption about the target function which labeled the data. Recent results have generalized two of the most important algorithms in computational learning theory to work in this model, and I will discuss some of our work that applies these algorithms (for example, to agnostically learn subclasses of polynomial size DNF formulas) and recent workin progress which attempts to adapt new algorithms to the agnostic setting.