Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[theory-seminar] Theory lunch Thursday, w. Ramya Vinayak

Gregory Valiant gvaliant at
Mon Aug 26 09:56:12 PDT 2019

Hi Friends,

Ramya will be visiting us from UW this week, and she will be giving a
Theory lunch talk (noon, Gates 463a). Title/abstract are below.




Crowd-sourced Clustering: Robust Algorithms and Query Design


We consider the problem of crowdsourced clustering – the task of clustering
items using answers from non-expert crowd workers who can answer
similarity-comparison queries of the type: “Are items i and j in the same
cluster?” Since the workers on crowdsourcing platforms are not experts,
they provide noisy answers. Further, due to budget constraints, we cannot
make all possible comparisons between items in the dataset. Thus, it is
important to design algorithms that can work with noise and partial data,
and to design queries that reduce the noise in the responses.

In the first part of the talk, we consider convex optimization based
clustering algorithms which work with noisy and partially observed graphs
and aim to recover the low-rank matrix that encodes underlying cluster
structure. We provide explicit conditions in terms of the size and sparsity
of clusters, and the number of observations required to guarantee
successful recovery of the clusters. While we apply these algorithms to
crowdsourced clustering problems, they can generally be applied to any
partially observed graphs. In the second part, we demonstrate that random
triangle queries (where three items are compared per query) provide less
noisy data, as well as greater quantity of data, for a fixed query budget,
as compared to random edge queries (where two items are compared per
query). We extend the analysis of convex clustering algorithms to show that
the exact recovery guarantees hold for triangle queries despite involving
dependent edges. We complement our theoretical results with experiments on
real datasets on Amazon Mechanical Turk.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the theory-seminar mailing list