testing collections of properties reut levi dana ron ronitt rubinfeld ics 2011
TRANSCRIPT
![Page 1: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/1.jpg)
Testing Collections of Properties
Reut Levi Dana Ron
Ronitt Rubinfeld
ICS 2011
![Page 2: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/2.jpg)
Shopping distribution
What properties do your distributions have?
![Page 3: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/3.jpg)
Transactions in California Transactions in New York
Testing closeness of two distributions:
trend change?
![Page 4: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/4.jpg)
Testing Independence:Shopping patterns:
Independent of zip code?
![Page 5: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/5.jpg)
This work: Many distributions
![Page 6: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/6.jpg)
One distribution:
D is arbitrary black-box distribution over [n], generates iid samples.
Sample complexity in terms of n? (can it be sublinear?)
D
Test
samples
Pass/Fail?
![Page 7: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/7.jpg)
Uniformity (n1/2) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08]
Identity (n1/2) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01]
Closeness (n2/3) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08]
Independence O(n12/3 n2
1/3), (n12/3 n2
1/3) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] , this work
Entropy n1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08]
Support Size (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10]
Monotonicity on total order (n1/2) [Batu, Kumar, Rubinfeld 04]
Monotonicity on poset n1-o(1)
[Bhattacharyya, Fischer, Rubinfeld, Valiant 10]
Some answers…
![Page 8: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/8.jpg)
Collection of distributions:
Two models: Sampling model:
Get (i,x) for random i, xDi
Query model: Get (i,x) for query i and xDi
Sample complexity in terms of n,m?
D1
Test
samples
Pass/Fail?
D2 Dm…
Further refinement: Known or unknown distribution on i’s?
![Page 9: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/9.jpg)
Properties considered:
Equivalence All distributions are equal
``Clusterability’’ Distributions can be clustered into k
clusters such that within a cluster, all distributions are close
![Page 10: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/10.jpg)
Equivalence vs. independence
Process of drawing pairs: Draw i [m], x Di output (i,x)
Easy fact: (i,x) independent iff Di‘s are equal
![Page 11: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/11.jpg)
Results
Def: (D1,…Dm) has the Equivalence property if Di = Di' for all 1 ≤ i, i’ ≤ m.
Lower Bound Upper Bound
n>m (n2/3m1/3) Unknown Weights Õ(n2/3m1/3)
m>n (n1/2m1/2) Õ(n1/2m1/2) Known Weights
Also yields “tight” lower bound for independence testing
![Page 12: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/12.jpg)
Clusterability
Can we cluster distributions s.t. in each cluster, distributions (very) close? Sample complexity of test is
O(kn2/3) for n = domain size, k = number of clusters No dependence on number of distributions Closeness requirement is very stringent
![Page 13: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/13.jpg)
Open Questions
• Clusterability in the sampling model, less stringent notion of close
• Other properties of collections?• E.g., all distributions are shifts of each other?
![Page 14: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011](https://reader033.vdocuments.net/reader033/viewer/2022050908/56649ebc5503460f94bc531d/html5/thumbnails/14.jpg)
Thank you