Kilian Q. Weinberger

Associate Professor of Computer Science

CSE 519T: Advanced Machine Learning (Fall 2014)

Assoc. Prof. Kilian Weinberger

Course Number: CSE 519T
Credit: 3 Units
Times: 4:00pm-5:30pm Tuesdays and Thursdays
Room: Mallinckrodt 302
Office Hours: Fridays 10am Jolley 407
Exam: TBD

Class Schedule:

- CSE 517a, CSE511a
- Knowledge of Python / Matlab
- Coursera ML course
- Some basic knowledge of statistics, probability theory, matrix algebra

The goal of this course is to provide further depth beyond the existing introductory level courses cse517a and cse511a for students with strong interest in machine learning.

We will cover three high-level topics:
1. Large Scale Machine Learning
2. Structured Prediction
3. Deep Learning

Paper order and sign-up sheet here:

1. Large Scale Machine Learning:

Leon Bottou, Olivier Bousquet, The Tradeoffs of Large Scale Learning (paper)
K. Q. Weinberger, A. Dasgupta, J. Langford, A. Smola, J. Attenberg. Feature Hashing for Large Scale Multitask Learning. (paper)
Rie Johnson and Tong Zhang. Learning nonlinear functions using regularized greedy forest (code, paper)
Alekh Agarwal, Olivier Chapelle, Miroslav Dudik, John Langford A Reliable Effective Terascale Linear Learning System (paper)
G. Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker, Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models (paper)
Hans Peter Graf, Eric Cosatto,Leon Bottou, Igor Durdanovic, Vladimir Vapnik, Parallel Support Vector Machines: The Cascade SVM (paper)
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin and Joseph M. Hellerstein , Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud (paper)
Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, Michael I. Jordan, A Scalable Bootstrap for Massive Data (paper)
Ali Rahimi and Ben Recht, Random Features for Large-Scale Kernel Machines (paper)
Andrei Broder, On the resemblance and containment of documents (paper, wiki)
Gal Chechik, Varun Sharma, Uri Shalit, Samy Bengio, Large Scale Online Learning of Image Similarity Through Ranking. (paper)
Alexandr Andoni and Piotr Indyk, Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. (paper, paper)
Tong Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms. (paper)
S Shalev-Shwartz, Y Singer, N Srebro, Pegasos: Primal Estimated sub-GrAdient SOlver for SVM (paper)
M. Zinkevich, M. Weimar, A. Smola, and L. Li, Parallelized Stochastic Gradient Descent (paper)
Elad Hazan, Tomer Koren, Nathan Srebro, Beating SGD: Learning SVMs in Sublinear Time (paper)
M. Zinkevich, M. Weimar, A. Smola, and L. Li, Parallelized Stochastic Gradient Descent (paper)

2. Resource Efficient Learning:

3. Deep Learning:


Optional course books:

Hal Daumé III: Course in Machine Learning
David Mackay: Information Theory, Inference, and Learning Algorithms

For purchase:
Chris Bishop: Pattern Recognition and Machine Learning