Research Interests
Publications
Awards & Honors
Research Internships
Service
I am a PhD student at the University of Washington advised by Luke Zettlemoyer working on representation learning, and neuro-inspired and hardware optimized deep learning. Previously I interned at the UCL Machine Reading Group where I was advised by Sebastian Riedel working on information retrieval and link prediction in knowledge graphs. I did my master in computer science at the University of Lugano, where I was advised by Fabio Crestani. During my master, I did two research internship at Microsoft under Chris Basoglu, where I worked speech recognition and memory-efficient deep learning algorithms. I did a bachelor in applied mathematics with The Open University while working as a software engineer in the automation industry.
In 2013, I also took part in Kaggle competitions where I peaked at world rank 63 (top 0.22%).
Feel free to contact me at lastname@cs.washington.edu; if you have questions regarding deep learning, I prefer that you post your questions as comments on one of my blog posts (or on this page if it does not fit to any blog post); this way all people can profit from your questions and my answers.
Research Interests
My main research thesis is that computational efficient methods will accelerate and enable progress in and understanding of deep learning. In particular, I am interested in:
- Sparse training and inference
- Mechanistic interpretability of deep learning
- Neuro-inspired deep learning
- Hardware optimized deep learning
- Making large models more accessible
- Low-bit quantization of large language models
Selected Publications
2022
The case for 4-bit precision: k-bit Inference Scaling Laws. Tim Dettmers, Luke Zettlemoyer. In submission. [arXiv]
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale. Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer. NeurIPS 2022. [arXiv] [blog post]
Petals: Collaborative Inference and Fine-tuning of Large Models. Alexander Borzunov*, Dmitry Baranchuk*, Tim Dettmers*, Max Ryabinin*, Younes Belkada, Artem Chumachenko, Pavel Samygin, Colin Raffel. In submission. [arXiv] [website] [chatbot]
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models. Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer. In submission. [arXiv]
2021
8-bit Optimizers via Block-wise Quantization. Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer. ICLR 2022 (Spotlight). [arXiv] [library] [video]
BASE Layers: Simplifying Training of Large, Sparse Models. Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer. ICML, 2021. [arXiv] [code]
2019
Sparse Networks from Scratch: Faster Training without Losing Performance. Tim Dettmers, Luke Zettlemoyer. [arXiv] [bib] [code] [blog post] [Q&A]
2018
Convolutional 2D Knowledge Graph Embeddings, Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel. AAAI2018. [arXiv] [bib] [code] [data] [Q&A]
2016
8-Bit Approximations for Parallelism in Deep Learning, Tim Dettmers. ICLR2016. [arXiv] [bib] [code] [data]
Awards & Honors
2018/2019 Jeff Dean – Heidi Hopper Endowed Regental Fellowship
2016/2017 Google Scholarship for Students with Disabilities
2016 ICRL 2016 Travel Award
2011/2012 Best regional graduate: Mathematical-technical Software Developer
Research internships
2017-09 – Now UCL Machine Reading Group, London
2017-06 – 2017-09 Microsoft Research, Redmond
2017-01 – 2017-06 UCL Machine Reading Group, London
2016-06 – 2016-09 Microsoft Research, Redmond
Service
I am reviewing for Knowledge and Information Systems.