Microsoft AI, Redmond, USA
Member of Technical Staff | 22 Jan 2024 - present |
Pretraining team focusing on synthetic data.
Microsoft Research, Redmond, USA
Senior Researcher | 18 Sept 2020 - 21 Jan 2024 |
Post Doctoral Researcher | 2 Apr 2018 - 17 Sept 2020 |
Research Intern | 6 June 2016 - 9 Sept 2016 |
Research Intern | 2 Feb 2015 - 8 May 2015 |
Joined Sebastien Bubeck’s Physics of AGI team mid 2023 to build the Phi small language models. I worked on synthetic generations and LLM based data filtering, ran ablations, helped organize early coding capabilities efforts, added necessary features to our PyTorch inference stack and worked to ensure our evaluations were uncontaminated.
Built MSCCL, a programmable communication library for GPUs, which delivered significant speedups for both internal and a key partner’s machine learning workloads. I wrote compilers, lead the language design and built algorithm syntheses.
Designed a novel derivatives-based regex matcher and shipped it as the first guaranteed-linear-time engine in .NET 7. Partnered with CredScan to help them achieve a double digit end-to-end throughput improvement.
Partnered with AzureML to ship a novel gradient aggregation technique for massively distributed ML training into Uber’s Horovod training library.
Lead the CHET/EVA compiler projects to make homomorphic encryption accessible to non-experts.
Developed a stream comprehension compiler and adapted the approach to improve query compilation in the SCOPE language.
Aalto University, Espoo, Finland
Doctoral Student | 1 Sept 2013 – 31 Mar 2018 |
Research Assistant | 1 June 2010 – 31 Aug 2013 |
Research Assistant | 1 June 2009 – 31 Aug 2009 |
Extended a Java dynamic symbolic execution tool to support multi-threaded programs.
Developed a verification tool for C programs on LLVM and participated in SV-COMP.
Optofidelity Ltd.
Trainee | 17 June 2008 – 29 June 2008 |
Trainee | 5 June 2007 – 24 Aug 2007 |
Nokia Research Center
Trainee | 26 June 2006 – 28 July 2006 |
Trainee | 1 Sept 2004 – 31 May 2006 |
Aalto University, Department of Computer Science
Doctor of Science (Technology) | Sept 2013 – Mar 2018 |
Master of Science (Technology) | Sept 2011 – Aug 2013 |
Helsinki University, Department of Computer Science
Bachelor of Science | Sept 2007 – Apr 2011 |
Marah Abdin, Sam Jacobs, Ammar Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Amit Garg, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olatunji Ruwase, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Masahiro Tanaka, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yue Zhang, Yunan Zhang, Xiren Zhou. Phi-3 technical report: a highly capable language model locally on your phone. Whitepaper, 2024. arXiv
Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li. Textbooks are all you need. Whitepaper, 2023. arXiv
Margus Veanes, Thomas Ball, Gabriel Ebner, Olli Saarikivi. Symbolic automata: ω-regularity modulo theories. Preprint, 2023. arXiv
Olli Saarikivi, Margus Veanes, Stephen Toub, Daniel Moseley, Jose Perez Rodriguez Finite automaton construction using regular expression derivatives to simulate behavior of a backtracking engine. US Patent 11,983,223, 2024. Google Patents
Abhinav Jangda, Saeed Maleki, Maryam Mehri Dehnavi, Madan Musuvathi, Olli Saarikivi. A framework for fine-grained synchronization of dependent GPU kernels. CGO 2024. DOI arXiv
Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang. Tessel: boosting distributed execution of large DNN models via flexible schedule search. HPCA 2024. DOI arXiv
Dan Moseley, Mario Nishio, Jose Perez Rodriguez, Olli Saarikivi, Stephen Toub, Margus Veanes, Tiki Wan, Eric Xu. Derivative based nonbacktracking real-world regex matching with backtracking semantics. PLDI 2023. DOI
Meghan Cowan, Saeed Maleki, Madanlal Musuvathi, Olli Saarikivi, Yifan Xiong. MSCCLang: Microsoft collective communication language. ASPLOS 2023. DOI arXiv
Aashaka Shah, Vijay Chidambaram, Meghan Cowan, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Jacob Nelson, Olli Saarikivi, Rachee Singh. TACCL: guiding collective algorithm synthesis using communication sketches. NSDI 2023. USENIX arXiv
Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi. Breaking the computation and communication abstraction barrier in distributed machine learning workloads. ASPLOS 2022. DOI arXiv
Zixian Cai, Zhengyang Liu, Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz, Jacob Nelson, Olli Saarikivi. Synthesizing optimal collective algorithms. PPoPP 2021. DOI arXiv
Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi. Distributed training of embeddings using graph analytics. IPDPS 2021. DOI arXiv
Sangeeta Chowdhary, Wei Dai, Kim Laine, Olli Saarikivi. EVA improved: compiler and extension library for CKKS. WAHC 2021. DOI
Madanlal Musuvathi, Kim Laine, Kristin Lauter, Hao Chen, Olli Saarikivi, Saeed Maleki, Roshan Dathathri, Todd Mytkowicz. Homomorphic evaluation of tensor programs US Patent 11,177,935, 2021. Google Patents
Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, Olli Saarikivi, Tianju Xu, Vadim Eksarevskiy, Jaliya Ekanayake, Emad Barsoum. Scaling distributed training with adaptive summation. MLSys 2021. PDF arXiv
Lenka Turoňová, Lukáš Holík, Ondřej Lengál, Olli Saarikivi, Margus Veanes, Tomáš Vojnar. Regex matching with counting-set automata. OOPSLA 2020. DOI
Roshan Dathathri, Blagovesta Kostova, Olli Saarikivi, Wei Dai, Kim Laine, Madan Musuvathi. EVA: an encrypted vector arithmetic language and compiler for efficient homomorphic computation. PLDI 2020. DOI
Lukáš Holík, Ondřej Lengál, Olli Saarikivi, Lenka Turoňová, Margus Veanes, Tomáš Vojnar. Succinct determinisation of counting automata via sphere construction. APLAS 2019. DOI arXiv
Roshan Dathathri, Olli Saarikivi, Hao Chen, Kim Laine, Kristin Lauter, Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz. CHET: an optimizing compiler for fully-homomorphic neural-network inferencing. PLDI 2019. DOI arXiv
Olli Saarikivi, Margus Veanes, Tiki Wan, Eric Xu. Symbolic regex matcher. TACAS 2019. DOI
Olli Saarikivi, Margus Veanes. Minimization of symbolic transducers. CAV 2017. DOI
Olli Saarikivi, Margus Veanes, Todd Mytkowicz, Madan Musuvathi. Fusing effectful comprehensions. PLDI 2017. DOI
Olli Saarikivi, Margus Veanes. Translating C# to branching symbolic transducers. LPAR 2017 Short Presentations. PDF
Olli Saarikivi, Hernán Ponce de León, Kari Kähkönen, Keijo Heljanko, Javier Esparza. Minimizing test suites with unfoldings of multithreaded programs. TECS 16(2), 2017. DOI
Olli Saarikivi, Keijo Heljanko. LCTD: Tests-guided proofs for C programs on LLVM (competition contribution). TACAS 2016. DOI
Olli Saarikivi, Keijo Heljanko. LCTD: Test-guided proofs for C programs on LLVM. JLAMP 85(6), 2016. DOI
Kari Kähkönen, Olli Saarikivi, Keijo Heljanko. Unfolding based automated testing of multithreaded programs. ASE 22(4), 2015. DOI
Hernán Ponce de León, Olli Saarikivi, Kari Kähkönen, Keijo Heljanko, Javier Esparza. Unfolding based minimal test suites for testing multithreaded programs. ACSD 2015. DOI
Olli Saarikivi, Keijo Heljanko. Reporting races in dynamic partial order reduction. NFM 2015. DOI
Olli Saarikivi. Test-Guided proofs for C programs on LLVM. Master’s Thesis, Aalto University, Finland, 2013. PDF
Kari Kähkönen, Olli Saarikivi, Keijo Heljanko. LCT: A parallel distributed testing tool for multithreaded Java programs. PDMC 2012. DOI
Kari Kähkönen, Olli Saarikivi, Keijo Heljanko. Using unfoldings in automated testing of multithreaded programs. ASE 2012. DOI
Olli Saarikivi, Kari Kähkönen, Keijo Heljanko. Improving dynamic partial order reductions for concolic testing. ACSD 2012. DOI
Kari Kähkönen, Tuomas Launiainen, Olli Saarikivi, Janne Kauttio, Keijo Heljanko, Ilkka Niemelä. LCT: An open source concolic testing tool for Java programs. BYTECODE 2011. PDF