Jump to Content
Craig Citro

Craig Citro

I'm a software engineer working primarily on Google Colaboratory, and I generally spend my time building tools to make researchers' lives easier.

I'm a former number theorist (PhD UCLA 2009), where I studied p-adic L-functions and modular forms, especially computational aspects. Since coming to Google, I've worked primarily on tools for data science, including Google BigQuery, TensorFlow, and Colaboratory. But all that aside, my biggest claim to fame was being an extra on Buffy the Vampire Slayer.

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    API Usability at Scale
    Luke Church
    Proceedings of the 26th annual workshop of the Psychology of Programming Interest Group (2016)
    Preview abstract Designing and maintaining useful and usable APIs remains challenging. At Google we manage hundreds of APIs. In this article we report on the experience of doing so and describe six on-going challenges: resource allocation, empirically-grounded guidelines, communicating issues, supporting API evolution over time, usable auth, and usable client libraries at scale. View details
    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
    Ashish Agarwal
    Ian Goodfellow
    Andrew Harp
    Yangqing Jia
    Rafal Jozefowicz
    Lukasz Kaiser
    Manjunath Kudlur
    Dan Mané
    Rajat Monga
    Chris Olah
    Mike Schuster
    Jonathon Shlens
    Benoit Steiner
    Ilya Sutskever
    Kunal Talwar
    Paul Tucker
    Vijay Vasudevan
    Pete Warden
    Yuan Yu
    Xiaoqiang Zheng
    tensorflow.org (2015)
    Preview abstract TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org. View details
    Systematic Analysis of Challenge-Driven Improvements in Molecular Prognostic Models for Breast Cancer
    Adam Margolin
    Erhan Bilal
    Erich Huang
    Ben Sauerwine
    Nicole Deflaux
    Lamia Youseff
    Joseph L. Hellerstein
    Science Translational Medicine, vol. 5.181 (2013), 181re1-181re1
    Preview abstract Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks–DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models. View details
    Cython: The Best of Both Worlds
    Stefan Behnel
    Lisandro Dalcin
    Dag Sverre Seljebotn
    Kurt Smith
    Computing in Science and Engineering, vol. 13.2 (2011), pp. 31-39
    Preview abstract Cython is an extension to the Python language that allows explicit type declarations and is compiled directly to C. This addresses Python's large overhead for numerical loops and the difficulty of efficiently making use of existing C and Fortran code, which Cython code can interact with natively. The Cython language combines the speed of C with the power and simplicity of the Python language. View details
    No Results Found