Jump to Content
Sameer Agarwal

Sameer Agarwal

I am a software engineer at Google. I work on problems in computer vision using methods from optimization and algebra.

Before coming to Google, I was a postdoc at the University of Washington, a graduate student at University of California, San Diego and an undergraduate at the Indian Institute of Technology, Kanpur.

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    The Geometry of Rank Drop in a Class of Face-Splitting Matrix Products
    Erin Connelly
    Alperen Ali Ergur
    Rekha R. Thomas
    arXiv (2023)
    Preview abstract Given $k \leq 6$ points $(x_i,y_i) \in \PP^2 \times \PP^2$, we characterize rank deficiency of the $k \times 9$ matrix $Z_k$ with rows $x_i^\top \otimes y_i^\top$ in terms of the geometry of the point configurations $\{x_i\}$ and $\{y_i\}$. While this question comes from computer vision the answer relies on tools from classical algebraic geometry: For $k \leq 5$, the geometry of the rank-drop locus is characterized by cross-ratios and basic (projective) geometry of point configurations. For the case $k=6$ the rank-drop locus is captured by the classical theory of cubic surfaces. View details
    On the local stability of semidefinite relaxations
    Diego Cifuentes
    Pablo A. Parrilo
    Rekha R. Thomas
    Mathematical Programming, vol. 193 (2022), pp. 629-663
    Preview abstract We consider a parametric family of quadratically constrained quadratic programs and their associated semidefinite programming (SDP) relaxations. Given a nominal value of the parameter at which the SDP relaxation is exact, we study conditions (and quantitative bounds) under which the relaxation will continue to be exact as the parameter moves in a neighborhood around the nominal value. Our framework captures a wide array of statistical estimation problems including tensor principal component analysis, rotation synchronization, orthogonal Procrustes, camera triangulation and resectioning, essential matrix estimation, system identification, and approximate GCD. Our results can also be used to analyze the stability of SOS relaxations of general polynomial optimization problems. View details
    An Atlas for the Pinhole Camera
    Timothy Duff
    Max Lieblich
    Rekha R. Thomas
    Foundations of Computational Mathematics (FOCM) (2022)
    Preview abstract We introduce an atlas of algebro-geometric objects associated with image formation in pinhole cameras. The nodes of the atlas are algebraic varieties or their vanishing ideals related to each other by projection or elimination and restriction or specialization, respectively. This atlas offers a unifying framework for the study of problems in 3D computer vision. We initiate the study of the atlas by completely characterizing a part of the atlas stemming from the triangulation problem. We conclude with several open problems and generalizations of the atlas. View details
    The Chiral Domain of a Camera Arrangement
    Andrew Pryhuber
    Rainer Sinn
    Rekha R. Thomas
    Journal of Mathematical Imaging & Vision (JMIV) (2022)
    Preview abstract We introduce the chiral domain of an arrangement of cameras A = {A_1, ..., A_m) which is the subset of P^3 visible in A. It generalizes the classical definition of chirality to include all of P^3 and offers a unifying framework for studying multiview chirality. We give an algebraic description of the chiral domain which allows us to define and describe the chiral version of Triggs’ joint image. We then use the chiral domain to re-derive and extend prior results on chirality due to Hartley. View details
    Ideals of the Multiview Variety
    Andrew Pryhuber
    Rekha R. Thomas
    IEEE Transcations on Pattern Analysis & Machine Intelligence (PAMI), vol. n/a (2021)
    Preview abstract The multiview variety of an arrangement of cameras is the Zariski closure of the images of world points in the cameras. The prime vanishing ideal of this complex projective variety is called the multiview ideal. We show that the bifocal and trifocal polynomials from the cameras generate the multiview ideal when the foci are distinct. In the computer vision literature, many sets of (determinantal) polynomials have been proposed to describe the multiview variety. We establish precise algebraic relationships between the multiview ideal and these various ideals. When the camera foci are noncoplanar, we prove that the ideal of bifocal polynomials saturate to give the multiview ideal. Finally, we prove that all the ideals we consider coincide when dehomogenized, to cut out the space of finite images. View details
    Jump: Virtual Reality Video
    Robert Anderson
    Noah Snavely
    Carlos Hernandez Esteban
    Steven M. Seitz
    SIGGRAPH Asia (2016)
    Preview abstract We present Jump, a practical system for capturing high resolution, omnidirectional stereo (ODS) video suitable for wide scale consumption in currently available virtual reality (VR) headsets. Our system consists of a video camera built using off-the-shelf components and a fully automatic stitching pipeline capable of capturing video content in the ODS format. We have discovered and analyzed the distortions inherent to ODS when used for VR display as well as those introduced by our capture method and show that they are small enough to make this approach suitable for capturing a wide variety of scenes. Our stitching algorithm produces robust results by reducing the problem to one of pairwise image interpolation followed by compositing. We introduce novel optical flow and compositing methods designed specifically for this task. Our algorithm is temporally coherent and efficient, is currently running at scale on a distributed computing platform, and is capable of processing hours of footage each day. View details
    On The Existence of Epipolar Matrices
    Hon Leung Lee
    Bernd Sturmfels
    Rekha R. Thomas
    International Journal of Computer Vision (2016), pp. 1-13
    Preview abstract This paper considers the foundational question of the existence of a fundamental (resp. essential) matrix given $m$ point correspondences in two views. We present a complete answer for the existence of fundamental matrices for any value of $m$. Using examples we disprove the widely held beliefs that fundamental matrices always exist whenever $m \leq 7$. At the same time, we prove that they exist unconditionally when $m \leq 5$. Under a mild genericity condition, we show that an essential matrix always exists when $m \leq 4$. We also characterize the six and seven point configurations in two views for which all matrices satisfying the epipolar constraint have rank at most one. View details
    Visibility Based Preconditioning for Bundle Adjustment
    Avanish Kushal
    IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
    Preview abstract We present Visibility Based Preconditioning (VBP) a new technique for efficiently solving the linear least squares problems that arise in bundle adjustment. Using the camera-point visibility structure of the scene, we describe the construction of two preconditioners. These preconditioners when combined with an inexact step LevenbergMarquardt algorithm offer state of the art performance on the BAL data set, with 3-5x reduction in execution time over currently available methods while delivering comparable or better solution quality View details
    Refractive Height Fields from Single and Multiple Images
    Qi Shan
    Brian Curless
    IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
    Preview abstract We propose a novel framework for reconstructing homogenous, transparent, refractive height-fields from a single viewpoint. The height-field is imaged against a known planar background, or sequence of backgrounds. Unlike existing approaches that do a point-by-point reconstruction – which is known to have intractable ambiguities – our method estimates and optimizes for the entire height-field at the same time. The formulation supports shape recovery from measured distortions (deflections) or directly from the images themselves, including from a single image. We report results for a variety of refractive height-fields showing significant improvement over prior art. View details
    A QCQP Approach to Triangulation
    Chris Aholt
    Rekha Thomas
    European Conference on Computer Vision, Springer Verlag (2012)
    Preview abstract Triangulation of a three-dimensional point from n ≥ 2 two-dimensional images can be formulated as a quadratically constrained quadratic program. We propose an algorithm to extract candidate solutions to this problem from its semidefinite programming relaxations. We then describe a sufficient condition and a polynomial time test for certifying when such a solution is optimal. This test has no false positives. Experiments indicate that false negatives are rare, and the algorithm has excellent performance in practice. We explain this phenomenon in terms of the geometry of the triangulation problem. View details
    Schematic Surface Reconstruction
    Changchang Wu
    Brian Curless
    Steven M. Seitz
    IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
    Preview abstract This paper introduces a schematic representation for architectural scenes together with robust algorithms for reconstruction from sparse 3D point cloud data. The schematic models architecture as a network of transport curves, approximating a floorplan, with associated profile curves, together comprising an interconnected set of swept surfaces. The representation is extremely concise, composed of a handful of planar curves, and easily interpretable by humans. The approach also provides a principled mechanism for interpolating a dense surface, and enables filling in holes in the data, by means of a pipeline that employs a global optimization over all parameters. By incorporating a displacement map on top of the schematic surface, it is possible to recover fine details. Experiments show the ability to reconstruct extremely clean and simple models from sparse structure-from-motion point clouds of complex architectural scenes. View details
    Multicore Bundle Adjustment
    Changchang Wu
    Brian Curless
    Steven Seitz
    Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2011), pp. 3057-3064
    Preview abstract The emergence of multi-core computers represents a fundamental shift, with major implications for the design of computer vision algorithms. Most computers sold today have a multicore CPU with 2-16 cores and a GPU with anywhere from 4 to 128 cores. Exploiting this hardware parallelism will be key to the success and scalability of computer vision algorithms in the future. In this project, we consider the design and implementation of new inexact Newton type Bundle Adjustment algorithms that exploit hardware parallelism for efficiently solving large scale 3D scene reconstruction problems. We explore the use of multicore CPU as well as multicore GPUs for this purpose. We show that overcoming the severe memory and bandwidth limitations of current generation GPUs not only leads to more space efficient algorithms, but also to surprising savings in runtime. Our CPU based system is up to ten times and our GPU based system is up to thirty times faster than the current state of the art methods, while maintaining comparable convergence behavior. View details
    Building Rome in a day
    Yasutaka Furukawa
    Noah Snavely
    Ian Simon
    Brian Curless
    Steven M. Seitz
    Rick Szeliski
    Communications of the ACM, vol. 54 (2011), pp. 105-112
    Preview abstract We present a system that can reconstruct 3D geometry from large, unorganized collections of photographs such as those found by searching for a given city (e.g., Rome) on Internet photo-sharing sites. Our system is built on a set of new, distributed computer vision algorithms for image matching and 3D reconstruction, designed to maximize parallelism at each stage of the pipeline and to scale gracefully with both the size of the problem and the amount of available computation. Our experimental results demonstrate that it is now possible to reconstruct city-scale image collections with more than a hundred thousand images in less than a day. View details
    No Results Found