Sameer Agarwal
I am a software engineer at Google. I work on problems in computer vision using methods from optimization and algebra.
Before coming to Google, I was a postdoc at the University of Washington, a graduate student at University of California, San Diego and an undergraduate at the Indian Institute of Technology, Kanpur.
Research Areas
Authored Publications
Sort By
Preview abstract
Given $k \leq 6$ points $(x_i,y_i) \in \PP^2 \times \PP^2$, we characterize rank deficiency of the $k \times 9$ matrix $Z_k$ with rows $x_i^\top \otimes y_i^\top$ in terms of the geometry of the point configurations $\{x_i\}$ and $\{y_i\}$.
While this question comes from computer vision the answer relies on tools from classical algebraic geometry: For $k \leq 5$, the geometry of the rank-drop locus is characterized by cross-ratios and basic (projective) geometry of point configurations. For the case $k=6$ the rank-drop locus is captured by the classical theory of cubic surfaces.
View details
The Chiral Domain of a Camera Arrangement
Andrew Pryhuber
Rainer Sinn
Rekha R. Thomas
Journal of Mathematical Imaging & Vision (JMIV) (2022)
Preview abstract
We introduce the chiral domain of an arrangement of cameras A = {A_1, ..., A_m) which is the subset of P^3 visible in A. It generalizes the classical definition of chirality to include all of P^3 and offers a unifying framework for studying multiview chirality. We give an algebraic description of the chiral domain which allows us to define and describe the chiral version of Triggs’ joint image. We then use the chiral domain to re-derive and extend prior results on chirality due to Hartley.
View details
On the local stability of semidefinite relaxations
Diego Cifuentes
Pablo A. Parrilo
Rekha R. Thomas
Mathematical Programming, 193 (2022), pp. 629-663
Preview abstract
We consider a parametric family of quadratically constrained quadratic programs and their associated semidefinite programming (SDP) relaxations. Given a nominal value of the parameter at which the SDP relaxation is exact, we study conditions (and quantitative bounds) under which the relaxation will continue to be exact as the parameter moves in a neighborhood around the nominal value. Our framework captures a wide array of statistical estimation problems including tensor principal component analysis, rotation synchronization, orthogonal Procrustes, camera triangulation and resectioning, essential matrix estimation, system identification, and approximate GCD. Our results can also be used to analyze the stability of SOS relaxations of general polynomial optimization problems.
View details
An Atlas for the Pinhole Camera
Timothy Duff
Max Lieblich
Rekha R. Thomas
Foundations of Computational Mathematics (FOCM) (2022)
Preview abstract
We introduce an atlas of algebro-geometric objects associated with image formation in pinhole cameras. The nodes of the atlas are algebraic varieties or their vanishing ideals related to each other by projection or elimination and restriction or specialization, respectively. This atlas offers a unifying framework for the study of problems in 3D computer vision. We initiate the study of the atlas by completely characterizing a part of the atlas stemming from the triangulation problem. We conclude with several open problems and generalizations of the atlas.
View details
Ideals of the Multiview Variety
Andrew Pryhuber
Rekha R. Thomas
IEEE Transcations on Pattern Analysis & Machine Intelligence (PAMI), n/a (2021)
Preview abstract
The multiview variety of an arrangement of cameras is the Zariski closure of the images of world points in the cameras. The prime vanishing ideal of this complex projective variety is called the multiview ideal. We show that the bifocal and trifocal polynomials from the cameras generate the multiview ideal when the foci are distinct. In the computer vision literature, many sets of (determinantal) polynomials have been proposed to describe the multiview variety. We establish precise algebraic relationships between the multiview ideal and these various ideals. When the camera foci are noncoplanar, we prove that the ideal of bifocal polynomials saturate to give the multiview ideal. Finally, we prove that all the ideals we consider coincide when dehomogenized, to cut out the space of finite images.
View details
Jump: Virtual Reality Video
Robert Anderson
Carlos Hernandez Esteban
Steven M. Seitz
SIGGRAPH Asia (2016)
Preview abstract
We present Jump, a practical system for capturing high resolution, omnidirectional stereo (ODS) video suitable for wide scale consumption in currently available virtual reality (VR) headsets. Our system consists of a video camera built using off-the-shelf components and a fully automatic stitching pipeline capable of capturing video content in the ODS format. We have discovered and analyzed the distortions inherent to ODS when used for VR display as well as those introduced by our capture method and show that they are small enough to make this approach suitable for capturing a wide variety of scenes. Our stitching algorithm produces robust results by reducing the problem to one of pairwise image interpolation followed by compositing. We introduce novel optical flow and compositing methods designed specifically for this task. Our algorithm is temporally coherent and efficient, is currently running at scale on a distributed computing platform, and is capable of processing hours of footage each day.
View details
On The Existence of Epipolar Matrices
Hon Leung Lee
Bernd Sturmfels
Rekha R. Thomas
International Journal of Computer Vision (2016), pp. 1-13
Preview abstract
This paper considers the foundational question of the existence of a fundamental (resp. essential)
matrix given $m$ point correspondences in two views.
We present a complete answer for the existence of fundamental matrices for any value of $m$. Using examples we disprove the widely held beliefs that
fundamental matrices always exist whenever $m \leq 7$. At the same time, we prove that they exist
unconditionally when $m \leq 5$. Under a mild genericity condition, we show that an essential matrix always exists when $m \leq 4$. We also characterize the six and seven point configurations in two views for which all matrices satisfying the epipolar constraint have rank at most one.
View details
A QCQP Approach to Triangulation
Chris Aholt
Rekha Thomas
European Conference on Computer Vision, Springer Verlag (2012)
Preview abstract
Triangulation of a three-dimensional point from n ≥ 2 two-dimensional images can be formulated as a quadratically constrained quadratic program. We propose an algorithm to extract candidate solutions to this problem from its semidefinite programming relaxations. We then describe a sufficient condition and a polynomial time test for certifying when such a solution is optimal. This test has no false positives. Experiments indicate that false negatives are rare, and the algorithm has excellent performance in practice. We explain this phenomenon in terms of the geometry of the triangulation problem.
View details
Visibility Based Preconditioning for Bundle Adjustment
Avanish Kushal
IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
Preview abstract
We present Visibility Based Preconditioning (VBP) a new
technique for efficiently solving the linear least squares
problems that arise in bundle adjustment. Using the
camera-point visibility structure of the scene, we describe
the construction of two preconditioners. These preconditioners when combined with an inexact step LevenbergMarquardt algorithm offer state of the art performance
on the BAL data set, with 3-5x reduction in execution time over currently available methods while delivering
comparable or better solution quality
View details
Schematic Surface Reconstruction
Changchang Wu
Brian Curless
Steven M. Seitz
IEEE Conference on Computer Vision and Pattern Recognition, IEEE (2012)
Preview abstract
This paper introduces a schematic representation for
architectural scenes together with robust algorithms for
reconstruction from sparse 3D point cloud data. The
schematic models architecture as a network of transport
curves, approximating a floorplan, with associated profile
curves, together comprising an interconnected set of swept
surfaces. The representation is extremely concise, composed of a handful of planar curves, and easily interpretable
by humans. The approach also provides a principled mechanism for interpolating a dense surface, and enables filling
in holes in the data, by means of a pipeline that employs a
global optimization over all parameters. By incorporating
a displacement map on top of the schematic surface, it is
possible to recover fine details. Experiments show the ability to reconstruct extremely clean and simple models from
sparse structure-from-motion point clouds of complex architectural scenes.
View details